Linguistics, Psycholinguistics and Semantics

Language, in other words the storehouse of all human Knowledge is represented by words and meanings. Language by itself has an Ontological structure, Epistemological underpinnings and Grammar. Across languages, even though words /usages differ, the concept of meanings remain the same in respective communications. Yet the "Meanings" are understood by human beings based on Contextual, Relative, Tonal and Gestural basis. The dictionary meanings or 'as it is' meanings are taken rarely into consideration, thus human language is ambigious in one sense and flexible in other.

Computers on the other hand are hard-coded to go by the dictionary meanings. Thus teaching (programming) Computers to understand natural language (human language) has been the biggest challange haunting Scientists ever since the idea of Artificial Intelligence (AI) came into existance. In addition this has lead to the obvious question of "What is intelligence" from a Computation perspective. Defining intelligence precisely being impossible, this field of study has taken many shapes such as Computational Linguistics, Natural Language Processing and "
Machine Learning" etc. Artificial Intelligence instead of being used as a blanket term, is now being used increasingly as "Analytics" in many critical applications.

Sanskrit being the oldest is also the most Scientific and Structured language. Sanskrit has many hidden Algorithms built into it as part of its vast scientific treatises, for analysing "Meanings" or "Word sense" from many perspectives since time immemorial. "It is perhaps our job to discover and convert the scientific methods inherent in Sanskrit into usable Computational models and Tools for Natural Language Processing rather than reinventing the wheel" - as some Scientists put it. This blog's purpose is to expose some of the hidden intricate tools and methodolgies used in Sanskrit for centuries to derive precise meanings of human language, to a larger audiance particularly Computational Linguists for futher study, analysis and deployment in Natural Language Processing.

In addition, Sanskrit even though being flexible as a human language, is the least ambigious as the structure of the language is precisely difined from a semantical and syntactical point of view. From a Psycholinguistic perspective this blog could also give us a glimpse of the advanced linguistic capabilities of our forefathers as well their highly disciplined approach towards the structure and usage.

Friday, February 22, 2013

Lost in Translation - Yogaartha vs. Rooddyartha

Meanings are lost in Translations, Generally happen and are accepted to some degree in other languages. But with Sanskrit sometimes translations can be completely wrong particularly with respect to shastras (sciences) - whats so special here and why?

Prakritih (Root - both verb root - called as Dhatu and Noun root - called as Praatipatika) while joining with Pratyayah (can be loosly termed as suffix - but it is more than just suffix), we get "Padam" - the word in Sanskrit. Entire Sanskrit language is nothing but a mixure of Prakritih and Pratyayah - here the word Prakriti denotes feminine gender and the word Pratyayah represents masculine gender (connecting with the higher principle of Prakriti - Purusha).

Similarly both Prakritih and Pratyayah contributes meanings to a Padam (word). One will convey the conceptual (root) meaning and the other its (vyavahara) meaning in worldly usage. The original meaning of a word (Prakritih + Pratyayah) is called Yogaarthah - the word Yogah (not Yogh or Yogaa - both are wrong pronunciations one is widely used in Northern India and the other by People in Western countries) Yogah means enjoinment - thus it is the original meaning of a word when Prakritih and Pratyayah is enjoined.

However due to usage of the word over a long time for a specific purpose, the meaning of the word get associated with that purpose. That superimposing of a meaning to a word is called Rooddyarthah. This superimposing (meaning
changes) is dealt in 2000+ year old Sanskrit texts - thus this is another proof that the language is very ancient and also widely being used.

When we see dictionaries, the first choice of meanings are always Rooddyarthah and not Yogaarthah. But in Sanskrit Shastras (scientific treatises) Yogaarthah is what is invariably used and not Rooddyarthah - it is the case with the shastric texts written even as late as in 17th century. Thus when we read /translate sanskrit scientific texts we have to be mindful of yogaartha and very careful about the contextual meaning also. Bhagavadgita which is a Yogashastra as well as Gitopanishad - when translated Gita or Yogasutra of Patanjali Maharishi is also susceptible to these rules as well as the important "Rule of studying shastra" in Sanskrit

The Rule of shastra studying is such that before one embarks on a study of Vedanta one should study - Vyakarana, Mimamsa and Nyaya - to understand Shankara bhashyam of Bhavadgita one needs these three shaastras. But nowadays people without studying even the basics doing free-flowing translation of Bhagavadgita, Yogasutra, Yogavaashista and many other texts with the help of some body's translation which is again based on Rooddyarthas. Which is wrong as meanings get diluted

One needs strong understanding (meanings) of Dhatu, Upasarga (prefix) and Pratyayah in addition to the Sanjna /Paribhasha = nomenclature and codewords /acronyms of the specific shastra thats being translated. Without such elaborate preparations the translation and the effort becomes unworthy and useless.

Some examples of Yogaartha vs. Rooddyartha.

The word "Ooha" generally used for the meaning "Guess" - the Yogaartha meaning is "Application". The word "Laavanyam" used to describe exceptional beauty, in Yogaartha it actually means "Saltiness". The word "Vyakti" used for referring to a person whereas its Yogaartha meaning is "manifest" or "known". Even the most talked /used word "Yogah' currently used for  "exercise" and that word's Yogaartha meaning is "Union" or "enjoinment". Similarly the word "Bhoo" and its Yogaartha meaning is "be" and its derivative Bhoota means "Being" (in the sense of life and life-form - Life is eternal and always exists, only the forms gets formed or changed). Similarly the word Dhyaanam which is popularly used for Meditation, whereas the Yogaartha meaning is "Brood-over". Similarly the word "gamanam" (gam /gach dhatu) means not going /travel, but reaching or attaining.

In the same way meanings of Upadesah, Upavaasah, Upanyaasah - all these 3 words (kridanta words) in Yogaartha means being /placing near to the object of focus (God), yet in RooddyarthaUpadesam means advice, Upavaasam means restrainment of food and Upanyaasam means spiritual discourse. Similarly Avataara which means descend /getting down but that has become manifestation and now after the popular movie its become like ones image in a digital /virtual world.

Another interesting point in meanings of words is that the degrees of meanings for a word - eg: the word Shariram - means generally body but when the body of a youth is referred then it is called as "Dehah", Man's body is called as Gaatram, then it is called as "Kaaya" old man's body as "Kalevaram", Form /Devata forms - male & female and also in some cases female body is called as "Vapuh" (Vapuz stem) and female body as "Tanuh". Also 'Aakaara' is used for Form and 'Aakritih' is used for body in a general sense.

Another corrupted word is "Aarya" - "Ri" Dhatu + Nyat pratyayanta kridanta roopam = Aryam (Aaryam yasya sah = Aaryah) - Aaryah as per yogaartha is the one who instills order, yet rooddyartha it is given to noble person or person of higher race. Here it is to be noted that as per yogaartha only a kshatriya in pravritti (in action) or God as an Avataarah (again in action) can only be Aarya - like Sri Rama, Sri Krishna, Maharajah Vikramaditya, Maharajah Bhoja, Maharajah Shivaji, Maharaja Krishnadevaraya as they have established Dharma - and not a renunciate - because renunciates have gone beyond Dharma and are in the path of Moksha or attained. If they happen to be social reformers also then they can be addressed as Aarya. If they are pure enlightened beings then they can be considered equivalent to God but not Aarya. Nature is the biggest Aaryaa.

The earlier rooddyartha of Aarya become noble person then later due to the influence of Western indologists it became invaders. Then now as per the convenience of Tamil Nadu politicians Aarya means a fair skinned person (North Indian) who displaced the so called native population to down south - height of ignorance and gullibility!

Prithvee - "prith" Dhatu - unaadi - "Prithu" - its Stree lingam (feminine gender) is Prithvee - yogaartha = manifold (that which is one yet manifolds into many - vyakarana itself teaches vedanta!). In Rooddyarthah this word is used for Earth or Big.

Ajinam - this word (taddhita compound) means some stuff that is connected with a sheep - used for sheep wool (sweater) etc. or sheep skin. Later this word transformed into general skin, etc. Now people associate Ajinam with Tiger skin

Similarly the word avagamanam means understanding; gnanam means awareness; Buddhih means intellect; Matam means openion or abhiprayah - this has now become Religion, etc. There are more words for various degrees /grades of human knowledge such as pratipattih, prateetih, sampratyayah, dheeh, bodha, samvit, gnanam, etc. - each one is at higher order than the previous one. For these words equivalent English words are not there, thus it is difficult or impossible to translate Sanskrit Shastras into other languages.

Thus for correct understanding of a particular Shastra one has to study it in its original language. Translation is such a poor alternative in some cases we will be better of without studying it. E.g.: If one tries to translate khaNDana-khaNDa-khAdya of the great Poet Sriharsha we will understand. Similarly many scholars admit that there is but only one good translation (in English) of Sri Nagarjuna's Moola Madhyamika kaarika in all these years of Buddhist studies - no wonder Buddha is misunderstood - and there is a fight between 2 wrong understandings then - which is still going on !. (The same case with the writings of  J Krishnamurthi which is in English, to translate it other non-European language would be a herculean task.)

All these collectively highlight the mistakes in our understandings born out of not studying Sanskrit properly. Our entire culture is based on Sanskrit yet we don't learn !. How then we will we know the hidden values and Scientific rationale behind our culture ?. Not knowing is certainly a shame.

Widespread studies of Sanskrit shastras stopped in the mid of 19th Century. Some of our fathers and grandfathers in the past 2-4 generations must have studied Vedas /Sanskrit but they didn't study the shastras.  Particularly
Vyakarana - which is the foundation stone of the language. Though Veda paaTashaalas somehow survived but many of the shastra paaTashaalas were closed in the begining of last century. Thanks to the efforts of many traditional MaTas and Acharyaas some shastra paaTashaalas were revived - where traditional shaastraas are taught in a traditional way. My humble salutations to them - but for them I wouldn't be writing this and I'm merely a pipe carrying the thoughts of teachers.

By saying all these I'm not saying Rooddyartha is wrong. All I'm saying is one should be mindful of which meaning is used in a particular Shastra in a particular context and translate accordingly. This is also important for Computational Linguists who are developing Machine Translation systems.

Thursday, February 7, 2013

Why Sanskrit? in Computational Linguistics - Part 1

This is a concise introduction to "How Sanskrit is the most suitable language for Computing?" and now "In what way Sanskrit is suitable for Computational Linguistics?"

When I first heard a few years back that Sanskrit is the most ideally suited language for Computing - I was curious to know How ? - I couldn't get any straight forward answer.  Later I found out on my own, with a bit of research in the Web and discussions with Linguistic scholars.

Two linguists namely Dr. Leonard Bloomfield and Dr. Zellig Harris who were living in early 20th Century were responsible for coming out with the theories of Structural Linguistics - main reason for the development of Computer programing languages. Widely used in the first and second generation of Programming languages

These two linguists - Leonard Bloomfield and Zellig Harris, I found that both of them went Germany during late 19th century /early 20th century and studied intensely both Vedic Grammar (Pratisakyam) and Paninian system - for 7 years !. in their post Doctoral research /studies. They both studied in details the works of Dr. Otto von Böhtlingk - a German Indologist and Sanskrit Scholar - specializing in Vyakarana

"From Wikipedia - pageöhtlingk was one of the most distinguished scholars of the nineteenth century, and his works are of pre-eminent value in the field of Indian and comparative philology. His first great work was an edition of the Sanskrit grammar of Panini, Aṣṭādhyāyī, with a German commentary, under the title Acht Bücher grammatischer Regeln (Bonn, 1839–1840)."

"From Wikipedia - page
The idea of describing the structure of language with rewriting rules can be traced back to at least the work of Pāṇini (before the 4th century BC), who used it in his description of Sanskrit word structure. American linguists such as Leonard Bloomfield and Zellig Harris took this idea a step further by attempting to formalize language and its study in terms of formal definitions and procedures (around 1920–60)

IAL (Intelligent Application Language) the first Computer Programming Language - from IAL born ALGOL-58 the first-generation popular programming language - John Backus a programmer in IBM labs developed the first notation  based on Sanskrit Grammar methods. Later when Peter Naur further developed the original ALGOL (58) into ALGOL-60 and created the Backus-Norm Form (BNF Notation) - it become a huge success and brought in major developments to the computer field.

From Wikipedia page -
"Further development of ALGOL led to ALGOL 60; in its report (1963), Peter Naur named Backus's notation Backus Normal Form, and simplified it to minimize the character set used. However, Donald Knuth argued that BNF should rather be read as Backus–Naur Form, as it is "not a normal form in any sense" unlike, for instance, Chomsky Normal Form. The name Pāṇini Backus form has also been suggested in view of the facts that the expansion Backus Normal Form may not be accurate, and that Pāṇini had independently discovered a similar notation centuries earlier"

Later date programming languages and linguistics got further development when Naom Chomsky introduced Generative Grammar. (Naom Chomsky is the student of Dr. Zellig Harris - Linguist and Sanskrit Vyakarana scholar) - Sanskrit language's speciality itself is its Generative Grammar & Morphology.

Thus it is very clear that Maharishi Panini not only helped to protect the Sanskrit grammar by writing his linguistic canon "Ashtadyayi". He also helped create Computer Programming languages. Panini - the first Computer Scientist.

Part 2 - How the rules of Ashtadyayi helped the Programming languages or how many of Panini's ideas are used "as it is" in programming languages.

Friday, February 1, 2013

Mordern Linguistic Terms and their Sanskrit equivalents

This table is to give a short understanding of the Sanskrit equivalents for the popular Linguistics Terms for non-Sanskrit Linguists. Contains both applied and non-applied linguistics references - more to come

Linguistics Term (Subsystem)
Sanskrit equivalent and  references
Phoneme (sound)
Lemma (psycholinguistics)
Abstract idea
Sphota - Vakyapadiya
Pashyanti (vision /flash)
Mental image of a meaning
Madhyama (thought /medium)
Sound (meaningful)
Vaikari (uttered)
Word (Lexicon)
Verbalized /meaningful
Literal Meanings and Synonyms
Padam Many subdivisions depending on action, tense, mood, gender, etc.
Word creation /derivation (morphology)
Root, Suffix and Stem meanings
Vyutpatti = Vyakarana – Prakriti + Pratyaya
Part of Speech = Word(s) in sentences
Prefix, Suffix and prepositions
Phrase meanings and changes
Vritti and Vigraha - meaning
Syntactic structure
Phrases and sentences
Literal Meanings of Phrases /words
Abhidha-meanings, Vibhaktis and relationship with Verb
Sound and Sense relation
Fitness compatibility
Causality relations
Literal meaning of a unit of speech - phoneme
Word (stand alone)
Literal meaning /dictionary meaning
Two types – Yogaartha meaning from root, Rooddhyartha -meaning from usage – dictionary meaning
Word family
Meanings of Universals
Word sense
Sentence meaning and word meanings
Changes in word meanings in sentences
Kaaraka wrt. Akanksha
Sentence meanings on the whole
Combined meanings from words and concepts
Kaaraka – Sentence meanings Vaachyaartha
Sentence meanings - contextual
Combined meanings based on context
Avaachyaartha (in between words), Tatparyaartha, Vyangyartha – Samyogadi, Vaktraadi (Vakyapadiya, Paramalaghumnjusha)
Kosha – Amaram, Medhini, etc.
Conceptual Understanding

Many of the modern concepts in Linguistics such as Semantics, Pragmatics, Psycholinguistics, etc. have been dealt in ancient Sanskrit treatises in detail, including with multiple view points - from 14 different philosophical schools - such as 6 Dharsana + 4 Bouddha + 1 Jaina + 1 Charvaka + 1 Rasa + 1 Vyakarana - (which contains the Original language structure). Each one of these schools had their own Ontology, Epistemology, Linguistic theories, etc.

Of these schools only 3 emerged popular - Vyakarana, Mimamsa and Nyaya (later two belongs to the Dharsana). These 3 philosophical schools of linguistics analysis - developed their own abstract methods of meaning analysis - the mechanism is called "Saabda Bodha". From the concepts of Navya Nyaya and Saabda Bodha, western logicians and mathameticians developed the symbolic representation of languge units & linguistics - that will be covered later.

Will add more in this subject of linguistics terms -CGK