Vowel– Vowels of Malayalam -Any of the set: [അആഇഈഉഊഋഎഏഐഒഓഔഔഅം]
VowelSign– Vowel signs. – Any of the set [ാിീുൃെേൊോൗൂൈ]
Consonant– Consonants – Any of the set [കഖഗഘങചഛജഝഞടഠഡഢണതഥദധനപഫബഭമയരലവശഷസഹളഴറ]
Virama– The sign ്.
VisargaThe sign ഃ
Anuswara– The vowel sign of അം.ie ം. This share some properties of Chillu.
Chillu– Pure consonants, without any vowels. Chillus are any of ൻ, ർ, ൽ, ൾ, ൺ, ൿ, ൔ, ൕ, ൖ. The last 4 chillus are rarely used or archaic. But we can consider them for our modeling. Due to historic encoding reasons, Chillus can also appear as base
ZWJform. That means, ൻ = ന + ് +
ZWJ. Chillus never appear in the begininning of word, but is not relevant for a syllable analyser.
SignsA term used to address various signs that modify a
Consonant. Any of
Conjunct:Refer the formal definition of this we discussed in previous blog post. We defined it as A
Consonantcombined with another
Virama. Example: സ+ ് + ത => സ്ത , സ്ത + ് + ര = സ്ത്ര. ദ്ധ + ് ര = ദ്ധ്ര, ദ്ധ്ര + ് + യ = ദ്ധ്ര്യ. But we need an advanced version. That definition did not support DotReph (ൎ) which combines with a consonant or conjunct to form Conjunct. To support
DotRephas well, we will redefine Conjunct as
HalfConsonant Conjunct / Consonant
DotRephThe sign (ൎ). It combines with other consonants as in this example: ൎ + യ -> ൎയ in ഭാൎയ
ViramaExample: പ്, ര്, മ് etc. Or a
Vowel. Vowels are often found at the begininning of the word. Example: അമ്മ. But for the specific case of Syllables, we can relax this rule of being in the start of word and generally state that a vowel is syllable. Note that vowel appearing as vowel sign is not what we are considering here.
Vowel signshas its own properties.
Chilluletter is a syllable.
Signsis a syllable. For example, in the word തറ, both ത and റ are Syllables.
Signsis a syllable. Here the Signs can be repeated more than once, but not freely. This syllable has the following characteristics:
Viramaonly if it is the last items of a given word. For example. അത് has അ, ത് as syllables, but അത്ഭുതം has അ, ത്ഭു, തം as syllables.
Signscan occur 2 times in folllowing cases:(a) First Sign is ു and Second is
ViramaThis combination is also called Samvruthokaram. Example: തു് in അതു്. (b) First Sign is a
VowelSignand Second is
Anuswara. Examples: താം, തീം, തോം, തും etc.
ZWNJmarks a syllable boundary. A ZWNJ inserted between two blocks of text inserts a ligature as well as syllable boundary. For example: തമിഴ്നാട്, the ZWNJ inserted after ഴ് and before നാ prevents possible ഴ്ന Conjunct and hence also makes a point that the pronounciation should break at that point. It is a bit wierd to say a ZWNJ forms a syllable since it is just a seperator. But while analysing a series of letters from begininning to end, it is technically okey to consider ZWNJ as a syllable block.
signs. For example, it does not allow a
visarga. If that happens, the parser will fail to parse a word. It permits a
VowelSign, but that is only for Samvruthokaram(vowel sign = ു ).
viramacomes in between a word, it has the nature of consonant combining.
Signsis also enforced. For example, you cannot have a
VowelSignു even though the reverse order is permitted.