Vowel
– Vowels of Malayalam -Any of the set: [അആഇഈഉഊഋഎഏഐഒഓഔഔഅം]VowelSign
– Vowel signs. – Any of the set [ാിീുൃെേൊോൗൂൈ]Consonant
– Consonants – Any of the set [കഖഗഘങചഛജഝഞടഠഡഢണതഥദധനപഫബഭമയരലവശഷസഹളഴറ]Virama
– The sign ്.Visarga
The sign ഃAnuswara
– The vowel sign of അം.ie ം. This share some properties of Chillu.Chillu
– Pure consonants, without any vowels. Chillus are any of ൻ, ർ, ൽ, ൾ, ൺ, ൿ, ൔ, ൕ, ൖ. The last 4 chillus are rarely used or archaic. But we can consider them for our modeling. Due to historic encoding reasons, Chillus can also appear as base Consonant
+Virama
+ZWJ
form. That means, ൻ = ന + ് + ZWJ
. Chillus never appear in the begininning of word, but is not relevant for a syllable analyser.Signs
A term used to address various signs that modify a Consonant
. Any of VowelSign
, Virama
, Anuswara
, Visarga
.Conjunct
:Refer the formal definition of this we discussed in previous blog post. We defined it as A Consonant
combined with another Conjunct
or Consonant
using Virama
. Example: സ+ ് + ത => സ്ത , സ്ത + ് + ര = സ്ത്ര. ദ്ധ + ് ര = ദ്ധ്ര, ദ്ധ്ര + ് + യ = ദ്ധ്ര്യ. But we need an advanced version. That definition did not support DotReph (ൎ) which combines with a consonant or conjunct to form Conjunct. To support DotReph
as well, we will redefine Conjunct as HalfConsonant Conjunct / Consonant
DotReph
The sign (ൎ). It combines with other consonants as in this example: ൎ + യ -> ൎയ in ഭാൎയHalfConsonant
: A Consonant
followed by Virama
Example: പ്, ര്, മ് etc. Or a DotReph
Vowel
. Vowels are often found at the begininning of the word. Example: അമ്മ. But for the specific case of Syllables, we can relax this rule of being in the start of word and generally state that a vowel is syllable. Note that vowel appearing as vowel sign is not what we are considering here. Vowel signs
has its own properties.Chillu
letter is a syllable.Consonant
without any Signs
is a syllable. For example, in the word തറ, both ത and റ are Syllables.Consonant
or Conjunct
with Signs
is a syllable. Here the Signs can be repeated more than once, but not freely. This syllable has the following characteristics:Signs
can be Virama
only if it is the last items of a given word. For example. അത് has അ, ത് as syllables, but അത്ഭുതം has അ, ത്ഭു, തം as syllables.Signs
can occur 2 times in folllowing cases:(a) First Sign is ു and Second is Virama
This combination is also called Samvruthokaram. Example: തു് in അതു്. (b) First Sign is a VowelSign
and Second is Anuswara
. Examples: താം, തീം, തോം, തും etc.ZWNJ
marks a syllable boundary. A ZWNJ inserted between two blocks of text inserts a ligature as well as syllable boundary. For example: തമിഴ്നാട്, the ZWNJ inserted after ഴ് and before നാ prevents possible ഴ്ന Conjunct and hence also makes a point that the pronounciation should break at that point. It is a bit wierd to say a ZWNJ forms a syllable since it is just a seperator. But while analysing a series of letters from begininning to end, it is technically okey to consider ZWNJ as a syllable block.signs
. For example, it does not allow a VowelSign
, virama
or anuswara
after a visarga
. If that happens, the parser will fail to parse a word. It permits a virama
after a VowelSign
, but that is only for Samvruthokaram(vowel sign = ു ).virama
comes in between a word, it has the nature of consonant combining.Signs
is also enforced. For example, you cannot have a virama
and then VowelSign
ു even though the reverse order is permitted.