Issue
Edit
Even though Malayalam script is an abugida its collation is phonemic. This introduces a unique problem discribed through the examples below. This problem very well could be an Indic problem since all the scripts in that block follow the same relevant properties.
Default collation table assumes following relationship:
Full vowel < Consonant < Vowel Sign < Virama
Because of this, following sample words assume following collation order:
വാക < വാകമരം < വാക് വാക്കുകള് < വാക്കു്
While a Malayalam user expects:
വാക് < വാക < വാകമരം വാക്കു് < വാക്കുകള്
The problematic cases are those words ending in Chandrakkala (visible virama). Bottom line is, community expects ക് < ക (<KA, VIRAMA> less than <KA>).
A Solution
Edit
The essential idea is to contract <consonant, virama> to a value less than <consonant>. Following inequality is also followed at the primary weights:
Vowel Signs < Full Vowels < Consonants
Solution is illustrated thru following examples:
-x = symbol of vowel x x_ = chillu of consonant x m_ = anuswara ~ = virama
Second column represents how it is viewed conceptually to arrive at the contraction.
n_ = n ~ zw-space = [1A2B.0018] n~ = n ~ = [1A2B.0019] nu~ = n ~ = [1A2B.0020] na = na = [1A2B.0021] naa = na -aa = [1A2B.0021], [1A0A.0020] ... nka = na ~ ka = [1A2B.0019], [1A18.0021] ...
Together with anuswara
m_ka = nga ~ ka = [1A1C.0019], [1A18.0021] ... m_ja = nya ~ ja = [1A21.0019], [1A1F.0021] ... m_ta = na ~ ja = [1A2B.0019], [1A27.0021] ... m_ya = ma ~ ya = [1A30.0019], [1A31.0021] ...
Notes
Edit
- 'n' and '~' together form a contraction.
- n_, n~ and nu~ are different only in secondary level.