Malayalam/Comparison of Chillu encoding proposals
From Unicode discussion
[edit] Introduction
Please read following background documents.
- Malayalam/Representation of Vowellessness
- L2/06-207 on the proposed solution for issues in Malayalam Chillu encoding
[edit] Definitions
- 'ID' represents the systems where format control characters are stripped
- 'IDN' is a ID system with additional restrictions.
- 'Text' means sequence of Unicode characters without any filtering or restriction.
[edit] Comparison table
| Option# | Option 1 | Option 2 | Option 3 | Option 4 |
|---|---|---|---|---|
| Proposals: | <Consonant, VIRAMA, ZWJ> | <Consonant, VIRAMA, special joiner>. This special Joiner ignored in some text processing like IDN and preserved in rest. | <Consonant, Chillu Sign> | Atomic encoding of Chillus |
| Current Status | Current standard | Discussed in the Indic list | Rejected | Proposed new standard |
| 1. Can represent Chillu of റ (RRA) | No | No | No | Yes |
| 2. Has a non-transparent representation of Chillu | No for IDN (words with Chillu cannot be registered as IDN) | No for IDN (words with Chillu cannot be registered as IDN) | Yes | Yes |
| 3. Words like നന്മ and നന്മ that are visually different, but meaning the same, can be encoded the same | Yes; barring format control characters (നന്മ cannot be used in IDN) | No in text; Yes in IDN (നന്മ cannot be used in IDN) | No (needs higher level logic to establish the equivalence) | No (needs higher level logic to establish the equivalence) |
| 4. If a new chillu letter is discovered, no need to encode it specifically | Yes in text; no chillu representation available for IDN or ID | Yes in text and ID; no chillu representation available in IDN | Yes | No |
| 5. ന്റ (/nta/) does not need a non-intuitive character sequence | False: <NA, VIRAMA, ZWJ, VIRAMA, RRA> with 2 consecutive VIRAMAs | False: <NA, VIRAMA, special joiner, VIRAMA, RRA> | Partly True: <NA, chillu-sign, VIRAMA, RRA>. Two consecutive VIRAMAs not required; However, equates ന്റ to ന്്റ. | Partly True: <Chillu-NA, VIRAMA, RRA>. Two consecutive VIRAMAs not required; However, equates ന്റ to ന്്റ. |
| 6. Chillu-RA/RRA (ര്) is intuitive to native user | No. Native user assumes that ര് is based on RRA | No. Native user assumes that ര് is based on RRA | No. Native user assumes that ര് is based on RRA | Yes. Base character is not relevant at the encoding level |
