Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

definitions of glyphs, modifiers and characters

Just a couple of clarifications on font encoding scheme
version 1.1 I put out this weekend
particularly to clarify the difference between some
glyphs which are pure 'alphabets' and those that act as

( Muthu: I am putting up these definitions here so that
we all speak/understand the same language. We need to have
these definitions correct in order to understand the inter-
relationship of font encoding for a tamil font and that
used in unicode scheme. please correct me if I am wrong
in any of the following statements. We did discuss
some of these, before discussions stopped for the singapore
TamilNet97 conference).

i) Uyir, mei and uyirmei letters of Tamil are all alphabets
or characters.

ii) A "glyph" is a geometric representation of whole or
part of these alphabet(s)/character(s).

iii) Every object that is included in the font encoding 
scheme is a glyph.

iv) A glyph can be a character by itself, as is the
case with all uyirs and mei alphabets and tamil numerals. 
(Note: here ka, ca, ta,.. are considered as mei and 
not ik, ic, it, ..)

v) Some of the uyirmei alphabets are included as such
as glyphs while others are obtained by sequential
typing of two or more glyphs. Some of the glyphs
(such as the "kaal" used for aakara varisai) stand
on their own (without kerning) while others modify
the actual appearance of the previous glyph typed
by the process known as 'kerning'. (e.g kokki/kombu
used to generate the ikara, iikara varisai). 

vi) The modifier "glyphs" in the tamil font encoding
scheme work slightly different from the way "glyphs"
of Unicode are used. 

In the present font encoding scheme, kA, ku, kE and kO 
will be stored as follows:
kA:  (glyph k)(modifier for aa viz. kaal)
ku:  (glyph ku)
kE: (modifier for E)(glyph k)
kO: (modifier for E)(glyph k)(modifier for aa,viz kaal)

In Unicode scheme, all glyphs used to generate uyirmeis
are all "modifiers" whether they involve kerning or not.
Thus there is one modifier for each series of uyirmeis:
virama dot (to get ik/il/il,..), aakara, ikara, iikara,
ukara,uukara, ekara, eekara, okara, Okara varisais.

In unicode text saving, the modifiers appear on the right
after the corresponding mei, irrespective of the final,
screen/print appearance of the resulting uyirmei. Thus,
for the above two case, unicode storing format will be:
kA:  (glyph k) (modifier for vowel A)
ku:  (glyph k) (modifier for vowel u)
kE:  (glyph k) (modifier for vowel E)
kO:  (glyph k) (modifier for vowel O)

The unicode modifiers for okara and Okara are difficult
to implement in tamil fonts even using kerning techniques 
and so in fonts will require totally three glyphs.

In tamil font text <---> unicode text conversion,
the sequence of glyphs are first searched, tagged
and then interchanged.


PS:  Muthu, if you agree with the above definitions, then
there is no need for the (okara, Okara, aukara) glyphs shown
in slot positions 147-149 of versions 1.1. Correct?

Home | Main Index | Thread Index