Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Possible font encoding scheme-version 1.1



Dear Friends:
I am pleased to note that some general consensus building 
up on the version 1.0 of possible slot assignments in the proposed 
font encoding scheme for tamil. Let us continue to discuss
all related issues and agree on some standard font encoding scheme.

Of the weekend mails, I take note of the following specifics on
earlier version 1.0:
 --  glyphs for old stlye Ra, tamil numeral for 1000 missing
 --  majority preference to keep grantha characters in the tamil font
 --  still a  split on the place for old style lai, nai, Rai etc
 --  no discussions yet on tamil font composition its relation to
unicode fonts
 --  suggestions to include some additional characters such as 
     copyright sign, all rights reserved sign etc that can be included 
     to make life easier. (a related topic- default option replacement
   of straight quotes by curly quotes b y most word-processors is a
   related topic to be discussed here)

I had extensive thinking/review on the version 1.0 myself and now
has come up with a revised version of it labelled 1.1
It is available under the URL
http://www.geocities.com/Athens/5180/charset11.gif
The modifications in this revised version are as follows:

a) Time-again there have been points raised that slots 128-159 are
"unsafe". Many of the 8859-X schemes have left these two rows
(9 and 10) vacant. 
So I regrouped the choice of glyphs in such a way that the main
part containing all the esssential glyphs for tamil in the major part
- in positions 160 -255.
Things such as tamil numerals, grantha characters, old style lai, nai,
Ra,
diacritical markers are all now grouped in the slots 128-159.
This should ensure that, even in primitive implementations,
there should not be any problems to get the main tamil text "in tact".

b) missing old style Ra, slot for tamil numeral 1000 are now included 

c) Included also are  some of the glyphs (modifiers) for okara, Okara 
varisais that are there in the UNICODE scheme. 
I am still very much confused on which of the UNICODE
glyphs to be there to ensure tamil font-unicode font compatibility (!)
We should have ability to save tamil text files in unicode format and
also
any unicode font based tamil text to be converted into one corresponding
to tamil font. The addition of tamil numerals was done mainly with
this thought in mind. 

In the UNICODE mailing list discussions, for e.g. Jeroen Hellingman said
"I disagree with the decision that VOWEL SIGN O is equivalent with 
VOWEL SIGN E + VOWEL SIGN AA, although in Tamil this may 
look the same it is not logically the same, and the decomposition 
must be very much discouraged, as it may cause problems in searching, 
transliteration to other scripts where this is not the case, and
sorting. "

 In tamil fonts we do not use the shown modifiers for okara and Okara
varisai . If Mylai type implementation is envisaged,
ko will have three glyphs in this scheme while it will involve only
two in unicode. Should we leave this point for conversion softwares
implementation and delete these extra/unwanted glyphs?

In the present font encoding scheme, the glyphs are used as modifiers
to get the uyirmeis. In unicode these glyphs are used in a different 
practical sense (kE is written in tamil font as modifier for E followed
by
ka while in unicode kE will be saved as ka followed by the glyph
corresponding to E). 
Can we have some discussions and consensus on this point.
Someone like Muthu can clarify and make appropriate recommendataions.
As before, version 1.1 is still a draft and so is up for revision.

Another point I would like to be discussed on the font encoding
scheme is on the *nature of any vacant slots that we have in the scheme"
I think it will be a good idea if we leave at least 4-6 slots vacant.
As Prof. Hart clearly pointed out 'no standard can be perfect nor
eternal'. Tamil font standard should be reviewed again in a decade.

But we should define the vacant slots if we choose to have some.
Unicode schemes have many vacant slots - some are defined as
 "free/open slots for end-user to use the way he wants" and others
 "reserved for future revisions and hence prohibited to place anything
  in these slots"
In the former scenario, we can even think of "special derivative
fonts" tailored for specific applications such as for electronic
archiving of ancient literature and old/palm-leaf manuscripts 
carrying additional characters that are there (to be identified).
One possibility could be to have the old style lai/Nai, Raa etc 
to be placed in these "user-definable" slots and hence does not form
part of the primary font encoding scheme. 
We can also have local fonts can have OM or whatever additional
glyphs that one would to have. 
The point to decide: should we have this option made available or not.

Kalyan


Home | Main Index | Thread Index