Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: glyph choices for char. encoding-version 1.2

Dear Muthu and Friends:
While we are reviewing still the version 1.2 of the proposed
character encoding scheme, I have the following questions:

i) If my little understanding of tamil is correct, the basic mei /
consonants of tamil alphabet are the series ik, ing, ic, ing,...
and not the series ka, nga, ca,....
So wouldn't be better if the former series appear first, right
after the vowel (uyir) series followed by ka, nga series?
(Nagu: I will invert the Ra/La in the next version).

ii) Regarding the slots 149-152, I fully agree with Prof. Hart 
that once and for all we should decide on what goes there -
either the four diacritical markers or the set of old style characters.
Going for a series of fonts where each one has different glyphs
put in these will reduce significantly all current efforts to have
a world standard.

My personal preferences are for the diacritical markers. I have
stated clearly several times the reasons for this: all major libraries
use them to catalogue tamil books, indologist all around the world
use them; practically all south asian journals use them in place of
tamil or other indic scripts; all major tamil research centers of
(particularly Inst. of Asian studies and International Inst. of Tamil
Studies) use predominantly transliterated tamil with these markers
(including monumental reference works such as Encyclopaedia).
Having these four glyphs in the scheme will allow integration of all
these efforts under a single umbrella. I can even think of OCR 
packages for tamil based on this unified, polyvalent font that can
scan all these texts -containing either tamil script or transliterated
texts and save them in electronic form.

As far as old stlye characters, they can still go in the other spots
(currently blanked off as X) or have it available as a special pull
down option in dedicated DTP packages/softwares - in some form
of "font substitution". All DTP packages involving romanized or
phonetic input must put out current version of lai, nai etc as the
default option.  
Such a procedure can be less controversial and easily digestable.
Please throw in your views so that we can decide on this soon.

iii) having the meis of grantha characters (is, ih, ij,...) also in
of the modifier 'dot':  Since we have the special situation that, all
uyirmeis of grantha are to be generated using the modifiers 
(aakara, ikara, iikara, ukara and uukara varisais), having the meis
also generated this way is consistent. If the mapping/correspondance
table is clearly defined on how the entire 256 tamil alphabets are
to be generated using the present character encoding scheme
(I am currently working on this), I do not see any problem.
In any case, we need to generate some slots (delete exising ones)
if we want to do this. 

iv) smart quote replacement in most softwares: 
I can talk only about Mac softwares. 
Yes Word of Microsoft and ClarisDraw of Claris particularly
have this automatic replacement of straight quotes by curly quotes.
I have written hundreds of emails to Mylai users that they have to
remove this default replacement for tamil vowels e and E in Mylai 
to appear properly in screen and in print.
I find it unnecessary that this option is forced on us as a default.


We are witnessing an unprecedented, healthy situation for
tamil computing where people from four corners of the
world are actively participating in the standardisation debate
via electronic mail. We are sampling a large, representative
mass involved/interested in tamil computing.
Yesterday I quoted part of the summary report of the 
discussion panel of the last TamilNet'97 conference held in
Singapore regarding their preferences for a 8-bit scheme.
I would like to draw particular attention to one line there
"The TamilNadu Computer Standardization 
Committee will work with developers towards a unified 
8-bit character set. "
Since we are now kind of going around in circles on what
should go in the character set, it is high time that the members
of TNC break their 'silent spectator role' and throw in their
viewpoints/preferences. This will be along the spirits of the 
above decisions made at the last Singapore conference 
(in the presence of majority of the TNC members and the
Hon'ble Minister of Tamil Culture for Tamilnadu, Prof.
I stated yesterday: "Deciding on whether we go for 
7-bit font or 8-bit fonts, 
with or without diacritical markers,
with or without old style tamil alphabets, 
with or without grantha characters
is a difficult issue, since the choices are more at the
personal preferences level. "
Bottom line: we can live with any of these choices.
It will be a futile exercise if we all agree on one and
TNC comes up with something else for reasons better
known to them.


Home | Main Index | Thread Index