Tamil Discussion archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [WMASTERS] Tamil language and glyph codes standardization
This week's sponsors -The Asia Pacific Internet Company (APIC)
@ Nothing Less Than A Tamil Digital Renaissance Now @
<http://www.apic.net> Click now<mailto:email@example.com> for instant info
*Selva wrote on 6 Oct 97:
*> My thoughts are as follows:
*> [A] Let us define a basic set of glyphs for Tamil (without granthas)
*> in contiguous slots and call it the standard primary
*> 'tamil space' set and then
*> define an extended set of glyph codes needed for non-tamil
*> characters and put them in the 'non-tamil space'.
*It may be recalled that in my first posting version 1.4,
*I have indicated in the footnote that, as a 8-bit encoding
*scheme with 256 glyphs, the scheme contains a mixture
*of glyphs of roman, grantha and tamil (all the
*arabic numerals, roman alphabets, punctuation marks,
*tamil glyphs and some grantha glyphs). For this reason,
*I have labelled the scheme explicitly as
*8-bit ROMAN-TAMIL-GRANTHA scheme.
*Please consult the gif again at
*The encoding scheme in full was never at any time
*labelled as a pure tamil encoding scheme. It was a collection
*of glyphs of different languages to facilitate tamil texts to be
*handled in several situations: email, transliterated tamil,
My understanding was a Tamil Nadu Standardization committee
was formed for standardizing Tamil character/glyph codes and
they wanted to have bilingual facility (Tamil and English),
the two official languages of TN. Not for standardizing
'a collection of glyphs of *different languages*'.
Now your claim that it was never 'labelled as a pure tamil
encloding scheme' raises more questions. Was then a Tamil nadu
committee formed for 'impure tamil encoding scheme' ?
I'm truly sorry for raising such questions; Kalyan's
unwarranted claim raises these..
*webpages, electronic archiving, optional save in unicode etc etc.
*(Those who have carefully read my earlier postings must
*have noticed that, right from the beginning I was referring
*to a polyvalent font scheme and not a "pure tamil scheme")
May be I didn't read your declaration that it was not
a "pure tamil scheme".
*I guess Mani's recent posting best summarises the design
*Given the above picture, it has been a mystery to me as to
*why people get all charged up and keep labelling the scheme
*as a "pure tamil scheme".
I had offered my reasons in this forum and I'm looking
forward to a reliable tamil-english bilingual scheme.
Some opinions were shared here as to what are the useful
non-tamil characters that can also be added. The fact
that I did suggest to include 'fa' and two diacritical
markers (single and double dots at the bottom of a letter)
to possibly denote other sounds *and* five grantha letters
and greek letter mu, must tell you that I did not label
the present exercise as 'pure tamil scheme'. My point
is a set of character/glyph codes needed for *tamil* be
defined such that future software developers will
try to comply with that for compatibility reasons. This
set need NOT include any non-tamil characters including
grantha. Now a set of glyphs for some non-tamil characters
that may be needed by several groups of tamils such as
grantha letters, or other symbols ( say like OM), copyrigh
symbols etc. can be 'recommended'. This non-tamil glyph
space can be less rigidly defined. But for ease of implementation
etc. it is best to arrive at some consensus about at least
a few of the more commonly used non-tamil glyph/character.
For example J, sh etc. This way if some want to implement
ksha they can, but if some want to substitute some other
useful symbol ( say OM), they can.. The market force will
dictate whether it is useful to have ksha there or some other
character. I hope my point comes across clear.
*> Examples of glyph codes for Tamil space:
*> 12 uyir (GQ)
*> 1 aytham (K0E)
*> 18 mey (@K)
*> 18 'akaram ERiya mey (
LE jF @K)
*> 6 diacritical markers for 'aa, i, ii, e, E, ai)
*> 2 for tamil di and dii
*> 36 ukaram Ukaaram ERiya meykaL (
*> 10 tamil numerals including 0. (if we want to claim
*> *full* compatibility with Unicode, then we may have to
*> reserve 3 more spaces for 10,100,1000.
*I presume by diacritical markers, Selva is referring to modifiers.
Yes modifiers :-)
The tamil word 'aravu' is closer to the dictionary meaning
of 'diacritical marker', though mostly applied to only
'aa' (what we call commonly as 'kaal').
I know diacritical marker has a different meaning as well.
Diacritical markers are modifiers too.
*The above collection of glyphs IS EXACTLY what is there in the
*version 1.4 (only difference is slots allocation for 10, 100 and 1000).
*So I really do not see any problem in glyph choices for tamil part.
*On the grantha part the only difference is inclusion of "ksha" and
*Can someone clarify pleaase if, with the most recent proposal
*of Selva, the differences between his proposal and version 1.4
*are just these two grantha glyphs and specific slot assignments?
*I would like to get the present situation straight.
I had elaborated a little bit above about non-tamil space.
I have no problem with ksha, except that it is a sheer waste
and gives room for people to misuse. The non-tamil space
need not be regidly defined, in my opinion. A symbol for
'fa' may be more useful than ksha and there should not be
any claims of 'incompatibilty with Tamil Nadu government or
Singapore government standard', because someone had decided to
drop ksha and added fa or something else. This is why I say
let the 'standard' be for tamil-space (roman is not decided by us),
and 'recommend' but not define specifics of what goes where
for the non-tamil-space. Any compatibility ( sought by government
bodies) should be only with tamil (and automatically roman).
It will be useful to recommend S,sh,J,h, and possibly sri as well.
If you feel strongly about ksha include it as well.
But even these should be considered as 'optional' from the
point of view of 'standard'. Or call this non-tamil-space
slots as user-defined, with 4/5/6 slots with default values.
*If it helps in any way, I am prepared to redraw the
*chart 1.4 in full (indicating all 256 slots) with SPECIFIC
*COLOR CODING for different components there:
It is also called 'Hindu-Arabic' or simple 'Hindu' numerals.
*punctuation and other marks (such as copyright sign and others)
*tamil alphabet part (incl. tamil numerals)
Call it non-tamil-glyph-space ('notag' space), or something like that,
so that even letters like fa or other useful
symbols can be included in this space. Grantha is only one kind of
'notag', others may want some other in addition to grantha.
*I thought the composition was obvious and labelling was adequate.
I think if we adopt the way I'm suggesting, it might find more
I've not yet Mani's mail and so some of the things I've said here
may not be applicable.
Sponsors/Advertisers needed - please email firstname.lastname@example.org
Check out the tamil.net web site on <http://tamil.net>
Postings to <email@example.com>. To unsubscribe send
the text - unsubscribe webmasters - to firstname.lastname@example.org
Main Index |