Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [WMASTERS] Tamil language and glyph codes standardization


This week's sponsors -The Asia Pacific Internet Company (APIC)
  @  Nothing Less Than A Tamil Digital Renaissance Now   @
<http://www.apic.net> Click now<mailto:info@apic.net> for instant info

*Selva wrote on 6 Oct 97:
*>  My thoughts are as follows:
*>  [A] Let us define a basic set of glyphs for Tamil (without granthas)
*>      in contiguous slots and call it the standard primary 
*>      'tamil space' set and then
*>      define an extended set of glyph codes needed for non-tamil 
*>      characters and put them in the 'non-tamil space'. 
*It may be recalled that in my first posting version 1.4,
*I have indicated in the footnote that, as a 8-bit encoding 
*scheme with 256 glyphs, the scheme contains a mixture
*of glyphs of roman, grantha and tamil (all the
*arabic numerals, roman alphabets, punctuation marks,
*tamil glyphs and some grantha glyphs). For this reason, 
*I have labelled the scheme explicitly as 
*8-bit ROMAN-TAMIL-GRANTHA scheme.
*Please consult the gif again at
*The encoding scheme in full was never at any time 
*labelled as a pure tamil encoding scheme. It was a collection
*of glyphs of different languages to facilitate tamil texts to be 
*handled in several situations: email, transliterated tamil, 

    My understanding was a Tamil Nadu Standardization committee
    was formed for standardizing Tamil character/glyph codes and
    they wanted to have bilingual facility (Tamil and English), 
    the two official languages of TN. Not for standardizing
    'a collection of glyphs of *different languages*'.
    Now your claim that it was never  'labelled as a pure tamil
    encloding scheme' raises more questions. Was then a Tamil nadu
    committee formed for 'impure tamil encoding scheme' ?
    I'm truly sorry for raising such questions; Kalyan's 
    unwarranted claim raises these..

*webpages, electronic archiving, optional save in unicode etc etc.
*(Those who have carefully read my earlier postings must
*have noticed that, right from the beginning I was referring 
*to a polyvalent font scheme and not a "pure tamil scheme")

   May be I didn't read your declaration that it was not
   a "pure tamil scheme". 

*I guess Mani's recent posting best summarises the design
*Given the above picture, it has been a mystery to me as to
*why people get all charged up and keep labelling the scheme
*as a "pure tamil scheme". 

    I had offered my reasons in this forum and I'm looking
    forward to a reliable tamil-english bilingual scheme. 
    Some opinions were shared here as to what are the useful
    non-tamil characters that can also be added. The fact
    that I did suggest to include 'fa' and two diacritical
    markers (single and double dots at the bottom of a letter)
    to possibly denote other sounds *and* five grantha letters
    and greek letter mu, must tell you that I did not label
    the present exercise as 'pure tamil scheme'. My point
    is a set of  character/glyph codes needed for *tamil* be
    defined such that future software developers will
    try to comply with that for compatibility reasons. This
    set need NOT include any non-tamil characters including 
    grantha. Now a set of glyphs for some non-tamil characters
    that may be needed by several groups of tamils such as
    grantha letters, or other symbols ( say like OM), copyrigh
    symbols etc. can be 'recommended'. This non-tamil glyph
    space can be less rigidly defined. But for ease of implementation
    etc. it is best to arrive at some consensus about at least
    a few of the more commonly used non-tamil glyph/character.
    For example J, sh etc.  This way if some want to implement
    ksha they can, but if some want to substitute some other
    useful symbol ( say OM), they can.. The market force will
    dictate whether it is useful to have ksha there or some other
    character. I hope my point comes across clear.
*Selva writes:
*> Examples of glyph codes for Tamil space:
*> 12 uyir (GQ)
*>  1 aytham (K0E)
*> 18 mey (@K) 
*> 18 'akaram ERiya mey (
LE jF @K)
*> 6 diacritical markers for 'aa, i, ii, e, E, ai)
*>  2 for tamil di and dii
*> 36 ukaram Ukaaram ERiya meykaL (
L @K
*> 10 tamil numerals including 0. (if we want to claim
*>   *full* compatibility with Unicode, then we may have to
*>   reserve 3 more spaces for 10,100,1000.
*I presume by diacritical markers, Selva is referring to modifiers.

    Yes modifiers :-)
    The tamil word 'aravu' is closer to the dictionary meaning
    of 'diacritical marker', though mostly applied to only
    'aa' (what we call commonly as 'kaal').
    I know diacritical marker has a different meaning as well.
    Diacritical markers are modifiers too.

*The above collection of glyphs IS EXACTLY what is there in the
*version 1.4 (only difference is slots allocation for 10, 100 and 1000).
*So I really do not see any problem in glyph choices for tamil part.
*On the grantha part the only difference is inclusion of "ksha" and
*Can someone clarify pleaase if, with the most recent proposal
*of Selva, the differences between his proposal and version 1.4
*are just these two grantha glyphs and specific slot assignments? 
*I would like to get the present situation straight.

     I had elaborated a little bit above about non-tamil space.
     I have no problem with ksha, except that it is a sheer waste
     and gives room for people to misuse. The non-tamil space 
     need not be regidly defined, in my opinion. A symbol for
     'fa' may be more useful than ksha and there should not be
     any claims of 'incompatibilty  with Tamil Nadu government or
     Singapore government standard', because someone had decided to
     drop ksha and added fa or something else. This is why I say
     let the 'standard' be for tamil-space (roman is not decided by us),
     and 'recommend' but not define specifics of what goes where
     for the non-tamil-space. Any compatibility ( sought by government
     bodies) should be only with tamil (and automatically roman).
     It will be useful to recommend S,sh,J,h, and possibly sri as well.
     If you feel strongly about ksha include it as well.
     But even these should be considered as 'optional' from the 
     point of view of  'standard'. Or call this non-tamil-space
     slots as user-defined, with 4/5/6 slots with default values. 

*If it helps in any way, I am prepared to redraw the
*chart 1.4 in full (indicating all 256 slots) with SPECIFIC
*COLOR CODING for different components there: 
*arabic numerals,

    It is also called 'Hindu-Arabic' or simple 'Hindu' numerals.

*roman alphabets,
*punctuation and other marks (such as copyright sign and others)
*tamil alphabet part (incl. tamil numerals)
*grantha part
  Call it non-tamil-glyph-space ('notag' space), or something like that,
  so that even letters like fa or other useful
  symbols can be included in this space. Grantha is only one kind of
  'notag', others may want some other in addition to grantha.

*I thought the composition was obvious and labelling was adequate.

  I think if we adopt the way I'm suggesting, it might find more

   I've not yet Mani's mail and so some of the things I've said here
   may not be applicable.

   anbudan selvaa


Sponsors/Advertisers  needed -  please email bala@tamil.net
Check out the tamil.net web site on <http://tamil.net>
Postings to <webmasters@tamil.net>. To unsubscribe send
the text - unsubscribe webmasters - to majordomo@tamil.net

Home | Main Index | Thread Index