Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[WMASTERS] Re: Can we please develop a standard?


This week's sponsors -The Asia Pacific Internet Company (APIC)
  @  Nothing Less Than A Tamil Digital Renaissance Now   @
<http://www.apic.net> Click now<mailto:info@apic.net> for instant info

Dear  friends:
Thanks for your continued participation in the ongoing efforts to define
possible standards for tamil. I have clarified several times the
by which myself, Muthu and others have been proceeding. As the
discussions become emotional, some of these guidelines are either 
forgotten or not seen as such. So I will try to clarify them here .

i) Let us take up the question of 'who should define the
I may add that there are two viewpoints here. 
One opinion is that today, majority who speak Tamil are based
in Tamilnadu, majority of the printed tamil materials originiate from
Tamilnadu and so it is the Tamilnadu Govt that should set the
standards and no-one else. The second school of thought feels that 
international standards in computing, particularly those applicable 
to internet are evolved in world forums involving worldwide 
participation and not mandates of the any individual governments. 
A larger percentage of overseas Tamils already do tamil computing.
(I have not personally checked if the standards for french, german, 
greek, dutch etc are defined in the first place by their respective 
governments and internet community simply accepted them later). 
In an earlier posting, I have elaborated on how internet community
"establishes" standards through "established procedures".

Ravi Paul wrote
>1.      This discussion is just between a minority group of
>individuals who have no "authority" to implement any standard
>or reform. As such, this attempt is limited to a group of 
> individuals attempting to set a standard. 
Yes. None of the present ring of participants in Webmasters have
a written mandate to define standards from "anyone". It is all  
VOLUNTARY EFFORTS of individuals (pure academics, commercial
software developers and end-users) who feel that it is high time
that we all agree in some standards that unify the world-wide
efforts. Few members of the Tamilnadu Advisory Committee 
are 'occasional participants' and they have indicated that if the
community collectively defines a standard they will give 
due considerations to it in their own deliberations. That is it.

So within the above scenario, the present efforts are attempts
to define possible ' standards' in a world forum open without any
restrictions to anyone who cares to participate. It is a collective
unprecedented, effort.
Frankly none of we three (myself, Muthu and Bala) have yet
clear ideas on how the standards are to be launched even if
we all agree on one. In the common interest of all, it would be 
better if subscribers to the above two schools collectively launch
these standards. 

ii) Many of the participants feel that "defining standards" is also
a good opportunity to introduce "language reforms". All arguments
for dropping a single glyph "ksha" or the entire grantha set 
are proposals of "language reform". (Though I am still in 
favour of keeping ORNL I agreed for its dropping, since ORNL 
has been dropped officially quite a while ago). In tamil.net we
had several such "language reform proposals" floated. Anu made
some suggesting usage of only uyir and mei alphabets. Now 
Selva proposes alternative ways of replacing grantha letters.

Languages do evolve but changes come about very slowly.
"Language reform" is a complex highly sensitive issue. It is not
at all a wise idea in my opinion for a small group on internet
to engage in such exercises of "introducing reforms". 
Such tasks are better left to Govt -mandated bodies such as TNC.  
For those interested in reforms, it will be a good idea to float ideas 
on internet and seek responses. Where there is large positive
response, the proposals can be forwarded to appropriate
government bodies for follow-up. Fortunately TNC is willing
to hear such propositions. In fact they have officially listed this
as one of their mandated tasks (but only after defining 
deciding on a working tamil computing standard that fully 
implements the present day practice of writing tamil - this includes
the minorities). The present approach follows this sequence.

Many academics and software vendors have clearly stated here
that they will not use "any standards that leave out the grantha set".
Softwares are end-user/market driven. If the standards do not
take into account the needs of majority of end-users they will stay
only on paper. The present situation of anarchy with many
user-defined fonts will continue to flourish. Do we want this? No.
So let us NOT for-ever keep mixing up issues of "language
reforms" while we discuss a standard based on the present day
style of written tamil.
Even at the risk of repeated myself, I would like to repeat here.
I have no special vested interest to "sanskritise tamil" nor any 
grudge against any "thani thamizh" efforts. The proposed scheme
is an effort to accommodate present day writing of tamil script.
So please refrain from colorising present standardising efforts.

iii) choosing one of the existing fonts as a standard
Right from the beginning, we have been trying to
define a standard that takes into consideration the field experience
we got from many tamil fonts and DTP packages. 
Many of us did try out of many of these packages and have 
chosen to work with one or the other.  It is not
a personality contest and no-one is trying to push indirectly
his/her product. A standard collectively defined by the
entire spectrum (pure academics including computer professionals,
software developers and end-users) has a very good chance
of being accepted quickly. The proposed standard has the
features of many fonts. We ALL have update our software/font
for tamil DTP soon but the effort will be all worth it.
Here again, let us not get side-tracked with specific
suggestions of "why not simply accept this or that? and
counter-arguments for something else.

Another practical problem raised by Muthu, Ravi Paul
and others quite early is connected with "propriety status"
of many of the fonts and DTP packages. Many tamil fonts
that are distributed free in the internet are "propriety
materials" with their own restrictions! They are like Microsoft
distributing Ariel and other fonts. Only a handful of tamil fonts
the authors have publicly stated that their fonts are open
to public for incorporation in others. Instead of opting for
a good, working propriety font and fighting with its author
to release it free to the public, we are all better off defining one
publicly but collectively. 

iv) tamil numerals:  
As far as implementing the present-day
usage of tamil, the only deviation the present scheme 
introduces is to have the tamil numerals. For no fault of
it, tamil numerals have died out from current usage.

The world is going towards multi-lingual computing. 
The necessity for this in a country like India with its 
vast language diversity needs no clarification. Unicode is 
fast evolving as 'the' approach to take with no other serious
contender. Indian Govt has already took a stand in defining
Indian Standard Code (ISC) for these and fortunately Unicode
is implementing the ISC in its indian language segment.
Repeatedly we have stressed that we all will benefit it a lot
if the proposed standard is defined in such a way to co-exist
happily with unicode.

It is "not" the revolutionary idea of me or Muthu to bring back
tamil numerals. They are there already in ISC and Unicode!!!
Irrespective of the details of implementation (at the glyph
level or character level) if we do leave out a dozen glyphs
that are in Unicode/ISCII we cannot claim any compatibility.
Tamil numerals 10, 100 and 1000 are there in Unicode.
Nagu's proposal to "add a glyph for 0" so that these numerals
can be re-activated one-day to do real mathematics is good.
It is now up to TNC to reflect on this. (If my understanding
is correct, tamil numerals are not based on decimal system ! ).
If "tamil numerals" are to be dropped, revisions have to come
first in Unicode and ISCII. Then we will have 10+ slots
liberated for other things. 

Unicode is a comprehensive package but will require the
state-of-art computers to implement it . Also it is not yet
fully mature particularly for indic languages. The proposed 8-bit
scheme, on the contrary, will allow making of a self-standing
truetype tamil font that will work on any computer -even on 
not-too-old (anything bought in the last ten years or so).
Still the tamil text files generated using this can be saved in
unicode format if one chooses to do so. (Proposed large scale
archiving projects of UC Berkeley want such unicode compatibility)
So we are converging towards an encoding scheme that
we can be proud of. 

> [1] The character set that needs to be assigned character
>  codes has not been discussed with any rationality.
>  There was debate about whether to include grantha
>  letters or not. There were at least four views aired.
>  First group(Kalyan et al) thinks we should have 
>  all six S,sh,h,j,ksha, sri
> The second group(Kumar Kumarappan et al) expressed 
>  doubts about the need to include Sri and ksha. 
>  The third group (Anu, Kathir et. al) oppose grantha 
>   letters. The fourth ( may be just only me?) 'group' suggested
>   to include only four or five grantha letters and not the
>  ksha but recommended including fa etc. 
>  I think here people have to consider a number of
>  questions and provide supporting reasons that 
>  prevail over the other choices. 
> I don't think this has happened and I feel 'due
>  consideration or thought' had NOT been given.
>  [2] Re: The tamil numerals. Why do we need 10, 100, 1000 
>   symbols ? Did anyone cogently argue for the inclusion 
>  of these. 
Points (i) - (iv) on  the design considerations answers
the questions raised above.

>   [3] Why do we need nju, njU, ngu, ngU ?
>  What was the answer to my suggestion that when needed
>  we can write as nj+u etc. ? This suggestion arises
>  because we don't use these letters. 
>  We can provide these symbols in some glyph table.
As I stated earlier, these are there to have some uniform
approach - one that allows facile sorting, search etc.
No "language reforms" now of writing nj+u for nju etc.
Many such proposals (e.g thU as thu + kaal) have been 
posted earlier.

>  [4] Why do we need a separate character code for 'au' ?
>  can't we write o+La  ? This again only because we
>  are going to use it very few times. Is it very
>  different from getting kO or kau for example ?
I have discussed this point specifically with Muthu in 
private correspondance. Muthu does admit that there may be
some confusion here. He prefers to keep au as such for
consistency with unicode schemes and sees it as the
 confusion we live with when people use upper case o
(O) for the numeral zero (0). 

>  [5] Do we need copyright and trademark symbols ?
Copyright and trademark symbols were proposed by Leong
Kok Yong to facilitate their usage in WWW. Many who
put up tamil webpages will need them and having them as
part of the tamil code will allow us to invoke "charset"
definitions early in HTML. It is a very good idea.

>[6] What room have we provided for future expansion ?
In earlier emails, I did raise the question of leaving a few slots
vacant and also how we can define such vacant slots. 
Unicode has two types - one that are forbidden for all and
a separate segment as "user-defined" block where anyone
can put up special glyphs for specific purposes. My personal
wish is to leave at least four to six slots vacant. 

Right now, in scheme 1.4, the last slot (255) is left vacant.
Muthu has indicated his agreement to indicate this slot as 
"user-defined" - an escape route through which software 
developers can introduce other glyphs for ORNL or 
aesthetics or whatever. Nagu or someone else suggested
such possibilities.
Having slots 145-149 filled up with single and double
curly quotes can make DTP processing easier (ANSI 
character set has these glyphs in these slots).  So we have
placed these there.  

Sorry for this long posting. 



Sponsors/Advertisers  needed -  please email bala@tamil.net
Check out the tamil.net web site on <http://tamil.net>
Postings to <webmasters@tamil.net>. To unsubscribe send
the text - unsubscribe webmasters - to majordomo@tamil.net

Home | Main Index | Thread Index