Tamil Discussion archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[WMASTERS] Anbu Arasan mail-repost
This week's sponsors -The Asia Pacific Internet Company (APIC)
@ Nothing Less Than A Tamil Digital Renaissance Now @
<http://www.apic.net> Click now<mailto:firstname.lastname@example.org> for instant info
>>on Sep 17 AnbuArasan clearly explain, that the proposed encodings should
>>be based on basic characters not glyphs, and he gave the reasons. At
>>this point we have to look at his mail seriously. can any one of you
>>repost that mail?
>Unfortunately I don't have that post. I second this request.
Here I repost the posting of Anbu Arasan referred to above.
Received-Date: Wed, 17 Sep 97 19:55:53 +0530
Posted-Date: Wed Sep 17 19:29:09 1997
X-Sent-To-Axcess: email@example.com(1|NN| )
X-Sent-To-Axcess: firstname.lastname@example.org(1|NN| )
X-Sent-To-Axcess: email@example.com(1|NN| )
X-Sent-To-Axcess-Cc: firstname.lastname@example.org(1|NN| )
X-Sent-To-Axcess-Cc: email@example.com(1|NN| )
Date: Wed Sep 17 19:29:09 1997
X-Mailer: aXcess Mail (version 2.0)
Subject: Thought provoking on Tamil encoding
It is my sincere effort to make very clear that in no way I intend to
or lament anyone and my interest is purely and solely to put my views
the enshrinement of the sweet Tamil. If at all, in any way at any place
my writing puts someone doing their mighty service for the enshrinement
TAMIL or those using it, I once again repeat and highlight it to forego
forgive it for the sake of prosperity of the language for which such
unprecedented discussions are taking place. No doubt, these
would pave means for better understandings and the best solutions for
Tamil language is evolved and reformed over a period of time immemorial.
Since, we believe in what we are seeing, some of the participants in
discussion believe that representing (displaying and printing) of Tamil
computers is Tamil encoding.
Tamil has witnessed and withstood many changes in its script form as
in its character set.
Before starting of encoding of Tamil glyphs, a requirement (aim) has to
formulated about what is that is going to be encoded, without which it
not advisable to select, segregate the Tamil glyphs as per ones taste.
I want to make one point very clear. There appears to be some
somewhere within the ambit of discussion between font/glyph and
set. Set of glyphs is not a Tamil character set, itself. The character
of Tamil is "uzhir eluthukkal and Mei eluthukkal". These thirty letters
the basis for Tamil and the combinations of these letters forms hundreds
characters and it is not possible to encode all these characters
computers. The basic common characters are considered as character set
Indian languages and encoded in ISCII (new standard).
It is because of some misinterpretation of some people involved in
earlier versions of ISCII Standards, an unnecessary coding appears to
been done for matra characters. These matras (vowel signs) are
the corresponding vowel present in the "uyir mei eluthukkal". These
signs could be a just one sign or two or three amongst the Indian
(In Tamil, only upto two signs are used). It can come only on right side
in "kA,ki,kI" etc. or on only left side as in 'kai,ke,kE" etc. or on
the sides as in "ko,kO, kou" etc. It is not so only in Tamil, but also
some of the other Tamil influenced languages like Malayalam,
Bengali(Bangla), Assamese, etc.
Even though we call the composite characters as "Uyir mei" its
composition stands out to be consonant (Mei) and vowel (uyir). Using
a basis, Indian scripts being coded on the computers. This is
even to earlier ISCII Standards ISCII-91 (called as level 1). In
consonants are followed by matras. It is the same in Unicode also.
Kanpadthum poi....... theera vicharippadhe mei. It seems that most of
participants didn't understood the encoding followed in ISCII and as
I humbly repeat, character encoding and font design are two different
these are not be mixed up together.
It is widely misunderstood by someone as the current discussion on
glyphs as encoding Tamil on computers and is the basis for enshrining
electronically. This appears to be a wrong conception and false
engulfed in the discussion.
Font encoding cannot solve many issues like, sorting, searching,
and preserving Tamil itself. Font encoding is just one way of
(rendering) Tamil on computers (since lot of maturing desired on
Regarding "glyph substitution" (wrongly stated as font substitution - a
substitution means substituting one font, say 'arial' in
environment with 'times new roman'), I feel, we can think as one of
option. Since glyph substitution already implemented in windows NT
windows 95, True Type fonts (this is not open type)
is the best option. It is all depending on our requirement (all of us -
have not yet decided to what environment, we are discussing the issue).
we are talking about the future including the present day computers
of running windows 95 for PCs or system 7.x on Apple, we can
adopt "Glyph substitution method". If our target is something else,
substitution will fail to support us. "Future international extensions
True type may require a unique Glyph" is as mentioned in the True
documentation "True type 1.0 font files - Technical specification
1.66" by Microsoft. Since True type is being promoted by both Microsoft
Apple, it seems that Glyph substitution will continue.
The glyph ordering followed by Dr Kalyan seems to be illogical, to
correct order just follow the thamizh nedunkanakku.
I feel the Glyph encoding has to be discussed, whether we need 8 bit or
bit, whether to support only GUI computers or atleast from AT 286 (most
the Government offices still use these outdated machines in India) or
cater to all electronic gadgets as someone pointed about POS. I
implemented few Indian languages on Pagers.
Someone may wonder to raise a query as to why we cannot use 128-160.
128-160 is just a replication of 0-32. It means the 160 (no break space)
to be same as 32 (space) with the same advance width.
Regarding Dr Herald Schiffmans' requirement and like-minded linguists
old Tamil letters are nothing but different 'varivadivam' for the same
constituents) is taken care in ISCII Standard. That is, any
literature could be stored using ISCII encoding scheme to preserve
Since, no common interface softwares are available yet, the
developers can provide a kind of converters to store in ISCII
(Apple has implemented ISCII - level in their machine and Microsoft is
for ISCII level 2).
I remember, Dr Herald Schiffman was referring to quote marks. I would
to present my view here. I feel his requirement is for us to have the
marks as used in Tamil texts (and in Indian languages and English in
that is the single quote will look like as if the comma is shifted to
the ascender of the character. Since the Glyphs encoding is round about
bit encoding retaining English, now, it is to accommodate in the upper
Quote marks used in Tamil is different from the one used in
There are two different single quote marks as open quote and close
marks. They are similar to inverted comma and comma as seen at
character position 145 and 146 in Arial fonts used in Windows.
In India, the Indian language numerals are seen to gain its
(except Tamil) because of the pushing effort and as it is being
as part of the language itself (I feel, a language cannot be
without its own numbering system).
I have not seen the romanised keyboard which is proposed by the
Standardisation Committee (has it been finalised). If, it is finalised,
it uses only English alphabets or even diacritic marks. If it is only
on the English alphabets it provides a keyboarding without any 'extras'
is the end of transliteration subject. I feel the transliteration
should facilitate to key in tamil without any extra font or softwares.
In conclusion of my views, I suggest to encode Tamil based on its
character set. Tamil is not like English having one to one
between character coding and display. Tamil has to be handled by two
I.e., an encoding based on Tamil characters and a font to render
Tamil Script. In the present scenario, It is not possible to have a
character encoding scheme and single font encoding scheme to cater to
the living computers and its operating systems. ASCII in the
environment and ANSI in the WINDOWS ( and other) environment are
different encoding scheme.
Sponsors/Advertisers needed - please email firstname.lastname@example.org
Check out the tamil.net web site on <http://tamil.net>
Postings to <email@example.com>. To unsubscribe send
the text - unsubscribe webmasters - to firstname.lastname@example.org
Main Index |