Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: glyph choices for char.scheme

Dear Kalyan,

Please read further...

Dr.K. Kalyanasundaram wrote:
> > Regarding the old style characters, the way I see things is as follows:
> old style lai is a distinct beast (graphical representation) by itself.
> If  the kokki has a separate code of its own, la following this kokki
> is a different object from the same la following the aikara modifier
> (irandu chuzhi). If an old text had this lai occurring in the old form
> it will be keyed  in and stored in that form. Even in the same file
> if it appears in the new form it will be typed and stored in that form.
> Of course, in practical reading of the tamil script these two are one
> and the same. We can bring in software intervention if necessary
> to identify the old style characters and replace them by the new format.
> So for people's understanding, these old-style alphabets (six in
> all, three with unique glyphs and three derived using the kokki)
> are same but for storage and handling they are not the same.
> Probably this is where we differ in our conceptual role of how the
> old style characters are handled in the character encoding scheme.

Just one note about my points regarding this. I missed to consider the
three unique glyphs. They need only one code. But the new ones need
two codes.

But, I have **Only One** more question on this subject:

       I **Really** do not understand why you are particular
       about **storing** old text in old style format when
       it can be, just before storing, mapped to new format,
       and stored.

       I am assuming that electronic devices e.g. OCRs can
       recognize a shape and store two codes(more than one!)
       to represent them.

> The usage and implications of the old tamil numerals are also in the
> same spirit. Tamil numeral one is a different object altogether and has
> no relation whatsoever to the roman numeral 1.  The former series,
> in electronic texts are pure graphic objects representing the way the
> numerals were used. They cannot be used to input numbers or do
> mathematical operations using the numeric keypad of the standard
> keyboard for example.


> I am very reluctant to say anything more on grantha letters. Since
> you raised a specific point on the need to have "ksha" vis-a-vis
> having it replaced by ka + sha. Bottom line it is a question of
> personal preference and taste. If there is no space problems,
> I do not mind having it.

Even if it is one, I really dont want to have a ***Redundant***
***Borrowed*** letter which will not anyway serve any special
purpose in my language.

I think I have explained it to the extent that it is more than a
**Personal Preference**.

Please convince me ...

> If I want to argue (I do not want to)
> we can extend this argument and go back to Muthu's earlier
> propositions of writing thuu as thu followed by aakara kaal and
> so on.

Yes. I had a question on this. I did not get any reply from you.
Will it not be too difficult (wont they look improportionate)
to stuff those big letters on Character Terminals.

Will it not be helpful to have half-kaal as separate Glyph???

> I already posted my clarification on diacritical markers so I will
> not repeat them here. I fully agree with Muthu that using
> translators/convertors one can always go back between tamil
> script text and a romanized text. My preferences are going for
> a general /comprehensive truetype font that can directly implement
>  input in both formats without the need for any additional
> software gimmicks.
> >We have a set of codes that will map to all the characters in Tamil.
> >That is we have a **representation** of all the characters that can
> > be used to store all the Tamil characters.
> If the input is #107 (roman k)  followed by #97 (roman a),
> you may be able to read as ka on screen but in the reference
> character code, the tamil alphabet will be there only if the glyphs
> corresponding to tamil glyph ka is keyed followed by the aakara
> modifier glyph.
> So when you do a search for the tamil alphabet ka, why should the
> sequence of codes (#107)(#97) bother anyone.
> I still do not understand the problem if any. Please explain so that
> we can converge.

Just(Only) One more explanation (Modified/refined thought, aligns with
Muthu's points):

1. If we have a transliteration scheme FOR Y(Tamil) language USING
   X(english) language it is for the convenience of the people of
   X language. They use the software written for X language and
   store it without any conversion in X language code. And
   the file continues to stay in X language format. SO that,
   X language person/software can always process it using
   X language. They continue to enjoy all the facilities their
   software would give to deal with X.

2. Now, if a person/software of Y language has to read that file,
   then they(Y) should have a standard scheme to convert from 
   one to the other.
Now, The X can be any language. Ideally we should standardize scheme
for all the languages. So automatically they can input the scheme
in their own (unique) input standard. So, there is no user input
standard problem.

Going by this logic we are not doing the right thing by occupying
the other half of an another language (ASCII(English)). Now I understand
what Muthu meant by "Overloading"! As, Dr. Srinivasan has put it,
anyday Microsoft can screw up any position in this table without
any legal problems. Who can control them in Lower ASCII part.
Lets hope that they dont do it.

Anyway, Going by the Bilingual Table we have, Shouldn't we just use the
symbols in Upper ASCII part. Are WE to add some more symbols in THEIR
language for them to transliterate. Atleast let that be a 7-bit code
and let the user enjoy all the facilities that a 7-bit English enjoys.

> Kalyan


Home | Main Index | Thread Index