[WMASTERS] Re. 8-bit scheme and Unicode 2.0 Tamil

To: Webmasters@tamil.net
Subject: [WMASTERS] Re. 8-bit scheme and Unicode 2.0 Tamil
From: "Dr.K. Kalyanasundaram" <kalyan@igcsun3.epfl.ch>
Date: Thu, 25 Sep 1997 11:56:03 +0000
CC: "K. Srinivasan" <srini@ireq.ca>
Content-Length: 4211
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=us-ascii webmasters@tamil.net
Organization: Swiss Federal Inst. of Technology
Reply-To: "Dr.K. Kalyanasundaram" <kalyan@igcsun3.epfl.ch>, webmasters@tamil.net
Sender: owner-webmasters@tamil.net

________________________________________________

This week's sponsors -The Asia Pacific Internet Company (APIC)
  @  Nothing Less Than A Tamil Digital Renaissance Now   @
<http://www.apic.net> Click now<mailto:info@apic.net> for instant info
________________________________________________

Srinivasan wrote:
>I suggest that we propose this scheme (version 1.4) for the
>UNICODE Tamil set. Positions U+0B80 to U+0BFF
>
>I feel that suggesting a standard where none exist
>is more urgent than arguing about where there are
>already many.

First a small clarification:
It is true that we do not have any standards as far as the
7-bit and 8-bit font based DTP packages are concerned.
But Unicode 2.0 tamil segment is an established 16-bit standard. 
There are already a handful of packages that have implemented
this current version of Unicode tamil . If we are not happy with
its contents, there is scope for revision. I gather it will be a
long drawn out process to introduce revisions into Unicode.
Apparently for indic languages the official organ of contact
for Unicode is Govt. of India. So for tamil, even TN Govt 
has to go through DOE, Govt of India! Revisions come about
initially with revision of ISCII (Indian Standard Code II) and
this is followed up incorporation of these revisions in Unicode.
I do not know if any revisions were introduced for tamil during
the current (1997) revision of ISCII. Earlier revisions of ISCII
were in 1988 and 1992.  (Anyway this was the
picture Anbarasan gave during the last TamilNet'97 conference).

As some of the participants have pointed out, the philosophy
of unicode standard is quite different from that of present approach.
The former is based on character encoding and the latter is
first glyph encoding and character encoding coming out of this
indirectly.
I will try to illustrate the differences in the two approaches by
discussing an analogy here (I would appreciate much if the
unicode experts in the forum correct me wherever my presentation
goes wrong). 
The analogy is an architect trying to build a house consisting of
many walls. Unicode scheme is like the architect defining the
dimensions of the wall without really get into the details of
how you construct the wall (the size and the number of bricks
that you use). The architect defines only a minimum of foundation
blocks/bricks. The walls can be made out of red clay, marbles
or concrete, a handful of large ones or many small bricks.
Unicode tries to define the basic characters that constitute the
alphabets of the language but leaves the details of its implementation
(the type and the number of glyphs) to the software developers.
Like large bricks, you can use a single glyph to represent each
character. Or like small bricks, you can use many glyphs to 
construct the same single character.
Unicode sees the different indic languages under one umbrella
like architect sees many houses that constitute a housing colony.
In an attempt to provide aesthetics, the architect can impose some
uniformity in the way all the houses of the colony are built.
Unicode starts with Devanagari as the most complex house to
build and sees all others (incl. Tamil ) as simplified versions of
these. 
So we have many basic/foundation bricks in Devanagari but only
a handful in tamil. (only one ka in tamil instead of four ks ka, kha, 
ga, gha ).

To assist all of us, I have put up a gif under the URL
http://www.geocities.com/Athens/5180/examples.gif
This gives examples of how various tamil words will be
displayed on screen with corresponding storage formats 
under the proposed 8-bit encoding (version 1.4) and in 
Unicode 2.0 formats. Please consult this gif and comment.

Muthu gave a nice presentation at the last TamilNet'97 on how
Unicode scheme is envisaged. You can read his paper at
http://www.irdu.nus.sg/tamilweb/tamilnet97/paper/html/muthu.htm

Kalyan

________________________________________________

Sponsors/Advertisers  needed -  please email bala@tamil.net
Check out the tamil.net web site on <http://tamil.net>
Postings to <webmasters@tamil.net>. To unsubscribe send
the text - unsubscribe webmasters - to majordomo@tamil.net
________________________________________________

Follow-Ups:
- [WMASTERS] Re: Re. 8-bit scheme and Unicode 2.0 Tamil
  - From: "K. Srinivasan" <srini@ireq.ca>

Prev by Date: [WMASTERS] Re: Your proposals for charset and grantha
Next by Date: [Correction: Re: [WMASTERS] charset and grantha]
Prev by thread: Re: [WMASTERS] Foreign sounds in Tamil
Next by thread: [WMASTERS] Re: Re. 8-bit scheme and Unicode 2.0 Tamil
Index(es):
- Date
- Thread

Home | Main Index | Thread Index