Re: [Indic-computing-devel] UTF-8

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Wed, Jan 30, 2002 at 12:05:35AM +0530, Tapan S. Parikh wrote:
> Now i agree there is a strong argument that this should be done server 
> side b/c of portability, but it is not always true that Unicode 
> transfers should take 2x as long.  This is only currently true b/c the 
> current standard Unicode one-byte encoding (UTF-8) is biased towards 
> latin scripts, in that latin chars take up one byte and all others 2-3.  
> One could very easily imagine and implement an encoding that would be 
> biased towards indian scripts, in that iscii chars would take up only 
> one byte and all others 2 or 3.  Now Im not saying we should do this, 
> and obv there is the issue of how to distinguish between diff scripts, 
> and related issues in unicode round trips, but its something to think 
> about...

If you use gzip'ed unicode HTML files, I'm sure whatever advantage 
latin1 scripts have will be neutralized. I think there are more 
important issues (sorting order, open type font support) that we 
need to worry about.

	-Arun