RE: [Phpvideopro-developers] About UTF-8
Brought to you by:
izzy
From: Tom A. <to...@ko...> - 2001-09-17 15:55:37
|
Maybe I missed somethings, but the other day I saw some funtions in php, to convert text to utf (or something). But isn't xml rather standard and will it not cover this problem? Not that i like it, but ... Tom > -----Oorspronkelijk bericht----- > Van: > php...@li... > [mailto:php...@li...urceforge. > net]Namens Leszek Boroch > Verzonden: woensdag 5 september 2001 21:25 > Aan: phpvideopro-developers > Onderwerp: [Phpvideopro-developers] About UTF-8 > > > >From http://www.cl.cam.ac.uk/~mgk25/unicode.html > > > Because of these difficulties, the major Linux > distributors and application > developers now foresee and hope that Unicode will > eventually replace all > these older legacy encodings, primarily in the UTF-8 > form. UTF-8 will be > used in > > text files (source code, HTML files, email messages, etc.) > file names > standard input and standard output, pipes > environment variables > cut and paste selection buffers > telnet, modem, and serial port connections to terminal > emulators > and in any other places where byte sequences used to > be interpreted in ASCII > In UTF-8 mode, terminal emulators such as xterm or the > Linux console driver > transform every keystroke into the corresponding UTF-8 > sequence and send it > to the stdin of the foreground process. Similarly, any > output of a process > on stdout is sent to the terminal emulator, where it > is processed with a > UTF-8 decoder and then displayed using a 16-bit font. > > Full Unicode functionality with all bells and whistles > (e.g. high-quality > typesetting of the Arabic and Indic scripts) can only > be expected from > sophisticated multi-lingual word-processing packages. > What Linux will use on > a broad base to replace ASCII and the other 8-bit > character sets is far > simpler. Linux terminal emulators and command line > tools will in the first > step only switch to UTF-8. This means that only a > Level 1 implementation of > ISO 10646-1 is used (no combining characters), and > only scripts such as > Latin, Greek, Cyrillic, Armenian, Georgian, CJK, and > many scientific symbols > are supported that need no further processing support. > At this level, UCS > support is very comparable to ISO 8859 support and the > only significant > difference is that we have now thousands of different > characters available, > that characters can be represented by multibyte > sequences, and that > ideographic Chinese/Japanese/Korean characters require > two terminal > character positions (double-width). > > > --- > Leszek Boroch > eng: KISS! - Keep It > Simple Stupid! > Technical University of Lublin pol: > BUZI! - Bez Udziwnien > Zapisu Idioto! > mailto: bo...@aj... lub: > BUZI! - Bez Urzywania > Zakreconego Interfejsu! > > > > > > _______________________________________________ > Phpvideopro-developers mailing list > Php...@li... > https://lists.sourceforge.net/lists/listinfo/phpvideopr o-developers |