From: Rafael L. <rla...@us...> - 2004-05-18 17:25:20
|
* Andrew Roach <aro...@ya...> [2004-05-18 22:05]: > I agree with Rafael, and believe unicode is the way to go long term - it's > only an accident of 1980s 8 bit architecture we are stuck with ASCII's 256 > characters. Translating between the Hershey fonts, unicode, then finding > freetype compatible fonts with the symbols would be the interesting part of > the exercise. In practical terms, we will have to make the functions pllab, plmtex, and plptex become Unicode-aware. (N.B. : the remaining problem is with plsym, which retains the semantics of the Hershey encoding.) This should not be very complicated. For instance, the following call in x03c.c: plmtex("t", 2.0, 0.5, 0.5, "#frPLplot Example 3 - r(#gh)=sin 5#gh"); would become: plmtex("t", 2.0, 0.5, 0.5, "#frPLplot Example 3 - r(θ)=sin 5θ"); where the funny sequence of 8-bit characters "θ" means "the lower-case greek letter theta". (Just cut & paste the line above, put it into a file and open the file in a Web browser with character coding set to Unicode and you will see a nice "theta"; at least, it works with my Mozilla 1.6.) Of course, the old method would still be valid, and escaped sequences #g* would be converted to the equivalent Unicode sequences. To prove that it is really simple to use Unicode nowadays, I prepared a C program using libunicode that decodes the string above. It is attached below along with its Makefile. If you have the libunicode-dev package installed (in Debian, at least), just type make and you will see: $ make gcc `unicode-config --libs --cflags` plunicode.c -o plunicode ./plunicode UTF-8 decoding of string: "#frPLplot Example 3 - r(θ)=sin 5θ" 1: 35 2: 102 3: 114 4: 80 5: 76 6: 112 7: 108 8: 111 9: 116 10: 32 11: 69 12: 120 13: 97 14: 109 15: 112 16: 108 17: 101 18: 32 19: 51 20: 32 21: 45 22: 32 23: 114 24: 40 25: 952 26: 41 27: 61 28: 115 29: 105 30: 110 31: 32 32: 53 33: 952 The characters at positions 25 and 33 have the code number 952. This is indeed the code for the lower-case greek theta (0x03B8), according to the Unicode chart for the Greek glyphs (http://www.unicode.org/charts/PDF/U0370.pdf). The PLplot core routines would do different things with the Unicode characters: either sending it directly to be plotted when using an Unicode-aware driver (like gd with Unicode fonts), or lookup into a table Unicode->Hershey and proceed with the plotting as it is the case today. -- Rafael |