From: John H. <jdh...@ac...> - 2004-10-29 14:21:14
|
>>>>> "Carl" =3D=3D Carl Dr Kleffner <cmk...@gm...> writes: Carl> I would like to try this. Due to time constraints, it may Carl> take some time. As far as I understand I have to use the Carl> GlyphIDs as well as the map code from cmap_format_4 to Carl> create a latex_to_umbelleek dictionary. Any hints from font Carl> experts are appreciated. The minimum you need to do is provide a dictionary that maps TeX symbol name to the fontname/glyphindex for that symbol. Eg for \pm in bakoma, the font name is cmsy10.ttf, the glyph index is 8 , the character code is 167 (hex is 0xa7) and the glyph name is plusminus. The entry in the latex_to_bakoma dict is r'\pm' : ('cmsy10', 8), From the fontname and glyph index, we can get the character code and glyphname from the ttf file. I have written a little helper script for you. It's brute force and ain't terribly pretty, but it (mostly, see below) works. http://matplotlib.sf.net/share/font_table.py This creates a font grid table png using the agg backend and matplotlib's ft2font module - you'll probably want to get the latest CVS matplotlib for this to work properly - I'm not 100% sure this is required but it is at least strongly recommended. It will produce font grid images for the font specified on the command like, like the following for umr10.ttf http://matplotlib.sf.net/share/umr10.ttf.png You can use these grid tables to get the hex charcode code of the symbol you want, and the output of the script lists the glyphind, ccode, hex(ccode), and name, sorted by charcode, so you can look up the glyphind form the hex code. Ie 1) Pick a new tex symbol. 2) Find the corresponding character in one of the umbellek font table pngs, or by using the glyph names listed when you run the font_table script. 3) Use the font_table output to get the glyphind corresponding to the symbol/name of interest. 4) GOTO 1 There is probably a better way, but with a combination of glyphnames and grid tables you can knock this out in several hours of tedious work. Any other information you want to attach while you are in the thick of it (mathml names, unicode chars) would be a great, but is not necessary. Carl> I would like to add codes for accented chars: r'=E4': Carl> ('umr10', <code>) Should _mathtext_data.py contain a Carl> encoding line, i.e. # -*- coding: latin1 -*- to allow Carl> non-Ascii chars? Perhaps others can give input here about what would be the best way to proceed. My inclination is to use the TeX names like \"a where possible, but by all means add them if you have them - getting the codes is the relatively tedious part, providing the proper interface to them can be worked out later. It may require some changes to the parser to support \"a and friends, but this is no problem. Now, on to the "mostly working" part of the font_table script, which is why I CCd Paul on this email. The font_table script is working on the um*.ttf fonts but failing on the bl*.ttf fonts. The reason it is failing is that FT2Font::get_charmap is returning an empty dict. These fonts are not empty, eg ft2font reports 1 face, 2 charmaps, and 124 glyphs for blsy.ttf, but get_charmap is returning empty, because the call to FT_ULong code =3D FT_Get_First_Char(face, &index); is returning 0 for code and index. Any ideas? =20 JDH |