From: Paul B. <ba...@st...> - 2004-11-01 20:21:13
|
John Hunter wrote: >>>>>>"Carl" == Carl Dr Kleffner <cmk...@gm...> writes: >>>>>> >>>>>> > > Carl> I would like to try this. Due to time constraints, it may > Carl> take some time. As far as I understand I have to use the > Carl> GlyphIDs as well as the map code from cmap_format_4 to > Carl> create a latex_to_umbelleek dictionary. Any hints from font > Carl> experts are appreciated. > >The minimum you need to do is provide a dictionary that maps TeX >symbol name to the fontname/glyphindex for that symbol. Eg for \pm in >bakoma, the font name is cmsy10.ttf, the glyph index is 8 , the >character code is 167 (hex is 0xa7) and the glyph name is plusminus. >The entry in the latex_to_bakoma dict is > > r'\pm' : ('cmsy10', 8), > > >From the fontname and glyph index, we can get the character code and >glyphname from the ttf file. I have written a little helper script >for you. It's brute force and ain't terribly pretty, but it (mostly, >see below) works. > > http://matplotlib.sf.net/share/font_table.py > >This creates a font grid table png using the agg backend and >matplotlib's ft2font module - you'll probably want to get the latest >CVS matplotlib for this to work properly - I'm not 100% sure this is >required but it is at least strongly recommended. > >It will produce font grid images for the font specified on the command >like, like the following for umr10.ttf > > http://matplotlib.sf.net/share/umr10.ttf.png > >You can use these grid tables to get the hex charcode code of the >symbol you want, and the output of the script lists the glyphind, >ccode, hex(ccode), and name, sorted by charcode, so you can look up >the glyphind form the hex code. Ie > > 1) Pick a new tex symbol. > > 2) Find the corresponding character in one of the umbellek font > table pngs, or by using the glyph names listed when you run the > font_table script. > > 3) Use the font_table output to get the glyphind corresponding to > the symbol/name of interest. > > 4) GOTO 1 > >There is probably a better way, but with a combination of glyphnames >and grid tables you can knock this out in several hours of tedious >work. Any other information you want to attach while you are in the >thick of it (mathml names, unicode chars) would be a great, but is not >necessary. > > Carl> I would like to add codes for accented chars: r'ä': > Carl> ('umr10', <code>) Should _mathtext_data.py contain a > Carl> encoding line, i.e. # -*- coding: latin1 -*- to allow > Carl> non-Ascii chars? > >Perhaps others can give input here about what would be the best way to >proceed. My inclination is to use the TeX names like \"a where >possible, but by all means add them if you have them - getting the >codes is the relatively tedious part, providing the proper interface >to them can be worked out later. It may require some changes to the >parser to support \"a and friends, but this is no problem. > > A possible alternative approach to getting the proper glyph from the TTF file is to map the LaTeX name into the PostScript name and then use the PS name to find the glyph index from ft2font::get_name_index(). This or a similar approach is what I had in mind when I first implemented the TTF code. This assumes that the glyphs associated with the PS names adhere to the Adobe PS naming definition. In this case, the PS name could be used to create on-the-fly a lookup dictionary of the fontname/index. My memory is a bit hazy on this issue, but I seem to recall that the TeX font names are not completely consistent with the Adobe PS names. In addition, there needed to be a mechanism to distinguish between the same glyph in different Bakoma fonts. I'm guessing that the more recent fonts probably adhere to the PS font naming convention and therefore it might be worthwhile persuing this approach again. It sure would make it easier to create the math font tables and to use other fonts that contain such mathematical glyphs. >Now, on to the "mostly working" part of the font_table script, which >is why I CCd Paul on this email. The font_table script is working on >the um*.ttf fonts but failing on the bl*.ttf fonts. The reason it is >failing is that FT2Font::get_charmap is returning an empty dict. >These fonts are not empty, eg ft2font reports 1 face, 2 charmaps, and >124 glyphs for blsy.ttf, but get_charmap is returning empty, because >the call to > > FT_ULong code = FT_Get_First_Char(face, &index); > >is returning 0 for code and index. > >Any ideas? > > John, you appear to have solved this one yourself. -- Paul -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Branch FAX: 410-338-4767 Baltimore, MD 21218 |