From: Carl D. K. <cmk...@gm...> - 2004-10-29 06:47:23
|
> >>>>> "Carl" == Carl Dr Kleffner <cmk...@gm...> writes: > > Carl> Any thoughts to use Richard Kinchs Universal Modern fonts as > Carl> well as the bellek fonts on his site (www.truetex.com) > Carl> instead of the bakoma fonts? It seems that um and bellek are > Carl> free to use and redistributable. This is not the case for > Carl> bakoma in commercial use. The quality of the fonts are > Carl> excellent. Umlauts and more special signs are included > Carl> compared to bakoma. The charnums are different however and > Carl> not included on mozillas encoding page. > > I would be very happy to support these fonts, especially if someone > (you, perhaps) provided the dictionary mapping tex symbol names to > name/charnum, as in the latex_to_bakoma dictionary in > matplotlib._mathtext_data. The rest is easy, and I could provide an > rc param allowing you to select which fonts to include. > > ... > > JDH > I would like to try this. Due to time constraints, it may take some time. As far as I understand I have to use the GlyphIDs as well as the map code from cmap_format_4 to create a latex_to_umbelleek dictionary. Any hints from font experts are appreciated. I would like to add codes for accented chars: r'ä': ('umr10', <code>) Should _mathtext_data.py contain a encoding line, i.e. # -*- coding: latin1 -*- to allow non-Ascii chars? Regards Carl -- NEU +++ DSL Komplett von GMX +++ http://www.gmx.net/de/go/dsl GMX DSL-Netzanschluss + Tarif zum supergünstigen Komplett-Preis! |
From: John H. <jdh...@ac...> - 2004-10-29 14:21:14
|
>>>>> "Carl" =3D=3D Carl Dr Kleffner <cmk...@gm...> writes: Carl> I would like to try this. Due to time constraints, it may Carl> take some time. As far as I understand I have to use the Carl> GlyphIDs as well as the map code from cmap_format_4 to Carl> create a latex_to_umbelleek dictionary. Any hints from font Carl> experts are appreciated. The minimum you need to do is provide a dictionary that maps TeX symbol name to the fontname/glyphindex for that symbol. Eg for \pm in bakoma, the font name is cmsy10.ttf, the glyph index is 8 , the character code is 167 (hex is 0xa7) and the glyph name is plusminus. The entry in the latex_to_bakoma dict is r'\pm' : ('cmsy10', 8), From the fontname and glyph index, we can get the character code and glyphname from the ttf file. I have written a little helper script for you. It's brute force and ain't terribly pretty, but it (mostly, see below) works. http://matplotlib.sf.net/share/font_table.py This creates a font grid table png using the agg backend and matplotlib's ft2font module - you'll probably want to get the latest CVS matplotlib for this to work properly - I'm not 100% sure this is required but it is at least strongly recommended. It will produce font grid images for the font specified on the command like, like the following for umr10.ttf http://matplotlib.sf.net/share/umr10.ttf.png You can use these grid tables to get the hex charcode code of the symbol you want, and the output of the script lists the glyphind, ccode, hex(ccode), and name, sorted by charcode, so you can look up the glyphind form the hex code. Ie 1) Pick a new tex symbol. 2) Find the corresponding character in one of the umbellek font table pngs, or by using the glyph names listed when you run the font_table script. 3) Use the font_table output to get the glyphind corresponding to the symbol/name of interest. 4) GOTO 1 There is probably a better way, but with a combination of glyphnames and grid tables you can knock this out in several hours of tedious work. Any other information you want to attach while you are in the thick of it (mathml names, unicode chars) would be a great, but is not necessary. Carl> I would like to add codes for accented chars: r'=E4': Carl> ('umr10', <code>) Should _mathtext_data.py contain a Carl> encoding line, i.e. # -*- coding: latin1 -*- to allow Carl> non-Ascii chars? Perhaps others can give input here about what would be the best way to proceed. My inclination is to use the TeX names like \"a where possible, but by all means add them if you have them - getting the codes is the relatively tedious part, providing the proper interface to them can be worked out later. It may require some changes to the parser to support \"a and friends, but this is no problem. Now, on to the "mostly working" part of the font_table script, which is why I CCd Paul on this email. The font_table script is working on the um*.ttf fonts but failing on the bl*.ttf fonts. The reason it is failing is that FT2Font::get_charmap is returning an empty dict. These fonts are not empty, eg ft2font reports 1 face, 2 charmaps, and 124 glyphs for blsy.ttf, but get_charmap is returning empty, because the call to FT_ULong code =3D FT_Get_First_Char(face, &index); is returning 0 for code and index. Any ideas? =20 JDH |
From: Carl D. K. <cmk...@gm...> - 2004-10-29 14:59:37
|
Dear John > ... > Perhaps others can give input here about what would be the best way to > proceed. My inclination is to use the TeX names like \"a where > possible, but by all means add them if you have them - getting the > codes is the relatively tedious part, providing the proper interface > to them can be worked out later. It may require some changes to the > parser to support \"a and friends, but this is no problem. > I will take that script (and use the output of the ttx fonttool as well) to estimate the glyph ids for each latex symbol. This will take a week or so. If it is not possible to reach the charmap values by get_charmap for the belleek fonts one could use an additional dict with the charmaps values. > Now, on to the "mostly working" part of the font_table script, which > is why I CCd Paul on this email. The font_table script is working on > the um*.ttf fonts but failing on the bl*.ttf fonts. The reason it is > failing is that FT2Font::get_charmap is returning an empty dict. > These fonts are not empty, eg ft2font reports 1 face, 2 charmaps, and > 124 glyphs for blsy.ttf, but get_charmap is returning empty, because > the call to > > FT_ULong code = FT_Get_First_Char(face, &index); > > is returning 0 for code and index. > > Any ideas? > > JDH > Maybe because this fonts use charmaps values beyond 0x20: <cmap_format_4 platformID="3" platEncID="0" version="0"> <map code="0x1" name="Delta"/><!-- <control> --> <map code="0x2" name="Theta"/><!-- <control> --> <map code="0x3" name="Lambda"/><!-- <control> --> A cmap value for 0x0 is missing: <map code="0x0" name=".null"/> (um fonts) But this is just a wild guess. Regards Carl -- Geschenkt: 3 Monate GMX ProMail + 3 Top-Spielfilme auf DVD ++ Jetzt kostenlos testen http://www.gmx.net/de/go/mail ++ |
From: John H. <jdh...@ac...> - 2004-10-29 18:03:47
|
>>>>> "Carl" == Carl Dr Kleffner <cmk...@gm...> writes: >> Carl> I will take that script (and use the output of the ttx Carl> fonttool as well) to estimate the glyph ids for each latex Carl> symbol. This will take a week or so. If it is not possible Carl> to reach the charmap values by get_charmap for the belleek Carl> fonts one could use an additional dict with the charmaps Carl> values. I believe I have fixed the problem with the bl*.ttf fonts. I exposed FT_Set_Charmap via font.set_charmap in ft2font and calling font.set_charmap(0) seems to cure the problem with the bl* fonts. Apparently, there isn't a default charmap set for those fonts - no exactly sure. In any case, I updated CVS -- make sure you have at least revision 1.10 of ft2font.cpp and the updated script at http://matplotlib.sf.net/share/font_table.py Have fun :-) Perhaps we should move further discussion on this issue over to the matplotlib-devel list. JDH |
From: Paul B. <ba...@st...> - 2004-11-01 20:21:13
|
John Hunter wrote: >>>>>>"Carl" == Carl Dr Kleffner <cmk...@gm...> writes: >>>>>> >>>>>> > > Carl> I would like to try this. Due to time constraints, it may > Carl> take some time. As far as I understand I have to use the > Carl> GlyphIDs as well as the map code from cmap_format_4 to > Carl> create a latex_to_umbelleek dictionary. Any hints from font > Carl> experts are appreciated. > >The minimum you need to do is provide a dictionary that maps TeX >symbol name to the fontname/glyphindex for that symbol. Eg for \pm in >bakoma, the font name is cmsy10.ttf, the glyph index is 8 , the >character code is 167 (hex is 0xa7) and the glyph name is plusminus. >The entry in the latex_to_bakoma dict is > > r'\pm' : ('cmsy10', 8), > > >From the fontname and glyph index, we can get the character code and >glyphname from the ttf file. I have written a little helper script >for you. It's brute force and ain't terribly pretty, but it (mostly, >see below) works. > > http://matplotlib.sf.net/share/font_table.py > >This creates a font grid table png using the agg backend and >matplotlib's ft2font module - you'll probably want to get the latest >CVS matplotlib for this to work properly - I'm not 100% sure this is >required but it is at least strongly recommended. > >It will produce font grid images for the font specified on the command >like, like the following for umr10.ttf > > http://matplotlib.sf.net/share/umr10.ttf.png > >You can use these grid tables to get the hex charcode code of the >symbol you want, and the output of the script lists the glyphind, >ccode, hex(ccode), and name, sorted by charcode, so you can look up >the glyphind form the hex code. Ie > > 1) Pick a new tex symbol. > > 2) Find the corresponding character in one of the umbellek font > table pngs, or by using the glyph names listed when you run the > font_table script. > > 3) Use the font_table output to get the glyphind corresponding to > the symbol/name of interest. > > 4) GOTO 1 > >There is probably a better way, but with a combination of glyphnames >and grid tables you can knock this out in several hours of tedious >work. Any other information you want to attach while you are in the >thick of it (mathml names, unicode chars) would be a great, but is not >necessary. > > Carl> I would like to add codes for accented chars: r'ä': > Carl> ('umr10', <code>) Should _mathtext_data.py contain a > Carl> encoding line, i.e. # -*- coding: latin1 -*- to allow > Carl> non-Ascii chars? > >Perhaps others can give input here about what would be the best way to >proceed. My inclination is to use the TeX names like \"a where >possible, but by all means add them if you have them - getting the >codes is the relatively tedious part, providing the proper interface >to them can be worked out later. It may require some changes to the >parser to support \"a and friends, but this is no problem. > > A possible alternative approach to getting the proper glyph from the TTF file is to map the LaTeX name into the PostScript name and then use the PS name to find the glyph index from ft2font::get_name_index(). This or a similar approach is what I had in mind when I first implemented the TTF code. This assumes that the glyphs associated with the PS names adhere to the Adobe PS naming definition. In this case, the PS name could be used to create on-the-fly a lookup dictionary of the fontname/index. My memory is a bit hazy on this issue, but I seem to recall that the TeX font names are not completely consistent with the Adobe PS names. In addition, there needed to be a mechanism to distinguish between the same glyph in different Bakoma fonts. I'm guessing that the more recent fonts probably adhere to the PS font naming convention and therefore it might be worthwhile persuing this approach again. It sure would make it easier to create the math font tables and to use other fonts that contain such mathematical glyphs. >Now, on to the "mostly working" part of the font_table script, which >is why I CCd Paul on this email. The font_table script is working on >the um*.ttf fonts but failing on the bl*.ttf fonts. The reason it is >failing is that FT2Font::get_charmap is returning an empty dict. >These fonts are not empty, eg ft2font reports 1 face, 2 charmaps, and >124 glyphs for blsy.ttf, but get_charmap is returning empty, because >the call to > > FT_ULong code = FT_Get_First_Char(face, &index); > >is returning 0 for code and index. > >Any ideas? > > John, you appear to have solved this one yourself. -- Paul -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Branch FAX: 410-338-4767 Baltimore, MD 21218 |