The commits I just completed largely finish most of the core infrastructure
changes that are required for the new font characterization scheme. That
includes parsing the new "#<" escapes which stand for various changes to the
FCI (font characterization integer) in mid-string, and surrounding the
single character unicode results corresponding to the three escape sequences
#(nnn), #[nnn], and #g so the result is an FCI which changes to the symbol
font, the unicode character, then an FCI which restores the prior font.
Also, the old Hershey #f[nris] escapes are now made
equivalent to the appropriate FCI inserted into the unicode stream.
The meanings of the various hex character parts of the FCI are documented in
plstrm.h (look for fci). In particular the left most hex digit of 1 makes
these integers easy to distinguish from unicode (UCS-4) integers (which have
a maximum value of a million or so).
There are three variations of the new '#<' form of escape sequence.
(1 and 2) If the character that comes after the < is in the range [0-9], then the
escape sequence is interpreted numerically (either hex or decimal) as in
(1) If that converted numerical result has a highest order hex digit
of one, (e.g., 1x10003000) it is assumed to be an FCI and simply inserted in
to the sequence of unicode digits as a direct command to change the fonts
(for the example given, change the fonts to (script, upright, normal,
(2) If that converted numerical result has a highest order hex digit that is
not 1 (e.g, 0x21), then the numerical result is interpreted as a command to
change one of the hex digits of the FCI, with the hex shift in the second
least significant hex digit, and the hex value in the least significant hex
digit. So the 0x21 example is an instruction to change font style to italic
while leaving font-family, font-variant, and font weight strictly alone.
(3) #<command-string/>. This variant is parsed in text2fci (in plcore.c).
See that code for the 11 command-string possibilities. Currently, I just
scan through the command-string possibilities until I get a match. With
such a small number of possibilities this is probably the most efficient way
to do the search. However, if the list of font-changing commands starts
getting a lot bigger, we may want to change this to a binary search
algorithm. Also, we may want to transform the input string to lower-case to
make the match case-insensitive.
The current code compiles fine, and also seems to execute without any major
problems for some simple examples. For example, valgrind shows no problems.
However, the ps.c and plfreetype.c back ends do not have any interpretation
in place for the FCIs that are in the middle of the unicode sequences, so
they are currently just interpreted as unicode glyph indices which the font
cannot render. So if you use this cutting-edge version tonight for unicode,
you will see some extra boxes or spaces (undefined glyphs) in your captions,
and the old #f[nris] font-changing sequence will also just produce an FCI
and therefore a box in your captions without actually changing the font.
I plan to address all these back-end FCI interpretation issues for unicode
fonts shortly, but meanwhile don't use CVS HEAD and unicode fonts except to
experiment with what I am doing. CVS HEAD and Hershey fonts should still be
Alan W. Irwin
Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).
Programming affiliations with the FreeEOS equation-of-state implementation
for stellar interiors (freeeos.sf.net); PLplot scientific plotting software
package (plplot.org); the Yorick front-end to PLplot (yplot.sf.net); the
Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project