Re: [Db2latex-devel] normalize-scape.mod.xsl

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

In message <m3a...@wi...>
on Tue, Jul 01, 2003 at 03:57:53PM +0200, Torsten Bronger wrote:
> Do you mean something like
> <http://xml.coverpages.org/unicodeRahtz19981008.xml>?

If anyone knows how to use this and would like to write notes
about how it can be used with DB2LaTeX, feel free ;-)

In message <200...@vz...>
on Tue, Jul 01, 2003 at 04:45:13PM +0400, Vitaly Ostanin wrote:
> What you think about creating xml file for symbols and
> replacements? Such xml will easy contributed and maintained for
> generating xslt from it.
[...]
> I already fix normalize-scape.mod.xml for using latex.mapping.xml
> and it worked.

I, too, would like to have had this. But DocBook XSL stylesheets, in
general, are slow enough already. The problem with using a recursive
template is that it can easily increase processing time by a factor of
five. Yet it only benefits developers. So I dropped the idea.

However, thanks to your prompting, perhaps we can come to a compromise:
we will still use the long, monolithic "scape" template but it will be
generated from a mapping file (not hand-coded).

> LaTeX doesn't support unicode characters by their numbers,
> so each character need to be translated into valid latex.

I haven't found that to be possible (but I'm not an XSLT expert). If you
have any idea how to do this portably in XSLT without using extensions,
I would really love to know. If you have a method that relies on
commonly-available extensions, we could include that as an option. Our
current approach is to say "we can't do this with XSLT, so we'll do it
with LaTeX".

For DB2LaTeX, there are three graceful options built in (though neither
is enabled by default). The test_entities folder (which should probably
have been named test_characters) demonstrates this. The current options
are:

 - Do nothing to handle Unicode characters. This is the default. You
   will get LaTeX error messages and the output won't be correct.
 - Enable output escaping and handle some 'essential' English-language
   characters. For unrecognised characters, spell out the character
   codes in the text (to alert the reader). This is best way of
   providing support for the bulk of English-language documents. "Odd"
   characters will appear in a way that proof-readers can recognise. The
   example files for this are test_entities/catcode.*
 - Enable output escaping, use the LaTeX 'unicode' package, but keep the
   output encoding in a Latin-alphabet character set.  This is for
   Latin-alphabet users. For them, it may be preferable to use an ISO
   Latin output encoding and have the 'babel' package handle Latin
   characters. Other characters, if present, will be intercepted and
   passed to the 'unicode' package. The example files for this are
   test_entities/ucs.*
 - Use Unicode characters directly. E.g. <xsl:output encoding="utf-8"/>.
   This allows fullest use of the DocBook localisations as-is (though
   you will need to install the 'unicode' LaTeX package). This option is
   intended for documents where the incidence of non-Latin characters is
   high. The example files for this are test_entities/utf-8.*

See also (incomplete documentation):
$latex.entities <http://db2latex.sourceforge.net/reference/rn45re81.html>
$latex.inputenc <http://db2latex.sourceforge.net/reference/rn45re81.html>
$latex.use.ucs <http://db2latex.sourceforge.net/reference/rn45re81.html>
$latex.ucs.options http://db2latex.sourceforge.net/reference/rn45re101.html
$latex.babel.language <http://db2latex.sourceforge.net/reference/rn45re102.html>

James.