|
From: Torsten B. <br...@ph...> - 2005-04-28 05:38:03
|
Hall=C3=B6chen!
nico <nic...@li...> writes:
> [...]
>
> Well, the perl processing is used for three purposes:
>
> - for fast systematic string replacement, e.g. a "\" is change to
> a "\\" to make latex happy. Doing this task with XSL is quite
> possible but really slow. I wanted something really faster because
> I need to drive books containing more than 200 pages. This said,
> maybe that EXSL functions now exist that can do this far
> better. Is it the case?
I don't know EXSL well enough, but while I too had once thought that
it's good to have everything in XSLT, I quickly realised that
there's always the right tool for the right purpose, and doing
everything in XSLT means that you do it sub-optimally in one way or
the other.
So I wrote tbrplent (in C++) which replaces UTF-8 sequences with
LaTeX commands. It works well. The idea is that the XSLT
stylesheet deploys delimiters. Every text node is enclosed by them,
every formula, and every text-within-formula. So the replacements
can fit in their context. My XSLT typically contains:
<xsl:variable name=3D"start-delimiter" select=3D"'˜'"/>
<xsl:variable name=3D"end-delimiter" select=3D"'œ'"/>
<xsl:template match=3D"text()" priority=3D"0">
<xsl:value-of select=3D"$start-delimiter"/>
<xsl:value-of select=3D"."/>
<xsl:value-of select=3D"$end-delimiter"/>
</xsl:template>
Unfortunately, my ambitions have grown since then, and I plan a
re-write. I needed three pairs of delimiters, but I want to replace
them by one pair plus numerical parameter, in order to have as many
"modes" as I need. For example, there must be a special mode for
headings, because in PDF bookmarks some characters must be written
differently.
tbrplent was written for my tbook project ("TBook RePLace ENTities),
but I now use it for texi2latex, too.
If you are interested, we can try a re-implementation (under a new
name) together. It's not a big thing after all. The replacement
tables exist already, they just need to be expanded to the new
modes. And we have to decide for a language. I planed to use
Python, but probably most of you are Perl hackers.
C++ would produce a small executable program, which is especially
pleasing for a Windows distribution. Since no regexes are involved,
it should be as easy to program and as fast as in Perl. So I vote
for C++.
Tsch=C3=B6,
Torsten.
--=20
Torsten Bronger, aquisgrana, europa vetus
|