From: nico <nic...@li...> - 2005-04-28 20:57:12
|
On Thu, 28 Apr 2005 07:37:53 +0200, Torsten Bronger <br...@ph...> wrote: > Hallöchen! Salut ! ;-) > > nico <nic...@li...> writes: > >> [...] >> >> Well, the perl processing is used for three purposes: >> >> - for fast systematic string replacement, e.g. a "\" is change to >> a "\\" to make latex happy. Doing this task with XSL is quite >> possible but really slow. I wanted something really faster because >> I need to drive books containing more than 200 pages. This said, >> maybe that EXSL functions now exist that can do this far >> better. Is it the case? > > I don't know EXSL well enough, but while I too had once thought that > it's good to have everything in XSLT, I quickly realised that > there's always the right tool for the right purpose, and doing > everything in XSLT means that you do it sub-optimally in one way or > the other. That's why I use the perl processing, indeed. > > So I wrote tbrplent (in C++) which replaces UTF-8 sequences with > LaTeX commands. It works well. The idea is that the XSLT > stylesheet deploys delimiters. Every text node is enclosed by them, > every formula, and every text-within-formula. So the replacements > can fit in their context. My XSLT typically contains: That's exactly the way I do. > > <xsl:variable name="start-delimiter" select="'˜'"/> > <xsl:variable name="end-delimiter" select="'œ'"/> > > <xsl:template match="text()" priority="0"> > <xsl:value-of select="$start-delimiter"/> > <xsl:value-of select="."/> > <xsl:value-of select="$end-delimiter"/> > </xsl:template> > > Unfortunately, my ambitions have grown since then, and I plan a > re-write. I needed three pairs of delimiters, but I want to replace > them by one pair plus numerical parameter, in order to have as many > "modes" as I need. For example, there must be a special mode for > headings, because in PDF bookmarks some characters must be written > differently. I understand. > > tbrplent was written for my tbook project ("TBook RePLace ENTities), > but I now use it for texi2latex, too. > > If you are interested, we can try a re-implementation (under a new > name) together. It's not a big thing after all. The replacement > tables exist already, they just need to be expanded to the new > modes. And we have to decide for a language. I planed to use > Python, but probably most of you are Perl hackers. Yes it's interesting. Personally I would vote for Python: powerfull, clean and maintanable code, well suited for text processing, compilable if hard performance is needed, available on almost every platform. > > C++ would produce a small executable program, which is especially > pleasing for a Windows distribution. Since no regexes are involved, > it should be as easy to program and as fast as in Perl. So I vote > for C++. > > Tschö, > Torsten. > Regards, BG |