You can subscribe to this list here.
2003 |
Jan
(1) |
Feb
|
Mar
(5) |
Apr
(13) |
May
(9) |
Jun
(7) |
Jul
(13) |
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
(5) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
|
Feb
(1) |
Mar
(5) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2005 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(15) |
May
(4) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <ben...@id...> - 2004-05-21 10:14:05
|
Dear Open Source developer I am doing a research project on "Fun and Software Development" in which I kindly invite you to participate. You will find the online survey under http://fasd.ethz.ch/qsf/. The questionnaire consists of 53 questions and you will need about 15 minutes to complete it. With the FASD project (Fun and Software Development) we want to define the motivational significance of fun when software developers decide to engage in Open Source projects. What is special about our research project is that a similar survey is planned with software developers in commercial firms. This procedure allows the immediate comparison between the involved individuals and the conditions of production of these two development models. Thus we hope to obtain substantial new insights to the phenomenon of Open Source Development. With many thanks for your participation, Benno Luthiger PS: The results of the survey will be published under http://www.isu.unizh.ch/fuehrung/blprojects/FASD/. We have set up the mailing list fa...@we... for this study. Please see http://fasd.ethz.ch/qsf/mailinglist_en.html for registration to this mailing list. _______________________________________________________________________ Benno Luthiger Swiss Federal Institute of Technology Zurich 8092 Zurich Mail: benno.luthiger(at)id.ethz.ch _______________________________________________________________________ |
From: Oleg A. P. <ol...@da...> - 2004-03-25 15:15:53
|
Hi! On Thu, 25 Mar 2004 13:43:51 +0100 Torsten Bronger <br...@ph...> wrote: > Halloechen! > > Oleg Paraschenko <ol...@da...> writes: > > > On Thu, 25 Mar 2004 12:42:49 +0100 > > Torsten Bronger <br...@ph...> wrote: > > > >> [...] Interesting, but how is it implemented? In XSLT, or a > >> scripting language, or what? > > > > It is implemented in the Python scripting language. > > I don't know Python. How easy can this be installed on a Windows > system? It should not be a problem. You can download Python from the http://www.python.org/download/ , install it and run scripts from a command line. For example: d:\python23\python.exe texml.py -e ascii test.xml test.tex > > > It uses only core Python modules (expat XML parser, unicode > > database, something other), so it should work on any recent > > system. Mapping from Unicode characters to LaTeX commands is taken > > from attachment for the MathML specification > > (http://www.w3.org/Math/characters/unicode.xml (note: 1,5 Mb)). > > And is it mode-aware? Yes, it is mode-aware. It knows text and math. > Does an alpha become \alpha in formulae and a > Greek letter elsewhere? I tested and found that in both modes result is "\alpha ". (Or the letter alpha itself if output is in Greek encoding. I consider it is ok because I see a very small difference between "$\alpha $" and "$a$") > What about ligatures like "--"? Is this an > en-dash or two hyphens? As ligatures in TeX are the property of fonts and are not the property of a document, and as the TeXML processor can't guess what font will be used, the processor ignores ligatures at all. As result, "--" in TeXML is translating into "--" in TeX, which is interpreted as en-dash. At time of development I was considering that it is a correct behaviour. Now I'm changing my mind and adding handling of "--" and "---" to the list of bugs. Anyway, I don't plan to break ligatires like "fi", "fl" etc. > What about typographic things like thin > spaces, soft hyphens, zero-width non-joiner and "break permitted > here"? How much of Unicode is covered yet? There are two translation tables, one for text mode, another one for math mode. There is 2361 symbols for text mode and 195 symbols for math mode (math mode reuses text mode if symbol not found). For mentioned typographic things, here is a test: | TeXML: | | <TeXML>α<math>α</math> | thin space: [ ] | soft hyphens: [­] | zero-width non-joiner: [‌] oops here ... | break permitted here: [‚] ... and here | </TeXML> | | TeX: | \alpha $\alpha $ | thin space: [\hspace{0.167em}] | soft hyphens: [\-] | zero-width non-joiner: [‌] oops here ... | break permitted here: [‚] ... and here As we see, not all characters are mapped. If it is an issue, then it is an issue for supporters of the unicode map of the MathML specification. After they approve and fix a problem, the TeXML processor also will be updated. > > >> How fast is it (I'm not prepared to accept a further significant > >> drop down in speed)? > > > > It is hard to said exactly, but I think it is fast. In any case, > > it should be faster then processing of specials by xslt. > > Okay; I asked because using it would mean to translate > XML--XML-->text instead of XML-->text-->filter-->text, where > "filter" is *very* fast. But faster than XSLT may be enough. > > >> How are different \usepackage[???]{inputenc}'s dealt with? > > > > The processor does not know about \usepackage, it only translates > > characters. It is a task of an xslt to insert \usepackage command into > > the output, if required. > > So I always have to include things like wasy, pifont, textcomp etc? > Wouldn't be a problem, I just need a complete list. Maybe I don't understand the question well, so repeat the qeustion if I give no answer. The TeXML processor does not add anything. So (imagine), if the processor generates "\alpha", and usage of "\alpha" in TeX document requires package "greekfont", you will probably get an error from LaTeX. I have no good solution yet. > > > User can specify an output encoding. The processor attempts to make as > > good translation as possible for it. > > Sounds nice. Are you aware of the very new utf-8 that was added to > the LaTeX core two months ago? How good does it work? I don't know yet if it works good. One of the problems is that Unicode itself is not enough. There are right-to-left languages, dynamic ligatures and other issues, so I'm investigating omega/lambda, not a LaTeX core. > > Tschoe, > Torsten. > > -- > Torsten Bronger, aquisgrana, europa vetus > Bye! -- Oleg |
From: Torsten B. <br...@ph...> - 2004-03-25 12:45:50
|
Halloechen! Oleg Paraschenko <ol...@da...> writes: > On Thu, 25 Mar 2004 12:42:49 +0100 > Torsten Bronger <br...@ph...> wrote: > >> [...] Interesting, but how is it implemented? In XSLT, or a >> scripting language, or what? > > It is implemented in the Python scripting language. I don't know Python. How easy can this be installed on a Windows system? > It uses only core Python modules (expat XML parser, unicode > database, something other), so it should work on any recent > system. Mapping from Unicode characters to LaTeX commands is taken > from attachment for the MathML specification > (http://www.w3.org/Math/characters/unicode.xml (note: 1,5 Mb)). And is it mode-aware? Does an alpha become \alpha in formulae and a Greek letter elsewhere? What about ligatures like "--"? Is this an en-dash or two hyphens? What about typographic things like thin spaces, soft hyphens, zero-width non-joiner and "break permitted here"? How much of Unicode is covered yet? >> How fast is it (I'm not prepared to accept a further significant >> drop down in speed)? > > It is hard to said exactly, but I think it is fast. In any case, > it should be faster then processing of specials by xslt. Okay; I asked because using it would mean to translate XML--XML-->text instead of XML-->text-->filter-->text, where "filter" is *very* fast. But faster than XSLT may be enough. >> How are different \usepackage[???]{inputenc}'s dealt with? > > The processor does not know about \usepackage, it only translates > characters. It is a task of an xslt to insert \usepackage command into > the output, if required. So I always have to include things like wasy, pifont, textcomp etc? Wouldn't be a problem, I just need a complete list. > User can specify an output encoding. The processor attempts to make as > good translation as possible for it. Sounds nice. Are you aware of the very new utf-8 that was added to the LaTeX core two months ago? How good does it work? Tschoe, Torsten. -- Torsten Bronger, aquisgrana, europa vetus |
From: Oleg P. <ol...@da...> - 2004-03-25 12:26:21
|
Hi! On Thu, 25 Mar 2004 12:42:49 +0100 Torsten Bronger <br...@ph...> wrote: > Halloechen! > ... > > > > One of the main benefits of TeXML usage is an automatical > > translation > > of the TeX special symbols. > > Interesting, but how is it implemented? In XSLT, or a scripting > language, or what? It is implemented in the Python scripting language. It uses only core Python modules (expat XML parser, unicode database, something other), so it should work on any recent system. Mapping from Unicode characters to LaTeX commands is taken from attachment for the MathML specification (http://www.w3.org/Math/characters/unicode.xml (note: 1,5 Mb)). > How fast is it (I'm not prepared to accept a > further significant drop down in speed)? It is hard to said exactly, but I think it is fast. In any case, it should be faster then processing of specials by xslt. > > How are different \usepackage[???]{inputenc}'s dealt with? The processor does not know about \usepackage, it only translates characters. It is a task of an xslt to insert \usepackage command into the output, if required. User can specify an output encoding. The processor attempts to make as good translation as possible for it. For example, for letter ß, if output encoding is ascii, then processor outputs "\ss "; if output encoding is latin1, then processor outputs "ъ". In latter case correct header should be \usepackage[latin1]{inputenc}, but it is not a task of processor to create this header. > > Tschoe, > Torsten. > > -- > Torsten Bronger, aquisgrana, europa vetus > Bye! -- Oleg |
From: Torsten B. <br...@ph...> - 2004-03-25 11:43:03
|
Halloechen! Oleg Paraschenko <ol...@da...> writes: > I think that you can use TeXML to some extent in the your project. > > | Example of TeXML to TeX translation > | > | TeXML: > | > | <cmd name="documentclass"> > | <opt>12pt</opt> > | <parm>letter</parm> > | </cmd> > | > | TeX: > | > | \documentclass[12pt]{letter} > > One of the main benefits of TeXML usage is an automatical translation > of the TeX special symbols. Interesting, but how is it implemented? In XSLT, or a scripting language, or what? How fast is it (I'm not prepared to accept a further significant drop down in speed)? How are different \usepackage[???]{inputenc}'s dealt with? Tschoe, Torsten. -- Torsten Bronger, aquisgrana, europa vetus |
From: Oleg P. <ol...@da...> - 2004-03-25 11:28:48
|
Hello colleagues, I'd like to introduce you TeXML, the XML vocabulary for TeX: http://getfo.sourceforge.net/texml/ I think that you can use TeXML to some extent in the your project. | Example of TeXML to TeX translation | | TeXML: | | <cmd name="documentclass"> | <opt>12pt</opt> | <parm>letter</parm> | </cmd> | | TeX: | | \documentclass[12pt]{letter} One of the main benefits of TeXML usage is an automatical translation of the TeX special symbols. | Example of translation of special TeX symbols | | TeXML: | | <TeXML>\section{No break}</TeXML> | | TeX: | | $\backslash$section\{No~break\} Default output encoding is utf8. TeXML processor escapes out-of-encoding chars automatically. | Example of translation of non-ASCII characters | | TeXML: | | <TeXML>ТеХ</TeXML> | | TeX in ASCII encoding: | | \cyrchar\CYRT \cyrchar\cyre \cyrchar\CYRH | | TeX in Russian encoding | | TeX There are some profits to generate TeXML instead of TeX: * you avoid painful handling of TeX special characters, * you should not bother about encodings, * there are chances to write more error-free code. About last item. For example, you want to generate | {\bf bold} One of the approaches is to generate "{", then "\bf " (with trailing space) and then "}". It is easy enough to miss space or to forget a brace or write an incorrect brace. But when you use TeXML, it cares for you: | <group><cmd name="bf"/>bold</group> Your comments are welcome. Regards, Oleg |
From: James D. <j-d...@us...> - 2004-02-01 07:52:02
|
Hi everyone, The next DB2LaTeX snapshot will also be known as 0.8pre1. There are many changes since 0.7. We would appreciate any feedback you can give us, either based on the prerelease or based upon snapshots. There are a few things that still need to be done before 0.8, including: - tie up some loose ends with localisation / currency symbols / quotation marks / charsets - tutorials / user-guides / LaTeX notes - regeneration of sample files (test_* directories) For people who have not used DB2LaTeX in a while and wish to try 0.8pre1 or the snapshots, I would recommend that you 'redo' your XSL customisation layer (DB2LaTeX now comes with features that users have been implementing with their own customisations). <http://db2latex.sourceforge.net/> |
From: bvh <bvh...@ir...> - 2003-12-13 13:34:41
|
On Sat, Dec 13, 2003 at 10:09:02AM +0800, James Devenish wrote: > Ah yes, found it. It's taken care of by the trim-outer template in > normalize-scape.mod.xsl (snapshots / CVS only). This answers my other question in the thread. Didn't realize that XPATH is part of xslt. cu bart -- http://www.irule.be/bvh/ |
From: bvh <bvh...@ir...> - 2003-12-13 13:30:14
|
On Sat, Dec 13, 2003 at 10:00:13AM +0800, James Devenish wrote: > Some examples are available in the snapshot 'samples' tarball or can be > viewed on the web: > <http://cvs.sourceforge.net/viewcvs.py/db2latex/db2latex/xsl/sample/test_entities/>. OK. I'll try these. > > However when I convert to LaTeX with db2latex the single white line is > > still there and becomes a hard line break in the typeset document. > This doesn't happen to me. Maybe I worked around it in CVS. Could you > try a snapshot? <http://db2latex.sourceforge.net/snapshot/> I tried > to track down where this was fixed, but I couldn't find it. Confusing. Yep. Is fixed in the snapshot. The last released version (0.7?) still contains this problem, but the snapshot from 13-12 does not have this problem anymore. Thanks! Just out of curiosity : how do text transformations like these work in xsl? I've only just started to play with xsl-transformation so I still have a _very_ hard time navigating around the complex xslts like db2latex. cu bart -- http://www.irule.be/bvh/ |
From: James D. <j-d...@us...> - 2003-12-13 02:09:06
|
In message <200...@ma...> > > 2. Line breaks in paragraphs [...] > > However when I convert to LaTeX with db2latex the single white line is > > still there and becomes a hard line break in the typeset document. [...] > This doesn't happen to me. Maybe I worked around it in CVS. Ah yes, found it. It's taken care of by the trim-outer template in normalize-scape.mod.xsl (snapshots / CVS only). |
From: James D. <j-d...@us...> - 2003-12-13 02:00:20
|
Hi, In message <200...@ns...> on Fri, Dec 12, 2003 at 01:13:23PM +0100, bvh wrote: > 1. Use of character entities [...] > Entity references in the latin1 set like (à etc) are converted to > latin1 encoding by db2latex It's the way XML works :-/ > Character entities like € (I believe this to be the euro symbol) are > left alone by db2latex. Again the problem is that (my installation of) LaTeX > doesn't know how to handle these. What should I do to get that working? There are several solutions provided by DB2LaTeX. You will need to turn these features on, though. I personally go for Unicode handling. This means that I have install the 'ucs' ('unicode') LaTeX package and then use the following variables in my XSL files: <xsl:output encoding="UTF-8"/> <xsl:variable name="latex.inputenc">utf8</xsl:variable> <xsl:variable name="latex.use.ucs">1</xsl:variable> <xsl:variable name="latex.ucs.options">postscript</xsl:variable> Some examples are available in the snapshot 'samples' tarball or can be viewed on the web: <http://cvs.sourceforge.net/viewcvs.py/db2latex/db2latex/xsl/sample/test_entities/>. > 2. Line breaks in paragraphs [...] > <para> > This is > > one paragraph > </para> [...] > However when I convert to LaTeX with db2latex the single white line is > still there and becomes a hard line break in the typeset document. This doesn't happen to me. Maybe I worked around it in CVS. Could you try a snapshot? <http://db2latex.sourceforge.net/snapshot/> I tried to track down where this was fixed, but I couldn't find it. Confusing. > 3. We have documents where the scaling factor for the imagedata has a fraction. > This didn't work with db2latex. However when rereading the docbook specification > before posting I saw that the scaling factor must be an integer so I'll have > to change those sources then. Yep, the @scale attribute is a percentage (0--100). See also <http://db2latex.sourceforge.net/reference/rn30re79.html>. |
From: bvh <bvh...@ir...> - 2003-12-12 12:05:49
|
Hi, I am interested in docbook->latex conversion. I wrote a simple C++ program for that (naartex). Unfortunatly only after I first published it someone pointed to db2latex. Now I am trying to use db2latex instead because it is much more mature than my first rudimentary conversion and it would save me a lot of time if I could use something existing instead of brewing my own solution. (I've to use this in a commercial project, so) However I've run in 3 small issues. 1. Use of character entities The documents I have to translate use character and entity references for special characters like accents, the euro symbol and many more. Entity references in the latin1 set like (à etc) are converted to latin1 encoding by db2latex (I believe?) I know about the latinenc package to solve this problem, but what about characters outside that set? Are there other LaTeX packages I need to use? Character entities like € (I believe this to be the euro symbol) are left alone by db2latex. Again the problem is that (my installation of) LaTeX doesn't know how to handle these. What should I do to get that working? The FAQ mentions this but it is not fully clear for me how to solve the character entity references (with the numerical form) 2. Line breaks in paragraphs My documents sometimes contain things like this <para> This is one paragraph </para> I believe the intent is to render this as one single paragraph. However when I convert to LaTeX with db2latex the single white line is still there and becomes a hard line break in the typeset document. 3. We have documents where the scaling factor for the imagedata has a fraction. This didn't work with db2latex. However when rereading the docbook specification before posting I saw that the scaling factor must be an integer so I'll have to change those sources then. Thanks for taking your time in replying and developing db2latex. cu bart -- http://www.irule.be/bvh/ |
From: Wynn W. <wy...@im...> - 2003-10-07 23:49:55
|
On Tuesday 07 October 2003 04:22 pm, James Devenish wrote: > Yes, appendixes are mapped to \chapter to \section, depending on where > <appendix> appears in the DocBook file. For \chapter-style appendixes, > the LaTeX command \appendix is issued a few lines before the first > appendix. The \appendix command is provided by LaTeX (for DB2LaTeX > books, the report.cls file provides it by default). The \appendix > command redefines the name of chapters. If I recall correctly, this > should be sufficient. I'm can't remember how we handle \section-style > appendices. Thanks for the hint- I'm using a *.cls file that does a lot of customizat= ions-=20 one of which is to make chapter names "Chapter X", and then that's what=20 Appendix names become as well. I fixed that and everything is fine now. Thanks for the response- and congrats on an excellent project. It's the b= est=20 docbook publishing tool I've found. Wynn |
From: James D. <j-d...@us...> - 2003-10-07 23:22:31
|
In message <200...@im...> on Tue, Oct 07, 2003 at 02:19:32PM -0700, Wynn Wilkes wrote: > When using an appendix, the table of contents has an entry for that appendix > but the name of the entry is "Chapter A" instead of "Appendix A". From > looking at the stylesheets, it looks like appencies are just matched to > chapters- although I'm not a latex expert by any means. Yes, appendixes are mapped to \chapter to \section, depending on where <appendix> appears in the DocBook file. For \chapter-style appendixes, the LaTeX command \appendix is issued a few lines before the first appendix. The \appendix command is provided by LaTeX (for DB2LaTeX books, the report.cls file provides it by default). The \appendix command redefines the name of chapters. If I recall correctly, this should be sufficient. I'm can't remember how we handle \section-style appendices. One subtle point is that the name of appendices is provided by LaTeX's localisation, not XML localisation. This is certainly what I had intended, though that is not to say it is the best decision in all circumstances. > Does anyone know a quick fix for that or a way to work around it so the TOC > gets generated correctly? "Works on my test systems" ;-) |
From: Wynn W. <wy...@im...> - 2003-10-07 20:20:34
|
Hello, I tried out the latest daily snapshot, and they look really good except f= or=20 one thing. When using an appendix, the table of contents has an entry for that appen= dix=20 but the name of the entry is "Chapter A" instead of "Appendix A". From=20 looking at the stylesheets, it looks like appencies are just matched to=20 chapters- although I'm not a latex expert by any means. Does anyone know a quick fix for that or a way to work around it so the T= OC=20 gets generated correctly? Thanks, Wynn |
From: Vitaly O. <vy...@vz...> - 2003-07-14 14:29:53
|
On Mon, 14 Jul 2003 18:17:52 +0400 Vitaly Ostanin <vy...@vz...> wrote: <skipped/> > I test whis schema with xsltproc: > original styles (bad for Russian symbols and possible for other > non Latin-1 users): > Applying stylesheet 20 times took 29235 ms > > Alternative schema: > Applying stylesheet 20 times took 41683 ms Sorry, my mistake. It was test of synchronized localization. Real test of alternative schema is: Applying stylesheet 20 times took 71162 ms > Yes, it slowly but it added optional replacements for all > needs. <skipped/> -- Regards, Vyt mailto: vy...@vz... JID: vy...@vz... |
From: Vitaly O. <vy...@vz...> - 2003-07-14 14:19:23
|
Hello, All! I make patch (against db2latex cvs 02 Jul 2003) from my additions to db2latex-xsl: 1. Added xsl/docbook-alt.xsl xsl/unicode.mapping.dtd xsl/unicode.mapping.xml xsl/unicode.mapping.ru.xml another schema for replacing Unicode entities. It's import but not changed original docbook.xsl Entities replaced from default file and from files for specified languages. Also possible to specify local file for replacements. See comments in style. If this patch will accepted I can write documentation for it like in the original stylesheets. Also this style fixed replacing entities in localization values and have some other fixes. I test whis schema with xsltproc: original styles (bad for Russian symbols and possible for other non Latin-1 users): Applying stylesheet 20 times took 29235 ms Alternative schema: Applying stylesheet 20 times took 41683 ms Yes, it slowly but it added optional replacements for all needs. 2. Added xsl/common/locale-leave-uniq-keys.xsl xsl/common/locale-generate-working-files.xsl xsl/common/Makefile Styles and Makefile for leaving uniq keys from db2latex localization to the source files; generating synchronized localization files. See comments in the styles. Also added result of this styles: $lang.xml locale-source.$lang.xml Lozalization synchronized with DocBook localization (from xsl-stylesheets 1.61.3) Patch available: http://www.vzljot.ru/vyt/db2latex/db2latex-xsl.vyt.2003-07-14.patch.bz2 CVS log for changes available: http://www.vzljot.ru/vyt/db2latex/vyt.cvs.db2latex.2003-07-14.log.bz2 PS I can create read only access to our cvs for DB2LaTeX developers if needed. PPS I can't grab current db2latex from cvs: $ cvs -z3 \ -d:pserver:ano...@cv...:/cvsroot\ /db2latex co db2latex cvs [checkout aborted]: end of file from server (consult above messages if any) -- Regards, Vyt mailto: vy...@vz... JID: vy...@vz... |
From: ben <nic...@li...> - 2003-07-02 18:46:48
|
James Devenish a écrit : > However, thanks to your prompting, perhaps we can come to a compromise: > we will still use the long, monolithic "scape" template but it will be > generated from a mapping file (not hand-coded). > Hi, If it can be of some help, in dblatex I use an engine to build the entities translation template. The attached files (used for MathML translation): map2func.xsl is the template to apply on mapmmlent.xml. The output is a stylesheet containing the templates, each template name corresponding to a <mapgroup> name. Of course the code produced is ugly, but it works fine, and it is fast enough. For instance, applying the scape template on a test file takes 147.1 s, using the produced template takes 96 s. Bye, BG |
From: Torsten B. <br...@ph...> - 2003-07-02 15:10:42
|
Halloechen! James Devenish <j-d...@us...> writes: > [...] As far as I can see, you're suggesting that we use > substring(...) to iterate over every character in text() nodes and > then do lookups in a 65000-element mapping document (most > characters will require LaTeX packages to be loaded -- so there is > always a need for a LaTeX-based solution). It simply isn't > practical (time, space, software support) to do that. Do you mean something like this: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/xsltml/xsltml/entities.xsl?rev=1.13&content-type=text/vnd.viewcvs-markup However I fully agree that this doesn't seem to be a very wise thing to do. Tschoe, Torsten. -- Torsten Bronger, aquisgrana, europa vetus |
From: Torsten B. <br...@ph...> - 2003-07-02 15:07:15
|
Halloechen! James Devenish <j-d...@us...> writes: > [...] > > For DB2LaTeX, there are three graceful options built in (though neither > is enabled by default). The test_entities folder (which should probably > have been named test_characters) demonstrates this. The current options > are: > > [...] > - Use Unicode characters directly. E.g. <xsl:output encoding="utf-8"/>. > This allows fullest use of the DocBook localisations as-is (though > you will need to install the 'unicode' LaTeX package). This option is > intended for documents where the incidence of non-Latin characters is > high. The example files for this are test_entities/utf-8.* This sounds perfect. Then what is the disadvantage of this option? Why isn't it used always? BTW, recently the LaTeX3 project team introduced the new inputenc option utf-8 (or utf8?) for testing. Tschoe, Torsten. -- Torsten Bronger, aquisgrana, europa vetus |
From: Vitaly O. <vy...@vz...> - 2003-07-02 13:59:51
|
On Wed, 2 Jul 2003 21:10:18 +0800 James Devenish <j-d...@us...> wrote: > Hi Vitaly, > > New ideas are welcome -- and it would be great for us to > improve language support -- but each idea needs to be assessed > for its practical value. > > In message <200...@vz...> > on Wed, Jul 02, 2003 at 03:03:49PM +0400, Vitaly Ostanin wrote: > > > I, too, would like to have had this. But DocBook XSL > > > stylesheets, in general, are slow enough already. The > > > problem with using a recursive template is that it can > > > easily increase processing time by a factor of five. > > > > You right, modified style is slow, but XSLT is not for speed. > > It's not for slownees, either! :) I optimize my version of template name="scape" (attached). Now it used key() functionality from XSLT. Top of my normalize-scape.mod.xsl: <xsl:key name="entity" match="mapping" use="@key"/> <xsl:variable name="latex.mapping.vyt" select="document('latex.mapping.xml')"/> Now speed statistic is (tested with xsltproc --timing --repeat): original db2latex Applying stylesheet 20 times took 15469 ms vyt first (scape.xsl) Applying stylesheet 20 times took 88864 ms Saving result took 1 ms vyt second (scape2.xsl) Applying stylesheet 20 times took 34364 ms > > > we will still use the long, monolithic "scape" > > > template but it will be generated from a mapping file (not > > > hand-coded). > > > > I'm not sure, that is the right way. > > >From what I have seen, it is the most practical way so far. May be. > > > > LaTeX doesn't support unicode characters by their > > > > numbers, so each character need to be translated into > > > > valid latex. > > > > > > I haven't found that to be possible (but I'm not an XSLT > > > expert). > > > > It's easily doing with characters mapping, without any > > extensions. > > I don't believe you! If you can find someone who has > demonstrated that it is practical (or can explain how it could > be done) that would help us find a new solution for DB2LaTeX. I'll try :) > As far as I can see, you're suggesting that we use > substring(...) to iterate over every character in text() nodes > and then do lookups in a 65000-element mapping document(most > characters will require LaTeX packages to be loaded -- so there > is always a need for a LaTeX-based solution). It simply isn't > practical(time, space, software support) to do that. You can split all mapping base by languages (numbers ranged) and include only specified. BTW, you can to have 2 alternative variants: with monolithic "scape"; and with replaced from mapping base (separated from latex.mapping.xml). <skipped/> -- Regards, Vyt mailto: vy...@vz... JID: vy...@vz... |
From: James D. <j-d...@us...> - 2003-07-02 13:10:25
|
Hi Vitaly, New ideas are welcome -- and it would be great for us to improve language support -- but each idea needs to be assessed for its practical value. In message <200...@vz...> on Wed, Jul 02, 2003 at 03:03:49PM +0400, Vitaly Ostanin wrote: > > I, too, would like to have had this. But DocBook XSL > > stylesheets, in general, are slow enough already. The problem > > with using a recursive template is that it can easily increase > > processing time by a factor of five. > > You right, modified style is slow, but XSLT is not for speed. It's not for slownees, either! :) > > we will still use the long, monolithic "scape" > > template but it will be generated from a mapping file (not > > hand-coded). > > I'm not sure, that is the right way. From what I have seen, it is the most practical way so far. > > > LaTeX doesn't support unicode characters by their numbers, > > > so each character need to be translated into valid latex. > > > > I haven't found that to be possible (but I'm not an XSLT > > expert). > > It's easily doing with characters mapping, without any > extensions. I don't believe you! If you can find someone who has demonstrated that it is practical (or can explain how it could be done) that would help us find a new solution for DB2LaTeX. As far as I can see, you're suggesting that we use substring(...) to iterate over every character in text() nodes and then do lookups in a 65000-element mapping document (most characters will require LaTeX packages to be loaded -- so there is always a need for a LaTeX-based solution). It simply isn't practical (time, space, software support) to do that. |
From: Vitaly O. <vy...@vz...> - 2003-07-02 11:04:12
|
On Wed, 2 Jul 2003 17:51:21 +0800 James Devenish <j-d...@us...> wrote: > In message <m3a...@wi...> > on Tue, Jul 01, 2003 at 03:57:53PM +0200, Torsten Bronger > wrote: > > Do you mean something like > > <http://xml.coverpages.org/unicodeRahtz19981008.xml>? > > If anyone knows how to use this and would like to write notes > about how it can be used with DB2LaTeX, feel free ;-) 1. License of this file and terms of using. 2. Transform this file to form like latex.mapping.xml (replace unicode numbers by their entities). Easy. 3. Include result of transform in latex.mapping.xml and use it with other replacements of special LaTeX symbols. > In message <200...@vz...> > on Tue, Jul 01, 2003 at 04:45:13PM +0400, Vitaly Ostanin wrote: > > What you think about creating xml file for symbols and > > replacements? Such xml will easy contributed and maintained > > for generating xslt from it. > [...] > > I already fix normalize-scape.mod.xml for using > > latex.mapping.xml and it worked. > > I, too, would like to have had this. But DocBook XSL > stylesheets, in general, are slow enough already. The problem > with using a recursive template is that it can easily increase > processing time by a factor of five. You right, modified style is slow, but XSLT is not for speed. > Yet it only benefits developers. So I dropped the idea. > > However, thanks to your prompting, perhaps we can come to a > compromise: we will still use the long, monolithic "scape" > template but it will be generated from a mapping file (not > hand-coded). I'm not sure, that is the right way. > > LaTeX doesn't support unicode characters by their numbers, > > so each character need to be translated into valid latex. > > I haven't found that to be possible (but I'm not an XSLT > expert). It's easily doing with characters mapping, without any extensions. And base for it already exists: http://xml.coverpages.org/unicodeRahtz19981008.xml > If you have any idea how to do this portably in XSLT > without using extensions, I would really love to know. If you > have a method that relies on commonly-available extensions, we > could include that as an option. Our current approach is to say > "we can't do this with XSLT, so we'll do it with LaTeX". It's not right. <skipped/> -- Regards, Vyt mailto: vy...@vz... JID: vy...@vz... |
From: James D. <j-d...@us...> - 2003-07-02 09:52:10
|
In message <m3a...@wi...> on Tue, Jul 01, 2003 at 03:57:53PM +0200, Torsten Bronger wrote: > Do you mean something like > <http://xml.coverpages.org/unicodeRahtz19981008.xml>? If anyone knows how to use this and would like to write notes about how it can be used with DB2LaTeX, feel free ;-) In message <200...@vz...> on Tue, Jul 01, 2003 at 04:45:13PM +0400, Vitaly Ostanin wrote: > What you think about creating xml file for symbols and > replacements? Such xml will easy contributed and maintained for > generating xslt from it. [...] > I already fix normalize-scape.mod.xml for using latex.mapping.xml > and it worked. I, too, would like to have had this. But DocBook XSL stylesheets, in general, are slow enough already. The problem with using a recursive template is that it can easily increase processing time by a factor of five. Yet it only benefits developers. So I dropped the idea. However, thanks to your prompting, perhaps we can come to a compromise: we will still use the long, monolithic "scape" template but it will be generated from a mapping file (not hand-coded). > LaTeX doesn't support unicode characters by their numbers, > so each character need to be translated into valid latex. I haven't found that to be possible (but I'm not an XSLT expert). If you have any idea how to do this portably in XSLT without using extensions, I would really love to know. If you have a method that relies on commonly-available extensions, we could include that as an option. Our current approach is to say "we can't do this with XSLT, so we'll do it with LaTeX". For DB2LaTeX, there are three graceful options built in (though neither is enabled by default). The test_entities folder (which should probably have been named test_characters) demonstrates this. The current options are: - Do nothing to handle Unicode characters. This is the default. You will get LaTeX error messages and the output won't be correct. - Enable output escaping and handle some 'essential' English-language characters. For unrecognised characters, spell out the character codes in the text (to alert the reader). This is best way of providing support for the bulk of English-language documents. "Odd" characters will appear in a way that proof-readers can recognise. The example files for this are test_entities/catcode.* - Enable output escaping, use the LaTeX 'unicode' package, but keep the output encoding in a Latin-alphabet character set. This is for Latin-alphabet users. For them, it may be preferable to use an ISO Latin output encoding and have the 'babel' package handle Latin characters. Other characters, if present, will be intercepted and passed to the 'unicode' package. The example files for this are test_entities/ucs.* - Use Unicode characters directly. E.g. <xsl:output encoding="utf-8"/>. This allows fullest use of the DocBook localisations as-is (though you will need to install the 'unicode' LaTeX package). This option is intended for documents where the incidence of non-Latin characters is high. The example files for this are test_entities/utf-8.* See also (incomplete documentation): $latex.entities <http://db2latex.sourceforge.net/reference/rn45re81.html> $latex.inputenc <http://db2latex.sourceforge.net/reference/rn45re81.html> $latex.use.ucs <http://db2latex.sourceforge.net/reference/rn45re81.html> $latex.ucs.options http://db2latex.sourceforge.net/reference/rn45re101.html $latex.babel.language <http://db2latex.sourceforge.net/reference/rn45re102.html> James. |
From: Vitaly O. <vy...@vz...> - 2003-07-01 14:18:45
|
On Tue, 01 Jul 2003 15:57:53 +0200 Torsten Bronger <br...@ph...> wrote: > Halloechen! > > Vitaly Ostanin <vy...@vz...> writes: > > > I look into xsl/normalize-scape.mod.xsl. > > > > template name="scape" is really big... And will very big. > > > > I don't know another way for multiple parsing string (and > > parsing result of parsing). > > > > What you think about creating xml file for symbols and > > replacements? Such xml will easy contributed and maintained > > for generating xslt from it. > > > > LaTeX doesn't support unicode characters by their numbers, so > > each character need to be translated into valid latex. > > Do you mean something like > <http://xml.coverpages.org/unicodeRahtz19981008.xml>? Thanks, it's cool! I'll look on it. > I think the best solution is an external tool. I already fix normalize-scape.mod.xml for using latex.mapping.xml and it worked. scape.xsl with fixed template name="scape" is attached. <skipped/> -- Regards, Vyt mailto: vy...@vz... JID: vy...@vz... |