Thread: [Vim-latex-devel] tex-refs project and xml to vim-help conversion
Brought to you by:
srinathava,
tmaas
From: Benji F. <be...@me...> - 2003-01-26 18:08:15
|
Peter: A while ago, you wrote to the vim-latex group, asking for a volunteer to work on an XML-to-vim-help converter. I think we are willing to give this a try. Unfortunately, none of us has much experience in XML so far, so we would appreciate some pointers to help us get started: XML, DocBook, XSLT. Any advice will be appreciated. --Benji Fisher |
From: Srinath A. <sr...@fa...> - 2003-01-26 20:58:53
|
Hello, Sorry for not writing earlier about this... I downloaded 0.2.1 of the tex-refs xml source. It looks like you are using the DocBook DTD. See: http://www.docbook.org/tdg/en/html/docbook.html But I am quite unclear about how exactly they process the xml source to generate the various formats. It looks like they would use the docbook xsl stylesheets available at http://docbook.sourceforge.net/projects/xsl/ But thats about all I know... I do not know the details of which tools they have used to apply the stylesheets etc. That would be helpful information.... Actually a quick google search just revealed that docbook recommends saxon which has a windows installer... Another thing: Since vim-help is plain text, it will be in general hard to create stylesheets for it, because we will have to do all the rendering ourselves. Therefore, it is of importance to know how many elements of the docbook DTD definition the tex-refs project uses. Each new element used will be more work. For a simple vim help file itself, we need not more than a couple of elements: 1. table 2. option 3. tag 4. para and maybe a few more. If you take a look at the tex-refs.xml source, you will see a much richer xml. Writing a custom xsl stylesheet to convert that to plain text might be a huge undertaking. Srinath On Sun, 26 Jan 2003, Benji Fisher wrote: > Peter: > > A while ago, you wrote to the vim-latex group, asking for a > volunteer to work on an XML-to-vim-help converter. I think we are > willing to give this a try. Unfortunately, none of us has much > experience in XML so far, so we would appreciate some pointers to help > us get started: XML, DocBook, XSLT. Any advice will be appreciated. > > --Benji Fisher > |
From: Peter K. <pe...@ka...> - 2003-01-27 22:32:16
|
Hi Srinath and Benji, thank your for your e-mails. I'm lucky that you like to try to contribute to our project :-) (I'm not any longer subscribed to the vim-developer list, so I can't read posts there, but I'm sending a copy there too). > I downloaded 0.2.1 of the tex-refs xml source. It looks like you are > using the DocBook DTD. > > See: > http://www.docbook.org/tdg/en/html/docbook.html That's right. Michael is maintaining the technical part of tex-refs and produces the various outpuf formats. He's quite busy these days (with a move) and I'm personally can't answer all your questions. I hope Michael will find time to respond in detail soon. Today and tomorrow I don't have time for a longer answer, but I'll try to answer as many questions as I can as soon as possible. May I suggest that all who are interested to take part subscribe to the tex-refs mailing list? Then it's automaticly everybody intersted can gets a copy of the e-mail. The list is low freqency, so you won't get "bombarded" with too many e-mails ;-) You can subscribe by sending email to tex...@ml... containing the word subscribe in the subject or body of the email. All the best Peter |
From: Peter K. <pma...@ka...> - 2003-01-31 13:34:47
|
Hi Benji and Srinath, I'll answer your two e-mails in one: > I downloaded 0.2.1 of the tex-refs xml source. It looks like you are > using the DocBook DTD. Yes. > But I am quite unclear about how exactly they process the xml source to > generate the various formats. It looks like they would use the docbook > xsl stylesheets available at > > http://docbook.sourceforge.net/projects/xsl/ I'm not sure about that. Michael can answer that. > Since vim-help is plain text, it will be in general hard to create > stylesheets for it, because we will have to do all the rendering > ourselves. I hope that it won't be that hard, because the stylesheet for the plain ascii text output could be used as a starting point. > Therefore, it is of importance to know how many elements of > the docbook DTD definition the tex-refs project uses. Each new element > used will be more work. As far as I've understand the number of elements won't add too much work, because many can be handled the same way to create a simple ascii output. I'm more worried if it's hard to create a stylesheet which creates an output as half as nice as the very good (in terms of ease of reading) formatted Vim help files. :-) > For a simple vim help file itself, we need not more than a couple of > elements: > > 1. table > 2. option > 3. tag > 4. para > > and maybe a few more. I'm not sure which option and tag you refer to. You don't mean the help tags which are generated, do you? > A while ago, you wrote to the vim-latex group, asking for a > volunteer to work on an XML-to-vim-help converter. I've forwarded your e-mail to Michael (at the tex-refs mailing list). As I've written he's pretty busy these days, but I hope he can find some time to give us a few pointers. If you don't mind it would be great to subscribe to the tex-refs mailing list, so it's easier to reach everyone "at once". I personally had started to improve upon the LaTeX reference which Mikolaij prepared to the Vim help format. I myself have no XML experience. I've started to learn about XML/XSL and so on, so that I can better contribute to the project. Have you already taken a look into the readme files of the source file from tex-refs? Some questions are answered there already. But I myself still need to see/learn which tags are used in the XML source and to think about how they should be translated into the Vim help format. I guess Michael is right and it's easier to create an XSLT transformation instead of writing a Vim macro (awk or whatever script) to do all the formatting, but maybe I'm wrong -- especially considering that you both have experience to write vim macros, but not with XSLT. The next best bet would be to use the text output format and reformat that to the Vim help format. That's not as nice as doing a direct translation, but might be a much easier solution -- more a (I hope so) quick hack to have the information at least inside vim so that the help tags file can be generated. What do you think about those two options? Which way do you think is the best way to go? Are there others who are interested in an XML to Vim help converter (besides tex-refs)? As XML is gaining more ground each day it might be useful in the future for others too? Thanks for your efforts to help us to get the tex-references accesible direct inside Vim :-) Regards Peter |
From: Benji F. <be...@me...> - 2003-01-31 14:08:40
|
Peter Karp wrote: > Hi Benji and Srinath, > > I'll answer your two e-mails in one: Thanks. >>But I am quite unclear about how exactly they process the xml source to >>generate the various formats. It looks like they would use the docbook >>xsl stylesheets available at >> >>http://docbook.sourceforge.net/projects/xsl/ > > I'm not sure about that. Michael can answer that. I hope he can find time to answer. >>Since vim-help is plain text, it will be in general hard to create >>stylesheets for it, because we will have to do all the rendering >>ourselves. > > I hope that it won't be that hard, because the stylesheet for the plain > ascii text output could be used as a starting point. That sounds promising. >>Therefore, it is of importance to know how many elements of >>the docbook DTD definition the tex-refs project uses. Each new element >>used will be more work. > > As far as I've understand the number of elements won't add too much work, > because many can be handled the same way to create a simple ascii output. > > I'm more worried if it's hard to create a stylesheet which creates an > output as half as nice as the very good (in terms of ease of reading) > formatted Vim help files. :-) [snip] > If you don't mind it would be great to subscribe to the tex-refs mailing > list, so it's easier to reach everyone "at once". [snip] I just subscribed. > I guess Michael is right and it's easier to create an XSLT transformation > instead of writing a Vim macro (awk or whatever script) to do all the > formatting, but maybe I'm wrong -- especially considering that you both > have experience to write vim macros, but not with XSLT. I am willing to learn a new tool if it seems like the right one for the job. > What do you think about those two options? Which way do you think is the > best way to go? Are there others who are interested in an XML to Vim help > converter (besides tex-refs)? As XML is gaining more ground each day it > might be useful in the future for others too? We are thinking of writing the documentation for LaTeX Suite in XML format, so that we can get LaTeX, HTML, and Vim-help output. If it works out, we can post a link on http://vimdoc.sourceforge.net/ in case others find it useful. --Benji |
From: Michael W. <mw...@mi...> - 2003-01-31 20:07:38
|
* Peter Karp <pma...@ka...> [030131 14:34]: > > I downloaded 0.2.1 of the tex-refs xml source. It looks like you are > > using the DocBook DTD. We are using DocBook XML V4.2, currently the most actual version. > > But I am quite unclear about how exactly they process the xml source to > > generate the various formats. It looks like they would use the docbook > > xsl stylesheets available at > > > > http://docbook.sourceforge.net/projects/xsl/ The HTML (chunked and non-chunked) transformation process uses the most actual version of the DocBook XSL Stylesheets (V1.60.1 at the time of this writing), using Saxon as XSLT processor (I'd prefer to use xsltproc from libxml2 because it would be much faster, but it is still buggy). For TXT output we use the HTML non-chunked output and use 'lynx -dump' to create TXT output. Note: there are no XSL stylesheets to generate directly TXT output! RTF is generated using openjade and the DSSSL stylesheets (there are no XSL stylesheets for RTF output available). PDF is still experimental, the XSL-FO output is quite good, but the backends to transform this into high quality PDF are not yet ready. Currently there is some work undergoing in the ConTeXt community, which might lead to good PDF in the near future. > > stylesheets for it, because we will have to do all the rendering > > ourselves. > > I hope that it won't be that hard, because the stylesheet for the plain > ascii text output could be used as a starting point. See the note above: there are _no_ XSL stylesheets to genertate directly TXT output. > > Therefore, it is of importance to know how many elements of > > the docbook DTD definition the tex-refs project uses. Each new element > > used will be more work. ... > > For a simple vim help file itself, we need not more than a couple of > > elements: > > > > 1. table > > 2. option > > 3. tag > > 4. para > > > > and maybe a few more. IMHO it's _not_ a good idea to restrict the use to a subset of all available DocBook tags! If a backend cannot handle specific tags it should simply copy the contents (all between start and end tag) of the unknown tags unchanged to the output file. > I guess Michael is right and it's easier to create an XSLT transformation > instead of writing a Vim macro (awk or whatever script) to do all the > formatting, but maybe I'm wrong -- especially considering that you both > have experience to write vim macros, but not with XSLT. > > The next best bet would be to use the text output format and reformat that > to the Vim help format. That's not as nice as doing a direct translation, > but might be a much easier solution -- more a (I hope so) quick hack to > have the information at least inside vim so that the help tags file can be > generated. Maybe you should consider to split the source XML file into separate files. It can be easier to handle only a part (e.g. Chapter 1, TeX) of the complete XML file at the beginning. I could provide a XSL stylesheets which auomatically extracts a specific section of the complete source file into a separate file. Please let me know if you need any further information or help (as Peter already mentioned I'm quite busy at the moment but will do my best to assist you in your work). Michael -- mw...@mi... http://www.miwie.org mw...@mi... |
From: Benji F. <be...@me...> - 2003-01-31 20:58:22
|
Michael Wiedmann wrote: > * Peter Karp <pma...@ka...> [030131 14:34]: > >>>I downloaded 0.2.1 of the tex-refs xml source. It looks like you are >>>using the DocBook DTD. > > We are using DocBook XML V4.2, currently the most actual version. > > The HTML (chunked and non-chunked) transformation process uses the > most actual version of the DocBook XSL Stylesheets (V1.60.1 at the > time of this writing), using Saxon as XSLT processor (I'd prefer to > use xsltproc from libxml2 because it would be much faster, but it > is still buggy). > > For TXT output we use the HTML non-chunked output and use 'lynx -dump' > to create TXT output. Note: there are no XSL stylesheets to generate > directly TXT output! > > RTF is generated using openjade and the DSSSL stylesheets (there are > no XSL stylesheets for RTF output available). > > PDF is still experimental, the XSL-FO output is quite good, but the > backends to transform this into high quality PDF are not yet ready. > Currently there is some work undergoing in the ConTeXt community, > which might lead to good PDF in the near future. That's too bad. I was under the impression that LaTeX output was already standard. Especially for the tex-refs and vim-latex projects, it would be nice to have some form of TeX output! > IMHO it's _not_ a good idea to restrict the use to a subset of all available > DocBook tags! If a backend cannot handle specific tags it should simply > copy the contents (all between start and end tag) of the unknown tags > unchanged to the output file. > > Maybe you should consider to split the source XML file into separate > files. It can be easier to handle only a part (e.g. Chapter 1, TeX) > of the complete XML file at the beginning. I could provide a XSL > stylesheets which auomatically extracts a specific section of the > complete source file into a separate file. > > Please let me know if you need any further information or help > (as Peter already mentioned I'm quite busy at the moment but will > do my best to assist you in your work). > > Michael Thanks for the information. It is beginning to sound as if XSLT is not the right tool for us to use, after all. I am inclined to write a custom filter from DocBook directly to vim help, rather than use some intermediate format. We might use Python, which already has support for processing XML. We in the vim-latex project know very little about XML and DocBook. What we really want at this point is a reading list to get started. What is the structure of a DocBook document? What is the idea behind it? What needs to be done to generate a specific output format? Just point me to some good "getting started" guides, and I think I can go from there ... or at least, it will keep me busy for a while. ;) --Benji |
From: Michael W. <mw...@mi...> - 2003-02-01 12:41:52
|
* Benji Fisher <be...@me...> [030131 16:12]: > That's too bad. I was under the impression that LaTeX output was > already standard. Especially for the tex-refs and vim-latex projects, > it would be nice to have some form of TeX output! There is at least one project which tries to convert DocBook XML to LaTeX (DB2LaTeX: http://www.sourceforge.net/projects/db2latex) but it's not yet usable. The author promised to start work at it again, but AFAIK nothing happened during the last months. The DocBook XSL Stylesheets to generate XSL-FO work quite good, but the backends to transform FO to PDF are not completley satisfying - at least if you have a 'real life' document using tables, etc. But for less complicated documents using FOP from the Apache project should give already good results. ... > We in the vim-latex project know very little about XML and > DocBook. What we really want at this point is a reading list to get > started. What is the structure of a DocBook document? What is the idea > behind it? What needs to be done to generate a specific output format? > Just point me to some good "getting started" guides, and I think I can > go from there ... or at least, it will keep me busy for a while. ;) The following links might be a bit out of date but should give you a start: Writing Documentation Using DocBook David Rugge; Mark Galassi; Eric Bischoff http://www.caldera.de/~eric/crash-course/HTML/index.html Introduction in DocBook from Marc Galassi http://nis-www.lanl.gov/~rosalia/mydocs/docbook-intro.html The Debian SGML/XML HOWTO http://people.debian.org/~bortz/SGML-HOWTO/potato/howto.html The Using DocBook HOWTO of the Linux Documentation Project http://metalab.unc.edu/godoy/using-docbook/using-docbook.html DocBook FAQ Dave Pawson http://www.dpawson.co.uk/docbook/ DocBook Wiki http://docbook.org/wiki/moin.cgi/ Michael -- mw...@mi... http://www.miwie.org mw...@mi... |
From: Benji F. <be...@me...> - 2003-02-11 15:23:41
|
I have a few questions today. 1. I have been thinking about how to translate DocBook tags into *tags* (hyperlinks) for vim help files. Let's look at an example: ===from tex-refs.xml=== <section id="bslash-addtocounter"> <title id="bslash-addtocounter-title">\addtocounter</title> <indexterm><primary>\addtocounter</primary></indexterm> <para><literal>\addtocounter{counter}{value}</literal> </para> <para>The <literal>\addtocounter</literal> command increments the <literal>counter</literal> by the amount specified by the <literal>value</literal> argument. The <literal>value</literal> argument can be negative. </para> </section> ======================== ===from tex-refs.html=== <div class="section" lang="en"> <div class="titlepage"> <div> <div> <h6 class="title"><a name="bslash-addtocounter"></a>1.3.2.1.1 \addtocounter</h6></div></div></div> <a class="indexterm" name="d0e264"></a> <p><tt class="literal">\addtocounter{counter}{value}</tt></p> <p>The <tt class="literal">\addtocounter</tt> command increments the <tt class="literal">counter</tt> by the amount specified by the <tt class="literal">value</tt> argument. The <tt class= "literal">value</tt> argument can be negative.</p></div> ========================= ===from latexhelp.txt=== \addtocounter{counter}{value} *\addtocounter* Increments the {counter} by the amount specified by the {value} argument. The {value} argument can be negative. ======================== It looks to me as if the HTML <a name="bslash-addtocounter"></a> is generated from the XML <section id="bslash-addtocounter">. Is this right? If so, I think we should plan to generate the *\addtocounter* from the same element. 2. I think the main reason to prefer Python over XSL is that we need plain text with the correct indentation and line breaks. Am I missing something? Perhaps it would be easier to use a two-step process: first, an XSL style sheet that does most of the processing, prducing something like this: ===from latexhelp.xml=== \addtocounter{counter}{value} *\addtocounter* <block indent=2 tw=80> Increments the {counter} by the amount specified by the {value} argument. The {value} argument can be negative. </block> ======================== Then a simple Python or vim script could do the rest of the work. Does this sound feasible? 3. If we want a customized format for vim documentation, we could make up our own simple DTD and then use XSL to convert this to DocBook. Is this as easy as I make it sound? Or would it be simpler to use attributes in standard DocBook tags? Srinath proposed ===from options.xml=== <option> <optionName>statusline</optionName> <optionNameAlias>stl</optionNameAlias> <optionType>string</optionType> <optionDefault>empty</optionDefault> <optionScope>global</optionScope> <notInVi/> <notInVi>not available when compiled with the <tag>+statusline</tag> feature</notInVi> <optionDescription> When nonempty, this option determines the content of the status line. Also see <featureTag>status-line</featureTag> </optionDescription> </option> ====================== A variant would be to use attributes: ===from options.xml=== <option name="statusline" alias="stl" type="string" default="empty" scope="global"> <compatibility>not in vi </compatibility> <compatibility>not available when compiled without the <tag>+statusline</tag> feature </compatibility> <optionDescription> When nonempty, this option determines the content of the status line. Also see <featureTag>status-line</featureTag> </optionDescription> </option> ======================= Is there enough flexibility in DocBook (with attributes and PI's) to so something equivalent, or does the custom format, converted to DocBook, sound like a good idea? 4. I spent some time trying to set up my system (Red Hat, not Debian) to handle the tex-refs conversions. I adjusted the paths in Makefile.cfg and downloaded Saxon from sourceforge. I think my problem is that I do not know what to do to install Saxon. I copied saxon.jar to /usr/share/java but I get this error message: [benji@localhost tex-refs-0.2.2]$ make tex-refs.html java -classpath /usr/share/java/saxon.jar com.icl.saxon.StyleSheet tex-refs.xml tex-refs.xsl > tex-refs-saxon.html Exception in thread "main" java.lang.VerifyError: verification failed at PC 65 in com.icl.saxon.style.XSLTemplate:preprocess(()V): incompatible type on stack at 0x4028115f: java.lang.Throwable.Throwable(java.lang.String) (/usr/lib/libgcj.so.3) Can someone tell me what else I should do? --Benji |
From: Srinath A. <sr...@fa...> - 2003-02-01 02:58:15
|
Hey Michael, Thanks for the information! On Fri, 31 Jan 2003, Michael Wiedmann wrote: > For TXT output we use the HTML non-chunked output and use 'lynx -dump' > to create TXT output. Note: there are no XSL stylesheets to generate > directly TXT output! > Thats what it looked like :) A question: How hard is it to tweak the xsl stylesheets for html non-chunked so that instead of generating sections like: <h1 class=section><a name="#section-tag">Overview</h1> it will do something like: <table> <tr><td colspan=2>=====================================================================</td></tr> <tr><td>Section name</td><td align=right>*section-tag*</td></tr> </table> I hope you see what I'm trying to say... This way, hopefully lynx -dump will produce the sections in the familiar way: ===================================================================== OVERVIEW *section-tag* So... although a xsl stylesheet for creating plain text does not exist, how hard is it to tweak the html-nonchunked stylesheet so that lynx -dump will look reasonable? Is this a better option? Lesser work maybe? > > > For a simple vim help file itself, we need not more than a couple of > > > elements: > > > > > > 1. table > > > 2. option > > > 3. tag > > > 4. para > > > > > > and maybe a few more. > > IMHO it's _not_ a good idea to restrict the use to a subset of all available > DocBook tags! If a backend cannot handle specific tags it should simply > copy the contents (all between start and end tag) of the unknown tags > unchanged to the output file. > Its just that supporting each tag becomes more and more work... Well, we could always just get the text data from all unsupported tags... Dunno how good an idea that is though.... Might even be better just completely ignoring unsupported tags. > Maybe you should consider to split the source XML file into separate > files. It can be easier to handle only a part (e.g. Chapter 1, TeX) > of the complete XML file at the beginning. I could provide a XSL > stylesheets which auomatically extracts a specific section of the > complete source file into a separate file. > I do not really know how splitting XML sources into seperate files will be much help. A typical vim help file is a single text file with sections... And unlike the tex-refs project, a vim help file is well within maybe 10-20 kb. I will also reply to Benji's mail here... He asked whether it will be a good idea to process DocBook XML directly from within python. If you remember, I had spent some time mocking up a tiny little python script to do this. My experience was that its very easy to get started and get some results if you use python. The xml.dom and xml.dom.minidom packages are extremeley nice and let you get something done without having to know much XML. The problem is that after a while, it will become a bit of grunt work supporting each tag. Actually, over the last couple of weeks, I have been very slowly hacking away at that primitive python code. It has gotten to a pretty respectable size now (unfortunately)... It is able to do some good stuff now... Benji, are you interested in taking a look at it? If you are interested in going the python way, its definitely worth a look... I am very surprized to find that there is no docbook-latex converter already!!! But it will be a big gain even if we can have just vim-help and html... Srinath PS: I am subscribed to the tex-refs mailing list too... -- Srinath Avadhanula Jan 31 1:04pm "Not only is this incomprehensible, but the ink is ugly and the paper is from the wrong kind of tree." -- Professor W. |
From: Michael W. <mw...@mi...> - 2003-02-01 12:49:37
|
* Srinath Avadhanula <sr...@fa...> [030131 18:58]: > Thats what it looked like :) A question: How hard is it to tweak the xsl > stylesheets for html non-chunked so that instead of generating sections > like: > > <h1 class=section><a name="#section-tag">Overview</h1> > > it will do something like: > > <table> > <tr><td colspan=2>=====================================================================</td></tr> > <tr><td>Section name</td><td align=right>*section-tag*</td></tr> > </table> This might need some complicate customization, but I can ask for some help in the DocBook-Apps ML. Could you please test before, whether this output would produce the wanted TXT output if dumped with lynx/w3m? ... > Its just that supporting each tag becomes more and more work... Well, we > could always just get the text data from all unsupported tags... Dunno > how good an idea that is though.... Might even be better just completely > ignoring unsupported tags. You cannto just ignore unknown tags, otherwise you'll loose content! Imagine a peace of source like: <para> foo foo foo foo <emphasis>bar bar</emphasis> foo foo foo</para> If you completely ignore the contents of the <emphasis> tag, then you'll loose the information inside the tag. Just copy the text content of any unknown tag to the output file and don't do any sepcial treatment for this tag. Michael -- mw...@mi... http://www.miwie.org mw...@mi... |
From: Srinath A. <sr...@fa...> - 2003-02-01 20:30:09
|
On Sat, 1 Feb 2003, Michael Wiedmann wrote: > > <table> > > <tr><td colspan=2>=====================================================================</td></tr> > > <tr><td>Section name</td><td align=right>*section-tag*</td></tr> > > </table> > > This might need some complicate customization, but I can ask for some > help in the DocBook-Apps ML. Could you please test before, whether > this output would produce the wanted TXT output if dumped with lynx/w3m? Lynx behaves somewhat strangely... Doing lynx -dump test.html > test.txt produces something like: ===================================================================== Section name *section-tag* But if I do lynx test.html and from within lynx do print -> save to local file -> test.txt then I correctly get: ===================================================================== Section name *section-tag* I still do not quite see why it produces extra leading spaces, but that is easy to take care of... > <para> foo foo foo foo <emphasis>bar bar</emphasis> foo foo foo</para> > > If you completely ignore the contents of the <emphasis> tag, then you'll > loose the information inside the tag. Just copy the text content of > any unknown tag to the output file and don't do any sepcial treatment > for this tag. Okay... Its trivial enough to just process all text within unkwown tags in python... It looks like customizing docbook xsl stylesheets is a complicated affair from what you said about customizing the output of the section tag. So another question: how hard is it to add tags of our own... For example, in a vim help file, we typically describe vim options in a consistent manner which lends itself nicely to markup. (For example try :help 'statusline'). We could do the same thing with a series of <table>'s but then we'll have to take care of the indexing and stuff ourselves which kind of defeats the purpose of docbook. Srinath |
From: Michael W. <mw...@mi...> - 2003-02-02 15:19:19
|
* Srinath Avadhanula <sr...@fa...> [030201 12:29]: ... > But if I do lynx test.html and from within lynx do > print -> save to local file -> test.txt > then I correctly get: > > ===================================================================== > Section name *section-tag* OK, so I will try to get some helpot to customize the XSL stylesheets to generate your mentioned HTML output. This may take a while.... ... > It looks like customizing docbook xsl stylesheets is a complicated > affair from what you said about customizing the output of the section > tag. So another question: how hard is it to add tags of our own... For > example, in a vim help file, we typically describe vim options in a > consistent manner which lends itself nicely to markup. > (For example try :help 'statusline'). We could do the same thing with a > series of <table>'s but then we'll have to take care of the indexing > and stuff ourselves which kind of defeats the purpose of docbook. If you want a 'valid DocBook XML' document, you are _not_ allowed to add tags of your own. The reason is obvious: usually you want to process your XML document using several tools to process different kinds of output formats. And each tool must know whioch tags are valid, so that it can handle the source file. One possibility might be the use of so-called 'processing instrucions' like e.g.: <?vimhelp some information for the vim help file backend?> Processing instructions contain information required by a specific application expected to process the XML data. But then your 'application' must be capable of handling such processing instructions. Let me know if this might be a way to ease the development of a special 'tex-refs to vim-help filter application'. I could add such PIs to the XML source file. Michael -- mw...@mi... http://www.miwie.org mw...@mi... |
From: Benji F. <be...@me...> - 2003-02-07 18:52:39
|
Srinath Avadhanula wrote: > I will also reply to Benji's mail here... > > He asked whether it will be a good idea to process DocBook XML directly > from within python. If you remember, I had spent some time mocking up a > tiny little python script to do this. My experience was that its very > easy to get started and get some results if you use python. The xml.dom > and xml.dom.minidom packages are extremeley nice and let you get > something done without having to know much XML. The problem is that > after a while, it will become a bit of grunt work supporting each tag. > Actually, over the last couple of weeks, I have been very slowly hacking > away at that primitive python code. It has gotten to a pretty > respectable size now (unfortunately)... It is able to do some good stuff > now... > > Benji, are you interested in taking a look at it? If you are interested > in going the python way, its definitely worth a look... Yes, please send me what you have. > I am very surprized to find that there is no docbook-latex converter > already!!! But it will be a big gain even if we can have just vim-help > and html... There are two approaches I have seen. I have not actually tried either one. * One, which Peter Karp pointed us to, is tBook: http://tbookdtd.sourceforge.net/ This is yet another source format; it can be translated into LaTeX or DocBook. Even if we ultimately decide to use this as our primary source format, I think the vim help may as well be generated from DocBook, since that will be useful to more people. Advantages: they claim fewer (but enough) elements than DocBook, lower tag/text ratio, and tag names that are similar to LaTeX commands. * The other approach is to use PassiveTeX: http://www.tei-c.org.uk/Software/passivetex/ The idea is to start with DocBook, and then use standard tools to create an FO (Format Object) file. This seems to be another flacor (or DTD) of XML. FO is the opposite of DocBook: it describes physical markup instead of logical markup. Then PassiveTeX is a set of TeX macros that read in the FO file. I am not really interested in this path, since I would rather have something that actually produces a LaTeX file. --Benji |
From: Srinath A. <sr...@fa...> - 2003-02-07 21:29:15
|
Well, the more I look at DocBook, the more I get the feeling that we will not be able to use it without some kind of modifications. The modifications are minor enough that its trivial to modify the xsl stylesheets as well, but they are kind of essential. I know that Michael Wiedman objects to such modifications, but unless someone suggests a constructive way to get around it, I do not see any option but to do some modifications. On Fri, 7 Feb 2003, Benji Fisher wrote: > > Benji, are you interested in taking a look at it? If you are interested > > in going the python way, its definitely worth a look... > > Yes, please send me what you have. > I have been using the vim-latex cvs tree to maintain what I've done... Check out the following module: TODO/srinath-files/python just as you would check out anything else. Make sure your python has the xml.dom and xml.dom.minidom pacakges installed. Just for good measure install the PyXML package... (I assume you know you to do these installs. Python makes it really easy). After doing that run $ python option.py from the command line. That should produce some formatted text on stdout. It processes latex-suite.xml file to produce the output. As of now, latex-suite.xml is fully legal docbook, but that as I have said, might need to change soon. For one, we will definitely need to use a <tag> markup tag to specify vim-help tags (repitition intended). You could redirect to /tmp/a.txt and ':set ft=help' to get a better idea... Obviously, this is a work very much in progress... I only work for very little time each day... If you have saxon or some other tool, you can use the standard docbook xsl stylesheet to produce html too from latex-suite.xml. I have not even begun thinking of anything other than docbook. > * One, which Peter Karp pointed us to, is tBook: > http://tbookdtd.sourceforge.net/ > This is yet another source format; it can be translated into LaTeX or > DocBook. Even if we ultimately decide to use this as our primary source > format, I think the vim help may as well be generated from DocBook, > since that will be useful to more people. Advantages: they claim fewer > (but enough) elements than DocBook, lower tag/text ratio, and tag names > that are similar to LaTeX commands. This looks interesting. I'll take a look at it next time I get some time. Srinath |