From: James F. <jam...@gm...> - 2010-03-06 02:01:05
|
Oh, the thing I want to get off my mind: Has anyone considered restructuring the command-line tools for docutils? Because the "<A>2<B>.py" design would result in a hella lot of short scripts were the number of readers to increase from 1 (RST). Does the following make more sense: james@pc:~ docutils --in=rst --out=html [infile [outfile]] Where the --in and --out are guessed from file extension if not given, and failing that, --in defaults to RST, --out to, say, HTML; and output as usual is stdout by default. It'd take some rethinking of format-specific options, I guess. Though I do think it would take less typing and less remembering command names. Also, might give more docutils brand awareness; not everyone using it is importing it in Python scripts. People could even be using RST without having heard of docutils. (I do remember when I first found docutils, thinking, 'so, where's the "docutils" command?') The reason I bring this up in this thread is that AFAIK there's no braindead way to work with XML doctrees. My instinct was to write a "doctree2rst.py" for the above. This kinda thing can get messy. James On Sat, Mar 6, 2010 at 1:36 AM, James Fisher <jam...@gm...> wrote: > Hi Stefan and Stefan! Sorry for my slow reply. I'm quite pleased at the > interest there is for this. I've tried out xml2rst.xsl (0.4.0) using xalan, > xsltproc and Firefox. It does look promising, though: > > * With xsltproc, I get namespace warnings (and the output has no > linebreaks): > > namespace warning : Namespace prefix xsl was not found > <xsl:text>&CR;</xsl:text> > ^ > namespace warning : Namespace prefix xsl was not found > <xsl:text>&CR;</xsl:text> > ^ > namespace warning : Namespace prefix xsl was not found > <xsl:text> </xsl:text> > ^ > > * With both xalan and Firefox, I get crazy amounts of extra carriage > returns and spaces. I can send you the specifics if you like. > > I'm guessing both of those whitespace problems are connected, but I don't > know any XSLT (though I've just spent 30 minutes reading up and imagine I > could pick it up quickly) so can't really debug. Otherwise, the output > looks pretty promising! > > A couple of separate things I'm wondering: > > 1. Could someone come up with a list of the non-semantic pieces of > information that the RST reader discards? Things like amount of whitespace, > what kind of table syntax has been used, ... . In an old post it was > written "A really lossless transformation would also need a subclass of the > rst parser (losslessrst)". Has anyone tried anything in this regard? Also, > would there really be harm in expanding the normal doctree to include all > this info, instead of subclassing to a new reader? Part of me says "yes" on > the grounds that it makes the doctree RST-specific (e.g. "table syntax type" > has no meaning if the doctree was generated from HTML). Has anyone > considered having separate format-specific namespaces in the XML to allow > these non-semantic elements? E.g., > > <paragraph> > A paragraph. > </paragraph> > <rst:newlines count="3" /> > <paragraph> > Another paragraph. > </paragraph> > > 2. I'm in a couple of minds about XSLT; one one hand it seems a really > perfect application of its strengths (AFAIK); on the other, what's the > chance of anything getting into the core that isn't pure-python? That's a > question more for David I guess. A question for Stefan: have you found XSLT > has any particular in-principle difficulties; how would you estimate the > effort level compared to pure-python had you gone that route? > > Also: has anyone made a comparison of all the existing attempts at an RST > writer? What do the authors think of other authors' attempts? (Sorry if > any of this sounds patronizing; I'm trying to get some high-level context > before I dig in and do anything myself.) (And are the authors of the two > html2rst modules on the mailing list?) > > > I had more to say a minute ago, but it's been a long day and I'm tired. > Speak tomorrow. Night all! > > > James > > > On Fri, Mar 5, 2010 at 4:14 PM, Stefan Merten <sm...@oe...> wrote: > >> Hi James and David! >> >> Yesterday James Fisher wrote: >> > * xml2rst.py ( >> http://www.merten-home.de/FreeSoftware/xml2rst/manual.html) -- >> > converts an XML doctree file to RST >> >> I'm maintaining this. >> >> > * it provides an ideal route for people's ad hoc crazyformat2rst.py >> > converters. >> > Just parse your file format into a doctree form, and docutils does the >> > rest. >> >> That is exactly what I created it for and what I'm actually using it >> for to convert my old SDF files to reST. >> >> I also thought of an converter from OOo formats to reST or from HTML >> to reST using this way. Using XSLT it is quite easy to generate >> Docutils XML from other XML like those. >> >> Yesterday David Goodger wrote: >> > On Thu, Mar 4, 2010 at 13:34, James Fisher <jam...@gm...> >> wrote: >> >> So, is anyone interested in such a tool? If I were to work on one (by >> >> heavily borrowing from the above implementations), would people be >> >> enthusiastic? Is there any chance of this getting into the codebase? >> > >> > There has been much interest in this over the years. If the >> > implementation is solid (quality code with tests & docs), sure, it can >> > get into the core. >> >> I'll use this as a nice occasion to report the latest developments in >> xml2rst. First of all: The tool is stable, it's up to date (i.e. >> supports more or less all Docutils features of today) and it is now >> accompanied by a test suite (using filterunit_). >> >> .. _filterunit: http://www.merten-home.de/FreeSoftware/filterunit/ >> >> Lately I also added a Python wrapper for the whole stuff using `lxml` >> for the XML related stuff. I.e.: No external XSLT processor is needed >> any more - which may make the usage somewhat friendlier. I also >> started using the EXSLT extensions offered by lxml / xsltproc. This >> makes the XSLT somewhat more readable. >> >> I did not put this stuff to the sandbox yet because when I announced >> xml2rst originally David and others didn't like it and so I thought >> why bother. However, if there is interest now I could easily upload >> the latest stuff. With the new Python wrapper and using lxml it embeds >> in a Python environment much nicer then before and the wrapper may >> give enough rope to create a real writer module from the existing >> stuff. >> >> If anyone is interested I can check this into the sandbox. Just drop >> me a note please. >> >> >> Grüße >> >> Stefan >> >> >> ------------------------------------------------------------------------------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> Docutils-users mailing list >> Doc...@li... >> https://lists.sourceforge.net/lists/listinfo/docutils-users >> >> Please use "Reply All" to reply to the list. >> >> > |