Re: [Docutils-users] A reStructuredText writer?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Oh, the thing I want to get off my mind:

Has anyone considered restructuring the command-line tools for docutils?
Because the "<A>2<B>.py" design would result in a hella lot of short scripts
were the number of readers to increase from 1 (RST).  Does the following
make more sense:

james@pc:~ docutils --in=rst --out=html [infile [outfile]]

Where the --in and --out are guessed from file extension if not given, and
failing that, --in defaults to RST, --out to, say, HTML; and output as usual
is stdout by default.  It'd take some rethinking of format-specific options,
I guess.  Though I do think it would take less typing and less remembering
command names.  Also, might give more docutils brand awareness; not everyone
using it is importing it in Python scripts.  People could even be using RST
without having heard of docutils.  (I do remember when I first found
docutils, thinking, 'so, where's the "docutils" command?')

The reason I bring this up in this thread is that AFAIK there's no braindead
way to work with XML doctrees.  My instinct was to write a "doctree2rst.py"
for the above.  This kinda thing can get messy.

James

On Sat, Mar 6, 2010 at 1:36 AM, James Fisher <jam...@gm...> wrote:

> Hi Stefan and Stefan!  Sorry for my slow reply.  I'm quite pleased at the
> interest there is for this.  I've tried out xml2rst.xsl (0.4.0) using xalan,
> xsltproc and Firefox.  It does look promising, though:
>
> * With xsltproc, I get namespace warnings (and the output has no
> linebreaks):
>
> namespace warning : Namespace prefix xsl was not found
> <xsl:text>&CR;</xsl:text>
>          ^
> namespace warning : Namespace prefix xsl was not found
> <xsl:text>&CR;</xsl:text>
>          ^
> namespace warning : Namespace prefix xsl was not found
> <xsl:text> </xsl:text>
>          ^
>
> * With both xalan and Firefox, I get crazy amounts of extra carriage
> returns and spaces.  I can send you the specifics if you like.
>
> I'm guessing both of those whitespace problems are connected, but I don't
> know any XSLT (though I've just spent 30 minutes reading up and imagine I
> could pick it up quickly) so can't really debug.  Otherwise, the output
> looks pretty promising!
>
> A couple of separate things I'm wondering:
>
> 1. Could someone come up with a list of the non-semantic pieces of
> information that the RST reader discards?  Things like amount of whitespace,
> what kind of table syntax has been used, ... .  In an old post it was
> written "A really lossless transformation would also need a subclass of the
> rst parser (losslessrst)".  Has anyone tried anything in this regard?  Also,
> would there really be harm in expanding the normal doctree to include all
> this info, instead of subclassing to a new reader?  Part of me says "yes" on
> the grounds that it makes the doctree RST-specific (e.g. "table syntax type"
> has no meaning if the doctree was generated from HTML).  Has anyone
> considered having separate format-specific namespaces in the XML to allow
> these non-semantic elements?  E.g.,
>
> <paragraph>
>         A paragraph.
> </paragraph>
> <rst:newlines count="3" />
> <paragraph>
>         Another paragraph.
> </paragraph>
>
> 2. I'm in a couple of minds about XSLT; one one hand it seems a really
> perfect application of its strengths (AFAIK); on the other, what's the
> chance of anything getting into the core that isn't pure-python?  That's a
> question more for David I guess.  A question for Stefan: have you found XSLT
> has any particular in-principle difficulties; how would you estimate the
> effort level compared to pure-python had you gone that route?
>
> Also: has anyone made a comparison of all the existing attempts at an RST
> writer?  What do the authors think of other authors' attempts?  (Sorry if
> any of this sounds patronizing; I'm trying to get some high-level context
> before I dig in and do anything myself.)  (And are the authors of the two
> html2rst modules on the mailing list?)
>
>
> I had more to say a minute ago, but it's been a long day and I'm tired.
> Speak tomorrow. Night all!
>
>
> James
>
>
> On Fri, Mar 5, 2010 at 4:14 PM, Stefan Merten <sm...@oe...> wrote:
>
>> Hi James and David!
>>
>> Yesterday James Fisher wrote:
>> > * xml2rst.py (
>> http://www.merten-home.de/FreeSoftware/xml2rst/manual.html) --
>> > converts an XML doctree file to RST
>>
>> I'm maintaining this.
>>
>> > * it provides an ideal route for people's ad hoc crazyformat2rst.py
>> > converters.
>> >   Just parse your file format into a doctree form, and docutils does the
>> > rest.
>>
>> That is exactly what I created it for and what I'm actually using it
>> for to convert my old SDF files to reST.
>>
>> I also thought of an converter from OOo formats to reST or from HTML
>> to reST using this way. Using XSLT it is quite easy to generate
>> Docutils XML from other XML like those.
>>
>> Yesterday David Goodger wrote:
>> > On Thu, Mar 4, 2010 at 13:34, James Fisher <jam...@gm...>
>> wrote:
>> >> So, is anyone interested in such a tool?  If I were to work on one (by
>> >> heavily borrowing from the above implementations), would people be
>> >> enthusiastic?  Is there any chance of this getting into the codebase?
>> >
>> > There has been much interest in this over the years. If the
>> > implementation is solid (quality code with tests & docs), sure, it can
>> > get into the core.
>>
>> I'll use this as a nice occasion to report the latest developments in
>> xml2rst. First of all: The tool is stable, it's up to date (i.e.
>> supports more or less all Docutils features of today) and it is now
>> accompanied by a test suite (using filterunit_).
>>
>> .. _filterunit: http://www.merten-home.de/FreeSoftware/filterunit/
>>
>> Lately I also added a Python wrapper for the whole stuff using `lxml`
>> for the XML related stuff. I.e.: No external XSLT processor is needed
>> any more - which may make the usage somewhat friendlier. I also
>> started using the EXSLT extensions offered by lxml / xsltproc. This
>> makes the XSLT somewhat more readable.
>>
>> I did not put this stuff to the sandbox yet because when I announced
>> xml2rst originally David and others didn't like it and so I thought
>> why bother. However, if there is interest now I could easily upload
>> the latest stuff. With the new Python wrapper and using lxml it embeds
>> in a Python environment much nicer then before and the wrapper may
>> give enough rope to create a real writer module from the existing
>> stuff.
>>
>> If anyone is interested I can check this into the sandbox. Just drop
>> me a note please.
>>
>>
>>                                                Grüße
>>
>>                                                Stefan
>>
>>
>> ------------------------------------------------------------------------------
>> Download Intel&#174; Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> _______________________________________________
>> Docutils-users mailing list
>> Doc...@li...
>> https://lists.sourceforge.net/lists/listinfo/docutils-users
>>
>> Please use "Reply All" to reply to the list.
>>
>>
>

Re: [Docutils-users] A reStructuredText *writer*?