|
From: David G. <go...@us...> - 2002-09-23 04:59:48
|
Dethe Elza wrote:
> But reST actually converts to the DOM when you call asdom(), right?
Yes; what else should it do?
> So it's a potentially expensive operation.
Theoretically it's O(n), and you should only have to do it once.
>> Please don't think of *any* part of the current Docutils
>> implementation as written in stone. It's all experimental, all
>> subject to change if we discover it's broken or deficient in some way.
>> That's what the "0." part of the "0.2" ("0.2.3", currently) version
>> number is meant to imply.
>
> Point taken, and I'm glad we're able to have these discussions in that
> light.
I've found that recognizing that all code is disposable has been a great
liberator. Someone said that any decent system needs to be written three
times: once to discover the issues, again to discover the *real* issues by
trying to solve the initial ones, and a third time to write an elegant,
general solution that really works.
> Some things jump out at me because I'm new to the project. If I find they're
> important enough to me, I may jump in and fix them.
Please feel free!
> Why is node.nodeName == 'Body' more expensive than isinstance(node, Body)? If
> it's because of a string compare rather than a pointer compare, then the
> strings can be interned and it's a pointer comparison again. Or is there some
> other reason?
I neglected to point out that ``nodes.Body`` is an abstract superclass of
*many* concrete node classes. There are many such "element category"
superclasses, roughly corresponding to those listed in
http://docutils.sf.net/spec/doctree.html#element-hierarchy. The difference
is between ``isinstance(node, nodes.Body)`` and ``node.tagName in
('paragraph', 'bullet_list', 'enumerated_list', 'definition_list', ...)``.
Multiply by the maintenance required every time a new element class is
added, and the difference should be clear.
> But once I'm done with the include directives I want to take a look at a reST
> DOM -> reST text writer to get a feel for how tricky it will be.
One thing to remember: it's not a "reST DOM", it's a "Docutils DOM" (or
"Docutils doc tree", as I usually refer to it to avoid misinterpretation).
The doc tree (internal document representation) is independent of
reStructuredText.
>> In addition, there's a lot going on behind the scenes that the
>> Docutils DTD doesn't expose. Try running ``html.py --dump-internals
>> input.txt ...`` to see what I mean. ("--dump-internals" is an
>> internal, hidden option, for debugging.)
>
> OK, but without knowing more about *why* that is, it looks like a bug
> to me.
It's mostly bookkeeping. In the source text, we may see::
Here is a reference_ to a web site.
.. _reference: http://www.example.com/
Once parsed, the target URL has to be moved over to the "reference" text.
The internal data structures keep track of stuff like this. Any browser or
other "user agent" has to do the same: create an internal database of
references and targets in order to make hyperlinks work. That internal
database is merely an implementation detail.
> It's Tim Peter's koan, "explicit is better than implicit." Once we start
> getting too much magic going on under the covers, python starts veering
> towards perl. IMHO that's one of the major problems in Zope, and makes work
> in Zope, beyond the very trivial, much harder to do and to understand than it
> would otherwise be.
I don't think that's a valid comparison, but perhaps we're talking at
cross-purposes here. There's nothing wrong with not exposing implementation
details where they're not relevant, such as in the output XML. However,
these details do need more documentation. I've tried to document them in
docstrings, but until there's a docstring extraction system in place you
have to read the source.
> One of my upcoming projects is to pull one of my unfinished novels out
> and serialize it on my weblog using reST. I really like the way I can
> more or less forget about the tool and focus on writing--it's almost as
> good as a typewriter that way!
I had occasion on Friday to write a document unrelated to Docutils. I used
reStructuredText in Emacs, and the text just flowed. I was quite pleased by
how easy it was to write the words *and* markup out without having the
markup get in the way. It's good to hear that I'm not the only one. I'm
biased, so I can't trust my own good experiences without corroboration.
> Of course, a typewriter never gives you compile errors. That's a major
> problem with XML and currently with reST. I have little hope of seeing
> it solved in XML, but I think we can and should make an effort in reST
> to make errors rarer, clearer, and more easily found/fixed.
I agree completely. In fact, I'm currently working on improving the
Reporter system to always report line numbers, and to report to stderr in
the GNU Tools format: "file:lineno: message". Should be checked in soon.
> A casual user should never have to see a python stack trace, for instance.
Again, agreed. The only time a stack trace should occur at present is with
a "SEVERE" (level-4) system message. I suppose even that ought to be
suppressed though; just output the system message plus a line saying
"Processing stopped due to the problems reported above" or some such.
If there are any other stack traces happening, they're bugs and should be
reported and squashed. I/O error handling could use some improvement; if
you specify a nonexistent file, you'll see a stack trace.
Another consequence of a "0.x" version: a bit of roughness around the edges.
> Fortunately, what's there is rich enough and stable enough to think
> about things like catching all the exceptions. And it may only be that
> I see a lot of exceptions because I'm working on a) very long and
> complex documents, and b) actively changing the guts of the system.
(b) I can understand, but (a) shouldn't cause any stack traces (except
perhaps a MemoryError if the document is *that* long!).
--
David Goodger <go...@us...> Open-source projects:
- Python Docutils: http://docutils.sourceforge.net/
(includes reStructuredText: http://docutils.sf.net/rst.html)
- The Go Tools Project: http://gotools.sourceforge.net/
|