>>>>> Just some minor assumptions--which require some work to
>>>>> check through-- prevent from saying "yes you can do that".
>>>> The "minor assumptions" you mention are fundamental to the
>>> I think this is the crux of the disagreement.
>>> I do not see that.
>> You do not see what?
> I do not see a necessity for the three processes to always be
> present upon any single invocation.
(I assume you mean the three major components. It's only one
That's the architecture. It simplifies a lot, because it's not *just*
the three components. The Reader has a source (docutils.io.Input
object) attached to it, the Writer has a destination
(docutils.io.Output object) attached, and all 3 major components
(Reader, Parser, Writer) can specify transforms.
>>> To me, separating the stages explicitly makes the architecture
>> Not to me. :-)
>> If you feel strongly about it, start a branch and show us some
> Not much needs to be done:
Then start a branch, and do it. I'm not interested. And even if you
*do* start a branch, I'm not guaranteeing that I'll OK it. I want to
see concrete evidence that it's *useful* and *solves a real problem*
first. Hypotheticals aren't enough.
> all we need to do is make the list of transforms separate for
> source/reader/parser and writer/destination. Two lists.
That's fine, from a purely theoretical standpoint. My position is
practical: the transforms have a natural ordering, and that ordering
works. Careful analysis of the transforms themselves tells us what
order they have to be applied in. The priorities of the transforms
are a direct result of this.
If dependencies do exist between transforms, any attempt to reorder
them will fail.
If and when something doesn't work, *then* we'll deal with it. I'm
really not interested in hypotheticals.
> With the explicit assumption that what is in-between is
> always the same. (More detailed explanation below.)
> A case where the writer runs a transform intermingled with the
> reader/parser could cause a dependency that is difficult to break of
> when running the transforms in two processes.
> Case 1: normal uninterrupted invocation
> config: reader R1, parser P1, writer W1.
> transforms (in order): W1.t1 R1.t1 W1.t2, R2.t2
> Case 2: two-step invocation, storing the doctree in-between:
> config: reader R1, parser P1, null writer.
> transforms (in order): R1.t1, R2.t2
> step 2, read the doctree, and then:
> config: dummy reader, writer W1
> transforms (in order): W1.t1, W2.t2
> In case 1, the transforms run are:
> W1.t1 R1.t1 W1.t2 R2.t2
> In case 2, the effect is:
> R1.t1 R1.t2 W1.t1 W1.t2
> Note: the order of the transforms is determined by the priority of
> the transform, and it is possible that a writer transform have a
> higher priority (comes first) than a reader transform. The example
> above is an example of this: W1.t1 comes before R1.t1.
> Right now, there is no guarantee that there is no kind of
> interaction between, say, the changes to the tree made by W1.t1 and
> R1.t1. If there is, the results will be different because the
> transforms get run in a different order depending on if we interrupt
> conversion or not.
That's hypothetically true. But is there any such case in concrete
If there is, then I see two possibilities:
1) The transform priorities are wrong, or the transforms themselves
are flawed. They should be fixed.
2) The transform priorities are correct. The natural transform
ordering prevents the division you seek. IOW, the dependencies
prevent two-stage processing, and cannot be fixed.
> The whole point of this discussion can be summarized as "a
> guarantee" that you will obtain the SAME result if you convert in
> one step or in two steps. I think that it holds in practice.
Show me a case where this doesn't hold, and then we'll talk.
> There is no guarantee, however, and until we split the list of
> transforms into two lists, one on each side of the point where you
> would interrupt the conversion (e.g. to store the tree in a blob in
> a database, like I'm doing), we cannot make this guarantee. By
> splitting the list of transforms in two, we allow a point at which
> we know the tree is independent of the configuration of the writer.
We make no claim of offering any such guarantee! Docutils was never
designed to do two-pass processing. Either it works, or it doesn't.
If it works, we have nothing to talk about. If it doesn't, we have
one of the two cases above: either it can be fixed, or it can't. If
it can be fixed, we'll fix it. If it can't, too bad.
I have no evidence that the system doesn't work, either in regular or
multiple-pass processing. Show me some concrete evidence otherwise,
and I'll change my tune.
> Do you see any good reason not to split the list of transforms into
> two lists, to insure that the transforms in the first list are
> always run before all the transforms in the second list?
Yes: it works as it is now. Without a real, concrete example of it
not working, it's not worth the effort.
>>> I'm sure you've got good reasons for maintaining this standpoint,
>>> I'm just not sure what they are.
>> See http://docutils.sf.net/docs/peps/pep-0258.html#docutils-project-model
> This graph itself does not tell the whole story.
No, of course not. But it does tell the story of the data path.
There's a clear entry point for input, and a clear exit point for
output. That model works well. Without it, Docutils would probably
not exist today.
> The Transformer stage depends on the configuration of both the
> reader and the writer, by way of the list of transforms which is
> ordered by its priority number.
> Choose a different writer, and you might get a different doctree
> in-between. Not a problem if you know a-priori where the output
> should go, but it *may* be a problem if you store in-between and
> might output to different media.
"Might" and "may" aren't enough to warrant reimplementing a subsystem
that works just fine now. Show it to be "do" and "is" first.
>>> The problem is: there is no guarantee that in the future someone
>>> will create some kind of ordering or other dependency between a
>>> particular writer and a particular reader, and that it will break
>>> applications which choose to do a two-step process to convert a
>> Although that would probably be considered a bug, it just proves my
>> point. In any case, let's deal with real problems if and when they
>> occur, and not waste time worrying about or designing around
>> hypothetical future problems.
> Sure, it just works now, but that you do not acknowledge the
> existence of a quirk in the design only stimulates more questioning
> and debate.
I don't see any design quirks. I see a design that tackles a
real-world problem, and real-world problems are sometimes dirty. To a
certain extent Docutils grew organically. I thought long and hard
about the initial design of Docutils, and it has evolved since then.
The addition of multi-pass processing functionality is just another
stage in its evolution.
And BTW, I'm not defending the design because it's mine, or refusing
to own up to design quirks that I'm emotionally attached to. If a
real problem does present itself, we'll tackle it. If it requires a
redesign, so be it. I've thrown away a lot of my own code in the
past, and I'm sure I'll throw away a lot more in the future. But not
without good reason.
I'm perfectly willing to see more, and more radical, evolution of the
Docutils architecture, **iff** it's warranted.
In any case, I've had enough of debate on this issue. Show me the
evidence (input data and output results, clearly showing the problem),
or give me peace!
> I hope the example I give above makes clear precisely what my bugger
> is with the potential problem that could occur.
Yes, thank you for detailing the issue. Up until now, I haven't
understood what the real issue was.
I agree that if there really is a problem, we should fix it. But as
far as I know, it works just fine now, and I see no reason to fix
David Goodger <http://python.net/~goodger>