From: David A. <DavidA@ActiveState.com> - 2002-12-10 23:55:55
|
Is there an option I don't know about to let docutils barrel through and never raise an exception when processing a document? I'm all for strictness in formatting in 99% of cases, but I'd like to use docutils to convert old documents written in "vague structured text" into HTML, in cases where there is no reason to go back to the original texts to make them conform. It appears that the failures i'm encountering in these old corpora are things like: When referencing an external web page in the body of a SPEC, you should include the title of the page in the text, with a footnote reference to the URL. Do not include the URL in the body text of the SPEC. E.g. Refer to the Python Language web site [1] for more details. ... [1] http://www.python.org Where the ... is interpreted as a heading marker -- if I change the "."'s to "1"'s, docutils has no problems. Note that while this is from an old PEP, I have other documents which have the same idiom: # if unknown message -> do mime parsing, regex rules, eval tests, etc. ... Maybe we could make ... a special case? Thoughts? --david |
From: David G. <go...@py...> - 2002-12-11 01:24:40
|
David Ascher wrote: > Is there an option I don't know about to let docutils barrel through > and never raise an exception when processing a document? "Never" may not be possible, but "almost never" is. Use the "--halt none" option or ``settings.halt_level = 5``. Similarly for "--report" if you don't even want to *see* the warnings (which I don't recommend). > Where the ... is interpreted as a heading marker -- if I change the > "."'s to "1"'s, docutils has no problems. > > Note that while this is from an old PEP, I have other documents > which have the same idiom: > > # if unknown message -> do mime parsing, regex rules, eval tests, etc. > ... > > Maybe we could make ... a special case? It is already a special case, actually. A line of punctuation marks is recognized as a title underline if it is flush left and as long as the title text or longer. If the line of punctuation marks is shorter than the title text, but at least 4 characters long, it will be recognized as an underline but a warning is generated. If it is 3 characters long or shorter, it is *not* recognized as a title underline, and an info-level system message is generated. Info-level messages are normally filtered out of final output (use "--report" to adjust). To illustrate:: $ tools/publish.py << EOF > blahblah > ... > EOF <document source="<stdin>"> <system_message level="1" line="2" source="<stdin>" type="INFO"> <paragraph> Possible title underline, too short for the title. Treating it as ordinary text because it's so short. <paragraph> blahblah ... But then I tried your example, in which the "..." is indented in a block quote. You've discovered a bug in the parser. Expect a fix in a day or two. -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: David G. <go...@py...> - 2002-12-12 02:56:46
|
I wrote: > But then I tried your example, in which the "..." is indented in a > block quote. You've discovered a bug in the parser. Expect a fix > in a day or two. This was an easy one; I just had to move a conditional. Bug fixed in CVS and snapshot: http://docutils.sf.net/docutils-snapshot.tgz -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: Beni C. <cb...@te...> - 2002-12-11 09:22:01
|
[New to the list, rST rocks :] On 2002-12-10, David Ascher wrote: > Is there an option I don't know about to let docutils barrel through and never > raise an exception when processing a document?I'm all for strictness in > formatting in 99% of cases, but I'd like to use docutils to convert old > documents written in "vague structured text" into HTML, in cases where there > is no reason to go back to the original texts to make them conform. > I don't think that skipping errors is the right approach since it can guess wrongly what you meant. DWIMs are known to make people sorry for using them :-). However the cycle of running docutils and fixing the source is not very convenient when you are in a hurry. TeX's approach of stopping and suggesting interactive fix possibilities could be more effecient (if the messages are less criptic ;-) but doesn't save the corrections in any way. So how about a ``--halt=edit`` mode that will spawn your text editor on the line where the first error happened and after you exit the editor, it will restart from the beginning (or maybe continue if the input file's timestamp hasn't changed)? -- Beni Cherniavsky <cb...@tx...> What's lower level than machine code? A spreadsheet: not only addresses are numeric and hand-allocated but also all loops are hand-unrolled and all calls hand-inlined... (and macros are unheard of, of course). |
From: David A. <DavidA@ActiveState.com> - 2002-12-11 17:53:15
|
Beni Cherniavsky wrote: > I don't think that skipping errors is the right approach since it can > guess wrongly what you meant. DWIMs are known to make people sorry for > using them :-). However the cycle of running docutils and fixing the > source is not very convenient when you are in a hurry. TeX's approach of > stopping and suggesting interactive fix possibilities could be more > effecient (if the messages are less criptic ;-) but doesn't save the > corrections in any way. So how about a ``--halt=edit`` mode that will > spawn your text editor on the line where the first error happened and > after you exit the editor, it will restart from the beginning (or maybe > continue if the input file's timestamp hasn't changed)? None of this helps me, as the docutils phase is run by a cron job at 4 in the morning. Trust me, I know what I'm doing =). --david |
From: Beni C. <cb...@te...> - 2002-12-11 21:43:03
|
On 2002-12-11, David Ascher wrote: > Beni Cherniavsky wrote: > > > I don't think that skipping errors is the right approach since it can > > guess wrongly what you meant. DWIMs are known to make people sorry for > > using them :-).However the cycle of running docutils and fixing the > > source is not very convenient when you are in a hurry.TeX's approach of > > stopping and suggesting interactive fix possibilities could be more > > effecient (if the messages are less criptic ;-) but doesn't save the > > corrections in any way.So how about a ``--halt=edit`` mode that will > > spawn your text editor on the line where the first error happened and > > after you exit the editor, it will restart from the beginning (or maybe > > continue if the input file's timestamp hasn't changed)? > > None of this helps me, as the docutils phase is run by a cron job at 4 in the > morning.Trust me, I know what I'm doing =). > Oh, now I understand =). But I could use an edit mode in any case, maybe I'll write one. -- Beni Cherniavsky <cb...@tx...> What's lower level than machine code? A spreadsheet: not only addresses are numeric and hand-allocated but also all loops are hand-unrolled and all calls hand-inlined... (and macros are unheard of, of course). |
From: David G. <go...@py...> - 2002-12-12 02:59:01
|
[Beni Cherniavsky] > [New to the list, rST rocks :] Welcome, and thanks! [David Ascher] >> Is there an option I don't know about to let docutils barrel >> through and never raise an exception when processing a >> document? [Beni Cherniavsky] > I don't think that skipping errors is the right approach since it > can guess wrongly what you meant. DWIMs are known to make people > sorry for using them :-). I agree, which is why the defaults are set up as they are. I do not recommend changing them permanently. Suppressing errors and warnings is treating the symptom, not the cause. In this case, the cause was a bug; suppressing the errors would just mask the bug. Posting bug reports on SourceForge or this list gives much better results. > However the cycle of running docutils and fixing the source is not > very convenient when you are in a hurry. TeX's approach of stopping > and suggesting interactive fix possibilities could be more effecient > (if the messages are less criptic ;-) but doesn't save the > corrections in any way. So how about a ``--halt=edit`` mode that > will spawn your text editor on the line where the first error > happened and after you exit the editor, it will restart from the > beginning (or maybe continue if the input file's timestamp hasn't > changed)? I think the complexity of such a feature would outweigh its usefulness. Docutils' error output ("filename:lineno: message") is already designed to be compatible with the output of many GNU tools, which have support in Emacs and probably in other tools as well. -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |