From: David G. <go...@py...> - 2002-11-12 01:52:24
|
Mikhail Sobolev wrote: > Having downloaded the 0.2 version, I tried to write a simple program > to convert from text to HTML. It turned to out to be not that easy. > Fortunately, in mailing list archives I found a suggestion to > download the snapshot and proceed with publish_string helper. This > works great, thank you. Thank you for checking the archives! > I have two questions, however. > > It looks like I cannot get only the body of the text (what is located > between <body> ... </body>) without some addtional programming, Correct. You'll need a specialized Writer component. Take a look at the files in http://docutils.sf.net/sandbox/oliverr/ht/ . This seems to be a common requirement for people, so a custom HTML-body-only Writer could be useful. I don't know what to do about the DocTitle transform in this case though (in docutils/transforms/frontmatter.py). > nor it's possible to get rid of use stylesheets at all. I'm not sure what you mean by this or what you want. Please elaborate. The html4css1.py Writer is designed to use a stylesheet, as recommended by the latest HTML specs. If you want HTML that doesn't require a stylesheet at all, a new Writer would be needed. > The second comes from my rather extensive use of Outlook (yes, a > Microsoft product) "highlighting". In cases, when the path or the > file name contain spaces, it's very convenient to just enclose the > whole consruction in angle brackets (like, <schema://some > path/with/spaces/and a file>), and you do not really have to worry > about converting those in %20. What would you say about such a > feature? According to RFC 2396 "Uniform Resource Identifiers (URI): Generic Syntax", spaces are not valid URI/URL characters. It does say this: In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may need to be added to break long URI across lines. The whitespace should be ignored when extracting the URI. ... Using <> angle brackets around each URI is especially recommended as a delimiting style for URI that contain whitespace. The syntax you propose would conflict with this, especially if the MS-style URL were to break across lines: <http://www.example.com/a/very/long/ path/broken/across/lines> Is the whitespace after "long/" significant or not? The RFC says it's not. The reStructuredText parser also joins long multi-line URLs in targets. I wouldn't mind adding the ability to join broken URLs in free text as well, if surrounded by brackets. So the answer to your question is, I think I'd say no thanks. Whitespace in URLs is a pain; I think it's better just to avoid it. -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |