From: Handzsuj,Thomas <tho...@dr...> - 2002-11-08 10:34:21
|
Hi, I have a suggestion for using ht2html together with docutils (http://docutils.sourceforge.net/) The advantage is that I have to deal only with simple and readable = ASCII files. The problem is that docutils creates normal html files. The information = in the header of these files (e.g. title, charset) is lost if I use them = as ht files. To transfer this information I wrote a simple Parser based on = SGMLParser. It collects information from the body and the get_.. functions take this information. (My generator is based on PDOGenerator.py) Here it is: class extraParser(SGMLParser): def __init__(self): SGMLParser.__init__(self) self.text =3D '' self.startRec =3D 0 =20 def start_title(self, attrs): self.startRec =3D 1 def end_title(self): self.title =3D self.text self.startRec =3D 0 def start_meta(self,attrs): for k,v in attrs: if k=3D=3D'content' and v.find('charset') >=3D 0: self.charset =3D v.split('=3D')[1] def handle_data(self, text): if self.startRec: self.text +=3D text else: self.text +=3D '' It is used after reading the body (in __grokbody of my generator): self.__extraParser.feed(text) self.__extraParser.close() Information is used like this: def get_title(self): try: return self.__extraParser.title except AttributeError: return self.__parser.get('title') Best regards / Mit freundlichen Gr=FC=DFen=20 Thomas Handzsuj Design and Development Intensive Care ---------------------------------------------------- DR=C4GER MEDICAL Dr=E4ger Medical AG & Co. KGaA Moislinger Allee 53-55 D-23542 L=FCbeck Tel: + 49-451-882-1524 Fax: + 49-451-882-71524 (PC) Fax: + 49-451-882-2856 (paper print) ----------------------------------------------------- |
From: David G. <go...@py...> - 2002-11-08 21:39:39
|
Handzsuj,Thomas wrote: > I have a suggestion for using ht2html together with docutils > (http://docutils.sourceforge.net/) > > The advantage is that I have to deal only with simple and readable ASCII > files. I agree. There has been some interest in integrating the two recently. Oliver Rutherfurd has written a preliminary Writer component for .ht output: http://docutils.sourceforge.net/sandbox/oliverr/ht/ > The problem is that docutils creates normal html files. The information in > the header of these files (e.g. title, charset) is lost if I use them as ht > files. > > To transfer this information I wrote a simple Parser based on SGMLParser. It > collects information from the body and the get_.. functions take this > information. I think this is a roundabout way of doing the job. Please take a look at the link above. It produces real .ht files, with the title in the RFC822 header. The encoding could go there too (assuming it's ASCII-compatible). I don't know how encoding-friendly ht2html is. Ideally, I'd like to see a system which programmatically controls both Docutils and ht2html, so that intermediate files don't have to be created. I haven't looked into the ht2html code enough yet to know if this is feasible. -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: Fred L. D. Jr. <fd...@ac...> - 2002-11-08 22:28:35
|
David Goodger writes: > I agree. There has been some interest in integrating the two recently. > Oliver Rutherfurd has written a preliminary Writer component for .ht output: > http://docutils.sourceforge.net/sandbox/oliverr/ht/ Cool! > I think this is a roundabout way of doing the job. Please take a look at > the link above. It produces real .ht files, with the title in the RFC822 > header. The encoding could go there too (assuming it's ASCII-compatible). > I don't know how encoding-friendly ht2html is. I doubt it is, but it could be made more so. > Ideally, I'd like to see a system which programmatically controls both > Docutils and ht2html, so that intermediate files don't have to be created. > I haven't looked into the ht2html code enough yet to know if this is > feasible. Is the goal to have the content portion of .ht files be reST, or to allow HT2HTML to start with reST content to begin with? I'm sure either could be done, but using .ht files with reST for the body would be the easier of the two. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation |
From: David G. <go...@py...> - 2002-11-08 23:12:30
|
Fred L. Drake, Jr. wrote: > Is the goal to have the content portion of .ht files be reST, or to > allow HT2HTML to start with reST content to begin with? What's the distinction? Actually, the Docutils component that Oliver wrote produces ordinary .ht intermediate files: RFC822 headers plus regular HTML. It's an ht2html "Writer". The way it's written now, one would have to run Docutils and then ht2html. It would almost require a Makefile. Not ideal. But it should be easy to make an ht2html "Reader" (Docutils code calls ht2html) or a generator module for ht2html (which would call Docutils code). One of these days I'd like to actually *use* ht2html, and dig through the code; then I'll have a better idea of which side should be "in control". > I'm sure either could be done, but using .ht files with reST for the > body would be the easier of the two. Yes, they would look *exactly* like reStructuredText PEP source files. -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: <ba...@zo...> - 2002-11-08 22:41:30
|
>>>>> "DG" == David Goodger <go...@py...> writes: DG> I don't know how encoding-friendly ht2html is. Probably not at all. Or to put it another way, it's completely ignorant of all such issues. -Barry |