From: abel d. <ade...@gm...> - 2002-10-23 22:16:22
|
Bob Woodside wrote: > On Wed, 23 Oct 2002 12:59:45 +0200 > abel deuring <a.d...@sa...> wrote: > > >>As far as I understand wvWare, the approach is to convert every Word >>doc paragraph into a a HTML paragraph, i.e., that thingy that starts >>with"<p>". Font attributes defined in the Word doc should be converted >>into corresponding HTML tags -- but the font size is really missing >>yet (or I missed the corresponding tag in the XML conversion spec >>files and the sources...) > > > Yes, after I sent my message, I took a quick glance through the code, > and I don't see anywhere that a size or font-size attribute is being > handled. (I certainly don't pretend to have acquired a significant > understanding of the code in 15 minutes, but this is my first > impression.) > > > >>I think that the approach to convert every Word doc paragraph into a >><p> tag with more or less complete font specs is reasonable, because >>you can define arbitrarily named style sheets in Word docs, so wvWare >>cannot use something like a comprehensive style sheet list in order to >>map the style sheets to HTML tags. > > > I think I need to do a little reasearch into the Word doc format(s) > :-(. In general, I think it's reasonable to have a <p> tag with an > inline style for each paragraph. But that doesn't quite catch everything > that's needed, since an arbitrary substring of a paragraph (like a > single word or phrase) might have different font attributes set in the > document - italic, bold, different point size, etc.. One could argue the > relative merits of using styles for this sort of thing as well (by > including an inline style in the <font> tag), but wv seems already to > have logic to do it with the old-fashiioned <font> attributes. Well, it > picks up color information this way, but not size or face. Most if not all of how the wvWare output looks is specified in xml files; there you could easily switch between <font> and <style> tags. (well, aside from the problem of different units used for the font size in the <style> and <font> tags) >> This bothered me too a bit, so I >>wrote a little Python script which does some post-processing of wvWare >>output. It can replace <p> tags with things like <h1 class="xyz">. Of >>course you need to specify, how Word styles are mapped to HTML tags. >>Im am not sure, if this script fits your need, because it it's >>concecpt is to keep only structural information from Word files and to >>remove anything explicitly presentation specific, like font >>attributes. Anyway, you can find it here: >>http://www.zope.org/Members/abel/wvZopeFilter . (don't panic about >>Zope -- while I use the script with a Zope server, you can use it as a >>standalone program ;) > > > This looks neat for some applications, but for my purposes the inline > styles (or traditional <font> attributes) would be fine. I was hoping wv > might be a quick and dirty solution for a client who hasn't a strong > grasp of the difference between a Word document and an HTML document > (curse you, M'soft!), and would like to be able to upload a Word > document to a Web site and have it magically be a Web page. But it looks > like wv needs a bit of work before it can be used this way. I'm afraid that you're right, if your client insists that he/she wants to specify every little design aspect of a web page... Abel |