From: Bob W. <mad...@wo...> - 2002-10-22 20:46:11
|
Newbie Alert: I just discovered wvWare, and after installing 0.7.2 I tried testing conversion of a few simple Word docs to HTML. I find that no FONT or Hn tags are produced in the resulting HTML to reflect changes in text size in the Word document. Then I tried running wvHtml on the example file, supported-font-features.doc, in the feature-examples directory, with the same result: no font-size or heading-level formatting, everything is all the same size. The colors come out OK (there is a font tag that overrides the "black" color spec in the inline paragraph style), but nowhere is there any size specification. The resulting HTML bears no resemblance to the converted version, supported-font-features.doc.html, in that directory. Does anyone have any idea what might be wrong, or where to look? Is there possibly some configuration option or adjunct package requirement I may have overlooked? (I'm running on a Linux box, with libiconv-1.7, expat-1.95.5, and libwmf-0.2.7-Darwin.) Or is this stuff known not to be working currently (and if not, can anyone else duplicate the problem with this file)? Cheers, Bob |
From: F J F. <F.J...@sh...> - 2002-10-23 08:48:57
|
> Newbie Alert: I just discovered wvWare, and after installing 0.7.2 I > tried testing conversion of a few simple Word docs to HTML. I find that > no FONT or Hn tags are produced in the resulting HTML to reflect changes > in text size in the Word document. > > Then I tried running wvHtml on the example file, > supported-font-features.doc, in the feature-examples directory, with the > same result: no font-size or heading-level formatting, everything is all > the same size. The colors come out OK (there is a font tag that > overrides the "black" color spec in the inline paragraph style), but > nowhere is there any size specification. The resulting HTML bears no > resemblance to the converted version, supported-font-features.doc.html, > in that directory. > > Does anyone have any idea what might be wrong, or where to look? Is > there possibly some configuration option or adjunct package requirement > I may have overlooked? (I'm running on a Linux box, with libiconv-1.7, > expat-1.95.5, and libwmf-0.2.7-Darwin.) Or is this stuff known not to be > working currently (and if not, can anyone else duplicate the problem > with this file)? libwmf-0.2.7-Darwin, huh? oh, well, if it works... Let me warn you here and now that there are no experts when it comes to wv. Most questions go unanswered here. (It doesn't help that the principal developer got unsubscribed from the list by SF for no discernible reason.) That said, I do read most messages, even if I have no clue what to do about most of them. Frank Francis James Franklin F.J...@sh... `Medium atomic weights are available: Gold, Lead, Copper, Jet, Diamond, Radium, Sapphire, Silver and Steel. `Sapphire and Steel have been assigned...' |
From: abel d. <a.d...@sa...> - 2002-10-23 10:40:17
|
> [Bob Woodside:] > > Newbie Alert: I just discovered wvWare, and after installing 0.7.2 I > > tried testing conversion of a few simple Word docs to HTML. I find that > > no FONT or Hn tags are produced in the resulting HTML to reflect changes > > in text size in the Word document. Bob, As far as I understand wvWare, the approach is to convert every Word doc paragraph into a a HTML paragraph, i.e., that thingy that starts with "<p>". Font attributes defined in the Word doc should be converted into corresponding HTML tags -- but the font size is really missing yet (or I missed the corresponding tag in the XML conversion spec files and the sources...) I think that the approach to convert every Word doc paragraph into a <p> tag with more or less complete font specs is reasonable, because you can define arbitrarily named style sheets in Word docs, so wvWare cannot use something like a comprehensive style sheet list in order to map the style sheets to HTML tags. This bothered me too a bit, so I wrote a little Python script which does some post-processing of wvWare output. It can replace <p> tags with things like <h1 class="xyz">. Of course you need to specify, how Word styles are mapped to HTML tags. Im am not sure, if this script fits your need, because it it's concecpt is to keep only structural information from Word files and to remove anything explicitly presentation specific, like font attributes. Anyway, you can find it here: http://www.zope.org/Members/abel/wvZopeFilter . (don't panic about Zope -- while I use the script with a Zope server, you can use it as a standalone program ;) > > > > Then I tried running wvHtml on the example file, > > supported-font-features.doc, in the feature-examples directory, with the > > same result: no font-size or heading-level formatting, everything is all > > the same size. The colors come out OK (there is a font tag that > > overrides the "black" color spec in the inline paragraph style), but > > nowhere is there any size specification. The resulting HTML bears no > > resemblance to the converted version, supported-font-features.doc.html, > > in that directory. > > > > Does anyone have any idea what might be wrong, or where to look? Is > > there possibly some configuration option or adjunct package requirement > > I may have overlooked? (I'm running on a Linux box, with libiconv-1.7, > > expat-1.95.5, and libwmf-0.2.7-Darwin.) Or is this stuff known not to be > > working currently (and if not, can anyone else duplicate the problem > > with this file)? > F J Franklin wrote: > libwmf-0.2.7-Darwin, huh? oh, well, if it works... > > Let me warn you here and now that there are no experts when it comes to > wv. Most questions go unanswered here. (It doesn't help that the > principal developer got unsubscribed from the list by SF for no > discernible reason.) That's a pity... Abel |
From: Bob W. <mad...@wo...> - 2002-10-23 21:53:42
|
On Wed, 23 Oct 2002 12:59:45 +0200 abel deuring <a.d...@sa...> wrote: > As far as I understand wvWare, the approach is to convert every Word > doc paragraph into a a HTML paragraph, i.e., that thingy that starts > with"<p>". Font attributes defined in the Word doc should be converted > into corresponding HTML tags -- but the font size is really missing > yet (or I missed the corresponding tag in the XML conversion spec > files and the sources...) Yes, after I sent my message, I took a quick glance through the code, and I don't see anywhere that a size or font-size attribute is being handled. (I certainly don't pretend to have acquired a significant understanding of the code in 15 minutes, but this is my first impression.) > I think that the approach to convert every Word doc paragraph into a > <p> tag with more or less complete font specs is reasonable, because > you can define arbitrarily named style sheets in Word docs, so wvWare > cannot use something like a comprehensive style sheet list in order to > map the style sheets to HTML tags. I think I need to do a little reasearch into the Word doc format(s) :-(. In general, I think it's reasonable to have a <p> tag with an inline style for each paragraph. But that doesn't quite catch everything that's needed, since an arbitrary substring of a paragraph (like a single word or phrase) might have different font attributes set in the document - italic, bold, different point size, etc.. One could argue the relative merits of using styles for this sort of thing as well (by including an inline style in the <font> tag), but wv seems already to have logic to do it with the old-fashiioned <font> attributes. Well, it picks up color information this way, but not size or face. > This bothered me too a bit, so I > wrote a little Python script which does some post-processing of wvWare > output. It can replace <p> tags with things like <h1 class="xyz">. Of > course you need to specify, how Word styles are mapped to HTML tags. > Im am not sure, if this script fits your need, because it it's > concecpt is to keep only structural information from Word files and to > remove anything explicitly presentation specific, like font > attributes. Anyway, you can find it here: > http://www.zope.org/Members/abel/wvZopeFilter . (don't panic about > Zope -- while I use the script with a Zope server, you can use it as a > standalone program ;) This looks neat for some applications, but for my purposes the inline styles (or traditional <font> attributes) would be fine. I was hoping wv might be a quick and dirty solution for a client who hasn't a strong grasp of the difference between a Word document and an HTML document (curse you, M'soft!), and would like to be able to upload a Word document to a Web site and have it magically be a Web page. But it looks like wv needs a bit of work before it can be used this way. Is anyone actively maintaining the code at this point, or is it somewhat in limbo? Cheers, Bob |
From: F J F. <F.J...@sh...> - 2002-10-24 08:23:08
|
> This looks neat for some applications, but for my purposes the inline > styles (or traditional <font> attributes) would be fine. I was hoping wv > might be a quick and dirty solution for a client who hasn't a strong > grasp of the difference between a Word document and an HTML document > (curse you, M'soft!), and would like to be able to upload a Word > document to a Web site and have it magically be a Web page. But it looks > like wv needs a bit of work before it can be used this way. Ah. Word and HTML. Hmm. I loathe the HTML that Word produces. Still, getting them to save as doc *and* save as HTML is possibly the only sensible solution. > Is anyone actively maintaining the code at this point, or is it > somewhat in limbo? Not quite in limbo. It gets patched from time to time. Most recent version always to be found in AbiWord's CVS repository. Martin Junius has submitted some nice stuff recently to do with XML output. So. But development is not fast. Best way to solve problems like this one is to fix it yourself... Frank Francis James Franklin F.J...@sh... `Medium atomic weights are available: Gold, Lead, Copper, Jet, Diamond, Radium, Sapphire, Silver and Steel. `Sapphire and Steel have been assigned...' |
From: abel d. <ade...@gm...> - 2002-10-23 22:16:22
|
Bob Woodside wrote: > On Wed, 23 Oct 2002 12:59:45 +0200 > abel deuring <a.d...@sa...> wrote: > > >>As far as I understand wvWare, the approach is to convert every Word >>doc paragraph into a a HTML paragraph, i.e., that thingy that starts >>with"<p>". Font attributes defined in the Word doc should be converted >>into corresponding HTML tags -- but the font size is really missing >>yet (or I missed the corresponding tag in the XML conversion spec >>files and the sources...) > > > Yes, after I sent my message, I took a quick glance through the code, > and I don't see anywhere that a size or font-size attribute is being > handled. (I certainly don't pretend to have acquired a significant > understanding of the code in 15 minutes, but this is my first > impression.) > > > >>I think that the approach to convert every Word doc paragraph into a >><p> tag with more or less complete font specs is reasonable, because >>you can define arbitrarily named style sheets in Word docs, so wvWare >>cannot use something like a comprehensive style sheet list in order to >>map the style sheets to HTML tags. > > > I think I need to do a little reasearch into the Word doc format(s) > :-(. In general, I think it's reasonable to have a <p> tag with an > inline style for each paragraph. But that doesn't quite catch everything > that's needed, since an arbitrary substring of a paragraph (like a > single word or phrase) might have different font attributes set in the > document - italic, bold, different point size, etc.. One could argue the > relative merits of using styles for this sort of thing as well (by > including an inline style in the <font> tag), but wv seems already to > have logic to do it with the old-fashiioned <font> attributes. Well, it > picks up color information this way, but not size or face. Most if not all of how the wvWare output looks is specified in xml files; there you could easily switch between <font> and <style> tags. (well, aside from the problem of different units used for the font size in the <style> and <font> tags) >> This bothered me too a bit, so I >>wrote a little Python script which does some post-processing of wvWare >>output. It can replace <p> tags with things like <h1 class="xyz">. Of >>course you need to specify, how Word styles are mapped to HTML tags. >>Im am not sure, if this script fits your need, because it it's >>concecpt is to keep only structural information from Word files and to >>remove anything explicitly presentation specific, like font >>attributes. Anyway, you can find it here: >>http://www.zope.org/Members/abel/wvZopeFilter . (don't panic about >>Zope -- while I use the script with a Zope server, you can use it as a >>standalone program ;) > > > This looks neat for some applications, but for my purposes the inline > styles (or traditional <font> attributes) would be fine. I was hoping wv > might be a quick and dirty solution for a client who hasn't a strong > grasp of the difference between a Word document and an HTML document > (curse you, M'soft!), and would like to be able to upload a Word > document to a Web site and have it magically be a Web page. But it looks > like wv needs a bit of work before it can be used this way. I'm afraid that you're right, if your client insists that he/she wants to specify every little design aspect of a web page... Abel |
From: Bob W. <mad...@wo...> - 2002-10-23 20:25:50
|
On Wed, 23 Oct 2002 09:48:15 +0100 (BST) F J Franklin <F.J...@sh...> wrote: > Let me warn you here and now that there are no experts when it comes > to wv. Most questions go unanswered here. (It doesn't help that the > principal developer got unsubscribed from the list by SF for no > discernible reason.) That's a shame...do you mean Dom? After a tedious search for legitimate posts among the spam on the archive, I was beginning to wonder if he'd graduated, gotten preoccupied with mundane things like finding a job in a dismal economy, and pretty much left the project an orphan. Is anyone actually in charge of wv at the moment? In any case, it's good to know that there are at least a couple of people to bounce ideas off of. Cheers, Bob |