From: Dom L. <ci...@ho...> - 2002-04-08 12:22:54
|
Kris, Please resend with sample documents that don't work attached to the bugreport. Otherwise, no doc == no bug. Dom >From: Kris Koskelin <kr...@an...> >To: wvw...@li... >Subject: [Wvware-users] wvHtml problem with Word2000 docs >Date: Tue, 02 Apr 2002 15:34:50 -0600 > >Forgive me if this is somewhere in the docs or in the mailing list >archives, but I've not found an answer to my problem yet. > >I've been attempting to convert some Word 2000 documents to HTML using >wvWare utilities v.0.7.1 (also tried 0.7.0) without success. An attempt of > 'wvHtml filename.doc results.html' >will create a file results.html size 0 bytes, and the following is printed >to STDERR: > "Diagnostic: (./wvWare.c:389) " >(with no error message/number after it) > >If I were a bit more of a C hacker, I would try to fix this problem, but >all I can tell is that the file format is unrecognized as a Word 2000 >document. The command > 'wvVersion filename.doc' >yields the error > "filename.doc couldn't be opened as any known word document" > >Despite this error, I am indeed able to open the file without error in Word >2000. Only after saving the file to Word 95 format is wvWare able to parse >it - and the resulting Word 95 document has lost some formatting, among >which are line graphics and boxes used in a flowchart. > >I do not need these graphics converted, but only want the text of the >document. Surely the presence of these graphics would not prevent the >document format from being properly detected? > >I appreciate any help or feedback! > >Regards, > > >-- >Kris Koskelin >kr...@an... > > >_______________________________________________ >Wvware-users mailing list >Wvw...@li... >https://lists.sourceforge.net/lists/listinfo/wvware-users _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. |
From: Kris K. <kr...@an...> - 2002-04-08 14:11:09
|
The document that was giving me problems was NOT in fact created by me, but by a user who claimed that it was a Word 2000 document. Upon further inspection of this document, it was revealed that despite being saved with a ".doc" extension, the file was in fact an RTF file. Since I do not own Office 2000, I am not aware of the default file-type. I thought it would be word 9 version by default, but these were RTF nonetheless. It may be outside the scope of wvHtml, but perhaps a hook could be made to look for "{\rtf" in the first 5 bytes of the document if all other document identification schemes should fail? The only reason I suggest this is that all other attempts to identify the document seemed to suggest that it WAS word-9 format. Or perhaps that is something best left out of the project, but I should be happy to allow the developers to make this choice. Thank you for your assistance, and a great library. --Kris Koskelin At 07:22 AM 4/8/2002, Dom Lachowicz wrote: >Kris, > >Please resend with sample documents that don't work attached to the >bugreport. Otherwise, no doc == no bug. > >Dom > >>From: Kris Koskelin <kr...@an...> >>To: wvw...@li... >>Subject: [Wvware-users] wvHtml problem with Word2000 docs >>Date: Tue, 02 Apr 2002 15:34:50 -0600 >> >>Forgive me if this is somewhere in the docs or in the mailing list >>archives, but I've not found an answer to my problem yet. >> >>I've been attempting to convert some Word 2000 documents to HTML using >>wvWare utilities v.0.7.1 (also tried 0.7.0) without success. An attempt of >> 'wvHtml filename.doc results.html' >>will create a file results.html size 0 bytes, and the following is printed >>to STDERR: >> "Diagnostic: (./wvWare.c:389) " >>(with no error message/number after it) >> >>If I were a bit more of a C hacker, I would try to fix this problem, but >>all I can tell is that the file format is unrecognized as a Word 2000 >>document. The command >> 'wvVersion filename.doc' >>yields the error >> "filename.doc couldn't be opened as any known word document" >> >>Despite this error, I am indeed able to open the file without error in Word >>2000. Only after saving the file to Word 95 format is wvWare able to parse >>it - and the resulting Word 95 document has lost some formatting, among >>which are line graphics and boxes used in a flowchart. >> >>I do not need these graphics converted, but only want the text of the >>document. Surely the presence of these graphics would not prevent the >>document format from being properly detected? >> >>I appreciate any help or feedback! >> >>Regards, >> >> >>-- >>Kris Koskelin >>kr...@an... >> >> >>_______________________________________________ >>Wvware-users mailing list >>Wvw...@li... >>https://lists.sourceforge.net/lists/listinfo/wvware-users > > >_________________________________________________________________ >Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. > > >_______________________________________________ >Wvware-users mailing list >Wvw...@li... >https://lists.sourceforge.net/lists/listinfo/wvware-users > > -- Kris Koskelin kr...@an... |
From: Ian B. <ia...@co...> - 2002-04-08 16:15:10
|
There's other (entirely separate) libraries and utilities for dealing with RTF files -- it would be best for you to check for {\rtf yourself, and call whatever utility is appropriate. On Mon, 2002-04-08 at 09:07, Kris Koskelin wrote: > The document that was giving me problems was NOT in fact created by me, but > by a user who claimed that it was a Word 2000 document. Upon further > inspection of this document, it was revealed that despite being saved with > a ".doc" extension, the file was in fact an RTF file. > > Since I do not own Office 2000, I am not aware of the default file-type. I > thought it would be word 9 version by default, but these were RTF > nonetheless. It may be outside the scope of wvHtml, but perhaps a hook > could be made to look for "{\rtf" in the first 5 bytes of the document if > all other document identification schemes should fail? The only reason I > suggest this is that all other attempts to identify the document seemed to > suggest that it WAS word-9 format. Or perhaps that is something best left > out of the project, but I should be happy to allow the developers to make > this choice. > > Thank you for your assistance, and a great library. > > --Kris Koskelin > > At 07:22 AM 4/8/2002, Dom Lachowicz wrote: > > >Kris, > > > >Please resend with sample documents that don't work attached to the > >bugreport. Otherwise, no doc == no bug. > > > >Dom > > > >>From: Kris Koskelin <kr...@an...> > >>To: wvw...@li... > >>Subject: [Wvware-users] wvHtml problem with Word2000 docs > >>Date: Tue, 02 Apr 2002 15:34:50 -0600 > >> > >>Forgive me if this is somewhere in the docs or in the mailing list > >>archives, but I've not found an answer to my problem yet. > >> > >>I've been attempting to convert some Word 2000 documents to HTML using > >>wvWare utilities v.0.7.1 (also tried 0.7.0) without success. An attempt of > >> 'wvHtml filename.doc results.html' > >>will create a file results.html size 0 bytes, and the following is printed > >>to STDERR: > >> "Diagnostic: (./wvWare.c:389) " > >>(with no error message/number after it) > >> > >>If I were a bit more of a C hacker, I would try to fix this problem, but > >>all I can tell is that the file format is unrecognized as a Word 2000 > >>document. The command > >> 'wvVersion filename.doc' > >>yields the error > >> "filename.doc couldn't be opened as any known word document" > >> > >>Despite this error, I am indeed able to open the file without error in Word > >>2000. Only after saving the file to Word 95 format is wvWare able to parse > >>it - and the resulting Word 95 document has lost some formatting, among > >>which are line graphics and boxes used in a flowchart. > >> > >>I do not need these graphics converted, but only want the text of the > >>document. Surely the presence of these graphics would not prevent the > >>document format from being properly detected? > >> > >>I appreciate any help or feedback! > >> > >>Regards, > >> > >> > >>-- > >>Kris Koskelin > >>kr...@an... > >> > >> > >>_______________________________________________ > >>Wvware-users mailing list > >>Wvw...@li... > >>https://lists.sourceforge.net/lists/listinfo/wvware-users > > > > > >_________________________________________________________________ > >Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. > > > > > >_______________________________________________ > >Wvware-users mailing list > >Wvw...@li... > >https://lists.sourceforge.net/lists/listinfo/wvware-users > > > > > > -- > Kris Koskelin > kr...@an... > > > _______________________________________________ > Wvware-users mailing list > Wvw...@li... > https://lists.sourceforge.net/lists/listinfo/wvware-users > |