Klaus,
The most important part is that htdig is set up with the correct mime-type mappings.
I would suggest that you check the header returned by your server somehow: options include the 'LiveHTTPheaders'  extension in Firefox,  a direct telnet to port 80,  or verbose options with something like WGET. 
That would eliminate any remaining confusion over what mime-type is being used - as you can see, there is a degree of faith required that the web server picks the 'correct' mime-type in each case!
 
If it does turn out to be returning application/octet-stream that needs to be fixed, but otherwise I would just put in a line for every possible variation of the WordPerfect mime-type - as you spotted, doc2html makes its own attempt to identify which kind of document it is given, which is independent of mime-type.
 
Hope that helps,
Mike 
 
 


From: htdig-general-bounces@lists.sourceforge.net [mailto:htdig-general-bounces@lists.sourceforge.net] On Behalf Of Klaus Bömken
Sent: Sunday, May 06, 2007 7:14 PM
To: htdig-general@lists.sourceforge.net
Subject: Re: [htdig] doc2html unable to convert wordperfect

That is a point that I am very uncertain about . Here is what I can tell after reading more of the documentation.

This is part of my /usr/share/file/magic:

>#WordPerfect type files Version 1.6 - PLEASE DO NOT REMOVE THIS LINE
>0       string  \377WPC\020\000\000\000\022\012\001\001\000\000\000\000 (WP) loadable text
>>15     byte    0       Optimized for Intel
>>15     byte    1       Optimized for Non-Intel
>1       string  WPC     (Corel/WP)
>>8      short   257     WordPerfect macro
(and so on)

This is part of my /etc/mime.types:

>application/wordperfect                         wpd
>application/wordperfect5.1                      wp5

Now if I use the file-command:

>file mswordfile.doc
>mswordfile.doc: Microsoft Office Document

>file -i mswordfile.doc
>mswordfile.doc: application/msword

That's what I would expect. But:

>file wordperfectfile.wpd
>wordperfectfile.wpd: (Corel/WP)

>file -i wordperfectfile.wpd
>wordperfectfile.wpd: application/octet-stream

... seems wrong to me.

On the other hand: how does doc2html identify file-types?

I'm not familiar with perl, but it seems to me that there is some sort of comparison with "\377" within the perl-script.

In this respect:

>od -b wordperfectfile.wpd | head
>0000000 377 127 120 103 365 007 000 000 001 012 002 001 000 000 000 002

... seems correct to me.

Would this be the information you asked about? What can I do next?

Thanks for your help,
Klaus


>-------in reply to
Re: [htdig] doc2html unable to convert wordperfect

michael.brockington
Fri, 04 May 2007 01:58:35 -0700

Have you got any idea what mime-type your server is putting out for these 
documents?
The key is that what you have in your config file must match what your server 
is saying, regardless of whether either of them is technically correct.

Regards,
Mike
 

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On 
> Behalf Of Klaus Bömken
> Sent: Wednesday, May 02, 2007 7:56 PM
> To: htdig-general@lists.sourceforge.net
> Subject: [htdig] doc2html unable to convert wordperfect
> 

<snip>

> 
> I read some stuff about mime-types that I don't understand. 
> Is it matter
> of wordperfect-versions?
> 
> Any hints?
> 
> Klaus
> 
-- 
*************************
Klaus Bömken
Herzogstraße 89b
46145 Oberhausen

Fon:   0208-612018
Mobil: 0163-8898 436
Mail: klaus@boemken.de