From: Fridrich S. <fri...@bl...> - 2011-04-24 05:18:01
|
Thanks, Edward, On 24/04/2011 07:05, Edward Mendelson wrote: > 1. Quite a few WP characters have no unicode equivalents, and there is no way to fix that. Yeah, as they say in Swahili: "Maisha ndyvio, alivio" or as we say here in the socialist Europe, "C'est la vie" :) Indeed, we cannot do much in this apart approximate wherever it seems useful. > 2. In TEST.WP (the WP5.1 file), 6,56 through 6,234 didn't convert at all; but these characters are correctly converted in the WP6.x CHARACT6.DOC. You evidently have different tables for 5.x and 6+, and I think you can simply copy the 6,56 through 6,234 mappings from the 6+ table to the 5.x table. Will do that this week. It will need then one other comparison run because there might be subtle differences we would like to catch. > 3. In the converted CHARACT6.DOC, I think it may be possible to add these: > 2,44 seems to be 0361 > 2,45 seems to be 035C > 4,100 seems to be 1D11E > 4,101 seems to be 1D122 Don't know what to do with these ones though. We store the conversion results as UCS2. Let me see what we can do. > 6,83 seems to be 2A38 > 9,83 seems to be 05AA Will correct this too. Do you mind to be marked as author of the contribution when I am committing into git? > I'll have to check the Hebrew tomorrow, but since I don't know any Hebrew, I'll be guessing. I did similar with the WP5 charset. Comparing pictures :) At the end it might be fun :) > I finally tested the Arabic WP 5.1 files from this page: > http://www.un.org/popin/unpopcom/32ndsess/gass.htm > wpd2odt says they are not WordPerfect files. Apparently libwpd doesn't handle documents created by Arabic WP5.1 or Hebrew WP5.1. Let me see those ones, I remember I was able to somehow read some of them. Nevertheless, I have already seen on UN web-sites documents that were declared to be WP document when in reality they were Word documents. Cheers F. |