[Wvware-devel] wvConvert, bidi, and xml

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

In my explorations of Hebrew Doc files and the wv sources I have found
that:

* A paragraph is RTL (right-to-left) if it preceeded by the sprm 
  sprmPFBiDi with the argument value 1.

I would like to have the argument of sprmPFBiDi to be cause the HTML
file to output DIR=LTR if sprmPFBiDi=0 and DIR=RTL if sprmPFBiDi=1.
From checking the xml files I understand that I can do this through
something like:

    <paragraph>
    <begin>
    &lt;p PROPS=&quot;text-align:<just/>&quot; DIR=<bididir/>&gt;
    </begin>
    <end>
    &lt;/p&gt;
    </end>
    </paragraph>

(The above was taken out of wvConfig.xml). But what I don't understand
is how to set the <bididir/> xml variable. I understand that this is
done in wvConfig.c, but I don't understand the syntax. Was it you,
Don, who wrote it? Could you explain how it works?

I think I can also fill in info of some of the questionmarks in 
sprm.c. As I already understood from the behaviour of Word, it stores
separate attributes for the LTR and the RTL fonts. The RTL font is
known to Word as a "BiDi font". Therefore there is are separate SPRM's
to change the attributes of the Bidi font. These are all the entries
with BiDi or Bi in their names. The following are some of my guesses
of the meaning of these attributes.

    sprmPFBiDi         Change the BiDi direction of the paragraph
    sprmCFBoldBi       Set the BiDi font to bold
    sprmCFBiDi         Change the BiDi face
    sprmCHpsBi         ??
    sprmSFBiDi         ??? What is SF 
    sprmTFBiDi         ??? Wwhat is SF
    sprmCFItalicBi     Make the BiDi font italic.
    sprmCFtcBi         ??? What is tc? sprmCFtcDefault is just doing bold...
    sprmCLidi          Change Bidi Lid
    sprmCIcoBi         ???

I still have to figure out how HTML 4.0 handles different fonts for
RTL and LTR. My guess is that it only supports one font at a time,
which is fine if you have a Unicode font. In my first attempt I would
just make CFBiDi equivalent to CFBi and see what happens.

Another error that occured to me is when trying to to wvHtml on the
file this_is_bold.html whereupon I three times got the error message:

    wvWarning: There is no paragraph due to open but one should 
    be, plugging the gap.

The source sais about this error in decode_complex.c:

    "if there's no paragraph open, but there should be then I believe
    that the fcFirst search has failed me, so I set it to now. I need
    to investigate this further. I believe it occurs when a the last
    piece ended simultaneously with the last paragraph, and that the
    algorithm for finding the beginning of a para breaks under that
    condition. I need more examples to be sure, but it happens is very
    large complex files so its hard to find"

Well, now it happened in a small and example. Btw, what it does to the
translated file is that it ends the font attributes to early.

If this is the doc file:

    <p>
    <red>This is red<br>
    And so is this<br>
    And this is too</red>
    </p>

Then it is translated into HTML as follows:

    <p>
    <red>This is red</red><br>
    And so is this<br>
    And this is too
    </p>

That is the red attribute is only done on one line.

As a reminder, the file this_is_red.html that I mentioned above is
available from http://imagic.weizmann.ac.il/~dov/freesw/wv .

Regards,
Dov