#5 xml2msi failed to read Japanese chars

closed
xml2msi (2)
9
2012-07-18
2002-01-11
Yi Ye
No

I have a msi db with Japanese strings in it
(good.msi). When I convert it to xml file (good.xml),
the resulted xml file is correct, with the Japanese
strings. When I convert the xml file back to msi
(bad.msi), though, all the Japanese chars are
corrupted in the resulted new msi file. This can be
verified by convering the new msi file to another xml
file (bad.xml). This time, the xml file has the
corrupted Japanese strings.

Here is what I did:
- good.msi (msi2xml)-> good.xml OK
- good.xml (xml2msi)-> bad.msi Corrupted
- bad.msi (msi2xml)-> bad.xml Corrupted

I did some debugging with your source code. It seems
like a problem with Microsoft XML Parser. The bug is
at line 497 of xml2msi.cpp file.

_bstr_t bstrData(pTd->nodeTypedValue);

The Japanese chars in pTd->nodeTypedValue obtained by
GetnodeTypedValue() is already corrupted. I also
downloaded and installed the XML Parser 4.0 and modify
the xml2msi code to use that but that doesn't solve
the problem. I don't know what else I can do to
address this problem. Your suggestions?

The file is attached. My email is yi.ye@citrix.com.
Thanks in advance,

Yi

Discussion

  • Yi Ye

    Yi Ye - 2002-01-11

    Zip file containing the 4 files mentioned.

     
  • Yi Ye

    Yi Ye - 2002-01-11

    This one has msi.xsl. The old one doesn't.

     
  • Daniel Gehriger

    Daniel Gehriger - 2002-01-11

    Logged In: YES
    user_id=30009

    I don't think it's a bug in MSXML. The converted XML files
    contain an encoding instruction at the top, which I set to
    UTF-8. This means that I should check all text and ensure
    it's valid UTF-8, and otherwise put it into a [CDATA
    [ ... ]
    ] section. Or, alternatively, change the encoding to
    UTF-16.

    Please try this:

    In the files template.xml and template_dt.xml, change UTF-8
    to UTF-16, then recompile everything. Let me know if this
    fixes the problem. Anyway, I'll probably keep UTF-8, and
    use a [CDATA[ section (you are welcome to implement this
    yourself, and send me the patch, if you feel confortable
    with the source code of msi2xml, otherwise let me know).

     
  • Yi Ye

    Yi Ye - 2002-01-11

    Logged In: YES
    user_id=424887

    Thanks for your quick response. Really appreciate it.

    I tried your method. I couldn't find UTF-8 in the two xml
    file. I found them in the xsl file. After I changed the UTF-
    8 in the two xsl files to UTF-16, when I run msi2xml on my
    good.msi, the resulted good.xml still has UTF-8 in it? And
    the problem is still there. Is there anything I did wrong?

    I guess I will just wait for your CDATA change, then. I am
    sorry but I really don't have the expertise implementing
    that change. I do not know a lot about XML. I will try to
    do it but I am not positive on the outcome. Thanks a lot
    for your help,

     
  • Shamsul Shaikh

    Shamsul Shaikh - 2002-01-14

    Logged In: YES
    user_id=141109

    I have tried the steps in both Japanese (W2K SP2, MSXML 3.0
    SP2, MSI2XML 1.2.1) and English (W2K SP2, MSI2XML 4.0, WI
    2.0, MSI2XML 1.2.1) environments. The problems happens only
    on the English environment as Yi has mentioned in his
    problem report. The problem does not occur on the Japanese
    environment but after converting the xml back to msi,
    before the Japanese character can be viewed (during the
    installation of the MSI) the code page has to be changed
    (in this case, to 932 for Japanese) - refer to BUG 502029.
    In the English environment, even if the code page is
    corrected the Japanese characters are not displayed
    correctly.

     
  • Daniel Gehriger

    Daniel Gehriger - 2002-01-14

    Logged In: YES
    user_id=30009

    Fixed in release 1.3.0. A new "encoding" attribute of the
    <msi> tag holds the database encoding.

    Note that you must use the Unicode version of
    msi2xml/xml2msi if you need Unicode character support.

     
  • Nobody/Anonymous

    x8X9y0 yjeviostnvmg, [url=http://ywigwamhzcfg.com/]ywigwamhzcfg[/url], [link=http://cguyleqhyxny.com/]cguyleqhyxny[/link], http://gstbkcbqrsox.com/

     


Anonymous

Cancel  Add attachments





Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks