Oriental language problem

Help
grandiose
2009-08-14
2013-05-15
  • grandiose
    grandiose
    2009-08-14

    package com.skcc.ilm.examples.Indexing;
    import com.ximpleware.VTDGen;
    //This example shows you how to create an VTD+XML index
    public class indexWrite {
    public static void main(String args[]) {
      try{
       VTDGen vg = new VTDGen();
       System.out.print(vg.parseFile("D:/eclipse-ilm/workspace/ximpleware_2.6_java/files/CONTRACT_SERVICE.xml", true));
       if (vg.parseFile("D:/eclipse-ilm/workspace/ximpleware_2.6_java/files/CONTRACT_SERVICE.xml", true)){
        // recommended extension is .vxl
        vg.writeIndex("D:/eclipse-ilm/workspace/ximpleware_2.6_java/files/CONTRACT_SERVICE.vxl");
       }
      }catch(Exception e){
       e.printStackTrace();
      }
    }
    }

    source Xml format:
    <?xml version="1.0" encoding="euc-kr"?> or <?xml version="1.0"?> 
    <test>korean chars</test>
    vg.parseFile  returns false

    when encoding was changed to "utf-8", we got the following:
    parseFile ParseException
    com.ximpleware.ParseException: UTF 8 encoding error: should never happen
    at com.ximpleware.VTDGen$UTF8Reader.handleUTF8(VTDGen.java:53
    at com.ximpleware.VTDGen$UTF8Reader.getChar(VTDGen.java:504)
    at com.ximpleware.VTDGen.getCharAfterSe(VTDGen.java:1204)
    at com.ximpleware.VTDGen.parse(VTDGen.java:189
    at com.ximpleware.VTDGen.parseFile(VTDGen.java:2387)

    How to parse the Korean XML files or some other oriental XML files?

     
    • jimmy zhang
      jimmy zhang
      2009-08-14

      can you write down a list of oriental languages that you think needs to be supported...
      we will investigate and get back to you