internal entity doesn't work

Help
2010-11-28
2013-05-15
  • Gabor Juhasz
    Gabor Juhasz
    2010-11-28

    Hi.
    I tried to parse a simple xml document with VDT-XML, but if the xml contains entity declaration, the parsing throws the error:
    com.ximpleware.EntityException: Errors in Entity: Illegal builtin reference

    this is the xml file, named "testdata.xml"

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE sgml [
      <!ELEMENT sgml ANY>
      <!ENTITY q  "Sample">
    ]>
    <sgml>&q;</sgml>
    

    and this is the essential part of my code:

                File f = new File("testdata6.xml");
                FileInputStream fis = new FileInputStream(f);
                byte[] b = new byte[(int) f.length()];
                fis.read(b);
                VTDGen vg = new VTDGen();
                vg.setDoc(b);
                vg.parse(true);
                VTDNav vn = vg.getNav();
                vn.toElement(VTDNav.ROOT);
                AutoPilot ap = new AutoPilot();
                ap.bind(vn);
                ap.selectXPath("/sgml");
                while (ap.evalXPath() != -1) {
                    vn.getElementFragmentNs().writeToOutputStream(System.out);
                    System.out.println();
                }
    

    could you help out? What have I've done wrong?

     
  • jimmy zhang
    jimmy zhang
    2010-11-28

    VTD-XML is performance oriented, it doesn't support external entity declaration because external entities is a perforamnce drag. One way to get around it is to remove external entity declaration using a sax parser…

     
  • Gabor Juhasz
    Gabor Juhasz
    2010-11-29

    external? isn't it internal entity declaration?

     
  • jimmy zhang
    jimmy zhang
    2010-11-29

    VTD-XML only supports built-in and character entities, those entities references that you used are generally deemed as too complex and being a drag on performance… would a entity reference resolver/replacer help?

     
  • Gabor Juhasz
    Gabor Juhasz
    2010-11-29

    I see. It would help, but is it possible to load the result from the resolver into a byte array?

     
  • jimmy zhang
    jimmy zhang
    2010-11-29

    entity reference's syntax spec is inherited from sgml, which is ancient, complex and very bad for performance… it is far cleaner to replace entitied xml with clean xml w/o entities … you lose virtually nothing doing the conversion, and gain many benefits

     
  • Gabor Juhasz
    Gabor Juhasz
    2010-11-29

    the problem is, that the xml file is given, and I don't want to change it, because it will cause, that I have to do that every time, when the source file is change.
    It's basicly a dictionary, which contains various information about words. And I wanted to do simple lookups.
    And the main reason for me using VTD-XML was, that it supports xpath, so I don't have to mark the beginning of the entry to be able to go back, and export the information I need. Plus the code is much more clearer with VDT-XML.
    That's why I thought, that it would be the best to stream the resolved xml into VDT-XML without writing out to a file.

    But since I have to use sax/stax either way, I guess, I can write my own handler as well.

    Thanks, for your help.

     
  • jimmy zhang
    jimmy zhang
    2010-11-29

    will have to think about this ..