loading xml files into databases created with

  • Anonymous - 2011-05-17


    Can someone describe how to load xml files into databases after creating them with "xsdtobe". I have tried "xpdutils-demo.jar" but an error occured (can not find the main class..) . I don't care if the solution uses an end-user application (like "xpdutils-demo.jar") or a java code.


  • dondi

    dondi - 2011-05-23


    Sorry for the delay in getting back to you.  You were definitely on the right track to first look at the demo source code and to run it, so that you can see how things work with the “canned” books database.

    However, upon examining the released XMLPipeDB Utilities 2.3 .zip file, it looks like the release file’s assembly was incorrect and the build product is missing some critical files and settings.

    I will look into fixing this build then let you know when something is ready for you to try.

  • dondi

    dondi - 2011-05-24


    I have released XMLPipeDB Utilities 2.3.1.  This releases addresses the errors you found, plus updates the demo application to use the latest utilities API.

    First off, the application should run correctly from the .jar now.  Just invoke java -jar xpdutils-demo.jar from within the xpdutils folder.

    To go any further, you must do the following:

    1. Prepare a compatible database to which you will connect.  The build defaults to PostgreSQL.  For others, you will need to add a suitable database driver and configure accordingly.

    2. Make sure to load up the sample database schema (in sample/sql).  This defines the tables required by the sample application.

    There is a sample XML file in the sample/data folder that you can use for importing.

    Now, if the sample works out OK, adapting this to your own xsd2db-generated database .jar involves:

    a. Loading the SQL generated by xsd2db into the database.
    b. Having XML data formatted according to the XSD you used.
    c. Adapting the sample code, particularly the configuration of ImportEngine, to use the elements in your XSD.
    d. If you want to use the TallyEngine, that will also need to be adapted.

    Hope this gets things started better for you.

  • Anonymous - 2011-05-26

    thank you very much.
    Certainly things started better for me. the sample is working correctly now but about updating things to my situation is not still very clear.
    the points "a" and "b" are clear but "c" (Adapting the sample code, particularly the configuration of ImportEngine, to use the elements in your XSD. ) still needs some demystification.
    And what about  "Building the database library using the supplied Ant file, then use that library to test XML import, queries, and other database functions"
    thanks again

  • dondi

    dondi - 2011-05-27


    Let's start with the latter issue, pertaining to building the database library.  Note that in the end, XMLPipeDB really is just a wrapper around two other technologies: JAXB and Hibernate.  It may be helpful for you to learn about those separately, but I will try to explain them here.

    JAXB is the Java standard for expressing Java objects as XML markup and vice versa.  Hibernate is a very popular technology for reading and writing Java objects to and from a relational database (sometimes called ORM for object-relational mapping).  XMLPipeDB uses a "combo" technology called HyperJAXB to combine the two: it defines Java objects that can be read from XML markup (via JAXB) which, at the same time, can also be read/written to/from a relational database (via Hibernate).

    Thus, the statement "Building the database library using the supplied Ant file, then use that library to test XML import, queries, and other database functions" simply recognizes that the core functionality driven by a set of specially-defined Java objects.  What xsd2db creates for you is the source code for precisely those Java objects.  For you to use these objects, you need to compile the source code into a .jar.  The Ant build.xml file that xsd2db produces can generally take care of this "out of the box."  If you have additional knowledge of Ant, you can customize this to your liking.

    So, the first step for using XMLPipeDB with your own data is to build this library.  In the sample, this library is bookdb.jar.  This is the compiled version of the source code generated by xsd2db based on the sample books.xsd file that is frequently used to illustrate Java/XML technologies, especially XML schemas (go ahead and google books.xsd and see what comes up).

    Now, once you have your equivalent of bookdb.jar (for whatever data you'd like to work with), you then need to write code that uses your custom XML+Hibernate-ready objects.  This is where the sample code comes in.  You can write your own basic application by replacing bookdb.jar with the .jar that you created.

    The next step is to modify the sample code so that it uses the objects in your new .jar.    This centers almost completely around the line that initializes the ImportEngine, somewhere in the 200th line of the MainController class:

                        ImportEngine importEngine = new ImportEngine(AppResources
                                .optionString("jaxbContextPath"), hibernateConfiguration,
                                "bookstore/book", rootElement);

    Note that the constructor takes four parameters.  This is what they mean:

    - jaxbContextPath is the package that holds the JAXB ObjectFactory class.  In bookdb.jar, this is the generated package.  You can see this by looking at the edu/lmu/xmlpipedb/util/resources/options.properties file in the sample/ source tree.  Your custom jar may have a different package; I recommend looking at the compiled classes to find that out.

    - hibernateConfiguration is the Hibernate Configuration object to use.  This one can be quite complex due to its flexibility.  For the XMLPipeDBUtils sample code, we have tried to simplify it using the ConfigurationPanel class.  Essentially this class gathers up a collection of database settings and builds a Configuration object from it.  For this part, you don’t need to change any sample code, but you will want to enter the correct database settings when running the application.

    - The third argument is the entryElement.  In the sample code, this is "bookstore/book."  This argument is the XML path for the object you are planning to import into the database.  In the sample code, this object is the Book class, and this class in turn is derived from all XML elements delimited by the "book" tag within the "bookstore" tag.  Thus, "bookstore/book."  For you, you will have to decide what objects you are loading, then supply the XML path corresponding to that object.

    - The fourth argument, rootElement, needs a little context.  One of the things we discovered with XMLPipeDB is that, by default, JAXB tries to read the ENTIRE XML file, and thus the ENTIRE corresponding object tree, into memory.  But, in our work, we deal with HUGE XML files, which cannot be practically read into memory completely.  Unfortunately, we were unable to find a way to get JAXB to read an XML file incrementally.  The rootElement argument represents the way we tackled this problem.

    rootElement is supposed to be a map with two keys: "head" and "tail."  The value of each key is the XML tag that delimits a particular subset of the objects that we need to read.  ImportEngine then (transparently) breaks up the XML file into smaller files that start with the value of "head" then end with the value of "tail" -- effectively "pretending" that a single huge XML file is actually multiple smaller ones.  This way, even a very large XML file can be read with a single ImportEngine invocation, without reading the entire XML file into memory.

    I'm not sure how large your XML files are; if they are not very large, it suffices to put the top-level tag in "head" and its corresponding end tag in "tail."  As you can see from the sample code, with bookdb.jar, these values are "<bookstore>" and "</bookstore>," which are precisely the outermost tags of any valid XML file that conforms to the books.xsd schema.

    So, to get a (minimally functional) version customized for your data, you should:
    - Build the .jar for your custom data using xsd2db
    - Replace bookdb.jar with the custom .jar
    - Modify the ImportEngine initialization with the right jaxbContextPath, entryElement, and rootElement map for your custom data (as explained above, and hopefully with the sample code being clearer now with that explanation)
    - Run the modified sample application
    - Configure your database settings
    - Choose a file to import

    Now, having done this with a variety of XSDs, I can tell you that having everything work without a hitch after following the above steps is not guaranteed.  Sometimes XSDs are not well-formed; sometimes their naming choices cause Java-level or table-level issues; sometimes the XML files you have turn out to not actually follow the XSD.  But, the above steps are the minimal things to do.  You would then deal with whatever happens after that.

    Sorry this has become long, but I wanted to give you as many details as possible.  If some of this is unclear, then definitely some separate reading on XML, XML Schemas, JAXB, Hibernate, and Hibernate configuration may be called for.  But, assuming that you have a well-formed XML Schema, with good XML files to accompany it, then the above steps should move you forward.

    Hope all of this helps.


Log in to post a comment.