Menu

3 separate issues - not necessarily related

Anonymous
2006-10-23
2012-10-04
  • Anonymous

    Anonymous - 2006-10-23

    I've had to convert around 7.5gb of text data into an Access database, and this project has been a blessing. However, I've had several issues.

    1. Writing the data to the files is extremely slow. Should the write buffer be larger? Should the buffer possibly be user-changable? To give you an idea, I've been converting the data now for 9 days, and it's just now on the second to last file. Luckily, I haven't had any other tasks to do during this time. And I've been using the Databse.import() function if you were curious for each file.

    2. I'm creating a total of 49 of databses. In each database, I'm adding a total of 74 tables (I have 74 files in a total of 49 directories) using the Database.import() function. However, each time Jackcess attempts to add the 72nd table in each database, I get this error:

    java.lang.UnsupportedOperationException: FIXME cannot write large index yet
    at com.healthmarketscience.jackcess.Index.write(Index.java:314)
    at com.healthmarketscience.jackcess.Index.update(Index.java:286)
    at com.healthmarketscience.jackcess.Table.addRows(Table.java:740)
    at com.healthmarketscience.jackcess.Table.addRow(Table.java:660)
    at com.healthmarketscience.jackcess.Database.addToSystemCatalog(Database.java:588)
    at com.healthmarketscience.jackcess.Database.createTable(Database.java:405)
    at com.healthmarketscience.jackcess.Database.importReader(Database.java:744)
    at com.healthmarketscience.jackcess.Database.importFile(Database.java:695)
    at IETConversion.IETToMSAccess.parseDir(IETToMSAccess.java:66)
    at IETConversion.IETToMSAccess.main(IETToMSAccess.java:303)

    I tried looking through the source code, and I saw the source of the error, but I don't have the time right now to trace the code well enough to know exactly how to fix it. Luckily, this isn't a big issue because I can manually import the 3 missing tables into each database, but if I were trying to add, say, 150 tables to a database, this would be a much larger issue.

    1. For each database, one of the tables I'm attempting to write is called "Group." Jackcess works perfectly with all other table names, but for some reason, when I add this particular table, it creates the table as "xGroup" when I view the database in Access. Note that this only happens with the Database.import() function. If I create the table manually, it creates the table as "Group."

    Other than those 3 complaints (the main one being the 1st issue - the slow performance), this library has been fantastic so far. Thanks for the hard work!

     
    • James Ahlborn

      James Ahlborn - 2007-04-26

      just wanted to follow up on the first point in your original post. another user determined that a "flush" call that jackcess was making caused jackcess to run much slower for large inserts. the HEAD in CVS now has an "autoSync" flag which can disable this automatic flushing. the user noted a dramatic speed improvement when the forced flushing was disabled.

       
    • Anonymous

      Anonymous - 2006-10-23

      In the above post, I meant to write that I was using the Database.importFile() function, not the Database.import() function (since it doesn't exist!). Whoops!

       
    • James Ahlborn

      James Ahlborn - 2006-10-24
      1. jackcess certainly has some room for optimization. the buffer sizes are most likely not the issue. my gueses: depending on the type of data, it may get copied 2 or 3 times before getting written into the final buffer. and, i believe all page writes force disk writes. and, i believe every chunk of row-writes re-writes the table header. anyways, i think you get the idea. to this point, most development effort has just been towards supporting a decent feature set and stability. maybe at some point in the future someone could put some effort into optimization.

      2. this is a missing feature. it's missing because writing a one-page index is easy, and writing multi-page indexes is hard. that exception is thrown when you cross the boundary from "easy" to "hard". :) i just recently added support for reading multi-page indexes, so there is some code to look at if you want to tackle the problem (hint, hint).

      3. the fact that it is "Group" when manually added is probably the bug :). "group" is a reserved word in access (the "x" prefix is the escape prefix jackcess adds). for some reason, the import code correctly handles reserved words, but the manual addtable code does not.

       

Log in to post a comment.