Menu

#142 Error extracting text from application/x-msaccess mime type

2.1.8
closed
nobody
None
1
2017-05-31
2017-05-26
No

I am using Tika 1.14 to extract text from an MS Access file with mime type = application/x-msaccess. Below is the stracktrace. A copy of the test file is attached.

05/25/17 18:51:30,246 WARN host=127.0.0.1,app=ReviewManager,env=DEV,Failed parsing query (Query: ~sq_cStudent List~sq_ccboFilterFavorites)
java.lang.IllegalStateException: Unexpected query type 9 (Query: ~sq_cStudent List~sq_ccboFilterFavorites)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.<init>(QueryImpl.java:70)
at com.healthmarketscience.jackcess.impl.query.BaseSelectQueryImpl.<init>(BaseSelectQueryImpl.java:36)
at com.healthmarketscience.jackcess.impl.query.SelectQueryImpl.<init>(SelectQueryImpl.java:34)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.create(QueryImpl.java:394)
at com.healthmarketscience.jackcess.impl.DatabaseImpl.getQueries(DatabaseImpl.java:1213)
at org.apache.tika.parser.microsoft.JackcessExtractor.parse(JackcessExtractor.java:158)
at org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:102)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
at com.raytheon.tika.RecursiveParser.parse(RecursiveParser.java:141)
05/25/17 18:51:30,249 WARN host=127.0.0.1,app=ReviewManager,env=DEV,Failed parsing query (Query: ~sq_cStudentBad~sq_ccboFilterFavorites)
java.lang.IllegalStateException: Unexpected query type 9 (Query: ~sq_cStudentBad~sq_ccboFilterFavorites)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.<init>(QueryImpl.java:70)
at com.healthmarketscience.jackcess.impl.query.BaseSelectQueryImpl.<init>(BaseSelectQueryImpl.java:36)
at com.healthmarketscience.jackcess.impl.query.SelectQueryImpl.<init>(SelectQueryImpl.java:34)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.create(QueryImpl.java:394)
at com.healthmarketscience.jackcess.impl.DatabaseImpl.getQueries(DatabaseImpl.java:1213)
at org.apache.tika.parser.microsoft.JackcessExtractor.parse(JackcessExtractor.java:158)
at org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:102)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
at com.raytheon.tika.RecursiveParser.parse(RecursiveParser.java:141)
at com.raytheon.tika.TikaUtil.extractContent(TikaUtil.java:435)
05/25/17 18:51:30,250 WARN host=127.0.0.1,app=ReviewManager,env=DEV,Failed parsing query (Query: ~sq_cCopy of Student List~sq_ccboFilterFavorites)
java.lang.IllegalStateException: Unexpected query type 9 (Query: ~sq_cCopy of Student List~sq_ccboFilterFavorites)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.<init>(QueryImpl.java:70)
at com.healthmarketscience.jackcess.impl.query.BaseSelectQueryImpl.<init>(BaseSelectQueryImpl.java:36)
at com.healthmarketscience.jackcess.impl.query.SelectQueryImpl.<init>(SelectQueryImpl.java:34)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.create(QueryImpl.java:394)
at com.healthmarketscience.jackcess.impl.DatabaseImpl.getQueries(DatabaseImpl.java:1213)
at org.apache.tika.parser.microsoft.JackcessExtractor.parse(JackcessExtractor.java:158)
at org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:102)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
at com.raytheon.tika.RecursiveParser.parse(RecursiveParser.java:141)
at com.raytheon.tika.TikaUtil.extractContent(TikaUtil.java:435)</init></init></init></init></init></init></init></init></init>

org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.JackcessParser@5761bff
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
at com.raytheon.tika.RecursiveParser.parse(RecursiveParser.java:141)
at com.raytheon.tika.TikaUtil.extractContent(TikaUtil.java:435)
at com.raytheon.tika.TikaUtil.extractContentToFile(TikaUtil.java:367)
at com.raytheon.reviewmanager.hrmtextextractorwidget.HRMTextExtractorThread.extractTextAndMetadataToFiles(HRMTextExtractorThread.java:325)
at com.raytheon.reviewmanager.hrmtextextractorwidget.HRMTextExtractorThread.run(HRMTextExtractorThread.java:161)
Caused by: java.lang.UnsupportedOperationException
at com.healthmarketscience.jackcess.impl.query.QueryImpl$UnknownQueryImpl.toSQLString(QueryImpl.java:561)
at com.healthmarketscience.jackcess.impl.query.QueryImpl.toSQLString(QueryImpl.java:356)
at org.apache.tika.parser.microsoft.JackcessExtractor.parse(JackcessExtractor.java:160)
at org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:102)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)

1 Attachments

Discussion

  • James Ahlborn

    James Ahlborn - 2017-05-31
    • status: open --> closed
    • Group: Unassigned --> 2.1.8
     
  • James Ahlborn

    James Ahlborn - 2017-05-31

    Fixed in trunk, will be in the 2.1.8 release.

     

Log in to post a comment.

MongoDB Logo MongoDB