#128 some multipart/report emails are parsed incorrectly

1.5.0 - bugs
closed-fixed
None
5
2010-06-25
2010-06-25
No

We have found multipart/report emails which throw a NullPointerException:

#3081 in folder Inbox
java.lang.NullPointerException: null
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleRfc822SinglePart(DataObjectFactory.java:640)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleSinglePart(DataObjectFactory.java:528)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMailPart(DataObjectFactory.java:492)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMixedPart(DataObjectFactory.java:1026)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMultipart(DataObjectFactory.java:966)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMailPart(DataObjectFactory.java:484)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleRfc822SinglePart(DataObjectFactory.java:623)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleSinglePart(DataObjectFactory.java:528)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMailPart(DataObjectFactory.java:492)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMixedPart(DataObjectFactory.java:1026)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMultipart(DataObjectFactory.java:966)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.handleMailPart(DataObjectFactory.java:484)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.createDataObjects(DataObjectFactory.java:400)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.<init>(DataObjectFactory.java:244)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.<init>(DataObjectFactory.java:272)
at org.semanticdesktop.aperture.crawler.mail.DataObjectFactory.<init>(DataObjectFactory.java:290)

Discussion

  • Antoni Mylka

    Antoni Mylka - 2010-06-25

    Identified the issue.

    The processing of mime parts text/rfc822-headers introduced during the work on issue 3020798 relies on a sort of a hack. javamail treats this part as an attachment, and the getContent method returns an InputStream. We create an instance of MimeMessage out of it and process it as a normal message. The problem is that such an instance is invalid from the parser POV, because it may specify a multipartcontent type, but it has no content and therefore no part boundaries. This causes an error in the parser.

    I tried to make the DataObjectFactory immune to such error so that all the multipart handling methods (handleMixedPart, handleRelatedPart etc.) can work correctly if in fact there are no subparts to crawl.

    It seems that the handleAlternativePart does not implement it correctly. Instead of parsing the parent part, then parsing the subparts, and then returning the parent (with children if need be. It first parses the children, and the parent only afterwards. If the exception occurs - it returns null, instead of the parent, which is a reason for a NullPointerException.

    I need to refactor the handleAlternativePart method to work in the same way as handleMixedPart or handleRelatedPart.

     
  • Antoni Mylka

    Antoni Mylka - 2010-06-25
    • status: open --> closed-fixed
     
  • Antoni Mylka

    Antoni Mylka - 2010-06-25

    fixed in rev 2360,

    this closes this issue

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks