Menu

#389 pst file with embedded email not normalising correctly

Next Release
open
High
2014-07-22
2014-06-27
No

When attempting to normalise a pst email file which contained an embedded email the following error below was displayed. Also only two xena files were output which did not include the embedded email.

Error message:

java.io.FileNotFoundException: C:\TestData\Test2-3.pst\mbox\2\3 (The system cannot find the path specified)
java.lang.RuntimeException: java.io.FileNotFoundException: C:\TestData\Test2-3.pst\mbox\2\3 (The system cannot find the path specified)
at au.gov.naa.digipres.xena.kernel.XenaInputSource.getByteStream(XenaInputSource.java:272)
at au.gov.naa.digipres.xena.util.MetadataExtraction.extractMetadataWithTika(MetadataExtraction.java:67)
at au.gov.naa.digipres.xena.kernel.metadata.DefaultMetaData.useMetadataExtractionTool(DefaultMetaData.java:74)
at au.gov.naa.digipres.xena.kernel.metadata.DefaultMetaData.parse(DefaultMetaData.java:196)
at au.gov.naa.digipres.xena.kernel.metadatawrapper.DefaultWrapper.endDocument(DefaultWrapper.java:352)
at au.gov.naa.digipres.xena.plugin.email.MessageNormaliser.parse(MessageNormaliser.java:177)
at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.parse(NormaliserManager.java:877)
at au.gov.naa.digipres.xena.plugin.email.EmailToXenaEmailNormaliser.doFolder(EmailToXenaEmailNormaliser.java:345)
at au.gov.naa.digipres.xena.plugin.email.EmailToXenaEmailNormaliser.parse(EmailToXenaEmailNormaliser.java:232)
at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.parse(NormaliserManager.java:829)
at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.normalise(NormaliserManager.java:1063)
at au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:667)
at au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:572)
at au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseFile(NormalisationThread.java:337)
at au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseStandard(NormalisationThread.java:255)
at au.gov.naa.digipres.xena.litegui.NormalisationThread.run(NormalisationThread.java:195)
Caused by: java.io.FileNotFoundException: C:\TestData\Test2-3.pst\mbox\2\3 (The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(Unknown Source)
at java.io.FileInputStream.<init>(Unknown Source)
at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
at au.gov.naa.digipres.xena.kernel.XenaInputSource.getByteStream(XenaInputSource.java:264)
... 15 more

Discussion

  • Terry O'Neill

    Terry O'Neill - 2014-06-27

    After some quick testing it appears this occurs on windows but not on linux. This is most likely because of one of these two reasons:

    1. I am using a newer version of readpst in my linux environment
    2. The readpst.exe executable which we include in Xena and which was ported by us is not working as desired on windows
     
  • Terry O'Neill

    Terry O'Neill - 2014-06-30

    I have found out that I was inadvertently building Xena with the naa wrapper on Linux previously and am now able to confirm that the issue occurs both on Linux and Windows. The issue appears to be the following section of the MessageNormaliser class:

    // There is a requirement for the meta-data to contain the real source location of the attachment,
    // thus this hack specially for Trim where attachments are separate files.
    if (bp.getContent() instanceof Message) {
        xis = lastInputSource = new XenaInputSource(nuri, localType);
    } else if (bp instanceof au.gov.naa.digipres.xena.plugin.email.trim.TrimAttachment) {
        xis = lastInputSource = new XenaInputSource(((au.gov.naa.digipres.xena.plugin.email.trim.TrimPart) bp).getFile(), null);
    } else {
        xis = lastInputSource = new ByteArrayInputSource(bp.getInputStream(), null);
        xis.setSystemId(nuri);
    }
    

    Unfortunately this code to cater for TRIM exports and the like breaks embedded emails. An easy fix for this would be to check if such a seperate file for the attachment exists, although there is a rare chance that we might end up with a file which exists but which is not meant to be the attachment for this email.

     
  • Terry O'Neill

    Terry O'Neill - 2014-07-01

    Have done the fix as a basic check for the existence of any attached file for now. I have attached the email.jar plugin for external users who wish to test this change. Note that this jar also includes the fix for bug #386 (pst files not correctly normalising international characters).

    Note that this does still not fix bug #387 (Email multipart in multipart not correctly handled)

     
  • Kirti Chennareddy

    • assigned_to: Kirti Chennareddy
     

Log in to post a comment.