From: Leo S. <leo...@df...> - 2008-02-25 16:09:26
|
It was Antoni Mylka who said at the right time 25.02.2008 16:54 the following words: > I've been working with the MboxCrawler. There are two issues with it. > > 1. The mbox files from the ubuntu mailing list archives - brought up > by Jose don't work with the crawler because they use the ' at ' string > instead of the proper @ sign - for spam avoidance. This breaks because > the DataObjectFactory uses the getFrom() method which tries to convert > that string into a javax.mail.Address instance - this fails obviously. > W'd need to rewrite the DataObjectFactory to work with > > String [] getHeader(String id) > > .. and perform the conversion ourselves. This shouldn't be too difficult. > > 2. I don't quite understand the mapping between the Message structure > and the list of data objects. I noticed it when the validator started > complaining. It turns out that each part in a multipart email is > translated into a separate data object. These data objects don't have > any types (only the first, the message itself, and the attachments > have proper types). What to do with them? The validator won't stand > them and there is no proper class for a message part in NMO (yet at > least). I'd go for adding a MessagePart or MimePart class in NMO (a > subclass of InformationElement). What do you think? > I wonder if they are dataobjects .... but I am ok when you create them as InformatioElements, go ahead. MessagePart or MimePart are both fine, MimePart sounds more like the RFC.. lg Leo > All kinds of comments welcome. > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo...@df... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |