2008/10/29 Christiaan Fluit <email@example.com>:
Antoni Mylka wrote:
Now that 1.2.0 is out, it's time for a brainstorm what to do next.
First I know about:
1. The subcrawler patch submitted to the tracker a week before the
release that didn't make it - should be merged.
Tested and committed it. It does contain a bug for which I added a bug
report. I suspect that archive entries with spaces in their name are the
problem. I understand that you are looking into this problem right now.
2. I've been receiving signals that what you're using Aperture most
are emails, even though many improvements have been made - further are
needed. This includes http://tinyurl.com/sf1989505 and further
customizations of the DataObjectFactory (2147955,2147921,2147901), and
a MessageDataObject idea - Christiaan will probably elaborate more.
Allow me to plug something here :) We will be releasing a new product
soon that is aimed at the forensic market. A use case is for example
investigating the mails found on a confiscated laptop's hard drive. This
product will contain a number of proprietary mail crawlers that were
developed by a partner company.
Because we needed access to all the mail headers inside a mail, we had
to modify the DataObjectFactory so that all headers are modeled as
metadata in the DataObject. Of course, we'd rather use the standard
Aperture build. There are a number of ways in which that can be
realized, all of which make sense. For example, a switch in the mail
data source that indicates that all headers should be exported, or a
DataObject subclass that embeds the Message instance, so that an
integrator can do some post-processing himself.
I am in favor of at least implementing the MessageDataObject, as it
allows for more advanced message processing without Aperture getting in
This is possible, i imagine it like this
1. the first data object from an email is a MessageDataObject and it
contains a MimeMessage
2. subsequent data objects are normal data objects, like they were before
3. there are no problems with this approach because a
javax.mail.mime.MimeMessage instance is always backed by an in-memory
buffer, so it doesn't care how many streams you open, and when you
close them, it's also probably thread-safe if you don't modify
4. we may add a switch in all mail-related datasources to suppress all
data objects from the email apart from the first one
Would this solve your problem?
3. Seven new bugs
4. And lots of open feature requests.
5. ??? any ideas, wishes, requests, threats - now is the time.
I would personally be for accelerating the release cycle, which means two things
1. Release more often (obviously)
2. Reduce the amount of work it takes to make a release, i.e. invest
some work in the build to reduce the time needed for maintenance.
- split the codebase in two (not more, Herko tried this and failed,
two is enough, two jars, two osgi bundles, two licenses)
I don't think he failed, he just didn't have time to finish it.
:), Didn't have time to finish it because doing the way it probably
should be done with a separate maven artifiact for each component
would take much more time than we can spend on it :)
- automate the license headers checking (and make it clear that core
is AFL, impl is OSL)
- drop selectors.xml (which is a nightmare)
- automate the osgi manifests generation (which is really tricky to
get right manually)
- have daily builds on windows AND some unix (e.g. this time I lost an
hour because tests failed on solaris), with every new release I get to
learn more about the platform-independence of java in the real life
It might not be a bad idea to squeeze Maven in here, regardless of the
build tool - doing the above needs some work (outlook dlls, jars
inside osgi bundles vs. outside etc), while maven has some clear
I would expect Maven to be able to take care of handling the
OSGi-specific stuff, but I don't have any experience to back that up.
In the two-bundle setup, we only need maven to generate
export-package, import-package and require-bundle statements. With the
multibundle setup (as herko tried), we would also need it to merge
entries from many bundles, and to generate an activator for the merged
bundle on the fly (like the CoreImplementationsActivator).
The manifest generation is there, with the maven-osgi-plugin, and it
works allright, I use it for hacking the sesame-onejar for Aperture,
for bundle merging we'd need to write it ourselves. (The same problem
today as it was two years ago when we started osgi-enabling aperture).
Also, updating dependencies, performing automatic builds, automatic
deployments, etc. gets easier this way.
OK, created an issue
will take a look at this