Thanks for the diagram - there's definitely some we could cut/split out (io -> isomorphism). 

I have no idea how Maven handles this, and if it does such copying
automatically to overcome this limitation of Java in modularized
building. Of course, if you put every CDK module into a separate
cdk/$module/src/main code folder you overcome this problem, but that
was a clear no-go for the CDK community at the time. 

Yes that's the way it is done - out of interest why was it a no go?

Perhaps jigsaw may actually be done and we'll have a modular jdk too… http://openjdk.java.net/projects/jigsaw/

J

On 9 Jun 2013, at 12:49, Egon Willighagen <egon.willighagen@gmail.com> wrote:

Hi John, all,

attached is an overview of module dependencies. I have worked hard in
the past to make as many classes use the data model interfaces instead
of implementations. The module approach enforces that people do not
accidentally introduce new such dependencies. That has been rather
critical for maintenance: the more people get warned about issues
before the code hits the main repository the better.

For qsarionpot the dependencies on additional code is a lot. For
example, it depends on atom and bond descriptors, which can only be
solved by putting all descriptor code in one module. Additionally, it
depends on many modules that most molecular descriptors do not even
depend on. The attached diagram shows this. For example, qsarionpot
depends on the full CML stack with Xerces and all. That's a heavy
dependency.

For qsarprotein a similar situation exists, but the biggest is that it
depends on the "data" module, and thus a specific data model
implementation. By not depending purely on the interfaces, it is too
easy to mess up, and introduce class cast exception bugs.

There is not such thing as a free lunch. By putting everything in a
single source tree, you cannot take advantage of modular software
engineering approaches. The current system holds middle ground between
making completely separated modules (with different source trees), and
having things in one source tree. It has been a deliberate choice to
fulfill both requirements.

In my experience, the extra time up front has saved me a much more
time later in tracking down and fixing bugs. Of course, if we all were
clean coders, all these tools to enforce our modular design was not
possible. Sadly, this is practically not the case, and we need peer
review; even our automated peer review (this build system) is not even
enough to not have any human peer review, with even the latter
occasionally finding bugs in new code.

I stress that the extra time spent up front in having the module
approach, has found a plethora of bugs, and tremendously increased the
code quality, stability, and significantly reduced the cost of
maintenance.

One interesting anecdote is that even if you ask Java to compile a
particular class against some jars, it will randomly pick any other
dependency from the source tree that class came from. This is why the
current code copies the source code from cdk/src into cdk/$build/src
so that it cannot sneak in other classes.

I have no idea how Maven handles this, and if it does such copying
automatically to overcome this limitation of Java in modularized
building. Of course, if you put every CDK module into a separate
cdk/$module/src/main code folder you overcome this problem, but that
was a clear no-go for the CDK community at the time.

We don't have a list of requirements the build system should comply
to; which complicates the discussion around it. I hope that these
clarifications about qsarionpot and qsarprotein help with getting such
requirements clear.

Grtz,

Egon

--
Dr E.L. Willighagen
Postdoctoral Researcher
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
<deps.png>------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j_______________________________________________
Cdk-devel mailing list
Cdk-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-devel