On Monday 19 September 2005 04:30 pm, Rich Apodaca wrote:
> On Mon, 19 Sep 2005 09:12:15 +0200, Egon Willighagen wrote
> > From what I've seen, I think CDKTools and Octet use a builder as stand in
> > for the data classes, correct?
> Yes - and no. My take on the AtomContainer class (the main "data class") is
> that it combines two completely unrelated behaviours: construction
> (addAtom, addBond, etc.) and query (getAtomCount, getBondCount, etc.). In
> other words, what I propose is:
> AtomContainer -> AtomContainer + CDKBuilder
This might be good in the long term... but in the short term I don't want to
step away from having things out of sight for users.
> In those situations in which AtomContainer behaves as its own builder (in
> file i/o, for example), CDKBuilder is designed to be a replacement.
> However, AtomContainer remains the main data class in those situations in
> which information needs to be extracted from an AtomContainer (in
> UniversalIsomorphismTester, for example).
> > This is not the step I would like to make, though I do want it's
> > functionality. The reason is that this would change the user
> > experience regarding the CDK API. Something I don't want to change.
> It needn't change the CDK API at all, just add to it. Check out the current
> release of CDKTools (http://cdktools.sf.net). I've introduced CDKBuilder as
> a layer on top of CDK. Clients that want to continue letting AtomContainer
> build itself can continue to do so. But clients wanting the flexibility of
> the Builder Pattern can take advantage of CDKBuilder. The more extreme
> option would be to remove the mutator methods (addAtom, addBond, etc.) from
> the AtomContainer interface. And I agree that this would very much change
> the CDK API. But what I'm doing with CDKTools and CDKBuilder just gives
> developers a more flexible option for building AtomContainers, without
> changing the CDK experience.
> > My main 'problem' I have with this design is that this indeed is a
> > replacement for the AtomContainer.
> Not at all. The two can co-exist side-by-side (see above).
> > What I would prefer, is something like this:
> > public ChemObjectBuilder cdk.interfaces.ChemObject.getBuilder()
> > and
> > public ChemObject ChemObjectReader.read(ChemObject)
> > // like it is now, except that it will take a class, not an instance
> > So people do not have to think about builders, just about data classes.
> > The builder would be a somewhat large helper class with methods like
> > static public Atom newAtom(...)
> > where for each current constructor there will be an equivalent in
> > this interface. So, a lot of methods, but the class can be singular
> > (only one instance can exist in the VM world), so I don't expect too
> > many memory problems.
> > In the end all library classes in the CDK do not instantiate data
> > classes directly, but use the builder instead.
> I realize that this is in line with the current CDK setup. Unfortunately, I
> don't see how this moves the ball forward. If the object returned by
> ChemObject.getBuilder() is a static Singleton instance of a class with a
> large interface, then the same effect can be achieved by adding the
> following method to any ChemObjectBuilder interface implementation:
> public interface ChemObjectBuilder
> // ... methods
> public class BasicChemObjectBuilder implements ChemObjectBuilder
> private static ChemObjectBuilder instance = null;
> // ... implement ChemObjectBuilder interface
> public static ChemObjectBuilder getInstance()
> // .. the usual Singleton Pattern code
> if (instance == null)
> instance = new BasicChemObjectBuilder();
> return instance;
Yes, this is what I had in mind. Except that the the ChemObjectReader does not
know which ChemObjectBuilder class to instantiate. Hence the
> Although this can be done, there really is no need for it. Java has a
> highly-optimized garbage collector that can easily handle a few unused
> ChemObjectBuilders. I'm not against it per se, I just think it's
> unnecessary. And this is the kind of optimization that is best performed at
> a late stage with a good profiling tool, so its performance benefit, if
> any, can be quantified.
That might be true, though I do not see why using a singleton would make it
possibly slower. Is there a reason not to use it?
> Aside from this, the solution you propose still couples builder-like
> behavior to query-like behavior.
Yes, that's the current design. I know that Octet uses immutable classes,
which might be better, but *I* don't want to make that transition now. That
is, if someone comes with a good patch, I won't object to applying it at all.
But I'm not going to write that patch at this moment myself.
I'm just trying to tweak things such that I can fullfill my personal needs
here, *without* modifying the API and breaking things too much...
> Why should Atom (which currently inherits
> from ChemObject) need to know anything about its builder? I'm advocating
> completely decoupling these behaviors because I've found this to be the
> most flexible solution.
Can you give a practical example where this flexibility comes in?
> I think what you're getting at is that interface implementations should
> never directly instantiate the objects they need to work with?
Yes, I think that about formulates one of the goals...
> But there
> are better ways to accomplish this, for example, through "Dependency
> Injection" (http://www.martinfowler.com/articles/injection.html). There are
> entire frameworks designed to make this simpler, such as Spring and
> PicoContainer. I'm planning on implementing a solution like this in Octet
> (http://octet.sf.net), but only as a late-stage optimization, when I'm
> certain the entire package is stable.
Will have to look at this.
> This is a very interesting discussion. I don't want to give the impression
> that what I'm proposing is the only way to do it. There are may viable
> solutions, but I'm interested in how you and the rest of the CDK developers
> see the relative advantages and disadvantages of the approaches we discuss.
Likewise, I do not overly prefer my solution to others, but just being
pragmatic here, considering time limitations (no time for thinking through
and explaning a large API design) and trying not to break anything...
It's an interesting discussion indeed. What do others think about this?