From: SourceForge.net <no...@so...> - 2010-03-26 14:59:13
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Tracker Item Submitted) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-03-29 18:00:30
|
Patches item #2977074, was opened at 2010-03-26 10:59 Message generated for change (Settings changed) made by rajarshi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master >Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 14:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-03-31 11:02:50
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Comment added) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Egon Willighagen (egonw) Date: 2010-03-31 13:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 20:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-03 19:30:52
|
Patches item #2977074, was opened at 2010-03-26 10:59 Message generated for change (Comment added) made by rajarshi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 15:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 07:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 14:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-03 21:57:13
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Comment added) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Egon Willighagen (egonw) Date: 2010-04-03 23:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 21:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 13:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 20:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-03 23:40:49
|
Patches item #2977074, was opened at 2010-03-26 10:59 Message generated for change (Comment added) made by rajarshi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 19:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 17:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 15:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 07:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 14:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 09:55:43
|
Patches item #2977074, was opened at 2010-03-26 14:59 Message generated for change (Comment added) made by mark_rynbeek You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 10:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 00:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 22:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 20:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 12:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 19:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 10:39:09
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Comment added) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Revision Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:39 Message: Mark, there are two separate issues here: 1. to work against interfaces, which is what this patch is about; 2. do we need more than one implementation. Regarding your question if we really need subclasses. I do not know. The system makes it possible. Now, let's turn around the question. Consider that the CDK are already not of the fastest, what would be your preferred way of introducing debugging functionality. My choice at the time was to not burden the existing classes with additional if statements, making the code more difficult to read, and further decrease the performance of the data classes. But, again, that's outside the scope of this patch. The builder merely allows us to work against interfaces, instead of implementations. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 11:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 01:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 23:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 21:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 13:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 20:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 10:42:33
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Comment added) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master >Group: Needs Review Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:42 Message: Rajarshi, how do you suggest we take this further? You now approved the patch, which is enough for application to master. I agree we should continue talking about the design of the CDK cores; however, I do not think this patch does more than changing the syntax of the currently existing structure, rather than changing the design. As such, I would ask this patch to be applied. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:39 Message: Mark, there are two separate issues here: 1. to work against interfaces, which is what this patch is about; 2. do we need more than one implementation. Regarding your question if we really need subclasses. I do not know. The system makes it possible. Now, let's turn around the question. Consider that the CDK are already not of the fastest, what would be your preferred way of introducing debugging functionality. My choice at the time was to not burden the existing classes with additional if statements, making the code more difficult to read, and further decrease the performance of the data classes. But, again, that's outside the scope of this patch. The builder merely allows us to work against interfaces, instead of implementations. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 11:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 01:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 23:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 21:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 13:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 20:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 11:58:00
|
Patches item #2977074, was opened at 2010-03-26 10:59 Message generated for change (Comment added) made by rajarshi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Review Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 07:58 Message: Egon, yes, I agree that the patch doesn't change the fundamental design and instead just the syntax. So I'm OK with applying it. But right now it fails error: patch failed: src/main/org/openscience/cdk/io/MDLV2000Reader.java:414 error: src/main/org/openscience/cdk/io/MDLV2000Reader.java: patch does not apply error: patch failed: src/test/org/openscience/cdk/io/MDLWriterTest.java:34 error: src/test/org/openscience/cdk/io/MDLWriterTest.java: patch does not apply Patch failed at 0001. I think you need to regenearte it for the latest master ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 06:42 Message: Rajarshi, how do you suggest we take this further? You now approved the patch, which is enough for application to master. I agree we should continue talking about the design of the CDK cores; however, I do not think this patch does more than changing the syntax of the currently existing structure, rather than changing the design. As such, I would ask this patch to be applied. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 06:39 Message: Mark, there are two separate issues here: 1. to work against interfaces, which is what this patch is about; 2. do we need more than one implementation. Regarding your question if we really need subclasses. I do not know. The system makes it possible. Now, let's turn around the question. Consider that the CDK are already not of the fastest, what would be your preferred way of introducing debugging functionality. My choice at the time was to not burden the existing classes with additional if statements, making the code more difficult to read, and further decrease the performance of the data classes. But, again, that's outside the scope of this patch. The builder merely allows us to work against interfaces, instead of implementations. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 05:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 19:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 17:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 15:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 07:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 14:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 12:45:09
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Comment added) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Review Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Egon Willighagen (egonw) Date: 2010-04-04 14:45 Message: I'm updating the patch against the current master... I just remembered your comment on new() versus newInstance()... new is indeed a reserved word, and cannot be used for a method name. We can easily give the method a different name when we come up with a good one... ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 13:58 Message: Egon, yes, I agree that the patch doesn't change the fundamental design and instead just the syntax. So I'm OK with applying it. But right now it fails error: patch failed: src/main/org/openscience/cdk/io/MDLV2000Reader.java:414 error: src/main/org/openscience/cdk/io/MDLV2000Reader.java: patch does not apply error: patch failed: src/test/org/openscience/cdk/io/MDLWriterTest.java:34 error: src/test/org/openscience/cdk/io/MDLWriterTest.java: patch does not apply Patch failed at 0001. I think you need to regenearte it for the latest master ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:42 Message: Rajarshi, how do you suggest we take this further? You now approved the patch, which is enough for application to master. I agree we should continue talking about the design of the CDK cores; however, I do not think this patch does more than changing the syntax of the currently existing structure, rather than changing the design. As such, I would ask this patch to be applied. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:39 Message: Mark, there are two separate issues here: 1. to work against interfaces, which is what this patch is about; 2. do we need more than one implementation. Regarding your question if we really need subclasses. I do not know. The system makes it possible. Now, let's turn around the question. Consider that the CDK are already not of the fastest, what would be your preferred way of introducing debugging functionality. My choice at the time was to not burden the existing classes with additional if statements, making the code more difficult to read, and further decrease the performance of the data classes. But, again, that's outside the scope of this patch. The builder merely allows us to work against interfaces, instead of implementations. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 11:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 01:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 23:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 21:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 13:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 20:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 12:57:34
|
Patches item #2977074, was opened at 2010-03-26 15:59 Message generated for change (Comment added) made by egonw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: Needs Review Status: Open Resolution: None Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Egon Willighagen (egonw) Date: 2010-04-04 14:57 Message: Rajarshi, new patch attached. There were some 'conflicts' with the recent R code update, where I needed to now also patch the new code introduced by the Rgroup support. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 14:45 Message: I'm updating the patch against the current master... I just remembered your comment on new() versus newInstance()... new is indeed a reserved word, and cannot be used for a method name. We can easily give the method a different name when we come up with a good one... ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 13:58 Message: Egon, yes, I agree that the patch doesn't change the fundamental design and instead just the syntax. So I'm OK with applying it. But right now it fails error: patch failed: src/main/org/openscience/cdk/io/MDLV2000Reader.java:414 error: src/main/org/openscience/cdk/io/MDLV2000Reader.java: patch does not apply error: patch failed: src/test/org/openscience/cdk/io/MDLWriterTest.java:34 error: src/test/org/openscience/cdk/io/MDLWriterTest.java: patch does not apply Patch failed at 0001. I think you need to regenearte it for the latest master ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:42 Message: Rajarshi, how do you suggest we take this further? You now approved the patch, which is enough for application to master. I agree we should continue talking about the design of the CDK cores; however, I do not think this patch does more than changing the syntax of the currently existing structure, rather than changing the design. As such, I would ask this patch to be applied. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 12:39 Message: Mark, there are two separate issues here: 1. to work against interfaces, which is what this patch is about; 2. do we need more than one implementation. Regarding your question if we really need subclasses. I do not know. The system makes it possible. Now, let's turn around the question. Consider that the CDK are already not of the fastest, what would be your preferred way of introducing debugging functionality. My choice at the time was to not burden the existing classes with additional if statements, making the code more difficult to read, and further decrease the performance of the data classes. But, again, that's outside the scope of this patch. The builder merely allows us to work against interfaces, instead of implementations. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 11:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 01:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 23:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 21:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 13:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 20:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |
From: SourceForge.net <no...@so...> - 2010-04-04 13:17:21
|
Patches item #2977074, was opened at 2010-03-26 10:59 Message generated for change (Comment added) made by rajarshi You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master >Group: Accepted >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Egon Willighagen (egonw) Assigned to: Nobody/Anonymous (nobody) Summary: new IChemObjectBuilder API Initial Comment: Hi all, as recently announced, I have written a new API for the IChemObjectBuilder infrastructure... (email attached below). Patch is massive (2MB) and targeted at master. One area where it could benefit other problems, is the dependency of \'isomorphism\' on data... using this new builder architecture, the module can implement those classes it itself, and only those it really needs. That way, we can remove the dependency if it on the data module, cleaning up things considerably. The email: The current model has some flaws: 1. it is a enormously large builder 2. making a new constructor for one of the IChemObject classes requires updating the IChemObjectBuilder 3. ... and requires all implementations to be updated 4. a new implementation (for whatever reason) of the IChemObject interfaces requires you to implement everything 5. currently some data classes do not extend IChemObject but are required for an IChemObjectBuilder, but do not share a common interface Particularly, the last is rather annoying, as it basically requires one to make a full implementation or none. The design I have been working on removes these limitations, and does the following. It introduces a new builder with a single method: public interface IChemObjectBuilder { public <T extends ICDKObject>T newInstance(Class<T> clazz, Object... params) throws IllegalArgumentException; } So, instead of: DefaultChemObjectBuilder.getInstance().newAtom(\"C\"); you would now do (given builder = DefaultChemObjectBuilder.getInstance()): builder.newInstance(IAtom.class, \"C\"); I am tempted to rename the newInstance() to just new()... comments on that most welcome. Passing a Point2d, would simply be: builder.newInstance(IAtom.class, \"C\", new Point2d(1,2)); If the caller would pass incorrect arguments, an IllegalArgumentException is called, such as happens if I would call: builder.newInstance(IAtom.class, new AtomContainer()); At the same time, this introduces a new ICDKObject interface, with only the getBuilder() method which used to be in the IChemObject... this allows IAtomParity to share a common interface, without inheriting listener stuff, etc. ---------------------------------------------------------------------- >Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 09:17 Message: Applied to master and pushed ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 08:57 Message: Rajarshi, new patch attached. There were some 'conflicts' with the recent R code update, where I needed to now also patch the new code introduced by the Rgroup support. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 08:45 Message: I'm updating the patch against the current master... I just remembered your comment on new() versus newInstance()... new is indeed a reserved word, and cannot be used for a method name. We can easily give the method a different name when we come up with a good one... ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-04 07:58 Message: Egon, yes, I agree that the patch doesn't change the fundamental design and instead just the syntax. So I'm OK with applying it. But right now it fails error: patch failed: src/main/org/openscience/cdk/io/MDLV2000Reader.java:414 error: src/main/org/openscience/cdk/io/MDLV2000Reader.java: patch does not apply error: patch failed: src/test/org/openscience/cdk/io/MDLWriterTest.java:34 error: src/test/org/openscience/cdk/io/MDLWriterTest.java: patch does not apply Patch failed at 0001. I think you need to regenearte it for the latest master ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 06:42 Message: Rajarshi, how do you suggest we take this further? You now approved the patch, which is enough for application to master. I agree we should continue talking about the design of the CDK cores; however, I do not think this patch does more than changing the syntax of the currently existing structure, rather than changing the design. As such, I would ask this patch to be applied. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-04 06:39 Message: Mark, there are two separate issues here: 1. to work against interfaces, which is what this patch is about; 2. do we need more than one implementation. Regarding your question if we really need subclasses. I do not know. The system makes it possible. Now, let's turn around the question. Consider that the CDK are already not of the fastest, what would be your preferred way of introducing debugging functionality. My choice at the time was to not burden the existing classes with additional if statements, making the code more difficult to read, and further decrease the performance of the data classes. But, again, that's outside the scope of this patch. The builder merely allows us to work against interfaces, instead of implementations. ---------------------------------------------------------------------- Comment By: Mark Rijnbeek (mark_rynbeek) Date: 2010-04-04 05:55 Message: Quote " That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default" That makes sense, but I still don't understand the need for subclasses with regards to debugging and notifying. For example I never ecountered an object model where debugging is implemented through separate subclasses. Unless debugging is very elaborate and introduces lots of code, why put it in a subclass to start with? My question would be the samish for the notification. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 19:40 Message: Aah, OK I got it (I think :). The patch itself looks good to me ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-04-03 17:57 Message: 1. IChemObject extends ICDKObject, so, I think 'everything' already extends it... 2. you still can do new NNAtom()... if that's what you want, it's what you do. The point of the builder (the new and the current) is to make an object without having to hardcode the actual implementation. That way, CDK code can be written independently from a particular implementation. That is, we want to be able to run the same code in Debug, NoNotify, and Default. Or with any new (faster) implementation. The old and new builder make it possible to *not* have to choose a particular implementation. 3. the function call will always be the same... it is the IChemObjectBuilder implementation that determines what impl class you get. The DefaultChemObjectBuilder gives an Atom, the NoNotificationChemObjectBuilder a NNAtom, and the DebugChemObjectBuilder a DebugAtom. All calls are simply taking IAtom.class as argument. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-04-03 15:30 Message: > That was temporary only, so that I could do things in steps... later I > renamed it to the old naming conventions, dropping the 'new' bits again. That makes sense. >> What is the motivation for ICDKObject being a super class of IChemObject? > Because not all classes currently extend IChemObject, presumably because > it was decided that they should not have listener functionality, etc.. > The ICDKObject class is there to simplify the builder API, as they now > have a common ancestor. Wouldn't it be a good to make things consistent and make everything derived from ICDKObject? Given that the current patch is a significant change, it seems reasonable to follow up with another change that would help things become more consistent. Obviously, this doesn't apply to this patch, but I think it should follow on from this >> A more fundamental question I have is that the current design (vai >> newInstance) seems to simply be a way of writing: >> >> IAtom anAtom = new Atom("C"); >> >> or >> >> IDebugAtom aDebugAtom = new DebugAtom("C"); > These two alternative do not allow you to select an alternative > implementation, like NNAtom... Not sure I understand; wouldn't one just do NNAtom nnatom = new NNAtom("C") I realize that in your version, you have a single function that is parametrized on the class of the desired object, but it seems to be fundamentally the same - to say choose between 3 possible classes you'd right if (desired == 'debug) return builder.newInstance(DebugAtom.class,...) else if (desired == 'nn') return builder.newInstance(NNAtom.class, ...) which seems to be the same as if (desired == 'debug) return new DebugAtom("C") else if (desired == 'nn') return new NNAtom("C") > indeed, with the difference that algorithms can now do stuff like: > > public IAtom someMethod(IMolecule mol) { > return mol.getBuilder().new(IAtom.class, "C"); > } > > and therefore not force you to use the Atom, but allow any > implementation. But you are specifying IAtom and not say DebugAtom or NNAtom. If I want NNAtom, I have to change that function call, right? I think I get the design issues here, but I have to say that this system seems to lead to an overly complex object hierarchy. ---------------------------------------------------------------------- Comment By: Egon Willighagen (egonw) Date: 2010-03-31 07:02 Message: > I'd go with new() rather than newInstance (but will the compiler allow a > function name with the same name as a keyword?) OK. Had considered that, but thought to stick with the Java newInstance() naming convention... > What is the need for NewDefaultChemObjectBuilder in addition to > DefaultChemObjectBuilder? Is the latter meant to be deprecated at one > point? Might it be a good idea to just rename the former to the latter? > Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with > the old method names? So you looked at the individual patches? :) That was temporary only, so that I could do things in steps... later I renamed it to the old naming conventions, dropping the 'new' bits again. > What is the motivation for ICDKObject being a super class of IChemObject? Because not all classes currently extend IChemObject, presumably because it was decided that they should not have listener functionality, etc.. The ICDKObject class is there to simplify the builder API, as they now have a common ancestor. > Couldn't IChemObject have been the base class for all CDK objects? Yes, but I think the choice to not do that for some classes was deliberate... would need to dig up email archives on why IAtomParity, IMolecularFormular, ... do not extend IChemObject, but would like to keep that decision separate from the builder refactoring anyway... > You mention the case of IAtomParity as a case where ICDKObject would be useful > - but can you explain some more? If no listeners are added, why would a > class have a problem? Overhead I guess. > A more fundamental question I have is that the current design (vai > newInstance) seems to simply be a way of writing: > > IAtom anAtom = new Atom("C"); > > or > > IDebugAtom aDebugAtom = new DebugAtom("C"); These two alternative do not allow you to select an alternative implementation, like NNAtom... > In otherwords, in your refactoring you'd write > > builder.newInstance(IAtom.class, "C"); > > which is just a rearrangement of what I wrote just above. indeed, with the difference that algorithms can now do stuff like: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } and therefore not force you to use the Atom, but allow any implementation. > So why do we need a builder? OK, you'd say that the builder pattern allows us to have > different implementaitons Yes, but the previous IChemObjectBuilder did that too. > - but how does your refactoring support that? The difference now is, that the new API allows one to implement a subset of classes... for example, it could implement only IMolecule and related stuff, but not IChemModel etc. Now, relating this to the UIT... the UIT needs only a few classes... IMolecule, IAtom, IBond,,, for setting up a query atomcontainer... with the new API it could use a shortlist of classes in the isomorphism module itself, and therefore removing the dependency on the data module. > It looks like I can only specify the interface for an atom (say), i.e., IAtom > and the code returns an Atom object. How would do I specify a different > implementaition of the IAtom interface? Because each IChemObject implementation has a specific builder... so, we can use the code: public IAtom someMethod(IMolecule mol) { return mol.getBuilder().new(IAtom.class, "C"); } This way, if the IMolecule was a NNMolecule, it returns an NNAtom... if the input was a DebugMolecule it would return a DebugAtom. > Do I need to implement a new builder? If you implement new classes for the IChemObject interfaces you also have to implement a new IChemObjectBuilder... the old IChemObjectBuilder had a million methods, making it much more work to make a new implementation. The new API only has one method, and allows you to throw an Exception for IChemObject's that cannot be build... > But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Yes. But the new API requires you much less copy/pasting than the 1.2.x builders. > Sorry for all these questions, Not at all! I should apologize for not providing enough info in the first place. > but I have never really wrapped my head around the builder architecture and this looks > like a good time to clear up my fuzziness. (Also it might be useful to have this discussion > on the list) Please read my answers, and decide then if we should copy/paste this message to cdk-devel... > obviously haven't gone through it line by line, but it seems OK. A second review will be useful OK. Meanwhile, I will prepare a "newInstance() -> new()" patch. ---------------------------------------------------------------------- Comment By: Rajarshi Guha (rajarshi) Date: 2010-03-29 14:00 Message: I'd go with new() rather than newInstance (but will the compiler allow a function name with the same name as a keyword?) What is the need for NewDefaultChemObjectBuilder in addition to DefaultChemObjectBuilder? Is the latter meant to be deprecated at one point? Might it be a good idea to just rename the former to the latter? Similarly why was getBuilder renamed to getNewBuilder? Can't we stay with the old method names? What is the motivation for ICDKObject being a super class of IChemObject? Couldn't IChemObject have been the base class for all CDK objects? You mention the case of IAtomParity as a case where ICDKObject would be useful - but can you explain some more? If no listeners are added, why would a class have a problem? A more fundamental question I have is that the current design (vai newInstance) seems to simply be a way of writing: IAtom anAtom = new Atom("C"); or IDebugAtom aDebugAtom = new DebugAtom("C"); In otherwords, in your refactoring you'd write builder.newInstance(IAtom.class, "C"); which is just a rearrangement of what I wrote just above. So why do we need a builder? OK, you'd say that the builder pattern allows us to have different implementaitons - but how does your refactoring support that? It looks like I can only specify the interface for an atom (say), i.e., IAtom and the code returns an Atom object. How would do I specify a different implementaition of the IAtom interface? Do I need to implement a new builder? But then it looks like I need to reimplement the whole of the DefaultChemObjectBuilder? Sorry for all these questions, but I have never really wrapped my head around the builder architecture and this looks like a good time to clear up my fuzziness. (Also it might be useful to have this discussion on the list) Iobviously haven't gone through it line by line, but it seems OK. A second review will be useful ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=2977074&group_id=20024 |