No problem,

So looking at the patch neither reader actually sets the bond order as null but instead does use a new enumeration as I suggested. Sorry for my mistake I read "bonds like the '4' query bond type in MDL molfiles will now be read as UNSET" as - "will be set as CDKConstants.UNSET" but this is not the case. It actually adds a new enum Order.UNSET.

public enum Order {
       SINGLE,
       DOUBLE,
       TRIPLE,
       QUADRUPLE,
       UNSET  <- new
}

For the patch I believe naming this as UNKNOWN instead of UNSET would follow enum naming conventions more closely. After all the order isn't unset for a query bond, it's "unknown" as we don't "know" whether it's up or down?

Egon, you didn't address this but what do you make of having different conventions to the query types in IBond.Stereo? As I said previously having the SINGLE_OR_DOUBLE enum opposed to the flag would align to the same storage of query IBond.Stereo. I know there are already inconsistencies (IAtomParity/setAtomParity springs to mind) do we want to introduce another one? I understand that this would require major rewrites but these are probably already needed if we introduce the Order.UNDER or null (see. below).

public enum Stereo {
NONE,
UP,
UP_INVERTED,
DOWN,
DOWN_INVERTED,
UP_OR_DOWN,
        UP_OR_DOWN_INVERTED,
E_OR_Z,
        E,
         Z,
E_Z_BY_COORDINATES
}

Having null or Order.UNSET will break some core methods. For example:

        AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(container);

will break if the order is null or a new Order.UNSET enum is introduced as it this method in the manipulator is used:

public Order getMaximumBondOrder(IAtom atom) {
IBond.Order max = IBond.Order.SINGLE;
for (int i = 0; i < bondCount; i++) {
if (bonds[i].contains(atom) && 
bonds[i].getOrder().ordinal() > max.ordinal()) {
max = bonds[i].getOrder();
}
}
return max;
}

if null is introduced we get a NPE, if Order.UNSET is introduced ordinal() will equal 5 and thus be the maximum bond order which I think would then lead to being unable to type the atom.
I'm sure there are more cases like this.

Cheers,
J

On 15 Jun 2012, at 10:02, Egon Willighagen wrote:

Hi John,

just some random comments, not meant as arguments...

On Fri, Jun 15, 2012 at 9:13 AM, John May <john.wilkinsonmay@gmail.com> wrote:
My biggest gripe with this is setting the order to null. I'm of the opinion that you should never use null to represent the absence of a value.

This is actually a deliberate design in the CDK: to use null for unset
values. This was chosen over the alternative of things having UNSET
for all sorts of enums etc.

Not using null for 'unset' everywhere would make the interface complexer...

Setting the order to null and using a flag would provide more null pointer exceptions and require more defensive and cluttered coding with (!= null) statements. I'm aware the bond order can be null already but this would amplify the problem.

At the same time, to surfaces problems... dedicated UNSET values can
more easily be ignored by code, leading to unsuspected behavior...

Here's my suggestions.

Set this as a new order, we already do this with the stereo (UP_OR_DOWN) and would make for a more consistent API. We could also have a new UNKNOWN enumeration that would remove all null checking
on bond order.

This is a very significant rewrite of the CDK, and many algorithms
would need to be updated...

public enum Order {
       UNKNOWN,                        <- new
       SINGLE,
       SINGLE_OR_DOUBLE,               <- new
       DOUBLE,
       TRIPLE,
       QUADRUPLE
}

IBond.Order is for fixed bond orders... how should algorithms handle
this. That makes your second proposal more attractive to me...

another options is to make the order interface and have two types of order?

public interface IOrder;

public enum BasicOrder implements IOrder {
       SINGLE,
       DOUBLE,
       TRIPLE,
       QUADRUPLE
}

public enum QueryOrder implements IOrder {
       SINGLE_OR_DOUBLE
}

Because in this proposal you clearly have an indication of the query
concept, much like in the IQueryBond...

The patch by Klas takes a middle road, that does not break the current
design, but does introduce the concept. Please review the patch as
such.

Egon

--
Dr E.L. Willighagen
Postdoctoral Researcher
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Cdk-devel mailing list
Cdk-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-devel