Thanks so much for this.
Currently I'm travelling, but as soon as I'm back I'll see if we can get
the Java version running.
BTW, I only got the .py thingy as an attachment, not the java skeleton
that you werer talking about.
Cheers,
Chris
Brian Kelley wrote:
> As stated before, there are two steps to canonicalize a molecule.
>
> The first step is to create the symmtery class and symmetry orders for
> each atom in the molecule.
>
> A symmetry class is defined to be a unique integer for all symmetrically
> identical atoms in the graph.
> A symmetry order is defined to be a unique integer for all atoms in the
> graph.
>
> Symmetry classes are defined in the first pass and a tie breaking scheme
> is used in the second pass to generate the symmetry orders.
>
> The symmetry orders are used to traverse the molecule in order to
> generate the ordered smiles string.
>
> The algorithm is fairly simple conceptually:
> 1 start at the atom with the lowest symmetry order
> 2 sort the connected atoms using symmetry orders ignoring atoms that
> have been visited
> 3 recursivly traverse to the next connected atom with the lowest
> symmetry order
> 4 go to 2 until all atoms are visited
>
> Repeat until all atoms have been visited. Then sort all the generated
> fragments.
>
> Enclosed is the python version of generating symmetry orders and an
> incomplete first pass at a java implementation.
>
> The java implementation is include for purposes of critique and is not
> functional at this point.
>
> A brief explanation of the molecule model
>
> molecule.atoms > atoms in a molecule
> molecule.bonds > bonds in a molecule
>
> atom.bonds > connected bonds
> atom.oatoms > connected atoms in the same order as atom.bonds
>
> bond.atoms > atoms at either end of a bond
>
> atom.xatom(bond) > returns the atom at the other end of bond
> bond.xatom(atom) > returns the atom at the other end of bond
>
> atom.handle
> bond.handle
> molecule.handle > mimic of daylights handle system for keeping track
> of objects
> each object has a unique handle that I
> generally use for lookups
>
>
> For the java code the SymmetryOrders is just a constructor the test
> purposes. The real constructor will take a cdk molecule.
>
> The code should be a good start for discussion.
>
>
> 
>
> """Disambiguate
>
> FreedDisambiguate(graph)
>
> given atom and bond equiv_classes use Freed's technique to
> find symmettry classes and symmetry orders for the graph
> """
> import Primes
>
> class FreedDisambiguate:
> def __init__(self, graph):
> """(graph)
>
> given atom and bond equiv_classes use Freed's technique to
> find symmettry classes and symmetry orders for the graph
> """
> self.graph = graph
> atoms = graph.atoms
> self.range = range(len(atoms))
> offsets = self.offsets = []
>
> # symmetry classes are the symmetrical points in a graph
> # so if two vertices share the same value, they are symmetric
> symclasses = self.symclasses = []
>
> # symmetry orders describe the traversal order to canonicalize
> # the graph. All of these should be unique values
> symorders = self.symorders = [0] * len(atoms)
> indices = {}
>
> # cache the atom indices in a lookup table
> index = 0
> for atom in atoms:
> indices[atom.handle] = index
> index += 1
>
> # initialize the necessary data
> for atom in atoms:
> # For each atom, get the neighbors offsets and the connected
> # bondtype for a given atom
> # N.B. the oatom (adjacent atom)
> # list and the bond list must be kept in sync
> # bondtype must be an integer!!!!
>
> neighbors = zip([indices[x.handle] for x in atom.oatoms],
> [x.bondtype for x in atom.bonds])
>
> # store the symorder index of the neighboring atoms
> # and the bondtype used to traverse to each neighbor
>
> # This forms a list of the form
> # (neighbor_symorder_index, bondtype),
> # ...(neighbor_symorder_index, bondtype) ...
> offsets.append(neighbors)
>
> # the symmetry classes start out as the atom's equivalence
> # class
> symclasses.append(atom.equiv_class)
>
> #########################################################
> # find the symmetry classes
> # rank the current symmetry orders
> self.rank()
>
> # Find the sym classes invariant for the current
> self.symclasses = self.findInvariant(self.symclasses)
>
> # now find the symmetry orders from the symmetry classes
> self.symorders = self.symclasses[:]
>
> self.findInvariantPartitioning()
>
> # Give them back to the atoms and bonds
> symclasses = self.symclasses
> symorders = self.symorders
> i = 0
> for atom in atoms:
> atom.symclass = symclasses[i]
> atom.symorder = symorders[i]
> i = i + 1
>
> def disambiguate(self, symclasses):
> """Use the connection to the atoms around a given vertex
> as a multiplication function to disambiguate a vertex"""
> offsets = self.offsets
> result = symclasses[:]
> for index in self.range:
> try:
> val = 1
>

Dr. Christoph Steinbeck (http://www.ice.mpg.de/departments/ChemInf)
MPI of Chemical Ecology, Winzerlaer Str. 10, Beutenberg Campus, 07745
Jena, Germany
Tel: +49(0)3641 571263  Fax: +49(0)3641 571202
What is man but that lofty spirit  that sense of enterprise.
... Kirk, "I, Mudd," stardate 4513.3..
