From: Christoph Steinbeck <steinbeck@ic...>  20020205 10:25:38

Thanks so much for this. Currently I'm travelling, but as soon as I'm back I'll see if we can get the Java version running. BTW, I only got the .py thingy as an attachment, not the java skeleton that you werer talking about. Cheers, Chris Brian Kelley wrote: > As stated before, there are two steps to canonicalize a molecule. > > The first step is to create the symmtery class and symmetry orders for > each atom in the molecule. > > A symmetry class is defined to be a unique integer for all symmetrically > identical atoms in the graph. > A symmetry order is defined to be a unique integer for all atoms in the > graph. > > Symmetry classes are defined in the first pass and a tie breaking scheme > is used in the second pass to generate the symmetry orders. > > The symmetry orders are used to traverse the molecule in order to > generate the ordered smiles string. > > The algorithm is fairly simple conceptually: > 1 start at the atom with the lowest symmetry order > 2 sort the connected atoms using symmetry orders ignoring atoms that > have been visited > 3 recursivly traverse to the next connected atom with the lowest > symmetry order > 4 go to 2 until all atoms are visited > > Repeat until all atoms have been visited. Then sort all the generated > fragments. > > Enclosed is the python version of generating symmetry orders and an > incomplete first pass at a java implementation. > > The java implementation is include for purposes of critique and is not > functional at this point. > > A brief explanation of the molecule model > > molecule.atoms > atoms in a molecule > molecule.bonds > bonds in a molecule > > atom.bonds > connected bonds > atom.oatoms > connected atoms in the same order as atom.bonds > > bond.atoms > atoms at either end of a bond > > atom.xatom(bond) > returns the atom at the other end of bond > bond.xatom(atom) > returns the atom at the other end of bond > > atom.handle > bond.handle > molecule.handle > mimic of daylights handle system for keeping track > of objects > each object has a unique handle that I > generally use for lookups > > > For the java code the SymmetryOrders is just a constructor the test > purposes. The real constructor will take a cdk molecule. > > The code should be a good start for discussion. > > >  > > """Disambiguate > > FreedDisambiguate(graph) > > given atom and bond equiv_classes use Freed's technique to > find symmettry classes and symmetry orders for the graph > """ > import Primes > > class FreedDisambiguate: > def __init__(self, graph): > """(graph) > > given atom and bond equiv_classes use Freed's technique to > find symmettry classes and symmetry orders for the graph > """ > self.graph = graph > atoms = graph.atoms > self.range = range(len(atoms)) > offsets = self.offsets = [] > > # symmetry classes are the symmetrical points in a graph > # so if two vertices share the same value, they are symmetric > symclasses = self.symclasses = [] > > # symmetry orders describe the traversal order to canonicalize > # the graph. All of these should be unique values > symorders = self.symorders = [0] * len(atoms) > indices = {} > > # cache the atom indices in a lookup table > index = 0 > for atom in atoms: > indices[atom.handle] = index > index += 1 > > # initialize the necessary data > for atom in atoms: > # For each atom, get the neighbors offsets and the connected > # bondtype for a given atom > # N.B. the oatom (adjacent atom) > # list and the bond list must be kept in sync > # bondtype must be an integer!!!! > > neighbors = zip([indices[x.handle] for x in atom.oatoms], > [x.bondtype for x in atom.bonds]) > > # store the symorder index of the neighboring atoms > # and the bondtype used to traverse to each neighbor > > # This forms a list of the form > # (neighbor_symorder_index, bondtype), > # ...(neighbor_symorder_index, bondtype) ... > offsets.append(neighbors) > > # the symmetry classes start out as the atom's equivalence > # class > symclasses.append(atom.equiv_class) > > ######################################################### > # find the symmetry classes > # rank the current symmetry orders > self.rank() > > # Find the sym classes invariant for the current > self.symclasses = self.findInvariant(self.symclasses) > > # now find the symmetry orders from the symmetry classes > self.symorders = self.symclasses[:] > > self.findInvariantPartitioning() > > # Give them back to the atoms and bonds > symclasses = self.symclasses > symorders = self.symorders > i = 0 > for atom in atoms: > atom.symclass = symclasses[i] > atom.symorder = symorders[i] > i = i + 1 > > def disambiguate(self, symclasses): > """Use the connection to the atoms around a given vertex > as a multiplication function to disambiguate a vertex""" > offsets = self.offsets > result = symclasses[:] > for index in self.range: > try: > val = 1 >  Dr. Christoph Steinbeck (http://www.ice.mpg.de/departments/ChemInf) MPI of Chemical Ecology, Winzerlaer Str. 10, Beutenberg Campus, 07745 Jena, Germany Tel: +49(0)3641 571263  Fax: +49(0)3641 571202 What is man but that lofty spirit  that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. 