Download Latest Version oscar3-a5.tar.gz (28.2 MB)
Email in envelope

Get an email when there's a new version of OSCAR

Home / opsin / 0.5.2
Name Modified Size InfoDownloads / Week
Parent folder
README.txt 2009-10-04 4.5 kB
ReleaseNotes.txt 2009-10-04 1.6 kB
nameToInchi-requires-opsin-0.5.2.jar 2009-10-04 2.7 MB
Opsin-0.5.2.zip 2009-10-04 1.0 MB
opsin-0.5.2.jar 2009-10-04 1.1 MB
Totals: 5 Items   4.8 MB 0
OPSIN - Open Parser for Structural IUPAC Nomenclature
version 0.5.2 (see ReleaseNotes.txt for what's new in this version)

Daniel Lowe(Current maintainer), Dr. Peter Corbett and Prof. Peter Murray-Rust

Contact address: dl387@cam.ac.uk

Factored out of OSCAR3 (again) by Daniel Lowe. Thanks to Richard Apodaca for doing this previously.

***NB*** OPSIN is not OSCAR3. OPSIN was developed as an OSCAR3
   component but can also standalone - hence this package.

This is a library for IUPAC name-to-structure conversion. Currently it
should be considered to be under development although the interface for using it will remain constant.

The workings of OPSIN are more fully described in:

Peter Corbett, Peter Murray-Rust High-throughput identification of
chemistry in life science texts. Proceedings of Computational Life
Sciences (CompLife) 2006, Cambridge, UK, pp. 107-118.

The following lists broadly summarise what OPSIN can currently do and what will be worked on in the future.

Supported nomenclature includes:
alkanes/alkenes/alkynes/heteroatom chains e.g. hexane, hex-1-ene, tetrasiloxane and their cyclic analogues e.g. cyclopropane
All IUPAC 1993 recommended rings
Trivial acids
Hantzsch-Widman e.g. 1,3-oxazole
Spiro systems (using Von baeyer brackets)
All von Baeyer rings e.g. bicyclo[2.2.2]octane
Hydro e.g. 2,3-dihydropyridine
Indicated hydrogen e.g. 1H-benzoimidazole
Heteroatom replacement
Specification of charge e.g. ium/ide
Multiplicative nomenclature e.g. ethylenediaminetetraacetic acid
Fused ring systems with some exceptions e.g. imidazo[4,5-d]pyridine
Ring assemblies e.g. biphenyl
Most prefix and infix functional replacement nomenclature
The following functional classes: esters, diesters, glycols, acids, azides, bromides, chlorides, cyanates, cyanides, fluorides, fulminates, hydroperoxides, iodides, isocyanates, isocyanides, isoselenocyanates, isothiocyanates, selenocyanates, thiocyanates, alcohols, selenols, thiols, ethers, ketones, peroxides, selenides, selenones, selenoxides, selones, selenoketones, sulfides, sulfones, sulfoxides, tellurides, telluroketones, tellurones, telluroxides and thioketones
Locanted E/Z/R/S stereochemistry

Currently UNsupported nomenclature includes:
Any stereochemistry other than locanted E/Z/R/S stereochemistry
Greek letters
Lambda convention
Amino Acids (simple substitutive operations are allowed)
Carbohydrates
Steroids
Nucleic acids
Bridged rings 
Fused ring systems built from more than one fusion or that involve non 6-membered rings AND are not in a chain
Some conjunctive operations e.g. cyclohexaneethanol
Some functional replacement nomenclature
The following functional classes: Hydrazides, lactones, lactams, acetals, hemiacetals, oxime, oxides, ketals, hydrazones, anhydrides and semicarbazones

To use OPSIN, you'll first need to build it, using the accompanying
ant build file. (a standalone jar file is also available from sourceforge if you are not familiar with
ant and do not wish to alter the sourcecode)

The command:

ant dist

will make a combined .jar file which also includes OPSIN's
dependencies (included).

To run, the class you want is uk.ac.cam.ch.wwmm.NameToStructure. This class should be chosen automatically even if not specified.
This has a main method, so that you can run:

java -jar opsin-0.5.2.jar

then type names in and get CML (chemical markup language) back.

To use within Java

1) Learn about XOM (http://xom.nu), the XML processing framework used
   by OPSIN
2) Create an OPSIN instance, by calling the following static method

NameToStructure nameToStructure = NameToStructure.getInstance();

3) Get CML (as XOM Elements), thus:

Element cmlElement = nameToStructure.parseToCML("acetonitrile");

4) Whatever you like. Maybe print it out, thus:

System.out.println(cmlElement.toXML());

NOTE: For efficiency reuse the same instance of NameToStructure. parseToCML will typically take 5-10ms to convert a name to CML making OPSIN suitable for use on a large number of names.

CML can, if desired, be converted to other format such as SD, SMILES, InChI etc. by toolkits such as CDK, OpenBabel and JUMBO.
(NOTE: if you want InChI the simplest and fastest way is touse the seperately available NameToInchi jar in conjunction with an opsin jar)

Good Luck and let us know if you have problems, comments or suggestions!
You can contact us by posting a message on SourceForge or you can email me directly (dl387@cam.ac.uk)
Source: README.txt, updated 2009-10-04