Menu

Home

Anonymous Igor

Description

OSRA (Optical Structure Recognition Application) is a utility designed to convert graphical representations of chemical structures and reactions, as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES or MOL files – a computer recognizable molecular structure format. OSRA can read a document in any of the over 90 graphical formats parseable by GraphicsMagick – including GIF, JPEG, PNG, TIFF, PDF, PS etc., and generate the SMILES or MOL representation of the molecular structure images encountered within that document, or RSMI/RXN for reactions.

Note that any software designed for optical recognition is unlikely to be perfect, and the output produced might, and probably will, contain errors, so curation by a human knowledgeable in chemical structures is highly recommended.

OSRA can process the following types of images:

  • Computer-generated 2D structures, such as found on the PubChem website, black-and-white and color.
  • Black-and-white PDF and PostScript files, including multi-page ones.
  • Scanned images – black-and-white, a resolution of 300 dpi is recommended, though 150 dpi can also produce fair results. Please make sure the scanned image is of reasonable quality – an input that's too noisy will only generate garbage output.
  • Reactions and Polymers

You can download a free version of the source code or support OSRA development by purchasing binary installation executables for Windows, Linux, and OSX.

Getting started with OSRA

OSRA is Free and Open Source Software. You are welcome to download and use it, provided that you understand the terms described above. Participation in the development is highly encouraged! We also welcome your feedback – send us your comments, suggestions, criticism, or praise to the contact emails listed here.

Web Interface

To demonstrate the capabilities (and limitations) of OSRA we have created an OSRA Web Interface. Try this sample image from the US Patent Office website first.


Related

Wiki: Batch_Processing_and_Filtering
Wiki: Compilation
Wiki: Contact_information
Wiki: Dependencies
Wiki: Download
Wiki: License
Wiki: Main_Page
Wiki: News
Wiki: Plugins
Wiki: Success stories
Wiki: Usage
Wiki: Validation