Menu

Usage

Anonymous Igor

Brief usage

osra [--learn] [-w <output file or folder name>] [--preview
                                   <filename>] [-s <dimensions, 300x400>]
                                   [-o <filename prefix>] [-v] [-d] [--ocr
                                   <oOcCnNHFsSBuUgMeEXYZRPp23456789AmThDGQ>
                                   ] [-a <configfile>] [-l <configfile>]
                                   [-b] [-c] [-e] [-g] [-p]
                                   [--embedded-format <inchi/smi/can>]
                                   [--v3000] [-f <can/smi/sdf>] [--timeout
                                   <default: no timeout>] [-k] [-i] [-j]
                                   [-u <default: 0 rounds>] [-t <0.2..0.8>]
                                   [--pdf <default: 300>] [-r <default:
                                   auto>] [-n] [-R <0..360>] [--]
                                   [--version] [-h] <input file or folder name> ...

Most common use: osra filename

Use reaction output formats such as rxn, rsmi, or cmlr to run OSRA in reaction recognition mode instead of the default molecule recognition.

Description

OSRA is a utility designed to convert graphical representations of chemical structures into SMILES or SDF. OSRA can read a document in any of the over 90 graphical formats parseable by GraphicsMagick and generate the SMILES or SDF representation of the molecular structure images encountered within that document.

Options

The program follows the usual GNU command line syntax, with long options starting with two dashes (`-').

  --learn
     Print out all structure guesses with confidence parameters

   -w <filename>,  --write <filename>
     Write recognized structures to a file or folder

   --preview <filename>
     Preview Image

   -s <dimensions, 300x400>,  --size <dimensions, 300x400>
     Resize image on output

   -o <filename prefix>,  --output <filename prefix>
     Write recognized structures to image files with given prefix

   -v,  --verbose
     Be verbose and print the program flow

   -d,  --debug
     Print out debug information on spelling corrections

   --ocr <oOcCnNHFsSBuUgMeEXYZRPp23456789AmThDGQ>
     OCR character filter

   -a <configfile>,  --superatom <configfile>
     Superatom label map to SMILES

   -l <configfile>,  --spelling <configfile>
     Spelling correction dictionary

   -b,  --bond
     Show average bond length in pixels (only for SDF/SMI/CAN output
     format)

   -c,  --coordinates
     Show surrounding box coordinates (only for SDF/SMI/CAN output format)

   -e,  --page
     Show page number for PDF/PS/TIFF documents (only for SDF/SMI/CAN
     output format)

   -g,  --guess
     Print out resolution guess

   -p,  --print
     Print out confidence estimate

   --embedded-format <inchi/smi/can>
     Embedded format

   --v3000
     Use V3000 format for MDL MOL and SDF output

   -f <can/smi/sdf>,  --format <can/smi/sdf>
     Output format

   --timeout <default: no timeout>
     Timeout in seconds per file processing

   -k,  --keep
     Keep image unsegmented, do not separate molecules from text

   -i,  --adaptive
     Adaptive thresholding pre-processing, useful for low light/low
     contrast images

   -j,  --jaggy
     Additional thinning/scaling down of low quality documents

   -u <default: 0 rounds>,  --unpaper <default: 0 rounds>
     Pre-process image with unpaper algorithm, rounds

   -t <0.2..0.8>,  --threshold <0.2..0.8>
     Gray level threshold

   --pdf <default: 300>
     Resolution in dots per inch for PDF rendering

   -r <default: auto>,  --resolution <default: auto>
     Resolution in dots per inch

   -n,  --negate
     Invert color (white on black)

   -R <0..360>,  --rotate <0..360>
     Rotate image clockwise by specified number of degrees

   --,  --ignore_rest
     Ignores the rest of the labeled arguments following this flag.

   --version
     Displays version information and exits.

   -h,  --help
     Displays usage information and exits.

   <>  (accepted multiple times)
     (required)  input file(s) or a single folder

Files

  • /opt/local/osra/2.2.1spelling.txt

This is system-wide spelling corrections dictionary for atom labels and abbreviations that might not be correctly parsed by OCR engine. The default location of the file can be redefined with -l option. You can run with -d option for more debug output on OCR processing and spelling corrrection.

  • /opt/local/osra/2.2.1//superatom.txt

This is system-wide translations of superatom labels to SMILES codes. The default location of the file can be redefined with -a option.

Diagnostics

The following diagnostics may be issued on stdout: Cannot open file "dummy.png" - the given input file does not exist or is not readable. OSRA provides some return codes, that can be used in scripts.

  • Code 0: Program exited successfully.
  • Code 1 or other non-zero: Some error occured. See above mentioned error messages.

Bugs

Report all bugs which are functionality related to SourceForge BugTracker. Report all bugs which are Debian-packaging or Debian-specific to Debian BugTracker.


Related

Wiki: Home

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.