osra [--learn] [-w <output file or folder name>] [--preview
<filename>] [-s <dimensions, 300x400>]
[-o <filename prefix>] [-v] [-d] [--ocr
<oOcCnNHFsSBuUgMeEXYZRPp23456789AmThDGQ>
] [-a <configfile>] [-l <configfile>]
[-b] [-c] [-e] [-g] [-p]
[--embedded-format <inchi/smi/can>]
[--v3000] [-f <can/smi/sdf>] [--timeout
<default: no timeout>] [-k] [-i] [-j]
[-u <default: 0 rounds>] [-t <0.2..0.8>]
[--pdf <default: 300>] [-r <default:
auto>] [-n] [-R <0..360>] [--]
[--version] [-h] <input file or folder name> ...
Most common use: osra filename
Use reaction output formats such as rxn, rsmi, or cmlr to run OSRA in reaction recognition mode instead of the default molecule recognition.
OSRA is a utility designed to convert graphical representations of chemical structures into SMILES or SDF. OSRA can read a document in any of the over 90 graphical formats parseable by GraphicsMagick and generate the SMILES or SDF representation of the molecular structure images encountered within that document.
The program follows the usual GNU command line syntax, with long options starting with two dashes (`-').
--learn
Print out all structure guesses with confidence parameters
-w <filename>, --write <filename>
Write recognized structures to a file or folder
--preview <filename>
Preview Image
-s <dimensions, 300x400>, --size <dimensions, 300x400>
Resize image on output
-o <filename prefix>, --output <filename prefix>
Write recognized structures to image files with given prefix
-v, --verbose
Be verbose and print the program flow
-d, --debug
Print out debug information on spelling corrections
--ocr <oOcCnNHFsSBuUgMeEXYZRPp23456789AmThDGQ>
OCR character filter
-a <configfile>, --superatom <configfile>
Superatom label map to SMILES
-l <configfile>, --spelling <configfile>
Spelling correction dictionary
-b, --bond
Show average bond length in pixels (only for SDF/SMI/CAN output
format)
-c, --coordinates
Show surrounding box coordinates (only for SDF/SMI/CAN output format)
-e, --page
Show page number for PDF/PS/TIFF documents (only for SDF/SMI/CAN
output format)
-g, --guess
Print out resolution guess
-p, --print
Print out confidence estimate
--embedded-format <inchi/smi/can>
Embedded format
--v3000
Use V3000 format for MDL MOL and SDF output
-f <can/smi/sdf>, --format <can/smi/sdf>
Output format
--timeout <default: no timeout>
Timeout in seconds per file processing
-k, --keep
Keep image unsegmented, do not separate molecules from text
-i, --adaptive
Adaptive thresholding pre-processing, useful for low light/low
contrast images
-j, --jaggy
Additional thinning/scaling down of low quality documents
-u <default: 0 rounds>, --unpaper <default: 0 rounds>
Pre-process image with unpaper algorithm, rounds
-t <0.2..0.8>, --threshold <0.2..0.8>
Gray level threshold
--pdf <default: 300>
Resolution in dots per inch for PDF rendering
-r <default: auto>, --resolution <default: auto>
Resolution in dots per inch
-n, --negate
Invert color (white on black)
-R <0..360>, --rotate <0..360>
Rotate image clockwise by specified number of degrees
--, --ignore_rest
Ignores the rest of the labeled arguments following this flag.
--version
Displays version information and exits.
-h, --help
Displays usage information and exits.
<> (accepted multiple times)
(required) input file(s) or a single folder
This is system-wide spelling corrections dictionary for atom labels and abbreviations that might not be correctly parsed by OCR engine. The default location of the file can be redefined with -l option. You can run with -d option for more debug output on OCR processing and spelling corrrection.
This is system-wide translations of superatom labels to SMILES codes. The default location of the file can be redefined with -a option.
The following diagnostics may be issued on stdout: Cannot open file "dummy.png" - the given input file does not exist or is not readable. OSRA provides some return codes, that can be used in scripts.
Report all bugs which are functionality related to SourceForge BugTracker. Report all bugs which are Debian-packaging or Debian-specific to Debian BugTracker.