Enabling chemists to send experimental or theoretical data together with a publication requires software (commercial and open access) which can create, handle or transform chemistry related data. This includes chemical drawings, reactions, spectral data and chemical property data.
Data for publication supplements should be submitted in open data formats (XML, CML, ThermoML, JCAMP) or at least in data formats which are well defined (like SD format V3000).Chemical data supplements should not be submitted in PDF format, a format which destroys chemical information and hinders automated machine readability. The publishing of chemical molecule and reaction drawings as picture data (TIFF, BMP, PNG) is needed for the print process, but breaks any simple computer capturing process. Instead such chemical bitmap data needs to be run through an optical character recognition process (OCR) to capture the chemical formulas. This process is not error-free, has a poor accuracy and would not be needed if the chemical meta data is submitted as CML.
Every modern chemistry software can export CML for molecule and reaction drawings and every software which captures experimental thermodynamic or spectroscopic data must support open data exchange formats (JCAMP, netCDF, ThermoML and others).
This section includes tools and data formats for molecules (mol, sdf, cml, SMILES) and reaction data (rnx, rdx, cml, SMARTS, SMIRKS). Chemical drawings should be exported at CML (Chemical markup language) or mol format. No software or vendor specific format or even worse picture formats (BMP, JPEG, TIFF) should be used. If possible a list of InChI codes (InChIKey) should be created from all molecules. Examples see below.
Name Vendor Open/Closed Source Operating System Note
ISISDraw
MDL
Closed
Windows software
(deprecated no CML import/export; copy/paste into other programs is possible)
ChemDraw
CambridgeSoft
Closed
Windows software
ChemSketch
ACDLabs
Closed
Windows software
MarvinSketch
BioRad
Closed
Windows
KnowItAll
BioRad
Closed
Windows
XDrawChem
<http://xdrawchem.sourceforge.net/>
Open
Windows/LINUX/OSX
JChemPaint
<http://cdk.sourceforge.net/>
Open
Platform Independent
Bioclipse
<http://www.bioclipse.net/>
Open
Windows/Linux/OS-X
Chemical drawings should be exported into CML or RNX format. Examples see below.
Name Vendor Open/Closed Source Operating System Note
ISISDraw
MDL
Closed
Windows software
(deprecated no CML import/export; copy/paste into other programs is possible)
ChemDraw
CambridgeSoft
Closed
Windows software
ChemSketch
ACDLabs
Closed
Windows software
MarvinSketch
BioRad
Closed
Windows
KnowItAll
BioRad
Closed
Windows
JChemPaint
<http://cdk.sourceforge.net/>
Open
Platform Independent
Bioclipse
<http://www.bioclipse.net/>
Open
Windows/Linux/OS-X
The below table provides is only intended to provide an overview of the functionality of a limited number of codes. The pages linked to in the "Special Features" section are places for users/developers to highlight particular strengths or unique features of a code.
For a more comprehensive list of the various builders and visualisers that are available, please see the Linux4Chemistry list or Mario Valle's list of Free Chemistry Visualisation Tools.
Program Building Visualising Platforms Open Special Features
Small Mol. Large Struct. Periodic Struct. Internal Minimiser Molecules Isosurfaces Vector Fields Windows Mac OSX Linux
Aten
y
y
y
y
y
y
-
-
y
y
?
[AtenFeatures]
Avogadro
y
y
y
y
y
y
-
y
y
y
y
[AvogadroFeatures]
CCP1GUI
y
-
-
y
y
y
y
y
y
y
?
[CCP1GUIFeatures]
Jmol
y
y
y
y
y
y
-
y
y
y
y
[JmolFeatures]
Molden
y
-
-
y
y
y
-
y
y
y
?
[MoldenFeatures]
-
y
y
-
y
y
y
?
[MolekelFeatures]
Zeobuilder
y
y
y
-
y
-
-
-
-
y
?
[ZeobuilderFeatures]
Jamberoo
y
y
y
y
y
y
[ZeobuilderFeatures]
Such converter tools can be used to convert chemical data into accepted data formats (CML, MOL, SDF, PDB).
Pure experimental and calculated molecular property data (mp, bp, logP, pKa, solubility, toxicity, molecular descriptors, toxicity data) should be supplied in open data formats like XML, allowed but discouraged are also TXT (TAB separeated) or XLS (BIFF4 or later) format. If molecular data is available the files should can be exported in SDF format together with molecular information. Forbidden are supplements in PDF format. Large files should be compressed in .gz or .zip format.
Name Vendor Open/Closes Source Note
Bioclipse
Bioclipse team
Open
EXCEL
Microsoft
Closed
Calc Spreadsheet
OpenOffice
Open
Instant-JChem
ChemAxon
Closed
ACDLabs
Closed
several spectral data packages
7ZIP
?
free compression and decompression tool for WIN/LINUX/OSX
TRC
?
tools for ThermoML conversion, capturing of experimental data and data format conversions
Here we are talking about NMR, MS, UV, IR, GC-MS, LC-MS, LC-UV.
Allowed but discouraged are vendor specific formats (like .skc in case of ISIS Draw or SMILES). Large files should be compressed in .gz format (GNU ZIP) or .zip format.
BACK to Open Data in Chemistry
Blue Obelisk Wiki: AtenFeatures
Blue Obelisk Wiki: Dat_file
Blue Obelisk Wiki: Log_file