sdsorter is a tool for manipulating molecular data within an sdf file
based on the contents of the sd tags. There are three variants:
sdsorter - raw processing of sdf files (no other formats supported) *RECOMMENDED*
obsdsorter - parses inputs with OpenBabel, slower but supports many file formats *DEPRECATED*
oesdsorter - parses inputs with OpenEye, faster than OpenBabel, but requires OpenEye license *DEPRECATED*
The basic sdsorter still has a dependency on the OpenBabel libraries.
Gzipped inputs and outputs are fully supported.
This is just a utility I wrote to be helpful in my own work, so there
is somewhat of a hodgepodge of features, some of which may not have
received much attention lately. Feel free to make suggestions.
Note that the program takes a single input file and a single output file.
If you accidentally give it multiple input files, you may be disappointed
to find all your file contents gone when the program treats the file as
an output file and overwrites it.
Example:
sdsorter -print -c -sort minimizedAffinity -reduceconfs 1 -max minimizedAffinity,-2 -max minimizedRMSD,2 docked.sdf sorted.sdf > out.txt
OVERVIEW: Sort and filter SDF files based on SD data.
USAGE: sdsorter [options] <input file> <output file>
OPTIONS:
-add=<tag,value> - add SD tag with specified value
-addToTitle=<sdkey> - add value of sd field to title
-average=<comma separated sdtags> - compute average of sdtag values
-chunksize=<uint> - Chunk-size for writing molecules. Larger uses more memory but is faster.
-clearAll - clear all all sd info
-concatToTitle=<val> - concatenate passed value to title
-divide=<a,b> - compute and store aDIVIDEPLUSEONEb
-efficiency=<tag> - Compute efficiency by dividing tag by molecular weight
-extract=<title> - extract molecules with given title
-extract-match=<regex> - extract mols whose title matches the provided regex
-extractRange=<start,end> - extract molecule within range [start,end) (applied at end)
-extractRank=<rank> - extract molecule with given rank (applied at end)
-help - Display available options (--help-hidden for more)
-keep-tag=<tag> - Remove all sddata tags except for these.
-max=<sdkey,sdval> - filter out all results with sdkey value more than sdval
-min=<sdkey,sdval> - filter out all results with sdkey value less than sdval
-nbest=<n> - keep only the n best results of current sort
-omit-header - omit header in columnized output
-print - print results to screen
-printCnt - Print out number of resulting conformations
-randomize - randomize order of results
-reduceCluster - Reduce size of clusters
=first - Choose conformation listed first in file.
=center - Choose conformation with minimum average cluster RMSD.
=both - Choose both first and center conformations.
=none - No cluster reduction (default).
-reduceconfs=<n> - keep only the first n conformations of each stereoisomer
-remove=<title> - remove molecules with given title
-remove-match=<regex> - remove mols whose title matches the regex
-remove-tag=<tag> - remove sd data with specified tag(s)
-rename=<old,new> - rename SD tag from old to new
-replaceTitle=<sdkey> - replace title with value of sd field
-reversesort=<sdkey> - sort from highest to lowest value of sdkey
-sort=<sdkey> - sort from lowest to highest value of sdkey
-subtract=<a,b> - compute and store aMINUSb
-unique=<sdkey> - keep single conformer for each unique value of sdkey
-v - Verbose output