[Rdkit-discuss] Using SQLAlchemy with the RDKit database cartridge

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi all,

I've started working on an extension of the SQLAlchemy database
toolkit that is aimed to support direct access from python to the
functions and data types exposed by the database chemical cartridge.
In brief this means that instead of interacting with the RDBMS using
raw SQL queries, it may become possible to execute the entire workflow
(data preprocessing and cleanup, insertion, selection and further
processing) without leaving the python interpreter, and at the same
time delegating the construction of the required SQL expressions to a
higher-level API. Just to make a simple example, instead of using

select count(*) from molecules where structure @> 'O=C1OC2=CC=CC=C2C=C1';

one might type something like the following:

>>> constraint = Molecule.structure.contains('O=C1OC2=CC=CC=C2C=C1')
>>> print session.query(Molecule).filter(constraint).count()

(ok, in this specific case the python expression is a bit more
verbose, but it's a very simple SQL query :-)

The project is still in an initial phase, and the code is far from
being mature, but the development is currently strongly focused on the
RDKit postgresql extension. Structure searches and molecular
descriptors should be fully supported, and bit fingerprints and
associated similarity operators are also available (but modifying the
default threshold similarity values is not yet possible). The code is
currently hosted on github

https://github.com/rvianello/razi

and some draft documentation (at the moment mainly intended to
illustrate the idea than providing a detailed reference) is also
available:

http://razi.readthedocs.org

If you use the RDKit chemical cartridge or SQLAlchemy (or both), I
hope you will find the idea interesting and I'd love to hear from you.
Comments, ideas and suggestions would be very welcome.

Cheers,
Riccardo

[Rdkit-discuss] Using SQLAlchemy with the RDKit database cartridge

Open-Source Cheminformatics and Machine Learning

[Rdkit-discuss] Using SQLAlchemy with the RDKit database cartridge