[Rdkit-discuss] Issues with rdkit and postgresql cartridge conda installation
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Danes, L. <lar...@un...> - 2016-10-14 16:50:00
|
Hello all, First and foremost, please excuse any inaccuracies as I am new to the world of Cheminformatics. I'll start with some background on my issue. I've got a MySQL database with chemical information such as CASRNs, Annotation Class, and SMILE strings. I have a web app currently in development that takes a list of CASRNs as input and performs an enrichment analysis on those chemicals. The goal is to now take a SMILE string as input, query the rdkit postgresql database (which will consist of a table with rows that have CASRNs and SMILE strings) to find any chemicals that match the input SMILE string over a set threshold. These chemicals CASRNs will then become the input for enrichment and the rest of the app will be agnostic toward how the CASRNs were obtained. I suppose I should say this: if the description above isn't something rdkit is suited for and I'm totally off base, please let me know! As I stated earlier, I'm new to this realm. Anyway, the issues I'm facing currently involve the installation of rdkit and the postgresql cartridge. I should say, I've tried this on both Windows 7 and Windows 10 (work desktop in 7 and laptop is 10). So, the first question I have is when running the command conda install -c https://conda.binstar.org/rdkit rdkit-postgresql I'm faced with a "PackageNotFoundError". Conda then suggests maybe I meant "rdkit-postgresql: postgresql". So, I run conda install -c https://conda.binstar.org/rdkit postgresql which seems to install just fine. My question is simply, is this ok? Do i specifically need the "rdkit-postgresql" package? I searched anaconda.org for the rdkit-postgresql package and found it. It said it could be installed with conda install -c rdkit rdkit-postgresql=2016.03.4 but this resulted in a similar error. I was wondering if it was possibly a platform difference, because anaconda.org shows the package as "linux-64". I'd also just like to mention a few other issues I was (seemingly) able to work around, but I would still like to mention them as they may provide context for my next question. The path to the rdkit bin folder for me is "C:\Users\Larson\Anaconda3\envs\my-rdkit-env\Library\bin" while the documentation specifies it should follow the "[conda folder]/envs/my-rdkit-env/bin" convention. Perhaps it's a change in the directory structure that hasn't been updated in the documentation? Next, to actually start the postgresql server the command I had to use was C:\Users\Larson\Anaconda3\envs\my-rdkit-env\Library\bin\pg_ctl -D path/to/db/data -l logfile start as opposed to [conda folder]/envs/my-rdkit-env/bin/postgres -D /folder/where/data/should/be/stored as outlined in the documentation. And I believe that brings us to my current issue/question. In the documentation, it specifies to create a database you should do the following: createdb my_rdkit_db psql my_rdkit_db # create extension rdkit; The first two lines I can get through just fine, but the "create extension" command gives me a file not found error, with the file in question being "rdkit.control". I searched the rdkit github page and found one such file and copied it to the appropriate location ("C:\Users\Larson\Anaconda3\envs\my-rdkit-env\Library\share\extension", I believe). After doing so I got a new error that I can't replicate at the moment, but it was another file not found error and the file it was looking for was "rdkit--3.5.sql" I'm fairly sure. I could not find any such file on github, which is what prompted me to create this post. Please let me know if I'm doing anything incorrectly or anything else that might help. I will also do my best to provide any clarification if necessary. Thanks very much for your time, Larson Danes |