From: Noel O'B. <bao...@gm...> - 2008-05-22 15:50:06
|
---------- Forwarded message ---------- From: Nick England <nic...@gm...> Date: 2008/5/22 Subject: Re: [OpenBabel-scripting] Python bindings in 2.1.1 missing "fs" format To: Noel O'Boyle <bao...@gm...> Noel, I am trying to perform a substructure search on a large number of molecules. I can do the SMARTS matching from python, but would need the fast search ideally. Would it be possible for me to write a C++ class to call the OBConversion with input and output streams, and call that from a Python script instead? Calling the babel executable directly from Python is an option, but I believe that has issues with portability? - Nick 2008/5/22 Noel O'Boyle <bao...@gm...>: > There's no way to do streams in Python. That's why we have this whole > ReadFile, Read, business. Can you describe what the overall problem is > that you're working on, and maybe some of the wise heads on this list > can suggest some alternative solutions... > > Noel > > 2008/5/22 Nick England <nic...@gm...>: >> Chris, >> >> Would it be possible to do something like >> >> infile=open("input.smi","r") >> outfile=open("index.fs","w") >> obconversion = openbabel.OBConversion(infile,outfile) (although these >> need to be wrapped into streams somehow, SWIG has a function for >> this?) >> obconversion.Convert() >> >> I am trying to be able to make and search an index from calls from a >> scripting environment. >> >> Thanks, >> >> Nick >> >> >> 2008/5/21 Chris Morley <c.m...@ds...>: >>> When fs is used as an output format it makes an index of the input file. >>> This is a list of the fingerprint of each molecule in the input file >>> and the number of bytes it is from the beginning of that file. It wasn't >>> written to be used like you are doing it, so the failure is not >>> surprising. There is an API of C++ classes for fastsearch but even with >>> scripting I think it would be easier to use the conversion framework to >>> make an index (if that is what you are trying to do). On the command >>> line it would be >>> babel input.smi index.fs >>> which would index all the molecules in input.smi >>> To use the index >>> babel index.fs -osmi -s"CO" >>> which would display all the molecules match the SMARTS. >>> >>> The SMARTS actually has to be a valid SMILES (a molecular fragment) >>> because in the searching its fingerprint is calculated (to be compared >>> with all the fingerprints in the input file). >>> >>> If you are doing a simple SMARTS filter >>> babel input.smi -osmi -s"[#6]O" >>> you can use a full SMARTS. >>> >>> Coming back to scripting, the fastsearch only has any point if you do it >>> all in one go, i.e. scan the index using compiled code. Looping in a >>> scripting language is too slow. This probably applies also when >>> indexing a file. Both would be best done by calling the conversion >>> framework. I'm afraid I haven't thought how to recover the result >>> molecules one by one. >>> >>> It may be that your best approach is not to use fastsearch and use an >>> ordinary SMARTS search instead. From the command line this is probably >>> the best way anyway for fewer than 10,000 molecules. >>> >>> Chris >>> >>> When >>> Noel O'Boyle wrote: >>>> Sounds like a bug. Could you file one? >>>> >>>> I didn't realise that 'fs' was a proper format. I always thought it >>>> was just some sort of index used to search a large SDF file, or >>>> something. >>>> >>>> Noel >>>> >>>> 2008/5/21 Nick England <nic...@gm...>: >>>>> Noel, >>>>> >>>>> The "obconversion.SetInFormat("fs")" returns true, and the input file >>>>> is valid in that it was made on this computer and works fine on the >>>>> command line babel (both under linux) >>>>> >>>>> However, trying to do: >>>>> >>>>> import pybel >>>>> allmols=[mol for mol in pybel.readfile("smi","input.smi")] >>>>> smarts=pybel.Smarts("[#6]O") >>>>> for mol in allmols: >>>>> if(smarts.findall(mol)): >>>>> print mol.write("fs") >>>>> >>>>> results in the mesage >>>>> >>>>> "Not a valid output format" >>>>> >>>>> >>>>> using the interpreter: >>>>>>>> obconversion.SetInFormat("fs") >>>>> True >>>>>>>> obconversion.SetOutFormat("fs") >>>>> True >>>>> >>>>> This problem cannot be due to a problem with the input files, since it >>>>> won't even output a simple CCCO smiles string to the fs format. The >>>>> obconversion seems to understant the format though. >>>>> >>>>> >>>>> 2008/5/21 Noel O'Boyle <bao...@gm...>: >>>>>> 2008/5/21 Nick England <nic...@gm...>: >>>>>>> Hello all, >>>>>>> >>>>>>> I am experiencing some odd behavoir with the python bindings. A simple >>>>>>> program to read in an index file: >>>>>>> >>>>>>> #! /usr/bin/env python >>>>>>> import openbabel >>>>>>> import pybel >>>>>>> allmols=[] >>>>>>> obconversion = openbabel.OBConversion() >>>>>>> obconversion.SetInFormat("fs") >>>>>>> obmol = openbabel.OBMol() >>>>>>> >>>>>>> notatend = obconversion.ReadFile(obmol,"index.fs") >>>>>>> >>>>>>> while notatend: >>>>>>> >>>>>>> allmols.append(obmol) >>>>>>> obmol=openbabel.OBMol() >>>>>>> notatend=obconversion.Read(obmol) >>>>>>> pybel.Molecule(obmol).write("smi","results.smi",True) >>>>>>> >>>>>>> is failing with the message: >>>>>>> "Not a valid input format" >>>>>> What line is failing? Try "success = obconversion.SetInFormat("fs")" >>>>>> and check its value. >>>>>> >>>>>> If the value is True, then the problem is not setting the format, but >>>>>> rather reading the file. Is the inputfile valid? Is it from the same >>>>>> operating system? If so, it sounds like a bug. Can you file one and >>>>>> provide the input file (use a short example, if possible). >>>>>> >>>>>>> However typing babel -Hfs and obconversion.GetSupportedInputFormat() >>>>>>> both list the fastsearch format as being present. The command babel >>>>>>> -ifs index.fs -osmi works fine, and the python program above works if >>>>>>> the format isn't fs. >>>>>>> I would also like to add the line >>>>>>> obconversion.AddOption('s',openbabel.OBConversion::GENOPTIONS,searchstring) >>>>>>> (but this is throwing a syntax error on the OBCoversion::GENOPTIONS >>>>>>> part, this is my first attempt at using python however so its not >>>>>>> unexpected!) >>>>>> Try using the interactive Python prompt: >>>>>>>>> dir(openbabel.OBConversion()) >>>>>> ['ALL', 'AddChemObject', 'AddOption', 'CloseOutFile', 'Convert', 'CopyOptions', >>>>>> ... >>>>>> GENOPTIONS', 'GetAuxConv', 'GetChemObject', 'GetDefaultFormat', 'GetInFilename', >>>>>> 'GetInFormat', 'GetInLen', 'GetInPos', 'GetInStream', 'GetOptionParams', 'GetOp >>>>>> ... >>>>>> 'thisown'] >>>>>>>>> openbabel.OBConversion.GENOPTIONS >>>>>> 2 >>>>>> >>>>>>> Any help would be appreciated! >>>>>>> >>>>>>> ------------------------------------------------------------------------- >>>>>>> This SF.net email is sponsored by: Microsoft >>>>>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>>>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>>>>> _______________________________________________ >>>>>>> OpenBabel-scripting mailing list >>>>>>> Ope...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/openbabel-scripting >>>>>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> OpenBabel-scripting mailing list >>>> Ope...@li... >>>> https://lists.sourceforge.net/lists/listinfo/openbabel-scripting >>>> >>> >>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> OpenBabel-scripting mailing list >>> Ope...@li... >>> https://lists.sourceforge.net/lists/listinfo/openbabel-scripting >>> >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> OpenBabel-scripting mailing list >> Ope...@li... >> https://lists.sourceforge.net/lists/listinfo/openbabel-scripting >> > |