|
From: Noel O'B. <bao...@gm...> - 2006-12-18 09:11:37
|
---------- Forwarded message ----------
From: Andreas Karwath <ka...@in...>
Date: 18-Dec-2006 08:25
Subject: Re: [OpenBabel-scripting] Trying to get access to a gzipped
SDF file in Python
To: Noel O'Boyle <bao...@gm...>
Cheers,
I also traced the vbug back to mdlformat.cpp, where a false is
returned as the if(!ifs.getline(buffer,BUFF_SIZE)) return(false);
returns false...
But thanks for the workaround....
ak
On 17.12.2006, at 22:20, Noel O'Boyle wrote:
> I have reproduced the bug. Unfortunately, I don't anticipate an easy
> solution. Streams and Python don't mix very well in the first place.
>
> But here's a workaround in the meanwhile:
>
> import gzip # a standard Python module
> text =3D gzip.open("test.sdf.gz").read()
>
> You now need to use openbabel to read this text (I'm afraid that
> Pybel's readstring currently only reads strings containing a single
> molecule).
>
> import openbabel as ob
> obmol =3D ob.OBMol()
> obconversion =3D ob.OBConversion()
> formatok =3D obconversion.SetInFormat("sdf")
> notatend =3D obconversion.ReadString(obmol, text)
> while notatend:
> # Do something with obmol
> obmol =3D ob.OBMol()
> notatend =3D obconversion.Read(obmol)
>
> Hope this helps...
>
> Noel
>
> On 16/12/06, Andreas Karwath <ka...@in...>
> wrote:
>> I guess you mean:
>>
>> babel -isdf <zipped SDF file> -osmi
>>
>> that works fine...
>>
>> I guess it has something to do with the internal stream (or however
>> it is called).
>>
>> Regards,
>>
>> ak
>>
>>
>> On 15.12.2006, at 18:06, Noel O'Boyle wrote:
>>
>> > Thanks Andreas for letting us know about this problem.
>> >
>> > First of all, can you let us know whether this problem occurs if
>> you
>> > use the babel executable itself to convert the file? (If so, the
>> > problem is nothing to do with the Python bindings)
>> >
>> > Noel
>> >
>> > On 15/12/06, Andreas Karwath <ka...@in...>
>> > wrote:
>> >> Hi all,
>> >>
>> >> I hope I don't have to send this to the developer list...
>> >>
>> >> I have installed the latest version of openbabel (2.1.0b3)
>> >> including the
>> >> python bindings... (openbabel and pybel), Linux Suse 9.2
>> >> I wanted to parse a gzipped SDF file (zipped file size 14 M) to
>> >> extract
>> >> specific molecules (by name), for this I first tried to do the
>> normal
>> >> routine, i.e.:
>> >>
>> >> sdfFileName =3D <someGzippedSDFFile>
>> >> obconversion =3D OBConversion()
>> >> obconversion.SetInFormat("sdf")
>> >> obconversion.SetOutFormat("smi")
>> >> obmol =3D OBMol()
>> >>
>> >> notatend =3D obconversion.ReadFile(obmol,sdfFileName)
>> >> export =3D obconversion.WriteFile(obmol,'myTest.smi')
>> >> while notatend:
>> >> obconversion.Write(obmol)
>> >> obmol =3D OBMol()
>> >> notatend =3D obconversion.Read(obmol)
>> >>
>> >> On a normal (i.e. unzipped) SDF File it works fine. But not on a
>> >> gzipped one
>> >> -> Segmentation Fault. It can only get access the first single
>> >> molecule
>> >>
>> >> The same is true when using the new pybel lib. -> Segmentation
>> Fault!
>> >>
>> >> I assume that OpenBabel keeps a pointer to the last read molecule
>> >> in the SDF
>> >> file, which would not work when accessing the zipped one...
>> >>
>> >> I don't want to unpack the file, as I have a few hundred of
>> >> those.. (disk
>> >> space!)
>> >>
>> >> Did anyone have the same problem and knows an elegant workaround?
>> >> I guess the problem should occur for other scripting languages as
>> >> well...
>> >>
>> >> Regards,
>> >>
>> >> A. Karwath
>> >> -----------------
>> >>
>> >> Dr. Andreas Karwath
>> >> Machine Learning Lab
>> >> Institute for Computer Science
>> >> Albert-Ludwigs-Universit=E4t Freiburg
>> >> Georges-K=F6hler-Allee 079
>> >> D-79110 Freiburg
>> >> Germany
>> >>
>> >>
>> >>
>> ---------------------------------------------------------------------
>> >> ----
>> >> Take Surveys. Earn Cash. Influence the Future of IT
>> >> Join SourceForge.net's Techsay panel and you'll get the chance to
>> >> share your
>> >> opinions on IT & business topics through brief surveys - and earn
>> >> cash
>> >> http://www.techsay.com/default.php?
>> >> page=3Djoin.php&p=3Dsourceforge&CID=3DDEVDEV
>> >>
>> >> _______________________________________________
>> >> OpenBabel-scripting mailing list
>> >> Ope...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/openbabel-scripting
>> >>
>> >>
>> >>
>>
>> Dr. Andreas Karwath
>> Machine Learning Lab
>> Institute for Computer Science
>> Albert-Ludwigs-Universit=E4t Freiburg
>> Georges-K=F6hler-Allee 079
>> D-79110 Freiburg
>> Germany
>> +49 761 203 8029 (office)
>> +49 761 203 8007 (fax)
>> http://www.informatik.uni-freiburg.de/~karwath/ (web)
>> ka...@in... (email)
>> theKnoedel (skype)
>>
>>
>>
Dr. Andreas Karwath
Machine Learning Lab
Institute for Computer Science
Albert-Ludwigs-Universit=E4t Freiburg
Georges-K=F6hler-Allee 079
D-79110 Freiburg
Germany
+49 761 203 8029 (office)
+49 761 203 8007 (fax)
http://www.informatik.uni-freiburg.de/~karwath/ (web)
ka...@in... (email)
theKnoedel (skype)
|