From: Till S. <til...@tu...> - 2013-11-29 11:28:17
|
Hi, Am Freitag, 29. November 2013, 12:26:16 schrieb Till Schäfer: > Hi Markus, > > Am Donnerstag, 28. November 2013, 11:37:35 schrieb Markus Heller: > > Hi all, > > > > when importing data (SDF file containing docking results) into a new or > > existing database, SH will only import structures that have not been > > imported yet. In the process, the number of docking results gets cut down > > considerably, in my case here from over 400 to just over 200, which has a > > big impact on my ability to analyze the scaffolds. The manual doesn't seem > > to address this; at least I haven't been able to find it if it does. > > > > How can I get SH to import *all* structures in a given SDF file, regardless > > of whether they've been imported before? > If you create a new Dataset all structures from the SDF should be imported. Therefore, the manual cannot mention it :-). That means this is a bug in the software or some misunderstanding about how the import works. > I tried to reproduce the behavior here with version 2.2.0 and was not able to do so with the mysql and hsqldb backend. > > That means we have to dig a bit deeper about what is going wrong: > > If you create a new database schema (you can just use a fresh temporary hasqldb) and import the dataset: Are all structures imported or at least most of them? The reason for this question is, that there are several circumstance in which some of the entries of a SDF are not imported or the the imported dataset and the sdf differ in size: > > 1. There are errors parsing the file. > This should be visible in the progress window. Can you provide a quick summarization about the top level entries, which are listed here? > > 2. There are several entries with the same structure. > We consider two structures identical iff they have the same SMILES string. That means also that we do NOT distinguish between different confirmations, etc. If there are some structures with the same SMILES during the import, the structure properties are merged regarding the merge strategy you select. I forget to mention, that this is indicated by a log entry during import, which tells you that a structure with the same smiles was already imported. > > 3. There is another software bug, because you use another Version of some software, such as the database. I think we should first inspect point 1 and 2, but maybe you can give me some information about the database you are using (name, version). > > If that all does not help and you are able to send me your SDF (no private data), i can also try it out here and see what is happening. If the error is located in the file (or in the way we interact with that file), we should be able to reproduce the behavior here. > Greetings Till -- Dipl.-Inf. Till Schäfer Technische Universität Dortmund Chair 11 - Algorithm Engineering Otto-Hahn-Str. 14 / Raum 237 44227 Dortmund, Germany e-mail: til...@cs... phone: +49(231)755-7706 fax: +49(231)755-7740 web: http://ls11-www.cs.uni-dortmund.de/staff/schaefer pgp: https://keyserver2.pgp.com/vkd/SubmitSearch.event?&&SearchCriteria=0xD84DED79 |