You can subscribe to this list here.
2003 |
Jan
|
Feb
(2) |
Mar
(2) |
Apr
(4) |
May
(1) |
Jun
(10) |
Jul
(1) |
Aug
(14) |
Sep
(4) |
Oct
(1) |
Nov
(11) |
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(5) |
Feb
(14) |
Mar
(21) |
Apr
(7) |
May
(8) |
Jun
(18) |
Jul
(14) |
Aug
(21) |
Sep
(4) |
Oct
(10) |
Nov
(8) |
Dec
(12) |
2005 |
Jan
(7) |
Feb
(9) |
Mar
(2) |
Apr
(8) |
May
(11) |
Jun
(2) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2006 |
Jan
(1) |
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
(7) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2007 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(3) |
May
|
Jun
(2) |
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2016 |
Jan
|
Feb
(2) |
Mar
|
Apr
(1) |
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Joerg W. <we...@in...> - 2004-06-04 15:15:46
|
Hi, this is interesting :-) 1. A profiling is recommended. I will be happy if you've time for this, after you've found a good and=20 free Java profiling tool. 2. Only the CVS can help. Around this time i've added all the descriptors= , so the bottleneck could be the parsing process using the regular expression patterns in joelib/src/joelib/data/plain/knownResults.txt So replacing this file with a file without regular expressions can be a solution. As already tested by myself the descriptors are the first bottleneck, eventually a SDF reader class without converting them can help, because i= n the previous version they were tored as uparsed entries, but i've changed this to avoid problems with CML2 export. Try your own MyMDLSD.java (removed descriptor parsing) and replace the current MDLSD reader in joelib.properties. Kind regards, Joerg On Thu, 3 Jun 2004, Oliva, Ambrogio wrote: > Hi. >=20 > I've found big differences in the performance of SimpleReader from olde= r and newer versions of the joelib library. Running a simple application = that counts the number of molecules in a SDF file with different versions= of joelib I've got the following results. >=20 > - with version 20040116: >=20 > C:\eclipse\workspace\MolCounter>java -cp .;log4j.jar;itext-0.94.jar;joe= lib-20040116.jar MolCounter sample.sdf > 16:04:49 [INFO ] joelib.data.JOEElementTable - Using eleme= nt table: joelib/data/plain/element.txt > 16:04:49 [INFO ] joelib.io.IOTypeHolder - 13 input/ou= tput types loaded. > 16:04:50 [INFO ] joelib.io.SimpleReader - ... 500 mol= ecules successful loaded in 922 ms. > Done: 500 found >=20 > - with version 20040323: > C:\eclipse\workspace\MolCounter>java -cp .;log4j.jar;itext-0.94.jar;joe= lib-20040323.jar MolCounter sample.sdf > 16:04:37 [INFO ] joelib.data.JOEElementTable - Using eleme= nt table: joelib/data/plain/element.txt > 16:04:37 [INFO ] joelib.io.IOTypeHolder - 22 input/ou= tput types loaded. > 16:04:38 [INFO ] joelib.desc.DescriptorHelper - 78 descript= or informations loaded. > 16:04:38 [INFO ] joelib.data.JOEAtomTyper - Using atom = type model: joelib/data/plain/atomtype.txt > 16:04:38 [INFO ] joelib.data.JOEPhModel - Using pH va= lue correction model: joelib/data/plain/phmodel.txt > 16:04:42 [INFO ] joelib.io.SimpleReader - ... 500 mol= ecules successful loaded in 4844 ms. > Done: 500 found >=20 > Could someone explain me the different behaviour of the two libraries, = and how to speed up the process using the newer versions? >=20 > The source code of MolCounter is below >=20 > // Imports > import org.apache.log4j.*; > import joelib.io.*; > import joelib.molecule.*; > import java.io.*; >=20 > public class MolCounter { > =09 > //Obtain a suitable logger. > private static Category logger =3D Category.getInstance("MolCounter"); >=20 > public static void main(String[] args) { > =09 > SimpleReader reader =3D null; //input SDF file > IOType inputType =3D IOTypeHolder.instance().getIOType("SDF"); > try { > reader =3D new SimpleReader(new FileInputStream(args[0]), inputType); > } catch (Exception ex) { > ex.printStackTrace(); > } > JOEMol mol =3D new JOEMol(inputType, inputType); > long lCounter =3D 0; > try {=09 > while (reader.readNext(mol)) { > lCounter++; > } > }=09 > catch (IOException ex) { > // occurs if file can not be found > ex.printStackTrace(); > } > catch (MoleculeIOException ex) { > // occurs if molecule entry is invalid > ex.printStackTrace(); =09 > } > reader.close();=09 >=20 > System.out.println("Done: "+ lCounter + " found"); > System.exit(0); > } > =09 > } >=20 >=20 >=20 > Thanks in advance. >=20 > Ambrogio >=20 >=20 >=20 >=20 >=20 >=20 >=20 > QUESTO MESSAGGIO E=1A PER USO ESCLUSIVO DEL DESTINATARIO IN ESSO INDICA= TO E PUO=1A CONTENERE INFORMAZIONI RISERVATE, SOGGETTE ALLA NORMATIVA SUL= SEGRETO PROFESSIONALE O AZIENDALE E/O RILEVANTI AI FINI DEL DECRETO LEGI= SLATIVO 30 GIUGNO 2003, N. 196 (CODICE IN MATERIA DI PROTEZIONE DEI DATI = PERSONALI). SE NON AUTORIZZATI, L=1AESAME, USO, COMUNICAZIONE O DIFFUSION= E DI QUESTO MESSAGGIO O DEI SUOI CONTENUTI SONO VIETATI. > QUALORA NON FOSTE IL DESTINATARIO DI QUESTO MESSAGGIO, VI PREGHIAMO DI = CORTESEMENTE DARCENE NOTIZIA A MEZZO TELEFAX O E-MAIL, CONFERMANDO LA DI= STRUZIONE DEL MESSAGGIO STESSO E DELLE EVENTUALI COPIE. PREVIA VOSTRA RIC= HIESTA IN TAL SENSO, PROCEDEREMO A RIMBORSARVI I RAGIONEVOLI COSTI DA VOI= SOSTENUTI IN RELAZIONE A QUANTO PRECEDE. >=20 > THIS MESSAGE IS FOR THE SOLE USE OF THE INTENDED RECIPIENT AND MAY CONT= AIN INFORMATION WHICH IS CONFIDENTIAL, PRIVILEGED, PROPRIETARY AND/OR COV= ERED BY THE PROVISIONS OF ITALIAN LEGISLATIVE DECREE N. 196 OF JUNE 30, 2= 003 (CODE FOR THE PROTECTION OF PERSONAL DATA). ANY UNAUTHORIZED REVIEW, = USE, DISCLOSURE OR DISTRIBUTION OF THIS MESSAGE OR ITS CONTENTS IS PROHIB= ITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE NOTIFY US BY TELEFAX= OR BY E-MAIL, CONFIRMING THAT THE MESSAGE AND ALL COPIES HAVE BEEN DESTR= OYED. UPON YOUR REQUEST, WE SHALL REIMBURSE YOU ALL REASONABLE COST BORNE= IN CONNECTION WITH THE ABOVE. > N=18=ACHS^=B5=E9=9A=8AX=AC=B2=9A'=B2=8A=DEu=BC=AD=85=E9=DE=C0=89=EC=B5=A9= eJ=18=9E=95=D5=C5=AE=89=96=8Awh=C2=CBh.)=EE=C6=C7=AB=BD=EA=EC=B6=89=A8n)^= "{-jYR=86'=A5ux=AC=B6=17=A8=9D=E8=A7=B2=D6=A5=95=ABb=A2v=AE=B6=1A+=8Ax,=A2= [=AD=8A=89=ED=85=AB]=A1=EB"=B5=A9e-=E6=AB=9Ej+y=A9=DDz=F6=A5=B9=AB^=B6=87= Z=CA=1Bm=A7=EF=FF=C3=0C"=9E=CBZ=96[!=89=E9]r=89=BF=EB=F6=EB=FF=D3=9D8&=87= =A5=89=B8^=96=99=9A=8AX=A7=82X=AC=B4=9A=1E=96&=E1zZe=8A=CBl=B2=8B=ABq=E7=E8= =AE=07=A7z=D8m=B6=9B>=FF=F9b=B2=DB,=A2=EA=DCy=FA+=81=E9=DE=B7=F9b=B2=DB?=96= +-=8Aw=E8=FE:=1E=96&=E1zZ Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) =20 Never mistake action for meaningful action. (Hugo Kubinyi,2004) = =20 |
From: Oliva, A. <Amb...@ct...> - 2004-06-03 14:59:23
|
SGkuCgpJJ3ZlIGZvdW5kIGJpZyBkaWZmZXJlbmNlcyBpbiB0aGUgcGVyZm9ybWFuY2Ugb2YgU2lt cGxlUmVhZGVyIGZyb20gb2xkZXIgYW5kIG5ld2VyIHZlcnNpb25zIG9mIHRoZSBqb2VsaWIgbGli cmFyeS4gUnVubmluZyBhIHNpbXBsZSBhcHBsaWNhdGlvbiB0aGF0IGNvdW50cyB0aGUgbnVtYmVy IG9mIG1vbGVjdWxlcyBpbiBhIFNERiBmaWxlIHdpdGggZGlmZmVyZW50IHZlcnNpb25zIG9mIGpv ZWxpYiBJJ3ZlIGdvdCB0aGUgZm9sbG93aW5nIHJlc3VsdHMuCgotIHdpdGggdmVyc2lvbiAyMDA0 MDExNjoKCkM6XGVjbGlwc2Vcd29ya3NwYWNlXE1vbENvdW50ZXI+amF2YSAtY3AgLjtsb2c0ai5q YXI7aXRleHQtMC45NC5qYXI7am9lbGliLTIwMDQwMTE2LmphciAgTW9sQ291bnRlciBzYW1wbGUu c2RmCjE2OjA0OjQ5IFtJTkZPIF0gam9lbGliLmRhdGEuSk9FRWxlbWVudFRhYmxlICAgICAgICAg ICAgICAtIFVzaW5nIGVsZW1lbnQgdGFibGU6IGpvZWxpYi9kYXRhL3BsYWluL2VsZW1lbnQudHh0 CjE2OjA0OjQ5IFtJTkZPIF0gam9lbGliLmlvLklPVHlwZUhvbGRlciAgICAgICAgICAgICAgICAg ICAtIDEzIGlucHV0L291dHB1dCB0eXBlcyBsb2FkZWQuCjE2OjA0OjUwIFtJTkZPIF0gam9lbGli LmlvLlNpbXBsZVJlYWRlciAgICAgICAgICAgICAgICAgICAtIC4uLiA1MDAgbW9sZWN1bGVzIHN1 Y2Nlc3NmdWwgbG9hZGVkIGluIDkyMiBtcy4KRG9uZTogNTAwIGZvdW5kCgotIHdpdGggdmVyc2lv biAyMDA0MDMyMzoKQzpcZWNsaXBzZVx3b3Jrc3BhY2VcTW9sQ291bnRlcj5qYXZhIC1jcCAuO2xv ZzRqLmphcjtpdGV4dC0wLjk0Lmphcjtqb2VsaWItMjAwNDAzMjMuamFyICBNb2xDb3VudGVyIHNh bXBsZS5zZGYKMTY6MDQ6MzcgW0lORk8gXSBqb2VsaWIuZGF0YS5KT0VFbGVtZW50VGFibGUgICAg ICAgICAgICAgIC0gVXNpbmcgZWxlbWVudCB0YWJsZTogam9lbGliL2RhdGEvcGxhaW4vZWxlbWVu dC50eHQKMTY6MDQ6MzcgW0lORk8gXSBqb2VsaWIuaW8uSU9UeXBlSG9sZGVyICAgICAgICAgICAg ICAgICAgIC0gMjIgaW5wdXQvb3V0cHV0IHR5cGVzIGxvYWRlZC4KMTY6MDQ6MzggW0lORk8gXSBq b2VsaWIuZGVzYy5EZXNjcmlwdG9ySGVscGVyICAgICAgICAgICAgIC0gNzggZGVzY3JpcHRvciBp bmZvcm1hdGlvbnMgbG9hZGVkLgoxNjowNDozOCBbSU5GTyBdIGpvZWxpYi5kYXRhLkpPRUF0b21U eXBlciAgICAgICAgICAgICAgICAgLSBVc2luZyBhdG9tIHR5cGUgbW9kZWw6IGpvZWxpYi9kYXRh L3BsYWluL2F0b210eXBlLnR4dAoxNjowNDozOCBbSU5GTyBdIGpvZWxpYi5kYXRhLkpPRVBoTW9k ZWwgICAgICAgICAgICAgICAgICAgLSBVc2luZyBwSCB2YWx1ZSBjb3JyZWN0aW9uIG1vZGVsOiBq b2VsaWIvZGF0YS9wbGFpbi9waG1vZGVsLnR4dAoxNjowNDo0MiBbSU5GTyBdIGpvZWxpYi5pby5T aW1wbGVSZWFkZXIgICAgICAgICAgICAgICAgICAgLSAuLi4gNTAwIG1vbGVjdWxlcyBzdWNjZXNz ZnVsIGxvYWRlZCBpbiA0ODQ0IG1zLgpEb25lOiA1MDAgZm91bmQKCkNvdWxkIHNvbWVvbmUgZXhw bGFpbiBtZSB0aGUgZGlmZmVyZW50IGJlaGF2aW91ciBvZiB0aGUgdHdvIGxpYnJhcmllcywgYW5k IGhvdyB0byBzcGVlZCB1cCB0aGUgcHJvY2VzcyB1c2luZyB0aGUgbmV3ZXIgdmVyc2lvbnM/CgpU aGUgc291cmNlIGNvZGUgb2YgTW9sQ291bnRlciBpcyBiZWxvdwoKLy8gSW1wb3J0cwppbXBvcnQg b3JnLmFwYWNoZS5sb2c0ai4qOwppbXBvcnQgam9lbGliLmlvLio7CmltcG9ydCBqb2VsaWIubW9s ZWN1bGUuKjsKaW1wb3J0IGphdmEuaW8uKjsKCnB1YmxpYyBjbGFzcyBNb2xDb3VudGVyIHsKCQov L09idGFpbiBhIHN1aXRhYmxlIGxvZ2dlci4KcHJpdmF0ZSBzdGF0aWMgQ2F0ZWdvcnkgbG9nZ2Vy ID0gQ2F0ZWdvcnkuZ2V0SW5zdGFuY2UoIk1vbENvdW50ZXIiKTsKCnB1YmxpYyBzdGF0aWMgdm9p ZCBtYWluKFN0cmluZ1tdIGFyZ3MpIHsKCQkKCVNpbXBsZVJlYWRlciByZWFkZXIgPSBudWxsOyAg Ly9pbnB1dCBTREYgZmlsZQoJSU9UeXBlIGlucHV0VHlwZSA9IElPVHlwZUhvbGRlci5pbnN0YW5j ZSgpLmdldElPVHlwZSgiU0RGIik7Cgl0cnkgewoJCXJlYWRlciA9IG5ldyBTaW1wbGVSZWFkZXIo bmV3IEZpbGVJbnB1dFN0cmVhbShhcmdzWzBdKSwgaW5wdXRUeXBlKTsKCX0gY2F0Y2ggKEV4Y2Vw dGlvbiBleCkgewoJCWV4LnByaW50U3RhY2tUcmFjZSgpOwoJfQoJSk9FTW9sIG1vbCA9IG5ldyBK T0VNb2woaW5wdXRUeXBlLCBpbnB1dFR5cGUpOwoJbG9uZyBsQ291bnRlciA9IDA7Cgl0cnkgewkK CQkJd2hpbGUgKHJlYWRlci5yZWFkTmV4dChtb2wpKSB7CgkJCWxDb3VudGVyKys7CgkJCX0KCQl9 CQoJCWNhdGNoIChJT0V4Y2VwdGlvbiBleCkgewoJCS8vIG9jY3VycyBpZiBmaWxlIGNhbiBub3Qg YmUgZm91bmQKCQlleC5wcmludFN0YWNrVHJhY2UoKTsKCQl9CgkJY2F0Y2ggKE1vbGVjdWxlSU9F eGNlcHRpb24gZXgpIHsKCQkvLyBvY2N1cnMgaWYgbW9sZWN1bGUgZW50cnkgaXMgaW52YWxpZAoJ CSBleC5wcmludFN0YWNrVHJhY2UoKTsJCQkKCQl9CgkJcmVhZGVyLmNsb3NlKCk7CQoKCQlTeXN0 ZW0ub3V0LnByaW50bG4oIkRvbmU6ICIrIGxDb3VudGVyICsgIiBmb3VuZCIpOwoJCVN5c3RlbS5l eGl0KDApOwoJfQoJCn0KCgoKVGhhbmtzIGluIGFkdmFuY2UuCgpBbWJyb2dpbwoKCgoKCgoKUVVF U1RPIE1FU1NBR0dJTyBFGiBQRVIgVVNPIEVTQ0xVU0lWTyBERUwgREVTVElOQVRBUklPIElOIEVT U08gSU5ESUNBVE8gRSBQVU8aIENPTlRFTkVSRSBJTkZPUk1BWklPTkkgUklTRVJWQVRFLCBTT0dH RVRURSBBTExBIE5PUk1BVElWQSBTVUwgU0VHUkVUTyBQUk9GRVNTSU9OQUxFIE8gQVpJRU5EQUxF IEUvTyBSSUxFVkFOVEkgQUkgRklOSSBERUwgREVDUkVUTyBMRUdJU0xBVElWTyAzMCBHSVVHTk8g MjAwMywgTi4gMTk2IChDT0RJQ0UgSU4gTUFURVJJQSBESSBQUk9URVpJT05FIERFSSBEQVRJIFBF UlNPTkFMSSkuIFNFIE5PTiBBVVRPUklaWkFUSSwgTBpFU0FNRSwgVVNPLCBDT01VTklDQVpJT05F IE8gRElGRlVTSU9ORSBESSBRVUVTVE8gTUVTU0FHR0lPIE8gREVJIFNVT0kgQ09OVEVOVVRJIFNP Tk8gVklFVEFUSS4KUVVBTE9SQSBOT04gRk9TVEUgSUwgREVTVElOQVRBUklPIERJIFFVRVNUTyBN RVNTQUdHSU8sIFZJIFBSRUdISUFNTyBESSAgQ09SVEVTRU1FTlRFIERBUkNFTkUgTk9USVpJQSBB IE1FWlpPIFRFTEVGQVggTyBFLU1BSUwsIENPTkZFUk1BTkRPIExBIERJU1RSVVpJT05FIERFTCBN RVNTQUdHSU8gU1RFU1NPIEUgREVMTEUgRVZFTlRVQUxJIENPUElFLiBQUkVWSUEgVk9TVFJBIFJJ Q0hJRVNUQSBJTiBUQUwgU0VOU08sIFBST0NFREVSRU1PIEEgUklNQk9SU0FSVkkgSSBSQUdJT05F Vk9MSSBDT1NUSSBEQSBWT0kgU09TVEVOVVRJIElOIFJFTEFaSU9ORSBBIFFVQU5UTyBQUkVDRURF LgoKVEhJUyBNRVNTQUdFIElTIEZPUiBUSEUgU09MRSBVU0UgT0YgVEhFIElOVEVOREVEIFJFQ0lQ SUVOVCBBTkQgTUFZIENPTlRBSU4gSU5GT1JNQVRJT04gV0hJQ0ggSVMgQ09ORklERU5USUFMLCBQ UklWSUxFR0VELCBQUk9QUklFVEFSWSBBTkQvT1IgQ09WRVJFRCBCWSBUSEUgUFJPVklTSU9OUyBP RiBJVEFMSUFOIExFR0lTTEFUSVZFIERFQ1JFRSBOLiAxOTYgT0YgSlVORSAzMCwgMjAwMyAoQ09E RSBGT1IgVEhFIFBST1RFQ1RJT04gT0YgUEVSU09OQUwgREFUQSkuIEFOWSBVTkFVVEhPUklaRUQg UkVWSUVXLCBVU0UsIERJU0NMT1NVUkUgT1IgRElTVFJJQlVUSU9OIE9GIFRISVMgTUVTU0FHRSBP UiBJVFMgQ09OVEVOVFMgSVMgUFJPSElCSVRFRC4gSUYgWU9VIEFSRSBOT1QgVEhFIElOVEVOREVE IFJFQ0lQSUVOVCwgIFBMRUFTRSBOT1RJRlkgVVMgQlkgVEVMRUZBWCBPUiBCWSBFLU1BSUwsIENP TkZJUk1JTkcgVEhBVCBUSEUgTUVTU0FHRSBBTkQgQUxMIENPUElFUyBIQVZFIEJFRU4gREVTVFJP WUVELiBVUE9OIFlPVVIgUkVRVUVTVCwgV0UgU0hBTEwgUkVJTUJVUlNFIFlPVSBBTEwgUkVBU09O QUJMRSBDT1NUIEJPUk5FIElOIENPTk5FQ1RJT04gV0lUSCBUSEUgQUJPVkUuCg== |
From: Joerg K. W. <we...@in...> - 2004-05-26 15:03:56
|
Hi, > sounds good. Does this mean to create an .xml file having a target that launches my .class file > provided with the information where it can find the bib? > Can I do this with something like: > ... > <classpath> > <fileset dir="${lib}"> > <include name="**/*.jar" /> > </fileset> > </classpath> > ... YES. I've no preferred tutorial. Have a look at the apache ant homepage and at already available ant files and remove all functionality you do not need. > Don't know exactly what this means. What do you mean by "'local' classpath"? > You don't speak of the environment variable $CLASSPATH, do you? I mean scripting and creating a backup variable, see e.g. joelib/build.sh Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Andreas M. <ma...@in...> - 2004-05-26 14:35:04
|
Hi Joerg, On Wed, May 26, 2004 at 02:49:42PM +0200, Joerg K. Wegner wrote: > 1. append all required *.jar files to the CLASSPATH environment (not > recommended) Not knowing the setup of potential users I could only do this on my own system. => I won't do it. > 2. use ant and catch all *.jar files to build a 'local' classpath for > your program sounds good. Does this mean to create an .xml file having a target that launches my .class file provided with the information where it can find the bib? Can I do this with something like: ... <classpath> <fileset dir="${lib}"> <include name="**/*.jar" /> </fileset> </classpath> ... Of course, additionally I would have to search the joelib-tree recursively to catch all jars. Do you know a good knowledge resource on the web about ant? > 3. use shell/batch script to build a 'local' classpath for your program > (you can use one of JOELib's as base). But under windows this is not a > good solution, because all dependencies are hard-coded. > So i recommend ant or shell-scripts using cygwin. Don't know exactly what this means. What do you mean by "'local' classpath"? You don't speak of the environment variable $CLASSPATH, do you? Greets, Andreas |
From: Joerg K. W. <we...@in...> - 2004-05-26 12:47:18
|
Hi Andreas, depends on your set up: 1. append all required *.jar files to the CLASSPATH environment (not recommended) 2. use ant and catch all *.jar files to build a 'local' classpath for your program 3. use shell/batch script to build a 'local' classpath for your program (you can use one of JOELib's as base). But under windows this is not a good solution, because all dependencies are hard-coded. So i recommend ant or shell-scripts using cygwin. Kind regards, Joerg > Hi there, > > Having installed joelib as a binary in /some/directory/joelib , how can > I use it if my own JAVA-Program resides in /some/other/directory/myprog > , i.e. is it possible to use the import statement for bibs which don't > reside in a subdir of my developer dir? > I know this is merely a JAVA-Question and I should pose it to SUN, but > perhaps you have a quick answer at hand... > Greets, > > Andreas > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Joelib-help mailing list > Joe...@li... > https://lists.sourceforge.net/lists/listinfo/joelib-help > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Andreas M. <ma...@in...> - 2004-05-26 12:42:35
|
Hi there, Having installed joelib as a binary in /some/directory/joelib , how can I use it if my own JAVA-Program resides in /some/other/directory/myprog , i.e. is it possible to use the import statement for bibs which don't reside in a subdir of my developer dir? I know this is merely a JAVA-Question and I should pose it to SUN, but perhaps you have a quick answer at hand... Greets, Andreas |
From: Joerg K. W. <we...@in...> - 2004-05-24 13:08:18
|
Hi Andreas, two problems occur: 1. log4j.properties is missing in the classpath or starting directory 2. You do not have ANY 2D coordinates. Creating them is not an easy task and you can e.g. use CDK under JOELib with: joelib.util.cdk.CDKTools joelib.util.cdk.TestLayout but you will need all CDK libraries under joelib/lib You can eventually (IBM-JRE-Linux only or Windows/Cygwin from command line) Ghemical for 3D coordinates, but that might be tricky, because it uses JNI: joelib.util.ghemical.TestInterface Or implement you own 2D layout (with constraints) as proposed by Rarey et al.: http://dx.doi.org/10.1021/ci049958u Kind regards, Joerg >>please try >>joelib.gui.render.MoleculeViewer2D >> >>Have a look at the main method and for details: >>display(JOEMol molecule, String _title, >> JOESmartsPattern smarts, String eTransfer, String retroSynth, >> String conjRing, String labels) >> > > > Hi Joerg, > > I tried like this: > > import joelib.molecule.*; > import joelib.smiles.*; > import joelib.gui.render.*; > > > public class SmilesSmartsViewer { > > public static void main(String[] args) { > > // obtain smiles string and create mol object > String smiles="C1CCCCC1"; // the smiles string > JOEMol mol=new JOEMol(); // create new mol-object > > // convert mol object to represent the smiles information > if (!JOESmilesParser.smiToMol(mol, smiles, smiles)) > { > System.err.println("SMILES entry \"" + smiles + "\" could not be loaded."); > } > > // display mol object > MoleculeViewer2D.display(mol); > } > } > > > Output is as follows here: > > log4j:WARN No appenders could be found for logger (joelib.io.IOType). > log4j:WARN Please initialize the log4j system properly. > t1 t2 t3 t4 t5 t6 > t1 t2 t3 t4 t5 t6 > t1 t2 t3 t4 t5 t6 > t1 t2 t3 t4 t5 t6 > t1 t2 t3 t4 t5 t6 > t1 t2 t3 t4 t5 t6 > t1 t2 t3 t4 t5 t6 > frag 0 has 6 atoms > t1 t2 t3 t4 t5 t6 > > A window opens but remains blank! > What's the error? > > Kind regards, > > Andreas > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Joelib-help mailing list > Joe...@li... > https://lists.sourceforge.net/lists/listinfo/joelib-help > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Andreas M. <ma...@in...> - 2004-05-24 12:56:37
|
On Fri, May 21, 2004 at 11:17:50AM +0200, Joerg K. Wegner wrote: > Hi Andreas, > > please try > joelib.gui.render.MoleculeViewer2D > > Have a look at the main method and for details: > display(JOEMol molecule, String _title, > JOESmartsPattern smarts, String eTransfer, String retroSynth, > String conjRing, String labels) > Hi Joerg, I tried like this: import joelib.molecule.*; import joelib.smiles.*; import joelib.gui.render.*; public class SmilesSmartsViewer { public static void main(String[] args) { // obtain smiles string and create mol object String smiles="C1CCCCC1"; // the smiles string JOEMol mol=new JOEMol(); // create new mol-object // convert mol object to represent the smiles information if (!JOESmilesParser.smiToMol(mol, smiles, smiles)) { System.err.println("SMILES entry \"" + smiles + "\" could not be loaded."); } // display mol object MoleculeViewer2D.display(mol); } } Output is as follows here: log4j:WARN No appenders could be found for logger (joelib.io.IOType). log4j:WARN Please initialize the log4j system properly. t1 t2 t3 t4 t5 t6 t1 t2 t3 t4 t5 t6 t1 t2 t3 t4 t5 t6 t1 t2 t3 t4 t5 t6 t1 t2 t3 t4 t5 t6 t1 t2 t3 t4 t5 t6 t1 t2 t3 t4 t5 t6 frag 0 has 6 atoms t1 t2 t3 t4 t5 t6 A window opens but remains blank! What's the error? Kind regards, Andreas |
From: Joerg K. W. <we...@in...> - 2004-05-21 09:15:24
|
Hi Andreas, please try joelib.gui.render.MoleculeViewer2D Have a look at the main method and for details: display(JOEMol molecule, String _title, JOESmartsPattern smarts, String eTransfer, String retroSynth, String conjRing, String labels) You know that this can be only used for the visulization. There is no editor functionality available as in CDK. The sources here are adapted CDK sources, so it should not be to much effort, but ... as allways time is rare to combine all functionality at once. So, a basic event model is still available, which is simply not activated. If you use CDK parts do not forget to mention the license from them ! Because open source developers work more or less only for reading their names in the sources, so they will be really upset if we forget these things. I know this, i've once a time forgotten the correct license, by replacing the header with copy-and-paste. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Andreas M. <ma...@in...> - 2004-05-19 15:26:09
|
Hi there, how do I use the display-routine in MoleculeViewer2D rightly? After having created my mol-Object, I tried RenderingAtoms ra = new RenderingAtoms(); ra.add(mol); MoleculeViewer2D mv2D = new MoleculeViewer2D(ra); mv2D.display(); but that just opens a blank screen... Thanks! Andreas |
From: Joerg W. <we...@in...> - 2004-04-19 16:54:07
|
Greetings, > well, then the first question - what about Weka performance ? > (It eats a lot of memory when working with large data sets) > > > R is similar and a long time ago i've used the interface under Java very > > shortly ... we're matlab based ! > > i like using matlab and it is quite usefull; > but matlab itself is not open source , it could be obstacle Same as for representation of molecules. WHY? 1111. Weka splits all into attributes and instances, also nominal and numeric attributes. This causes memory, but is quite usefull, because it is not clear from a series: 1,NaN,3,4,2,1 if this is a nominal classification or a numeric regression problem ! I understand your point, in fact i've implemented a DescriptorMatrix class for JOELib (joelib.desc.data) which holds only the matrix with descriptor names and molecules, but this causes a lot of problems for algorithm development, because the interface can not distinguish the above series by default. I used simply a matrix2weka mapping tool. That's why a student of mine developed a second interface was implemented to have both possibilities, which holds also the molecules in a weka related context directly. For my actual problem i need a wild mix between nominal and numeric and it is more clearly if the attributes holds this information already, so i must not implement always helper classes for both cases. 2222. In general it is usefull to cache data sets (already available as DescriptorMatrixCache) to avoid multiple entries in memory. The cross-validation can be catched from the cached versions. Furthermore optimization algorithms needs a common DB analogue interface or caching mechanism to load required data set s only once (singleton class interface) 3333. It is not possible to compete with fast matrix operations, there R or Matlab should be used, there specific optimized code is needed. Java has: Jama and COLT and some Weka-Add-Ons uses them, but this can never be compared to assembler optimized code. Kind regards, Joerg Joerg Kurt Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg W. <we...@in...> - 2004-04-18 10:45:56
|
Hi all, > > I do not agree to open an own project, there is much code out there: > > Weka, YALE (includes Weka interface) and XML, Commercial stuff with Weka > > interface (Xalopy or what was the correct name ?) > A new project does not mean that available pieces cannot be used... So who can decide which classes are the best main classes ? Do you know a critical mass of Weka and JOELib classes you can use ? Do i know all CDK classes i can use ? What's with R, Yale, JavaNNS (SNNS successor), JavaEVA (EVA successor), libSVM, 'feature extraction', clustering ... > > I think, we do not want to invent an new data mining standard, such > > discussions are more usefull for the Weka mailing list and all > > avaliable Matlab algorithm providers (toolboxes !!!) ... > Not everyone prefer to work with Matlab... Matlab is not free, neither is the > PLS Toolbox... What's the URL for Weka? Google: Weka, Java, Data Mining That's irrelevant, i've plenty of 'feature extraction' methods, you must not buy=20 commercial toolboxes, there is a lot of free stuff, or use R ... ... the problem is mixing all together ... i use these things and i'm far from feeling experienced enough to define a common interface ! I think this is more a evolutionary process, use it and then you find way= s you can faciliate the usage, but a faciliated usage causes a more complex interface so ... every new API requires time to understand their approaches ... and can save development time ... > > - the MaximumCommonSubstructure (MCS) algorithms > Is this an improved algorithm, or similar to that in CDK? 1. I can assign different chemical graph labels=20 1.1. basic atom types 1.2. general PATTY 1.3. atom properties threshold 1.4. atom properties difference 2. MCS by clique detection 2.1. Bron-Kerbosch (exact) 2.2. DFMax (fast heuristic, non-exact) 3. multiple MCS 3.1. HSCS (Sheridan approach) 3.2. stochastic version 4. feature reduction step available for 1.1.-1.3. Beside these things, there exists also the incremental graph isomorphism algorithm for SMARTS matching (Ullmann variant with backtracking) > > Sorry, CDK for descriptors is not obvious to me, please explain. As you > > can mention, i do not agree for several reasons, as already discussed > > previously, e.g. missing atom typer and missing substructure search ! > CDK *has* substructure search, implemented in a rather flexible way. Graph isomorphism is not the same as substructure search ! (See definitio= n Subgraph/Substructure by R=FCcker/R=FCcker) Or which expert systems do you use to assign the graph labels of the 'attributed graph' ? (in general: things i critisize in my submitted pape= r !) In fact, nearly every software uses it's own 'labelling', so which one is correct ? standard ? The isomorphism is not the problem, because we talk about exact matching, of course there exists other kind of matchings, like ... here you will need an optimization algorithm, like our JavaEVA library ... > > Descriptor dependencies > > are NOT all linear 2D dependencies as already excellently mentioned b= y > > Nikolova/Jaworska. So where is the advantage to show them in 2D or 3D ? > > That's mainly irrelevant and misleading ! A 2D plot is only one > > possibility for the model quality, and not always the best one !!! > What kind of 2D are you talking about here? E.g. plain correlation plots between descriptorXYZ and predictedVALUE. Such things can be helpfull, but such an approach is similar to visual 'featur= e selection' on one feature and it is well know, that important features ar= e not the best ones, from the standpoint of generalization ability (see Eibe/Witten or my submitted paper, if accepted :-) > I have no idea what a data mining API is... data mining is a rather vague > term... like chemometrics API. That's the point !!! I'm more interested to implement all required methods and extensions in JOELib/CDK, because the hypothetical interface will access these methods anyway ! Furthermore i'm more interested to implement access/algorithms speed-ups. That's what i call 'maintenance' problems. The libraries are still complex, so i'm more interested to write more examples, more tutorial, including more literature references, ... Kind regards, Joerg Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) =20 Never mistake action for meaningful action. (Hugo Kubinyi,2004) = =20 |
From: Egon W. <eg...@sc...> - 2004-04-17 19:46:18
|
On Saturday 17 April 2004 19:25, Joerg Wegner wrote: > > > I suggest starting not with deciding what program to write but with > > > what the components of a QSAR system are and then deciding what who > > > wants to be involved, we have got and setting some realistic scope to > > > what is achievable > > Of course, i like QSAR .. but time is rare and who will implement things > ... you know that's my default comment ... > > Egon i've read your mail ... and yes i'm still in holiday ... and i do > check > e-mails and i work since 3 years on QSAR ... so holiday means i can read > fantasy books and can do thinks i like, e.g. read some QSAR papers !:-) > Holiday and spare-time are some curious things .. aren't they :-) :) > > It seems there is general agreement that an SF project in this area is > > valuable and I'll make a few comments which I hope are helpful. Please > > ignore if they aren't. > > I do not agree to open an own project, there is much code out there: > Weka, YALE (includes Weka interface) and XML, Commercial stuff with Weka > interface (Xalopy or what was the correct name ?) A new project does not mean that available pieces cannot be used... > I think, we do not want to invent an new data mining standard, such > discussions are more usefull for the Weka mailing list and all > avaliable Matlab algorithm providers (toolboxes !!!) ... Not everyone prefer to work with Matlab... Matlab is not free, neither is the PLS Toolbox... What's the URL for Weka? > ... and such discussions are not new (see Weka mailing-list) !!! > I think we are interested to provide the best useable appraoch > with implemented algorithms available, so let's use the already > available ones and extend them !!! Absolutely. If that has not been clear so far, I prefer to use existing stuff as much as possible, but I do prefer some tools over others, which is in general too, so we need to develop wrappers using a unified interface... > IMHO: > !!! The problem is not the missing 'data mining'-standard. The problem > is the misuse of > 1. a general molecular-structure-coding with these standard algorithms !!! > 2. applying these algorithms correctly > So let's focus this problem first !!! Not everyone agrees on how methods/algorithms should be applied... but I agree that there is plenty of weird use of methods in some QSAR research... I think that providing people with an easy to use, clear and well defined program will make it much easier to teach others what things should be taken into account when making models... > This is a problam of CDK and JOELib > and only if we have solved this, we can solve the next one. > Furthermore i will publish in the next time: > - the extended Weka interface Looking forward to reading that... > - the MaximumCommonSubstructure (MCS) algorithms Is this an improved algorithm, or similar to that in CDK? > - The Metric-Interface is still available and is used by the AtomPair- > descriptor > Weka-Clusterers with Molecular-Metrics are planned and will be > implemented next. The Cluster-Matlab-Molecule connection is to difficult > at the moment, because the similarity metric must be coded under Matlab > or we use indices ... Not sure what you mean here... > So again, i'm using a lot of interfaces and i do not like another one !!! Fine. I don't think we will need to reinvent what you did. I'm, and I guess others too, are fine with using interfaces similar or identical to yours... > Will it not be easier to add CDK- and JOELib-PlugIns. > Do not make the algorithms to easy for chemists, probably they think > hypothesis-testing is an easy tasks and the molecular structure is the > most important thing ... IMHO ... that's badly wrong !!! Mmm... not sure I agree here... chemists are our target... likely even biologists (no offense... :) > So force them > to read the data mining/interface manual carefully. Ok, can you explain me what the goal is here? I.e. what should they learn from understanding the interface? > Descriptor dependencies > are NOT all linear 2D dependencies as already excellently mentioned by > Nikolova/Jaworska. So where is the advantage to show them in 2D or 3D ? > That's mainly irrelevant and misleading ! A 2D plot is only one > possibility for the model quality, and not always the best one !!! What kind of 2D are you talking about here? > > A. Current QSAR practice has severe problems. They include: > > - almost all codes are closed. Many are not free. > > Exact: > Descriptors: Dragon, MolConnZ, ... > Algorithms: Often unpublished code with hiding most of the paramaters, > also important ones > > > - it is impossible to repeat any experiment. Therefore QSAR ceases to be > > scientific but relies on reputation, trust and power > > - the objects used are badly designed, irreproducible and have variable > > interpretation > > - data selection is arbitrary. There are few (no?) standard test sets. > > It is impossible to verify whether data have be modified consciously or > > unconsciously to increase apparent success > > - algorithms are closed, even if the data are well defined. > > Agree fully, four times ! > Oh, i've some nice slides i can present for these points ... :-) > > > B. The mainstream QSAR community is not taking effective steps to remedy > > the errors. Our current group believes that through an OpenSource > > approach > > > we can catalyse a change in thinking and practice. We do this by > > creating a > > > system and practice that demonstrates the increased **quality** > > available > > > through OpenSource. IMO quality is the most important - more so than > > platform, language, ease of use, performance, etc. If it is easier and > > faster to create more garbage on every platform what have we achieved? > > 1. Correct, but surely you know the No-Free-Lunch-Theorem ... i know that > not > everybody like this theorem (still apriori) ... BUT ... now we have a huge > amount of algorithms ... which one to pick ? It's 'easy' to find one > algorithm and one feature set to explain one data set perfectly ! > > 2. And we are not all algorithm developers, so use the existing libraries > which the main-stream user can use. There is still enough room to make > errors, also if we must not reimplement algorithms !!! > > 3. A QSAR framework is not easy, because there are a lot of different > opinions: Correct. Hence the proposed the new SF project to discuss and implement these things... > 3.1. how to present structures, e.g. CDK<->JOELib > 3.2. models (hypothesis building algorithms) are really abstract and do > not > forget the nested and highly interesting meta algorithms with > recursive > character, so let's forget the C++ libraries and concentrate on the > Java and Matlab (Java GUI) libraries (R?) with their flexible > reflection > mechanism! > 3.3. results ... uhhh ... cross-validation, feature selection, data set > splitting ? > Do not forget that we talk about molecular structures, so ... > 3.4. Big descriptor files with normalized descriptors, missing values, if > instable numeric descriptors or they depend on molecule size, ... > 3.5. Are we working in memory or on files ??? For hypothesis building we > are hopefully are working on memory, but the preprocessing steps do > not underly this restriction. Much of this has already be discussed in the thread. True nevertheless. > Sorry, CDK for descriptors is not obvious to me, please explain. As you > can mention, i do not agree for several reasons, as already discussed > previously, e.g. missing atom typer and missing substructure search ! CDK *has* substructure search, implemented in a rather flexible way. > (molecular-structure-coding ... is restricted to applied expert systems) > > Why do we need again a new project, As said above, a new project does not equal starting from scratch. > do we not have enough interface > maintenance 'problems' with the actual projects !? > 1. I think the standard should be a file format or CML, but this does not > help at all, this can only save time by using more space ! > You-Know: Time-Space-Complexity I have not seen the Heisenberg relation for this yet... > 2. Often on-the-fly calculations are required, so this will require > JOELib or CDK or > external JOELib module (which exists already: Corina, Petra, XLogP,...) > So we need a molecule data structure, so which one to use ? > Again implement a new interface ? Why ? I can't see the advantage ? See thread. > 2.1. Interface to Molecules: > - JOELib (available) > - CDK (available) > - Ghemical/Mopac (available in JOELib) > - OpenBabel (JNI, same object structure as JOELib, but is this > usefull ?) > - Tinker > > 2.2. Interface to data mining packages > - Weka (available in JOELib/JCompChem) > - JavaNNS (SNNS sucessor, available in JOELib/JCompChem) > - LibSVM (available in JOELib/JCompChem) > - Matlab and it's 1001 free-packages (available in JOELib/JCompChem) Too bad Matlab itself is not... > - Yale uses Weka > - Data mining API Let's use a chemometrics API. :) I have no idea what a data mining API is... data mining is a rather vague term... like chemometrics API. > - ... to much such stuff ... all mostly incompatible ... let's use > Weka, that's the most serious used OpenSource approach. > Data Miners will implement their algorithms for it, we can use them ! > - let's use Matlab and/or R Let's have that plugable. So that anyone can choose whatever program they like. > 3. Visualization: > 3.1. Molecules: Can be done with CDK and with JOELib also highlighted > SMARTS substructures: > 2D layout CDK > 3D layout JOElib (Corina, Ghemical, orYourInterface) > 3.2. Data: what, histograms, plots, 3D plots , ... > no interest to implement such things, that's boring and does not > help at all, because Weka, Matlab, R have all their own tools > and which one do you prefer ? > What's with independent packages, like libSVM, our JavaNNS > (SNNS successor), ... > So we nedd an interface for all, that's nearly impossible in a short > time period. > I use most often the Java->Matlab interface, this is nothing special > only the adapted JMatLink connection. > > ... and another advantage of holiday and weekeend ... i can write really > long e-mails :-) Thanx for this analysis. And don't spend to much of your holiday on these kinds of emails... though it is difficult not to respond. :) > Kind regards, Joerg Have a nice continuation of you holiday! Egon |
From: Joerg W. <we...@in...> - 2004-04-17 17:25:49
|
Hi all, > > I suggest starting not with deciding what program to write but with what > > the components of a QSAR system are and then deciding what who wants to be > > involved, we have got and setting some realistic scope to what is > > achievable Of course, i like QSAR .. but time is rare and who will implement things ... you know that's my default comment ... Egon i've read your mail ... and yes i'm still in holiday ... and i do check e-mails and i work since 3 years on QSAR ... so holiday means i can read fantasy books and can do thinks i like, e.g. read some QSAR papers !:-) Holiday and spare-time are some curious things .. aren't they :-) > It seems there is general agreement that an SF project in this area is > valuable and I'll make a few comments which I hope are helpful. Please > ignore if they aren't. I do not agree to open an own project, there is much code out there: Weka, YALE (includes Weka interface) and XML, Commercial stuff with Weka interface (Xalopy or what was the correct name ?) I think, we do not want to invent an new data mining standard, such discussions are more usefull for the Weka mailing list and all avaliable Matlab algorithm providers (toolboxes !!!) ... ... and such discussions are not new (see Weka mailing-list) !!! I think we are interested to provide the best useable appraoch with implemented algorithms available, so let's use the already available ones and extend them !!! IMHO: !!! The problem is not the missing 'data mining'-standard. The problem is the misuse of 1. a general molecular-structure-coding with these standard algorithms !!! 2. applying these algorithms correctly So let's focus this problem first !!! This is a problam of CDK and JOELib and only if we have solved this, we can solve the next one. Furthermore i will publish in the next time: - the extended Weka interface - the MaximumCommonSubstructure (MCS) algorithms - The Metric-Interface is still available and is used by the AtomPair- descriptor Weka-Clusterers with Molecular-Metrics are planned and will be implemented next. The Cluster-Matlab-Molecule connection is to difficult at the moment, because the similarity metric must be coded under Matlab or we use indices ... So again, i'm using a lot of interfaces and i do not like another one !!! Will it not be easier to add CDK- and JOELib-PlugIns. Do not make the algorithms to easy for chemists, probably they think hypothesis-testing is an easy tasks and the molecular structure is the most important thing ... IMHO ... that's badly wrong !!! So force them to read the data mining/interface manual carefully. Descriptor dependencies are NOT all linear 2D dependencies as already excellently mentioned by Nikolova/Jaworska. So where is the advantage to show them in 2D or 3D ? That's mainly irrelevant and misleading ! A 2D plot is only one possibility for the model quality, and not always the best one !!! > A. Current QSAR practice has severe problems. They include: > - almost all codes are closed. Many are not free. Exact: Descriptors: Dragon, MolConnZ, ... Algorithms: Often unpublished code with hiding most of the paramaters, also important ones > - it is impossible to repeat any experiment. Therefore QSAR ceases to be > scientific but relies on reputation, trust and power > - the objects used are badly designed, irreproducible and have variable > interpretation > - data selection is arbitrary. There are few (no?) standard test sets. It > is impossible to verify whether data have be modified consciously or > unconsciously to increase apparent success > - algorithms are closed, even if the data are well defined. Agree fully, four times ! Oh, i've some nice slides i can present for these points ... :-) > B. The mainstream QSAR community is not taking effective steps to remedy > the errors. Our current group believes that through an OpenSource approach > we can catalyse a change in thinking and practice. We do this by creating a > system and practice that demonstrates the increased **quality** available > through OpenSource. IMO quality is the most important - more so than > platform, language, ease of use, performance, etc. If it is easier and > faster to create more garbage on every platform what have we achieved? 1. Correct, but surely you know the No-Free-Lunch-Theorem ... i know that not everybody like this theorem (still apriori) ... BUT ... now we have a huge amount of algorithms ... which one to pick ? It's 'easy' to find one algorithm and one feature set to explain one data set perfectly ! 2. And we are not all algorithm developers, so use the existing libraries which the main-stream user can use. There is still enough room to make errors, also if we must not reimplement algorithms !!! 3. A QSAR framework is not easy, because there are a lot of different opinions: 3.1. how to present structures, e.g. CDK<->JOELib 3.2. models (hypothesis building algorithms) are really abstract and do not forget the nested and highly interesting meta algorithms with recursive character, so let's forget the C++ libraries and concentrate on the Java and Matlab (Java GUI) libraries (R?) with their flexible reflection mechanism! 3.3. results ... uhhh ... cross-validation, feature selection, data set splitting ? Do not forget that we talk about molecular structures, so ... 3.4. Big descriptor files with normalized descriptors, missing values, if instable numeric descriptors or they depend on molecule size, ... 3.5. Are we working in memory or on files ??? For hypothesis building we are hopefully are working on memory, but the preprocessing steps do not underly this restriction. Sorry, CDK for descriptors is not obvious to me, please explain. As you can mention, i do not agree for several reasons, as already discussed previously, e.g. missing atom typer and missing substructure search ! (molecular-structure-coding ... is restricted to applied expert systems) Why do we need again a new project, do we not have enough interface maintenance 'problems' with the actual projects !? 1. I think the standard should be a file format or CML, but this does not help at all, this can only save time by using more space ! You-Know: Time-Space-Complexity 2. Often on-the-fly calculations are required, so this will require JOELib or CDK or external JOELib module (which exists already: Corina, Petra, XLogP,...) So we need a molecule data structure, so which one to use ? Again implement a new interface ? Why ? I can't see the advantage ? 2.1. Interface to Molecules: - JOELib (available) - CDK (available) - Ghemical/Mopac (available in JOELib) - OpenBabel (JNI, same object structure as JOELib, but is this usefull ?) - Tinker 2.2. Interface to data mining packages - Weka (available in JOELib/JCompChem) - JavaNNS (SNNS sucessor, available in JOELib/JCompChem) - LibSVM (available in JOELib/JCompChem) - Matlab and it's 1001 free-packages (available in JOELib/JCompChem) - Yale uses Weka - Data mining API - ... to much such stuff ... all mostly incompatible ... let's use Weka, that's the most serious used OpenSource approach. Data Miners will implement their algorithms for it, we can use them ! - let's use Matlab and/or R 3. Visualization: 3.1. Molecules: Can be done with CDK and with JOELib also highlighted SMARTS substructures: 2D layout CDK 3D layout JOElib (Corina, Ghemical, orYourInterface) 3.2. Data: what, histograms, plots, 3D plots , ... no interest to implement such things, that's boring and does not help at all, because Weka, Matlab, R have all their own tools and which one do you prefer ? What's with independent packages, like libSVM, our JavaNNS (SNNS successor), ... So we nedd an interface for all, that's nearly impossible in a short time period. I use most often the Java->Matlab interface, this is nothing special only the adapted JMatLink connection. ... and another advantage of holiday and weekeend ... i can write really long e-mails :-) Kind regards, Joerg > C. The OpenSource community has made some small, useful steps in this > direction. They now wish to pool their efforts and produce a single point > of contact for their own development and to show to the world. This does > NOT necessarily mean a single program. IMO it is much more likely to mean > an infrastructure on which a variety of operations can be carried out > ("glueware"?). They wish to create a project at SF which leads to: > - active constructive discussion > - agreed representation of objects > * molecules, atoms, fragments, etc. > * descriptors > * properties > - creation, cataloguing, annotating, high-quality information objects: > * dictionaries > * properties (e.g. of atoms) > * datasets > - creation, cataloguing, annotation of algorithms related to QSAR > * chemical perception > * statistics, optimisation, etc > - creation of software: > * as toolkit components > * as demonstrators of the *quality* of the system > > That is as far as I have got... > > I think it's important to be inclusive and I would therefore suggest that > we review the current OpenSource efforts in this area. My knowledge extends to: > - CDK, etc. > - JOELib > - OpenBabel > - Weka > - Nina's work (does this have a label?) > > In projects of this sort everyone has something to contribute and also > something to give up. For example I did a lot of work on visual display of > CML (Jumbo3) - and some of this functionality is not provided by other > sources. Nevertheless I decided to give up JUMBO3 and use JCP and Jmol for > display. JUMBO4.3 has now developed in a more structured form as a flexible > XML DOM and Tools library which can be reconfigured easily and rapidly. It > is component based rather than application based. > > I suggest starting not with deciding what program to write but with what > the components of a QSAR system are and then deciding what who wants to be > involved, we have got and setting some realistic scope to what is achievable. > > Best > > P. > Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg W. <we...@in...> - 2004-04-17 08:16:17
|
Dear Nina Nikolova, Dear All, yes of course, it is definitley recommended to introduce a model validation possibility, as already discussed in my last two papers. And also havily critisized by Agrafiotis. Model comparison without data comparison is not really usefull, so models and the data must be stored, i would prefer a benchmark database or at least public web page. I duscussed these topic also with the JCICS editor, but ... you know chemists and their data. SO, for a usefull model validation we need at first benchmark data sets, beause we can only compare the hypothesis, if the data sets are the same. Furthermore a basic 'guideline' must be available, to avoid over-/underfitted models, especially when applying feature selection: See feature selection papers: http://www-ra.informatik.uni-tuebingen.de/software/joelib/users.html The first paper contains two benchmark data sets with nearly 3000 descriptors ? For models i recommend to use Weka, because these models can be stored as Java-objects, this is transparent enough, or if possible a XML mapping tool can be used. For our JavaNNS interface there exists still a text export,also for the libSVM interface, ... For Matlab things can be stored in Matlab objects. No, sorry, i'm not on the ADMET-conference, but i'm going to: -Chemoinformatics, sheffield, Just to talk to others and -Analytica, Munich, Lecture: 'Model quality' !!! Kind regards, Joerg Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg W. <we...@in...> - 2004-04-15 08:37:04
|
Hi all, there is still a kekulizing-method in JOEMol available, which is now used for: - visualize structures without aromatic rings in kekule mode - image export and PDF export - MDL SDF export without aromatic bond flag The changes are added to CVS and the tutorial was updated. The flags in the properties file work now correctly: 1. joelib.gui.render.Renderer2DModel.useKekuleStructure=true 2. #SD Files joelib.io.types.MDLSD.writeAromaticityAsConjugatedSystem=false Kind regards, Joerg Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Peter Murray-R. <pm...@ca...> - 2004-04-03 13:42:24
|
At 11:02 29/03/2004 +0200, Joerg K. Wegner wrote: >Hi all, > >which features do you prefer for the next JOELib release ? > >Voting: >1. Tautomers > - Based on SMARTS Transformation rules, analogue to pH > correction module in JOELib, just use combinatorial > generation and not only one rule. > So this is analogue to other approaches, like > Agent2 (no SMARTS?, hard coded?), Docking programs, > Daylight, CACTVS > - BTW, i would be happy if you can submit requested > tautomer patterns in SMARTS notation. Assuming this is (a) the only Open Java code for tautomers and (b) mirrors an accepted practice (e.g. SMARTS) I would favour this. P. >2. Rotamers > - Porting the OELib rotamer generation to Java, > 60-75% already finished. > > >No voting: >A. MCS based pharmacophore detection is finished and > will be probably published in quartal 4 of this year. > >Kind regards, Joerg > >-- >Dipl. Chem. Joerg K. Wegner >Center of Bioinformatics Tuebingen (ZBIT) >Department of Computer Architecture >Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany >Phone: (+49/0) 7071 29 78970 >Fax: (+49/0) 7071 29 5091 >E-Mail: mailto:we...@in... >WWW: http://www-ra.informatik.uni-tuebingen.de >-- >Never mistake motion for action. > (E. Hemingway) > >Never mistake action for meaningful action. > (Hugo Kubinyi,2004) > > > >------------------------------------------------------- >This SF.Net email is sponsored by: IBM Linux Tutorials >Free Linux tutorial presented by Daniel Robbins, President and CEO of >GenToo technologies. Learn everything from fundamentals to system >administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >_______________________________________________ >Joelib-help mailing list >Joe...@li... >https://lists.sourceforge.net/lists/listinfo/joelib-help Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |
From: Joerg K. W. <we...@in...> - 2004-03-29 09:01:28
|
Hi all, which features do you prefer for the next JOELib release ? Voting: 1. Tautomers - Based on SMARTS Transformation rules, analogue to pH correction module in JOELib, just use combinatorial generation and not only one rule. So this is analogue to other approaches, like Agent2 (no SMARTS?, hard coded?), Docking programs, Daylight, CACTVS - BTW, i would be happy if you can submit requested tautomer patterns in SMARTS notation. 2. Rotamers - Porting the OELib rotamer generation to Java, 60-75% already finished. No voting: A. MCS based pharmacophore detection is finished and will be probably published in quartal 4 of this year. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-03-24 13:27:43
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 19 March 2004 12:33, Joerg K. Wegner wrote: > 4. CDK CML core wishlist (Egon): > 4.1. Please move (from MDL CDO) or add in any form of stereochemistry > support to core CML and the RSS plugin. This is important for > visualizing drugs! Elaborate... MDL style wedge bonds are read from CML... what are you referi= ng=20 too? CML fragment please... > 4.2. Please accept RSS entries without structures without throwing an > error.=20 Cannot reproduce that behaviour. I don't have that problem. Can you tell me= =20 how to reproduce the bug? > Please crosscheck the GPLed Java RSS viewer on SF.net: > https://sourceforge.net/projects/rssview/ > They can also add and edit RSS feed properties and have proxy support, > additionally all informations are stored in a XML file. Thanx for the ideas... this does not have a high priority ... > 4.3. Visualization for activity and ADMET data in this RSS feed example > (xsd:string, xsd:double, xsd:integer descriptors). Yes, that is being worked out... Will report later... Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFAYYyxd9R8I9Yza6YRApgHAKCF0AFHwQ0M6YT2YaT61BfaQLB03wCfe692 dvrhPTw6+TVijhwfHByzOuo=3D =3DyZi7 =2D----END PGP SIGNATURE----- |
From: Joerg K. W. <we...@in...> - 2004-03-24 13:17:41
|
Here is the API link: http://www-ra.informatik.uni-tuebingen.de/software/joelib/api/index.html > Dear JOELib users, > > i've added the literature references for the donor/acceptor descriptors. > Unfortunately i do not know a common or 'general' formulation of these > properties. > > Does anybody of you know other references and argumentations for these > properties ? > If yes, please submit any suggestions ! > > src/joelib/desc/types/AtomInAcceptor.java > src/joelib/desc/types/AtomInDonAcc.java > src/joelib/desc/types/AtomInDonor.java > src/joelib/desc/types/AtomInConjEnvironment.java > src/joelib/desc/types/HBA1.java > src/joelib/desc/types/HBA2.java > src/joelib/desc/types/HBD1.java > src/joelib/desc/types/HBD2.java > src/joelib/process/filter/RuleOf5Filter.java > > Kind regards, Joerg Kurt Wegner > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-03-24 13:16:35
|
Dear JOELib users, i've added the literature references for the donor/acceptor descriptors. Unfortunately i do not know a common or 'general' formulation of these properties. Does anybody of you know other references and argumentations for these properties ? If yes, please submit any suggestions ! src/joelib/desc/types/AtomInAcceptor.java src/joelib/desc/types/AtomInDonAcc.java src/joelib/desc/types/AtomInDonor.java src/joelib/desc/types/AtomInConjEnvironment.java src/joelib/desc/types/HBA1.java src/joelib/desc/types/HBA2.java src/joelib/desc/types/HBD1.java src/joelib/desc/types/HBD2.java src/joelib/process/filter/RuleOf5Filter.java Kind regards, Joerg Kurt Wegner -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-03-23 19:49:41
|
Hi all, here is the proposed CML2 bugfix release. BTW, i've added: -Amber prep -Mopac out And fixed: - Reading Sybyl partial charges are now not overwritten by JOELib' partial charges, so you can store these infromations now in CML2. All partial charges in CML2 contain now a vendor information. Kind release regards, J=F6rg --=20 Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Peter Murray-R. <pm...@ca...> - 2004-03-22 09:18:40
|
At 16:17 21/03/2004 +0100, Joerg Wegner wrote: >Hi all, > >1. I've added Amber,Mopac import to JOELib. >2. The partial charges entries in Mopac and Sybyl are now not any more >overwritten by JOELib's Gasteiger-Marsili partial charges. So they can be >forwarded to e.g. CML2. >I've added two test files. Please try: >sh convert.sh src/joelib/test/1bhf-ligand.mol2 sybyl.cml >sh convert.sh src/joelib/test/Ethanol.mopout mopac.cml Thanks very much. I reply before looking at the details... >Please attend that at the actual implementation the partial charges are >used to calculate the hashcode and the molecule ID for CML (hopefully >nearly unique), if you do not use the canonical SMILES hashcode, you will >get different results for different partial charges. >To avoid this, we simply use the canonicalized SMILES hashcode in >CMLIDCreator. > >So we get: ><cml:scalar units="units:electron" dataType="xsd:float" >dictRef="MOPAC">-0.0192</cml:scalar> This is valid CML but we are gently moving towards more structured dictRefs. I suggest: <molecule xmlns:MOPAC="http://www.tuebingen.de/dict/MOPAC">... ><cml:scalar units="units:electron" dataType="xsd:float" >dictRef="MOPAC:partialCharge73">-0.0192</cml:scalar> the MOPAC:partialCharge is an XML QName. You are in control of it. it acts as a pointer to an ID in a dictionary - i.e. somewhere you have something like: <dictionary id="mopacDict" dictRef="MOPAC" xmlns:MOPAC="http://www.tuebingen.de/dict/MOPAC"> <entry id="partialCharge73" term="partial charge from MOPAC"> <appinfo> <scalar type="xsd:float"/> </appinfo> <description>The partial charge as defined on p. 333 of the manual...</description> </entry> I am writing this without looking at the details but to give the idea. The appinfo stuff is new and still being developed. Your input will be important. The main thing is that you have a dictionary defining the concept. The actual ID doesn't matter - it could be JW:id376 as long as this was linked to a jJW namespace and there was an entry id376. ><cml:scalar units="units:electron" dataType="xsd:float" >dictRef="MMFF94_CHARGES">0.569</cml:scalar> I would suggest either JW:MMFF94_CHARGES or if you have a large list of MMFF94 terms MMFF94:charges >If no charges are defined we get, as previously: ><cml:scalar units="units:electron" dataType="xsd:float" >dictRef="joelib:partialCharge">-0.2686233439216685</cml:scalar> > >3. Peter: Can these dictRef entries be used ? If not what then, because >Sybyl itself contains a string line where the 'partial charge vendor' is >already given. I've simply replaced ' ' by '_'. If manufacturers have IDs that conform to XML we can use them. If not there has to be a mapping. In haste P. >Kind regards, Joerg > >Dipl. Chem. Joerg K. Wegner >Center of Bioinformatics Tuebingen (ZBIT) >Department of Computer Architecture >Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany >Phone: (+49/0) 7071 29 78970 >Fax: (+49/0) 7071 29 5091 >E-Mail: mailto:we...@in... >WWW: http://www-ra.informatik.uni-tuebingen.de >-- >Never mistake motion for action. > (E. Hemingway) > >Never mistake action for meaningful action. > (Hugo Kubinyi,2004) > > > > >------------------------------------------------------- >This SF.Net email is sponsored by: IBM Linux Tutorials >Free Linux tutorial presented by Daniel Robbins, President and CEO of >GenToo technologies. Learn everything from fundamentals to system >administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click >_______________________________________________ >Joelib-help mailing list >Joe...@li... >https://lists.sourceforge.net/lists/listinfo/joelib-help Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |
From: Joerg W. <we...@in...> - 2004-03-21 15:18:11
|
Hi all, 1. I've added Amber,Mopac import to JOELib. 2. The partial charges entries in Mopac and Sybyl are now not any more overwritten by JOELib's Gasteiger-Marsili partial charges. So they can be forwarded to e.g. CML2. I've added two test files. Please try: sh convert.sh src/joelib/test/1bhf-ligand.mol2 sybyl.cml sh convert.sh src/joelib/test/Ethanol.mopout mopac.cml Please attend that at the actual implementation the partial charges are used to calculate the hashcode and the molecule ID for CML (hopefully nearly unique), if you do not use the canonical SMILES hashcode, you will get different results for different partial charges. To avoid this, we simply use the canonicalized SMILES hashcode in CMLIDCreator. So we get: <cml:scalar units="units:electron" dataType="xsd:float" dictRef="MOPAC">-0.0192</cml:scalar> <cml:scalar units="units:electron" dataType="xsd:float" dictRef="MMFF94_CHARGES">0.569</cml:scalar> If no charges are defined we get, as previously: <cml:scalar units="units:electron" dataType="xsd:float" dictRef="joelib:partialCharge">-0.2686233439216685</cml:scalar> 3. Peter: Can these dictRef entries be used ? If not what then, because Sybyl itself contains a string line where the 'partial charge vendor' is already given. I've simply replaced ' ' by '_'. Kind regards, Joerg Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-03-20 16:11:16
|
Hi, sorry, the sequential CML2 reader (for uncompressed files ONLY!!!) contains a parsing bug for complex descriptors like arrays, matrices, atom-pair, and whatever ... I've fixed it and checked in. I will not build a new release, so checkout, if you want to try or compress your CML2 file before loading it :-) Or wait until the next release. BTW, i've updated the homepage and added P.M. Rust to the acknowledgements, because the valuable CML discussion. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |