joelib-help Mailing List for JOELib/JOELib2 (Page 7)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

this is interesting :-)

1. A profiling is recommended.
I will be happy if you've time for this, after you've found a good and=20
free Java profiling tool.

2. Only the CVS can help. Around this time i've added all the descriptors=
,
so the bottleneck could be the parsing process using the regular
expression patterns in joelib/src/joelib/data/plain/knownResults.txt
So replacing this file with a file without regular expressions can be a
solution.

As already tested by myself the descriptors are the first bottleneck,
eventually a SDF reader class without converting them can help, because i=
n
the previous version they were tored as uparsed entries, but i've changed
this to avoid problems with CML2 export.
Try your own MyMDLSD.java (removed descriptor parsing) and replace the
current MDLSD reader in
joelib.properties.

Kind regards, Joerg

On Thu, 3 Jun 2004, Oliva, Ambrogio wrote:

> Hi.
>=20
> I've found big differences in the performance of SimpleReader from olde=
r and newer versions of the joelib library. Running a simple application =
that counts the number of molecules in a SDF file with different versions=
 of joelib I've got the following results.
>=20
> - with version 20040116:
>=20
> C:\eclipse\workspace\MolCounter>java -cp .;log4j.jar;itext-0.94.jar;joe=
lib-20040116.jar  MolCounter sample.sdf
> 16:04:49 [INFO ] joelib.data.JOEElementTable              - Using eleme=
nt table: joelib/data/plain/element.txt
> 16:04:49 [INFO ] joelib.io.IOTypeHolder                   - 13 input/ou=
tput types loaded.
> 16:04:50 [INFO ] joelib.io.SimpleReader                   - ... 500 mol=
ecules successful loaded in 922 ms.
> Done: 500 found
>=20
> - with version 20040323:
> C:\eclipse\workspace\MolCounter>java -cp .;log4j.jar;itext-0.94.jar;joe=
lib-20040323.jar  MolCounter sample.sdf
> 16:04:37 [INFO ] joelib.data.JOEElementTable              - Using eleme=
nt table: joelib/data/plain/element.txt
> 16:04:37 [INFO ] joelib.io.IOTypeHolder                   - 22 input/ou=
tput types loaded.
> 16:04:38 [INFO ] joelib.desc.DescriptorHelper             - 78 descript=
or informations loaded.
> 16:04:38 [INFO ] joelib.data.JOEAtomTyper                 - Using atom =
type model: joelib/data/plain/atomtype.txt
> 16:04:38 [INFO ] joelib.data.JOEPhModel                   - Using pH va=
lue correction model: joelib/data/plain/phmodel.txt
> 16:04:42 [INFO ] joelib.io.SimpleReader                   - ... 500 mol=
ecules successful loaded in 4844 ms.
> Done: 500 found
>=20
> Could someone explain me the different behaviour of the two libraries, =
and how to speed up the process using the newer versions?
>=20
> The source code of MolCounter is below
>=20
> // Imports
> import org.apache.log4j.*;
> import joelib.io.*;
> import joelib.molecule.*;
> import java.io.*;
>=20
> public class MolCounter {
> =09
> //Obtain a suitable logger.
> private static Category logger =3D Category.getInstance("MolCounter");
>=20
> public static void main(String[] args) {
> 	=09
> 	SimpleReader reader =3D null;  //input SDF file
> 	IOType inputType =3D IOTypeHolder.instance().getIOType("SDF");
> 	try {
> 		reader =3D new SimpleReader(new FileInputStream(args[0]), inputType);
> 	} catch (Exception ex) {
> 		ex.printStackTrace();
> 	}
> 	JOEMol mol =3D new JOEMol(inputType, inputType);
> 	long lCounter =3D 0;
> 	try {=09
> 			while (reader.readNext(mol)) {
> 			lCounter++;
> 			}
> 		}=09
> 		catch (IOException ex) {
> 		// occurs if file can not be found
> 		ex.printStackTrace();
> 		}
> 		catch (MoleculeIOException ex) {
> 		// occurs if molecule entry is invalid
> 		 ex.printStackTrace();		=09
> 		}
> 		reader.close();=09
>=20
> 		System.out.println("Done: "+ lCounter + " found");
> 		System.exit(0);
> 	}
> =09
> }
>=20
>=20
>=20
> Thanks in advance.
>=20
> Ambrogio
>=20
>=20
>=20
>=20
>=20
>=20
>=20
> QUESTO MESSAGGIO E=1A PER USO ESCLUSIVO DEL DESTINATARIO IN ESSO INDICA=
TO E PUO=1A CONTENERE INFORMAZIONI RISERVATE, SOGGETTE ALLA NORMATIVA SUL=
 SEGRETO PROFESSIONALE O AZIENDALE E/O RILEVANTI AI FINI DEL DECRETO LEGI=
SLATIVO 30 GIUGNO 2003, N. 196 (CODICE IN MATERIA DI PROTEZIONE DEI DATI =
PERSONALI). SE NON AUTORIZZATI, L=1AESAME, USO, COMUNICAZIONE O DIFFUSION=
E DI QUESTO MESSAGGIO O DEI SUOI CONTENUTI SONO VIETATI.
> QUALORA NON FOSTE IL DESTINATARIO DI QUESTO MESSAGGIO, VI PREGHIAMO DI =
 CORTESEMENTE DARCENE NOTIZIA A MEZZO TELEFAX O E-MAIL, CONFERMANDO LA DI=
STRUZIONE DEL MESSAGGIO STESSO E DELLE EVENTUALI COPIE. PREVIA VOSTRA RIC=
HIESTA IN TAL SENSO, PROCEDEREMO A RIMBORSARVI I RAGIONEVOLI COSTI DA VOI=
 SOSTENUTI IN RELAZIONE A QUANTO PRECEDE.
>=20
> THIS MESSAGE IS FOR THE SOLE USE OF THE INTENDED RECIPIENT AND MAY CONT=
AIN INFORMATION WHICH IS CONFIDENTIAL, PRIVILEGED, PROPRIETARY AND/OR COV=
ERED BY THE PROVISIONS OF ITALIAN LEGISLATIVE DECREE N. 196 OF JUNE 30, 2=
003 (CODE FOR THE PROTECTION OF PERSONAL DATA). ANY UNAUTHORIZED REVIEW, =
USE, DISCLOSURE OR DISTRIBUTION OF THIS MESSAGE OR ITS CONTENTS IS PROHIB=
ITED. IF YOU ARE NOT THE INTENDED RECIPIENT,  PLEASE NOTIFY US BY TELEFAX=
 OR BY E-MAIL, CONFIRMING THAT THE MESSAGE AND ALL COPIES HAVE BEEN DESTR=
OYED. UPON YOUR REQUEST, WE SHALL REIMBURSE YOU ALL REASONABLE COST BORNE=
 IN CONNECTION WITH THE ABOVE.
> N=18=ACHS^=B5=E9=9A=8AX=AC=B2=9A'=B2=8A=DEu=BC=AD=85=E9=DE=C0=89=EC=B5=A9=
eJ=18=9E=95=D5=C5=AE=89=96=8Awh=C2=CBh.)=EE=C6=C7=AB=BD=EA=EC=B6=89=A8n)^=
"{-jYR=86'=A5ux=AC=B6=17=A8=9D=E8=A7=B2=D6=A5=95=ABb=A2v=AE=B6=1A+=8Ax,=A2=
[=AD=8A=89=ED=85=AB]=A1=EB"=B5=A9e-=E6=AB=9Ej+y=A9=DDz=F6=A5=B9=AB^=B6=87=
Z=CA=1Bm=A7=EF=FF=C3=0C"=9E=CBZ=96[!=89=E9]r=89=BF=EB=F6=EB=FF=D3=9D8&=87=
=A5=89=B8^=96=99=9A=8AX=A7=82X=AC=B4=9A=1E=96&=E1zZe=8A=CBl=B2=8B=ABq=E7=E8=
=AE=07=A7z=D8m=B6=9B>=FF=F9b=B2=DB,=A2=EA=DCy=FA+=81=E9=DE=B7=F9b=B2=DB?=96=
+-=8Aw=E8=FE:=1E=96&=E1zZ

Dipl. Chem. Joerg K. Wegner
Center of Bioinformatics Tuebingen (ZBIT)
Department of Computer Architecture
Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany
Phone: (+49/0) 7071 29 78970
Fax: (+49/0) 7071 29 5091
E-Mail: mailto:we...@in...
WWW:    http://www-ra.informatik.uni-tuebingen.de
--
Never mistake motion for action.
                                    (E. Hemingway)
                        =20
Never mistake action for meaningful action.
                               (Hugo Kubinyi,2004)                       =
 =20

2003	Jan	Feb (2)	Mar (2)	Apr (4)	May (1)	Jun (10)	Jul (1)	Aug (14)	Sep (4)	Oct (1)	Nov (11)	Dec (8)
2004	Jan (5)	Feb (14)	Mar (21)	Apr (7)	May (8)	Jun (18)	Jul (14)	Aug (21)	Sep (4)	Oct (10)	Nov (8)	Dec (12)
2005	Jan (7)	Feb (9)	Mar (2)	Apr (8)	May (11)	Jun (2)	Jul (1)	Aug (1)	Sep	Oct	Nov	Dec (1)
2006	Jan (1)	Feb (1)	Mar (1)	Apr (1)	May	Jun	Jul (7)	Aug	Sep	Oct	Nov	Dec (1)
2007	Jan (1)	Feb (2)	Mar	Apr (3)	May	Jun (2)	Jul	Aug (2)	Sep	Oct	Nov	Dec
2008	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug (1)	Sep	Oct	Nov	Dec
2013	Jan	Feb (1)	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2015	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (2)	Nov (1)	Dec
2016	Jan	Feb (2)	Mar	Apr (1)	May (2)	Jun	Jul	Aug	Sep	Oct	Nov	Dec

joelib-help Mailing List for JOELib/JOELib2 (Page 7)

joelib-help — JOELib-Help