From: Joerg K. W. <we...@in...> - 2004-02-19 09:17:35
|
Hi karthikeyan, > 1) thank you for your immediate response. I can wait till I can see the > atom symbols in 3D (i feel it is very much required for chemist to understand the > molecule in 3D view) > I will try the 2D technique suggested by you. I agree, but it the priority is still very low for me :-) > 2) Second query: (pl.) > I have problem in reading Molconnz (.s) file in Joelib. I get an error: > Molecule entry (#1) skipped: .. io.MoleculeIOexception: Line 14(1) > should contain 7 > descriptor not 11: 2.828427 2.00000 1.4142 1.000 0.0000 etc., > I am using the molconnz output directly This depends on your MolConnZ version ! The supported format is defined in joelib\src\joelib\data\plain\molconnz350.txt and the first lines should contain: id nvx nrings ncirc nelem fw aname ... so if you are using any other version you must supply your file definition, e.g. molconnz400.txt and define these descriptors in joelib\src\joelib\data\plain\knownResults.txt where you must define if your descriptors are double, integer, boolean, or whatever. If you do not define these things JOELib can still of course load the entries and select them, but always as string ! So if you plan to normalize your data or filters, or something else i recommend the definition in knownResults.txt. > or should I convert into some format before submitting to joelib? Is possible also, e.g. SDF which is my preffered format, but still not the best one, because you must still define the descriptors in knownResults.txt. The CML in JOELib is the most verbose format, because the descriptors obtains already a format (after defined in knownResults.txt )! Then they can be loaded without any definitions ! Furthermore the CML reader/writer is consistent in JOELib, but eventually not up-to-date with the Murray-Rust CML2 implementation in his Java library and OpenBabel, because he develops a huge amount of code i'm not able to follow so fast. And i'm not sure about their descriptor abilities, because OpenBabel has no descriptor storing facility. Please correct me anybody if this is not true. The conversion of XML files should be not to difficult, so ... > The final objective is: to read all molconnz descriptor and 'optionally' > write in a clean format (col/row) > > mol1 d1 d2 d3 d4.. > mol2 d1 d2 d3 d4.. SMILES/Flat file format is supported also in sh convertSkip.sh where the flat file format should be defined in a separate file format.txt with mol1-ID d1 d2 d3 d4 or in joelib.properties for the SMILES, but i think i've added a command line switch also!!! > 3) finally... regarding compiling using ant, the output is directed to > build directory > and how to run from main *.bat files if the output is in build > directory? > as a shortcut I copied all the *.bat files to build directory.. and it > is working ok As already discussed i do not like the bat-files and they are not really supported and up-to-date. If possible in any way, i recommend cygwin !!! A unix shell for windows, so you can use the shell scripts !!! Of course the bat files will work if all required libraries are added to the classpath (classes in build-directory and all lib/*jar files), but that's a boring work and changes often ... The shell scripts or ant will resolve the dependencies automatically ! Regards, Joerg > > regards > > > > -- > M. Karthikeyan, Ph.D., Scientist > _| _| _|_|_| _| > _|_| _| _| _| > _| _| _| _| _| > _| _|_| _| _| > _| _| o _|_|_|o _|_|_|_| > National Chemical Laboratory > Pune - 411 008, INDIA > Ph: +91-(0)20-5893 457 FAX: 5893 973 > http://www.ncl-india.org/ > > > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. E. Hemingway |
From: E.L. W. <eg...@sc...> - 2004-02-21 00:55:40
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 19 February 2004 10:12, Joerg K. Wegner wrote: > > 1) thank you for your immediate response. I can wait till I can see the > > atom symbols in 3D (i feel it is very much required for chemist to > > understand the molecule in 3D view) > > I will try the 2D technique suggested by you. > > I agree, but it the priority is still very low for me :-) Hi karthikeyan, Jmol (jmol.sf.net) is an excellent 3D viewer which can label atoms by eleme= nt=20 and by number ... it's not based on Java3D and has excellent performance. I= t=20 shares at least the CML formats with JOELib, so it should interoperate=20 without much trouble.... Joerg, have you considered distributing Jmol with JOELib? It's not based on= =20 Java3D (a plus or minus, does not really matter), but is still actively=20 developed... Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFANIfsd9R8I9Yza6YRAraNAJ9xOo/cuB4vQOTiBj3h+Zp0Ydd8bgCguaEG aVqzT18LxiE2FWxRQ/E136M=3D =3D21oG =2D----END PGP SIGNATURE----- |
From: Joerg K. W. <we...@in...> - 2004-02-19 11:23:00
|
Hi, sorry EGON ! I was focused on the technical question. Would be great !!! I've also some time ago also used the commercial Marvin as interface for testing purpose, but never really needed this functionality, because i used other tools. If you have an actual JMol package, which uses the JOELib import/export from GUI i will be glad to add this as optional package to the file downloads, additional to Ghemical, Weka and the Software design libraries. http://sourceforge.net/project/showfiles.php?group_id=39708 If such things are already available i can add a short description to the XML DocBook tutorial or you can add it and your name as author ... as you like. In fact im using SGML not XML, ... But this does not change my time priorities, so i would be really happy to add any interfaces, but i'm not able to maintain they actively, because my actual focus lies on our interal JCompChem cheminformatics library with data mining and maximum common substructure search algoritms. These things are really alpha, because the packages are refactored yet, because we have found some blind alleys, which restricts further algorithm development, so ... in progress. Eventually these things will be publicly available in the future, too ... as part of JOELib ... and after (hopefully) publishing some nice combinations ... Regards, Joerg P.S.: This reminds me to update the CDK installation instruction ... and add some more instructions to the tutorial ... my actual deprecated CDK version has some problems in 2D layout ... E.L. Willighagen wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Thursday 19 February 2004 10:12, Joerg K. Wegner wrote: > >>>1) thank you for your immediate response. I can wait till I can see the >>>atom symbols in 3D (i feel it is very much required for chemist to >>>understand the molecule in 3D view) >>>I will try the 2D technique suggested by you. >> >>I agree, but it the priority is still very low for me :-) > > > Hi karthikeyan, > > Jmol (jmol.sf.net) is an excellent 3D viewer which can label atoms by element > and by number ... it's not based on Java3D and has excellent performance. It > shares at least the CML formats with JOELib, so it should interoperate > without much trouble.... > > Joerg, have you considered distributing Jmol with JOELib? It's not based on > Java3D (a plus or minus, does not really matter), but is still actively > developed... > > Egon > > - -- > eg...@sc... > PhD on Molecular Representation in Chemometrics > Nijmegen University > http://www.cac.sci.kun.nl/people/egonw/ > GPG: 1024D/D6336BA6 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.0.7 (SunOS) > > iD8DBQFANIfsd9R8I9Yza6YRAraNAJ9xOo/cuB4vQOTiBj3h+Zp0Ydd8bgCguaEG > aVqzT18LxiE2FWxRQ/E136M= > =21oG > -----END PGP SIGNATURE----- > > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. E. Hemingway |
From: E.L. W. <eg...@us...> - 2004-02-22 02:14:34
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 19 February 2004 12:17, Joerg K. Wegner wrote: > sorry EGON ! I was focused on the technical question. > Would be great !!! I've also some time ago also used the commercial > Marvin as interface for testing purpose, but never really needed this > functionality, because i used other tools. > > If you have an actual JMol package, which uses the JOELib import/export > from GUI i will be glad to add this as optional package to the file > downloads, additional to Ghemical, Weka and the Software design libraries. > http://sourceforge.net/project/showfiles.php?group_id=3D39708 (Jmol with a lower case m... :) What I would love to see, is a JOELib plugin for Jmol... I'm not sure what = the=20 main funtion of JOELib is (descriptor calculation, I guess...), but it is=20 relatively easy to put that into a Jmol plugin... but, second uncertainty..= =2E=20 I'm not sure what the best way would be to interact with JOELib...=20 In other words, what would be most interesting for Jmol to depict? Anyway, I'm seeing a plugin that would calculate the JOELib descriptors for= =20 the shown structure... more ideas? Have a look at=20 http://cdk.sf.net/plugins.html to see what the plugins are about... > If such things are already available i can add a short description to > the XML DocBook tutorial or you can add it and your name as author ... > as you like. In fact im using SGML not XML, ... > > But this does not change my time priorities, so i would be really happy > to add any interfaces, but i'm not able to maintain they actively, > because my actual focus lies on our interal JCompChem cheminformatics > library with data mining and maximum common substructure search > algoritms.=20 Sure. I have your two recent articles on my desk, but have not found time y= et=20 to read them... BTW, CDK already provides MCSS code... why not use that? It has the best=20 algorithm available at this moment... unless you're interesting in finding = a=20 better one... > These things are really alpha, because the packages are > refactored yet, because we have found some blind alleys, which restricts > further algorithm development, so ... in progress. Eventually these > things will be publicly available in the future, too ... as part of > JOELib ... and after (hopefully) publishing some nice combinations ... > > Regards, Joerg > > P.S.: This reminds me to update the CDK installation instruction ... and > add some more instructions to the tutorial ... my actual deprecated CDK > version has some problems in 2D layout ... Yes, please do keep up... where gaining momentum every month...=20 Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFANJ3zd9R8I9Yza6YRAqljAJwLuwh44ehn/Ew6PWNMYU3m7IPXawCaAyDY xNHnzDaLo7nWwVYbcDoqi1Y=3D =3D/HaH =2D----END PGP SIGNATURE----- |
From: Joerg K. W. <we...@in...> - 2004-02-19 12:09:50
|
Hi Egon, first: if the e-mail of C. Steinbeck is not correct, please feel free to forward this message. Eventually he could be interested in such things also. I've read his actual JCICS-CASE paper. > (Jmol with a lower case m... :) Jmol :-) > In other words, what would be most interesting for Jmol to depict? Import&Export and descriptors, as you surely know i use the OELib kernel, also OpenBebl kernel, so we can easily add all supported types of them, if anyone can find to port the C++ to Java, which is VERY easy ... but costs still time ... the important point are the atom types, and these are available. > Anyway, I'm seeing a plugin that would calculate the JOELib descriptors for > the shown structure... more ideas? Have a look at > http://cdk.sf.net/plugins.html to see what the plugins are about... I'll have a short look ... and again ... >>But this does not change my time priorities, so i would be really happy >>to add any interfaces, but i'm not able to maintain they actively, >>because my actual focus lies on our interal JCompChem cheminformatics >>library with data mining and maximum common substructure search >>algoritms. > Sure. I have your two recent articles on my desk, but have not found time yet > to read them... Machine learning and algorithm stuff ... > BTW, CDK already provides MCSS code... why not use that? It has the best > algorithm available at this moment... unless you're interesting in finding a > better one... I'm not interested in finding a better one ... i'm interested in using them with a general defined approach, so that we can adapt the algorithm to our cheminformatics requirements. The CDK implementation uses, if i'm understanding this correctly the association graph method, so i use the same association matrix method with a generalized atomType assignment, which contains also the ESTATE or my improved CESTATE one ... which works already great using some Hashing ... The association graph is not the problem, but the missing atom types and atom properties (descriptors) in CDK. The relevant clique detection algorithm in my implementation is just an interface and can be replaced by any other implementation ... depends on your needs, because fast implementations uses, of course, heuristic approaches ! BTW, the original reference for this kind of MCS was 1976 ! The performance depends on the used Clique detection algorithm, references can be also found in my papers, because i use already two implementations, ...there exists a lot of them ... but i do not believe that it is possible to improve the performance easily, because there was a lot of work already done by graph experts. We are working on a multiple interpretation and reimplementing three other literature approaches, to be able for multiple MCS which is much more interesting if you are interested in finding a pharmacophore based description based on ligands ... so this is in progress ... or if you like some CASE relevant analysing stuff ... at the moment we have reached the pre-alpha, so we are testing, testing and testing ... to find suitable combinations and parameters ... Regards, Joerg > > >>These things are really alpha, because the packages are >>refactored yet, because we have found some blind alleys, which restricts >>further algorithm development, so ... in progress. Eventually these >>things will be publicly available in the future, too ... as part of >>JOELib ... and after (hopefully) publishing some nice combinations ... >> >>Regards, Joerg >> >>P.S.: This reminds me to update the CDK installation instruction ... and >>add some more instructions to the tutorial ... my actual deprecated CDK >>version has some problems in 2D layout ... > > > Yes, please do keep up... where gaining momentum every month... > > Egon > > - -- > eg...@sc... > PhD on Molecular Representation in Chemometrics > Nijmegen University > http://www.cac.sci.kun.nl/people/egonw/ > GPG: 1024D/D6336BA6 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.0.7 (SunOS) > > iD8DBQFANJ3zd9R8I9Yza6YRAqljAJwLuwh44ehn/Ew6PWNMYU3m7IPXawCaAyDY > xNHnzDaLo7nWwVYbcDoqi1Y= > =/HaH > -----END PGP SIGNATURE----- > > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. E. Hemingway |