You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(3) |
Oct
(1) |
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
(10) |
Mar
|
Apr
(2) |
May
(4) |
Jun
(1) |
Jul
(1) |
Aug
(13) |
Sep
(1) |
Oct
|
Nov
(4) |
Dec
|
2004 |
Jan
(5) |
Feb
(9) |
Mar
(13) |
Apr
(25) |
May
(10) |
Jun
(21) |
Jul
(13) |
Aug
(8) |
Sep
(6) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2005 |
Jan
(9) |
Feb
(15) |
Mar
(8) |
Apr
(8) |
May
(3) |
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2006 |
Jan
(2) |
Feb
(2) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(5) |
Oct
|
Nov
|
Dec
|
2007 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Joerg W. <we...@in...> - 2004-06-04 15:19:43
|
Hi, sorry i'm not familiar with the version numbers. Only uncompressed formats ! No JCAMP CS ! At least all label-data-tags should be parsed. JCAMP isn't that easy to implement with all it's variants and nested elements, which are or should be supported. Kind regards, Joerg On Thu, 3 Jun 2004, Stefan Kuhn wrote: > Hi everybody, > I have a question about the jcamp parser: Which Versions of jcamp data = can be=20 > read? Because one of our users says: "I have tried V 5.0, 5.01 and 6.0,= but=20 > none of the files, which have been generated by our NMR-software works.= "=20 > Could you shed some light on this? > Thanks > Stefan > --=20 > Stefan Kuhn M. A. > Cologne University BioInformatics Center (http://www.cubic.uni-koeln.de= ) > Z=C3=BClpicher Str. 47, 50674 Cologne > Tel: +49(0)221-470-7428 Fax: +49 (0) 221-470-7786 > My public PGP key is available at http://pgp.mit.edu >=20 >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by the new InstallShield X. > >From Windows to Linux, servers to mobile, InstallShield X is the one > installation-authoring solution that does it all. Learn more and > evaluate today! http://www.installshield.com/Dev2Dev/0504 > _______________________________________________ > Joelib-devel mailing list > Joe...@li... > https://lists.sourceforge.net/lists/listinfo/joelib-devel >=20 Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) =20 Never mistake action for meaningful action. (Hugo Kubinyi,2004) = =20 |
From: Stefan K. <ste...@un...> - 2004-06-03 18:10:10
|
Hi everybody, I have a question about the jcamp parser: Which Versions of jcamp data ca= n be=20 read? Because one of our users says: "I have tried V 5.0, 5.01 and 6.0, b= ut=20 none of the files, which have been generated by our NMR-software works."=20 Could you shed some light on this? Thanks Stefan --=20 Stefan Kuhn M. A. Cologne University BioInformatics Center (http://www.cubic.uni-koeln.de) Z=C3=BClpicher Str. 47, 50674 Cologne Tel: +49(0)221-470-7428 Fax: +49 (0) 221-470-7786 My public PGP key is available at http://pgp.mit.edu |
From: Joerg K. W. <we...@in...> - 2004-05-10 17:00:13
|
Hi all, >>Are we (QSAR, CDK, JOELib, Octet, Jumbo) trying to do too much at the same >>time? > Maybe, but I think that things are going fine as they go now... we approach > things step by step... I guess we are mostly just glueing existing tools > together... Maybe ... step by step ... and we need at first a common merged interface, before any concrete implemention helps us to improve the actual design. >>(3) Wouldn't it be even more useful if project we're planning interacted >>with a single "standard" Java API for accessing and manipulating Molecular >>information? >>(4) Yes it would, > focus on chemical entities only... very difficult to make the 'single > standard API'... Chemistry is too fuzzy, too broad... > But this does not mean that we can define 'a standard Java API' which > glues together a few existing projects... Let's start with the 'glued' interface, if people have plans to write their own implementation, they can do that. But at first we must find a common interface... combining actual available open source projects may be at a later stage be interesting. >>but such a thing doesn't exist! How can we ensure that >>the new API will be general enough, robust, and useful? > I don't think we can... At the moment, i don't think we have time ... hey, these are open source projects, so in future we have the ability to refactor things ... >>My point is this: would it be useful to tackle the problem of developing a >>single standard Molecular API separately from the development of a QSAR >>framework? > Interesting, but I don't think we can easily come up with the solution to this > problem... (if it was easy, it was already done...) Correct, of course is refactoring much more easy than developing functionality, but there are still some really nasty problems out there, so i'm optimistic that we can iterate to a common interface and a common API, but this will need time ... it's still hard enough to maintain the actual available projects, because there are still some open performance-problems or bad-designs in them. And simply 'merging' the functionality is difficult, because it may demand a difficult refactoring. You surely know the actual LinesOfCode: ChemicalMarkupLanguge: 30285 CDK: 43772 JOELib: 63761 http://pmd.sourceforge.net/scoreboard.html So, assuming that a good developer reads 1000 LOC/day and understands them and all the dependencies, he will need 30+44+64=138 days (4 1/2) months to understand all the projects, then he can start with refactoring and testing, so ... hope you get paid for one year producing nothing :-) So are LOC a good measure for productivity ? No, but ... that's another problem, and out of the QSAR project focus. > Interesting, too... OpenBabel is struggling with atom types in file conversion > (i.e., I think they still are...)... which indicates only part of the > problems... I've discussed this topic with Geoff, but as always ... there are some other things to do, but we have exactly the same chemistry 'kernels', but this was checked 'by hand', because we have partially hard-coded assignment algorithms, so still suboptimal. > Jakarta is a much simpler working area... all the results are artificial... > that is, they don't have to match with nature... so they don't really care on > how things should be interpreted, only that they work... I agree ... chemoinformatics is still strongly connected to science, because we need still standards, which are in progress ... CML, 'expert systems', interfaces, ... Unfortunately, as already critisized by Kubyini (or at least cited by him) the contribution of the pharmaceutical industry could be higher in helping to set a standard. So, refactoring helps me not to publish papers and does not help pharmaceutical industry to reduce their data piles, of course for the future it can be helpfull, but financial pressure might be high for them and for us ... so who cares about a good hypothetical standard in the future which faciliates the maintenance ? So let's work with shell-scripts, they are fast and have an included copy protection, but that's unrealistic :-) As already said by Egon ... let's iterate ... step by step ... nothing is exluded ... but also nothing should be included too early ... Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-05-10 12:26:47
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday 10 May 2004 00:05, rich apodaca wrote: > I almost don't want to bring this up because the discussion around the QS= AR > project is pretty involved as it is. But I can't resist... > > Egon, your comment about QSAR being a "meta project" hit home with me in a > big way. > > The thought occurs: > > Are we (QSAR, CDK, JOELib, Octet, Jumbo) trying to do too much at the same > time? Maybe, but I think that things are going fine as they go now... we approach= =20 things step by step... I guess we are mostly just glueing existing tools=20 together... > Here's my impression of the line of discussion that led to where we are n= ow > (which I believe is a good place, by the way): > > (1) Wouldn't it useful to have an open-source project devoted exclusively > to QSAR with open implementations based on existing projects, a GUI, and > which makes use of open-source data mining tools (such as weka)? > > (2) Yes it would. Yes, that's a nice summary of the goal of qsar.sf.net :) > (3) Wouldn't it be even more useful if project we're planning interacted > with a single "standard" Java API for accessing and manipulating Molecular > information? > > (4) Yes it would,=20 Mmmm... people have tried that... there are some articles in which they onl= y=20 focus on chemical entities only... very difficult to make the 'single=20 standard API'... Chemistry is too fuzzy, too broad... But this does not mean that we can define 'a standard Java API' which glues= =20 together a few existing projects... > but such a thing doesn't exist! How can we ensure that > the new API will be general enough, robust, and useful?=20 I don't think we can... > How can we meet > this objective AND minimize refactorings of existing cheminformatics > projects to accomodate this new API? > > This is where we are now, in my view. The problem is, just tackling point > (4) will be a very big job in itself. Agreed. And I do not think we should make this our focus... I very much lik= ed=20 your suggestion of spliting up API's which can be merges for some specific= =20 application...: Very basic Atom API 3DRenderingAPI 2DRenderingAPI > My point is this: would it be useful to tackle the problem of developing a > single standard Molecular API separately from the development of a QSAR > framework? Interesting, but I don't think we can easily come up with the solution to t= his=20 problem... (if it was easy, it was already done...) > Would it be even more helpful to devote a separate project toward > cheminformatics standardization and/or integration in general? This proje= ct > could start off by trying address our point (4), but could easily expand = to > deal with any number of standardization/integration issues currently > plaguing cheminformatics research. The focus of the project needn't be > Java-centric either, although it would probably start out that way. Interesting, too... OpenBabel is struggling with atom types in file convers= ion=20 (i.e., I think they still are...)... which indicates only part of the=20 problems... But, I think doing this for the QSAR field only, reduces the problem size, = and=20 would make an very interesting test case... > As a model for such an effort, how about the Apache Jakarta project > (http://jakarta.apache.org/)? This project nicely ties together a lot of > technologies and serves as an essential resource for experienced develope= rs > and newcomers alike. More importantly, experiences in one project often > lead to new projects that address novel problems. > > Any thoughts? Jakarta is a much simpler working area... all the results are artificial...= =20 that is, they don't have to match with nature... so they don't really care = on=20 how things should be interpreted, only that they work... But, the resources that such a thing provides is applicable to our situatio= n=20 too... I'm hoping that the qsar.sf.net project can server such a function t= o=20 the QSAR field of science... Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFAn3T9d9R8I9Yza6YRAse7AKCuSJRXMLMoSAxYDtjg8Zk+dvGv5wCgkz98 lZ/LyciBliBj5jzF3tSIwMw=3D =3DdM7G =2D----END PGP SIGNATURE----- |
From: rich a. <che...@ya...> - 2004-05-09 22:05:52
|
Hello All, I almost don't want to bring this up because the discussion around the QSAR project is pretty involved as it is. But I can't resist... Egon, your comment about QSAR being a "meta project" hit home with me in a big way. The thought occurs: Are we (QSAR, CDK, JOELib, Octet, Jumbo) trying to do too much at the same time? Here's my impression of the line of discussion that led to where we are now (which I believe is a good place, by the way): (1) Wouldn't it useful to have an open-source project devoted exclusively to QSAR with open implementations based on existing projects, a GUI, and which makes use of open-source data mining tools (such as weka)? (2) Yes it would. (3) Wouldn't it be even more useful if project we're planning interacted with a single "standard" Java API for accessing and manipulating Molecular information? (4) Yes it would, but such a thing doesn't exist! How can we ensure that the new API will be general enough, robust, and useful? How can we meet this objective AND minimize refactorings of existing cheminformatics projects to accomodate this new API? This is where we are now, in my view. The problem is, just tackling point (4) will be a very big job in itself. My point is this: would it be useful to tackle the problem of developing a single standard Molecular API separately from the development of a QSAR framework? Would it be even more helpful to devote a separate project toward cheminformatics standardization and/or integration in general? This project could start off by trying address our point (4), but could easily expand to deal with any number of standardization/integration issues currently plaguing cheminformatics research. The focus of the project needn't be Java-centric either, although it would probably start out that way. As a model for such an effort, how about the Apache Jakarta project (http://jakarta.apache.org/)? This project nicely ties together a lot of technologies and serves as an essential resource for experienced developers and newcomers alike. More importantly, experiences in one project often lead to new projects that address novel problems. Any thoughts? cheers, rich Egon Willighagen <eg...@sc...> wrote: The interfaces and the wrappers can be in Octet, but personally I prefer to do this is the common, implementation neutral, QSAR project... The compile scheme is identical, the only difference is where people get added as developer. I prefer this setup, because it more clearly shows that the QSAR part is sort of meta project which tries to connect available OS tools for QSAR research. Please comment. --------------------------------- Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs |
From: <gra...@sy...> - 2004-05-06 13:29:32
|
Folks, just in case it's worthwhile, crystallographers, for example the CCDC ( http://www.ccdc.cam.ac.uk/products/csd/ <http://www.ccdc.cam.ac.uk/products/csd/> ) have had to worry about how to handle complex "bonding" representations programmatically. See if they can add anything to your understanding of not just the range of problems but the possible solutions. For example, the now pretty ancient 'FCON' file format http://www.ccdc.cam.ac.uk/support/documentation/quest/volume3/z317.html <http://www.ccdc.cam.ac.uk/support/documentation/quest/volume3/z317.html> covers the representation of a variety of atom and bond properties. (The more object-oriented among you may wish to wash your hands after reading the (FORTRAN) format statements ;-) Also related is the CIF (Crystallographic Information File) format, see http://www.iucr.org/iucr-top/cif/home.html <http://www.iucr.org/iucr-top/cif/home.html> regards, Graham Graham Mullier Chemoinformatics Team Leader, Chemistry Design Group, Syngenta, Bracknell, RG42 6EY, UK. direct line: +44 (0) 1344 414163 -----Original Message----- From: rich apodaca [mailto:che...@ya...] Sent: 05 May 2004 19:46 To: joe...@li...; qsa...@li... Subject: Re: [QSAR-devel] Data storage API summary -> net.sf.qsar.api.data package Hello All, Joerg, I appreciate your concerns regarding the implementation of Octet model-level interfaces by JOELib clients, and now I have a much clearer idea of why you think it's a good idea to first solve that problem before moving on to implement qsar functionality. I now agree with your postition on this. I also think I understand how you'll be developing and why you need developer access to Octet. So within a day or so, I'll add you as a developer. If you think it would be helpful, I'd be interested in working with you (and any others...?) to implement the Octet interfaces with JOELib classes. As I posted on the cdk-devel list, I believe the main challenge will be in reconciling Octet's bonding model with those of CDK and JOELib, but I think the problem can be solved. Interestingly, this issue goes beyind CDK or JOELib, and arises from fundamentally different views of how bonding should be modeled computationally. cheers, rich "Joerg K. Wegner" <we...@in...> wrote: Hi, > I don't want to spend time getting things to compile that I do not > need to use... So are we here doing things we like, or things we need ? But, what if i need them, that's the point. A Octet-CDK interface will work and from the actual design there are no problems, but i promise that a Octet-JOELib interface will cause problems, also as already the CDK-JOELib interface, because we have, as expected, different opinions about our requirements, because we have started from different design criterias. So kick JOELib and you can proceed as you like, if you want to use it, i prefer eventually (not sure at all) Octet interface changes, and this causes that you must change the actual Octet-CDK interface definition. If you will not complain about this, i'm fine with your approach, i would prefer to be a direct Octet deve! loper to try to reduce complications for Octet, CDK, Jumbo and JOELib, because i think this is much more elegant and more efficient than resolving every personal preferred complications. > I thought that that was what we were working on... a common API??? I don't think we know already what we want and need in common. Everyone prefers his already available well-know object structure, that's clear. And we have different opinions, of how we like to proceed. Let's stop this discussion here, and continue in two weeks. Again, Rick, i would like to be a Octet developer and would like to check in the Octet-JOELib interface implementations directly to octet or in a new CVS directory at the Octet project. So i can start with implementing and filling the stubs and change things i will need directly. Then we can discuss things for these implementations, before they are taken to Octet and dicuss on the mailing list, how to proceed and iterate our consens on the actual interface. Or we can use the tracking system with upload ability. > I think that MDL was interested in posting their substructure set > which is used to calculate their fingerprint... that could be used as > a standard. Fine, and: The expert systems are the main critical step in my opinion. I've also heard people saying, 'who is intersted in slight differences?'. At least from the scientific standpoint i disagree heavily on such comments. So what expresses [a;$(#6:#6-N)] if we use another expert system ? Surely not a random match, but eventually not what we've expected. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ Qsar-devel mailing list Qsa...@li... https://lists.sourceforge.net/lists/listinfo/qsar-devel _____ Do you Yahoo!? Win <http://pa.yahoo.com/*http://us.rd.yahoo.com/hotjobs/hotjobs_mail_signature_ footer_textlink/evt=23983/*http://hotjobs.sweepstakes.yahoo.com/careermakeov er> a $20,000 Career Makeover at Yahoo! HotJobs |
From: rich a. <che...@ya...> - 2004-05-05 18:45:52
|
Hello All, Joerg, I appreciate your concerns regarding the implementation of Octet model-level interfaces by JOELib clients, and now I have a much clearer idea of why you think it's a good idea to first solve that problem before moving on to implement qsar functionality. I now agree with your postition on this. I also think I understand how you'll be developing and why you need developer access to Octet. So within a day or so, I'll add you as a developer. If you think it would be helpful, I'd be interested in working with you (and any others...?) to implement the Octet interfaces with JOELib classes. As I posted on the cdk-devel list, I believe the main challenge will be in reconciling Octet's bonding model with those of CDK and JOELib, but I think the problem can be solved. Interestingly, this issue goes beyind CDK or JOELib, and arises from fundamentally different views of how bonding should be modeled computationally. cheers, rich "Joerg K. Wegner" <we...@in...> wrote: Hi, > I don't want to spend time getting things to compile that I do not > need to use... So are we here doing things we like, or things we need ? But, what if i need them, that's the point. A Octet-CDK interface will work and from the actual design there are no problems, but i promise that a Octet-JOELib interface will cause problems, also as already the CDK-JOELib interface, because we have, as expected, different opinions about our requirements, because we have started from different design criterias. So kick JOELib and you can proceed as you like, if you want to use it, i prefer eventually (not sure at all) Octet interface changes, and this causes that you must change the actual Octet-CDK interface definition. If you will not complain about this, i'm fine with your approach, i would prefer to be a direct Octet developer to try to reduce complications for Octet, CDK, Jumbo and JOELib, because i think this is much more elegant and more efficient than resolving every personal preferred complications. > I thought that that was what we were working on... a common API??? I don't think we know already what we want and need in common. Everyone prefers his already available well-know object structure, that's clear. And we have different opinions, of how we like to proceed. Let's stop this discussion here, and continue in two weeks. Again, Rick, i would like to be a Octet developer and would like to check in the Octet-JOELib interface implementations directly to octet or in a new CVS directory at the Octet project. So i can start with implementing and filling the stubs and change things i will need directly. Then we can discuss things for these implementations, before they are taken to Octet and dicuss on the mailing list, how to proceed and iterate our consens on the actual interface. Or we can use the tracking system with upload ability. > I think that MDL was interested in posting their substructure set > which is used to calculate their fingerprint... that could be used as > a standard. Fine, and: The expert systems are the main critical step in my opinion. I've also heard people saying, 'who is intersted in slight differences?'. At least from the scientific standpoint i disagree heavily on such comments. So what expresses [a;$(#6:#6-N)] if we use another expert system ? Surely not a random match, but eventually not what we've expected. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ Qsar-devel mailing list Qsa...@li... https://lists.sourceforge.net/lists/listinfo/qsar-devel --------------------------------- Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs |
From: Egon W. <eg...@sc...> - 2004-05-05 16:25:38
|
On Wednesday 05 May 2004 04:35, rich apodaca wrote: > One of the problems I came up against in using option (2) was that the CDK > IO classes create their own instance of org.openscience.cdk.Molecule, > rather than allow for one to be passed in and operated on. If this could be > enabled, then clients could go: > > BasicCDKMolecule cyclohexane = new BasicCDKMolecule(); > SMILESReader reader = new SMILESReader(new StringReader("C1CCCCC1")); > > reader.read(cyclohexane); // or possibly reader.read(chemobject, > cyclohexane); > > // now do something with cyclohexane Yes, that should be changed. One complicating problem is that we also need the BasicCDKMolecule *and* the CDK Molecule to create Atom's ! Or does the BasicCDKMolecule except CDK Atom, and convert that internally? > // we now have cyclohexane that can be used in either Octet or CDK - and > the beauty is that CDK knew nothing about what just happened! > > Of course, there are other strategies for CDK-Octet interoperability that > could be followed, and variants on the above two, but I think this give a > feel for what might be possible. No, this looks fine. Egon |
From: rich a. <che...@ya...> - 2004-05-05 02:35:26
|
Hello All, As per a discussion Egon, Joerg, and I had on the qsar-devel list, I have put together a package that illustrates the use of CDK with Octet. It is called "cdktools-0.0.1.zip" and can be downloaded from sourceforge: http://sourceforge.net/project/showfiles.php?group_id=96108 This code implements two basic ideas: (1) Conversion back and forth between a CDK Molecule to an Octet Molecule through the use of static methods (CDKKit). (2) Use of a hybrid Molecule that extends org.openscience.cdk.Molecule _and_ inherits the net.sourceforge.octet.molecule.Molecule interface (BasicCDKMolecule). This class is at home either in the CDK environment, or in the Octet environment with no conversion necessary by virtue of Java's support for multiple inheritance. This provides a useful "abstraction layer" that shields clients from having to worry about what kind of a Molecule they have a reference to. It also simplifies debugging somewhat because conversion takes place at the time the Molecule is created. One of the problems I came up against in using option (2) was that the CDK IO classes create their own instance of org.openscience.cdk.Molecule, rather than allow for one to be passed in and operated on. If this could be enabled, then clients could go: BasicCDKMolecule cyclohexane = new BasicCDKMolecule(); SMILESReader reader = new SMILESReader(new StringReader("C1CCCCC1")); reader.read(cyclohexane); // or possibly reader.read(chemobject, cyclohexane); // now do something with cyclohexane // we now have cyclohexane that can be used in either Octet or CDK - and the beauty is that CDK knew nothing about what just happened! Of course, there are other strategies for CDK-Octet interoperability that could be followed, and variants on the above two, but I think this give a feel for what might be possible. Of course, this discussion is equally applicable to JOELib, just replace org.openscience.cdk.Molecule with joelib.molecule.Molecule. cheers, rich --------------------------------- Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs |
From: Joerg K. W. <we...@in...> - 2004-05-04 13:18:19
|
Hi Nicolas, sorry :-) 1. CDK has implemented it's own logging implementation (not standard-log4j), so this can not cause the problem, i think. 2. ant is part of JOELib: simply try: joelib> sh build.sh compile or joelib> sh build.sh dist eventually you must set the base directory with setenv JOELIBDIR yourBase/joelib or export JOELIBDIR=yourBase/joelib if you're a bash user 3. the java-property-files joelib.properties log4j.properties should be in the base classpath or in the .jar file Have you copied them also to your CLASS_PATH directory? Example in log4j.properties: log4j.category.joelib.data.JOEAromaticTyper=ERROR So only error are printed: Try: log4j.category.joelib.data.JOEAromaticTyper=DEBUG and also DEBUG messages wil be shown. Much more screen-filling are things like: log4j.category.joelib.smarts.JOESmartsPattern=DEBUG log4j.category.joelib.smarts.JOESSMatch=DEBUG log4j.category.joelib.smarts.ParseSmart=DEBUG or log4j.category.joelib.molecule.JOEMol=DEBUG I know that dependencies are not easy, but on the other side there is a huge benefit from using default open-source-packages, so be patient and try it from different directions ... Kind regards, Joerg > I've tried to update all my classes this morning (log4j and JOELib), > but the result is the same. I haven't any ANT installation so I can't build > anything for the moment. If I've really need this new CDK update, I'll > install ANT correctly. However, what I don't understand is the first time I > would launch CDK and JOELib Classes, I 've just make a copy of all JOELib > Classes and put them in my CLASS_PATH directory. I have also put the log4j > classes in org/apache/ dir. And all ran correctly. Now, I have just put a > new version of CDK, and I have these errors messages. Moreover, I'm not an > expert in logging methods, so it's difficult for me to set params like you > said (switch on/off info/warn/error etc...). > > I'll see following my needs what I can realy do with that pb. > > Thanks, > Nico > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-05-04 08:15:54
|
Hi Nicolas, > log4j:ERROR No appenders could be found for category (joelib.io.IOType). > log4j:ERROR Please initialize the log4j system properly. i think you're not using the supported ant build file mechanism. So check, if you've also the log4j.properties file in your classpath. There you can easily switch on/off info/warn/error and change the output type, e.g. console, file, e-mail, ... > Before this update all were running correctly; Is somebody has met the same > pb ? Not that i remind in the last 2 months, so i can only guess Have you tried the distribution builder in ant joelib/ant> ant dist ? And picked the joelib.jar from the binary distribution ? > thanks, Was me a pleasure. Kind regards, J. K. -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: <nic...@sy...> - 2004-05-04 08:07:43
|
Hi All, I'm a CDK and JOELib user and since I have updated the CDK classes I obtain this message below: log4j:ERROR No appenders could be found for category (joelib.io.IOType). log4j:ERROR Please initialize the log4j system properly. Before this update all were running correctly; Is somebody has met the same pb ? thanks, Nico |
From: Joerg K. W. <we...@in...> - 2004-04-29 09:18:42
|
Hi all, > I don't think anyone can restrict atom pair to match a specific descriptor... > it's something like making "windows" a registered trade mark. "Atom pair" is > a general term and cannot be restricted to just denote on descriptor. It's, as i said, a object oriented design question. I prefer as most general objects as possible, so if the actual AtomPair supports also a distance variable (must not be 1) and an occurence count variable i like only one object. If you think this is overhead for all (also primitive) Atom-BondWhatever-Atom definitions (greater space complexity) i would prefer two different inherited interfaces ? Nikolas, any comments ?:-) > Fortunately, we will have dictionaries... where the descriptor "AtomPair" can > be explaned... Which one ? Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-04-29 09:10:11
|
Hi Rich, i think, i must at first (re)read the 'Gang-Of-Four-Book' :-) Until not done i'm not really able to comment reasonably on your Pattern questions. Kind regards, Joerg > I found the api you mentioned here: http://www.xml-cml.org/cmldom/htmlDoc/index.html (but I still wasn't able to find MoleculeTool). It's very interesting that interface definitions can be produced from an xml specification. > > One of the points I considered when developing Octet was whether the core model-level interfaces such as Atom, AtomPair, and Molecule should be immutable or not. I ended up choosing the "Read-Only Object" variant of the "Immutable" design pattern described by Grand. > > I won't go into the many reasons behind this decision here other than to mention that once Molecules are created, there was no situation (other than a graphical structure editor) that I could think of where the need to connect/disconnect atoms, change bond orders, or otherwise tinker with the Molecule's internal state would come up post creation. More importantly, the entire process of working with Molecules becomes simpler, more robust, and less error-prone if immutability of model-level objects can be assumed by clients. > > What I noticed from the cml interface definitions is that they define read-write access for every property. This means that no assumptions about Molecule, Atom, or Bond immutability can be made. What would you think about an interface that removes the mutator methods? In Octet, I handled the need to get Molecules created in the first place by writing a concrete implementation of the Molecule interface (BasicMolecule) that has mutator methods, and by using the "Builder" pattern for general molecule construction. Of course, clients can always try to guess the concrete implementation of Molecule it encounters, downcast to it, and use the mutator methods, but they really have to consider if this is a good thing to be doing. > > Another thing that struck me when looking at the CMLMolecule, CMLAtom, and CMLBond interface definitions was that they are essentially interfaces to a data structure. If that data structure were to change for some reason, the resulting refactorings could be somewhat painful. Since I'm a fan of OO programming and encapsulation, I've been conditioned to avoid this kind of situation. What are your thoughts on this? > > One more thing: there are features like 2-D and 3-D coordinates that will be unused in many cheminformatics applications but which are defined in the CML interfaces. This means that overhead will be incurred when it might not be necessary. In Octet/Structure, I've handled this with a "pay-as-you-go" approach. The net.sourceforge.octet.molecule.Molecule inteface defines the bare minimal functionality needed to work with molecular graph objects. If 2-D coordinates are desired, then clients can apply the "Decorator" design pattern and use a net.sourceforge.structure.molecule.Molecule2D implementation which is itself a subclass of Molecule with additional 2-D coordinate functionality. What are your thoughts on this kind of approach to defining a Molecule interface? > > cheers, > rich > > Peter Murray-Rust <pm...@ca...> wrote: > [Reply to QSAR list only] > > At 19:03 27/04/2004 -0700, rich apodaca wrote: > >>Thanks for your comments, Peter. I'm especially interested in your >>comments on cml. I've been watching cml at a distance for some time, but I >>didn't realize you had defined interfaces for molecule and atom behavior. >>Could you more precisely point me to where these interfaces are? I visited >>the link you sent but wasn't able to find them. > > > http://wwmm.ch.cam.ac.uk/moin/ChemicalMarkupLanguage > > You will find schema elements for about 100 concepts. > http://wwmm.ch.cam.ac.uk/moin/CmlElements > This list is autogenerated from the schema, so can be updated every time > the schema is modified. There are similar lists for > http://wwmm.ch.cam.ac.uk/moin/CmlAttributes > and > http://wwmm.ch.cam.ac.uk/moin/CmlSimpleComplexTypes > > These are then automatically compiled into target code (Java, C++, Python, > F90). In Java this results in a (Java) interface for every Element > including appropriate methods for every attribute. The code obviously > generates Javadoc. Rather than displaying this we distribute the complete > system: > http://wwmm.ch.cam.ac.uk/moin/CmlAtNesc > and ask people to generate their own. > > I now believe that we should try to define interfaces, etc in XML rather > than a target language. I am not a fan of UML (costs money) so somewhat > reluctantly use XMLSchema. > > >>I'm not very familiar with xml, but if I understand correctly, a DOM is >>used to produce an in-memory representation of the structure of an XML >>document. Minimally, > > > Absolutely right > > >>it provides an exact representation of the content of the XML document. If >>I'm correct so far, then I imagine that a CML DOM provides an exact >>representation of the structure of a CML document. >> > > > Yes. > > >>In addition to providing an interface to access the data, what behaviors >>do the CML interfaces define for model-level objects like Atom and >>Molecule? To me, an example of pure Atom data would be an atom label >>property, whereas an example of Atom behavior is the capability to report >>what bonding systems an Atom belongs to and what Atoms it is a neighbor >>of. The choice of behavior is critical: too much functionality and the >>interface becomes bloated and hard to understand - too little and >>developers are frustrated at how much work it takes to do simple things. >>I'm very interested in knowing what the right balance is. >> > > > Fully agreed. That is why I have developed a Tool approach. Every element > has a Tool which adds functionality.Thus Molecule has MoleculeTool. The > tool has behavioural methods like: > MoleculeTool.getMolecularMass(). > MoleculeTool.get2DCentroid(). > > I originally wrote these in Java but am now starting to develop a > pseudocode so that the other target languages can be supported. In this > way we get a complete interface for behaviour which - hopefully - will > lead to increasingly consistency of implementation > > >>It sounds like the approach you've taken in using interfaces is similar to >>mine. Like you, I am keenly interested in taking advantage of the rich >>functionality of CDK and JOELib. As a first pass, I've been working on a >>two-way adapter class for CDK. Its definition looks something like this: > > > See MoleculeTool in our distrib. At present it uses CDK as the engine but > could easily use JOELib, etc. I am sure this is the right way to go > > >>public class CDKMolecule extends org.openscience.cdk.Molecule >>implements net.sourceforge.octet.molecule.Molecule >>{ >>// override org.openscience.cdk.Molecule methods where appropriate >> >>// implement net.sourceforge.octet.molecule.Molecule interface >>} > > > Yes. Seems reasonable > > I tend to use a delegation method: > >>public class MoleculeToolImpl implements MoleculeTool { > > // body is implementor dependent > org.openscience.cdk.Molecule theMolecule; // used for computation > > } > > >>The advantage here is that a CDKMolecule can be used from within either >>CDK or Octet without the need for a conversion step. I plan to do the same >>thing for joelib.molecule.JOEMol. > > > Yes. > I am now starting to use workflow tools (Kepler, Taverna - see sf) and > these require small atomic units (in the CS sense) It is important that > their interface to the external world is implementation independent > > >>In particular, it would be helpful to use the file format read/write >>capabilities of CDK. The problem I'm currently facing is that IO classes >>such as org.openscience.cdk.io.MDLReader provide their own instance of >>org.openscience.cdk.Molecule that is created during a call to read(). If >>this method used an instance of org.openscience.cdk.Molecule passed into >>the read() method instead, then I could just pass in my CDKMolecule, and >>the reader would not be the wiser. What would be the consequences of >>modifying the IO classes to allow for this? >> >>With regard to directly supporting CML, I'm interested in trying my hand >>at it with Octet. The Octet model for bonding is somewhat different from >>the other Java cheminformatics packages I've seen in that it directly >>supports multicenter, multielectron bonding arrangements. > > > So does CML. A bond can be between 2, 3, 4 or many atoms. It can also be > between atoms and bonds or bonds and bonds. To be fair we haven't > implemented this > > >>So, the bonding arrangement of ferrocene, benzyne, borane clusters, or the >>homotropylium cation are handled exactly the same way as those of hexane. >>This implementation is based on a paper by Dietz (JCICS 1995, 35, 787). >>What are your thoughts on CML providing the syntax necessary to represent >>these "non-traditional" kinds of bonding arrangements? > > > See if it works!!! > > > > P. > > > Peter Murray-Rust > Unilever Centre for Molecular Informatics > Chemistry Department, Cambridge University > Lensfield Road, CAMBRIDGE, CB2 1EW, UK > Tel: +44-1223-763069 > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Qsar-devel mailing list > Qsa...@li... > https://lists.sourceforge.net/lists/listinfo/qsar-devel > > > --------------------------------- > Do you Yahoo!? > Win a $20,000 Career Makeover at Yahoo! HotJobs -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-04-29 07:01:23
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 28 April 2004 11:05, Joerg K. Wegner wrote: > > Do you consider AtomPair a "descriptor"? I noticed it is present in > > JOELib. Octet also has an AtomPair interface. However, in Octet, AtomPa= ir > > simply represents an association between two atoms (no electrons involv= ed > > - that happens through BondingSystem). To find all the atoms that are > > associated in a Molecule, use Molecule.iterateAtomPairs(). Your point > > about hashCode() is well-taken. > > Not i'm taking it as a descriptor. IT IS a descriptor in medicinal > chemistry community. So i've no problem with this definition, but there > are already some papers out there which uses the same name ! > You can generalize, if you add distance and occurence variables. But > here also a atom can be a more abstract label than a simple atom, e.g. a > mix of special atom properties. > So i would prefer a redesign or at least another name. I don't think anyone can restrict atom pair to match a specific descriptor.= =2E.=20 it's something like making "windows" a registered trade mark. "Atom pair" i= s=20 a general term and cannot be restricted to just denote on descriptor. =46ortunately, we will have dictionaries... where the descriptor "AtomPair"= can=20 be explaned...=20 Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFAkKg4d9R8I9Yza6YRAnBEAJ45WFMUlkBrYxAII5wR/om+hD/obgCguvDt Rt5MxFFsSquEA8lwT1dkKzU=3D =3Do4ja =2D----END PGP SIGNATURE----- |
From: Joerg K. W. <we...@in...> - 2004-04-28 09:03:54
|
Hello Rich, > What are the advantages of an Octet Atom inheriting from Node? The definition of the Atom interface is very short and contains mainly methods for identifying neighboring atoms and bonding systems. Octet doesn't use Bond, but rather BondingSystem, which allows for the connection of any number of Atoms using any number of electrons so that structures like ferrocene and transition metal complexes can be handled the same way as any purely organic molecule. I've also some (internal) code for maximum common substructure search and for such cases we work on abstract nodes. Of course, thes can use typical atom labels, but also a lot of other stuff. So for general graph algorithms, i would prefer a more abstract interface. > I thought about being able to store keys, properties, etc. in Atom, Molecule. However, since the design of Octet is based on the implementation of interfaces, doing so puts a burden on the implementor to provide this functionality. Marking atoms, bonding systems, and molecules could just as easily be done externally to those interfaces using a vector of visited atoms, for example. In fact, Octet uses this approach in, for example, the DepthFirstTraverser class. Is there something else I'm missing? Let's iterate ... > Do you consider AtomPair a "descriptor"? I noticed it is present in JOELib. Octet also has an AtomPair interface. However, in Octet, AtomPair simply represents an association between two atoms (no electrons involved - that happens through BondingSystem). To find all the atoms that are associated in a Molecule, use Molecule.iterateAtomPairs(). Your point about hashCode() is well-taken. Not i'm taking it as a descriptor. IT IS a descriptor in medicinal chemistry community. So i've no problem with this definition, but there are already some papers out there which uses the same name ! You can generalize, if you add distance and occurence variables. But here also a atom can be a more abstract label than a simple atom, e.g. a mix of special atom properties. So i would prefer a redesign or at least another name. > Your point about copy() clone() is also well-taken. However, this can't be forced through the interface definition but can be incorported into the reference implementations. i know. > Can you give me an example of the readAsString() method and its advantages in handling corrupted file entries compared to just throwing an exception with the existing MoleculeReader methods? You're right about these methods needing to declare an exception. E.g.: start reading a file normally with LineReader. When finding a corrupted line all things are skipped or you allow special skipping rules, which are difficult to handle. Allowing to read a full molecule entry from start-to-end-tag is bad in runtime, but pretty easy to implement. So for convertSkip.sh i simply load all molecules as String, then parse to molecule. Corrupted entries can be saved in a skip file without knowing the error and cancelling reading. Of course a error/warning is written, so if a company converts 100000 molecules, they can convert all at once and have later a look at the corrupted entries in the skip file. Otherwise you force them to correct every error by hand. I promise, if you do so they will flame you. > I'm currently working on implementing some of the other features you asked about such as a descriptor framework, substructure/similarity searching. However, these features are independent of the interface definitions for the key model-level objects (Molecule, BondingSystem, and AtomPair). I've had a look at JOELib's descriptor framework, and it looks like a flexible way implement descriptor functionality. Yes of course. See joelib/desc, joelib/math/similarity and the helper classes at joelib/util Take what you need. > Can you explain what a "descriptor IO helper class" is and why it is necessary? There are three relevant main classes: 1. DescriptorFactory-Factory pattern: Load calculation class for a descriptor by it's name. No caching at the moment, but clear() methods already available. 2. DescriptorHelper: Allows you to load/calculate descriptors. E.g. i'm interested in 'PSA' i say descFromMol(mol,"PolarSurfaceArea", true); If it already exists, just return it. If not, calculate it and add it imediately to the molecule (caching). This is important for expensive matrix or array descriptors to avoid calculating them several times. 3. ResultFactory-Parser factory: If they descriptor type was not already assigned in the loading process (only partial CML, because it stores SOME types explicitely), so we need to map the unparsed descriptors to a data type, e.g. int, double, int matrix, boolean array, atom pair, ... The mapping can be defined by name or regular expression in joelib/desc/data/plain/extKnownResults.txt, so we are able to load also external descriptors from other programs, like MolConnZ, Petra, MOE, Dragon, whatever ... The only thing we need is the mapping and Java reflection !!! Special IO functionalities, e.g. CML properties, like delimiter, ... are directly stored in each descriptor result class. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-04-26 07:21:30
|
Hi, I've had a short look and i'm missing some things functionalities in octet: - i would prefer Node and Edge objects as Atom and Bond base - i would prefer a general NodeKey, EdgeKey, MoleculeKey, RingKey object as labelling the attributed molecular graph both things are required for general graph algorithms, for the keys a factory pattern could/should be used, especially for assigning default labels. This avoids calculating e.g. a ring search twice by using: if(!mol.hasKey(myRingSearchKey))mol.calculateRingSearch() - The AtomPair is ambigous, there exists a descriptor with an additional distance parameter, here you are using always one. Hashing is important here. - Force Copy/Clone/Hash-methods. - The reader should provide, readAsString, readToMoleculeObject, so we can catch corrupted file entries. Don't ask me why there are such a lot of corrupted entries, but they exists - Add MoleculeIOException to read/write, to catch these corrupted entries, this will us enable to write skip files - A general SubstructureSearch object would be fine, also a UniqueSubstructureSearch object or a transformer object. - General descriptor objects are missing completely, but they can be handled by the hashed MoleculeKey objects, but eventually we distinguish between keys which can handle only one object (hashed) and keys which can handle multiple objects, so we need a GeneralPropertyHandler which accepts single and multiple entries by key. - For descriptors IO helper classes are required, which have read(IOType) and write(IOType) Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-04-24 09:37:32
|
Hi Thijs, please use the developer mailing list, to avoid me answering questions multiple times and a more descriptive subject than 'bug'. You can also use the bug-tracking system. good point :-) Actually i've never tested this class, because the rotamere generation sources are still not complete ported from the stalled OELib. The original C++ class is ctransform.cpp. So, if you're familiar with C++ and pointer references you can have a look at your own at OELib or his successor OpenBabel. I will have a look, but actually i'm not working actively at the rotamer generation, but i think it should be: System.arraycopy(in_xyz, 3 * 2, y, 0, 3); System.arraycopy(in_xyz, 3 * 3, z, 0, 3); Because the API docu, i've now added,says: in_xyz A length 12 array containing 4 coordinates (0,0,0), (1,0,0), (0,1,0) and (0,0,1) from the initial reference frame transformed into the final reference frame If this not works, i closer look at the matrix transformation is required. Kind regards, Joerg > Hi, > > I was playing around a bit with your resources and found what i think is > a a bug. > > class:joelib.math.JOECoordTrans > method: public boolean setup(double[] in_xyz) > line 509 > > public boolean setup(double[] in_xyz) > { > //Copy coordinate array > double[] xyz = new double[12]; > double[] y = new double[4]; > > //;xyz[3*2]; > double[] z = new double[4]; > > //xyz[3*3]; > int i; > > //for (i=0 ; i<12 ; i++) xyz[i] = in_xyz[i]; > System.arraycopy(in_xyz, 0, xyz, 0, 12); > System.arraycopy(in_xyz, 3 * 2, y, 0, 4); > System.arraycopy(in_xyz, 3 * 3, z, 0, 4); > Here it goes wrong, since element (3*3) + 4 = 12 does not exist in xyz, and > i get an arrayindexoutofbounds exception > > The solution is not obvious to me, > > Hopefully you can help me out. > > Thijs Beuming > > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-04-24 09:18:35
|
Hi Nicolas, please use developer mailing list, to avoid answering me questions multiple times. A more general way would be to get an 'atom property'-object with the general descriptor interface ! The problem with your approach is, that you are not automatically allowed to overwrite partial charges in the molecule. So the obtained array contains the correct partial charges, but the molecule contains the loaded ones, e.g. MOPAC, Sybyl, ... The mol.automaticPartialCharge() checks this flag. You can set it with mol.setAutomaticPartialCharge(boolean flag). If not allowed to overwrite molecule is cloned before the partial charge is calculated. You've forgotten the initialization JOEPhModel.instance().assignSeedPartialCharge(useMol); O.k., now the general way: String propertyName ="Gasteiger_Marsili"; AtomProperties properties; if (JOEHelper.hasInterface(tmpPropResult, "AtomProperties")) { properties = (AtomProperties) tmpPropResult; } else { logger.error("Property '" + propertyName + "' must be an atom type for calculating " + DESC_KEY + " but it's " + tmpPropResult.getClass().getName() + "."); return null; } then you can access the properties by using: properties.getDoubleValue(atomIndex) Kind regards, Joerg > I'm a JOELib and CDK user and I've tried to use the available > Gasteiger calculation classes, but I met some difficulties; > First I have convert my CDK Molecule in a JOEMol molecule: > > Convertor conv = new Convertor(); > joelib.molecule.JOEMol joemolecule = conv.convert(molecule); > > (Where "molecule" is a valid CDK molecule). Second I have created a > GasteigerMarsili object using: > > GasteigerMarsili gm = new GasteigerMarsili (); > double[] atomprop = gm.getDoubleAtomProperties(joemolecule); > > But when I ask: > > System.out.println(joemolecule.getAtom(i).getPartialCharge()); > > I have only 0.0 results. > > I have also try with: > > JOEGastChrg jgc = new JOEGastChrg(); > jgc.assignPartialCharges(joemolecule); > > But the result is the same. What do you think I forget in my source > ? could you please comment just a little bit more about these methods ? > > Thanks a lot for your response, > > Nicolas Job > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-04-24 09:06:45
|
Hi all, i agree. A common molecule interface would be fantastic. I will have a look at your code. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Christoph S. <c.s...@un...> - 2004-04-22 08:10:20
|
Rich, it was interersting to learn about your projects, and of course you=20 point about interfaces is a valid one. We had this discussion for CDK again and again and we are likely to move=20 to core class (Atom, Bond, Molecule, etc.) interfaces in the future. With now three Chemoinformatics Java frameworks on Sourceforge and this=20 interesting discussion about the QSAR project, we have a decent chance=20 to agree on a well-defined interface for those core classes. Cheers, Chris --=20 Dr. rer. nat. habil. Christoph Steinbeck (c.s...@un...) Groupleader Junior Research Group for Applied Bioinformatics Cologne University BioInformatics Center (http://www.cubic.uni-koeln.de) Z=FClpicher Str. 47, 50674 Cologne Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-7786 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. rich apodaca wrote: > I agree that a common method for the representation of molecular=20 > objects is critical for the development of portable and verifiable=20 > cheminformatics protocols. > =20 > A core principle of object-oriented design is that designs are most=20 > reusable when you program to interfaces, not implementations. > =20 > I would propose that any discussion of a QSAR framework should take int= o=20 > consideration the need to first define Java interfaces for core objects= =20 > such as Atom and Molecule. The QSAR framework would be useful to the=20 > greatest number of developers if each developer is free to provide thei= r=20 > own implementation of the core interfaces that will work without=20 > modification in the QSAR framework. Defining these interfaces means tha= t=20 > the irreducible core functionality of Molecule, Atom, etc. with which=20 > the framework will neeed to work must be decided on. > =20 > The advantage of this approach is true design reuse. Because the QSAR=20 > framework only knows about Java interfaces, all a developer needs to do= =20 > to use all of the functionality of the framework is to provide an=20 > implementation of those interfaces. Of course, reference implementation= s=20 > should be provided by the framework as well. > =20 > I've taken this approach in a cheminformatics framework called "Octet"=20 > (http://octet.sourceforge.net <http://octet.sourceforge.net/>) and in a= =20 > 2-D molecular visualization framework called "Structure"=20 > (http://structure.sourceforge.net <http://structure.sourceforge.net/>).= =20 > The approach in these frameworks differs significantly from both JOELib= =20 > and CDK in that a developer is never required to use my reference=20 > implementations of Molecule or Atom. > =20 > For example, it is possible to provide performance-optimized=20 > implementations of these interfaces that would be suitable for large=20 > numbers of molecules, or the rapid constrution of molecules. The=20 > framework only knows about interfaces, and this is the key to code reus= e. > =20 > I would be willing to provide any code and/or experiences from these=20 > projects to the development of a QSAR framework. > =20 > cheers, > rich |
From: rich a. <che...@ya...> - 2004-04-21 15:03:32
|
I agree that a common method for the representation of molecular objects is critical for the development of portable and verifiable cheminformatics protocols. A core principle of object-oriented design is that designs are most reusable when you program to interfaces, not implementations. I would propose that any discussion of a QSAR framework should take into consideration the need to first define Java interfaces for core objects such as Atom and Molecule. The QSAR framework would be useful to the greatest number of developers if each developer is free to provide their own implementation of the core interfaces that will work without modification in the QSAR framework. Defining these interfaces means that the irreducible core functionality of Molecule, Atom, etc. with which the framework will neeed to work must be decided on. The advantage of this approach is true design reuse. Because the QSAR framework only knows about Java interfaces, all a developer needs to do to use all of the functionality of the framework is to provide an implementation of those interfaces. Of course, reference implementations should be provided by the framework as well. I've taken this approach in a cheminformatics framework called "Octet" (http://octet.sourceforge.net) and in a 2-D molecular visualization framework called "Structure" (http://structure.sourceforge.net). The approach in these frameworks differs significantly from both JOELib and CDK in that a developer is never required to use my reference implementations of Molecule or Atom. For example, it is possible to provide performance-optimized implementations of these interfaces that would be suitable for large numbers of molecules, or the rapid constrution of molecules. The framework only knows about interfaces, and this is the key to code reuse. I would be willing to provide any code and/or experiences from these projects to the development of a QSAR framework. cheers, rich Peter Murray-Rust <pm...@ca...> wrote: C. The OpenSource community has made some small, useful steps in this direction. They now wish to pool their efforts and produce a single point of contact for their own development and to show to the world. This does NOT necessarily mean a single program. IMO it is much more likely to mean an infrastructure on which a variety of operations can be carried out ("glueware"?). They wish to create a project at SF which leads to: - active constructive discussion - agreed representation of objects * molecules, atoms, fragments, etc. * descriptors * properties - creation, cataloguing, annotating, high-quality information objects: * dictionaries * properties (e.g. of atoms) * datasets - creation, cataloguing, annotation of algorithms related to QSAR * chemical perception * statistics, optimisation, etc - creation of software: * as toolkit components * as demonstrators of the *quality* of the system --------------------------------- Do you Yahoo!? Yahoo! Photos: High-quality 4x6 digital prints for 25¢ |
From: Joerg K. W. <we...@in...> - 2004-04-20 14:53:29
|
Hi Nikolas, Hi all, i've added the untested Molecule-Weka interfaces to JOELib and have modified the Matrix class as suggested by Nikolas. Good work, thanks ! I've added a non-complete API documentation to MoleculesDescriptorMatrix. Nikolas, if you can find time (no hurry) can you please add analogue docu to: MoleculeCache-interface and MolInstancesCache-class Please use the actual CVS versions or i can send you the actual versions via e-mail. Nikolas, if you like, you can join the developer mailing lists: http://sourceforge.net/projects/qsar/ (see lists) http://lists.sourceforge.net/lists/listinfo/joelib-devel Here is the full changelog: - joelib.algo.datamining.weka.* Weka specific Instances with molecular data. The first step to general similarity/dissimilarity metrics for the already introduced joelib.math.similarity interfaces. - joelib.desc.data.* Introduced MoleculeCache interface to store data. - joelib.util.MDMatrixCache renamed to MoleculeDataCacheHolder Molecule data holder for matrices AND Weka-Molecule-instances. - Corrected usage of kekulization for MDL SD export and visualization using JOEMol.kekulize() - Slightly updated DocBook-tutorial (Web-version). Kind regards, Joerg P.S. Nikolas: The Smoothed AP was not published now, so i've commented the LOC which caused a compiler error. -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg W. <we...@in...> - 2004-04-19 17:18:48
|
Hi, > toolkits. So, for example, if there is a routine that extracts descriptors > from a molecule it would be useful to be able to load the molecule and > print out a list of the descriptors - maybe from a commandline in the first > instance or from a simple test program. Pretty well done ! Try JOELib with: sh statistic.sh yourMoleculeFileWithDesc sh descSelection.sh --help also normalization, calculation with: calculateDescriptors.sh and nomralize.sh filters, e.g. hydrogen donors, with: sh convert.sh --help (see example given and/or tutorial) Kind regards, Joerg On Mon, 19 Apr 2004, Peter Murray-Rust wrote: > At 09:42 19/04/2004 +0200, E.L. Willighagen wrote: > >-----BEGIN PGP SIGNED MESSAGE----- > >Hash: SHA1 > > > > > >Good morning all, > > > >Last friday I requested a new project, QSAR - > >http://www.sf.net/projects/qsar, > >which got approved. This project's goal to is guide the development of the > >discussion and software development which has been discussed on the > >cdk...@li... and joe...@li... list last week. > > Well done! > > please add petermr to the list > > >I would like to stress that this new SF project does not intend to reinvent > >the wheel at all, but aimed at: > > > >- - writing down a requirement analysis > >- - developing a GUI that uses CDK, JChemPaint, Jmol, JOELib (alphabetical > > order) and other projects for QSAR model building > > Can I add "collecting and annotating resources - especially data and > ontologies" > > >(More details are available in the thread and on the website soon.) > > > >Furthermore, keep in mind that though I set up this project, it is not my > >intent to 'lead' the project such that my vote counts more than others. > > Thanks! It is very difficult to set up projects with people you have never > met! Henry and I did this for XML-DEV. The site (or list) owner is often > the servant of the community, spending large amounts of time in boring work > (editing and moving pages, mending mail lists, etc.) for which they get few > public thanks! > > >I've set up a mailing list (has still to be approved) to which can be > >subscribed at this page: > > > >http://lists.sourceforge.net/lists/listinfo/qsar-devel > > > >I've you like to join (which I hope), please send me your SF account name, so > >that I can add you to the project. I would also like to repeat Peter's > >suggestions to join the IRC chat channel (for newbies: XChat is a very good > >IRC client which runs on most platforms) at #cdk on the irc.freenode.net > >server. (Note, that when joining a channel the '#' is part of the name.) > > IRC is extremely good if you have a topic which is perhaps too undefined to > have a full mailing list discussion or casual queries. It also gives people > a good idea of how the project is actually managed and serviced. > > What I would find useful in the first few days/weeks is: > - a realistic definition of the scope. We have to be really ruthless here! > The suggestions already made are more than we have resources to manage. > - a catalog of components > - demonstrations of what already works. I'd like to be able to run *demos* > of CDK and/or JOELib without having to read the APIs. I find both of them > are large and navigation is not trivial. This is true of all libraries and > toolkits. So, for example, if there is a routine that extracts descriptors > from a molecule it would be useful to be able to load the molecule and > print out a list of the descriptors - maybe from a commandline in the first > instance or from a simple test program. > > Good luck > > P. > > > Peter Murray-Rust > Unilever Centre for Molecular Informatics > Chemistry Department, Cambridge University > Lensfield Road, CAMBRIDGE, CB2 1EW, UK > Tel: +44-1223-763069 > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IBM Linux Tutorials > Free Linux tutorial presented by Daniel Robbins, President and CEO of > GenToo technologies. Learn everything from fundamentals to system > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > _______________________________________________ > Joelib-devel mailing list > Joe...@li... > https://lists.sourceforge.net/lists/listinfo/joelib-devel > Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Peter Murray-R. <pm...@ca...> - 2004-04-19 16:59:36
|
At 09:42 19/04/2004 +0200, E.L. Willighagen wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > > >Good morning all, > >Last friday I requested a new project, QSAR - >http://www.sf.net/projects/qsar, >which got approved. This project's goal to is guide the development of the >discussion and software development which has been discussed on the >cdk...@li... and joe...@li... list last week. Well done! please add petermr to the list >I would like to stress that this new SF project does not intend to reinvent >the wheel at all, but aimed at: > >- - writing down a requirement analysis >- - developing a GUI that uses CDK, JChemPaint, Jmol, JOELib (alphabetical > order) and other projects for QSAR model building Can I add "collecting and annotating resources - especially data and ontologies" >(More details are available in the thread and on the website soon.) > >Furthermore, keep in mind that though I set up this project, it is not my >intent to 'lead' the project such that my vote counts more than others. Thanks! It is very difficult to set up projects with people you have never met! Henry and I did this for XML-DEV. The site (or list) owner is often the servant of the community, spending large amounts of time in boring work (editing and moving pages, mending mail lists, etc.) for which they get few public thanks! >I've set up a mailing list (has still to be approved) to which can be >subscribed at this page: > >http://lists.sourceforge.net/lists/listinfo/qsar-devel > >I've you like to join (which I hope), please send me your SF account name, so >that I can add you to the project. I would also like to repeat Peter's >suggestions to join the IRC chat channel (for newbies: XChat is a very good >IRC client which runs on most platforms) at #cdk on the irc.freenode.net >server. (Note, that when joining a channel the '#' is part of the name.) IRC is extremely good if you have a topic which is perhaps too undefined to have a full mailing list discussion or casual queries. It also gives people a good idea of how the project is actually managed and serviced. What I would find useful in the first few days/weeks is: - a realistic definition of the scope. We have to be really ruthless here! The suggestions already made are more than we have resources to manage. - a catalog of components - demonstrations of what already works. I'd like to be able to run *demos* of CDK and/or JOELib without having to read the APIs. I find both of them are large and navigation is not trivial. This is true of all libraries and toolkits. So, for example, if there is a routine that extracts descriptors from a molecule it would be useful to be able to load the molecule and print out a list of the descriptors - maybe from a commandline in the first instance or from a simple test program. Good luck P. Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |