From: Egon W. <el...@ca...> - 2003-02-26 16:11:27
|
Hi all, while working on implementing RFC #8, I encounterd this problem: the interface Atom (see RFC) cannot required fields available in ChemObject like flag[] and pointer[], because interfaces in general cannot do that. But all cdk core entity classes extend ChemObject and thus have these fields. Many other classes use these fields for calculations. The use of the direct access to the fields instead of using get and set methods is assumed to speed up the algorithms. Suggestions are needed how to solve this. As cannot come up with something better, I propose: - Use get/set methods instead: get/setFlag, get/setPointer Egon |
From: Egon W. <el...@ca...> - 2003-02-26 16:14:28
|
On Wednesday 26 February 2003 4:11 pm, Egon Willighagen wrote: > while working on implementing RFC #8, I encounterd this problem: > > the interface Atom (see RFC) cannot required fields available in ChemObject > like flag[] and pointer[], because interfaces in general cannot do that. > But all cdk core entity classes extend ChemObject and thus have these > fields. Many other classes use these fields for calculations. The use of > the direct access to the fields instead of using get and set methods is > assumed to speed up the algorithms. > > Suggestions are needed how to solve this. > > As cannot come up with something better, I propose: > > - Use get/set methods instead: get/setFlag, get/setPointer I've just realized that casting the Atom into a ChemObject works as well, but that is ugly... Egon |
From: Egon W. <el...@ca...> - 2003-02-26 16:17:49
|
On Wednesday 26 February 2003 4:11 pm, Egon Willighagen wrote: > Suggestions are needed how to solve this. > > As cannot come up with something better, I propose: > > - Use get/set methods instead: get/setFlag, get/setPointer BTW, forgot to mention why this options works: these methods can be added to the Atom interface ;) E. |
From: Miguel <mt...@mt...> - 2003-02-26 17:07:42
|
> the interface Atom (see RFC) cannot required fields available in > ChemObject like flag[] and pointer[], because interfaces in general > cannot do that. But all cdk core entity classes extend ChemObject and > thus have these fields. Many other classes use these fields for > calculations. The use of the direct access to the fields instead of > using get and set methods is assumed to speed up the algorithms. > > Suggestions are needed how to solve this. > > As cannot come up with something better, I propose: > > - Use get/set methods instead: get/setFlag, get/setPointer Personally, I wouldn't worry about performance in this case. I strongly suspect that the overhead of the function call is insignificant compared to the operations surrounding the setting/testing of flags. However, I must confess that I do have an ulterior motive; in this case i= t is not the speed that bothers me, but rather the wasted space. flag is currently defined as boolean[100]. That means that it is 100 *bytes* (not 100 bits as one might hope). And this boolean[] is allocated for every atom. So my ulterior motive is to change the interface to get/setFlag so that w= e can subsequently consider alternate implementations: - a single long where we set/test individual bits - a BitSet or a boolean[] which is allocated only if/when we need it. - a set of parallel BitSets on the side - others? Miguel > > Egon > > > ------------------------------------------------------- > This SF.net email is sponsored by: Scholarships for Techies! > Can't afford IT training? All 2003 ictp students receive scholarships. > Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more. > www.ictp.com/training/sourceforge.asp > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel -------------------------------------------------- Miguel Howard mi...@ho... c/Pe=F1a Primera 11-13 esc dcha 6B 37002 Salamanca Espa=F1a Spain -------------------------------------------------- telefono casa 923 27 10 82 movil 650 52 54 58 -------------------------------------------------- To call from the US dial 9:00 am Pacific US =3D home 011 34 923 27 10 82 12:00 noon Eastern US =3D cell 011 34 650 52 54 58 6:00 pm Spain -------------------------------------------------- |
From: Edgar L. <ed...@up...> - 2003-02-27 08:21:17
|
Hi, >>the interface Atom (see RFC) cannot required fields available in >>ChemObject like flag[] and pointer[], because interfaces in general >>cannot do that. But all cdk core entity classes extend ChemObject and >>thus have these fields. Many other classes use these fields for >>calculations. The use of the direct access to the fields instead of >>using get and set methods is assumed to speed up the algorithms. >> >>Suggestions are needed how to solve this. >> >>As cannot come up with something better, I propose: >> >>- Use get/set methods instead: get/setFlag, get/setPointer >> >Personally, I wouldn't worry about performance in this case. I strongly >suspect that the overhead of the function call is insignificant compared >to the operations surrounding the setting/testing of flags. > >However, I must confess that I do have an ulterior motive; in this case it >is not the speed that bothers me, but rather the wasted space. > >flag is currently defined as boolean[100]. That means that it is 100 >*bytes* (not 100 bits as one might hope). And this boolean[] is allocated >for every atom. > >So my ulterior motive is to change the interface to get/setFlag so that we >can subsequently consider alternate implementations: > - a single long where we set/test individual bits > - a BitSet or a boolean[] which is allocated only if/when we need it. > - a set of parallel BitSets on the side > - others? > I do agree with this ideas (typing by an interface is generally good, and of course hiding the internals from the user via the ever so popular get/set mechanism gives us all the flexibility we might need). Looking at the memory consumption is also important, because 100 bytes per atom is part of the problems I run into, when using cdk functionality with medium sized proteins (10.000 atoms). So my suggestion is: 1.) get/setFlag or get/setPointer is fine and to answer any questions regarding speed 2.) some method like: public boolean[] getFlags() which returns an array like the actual used array. (Of course this would need the associated setFlags(boolean[] array) method.) 3.) Switching to some more efficient way to store the flags, like Miguel suggested and 4.) Allocate this array only when needed the first time, which also can easily be handled by a wrapper method. this would provide the same functionality as there is now and could improve some of the actual weaknesses. Best regards Edgar -- Edgar Luttmann University of Paderborn, Germany Department of organic chemistry office: J6.302 Warburger Str. 100 phone: (+49) 5251 60-2498 33098 Paderborn eMail: ed...@up... |
From: Joerg K. W. <we...@in...> - 2003-02-27 08:42:05
|
Hello, Memory issue: see last Mails from Miguel and Edgar I agree completely ! Something like an singleton Helper class could work: AtomFlagHelper with addAtomFlag("AtomFlag234", new Integer(234)) Integer getFlagIndex(String flag); In Atom atom.hasAtomFlag(atom, flag="AtomFlag234") { Integer integer=AtomFlagHelper.instance().getFagIndex(flag); if(integer!=null) return bitSet.get(integer.intValue()); else return false; } Performance issue: http://sourceforge.net/mailarchive/message.php?msg_id=1986607 Regards, Joerg > Hi, > >>> the interface Atom (see RFC) cannot required fields available in >>> ChemObject like flag[] and pointer[], because interfaces in general >>> cannot do that. But all cdk core entity classes extend ChemObject and >>> thus have these fields. Many other classes use these fields for >>> calculations. The use of the direct access to the fields instead of >>> using get and set methods is assumed to speed up the algorithms. >>> >>> Suggestions are needed how to solve this. >>> >>> As cannot come up with something better, I propose: >>> >>> - Use get/set methods instead: get/setFlag, get/setPointer >>> >> Personally, I wouldn't worry about performance in this case. I strongly >> suspect that the overhead of the function call is insignificant compared >> to the operations surrounding the setting/testing of flags. >> >> However, I must confess that I do have an ulterior motive; in this >> case it >> is not the speed that bothers me, but rather the wasted space. >> >> flag is currently defined as boolean[100]. That means that it is 100 >> *bytes* (not 100 bits as one might hope). And this boolean[] is allocated >> for every atom. >> >> So my ulterior motive is to change the interface to get/setFlag so >> that we >> can subsequently consider alternate implementations: >> - a single long where we set/test individual bits >> - a BitSet or a boolean[] which is allocated only if/when we need it. >> - a set of parallel BitSets on the side >> - others? >> > I do agree with this ideas (typing by an interface is generally good, > and of course hiding the internals from the user via the ever so popular > get/set mechanism gives us all the flexibility we might need). > > Looking at the memory consumption is also important, because 100 bytes > per atom is part of the problems I run into, when using cdk > functionality with medium sized proteins (10.000 atoms). > > So my suggestion is: > 1.) get/setFlag or get/setPointer is fine and to answer any questions > regarding speed 2.) some method like: > public boolean[] getFlags() > which returns an array like the actual used array. (Of course this > would need the associated setFlags(boolean[] array) method.) > 3.) Switching to some more efficient way to store the flags, like Miguel > suggested and > 4.) Allocate this array only when needed the first time, which also can > easily be handled by > a wrapper method. > > this would provide the same functionality as there is now and could > improve some of the actual weaknesses. > > Best regards > > Edgar > > -- > Edgar Luttmann > > University of Paderborn, Germany > Department of organic chemistry > office: J6.302 > Warburger Str. 100 phone: (+49) 5251 60-2498 > 33098 Paderborn eMail: ed...@up... > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Scholarships for Techies! > Can't afford IT training? All 2003 ictp students receive scholarships. > Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more. > www.ictp.com/training/sourceforge.asp > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel > -- Dipl. Chem. Joerg K. Wegner Univ. Tuebingen, Computer Architecture, Sand 1, D-72076 Tuebingen, Germany Tel. (+49/0) 7071 29 78970, Fax (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de |
From: Christoph S. <c.s...@un...> - 2003-02-27 15:01:21
|
Miguel wrote: >>the interface Atom (see RFC) cannot required fields available in >>ChemObject like flag[] and pointer[], because interfaces in general >>cannot do that. But all cdk core entity classes extend ChemObject and >>thus have these fields. Many other classes use these fields for >>calculations. The use of the direct access to the fields instead of >>using get and set methods is assumed to speed up the algorithms. >> >>Suggestions are needed how to solve this. >> >>As cannot come up with something better, I propose: >> >>- Use get/set methods instead: get/setFlag, get/setPointer >=20 > Personally, I wouldn't worry about performance in this case. I strongly > suspect that the overhead of the function call is insignificant compare= d > to the operations surrounding the setting/testing of flags. >=20 > However, I must confess that I do have an ulterior motive; in this case= it > is not the speed that bothers me, but rather the wasted space. >=20 > flag is currently defined as boolean[100]. That means that it is 100 > *bytes* (not 100 bits as one might hope). And this boolean[] is allocat= ed > for every atom. >=20 > So my ulterior motive is to change the interface to get/setFlag so that= we > can subsequently consider alternate implementations: > - a single long where we set/test individual bits > - a BitSet or a boolean[] which is allocated only if/when we need it. > - a set of parallel BitSets on the side > - others? These suggestions are well founded and ok for me. Cheers, Chris --=20 Dr. Christoph Steinbeck (e-mail: c.s...@un...) Groupleader Junior Research Group for Applied Bioinformatics Cologne University BioInformatics Center (http://www.cubic.uni-koeln.de) Z=FClpicher Str. 47, 50674 Cologne Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-5092 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |