[Cdk-devel] Molecule objects have size problems?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I've been noticing that the Molecule data structure - as available on the 
trunk - leaves a larger memory footprint than it did in the July snapshot 
(last stable release 14. July 2006). 
I have a couple of smiles strings (337) available, which I parsed to  Molecule 
objects and serialized them (using an ObjectOutputStream) to a byte array. I 
did that using the (1) 20060714 release and (2) the latest trunk sources. 
Most of the molecules (about 2/3) increased in size, some of them are now 3 
times as big. My personal favorite is 
c12-c3:c(-C(-c4:c(:c(-[H]):c5:c(:c:2:4):c(-C(=O)-c2:c(:c(:c(:c(-[H]):c-5:2)-[H])-[H])-[H]):c(-[H]):c:1-[H])-[H])=O):c(:c(:c(:c:3-[H])-[H])-[H])-[H]
whose byte array size increased from 88kB using the july code to 426kB (using 
cdk-svn-20061125.jar). 

Does anyone have a clue why that has changed so dramatically? In KNIME 
(http://www.knime.org) we use serialization to not hold objects in memory, 
i.e. for us "size does matter".

Thanks
  Bernd

PS: If requested, I can send the little test class and the data around.