From: Christoph S. <c.s...@un...> - 2005-05-24 14:19:57
|
Dear All, I ran a few preliminary CDK performance tests. The questions to answer were: - How large are CDK molecules in memory? - What happens to their size when algorithms using the flags are run on them? - Would a java.util.BitSet be a good replacement for the current flag implementation The current, very crude answers are: - the average molecule from NMRShiftDB (27 Atoms) takes about 13 kb - one with only 19 atoms take 8 kb - Running the fingerprint on all of these takes up an addition 1.5 kb for each molecule, but that might also be other objects created during fingerprinting time. I'll do better profiling there later. - A bitset is no improvement when we stay with 5 flag bits are we do now but will perform much better if we use more flags. Test results below. Cheers, Chris Reading 1000 molecules from d:/documents/Work/Documentation/nmrshiftdb.sdf/nmrshiftdb.sdf. Average atom count: 27 AtomContainers from SDF file size in bytes: 13481 AtomContainers from SDF file construction time for 1000 instances: 2584 ms *********Next Test********* Creating 1000 molecules... Average atom count: 19 Molecules from Factory size in bytes: 8522 Molecules from Factory construction time for 1000 instances: 621 ms *********Next Test********* Creating 1000 molecules... Average atom count: 19 Now creating fingerprints Molecules from Factory after fingerprint creation size in bytes: 10837 Molecules from Factory after fingerprint creation construction time for 1000 instances: 7200 ms *********Next Test********* Boolean Array[5] size in bytes: 41 Boolean Array[5] construction time for 1000 instances: 10 ms *********Next Test********* Boolean Array[100] size in bytes: 128 Boolean Array[100] construction time for 1000 instances: 20 ms *********Next Test********* Boolean Array[100] all true size in bytes: 128 Boolean Array[100] all true construction time for 1000 instances: 10 ms *********Next Test********* BitSet size in bytes: 40 BitSet construction time for 1000 instances: 20 ms *********Next Test********* Example bitset: {5} Filled BitSet with one bit at position 5 size in bytes: 40 Filled BitSet with one bit at position 5 construction time for 1000 instances: 20 ms *********Next Test********* Example bitset: {0, 1, 2, 3, 4} Filled BitSet with all true bits up to position 5 size in bytes: 40 Filled BitSet with all true bits up to position 5 construction time for 1000 instances: 0 ms *********Next Test********* Example bitset: {100} Filled BitSet with one bit at position 100 size in bytes: 48 Filled BitSet with one bit at position 100 construction time for 1000 instances: 20 ms *********Next Test********* Example bitset: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99} Filled BitSet with all true bits up to position 100 size in bytes: 48 Filled BitSet with all true bits up to position 100 construction time for 1000 instances: 20 ms *********Next Test********* Example bitset: {1000} Filled BitSet with one bit at position 1000 size in bytes: 160 Filled BitSet with one bit at position 1000 construction time for 1000 instances: 20 ms *********Next Test********* Example bitset: {10000} Filled BitSet with one bit at position 10000 size in bytes: 1288 Filled BitSet with one bit at position 10000 construction time for 1000 instances: 40 ms -- Priv. Doz. Dr. Christoph Steinbeck (c.s...@un...) Head of the Research Group for Molecular Informatics Cologne University BioInformatics Center (http://www.cubic.uni-koeln.de) Zülpicher Str. 47, 50674 Cologne Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-7786 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |