Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Possible Memory Leak

2006-09-07
2013-06-03
  • Hi guys,

    another Memory problem using the BTree.

    This is my test code:

        public void testWrite() throws Exception{
            BaseRecordManager  recordManager = (BaseRecordManager) new BaseRecordManager( DATABASE);
            recordManager.disableTransactions();
            long btreeId=recordManager.getNamedObject(INDEX_NAME);
            BTree tree=null;
            if ( btreeId != 0 ) {
                tree = BTree.load( recordManager, btreeId );
                System.out.println( "Reloaded existing BTree with " + tree.size());
            } else {
                tree = BTree.createInstance( recordManager, new LongComparator());
                recordManager.setNamedObject( INDEX_NAME, tree.getRecid() );
                System.out.println( "Created a new empty BTree" );
            }
            Random r= new Random(System.currentTimeMillis());
            for (int i = 0; i < 100000; i++) {
                Long id=(Long) new Long(r.nextLong());
                Object values= randomBytes(i, 10000);
                tree.insert(id, values, false);
                if (i%1000==0){
                    System.gc();
                    long freeMem=(maxMemory- Runtime.getRuntime().totalMemory())+Runtime.getRuntime().freeMemory()/(1024*1024);
                             System.out.println("Free "+freeMem+" MB");
                    recordManager.commit();
                }
            }
            recordManager.commit();
            recordManager.close();

        }

        private Object randomBytes(int i, int length) {
            byte[] bytes= new byte[length];
            new Random(i).nextBytes(bytes);
            return bytes;
        }

    If I execute this code the output will look like this:
    Do persist 0:[#.........] (253M free/254M max, 90% free )
    Do persist 1000:[#.........] (239M free/254M max, 90% free )
    Do persist 2000:[##........] (227M free/254M max, 80% free )
    Do persist 3000:[##........] (215M free/254M max, 80% free )
    Do persist 4000:[##........] (203M free/254M max, 80% free )
    Do persist 5000:[###.......] (198M free/254M max, 70% free )
    Do persist 6000:[###.......] (191M free/254M max, 70% free )
    Do persist 7000:[###.......] (184M free/254M max, 70% free )
    Do persist 8000:[####......] (175M free/254M max, 60% free )
    Do persist 9000:[####......] (173M free/254M max, 60% free )
    Do persist 10000:[####......] (167M free/254M max, 60% free )
    Do persist 11000:[####......] (165M free/254M max, 60% free )
    Do persist 12000:[####......] (162M free/254M max, 60% free )
    Do persist 13000:[####......] (159M free/254M max, 60% free )
    Do persist 14000:[####......] (158M free/254M max, 60% free )
    Do persist 15000:[####......] (154M free/254M max, 60% free )
    Do persist 16000:[#####.....] (149M free/254M max, 50% free )
    Do persist 17000:[#####.....] (150M free/254M max, 50% free )
    Do persist 18000:[#####.....] (146M free/254M max, 50% free )
    Do persist 19000:[#####.....] (143M free/254M max, 50% free )

    and in my regular program it will throw an OutOfMemory-Exception at some point.
    Also it will get really really slow here.

    Is this expected Behaviour or a memory leak?

    Here is the Heap Dump at the time of program completion:
    Space      Count      Class Signature
    ---------- ---------- ----------------------
        214952        602 Ljava/lang/Class;
        167720        435 [I
        114200        600 [B
         96040       1152 [C
         47936        789 [S
         27960       1165 Ljava/lang/String;
          8512        221 [Ljava/lang/Object;
          5760        180 Ljava/util/LinkedHashMap$Entry;
          5424        226 Ljava/util/Hashtable$Entry;
          5360         51 [Ljava/util/HashMap$Entry;
          4248        177 Ljava/util/HashMap$Entry;
          2560         80 Ljava/util/concurrent/ConcurrentHashMap$Segment;
          1944         15 [Ljava/util/Hashtable$Entry;
          1920         80 Ljava/util/concurrent/locks/ReentrantLock$NonfairSync;
          1840         46 Ljava/util/HashMap;
          1664         26 Ljava/lang/reflect/Constructor;
          1472         46 Ljava/lang/ref/SoftReference;
          1400         31 [Ljava/lang/String;
          1360         80 [Ljava/util/concurrent/ConcurrentHashMap$HashEntry;
          1360          5 [Z
          1056         44 Ljava/security/Provider$ServiceKey;
    ---------- ---------- ----------------------

    Thanks,

    Alexander

     
    • Kevin Day
      Kevin Day
      2006-09-14

      Hey gents - I don't have time to deal with this right now, so someone else will have to take a look at it.

      Off hand, this sounds like a caching issue, but the heap dump doesn't show any classes from jdbm, so it's kind of hard to say definitively.

      Alexander - if you set the jdbm.disableTransactions.autoCommitInterval property to "50", does it make any difference in your run?

      - K

       
    • Thanks for your reply, but when I tried your solution, I had to get the source from cvs and buit it which actually fixed the problem.
      The current code does not seem do have the memory problems.

      But another question, what about performance. If I compare just serializing the bytes it takes for 8000 bytes about 1 second. If I use jdbm it takes about 30 seconds. I don't think this is expected behaviour, or are other people seeing the same performance? What coule I do to speed this up?
      Regards,

      Alexander

       
      • Kevin Day
        Kevin Day
        2006-09-19

        Not sure about the performance - that sounds pretty bad - we probably ought to get a run diagnostics done on sample code that exhibits this to see where the delay is.  Go ahead and post your exact sample code and I'll run it on my system here.

        In terms of serialization, are you using the built in serialization method? 

        I strongly recommend implementing your own serializer (implmeenting jdbm.helper.Serializer) for your business objects, instead of relying on default serialization.

        But the performance delta you are talking about is prety bad (30 seconds to serialize an 8k byte array is crazy... heck 1 second to serialize that array is crazy).

        - K

         
        • Kevin, thanks for the quick reply!
          I'm sorry I went to the Oktoberfest yesterday, so I'm not in the best condition, what I meant is 8K bytes* 1000. Thats 30 seconds for jdbm and 1 second for regular serialization.
          The serialization shouldn't matter since I use the regular java serialization in both cases.

          I now created two sample programs, where I strangely enough now only see a difference of about x7 (so my regular java object serialization takes 2 seconds and jdm takes 15 seconds). This is still probably a lot mor than expected.

          The JDBM sample is as follows:
              public void testWrite() throws Exception{
                  Properties options= new Properties();
                  options.put(RecordManagerOptions.DISABLE_TRANSACTIONS, "true");
                  RecordManager recordManager = RecordManagerFactory.createRecordManager(DATABASE, options);
                  long btreeId=recordManager.getNamedObject(INDEX_NAME);
                  BTree tree=null;
                  if ( btreeId != 0 ) {
                      tree = BTree.load( recordManager, btreeId );
                      System.out.println( "Reloaded existing BTree with " + tree.size());
                  } else {
                      tree = BTree.createInstance( recordManager, new LongComparator());
                      recordManager.setNamedObject( INDEX_NAME, tree.getRecid() );
                      System.out.println( "Created a new empty BTree" );
                  }
                  writeEntries(recordManager, tree);
                  recordManager.commit();
                  recordManager.close();
                  System.out.println("Free "+RtAgLogUtils.getMemoryInfo());
              }

              private void writeEntries(RecordManager recordManager, BTree tree) throws IOException {
                  Random r= new Random(System.currentTimeMillis());
                  long startTime=System.currentTimeMillis();
                  for (int i = 0; i < 50000; i++) {

                      Long id=(Long) new Long(r.nextLong());
                      tree.insert(id, randomBytes(i, 8000), false);
                     
                      if (i%1000==0){
                          System.gc();
                          recordManager.commit();
                          long endTime = System.currentTimeMillis();
                          System.out.println(" Took "+(endTime-startTime));
                          startTime=endTime;
                      }
                  }
              }

          // END JDMB Sample
          ------------------------------------------------

          Here is the regular java serialization test:
              public void testWrite() throws Exception{
                  ObjectOutputStream stream = new ObjectOutputStream( new BufferedOutputStream( new FileOutputStream( DATABASE)));
                  writeEntries(stream);
                  stream.close();
                  System.out.println("Free "+RtAgLogUtils.getMemoryInfo());

              }

              private void writeEntries(ObjectOutputStream stream) throws IOException {
                  long startTime=System.currentTimeMillis();
                  for (int i = 0; i < 50000; i++) {
                      stream.writeObject(randomBytes(i, 8000));
                      if (i%1000==0){
                          System.gc();
                          long endTime = System.currentTimeMillis();
                          System.out.println(" Took "+(endTime-startTime));
                          startTime=endTime;
                      }
                  }
              }

          ------------------------------------------------
          Maybe you spot the wtf here.

          Thanks alot,

          Alexander

           
          • Kevin Day
            Kevin Day
            2006-09-19

            OK - a couple of very important things to understand:

            IF you use default serialization to write 1000 objects into a btree, it is VERY different than writing 1000 objects into a single serialized stream.  With a serialized stream, you have a single header for all of the objects.  With the btree, each leaf of the tree might hold 10 or 20 objects.  Each leaf node has it's own serialization stream (when using default serialization), so you wind up with a header (and the stream initialization, etc... that goes with it) per leaf.

            In addition, you are writing some pretty big objects to the tree.  If the nodes are configured to hold 16 entries, then you are winding up with leaf nodes that are 16 physical disk pages long each...  Not sure if that would result in any inefficiency, but it could.

            I would recommend that you remove the setting change that I recommended in my first post (that was really intended to just push the problem into a corner to make sure we understood it - the default value for that setting should be sufficient).

            Anyway, when I have time, I'll run your code with a profiler and see where the time is being spent.  I suspect it's happening in the free record allocator (which has some efficiency problems that would be particularly evident in the test case you are running - i.e. huge objects).

            With transactions disabled, I would expect the performance to be maybe half of that of writing directly to disk, but 7 times worse seems a surprise...

            - K

             
    • Kevin,

      excuse my ignorance but how do you actually set a different serializer.
      I would like to try it out, to see if it does improve performance, though I doubt it, since as far as I can see the ObjectOutputStream only adds like a couple of bytes if it is an array 4 bytes for a header and then a couple of bytes for the type of array and so on. So overall it should be less than 1% for my 8K array, don't you think?
      Best regards,

      Alexander Kainz

       
      • Kevin Day
        Kevin Day
        2006-10-02

        No worries. 

        You will need to implement a class with the jdbm.helper.Serializer interface, then use the expanded form of BTree.createInstance to specify the serializer (you will also have to specify a comparitor).

        Depending on the objects you are storing, you may also want to consider specifying the setKeyCompressionProvider on the BTree as follows:

        tree.setKeyCompressionProvider(new LeadingValueCompressionProvider());

        There's a unit test in CVS called BTreeKeyCompressionTest that may give you some ideas on this (the getNewTree() method shows you how to use a simple comparitor and serializer when you create the tree).

        If all you are storing are byte arrays, then the ByteArraySerializer and ByteArrayComparator classes may be of use.  If your keys are just longs, then the LongPackedSerializer may be helpful.

        - K