Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.
This post is to kickoff a common location for discussion on the best way to implement a weak reference cache (let's call it WRC for short).
I posted some code a little while back for a WeakValueHashMap that we can use as a starting place.
The current implementation has a couple of things about it that I'm not crazy about. First, it has to maintain a reverse lookup to locate locate the key that should be purged when a given reference becomes available. It seems like a better approach would be to subclass WeakReference and give it a member variable for storing the key (i.e. recid).
That removes the need for the reverse map entirely.
Another thing that I'd like to do is to only clearUnreferencedValues every X gets or Y puts, whichever come first. The current implementation calls every time.
Finally, to move this along in the direction of efficiency, I think we should consider re-implementing the Map so it stores primitive longs as keys. I have permission from another SF project to borrow their code on this, so we should be able to do that fairly easily.
Until the rest of the JDBM record manager is updated to use primitives for keys, we will need an adapter layer to make it work.
With that in place, we will have a mechanism for storing objects by recid and still allow the garbage collector to free them as needed.
That is good for an initial cut, but I think we need to take this further. For performance reasons, it may be desirable to use an MRU cache in conjunction with the WRC. I think that the appropriate architecture here is to have a "ChainedCache" implementation that accepts a series of caches and pulls items evicted from one and places them into the next.
As a quick note: It is not possible to retrieve the object of a WeakReference after it as been collected, so it will not be possible to evict from a WRC and then add to an MRU. We'd have to use soft or phantom references for that, and I'm not sure we really need to go there.
I strongly suspect that the MRC will provide sufficient performance to where an MRU isn't needed, but we'll see. After all, the GC is really nothing more than a big, complicated MRU...
Can you go ahead and commit the existing code for the weak value hash map and the "long" value hash map from that other project? I'll check them out and start looking at options. Until then, I will start thinking about some test cases.
One use case when you will need chained caches is a pre-fetch facility. If you pre-fetch records (based on whatever guidence) and you only have an WRC, then they will be immediately collected by the JVM.
I can - but before I do that, I'd like you to take a look at the jdbm.helper.SoftCache class in CVS.
Doesn't this do what we want already? It actually is almost a line for line implementation of my class, and it already handles chaining of caches.
The only issue with this class (and it would be an issue with my design as well), is that there is no way to fire cache eviction events - by the time you know that the object has been freed, you no longer have a reference to the object to send in the event. That means that changes to the object would not get stored.
The workaround is to couple this cache with an MRU cache. When an object is found in the SoftCache, and it isn't in the MRU cache, it is added to the MRU cache. This forces the object to have a reference until it is no longer available in the MRU - and around and around we go.... Each time an object that is marked as dirty falls off the MRU, it gets stored to disk - even though a reference to it is still held in the softcache.
In a long running transaction, this means that some objects may get stored multiple times, but for all practical purposes, it should work well.
What do you think? Are we worrying about developing something that already exists?
Sure. I'll patch the Provider class and start writing
some tests. One way to approach testing is by explictly invoking garbage collection, which I can look into. However reading up on SoftReference, it is clear that even explicit GC does not require that any given SoftReference get collected. Another approach would be to constrain the available memory during the test and then run the heap up until the GC is required to collect the soft references.
But those are just some possible techniques to exploit. We have to figure out what the contract is that we are going to test. As far as I can see that contract has both a deterministic and a probabilistic/performance aspects.
To my mind the primary deterministic contract is that we MUST NOT see two runtime objects that are both instances of the same persistent data.
There is a second deterministic aspect which is that the MRU cache MUST NOT permit the N most recently accessed records to be evicted.
I think that we could test for these by obscuring the data under test. For example, we can not hold a hard reference to an object whose cache eviction we are testing, but we could serialize a representation of the Java reference as a String and pairing it up with the recid and use that data to test for violations of the contract.
The probabilistic aspect basically has to do with performance, so it might be more a matter of characterizing performance than testing for failure. I'll give this some more thought.
Do go ahead and commit the native "long" hash map though.
OK - native long hashmap may take some doing - it's part of a larger system and there may be some dependencies that need to be removed.
I'll see what I can do.
Primitive long map has now been added to CVS. The resultant .class files will add another 60 KB to the distro (33 KB zipped).
Right now, I have included two map implementations: chained and open. When we release, we can probably pick just one of these (most likely, the chained version, but I'm open to discussion on that - there's a good article at wikipedia that describes the two approaches: http://en.wikipedia.org/wiki/Hash_table#Collision_resolution\)
In addition, there is a lot of "architectural" overhead in the classes for making them similar to the Java collections classes. If size is an issue, we could do some refactoring, decide what functionality we actually need from these maps, etc... and probably get it shrunk down.
For the time being, go ahead and play with the LongKeyChainedHashMap and go from there.
I've looked over the code some more. There is an existing test suite. I've tried running that test suite under four JVMs (Sun JDK 1.5, the JVM bundled with eclipse 3.0, and JRockit 1.4.2_05 and 1.5.0_03). The test suite succeeds for each of the JVMs, but there is one test (testL2Recovery) that causes an out of memory condition under the eclipse JVM. The test succeeds anyway and is allocating large objects, so I am not that suprised that it stimulates an out of memory condition. The error message generated by the eclipse JVM for those who are interested is:
JVMDG217: Dump Handler is Processing OutOfMemory - Please Wait.
JVMDG315: JVM Requesting Heap dump file
JVMDG318: Heap dump file written to C:\Documents and Settings\thompsonbry\IBM\rationalsdp6.0\workspace\jdbm-dev\heapdump.20050823.055145.3460.phd
JVMDG303: JVM Requesting Java core file
JVMDG304: Java core file written to C:\Documents and Settings\thompsonbry\IBM\rationalsdp6.0\workspace\jdbm-dev\javacore.20050823.055145.3460.txt
JVMDG274: Dump Handler has Processed OutOfMemory.
I have also written to the author, Dilum Ranatunga, to see what he thinks the status is of the SoftCache class and test suite. If things look good, then I will commit the patch to jdbm.recman.Provider to support "soft" caches.
I would like to run the jdbm test suite using the "soft" cache to see if it causes any failures in the total system. However, several parts of the test suite are currently failing. I believe that you were looking into this? Any progress on this front?
The latest code in CVS (including updated tests) allows all unit tests to pass.
I suspect that the out of memory error you encountered was due to heap space. From looking at the code, it is creating about 400 MB of data. On my machine (with 786 MB RAM), the test passes without problems.
I would like to propose that we consider some changes to the unit tests. Specifically, refactor the creation of an empty record manager into it's own utility method in the TestUtil class. This will allow us to change the properties in one place (and allow easy addition of tests for other property settings).
Are you sure that your test case modifications have all been committed? I am seeing problems with many of the test cases - an example is below. I am also using eclipse. I right click on the "src/tests" folder and select Run => JUnit Test. Perhaps you are not running all the tests?
With respect to refactoring the tests - yes. I would like to be able to run the entire test suite with different configurations of the parameters.
With respect to running out of memory during the SoftCache tests, yes, it is creating a lot of data. What was suprising is how different the performance was for that test under the different JVMs.
Here is one example of a test that fails in the current CVS code:
java.lang.Error: CRITICAL: file header magic not OK 25168
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I found another problem with the tests (one of the hash tests was not closing the record manager). I believe that this causes your other problems.
I am definitely running all tests...
I'll patch that test now, go ahead and check out, then try again.
Ok - the close() definately helps matters. However, I am not convinced that all tests are being run, at least, not by AllTests. For example, there is a performance test suite for insert, fetch, update, and delete. How are you running the test suite?
I am also using eclipse, and I execute the tests against the entire tests folder. It's actually running several of the tests multiple times, because it is picking up the individual tests, and the test suite. I am almost positive that all tests are getting run.
What performance suite are you refering to, and what is it's containing class? I see recman.TestStress...
Alex found a problem with the AllTests suite yesterday (it wasn't running the TestStress test), and submitted a change to CVS so it does include it now.
I was referring to src/tests/recman/TestPerformance, which is not run by AllTests.
I was running the src/tests folder, but, as you say, it runs some of the tests multiple times. This should get cleaned up for the next release so that there are clear entry points for the test suite and perhaps a clear entry point for any performance (vs correctness) tests.
OK - TestPerformance is not getting run with the technique I'm using. Good catch.
Running it manually, I am getting the out of memory error that you are seeing. The problem is that the test is invalid. It does not call commit(), so everything is getting stored up in one huge transaction.
Fix is to insert a commit() every XXX insertions. You'll probably want to add this to all tests in the class.
I agree that a restructuring of the unit test suites to make the differentiation between performance and correctness makes sense.
If this is just a matter of adding support for SoftCache, it will be trivial to add to the Provider class.
What I think we will really need are a bunch of unit tests to make sure that things are working properly.
Bryan - are you up for putting together some tests?
What makes testing this insanely tricky is that some of the tests are at the mercy of the garbage collector.