On Wed, 2012-07-18 at 15:08 +0100, Jan Stette wrote:
> After upgrading from Sesame 2.6.6 to 2.6.8, we're seeing a huge number
> of errors as below in our logs. This is making some applications
> unusably slow, not sure if it's just down to the volume of logging or
> if there is something else that's slowing them down or locking them
> INFO 15:00:44,198 [MemoryStore snapshot cleanup] locks.LockManager
> Thread MemoryStore snapshot cleanup is waiting on an active Read lock
> acquired in pool-1-thread-1
This is just an INFO message to let you know that the thread
"MemoryStore snapshot cleanup" is blocked, waiting for a lock to be
released. This general log message is to help users problem solve dead
locks. However, in this case, this thread is expected to be blocked
while there is activity on the store. Perhaps this particular message
for this thread should be silenced, I'm not sure.
The stack trace is shown because you have track locks enabled. This
causes a little bit of an overhead for each lock created.
> We also see some cases of uncaught IllegalArgumentExceptions in the
> MemoryStore sync thread:
> ERROR 11:13:39,490 [Thread-5] STDERR Exception in thread "MemoryStore
> snapshot cleanup"
> ERROR 11:13:39,490 [Thread-5] STDERR
> java.lang.IllegalArgumentException: timeout value is negative
This is a bug, and I have opened an issue about this.
> Would someone be able to have a look at this please? As it is, the
> 2.6.8 release seems unusable for us.
When a thread is blocked waiting for a (particular type of) lock, it
will wait so long before scanning the memory looking to see of a lock
was abandoned. If nothing was find it double the amount of time it
waits. It appears that this wait time past the max int value and became
negative, causing this error. Once this happened the lock system because
unstable and therefore unusable.
> Also, maybe someone could explain what the changes to LockManager are
> and why there were done?
Back in May there was a discussion about making the lock management in
Sesame more forgiving.
> I also see that in the method LockManager.releaseAbanded() [sic],
> there's an explicit call to System.gc(). We're running some systems
> that have very large heap sizes indeed, and doing explicit GC's is
> very expensive, it can cause slowdowns of several minutes. Is this
> really necessary here?
Unfortunately, when a JVM enters a potential deadlocked, an explicit
System.gc() is needed to scan the memory for abandoned locks for
potential recovery. However, we can increase the time a thread is
blocked before going into recovery mode.
On Wed, 2012-07-18 at 16:45 +0200, Arjohn Kampman wrote:
> I had a quick look at the changes to the lock managers. My gut feeling
> is that the modifications that have been made as part of SES-1028 make
> the mechanism quite a lot heavier, where this was meant as a light
> weight alternative to java's Lock's. Did anyone compare the performance
> of the memory store before and after this change?
> I agree with you that calling System.gc() from production code is dubious.
There is obviously a trade off here for users that "forget" to release
locks and lock efficiency. The gc() call is to search for abandoned
locks and only occurs when a thread has been blocked for a period of
time. This period of time doubles every time a locks is not found to
have been abandoned. Although, the performance impact of WeakReferences
is not well documented the current implementation tries to strike a
balance between risk of a dead lock and using object finalization, which
is known to have a significant performance impact. A performance
benchmark would be useful.
The new LockManager was introduced in 2.6.7.