I use HSQLDB for an image caching system. The images themselves are stored on disk but an image statistics (last access, usages etc) and location on disk are stored in HSQLDB. I experience deadlocks at the point the logfile is processed and emptied.
Even though not sure this is because of my own code (it locks rows when updating statistics and when records are evicted, Spring and Hibernate should commit the sessions), I haven't experienced any deadlock issues with MYSQL so far.
I have a unittest that reproduces this issue on my machine. It runs 10 concurrent threads that read from the database (read and update same record's stats) and 1 thread that inserts records. I tried both direct jdbc:hsqldb:file: connections and standalone instance with jdbc:hsqldb:hsql connections.
My connection string (issue happens with both cached and normal table settings):
jdbc:hsqldb:file:${java.io.tmpdir}/appdata/imageserver/registry;hsqldb.default_table_type=cached;hsqldb.tx=MVCC;hsqldb.tx_level=READ_COMMITTED
Furthermore I use a 20 connection pool (dbcp2).
I captured the session data together with a threaddump, see attachment.
It looks like the connection with SESSION_ID : 4545 has inserted or deleted a row and has not been committed. The internal session performing the checkpoint is waiting for this to commit. The rest of the sessions are waiting for the internal checkpoint session.
Hi Fred,
Thanks for the reply. I didn't have time to look into this but I'm looking at it now=) Shouldn't the uncommitted session be visible in the SYSTEM_SESSIONS table? The SESSION that seems to block everything doesn't show the query it performed. Is there a way I can log this in more detail?
The uncommitted session is visible in the SYSTEM_SESSIONS. A query that was completed is not shown, only a query or statement that is current.
So I did some investigation. I use spring and hibernate and I found that when setting Transaction Propagation to NEVER, the 'deadlock' doesn't occur (I put hsqldb.log_size to 1 so that checkpoint occurs often enough).
I tried to narrow down a bit and it seems the issue occurs at a point where data is being inserted together with Propagation REQUIRED or Propagation REQUIRES_NEW. Something I didn't expect at all; I expected it to happen with deletes being rolled back
I plan to rewrite the cache in pure JDBC to see if I get similar issues
You can enable SQL logging and find out exactly what each session does.
Thanks I have a look at it
Hi Fred,
I think I found the issue (without the help of logging=). The image cache is limited to a certain disk space usage. So when an image needs to be cached, the calling thread asks the DiskSpaceEvaluator to ensure enough disk available. The latter is executed in a permanent thread. The calling thread 'joins' the DiskSpaceEvaluator until it is sure enough diskspace is available. In order to free up diskspace, old cached items are evicted. So, some records are deleted from HSQLDB
The calling thread though, performs its action in a transaction; it inserts the cache record in HSQLDB, then requests the diskspace and stores the file. If all succeeded, the transaction is committed. So, when the file store operation fails, the transaction is rolled back and HSQLDB remains unchanged.
The deadlock is caused because the CHECKPOINT operation is sometimes started at the moment the cache is evicted due to an incoming cache request. In that case the DiskSpaceEvaluator thread is blocked by the checkpoint operation; it can't delete record from HSQLDB and free up diskspace. Therefore, the calling thread cannot finish its cache transaction (which was started before eviction ofcourse). As a result, the CHECKPOINT can't finish because it is waiting for the cache transaction to be committed. Which in turn is waiting for the cache eviction, which in turn is waiting for the CHECKPOINT....
Nice =) This explains why the issue didn't occur with autocommit on inserts.
If my analysis is correct, I also think that HSQLDB could be improved a bit by not blocking all transactions during checkpoint operation. Not sure if such a thing is feasible?
I think I can work around it by requesting diskspace before inserting the cache record (with small risk that the cache is drained), or do a manual rollback in case file storage fails. In the worst case I could use transactiontimeout to make the cache operation fail. So fortunately I can still use HSQLDB (which is a great product, thanks for your efforts!!!).
reg,
Danny
You can create an SQL TRIGGER written in Java to delete the old files and the HSQLDB records after each insert into the table that holds the file records.
Thanks! How does that work with concurrent inserts? I'd guess that when inserting two or more cache record at nearly the same time, the SQL TRIGGERS would run at nearly the same time and all of them would select the same records for eviction. This would result in not deleting enough records and exceeding the maximum diskspace allowed to be used. This is why I use a single eviction thread that handles all request for diskspace and cleanup.
But for now I swapped the eviction and cache insert order; so diskspace is freed first and then the insert occurs. This way the deadlock can never occur. I'm happy enough with that approach.
Bottom line of what I learned is, I should prevent the following situation from happening:
1) thread A to start a transaction and join thread B
2) thread B to start a transaction and commit it
3) thread A to commit the transaction after thread B exits
because if the CHECKPOINT operation happens to occur between step 1 and 2, it blocks thread B from committing the transaction while it waits for thread A to commit, which won't happen until thread B exits.
From version 2.3.4 CHECKPOINT does not start untill all transactions that have started at that point perform a commit or rollback.This includes transactions that have not changed the data. After CHECKPOINT starts new transactions wait for its completion.
Wow nice! Thanks for the notification. I'll try to test it this month.
Please test with the latest SVN code for version 2.3.5 / 2.4.0 and report.