Be similar to https://sourceforge.net/p/hsqldb/bugs/1418/ {Checkpoint deadlock in mvcc read_committed mode versions 2.2.6, 2.3.2, 2.3.3].
If an active transaction is executed, checkpoint will cause a deadlock.
Viewing the image, you can see that the active transaction and checkpoint wait for each other and deadlock
Thanks for reporting. Your screenshots clearly show the deadlock Will be fixed for the next release.
You can try setting the sessions that perform SELECTs to auto-commit, if this doesn't affect your application.
thks for your reply
I am using HSQL 2.7.1 in embedded (file) mode. Most of the tables are in mem, except for a couple of Cached tables. MVCC as transaction control mode.
I notice regularly during performance/load tests that the application goes unresponsive and whenever this happens, the DB log file size has just crossed the hsqldb.log_size limit. The Checkpoint happens atleast for a couple of rounds when the log file size crosses the log_size limit. I suspect some deadlock but since the application does not respond, I am unable to fetch details from the SYSTEM_SESSIONS table. Checked with multiple settings for the log_size.
To confirm, I set hsqldb.log_data=false and I do not see this issue.
Is this related to the issue mentioned in this thread? If so, is there a fix possible? Any workarounds that can be tried?
Last edit: Sai 2023-02-28
This may be related as there is a checkpoint when the log_size is reached. You could set the log_size to zero to avoid auto checkpoints and do the checkpoints from your app when there is no other database access.
Hi @fredt
May I know is a fix for this issue is planned?
Next release will probably go out in May.
Thanks!
Sadly, it looks like the fix didn't make the 2.7.2 release. Here's a project that demonstrates the problem, if it helps.
https://github.com/pwinckles/hsqldb-deadlock
Hello Peter. Thanks for the test case. I will run it and see.
I looked at Main.java and noticed you have c.setAutoCommit(false); but there is no call to commit on the connection used for the SELECT statements. Obviously connections must all commit to allow a CHECKPOINT to proceed.
Sorry about that. I was porting the example from a test case that was using Hibernate. It doesn't really affect the test one way or the other, but I did just add the manual commit.
To be clear, the code is demonstrating that there is a deadlock with MVCC in the following circumstance, which does not happen with the other strategies:
This is probelmatic for our application because we'd like to move to MVCC because there are are circumstances where the LOCKS strategy results in a deadlock. However, moving to MVCC resolves these deadlocks, but is untenable because there is now a new deadlock on CHECKPOINT.
Last edit: Peter Winckles 2023-06-06
Thanks Peter. When using HSQLDB in embedded mode, there must be one connection per thread. The threads are put to wait by a latch belonging to the connection. I suppose your thread 3 is using its own connection. I don't understand why it joins thread 1, but I have not yet run the code.
Peter, I have run the test and do not understand what the threads in your tests are doing. The execution never reaches the point of the COMMIT that you added. I am sure if the first thread actually executes the COMMIT, it will unblock. After your step 3, the CHECKPOINT is waiting for the first uncommitted SELECT and the second SELECT is waiting for the CHECKPOINT.
@fredt Yes, that's the problem, it will never reach the commit because it's deadlocked. This deadlock happens because a thread with an open tx is waiting to commit the tx on a second thread that also has an open tx. The tx in the second thread is in turn blocked by the pending CHECKPOINT (requested by a third thread).
The code in the example is contrived to demonstrate the problem. This problem could happen anytime an application waits on another thread within a tx, and the other thread performs db operations.
Last edit: Peter Winckles 2023-06-28
Peter, with your scenario there will be deadlocks in LOCKS and MVCC. HSQLDB uses threads and latches to schedule the execution. You cannot do something with the threads that interferes with this. You can probably try the client/server model to separate the HSQLDB's threads from your app's.
@fredt That test code works against LOCKS. Try it.
Last edit: Peter Winckles 2023-06-28
Peter, Transactions are scheduled differently with LOCKS. In this case, Thread 3 needs the same READ lock (a shareable lock) that is held by Thread 1's open transaction, so it goes ahead and executes the SELECT without being paused on the latch.
@fredt The difference seems to be that with MVCC CHECKPOINT blocks all transactions that come in behind it, but with LOCKS this does not happen. The order is:
Yes. MVCC uses a CATALOG LOCK for CHECKPOINT and DDL. There is no other lock. With LOCKS, there are READ and WRITE locks for each table as well.
Is the LOCKS CHECKPOINT lock the same as MVCC? Would you expect them to behave the same? If so, I'm sorry, but I really don't understand your explaination of why the test case works with LOCKS.
Here's the output of the test with LOCKS:
java -jar target/hsqldb-deadlock-1.0-SNAPSHOT-exec.jar LOCKS
18:05:50.051 [pool-1-thread-1] INFO com.pwinckles.Main - Before query 1
18:05:50.057 [main] INFO com.pwinckles.Main - Before checkpoint
18:05:51.058 [pool-1-thread-2] INFO com.pwinckles.Main - Before query 2
18:05:51.060 [pool-1-thread-2] INFO com.pwinckles.Main - After query 2
18:05:51.060 [pool-1-thread-1] INFO com.pwinckles.Main - After query 1
18:05:51.060 [main] INFO com.pwinckles.Main - After checkpoint
Here's the output with MVCC:
java -jar target/hsqldb-deadlock-1.0-SNAPSHOT-exec.jar MVCC
18:06:43.518 [pool-1-thread-1] INFO com.pwinckles.Main - Before query 1
18:06:43.524 [main] INFO com.pwinckles.Main - Before checkpoint
18:06:44.526 [pool-1-thread-2] INFO com.pwinckles.Main - Before query 2
I must be not understanding something, but it sure looks like to me that with LOCKS, CHECKPOINT does not prevent queries that come in behind it from executing, which is not the case with MVCC.
You understand it correctly. With LOCKS, the second SELECT needs the same shareable lock that is held by the first SELECT, so it goes ahead. If instead of the second SELECT, there was a DML statement modifying a separate table, it would be paused until the CHECKPOINT finishes.
@fredt Ah, thank you for taking the time to explain that to me. So, what would it take to reproduce the deadlock with LOCKS? I modified the code to do the following:
But, it still didn't deadlock. I wouldn't have expected that writing to a different table would have required a new lock.
I'm going to try to get detailed logging from a real life MVCC deadlock tomorrow that will hopefully prove to be more interesting than my test scenario.
I may have given the wrong example but a CHECKPOINT with LOCKS can block.
Locks in SQL can be considered as sets. A transaction may be waiting to obtain multiple locks and it will eventually go through when it can obtain the whole set at once. Or it may obtain the locks one at a time. It releases them all at commit.. There is logically no possibility of deadlock as this is avoided with an exception.
@fredt Okay, I finally have a real (not a contrived test case) MVCC deadlock. The app is running and I can query the system_session table and get any diagnostics as is helpful.
So, from that I see that there are 51 sessions blocked on session 29, which is attempting to run CHECKPOINT DEFRAG. Session 29 is in turn blocked by session 20. Session 20 does not appear to be blocked by anything.
Using the value in LAST_IDENTITY, I was able to locate the corresponding record in the hsqldb log file. I took a thread dump, and I do not see a thread in the dump that corresponds with the insertion of this record. To me, this reads like that from the application's perspective the session was successfully committed and it moved on, but for some reason the book keeping got off in HSQLDB and it still thinks the session is active.
Last edit: Peter Winckles 2023-08-30