We're running HSQLDB version 2.7.1 using MVCC on Java 21. We occasionally experience a deadlock when CHECKPOINT, is waiting to run. I have not been able to reliably reproduce it. It happened again today.
Here's a view of SYSTEM_SESSIONS at the time of deadlock:
sql> SELECT SESSION_ID, THIS_WAITING_FOR, WAITING_FOR_THIS, LATCH_COUNT, TRANSACTION, LAST_IDENTITY FROM INFORMATION_SCHEMA.SYSTEM_SESSIONS;
SESSION_ID THIS_WAITING_FOR WAITING_FOR_THIS LATCH_COUNT TRANSACTION LAST_IDENTITY
---------- ---------------- --------------------------------------------------- ----------- ----------- -------------
6 19 1 FALSE 36
7 19 1 FALSE 30
10 19 1 FALSE 2147485201
11 19 0 TRUE 0
12 19 1 FALSE 0
13 19 1 FALSE 0
14 19 1 FALSE 0
15 19 1 FALSE 0
16 19 1 FALSE 0
17 19 1 FALSE 0
18 19 1 FALSE 0
19 11 20,18,12,13,7,17,16,15,10,6,14,23,22,21,24,27,26,25 1 FALSE 2147484642
20 19 1 FALSE 0
21 19 1 FALSE 0
22 19 1 FALSE 0
23 19 1 FALSE 0
24 19 1 FALSE 0
25 19 1 FALSE 0
26 19 1 FALSE 0
27 19 1 FALSE 0
28 0 FALSE 0
Session 28 is the sqltool session I used to query the data. Session 19 is the session that's waiting to run CHECKPOINT. Session 11 is the session that's blocking session 19, even though its latch is 0. The rest of the sessions are blocked on session 19.
The following is a portion of the stack trace of the thread executing session 11:
"ajp-nio-127.0.0.1-5678-exec-2" #68 [9803] daemon prio=5 os_prio=0 cpu=2813.78ms elapsed=4961.99s allocated=260M defined_classes=1344 tid=0x00007f6612d7ec70 nid=9803 waiting on condition [0x00007f65e47fa000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@21/Native Method)
- parking to wait for <0x00000000c803e318> (a org.hsqldb.lib.CountUpDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(java.base@21/LockSupport.java:221)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@21/AbstractQueuedSynchronizer.java:754)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@21/AbstractQueuedSynchronizer.java:1099)
at org.hsqldb.lib.CountUpDownLatch.await(Unknown Source)
at org.hsqldb.Session.executeCompiledStatement(Unknown Source)
at org.hsqldb.Session.execute(Unknown Source)
- locked <0x00000000c803e358> (a org.hsqldb.Session)
at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(Unknown Source)
at org.hsqldb.jdbc.JDBCPreparedStatement.executeQuery(Unknown Source)
- locked <0x00000000c803e5b8> (a org.hsqldb.jdbc.JDBCPreparedStatement)
at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:431)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:60)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2265)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2028)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1990)
at org.hibernate.loader.Loader.doQuery(Loader.java:949)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:351)
at org.hibernate.loader.Loader.doList(Loader.java:2787)
at org.hibernate.loader.Loader.doList(Loader.java:2770)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2604)
at org.hibernate.loader.Loader.list(Loader.java:2599)
at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:505)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:395)
at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:220)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1526)
at org.hibernate.query.internal.AbstractProducedQuery.doList(AbstractProducedQuery.java:1598)
at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1566)
In the SYSTEM_SESSIONS table, the CURRENT_STATEMENT for session 11 is blank. However, I extracted it from the heap dump. Here's a sanitized version:
select t1.f1, t1.f2, t1.f3, t1.f4, t1.f5, t1.f6, t1.f7, t2.f1, t2.f2, t2.f3, t2.f4, t2.f5, t2.f6, t2.f7, t2.f8, t2.f9, t2.f10, t2.f11, t2.f12, t2.f13, t2.f14, t2.f15, t2.f16, case when t2.f1 is not null then 1 when t1.f1 is not null then 0 end as clazz_ from table1 t1 left outer join table2 t2 on t1.f1=t2.f2 where t1.f3='SOMETHING'
So, it's just doing a select. The transaction was created to execute this query and it is committed as soon as it returns (though it never returns).
Every time this has happened, there's always been a CHECKPOINT waiting to run, which is why I think it's related to CHECKPOINT.
Hi Peter,
You had previously reported this. I made changes to transaction management in version 2.7.2 to fix this issue with CHECKPOINT. Now please use version 2.7.3 and report if there is still an issue.
Thanks, Fred! I appreciate your work and I hadn't realized that you'd made changes to attempt to address the issue. I'll upgrade.