Menu

#1708 MVCC deadlock

current-release
open-fixed
None
5
4 days ago
2024-06-19
No

We're running HSQLDB version 2.7.1 using MVCC on Java 21. We occasionally experience a deadlock when CHECKPOINT, is waiting to run. I have not been able to reliably reproduce it. It happened again today.

Here's a view of SYSTEM_SESSIONS at the time of deadlock:

sql> SELECT SESSION_ID, THIS_WAITING_FOR, WAITING_FOR_THIS, LATCH_COUNT, TRANSACTION, LAST_IDENTITY FROM INFORMATION_SCHEMA.SYSTEM_SESSIONS;
SESSION_ID  THIS_WAITING_FOR  WAITING_FOR_THIS                                     LATCH_COUNT  TRANSACTION  LAST_IDENTITY
----------  ----------------  ---------------------------------------------------  -----------  -----------  -------------
         6  19                                                                               1  FALSE                   36
         7  19                                                                               1  FALSE                   30
        10  19                                                                               1  FALSE           2147485201
        11                    19                                                             0  TRUE                     0
        12  19                                                                               1  FALSE                    0
        13  19                                                                               1  FALSE                    0
        14  19                                                                               1  FALSE                    0
        15  19                                                                               1  FALSE                    0
        16  19                                                                               1  FALSE                    0
        17  19                                                                               1  FALSE                    0
        18  19                                                                               1  FALSE                    0
        19  11                20,18,12,13,7,17,16,15,10,6,14,23,22,21,24,27,26,25            1  FALSE           2147484642
        20  19                                                                               1  FALSE                    0
        21  19                                                                               1  FALSE                    0
        22  19                                                                               1  FALSE                    0
        23  19                                                                               1  FALSE                    0
        24  19                                                                               1  FALSE                    0
        25  19                                                                               1  FALSE                    0
        26  19                                                                               1  FALSE                    0
        27  19                                                                               1  FALSE                    0
        28                                                                                   0  FALSE                    0

Session 28 is the sqltool session I used to query the data. Session 19 is the session that's waiting to run CHECKPOINT. Session 11 is the session that's blocking session 19, even though its latch is 0. The rest of the sessions are blocked on session 19.

The following is a portion of the stack trace of the thread executing session 11:

"ajp-nio-127.0.0.1-5678-exec-2" #68 [9803] daemon prio=5 os_prio=0 cpu=2813.78ms elapsed=4961.99s allocated=260M defined_classes=1344 tid=0x00007f6612d7ec70 nid=9803 waiting on condition  [0x00007f65e47fa000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@21/Native Method)

        - parking to wait for  <0x00000000c803e318> (a org.hsqldb.lib.CountUpDownLatch$Sync)
        at java.util.concurrent.locks.LockSupport.park(java.base@21/LockSupport.java:221)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@21/AbstractQueuedSynchronizer.java:754)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@21/AbstractQueuedSynchronizer.java:1099)
        at org.hsqldb.lib.CountUpDownLatch.await(Unknown Source)
        at org.hsqldb.Session.executeCompiledStatement(Unknown Source)
        at org.hsqldb.Session.execute(Unknown Source)
        - locked <0x00000000c803e358> (a org.hsqldb.Session)
        at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(Unknown Source)
        at org.hsqldb.jdbc.JDBCPreparedStatement.executeQuery(Unknown Source)
        - locked <0x00000000c803e5b8> (a org.hsqldb.jdbc.JDBCPreparedStatement)
        at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:431)
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:60)
        at org.hibernate.loader.Loader.getResultSet(Loader.java:2265)
        at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2028)
        at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1990)
        at org.hibernate.loader.Loader.doQuery(Loader.java:949)
        at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:351)
        at org.hibernate.loader.Loader.doList(Loader.java:2787)
        at org.hibernate.loader.Loader.doList(Loader.java:2770)
        at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2604)
        at org.hibernate.loader.Loader.list(Loader.java:2599)
        at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:505)
        at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:395)
        at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:220)
        at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1526)
        at org.hibernate.query.internal.AbstractProducedQuery.doList(AbstractProducedQuery.java:1598)
        at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1566)

In the SYSTEM_SESSIONS table, the CURRENT_STATEMENT for session 11 is blank. However, I extracted it from the heap dump. Here's a sanitized version:

select t1.f1, t1.f2, t1.f3, t1.f4, t1.f5, t1.f6, t1.f7, t2.f1, t2.f2, t2.f3, t2.f4, t2.f5, t2.f6, t2.f7, t2.f8, t2.f9, t2.f10, t2.f11, t2.f12, t2.f13, t2.f14, t2.f15, t2.f16, case when t2.f1 is not null then 1 when t1.f1 is not null then 0 end as clazz_ from table1 t1 left outer join table2 t2 on t1.f1=t2.f2 where t1.f3='SOMETHING'

So, it's just doing a select. The transaction was created to execute this query and it is committed as soon as it returns (though it never returns).

Every time this has happened, there's always been a CHECKPOINT waiting to run, which is why I think it's related to CHECKPOINT.

Discussion

  • Fred Toussi

    Fred Toussi - 2024-06-20

    Hi Peter,

    You had previously reported this. I made changes to transaction management in version 2.7.2 to fix this issue with CHECKPOINT. Now please use version 2.7.3 and report if there is still an issue.

     
    • Peter Winckles

      Peter Winckles - 2024-06-20

      Thanks, Fred! I appreciate your work and I hadn't realized that you'd made changes to attempt to address the issue. I'll upgrade.

       
  • Fred Toussi

    Fred Toussi - 4 days ago
    • status: open --> open-fixed
    • assigned_to: Fred Toussi
     

Log in to post a comment.

MongoDB Logo MongoDB