OpenSAF / Tickets / #1526 imm: 1PBE can see db as locked

Description has changed:

Diff:

--- old
+++ new
@@ -1,4 +1,3 @@
-
 when the disk is full the sqlite will return error.

 Sep 18 13:42:02 SC-2 osafimmpbed: ER SQL statement ('COMMIT TRANSACTION') failed because:  disk I/O error
@@ -27,6 +26,6 @@

 Solution(1PBE):

-For the 1PBE case, which is not multi threaded, them if the sqlite db locked case is reached abort the PBE and let the PBE be re-generated(instead of blocking the PBE process).
+For the 1PBE case, which is not multi threaded, if the sqlite db locked case is reached abort the PBE and let the PBE be re-generated(instead of blocking the PBE process).

Neelakanta Reddy - 2015-10-08

summary: imm: abort the 1PBE when pbeBeginTrans sees db as locked --> imm: exit the 1PBE when pbeBeginTrans sees db as locked

status: accepted --> review
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-08

Question: How can this case happen for the 1PBE case when there is only one user thread using the sqlite instance ?

Another relevant question is why/when do you observe this now ?
The test case or test setup must be special somehow.

With only one thread this case should be impossible.
It suggest heap correuption could be the cause.

Some years ago we did see problems although not exactly this kind, in conjunction with
repeated failovers, where the new PBE managed to start while the old PBE (on the other SC) was
still executing (slow to terminate). But the distributes file level protection uses file system locking
and the symptoms should be different.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-08

I guess it could be that the pbe level message "Sqlite db locked by other thread" is plain wrong,
i.e. missleading.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-08

I looked at the code and the error message is correct but the "lock" is the PBE "spin lock" created
for handling 2PBE. The fact that it finds it locked in 1PBE means there is a logical bug somewhere
in 1PBE.

Most likely some error case where there is a bailout from commit processing without correct cleanup.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-08

Changed ticket slogan to describe the problem.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-08

summary: imm: exit the 1PBE when pbeBeginTrans sees db as locked --> imm: 1PBE can see db as locked
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-13

I nack'ed the patch because the imm service already has a restart mechanism for the PBE if
it gets stuck and the symptom shown here must result from a bug (if this truly is on 1PBE).

If there is not enough information to locate the bug, then the problem needs to be reproduced
with trace.

If it can not be reproduced then we close the ticket as not reproducible.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Bjornerstedt - 2015-10-13

status: review --> accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Widell - 2015-11-02

Milestone: 4.5.2 --> 4.6.2
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neelakanta Reddy - 2015-11-02

status: accepted --> assigned
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mathi Naickan - 2016-05-04

Milestone: 4.6.2 --> 4.7.2
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anders Widell - 2016-09-20

Milestone: 4.7.2 --> 5.0.2
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neelakanta Reddy - 2016-11-07

status: assigned --> not-reproducible
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Neelakanta Reddy - 2016-11-07

since, the problem is not reproducible closing the defect.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

imm: 1PBE can see db as locked

Milestone

Searches

Help

#1526 imm: 1PBE can see db as locked

----

Discussion