Menu

#357 Crash in autovacuum

V1.0 maintenance
closed
5
2012-12-10
2012-11-30
No

It was reported that datanode autovacuum crashes occasionally. Crash is caused in SEGV in malloc(). It is highly probable that this was caused by preceding buffer overflow/underflow.

Reported crash happened in DBT-2 benchmarks.

Discussion

  • Koichi Suzuki

    Koichi Suzuki - 2012-11-30

    Tracing buffer overflow by e-fence suggested one buffer overflow and found code section in question. Testing improved code to see if no more buffer overflow is observed and the result will be reported with a patch.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-11-30
    • assigned_to: nobody --> koichi-szk
     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-04

    768402786001578ff4d938cf69d5ebc7bafcb266 fixes it for REL1_0_STABLE.
    6457e8d7b2bda1081c44e0242aea00a05925f8a1 fixes it for the master.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-04
     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-04
    • status: open --> closed
     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-07

    This is a patch to fix autovacuum crash for REL1_0_STABLE branch. Regression tested. DBT-2 and e-fence check should be done before commit.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-07

    The previous work solved DBT-2 long term test and autovacuum problem. But it was found that this patch crashes many regression tests. The fix in the code was not completed. I will re-open this issue.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-07
    • status: closed --> open
     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-07

    By reviewing procarray.c code carefully, I found there are many more code which does not handle XC specifics (size of xip member of SnapshotData should not be fixed as maxProcs. It could be even larger).

    Will submit the patch to fix it.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-07

    This is similar to procarray_c_20121206_02.patch but for the current master. Regression tested. DBT-2 and e-fence check should be done before commit.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-07

    Have uploaded two patches for the fix, for REL1_0_STABLE and master respectively. Both passed the regression. Will run DBT-2 with/without e-fence.

     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-10
    • status: open --> closed
     
  • Koichi Suzuki

    Koichi Suzuki - 2012-12-10

    The patches procarray_c_20121207_01_master.patch and procarray_c_20121206_02.patch is the fix of the bug for REL1_0_STABLE and master respectively.

    Commit ID for REL1_0_STABLE: 742ec027820a1cb4a72e7f30eefd8d3c895ef947
    for master: 07e15efd88e29673064585cb5aeaf5043273ddcd

     

Log in to post a comment.