STXXL 3.1: persistent vector corrupted after

  • Ingo

    First of all, thanks for the great and useful work.

    Unfortunately, I encounter a certain problem with stxxl::vector created with a given file for persistence.

    1. I create the vector of - say - doubles and add data.
    2. I close the program gracefully.
    3. I run the program, it initialises the vector from the file: size() < capacity() as expected.
    4. The program crashes.
    5. I restart the program, but now the size() of the vector equals its capacity() and beyond the expected size() zeroes are added.

    Have you any hint what I could do to get rid of this behaviour?

    stxxl::VECTOR_GENERATOR<double, PAGE_SIZE, PAGES, PAIR_BSIZE, stxxl::RC, stxxl::lru>::result VT;
    stxxl::syscall_file file(path, stxxl::file::CREAT | stxxl::file::RDWR);
    VT vec(&file);

  • If the vector contains constant data once it has been created successfully, you can open it read-only and use a const stxxl::vector<…> reference to access it

    If you want to modify the vector and the program crashes, the file is in an inconsistent state. The mismatching size is the obvious point, but all the blocks in the file can be a) old, b) new, c) partially updated. There are no "transactions" or similar mechanisms in stxxl so that you could tell afterwards what happened in the file or replay missing parts.
    You'll have to copy the data, write it to a temporary file and rename the file after it was successfully written and closed.


  • Ingo

    Thanks a lot, Andreas! I changed the file mode wherever possible to RDONLY and added const where I forgot it and the whole thing is much more robust now.

    I already used the backup trick BTW for certain data. Thanks anyway. Using it for all data structures is not that feasible due to the huge amount of data in question. The application deals with data in TB scale - which would be much harder without stxxl in the toolbox.

    Many thanks,