Menu

Question about space reuse strategy

Help
silvio
2017-02-17
2018-07-17
  • silvio

    silvio - 2017-02-17

    Hello all,

    I am running some comparisons between HyperSQL (which we currently use in our development setups) and our current production database engine (H2), both using file based (cached) tables. One test gives a striking difference. It drops a table and then recreates and populates it with data. In the H2 case we see an almost constant database size, no matter how often we run the test. My HyperSQL database is only about 60% of the size of my H2 database after the first run (which is good). But after every rerun of the test the database size increases linearly (which is not good). After 4 runs it has grown to 2.5 times the size of my H2 database after 4 runs.

    Therefore my question is what the space reuse strategy is with cached tables. Our application manages sets of tables with a specific usage scenario. Dropping and recreating/populating tables is very common in that scenario. I am afraid my databases will keep growing so I would like to know what to expect.

    Cheers,

    Silvio

     
  • silvio

    silvio - 2017-02-17

    Some closer observation revealed that database size does not grow linearly but exponentially (it doubles). The number of runs per doubling increases, which suggests it is filling up the available space. But as said: the table is dropped and recreated every time so reusing that space would make increasing the database unnecessary.

     

    Last edit: silvio 2017-02-17
  • silvio

    silvio - 2017-02-17

    I have finished my tests. HyperSQL can insert records at about 2.5 times the speed H2 can. Unfortunately the database has grown to 257Mb versus 19Mb for H2 after exactly the same number of runs...

     
  • Fred Toussi

    Fred Toussi - 2017-02-17

    With the next version of HSQLDB a setting will allow full reuse of space after dropping a table. Please compile from SVN trunk/HEAD and execute SET FILES SPACE 2 on the empty database and SET TABLE <name> NEW SPACE after creating the table.

     
  • silvio

    silvio - 2017-02-17

    Hello Fred.

    Thanks for the response. I have never compiled HyperSQL from source but will look into that.

    Unfortunately I need to do everything from the connection setup level since the application code needs to stay database agnostic. We need to run in hybrid database mode supporting HyperSQL for new stuff and H2 for existing stuff to gradually phase-in HyperSQL. Each new user "project" has its own database so we can use HyperSQL for new projects while having the existing projects run as they already did on H2.

    Is it (or will it be) possible to manipulate FILES SPACE and NEW SPACE through the connection string?

     
  • Fred Toussi

    Fred Toussi - 2017-02-17

    The SET FILES SPACE can be on the connection url with hsqldb_files_space=2 but the other setting is per table and must be used individually.

    If the per-table setting is not used, the space is still recovered, but not fully and not as fast.

    You should enable your app to execute additional SQL after table creation.

     
  • silvio

    silvio - 2017-02-21

    I understand that NEW SPACE is a per table setting. I was just hoping for something like hsqldb.default_table_type=cached since that allows setting the default for the per table type setting thus making specific SQL syntax redundant.

    I am looking into enabling the app to execute additional SQL after table creation. The problem is that the app uses some SQL-abstraction layer that allows per-backend tweaking of connection strings and, within strict boundaries, SQL syntax like column type names but something like this has not been foreseen. Not saying it can't be done but it will be quite an endeavour.

     

    Last edit: silvio 2017-02-22
  • silvio

    silvio - 2018-06-05

    Well, it has taken some time but I have managed to implement this. But now I get:

    Access is denied in statement [SET TABLE "PUBLIC"."DEMO" NEW SPACE]

    Any ideas?

     
  • silvio

    silvio - 2018-06-05

    BTW: I am using 2.4.1 for this now.

     
  • Fred Toussi

    Fred Toussi - 2018-06-05

    You need to set SET FILES SPACE 2 before the other command.

     
  • silvio

    silvio - 2018-06-06

    I have tried that approach using

    hsqldb.files_space=2

    in the connection string. In that case the error does not occur but a very strange thing happens: during the first test-run a database.data file is created and the data is written to it. The second run succesfully retrieves the data, drops, recreates and repopulates the tables. But after that any subsequent run does no longer see the tables created during the previous runs. After the first run the database.data file is never touched again. It remains the exact same size and the timestamp does not change. Neither does the file database.script change.
    It almost appears as if everything is done in memory tables. I use the connection property

    hsqldb.default_table_type=cached

    to enforce cached tables. My connection string is the same during the first and subsequent connects. I delete the complete database before the first run.

     
  • Fred Toussi

    Fred Toussi - 2018-07-11

    Do a SHUTDOWN at the end of the tests. Do a SELECT on the INFORMATION_SCHEMA.SYSTEM_TABLESTATS to see how the space is used.

     

    Last edit: Fred Toussi 2018-07-11
  • silvio

    silvio - 2018-07-17

    Thank you, that does force a write of the database files. And the database seems to stay at the same size so the space is indeed reused.

    Unfortunately the shutdown subsequently triggers failures on connections that are pulled out of my connection pool for the same database. So I need to shutdown the database and purge all connections for this database from the pool to force a database write.

    Is there any other way to force the writes without shutting down the database?

     
  • Fred Toussi

    Fred Toussi - 2018-07-17

    You can do a CHECKPOINT instead at any time during the tests.

     

Log in to post a comment.