I am running some comparisons between HyperSQL (which we currently use in our development setups) and our current production database engine (H2), both using file based (cached) tables. One test gives a striking difference. It drops a table and then recreates and populates it with data. In the H2 case we see an almost constant database size, no matter how often we run the test. My HyperSQL database is only about 60% of the size of my H2 database after the first run (which is good). But after every rerun of the test the database size increases linearly (which is not good). After 4 runs it has grown to 2.5 times the size of my H2 database after 4 runs.
Therefore my question is what the space reuse strategy is with cached tables. Our application manages sets of tables with a specific usage scenario. Dropping and recreating/populating tables is very common in that scenario. I am afraid my databases will keep growing so I would like to know what to expect.
Cheers,
Silvio
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Some closer observation revealed that database size does not grow linearly but exponentially (it doubles). The number of runs per doubling increases, which suggests it is filling up the available space. But as said: the table is dropped and recreated every time so reusing that space would make increasing the database unnecessary.
Last edit: silvio 2017-02-17
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have finished my tests. HyperSQL can insert records at about 2.5 times the speed H2 can. Unfortunately the database has grown to 257Mb versus 19Mb for H2 after exactly the same number of runs...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
With the next version of HSQLDB a setting will allow full reuse of space after dropping a table. Please compile from SVN trunk/HEAD and execute SET FILES SPACE 2 on the empty database and SET TABLE <name> NEW SPACE after creating the table.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the response. I have never compiled HyperSQL from source but will look into that.
Unfortunately I need to do everything from the connection setup level since the application code needs to stay database agnostic. We need to run in hybrid database mode supporting HyperSQL for new stuff and H2 for existing stuff to gradually phase-in HyperSQL. Each new user "project" has its own database so we can use HyperSQL for new projects while having the existing projects run as they already did on H2.
Is it (or will it be) possible to manipulate FILES SPACE and NEW SPACE through the connection string?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I understand that NEW SPACE is a per table setting. I was just hoping for something like hsqldb.default_table_type=cached since that allows setting the default for the per table type setting thus making specific SQL syntax redundant.
I am looking into enabling the app to execute additional SQL after table creation. The problem is that the app uses some SQL-abstraction layer that allows per-backend tweaking of connection strings and, within strict boundaries, SQL syntax like column type names but something like this has not been foreseen. Not saying it can't be done but it will be quite an endeavour.
Last edit: silvio 2017-02-22
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
in the connection string. In that case the error does not occur but a very strange thing happens: during the first test-run a database.data file is created and the data is written to it. The second run succesfully retrieves the data, drops, recreates and repopulates the tables. But after that any subsequent run does no longer see the tables created during the previous runs. After the first run the database.data file is never touched again. It remains the exact same size and the timestamp does not change. Neither does the file database.script change.
It almost appears as if everything is done in memory tables. I use the connection property
hsqldb.default_table_type=cached
to enforce cached tables. My connection string is the same during the first and subsequent connects. I delete the complete database before the first run.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you, that does force a write of the database files. And the database seems to stay at the same size so the space is indeed reused.
Unfortunately the shutdown subsequently triggers failures on connections that are pulled out of my connection pool for the same database. So I need to shutdown the database and purge all connections for this database from the pool to force a database write.
Is there any other way to force the writes without shutting down the database?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello all,
I am running some comparisons between HyperSQL (which we currently use in our development setups) and our current production database engine (H2), both using file based (cached) tables. One test gives a striking difference. It drops a table and then recreates and populates it with data. In the H2 case we see an almost constant database size, no matter how often we run the test. My HyperSQL database is only about 60% of the size of my H2 database after the first run (which is good). But after every rerun of the test the database size increases linearly (which is not good). After 4 runs it has grown to 2.5 times the size of my H2 database after 4 runs.
Therefore my question is what the space reuse strategy is with cached tables. Our application manages sets of tables with a specific usage scenario. Dropping and recreating/populating tables is very common in that scenario. I am afraid my databases will keep growing so I would like to know what to expect.
Cheers,
Silvio
Some closer observation revealed that database size does not grow linearly but exponentially (it doubles). The number of runs per doubling increases, which suggests it is filling up the available space. But as said: the table is dropped and recreated every time so reusing that space would make increasing the database unnecessary.
Last edit: silvio 2017-02-17
I have finished my tests. HyperSQL can insert records at about 2.5 times the speed H2 can. Unfortunately the database has grown to 257Mb versus 19Mb for H2 after exactly the same number of runs...
With the next version of HSQLDB a setting will allow full reuse of space after dropping a table. Please compile from SVN trunk/HEAD and execute SET FILES SPACE 2 on the empty database and SET TABLE <name> NEW SPACE after creating the table.
Hello Fred.
Thanks for the response. I have never compiled HyperSQL from source but will look into that.
Unfortunately I need to do everything from the connection setup level since the application code needs to stay database agnostic. We need to run in hybrid database mode supporting HyperSQL for new stuff and H2 for existing stuff to gradually phase-in HyperSQL. Each new user "project" has its own database so we can use HyperSQL for new projects while having the existing projects run as they already did on H2.
Is it (or will it be) possible to manipulate FILES SPACE and NEW SPACE through the connection string?
The SET FILES SPACE can be on the connection url with hsqldb_files_space=2 but the other setting is per table and must be used individually.
If the per-table setting is not used, the space is still recovered, but not fully and not as fast.
You should enable your app to execute additional SQL after table creation.
I understand that NEW SPACE is a per table setting. I was just hoping for something like hsqldb.default_table_type=cached since that allows setting the default for the per table type setting thus making specific SQL syntax redundant.
I am looking into enabling the app to execute additional SQL after table creation. The problem is that the app uses some SQL-abstraction layer that allows per-backend tweaking of connection strings and, within strict boundaries, SQL syntax like column type names but something like this has not been foreseen. Not saying it can't be done but it will be quite an endeavour.
Last edit: silvio 2017-02-22
Well, it has taken some time but I have managed to implement this. But now I get:
Access is denied in statement [SET TABLE "PUBLIC"."DEMO" NEW SPACE]
Any ideas?
BTW: I am using 2.4.1 for this now.
You need to set SET FILES SPACE 2 before the other command.
I have tried that approach using
hsqldb.files_space=2
in the connection string. In that case the error does not occur but a very strange thing happens: during the first test-run a database.data file is created and the data is written to it. The second run succesfully retrieves the data, drops, recreates and repopulates the tables. But after that any subsequent run does no longer see the tables created during the previous runs. After the first run the database.data file is never touched again. It remains the exact same size and the timestamp does not change. Neither does the file database.script change.
It almost appears as if everything is done in memory tables. I use the connection property
hsqldb.default_table_type=cached
to enforce cached tables. My connection string is the same during the first and subsequent connects. I delete the complete database before the first run.
Do a SHUTDOWN at the end of the tests. Do a SELECT on the INFORMATION_SCHEMA.SYSTEM_TABLESTATS to see how the space is used.
Last edit: Fred Toussi 2018-07-11
Thank you, that does force a write of the database files. And the database seems to stay at the same size so the space is indeed reused.
Unfortunately the shutdown subsequently triggers failures on connections that are pulled out of my connection pool for the same database. So I need to shutdown the database and purge all connections for this database from the pool to force a database write.
Is there any other way to force the writes without shutting down the database?
You can do a CHECKPOINT instead at any time during the tests.