But, to maintain the correct order, that is, to have the log file saved before writing to the data file, the sync is only required in TransactionManager#synchronizeLogFromMemory (or in TransactionManager#close). The usual commit will only put new records into the log and will not touch the data file, thus sync is not required on every transaction.
It will be a good optimization to sync the log only before the data file begins to be modified (considering the size of log is configurable), and source modifications to achieve this are few.
The database will be still consistent (will not be damaged), although some of the last transactions might be lost after the system failure.
Please, verify if I am right (about consistency)?
Transactional systems typically provide all ACID properties (atomicity, consistency, isolation and durability).
You are right in that sync'ing only the main database file will result in consistency. But as you point out, it's possible to lose the last transactions that have been written in the log but not yet synchronized to the database file.
This is generally considered as breaking the "durability" contract. In short, durability implies that after you commit, your data has been made durable. If you accepted an order for a customer and added it to a JDBM data strucutre, then you're guaranteed the order won't be lost after you call commit().
Some systems reduce the overhead of sync'ing for every transaction and instead adopt a 'group commit' policy where multiple concurrent (and therefore independent) transactions will be written to the log and synchronized together. This approach can reduce the number of I/O writes as well as increase throughput because sync is performed less often.
Thanks, this is what i thought.
To me the consistency after failure in most places is far more important then durability.
Do you interested in a patch providing this optional (configurable) optimization?
> Some systems reduce the overhead of sync'ing
> for every transaction and instead adopt a 'group
> commit' policy where multiple concurrent (and
> therefore independent) transactions will be
> written to the log and synchronized together.
> This approach can reduce the number of I/O
> writes as well as increase throughput because
> sync is performed less often.
Not applicable in my case, as i have a lot of non-concurrent databases (i don't even need the Jdbm classes to be thread-safe, as i synchronize outside).
If you have a patch that applies cleanly to the current CVS code, please post it and I'll commit it to CVS.