For brevity the following is a repetition of the MoelnirFile class description.
A complete Java API documentation is available in the developer file under the 'doc' folder.
General
Moelnir is a database file which allows storage and retrieval of
user data records in a heap organisation, including administration and
re-usage of garbage data. The advantage of Moelnir is that each record is
stored in a continuous segment on the file while still each record size is
limit-less variable and modifiable. Garbage space, which falls from removed
or truncated records, is automatically organised, as are extension and
truncation of valid data space. This conception allows for very quick data
reading and writing. As disadvantage of this conception may count reduced
scalability, as a list of administration data is kept in core memory for each
record. The demand per record for administration is in the order of 100 bytes.
An upper limit for practicability can be expected at 1 Mio. entries.
Performance of reading and writing is fairly constant while insertions and
removals slow linear with the number of entries.
Data Chunks
The storage units for user records are called "chunks" in the parlance of
this class'es descriptions.
The class generates identifiers (type long integer) for allocated chunks,
alternatively the user can suggest an identifier which will be accepted if it
is not in use. The system guarantees uniqueness of all identifiers of chunks
in a file. Chunks as such are a rigid structure. They are, however,
automatically resized and moved in the file as needed. Chunks have a buffer
size and a data-length (user data), which are different values. On the
user level any size of data can be written to and read from a data chunk.
Garbage Chunks
When a data chunk is removed, it becomes a Garbage chunk. Garbage chunks
are nameless, which means their (former) identifiers are available for new
purposes at the user interface as soon as they have become invalidated.
Garbage chunks are just free data space and are merged with other garbage
where feasible.
Chunk Creation and Life
New chunks are preferably created from existing garbage chunks by either
taking over their buffer size or by dividing them into two separate chunks
where the returned chunk's buffer is dimensioned to comply with the requested
size. If there is no matching garbage available, the new chunk is allocated
at file end. A chunk (as referenced by its name) exists until the user
explicitly removes it; it is (as a data set) moved around in the file as
required and a fix location cannot be assumed.
Garbage Collection
Garbage collection has two operation modi: IMMEDIATE and ON_DEMAND.
In IMMEDIATE modus each action removing a data chunk (which may also occur
implicit in write operations)
checks to optimise the chunk neighbourhood, which takes time to perform.
In ON_DEMAND modus, garbage optimising is never done automatically (until
the file closes), it has to be performed by the user by calling method garbageCollect().
Normally it is safe to keep the default modus IMMEDIATE. If you have a
project with special demand on velocity, huge amounts of chunks and many
insertions/removals of elements, a switch to ON_DEMAND modus will become
attractive, while you will perform garbage collections in the more quiet
times. Garbage collection is always implied by the 'close()' method.
Streaming Devices
IO streams are available for reading and writing of chunks. Multiple
read streams may occur for a single chunk at a time, but a write stream
blocks any other write or remove action through the interface.
Care has to be taken that write
streams get closed when output is finished. Write streams are an elegant way
to transfer large data sizes to a chunk as size restrictions do not play
a role. They can also be used to append existing chunk data.
Reading and writing on a chunk x may occur parallel through IO-streams. The
rule is that a flush or a close operation on a write stream makes new
content available in the chunk, while a read
stream throws a ConcurrentModificationException when its chunk
has received modification while the stream was open.
Instance Properties
The following properties can be set on a single file instance.
SECTORISED - boolean, whether storage buffer sizes align to a sector
ceiling (read only, equal to sector-size = 0)
SECTOR-SIZE - int, the sector size of data storage
MIN-CHUNK-SIZE - long, the minimum buffer length of data chunks
MAX-WRITE-BUFFER - int, maximum core buffer size for output streams
POSITIVE-IDENTIFIERS - boolean, whether all generated chunk-IDs are positive
AUTO-CRC - boolean, whether CRC-32 on user data is automatically stored in chunks
ACCESS-SPEED - enum, setting for urgency of execution acceleration
COLLECT-MODUS - enum, time strategy when garbage collection is performed
Thread Safety
This class is not thread save.