First, let me note that there are bugs in the database, at least one.
Having fixed several known errors, I find that there are still
occasional problems. Its a catch-22 situation, in part at least. Until
we start using the database more, we have no experience with what the
bugs are, so they are very hard to find. But without reliability, we
don't want to use the database. But it is not really quite as bad as
that. We can develop tools for capturing/restoring content, reducing the
amount of data at risk significantly.
There are two nice things about having completed the citation logic.
First, the data structures should now be stable for some time, at least
until I start working on classifiers--and that will likely not be for a
few months. Second, the data structures for references (used by the
citation logic) pushes the database pretty hard, causing errors to
surface which would otherwise remain hidden for quite some time.
One thought here is to develop some code to exercise the database. The
problem is in knowing what to exercise. For example, creating a cabinet
already creates a lot of structures but it runs reliably. I don't see
this approach as gaining us much in the short term.
Now the database already creates a log file, which is not currently
being used. (So it is likely buggy.) This log file contains (or
should/will contain) enough information to rebuild the database from
scratch. But there may be some problems caused when the AwServer is
restarted. So I'm thinking that the first step should be to begin a new
log file (with a date stamp) each time the server is restarted. Then I
can focus on rebuilding the database using a collection of log files.
Bill
|