The Database Update Data Stream is a stream of data received from the CodeSnip Database Update web service that is used to update the local copy of the main database.
The stream is plain text and consists of a concatenation of text files from the online database along with some housekeeping information. The text files are recreated in the main database directory.
The data stream is received from the web server as single- or multi-byte ANSI encoded text. The encoding must be such that characters from the ASCII character set occupy one byte each. Therefore encodings that use two bytes for such characters, such as UTF-16, cannot be used.
The actual encoding used is determined by the web server should be specified in HTTP header. If the HTTP headers do not specify the encoding then ISO-8859-1 is assumed.
The encoding used for the files recreated in the main database directory is UTF-8 with byte order mark.
Data is converted between several formats on its journey from the web server to the final database file. See the appendix for details.
The stream contains structured plain text comprising both numeric and string information. Variable length strings are preceded by numeric values that indicate the length of the following string in bytes. Numeric values are encoded as hexadecimal characters. The format is as follows:
FileCount
Followed by FileCount
file records of:
Name
UnixDate
Content
The following flowchart show the various encodings used for downloaded data on its journey from web server to main database file.