currently there is one parser in lucy, that has an
internal buffer. However, there is obviously a desire
to put more parsers into lucy to handle different types
of data. Since parsers only need to be used one
(possibly two with an email parser, since emails can
contain other documents. Other document types might
also allow embedded documents. Nesting should probably
be limited to two) at a time, a central buffer is
possible. Another related change is document type
detection from document contents. Text could be read
into the central buffer to select the appropriate
parser <b>before</b> selecting the parser. This could
also be used to identify bzipped and gzipped data, and
transparently decompress it (adding it to the
repository if necessary).