From: Paul R. <pa...@ro...> - 2003-01-21 17:16:43
|
> > 2. We will rarely, if ever, need to delete a message from > > the middle > > of the file > > Is this correct? If I'm getting what you're saying, we will need to delete > messages from the beginning middle and end of a file if they contain more > than one message. > But in the normal (quota management) case, you'd delete oldest-first -- i.e. from the beginning. If you batch this up a bit, it's not too bad. > Also, there's the issue of advertising posts and just plain inappropriate or > offensive posts. > True, but hopefully not *that* frequent. > > Building the index can be quite fast if we store content-length > > (probably as > > a line count) as above. > > I'm guessing that what you are proposing is fixed-length records. This could > be substantially faster on most systems, but is it easily portable? > Not at all. Either store the length in bytes, or the lenght in (variable) lines. > This kinda brings up another concern/thought. If we use either this format > or Simon's, don't I lose the benefit of a data-defined hierarchy? I > understand that they both try to solve this using indexing, but I still have > to deal with multi-level indents/nesting. In other words, with either of > these formats, don't I have to presort all of the data, assigning index > levels to each entry based upon the index level of the parent. With my > format (XML or not), the hierarchy is predetermined by the nesting of each > element, which means that they are already presorted. This means I only have > to increment/decrement the indent level based upon the data tags > encountered. I may be wrong, but it just sounds like a much simpler > implementation to me (and where I'm doing it, that sounds good :-). KISS > principle? > You're assuming (a) that we'll precompute the indent level, which I don't assume is a great idea. Also assuming that the indent level will continue to be correct when earlier (potentially parent) messages are deleted. Pre-computing, storing, re-computing what is essentially a presentation detail seems to be anti-KISS to me. > > Message display can still be quite efficient if we further extend > > the index > > to include the offset (by line, byte, whatever) of the message in > > the larger > > file. We seek (or quickly skip lines), we grab exactly content-length > > lines, and off we go. > > I do like the idea of this, but I would like to know it will work > everywhere. > Everywhere that the file seek functions work. Or everywhere you can read a line at time, which is... everywhere. You're still reading them, but not parsing. -paul |