If I might jump in...
I like the basic ideas going on here, but let me throw out a few suggestions
to be shot down. I'll skip the whole XML-or-not question for now. Not
obvious that it buys us much here, though.
Not clear that we really gain much by the one-message-per-file method,
either. You quickly end up with huge directories, and so forth. I realize
there are pros and cons either way, but consider some of the characteristics
of this app vs., say, email storage:
1. Reads are much more frequent than posts
2. We will rarely, if ever, need to delete a message from the middle
of the file
3. Unlike email, we can completely track and control content-length
information for each message
Appending to the end of a file (even a large one) is obviously no big
performance problem. So what about index building, index display, and
Building the index can be quite fast if we store content-length (probably as
a line count) as above. We know exactly how many lines to skip (and
therefore not parse) at any time. Appending to the index is a one-shot if
we store an in-reply-to ID with each message, rather than a list of replies
with each parent (see the JWZ article below).
Index display is easy if we extend the previously-mentioned format to
include Poster, Date and Subject line. The index file will still be quite
small relative to the main message text, which will never be read when we're
displaying our message tree / list.
Message display can still be quite efficient if we furthur extend the index
to include the offset (by line, byte, whatever) of the message in the larger
file. We seek (or quickly skip lines), we grab exactly content-length
lines, and off we go.
On thread-building, by the way, I highly recommend a look at Jamie
Zawinski's algorithm and pseudocode for that (for email, which is a
more-complicated version of the task we face):
Just my two cents ( 0.0187410 EUR, 0.0124266 GBP, 0.03 CAD).
----- Original Message -----
From: "Wizard" <wizard@...>
To: "Simon Wilcox" <essuu@...>
Cc: "nms-devel" <nms-cgi-devel@...>
Sent: Tuesday, January 21, 2003 10:34 AM
Subject: RE: [Nms-cgi-devel] WWWBoard2 - PLEASE RESPOND
> > Fair point but there is still an overhead in doing the regexes when my
> > format uses simple splits. Plus you need to factor in the recursiveness
> > of the format. Not that it won't work just fine but mine seems simpler
> > to implement and less resource hungry.
> If you're saying that the index file will only contain the hierarchy (by
> ID?) and no message data, then I think I get what you're saying. That
> be much faster for posting, but for viewing the index page we're then
> talking about open & close on each and every message file to get the
> subject, user, email and date. I'd guess that much disk access is going to
> be much slower than the benefit of speeding up edits. If you're talking
> about including everything but the message text, then I can see the split
> being a benefit over regexes. I think I would use an alternate delimiter,
> though (maybe '::' or '||').
> > I agree that looking to the future is good but since it will be a major
> > upgrade to use a different backend or whatever I would suggest that as
> > long as we have the file IO in a module that provides a standard
> > interface, the implementation can be changed at any time.
> > Munging the datastore to fit whatever new strategy you want to use
> > should be straightforward.
> I honestly could care less about XML, but I thought that being an in-use
> standard might lead to greater acceptance over a proprietary format. It's
> 'Give the people what they want', as far as I'm concerned. What is it they
> > Did you see one big xml file ?
> Yes, or one wwwindex.xml with everything but individual message text,
> would be separate files.
> > If so, it will need to be locked while you write out a new one with the
> > new post in it. This may have performance issues.
> Yes, if it includes the message text. If it's just the index with an XML
> whatever) thread similar to what we have now in wwwboard.html, then it
> should be faster than what we have now due to lack of XHTML
> formatting/header/form/footer output.
> > I'll try and work up some examples of what I mean as soon as possible
> > but I'm at work right now and I should be doing other things :)
> Ok, when you have a moment. I don't believe this will be resolved today
> Grant M.
> This SF.net email is sponsored by: Scholarships for Techies!
> Can't afford IT training? All 2003 ictp students receive scholarships.
> Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
> Nms-cgi-devel mailing list