|
From: Geoff H. <ghu...@ws...> - 2002-11-08 15:57:24
|
On Fri, 8 Nov 2002, Lachlan Andrew wrote: > Regarding the flags, I can see why it makes sense to store > the information, but it doesn't need to be as a bit-field. I do think it makes sense to have a bit field. Remember that we're not just planning a database for HTML documents. Yes, some of the current bits are exclusive, but I can imagine that some XML documents might want combined bits, e.g.: <foo ...> text to be indexed <bar> more text </bar> </foo> Yes, some of the current flags could be in a lookup, but some (i.e. FLAG_CAPITAL) are clearly a bitfield. I could also see some situations where FLAG_AUTHOR and FLAG_KEYWORDS are combined, and conceivably the parser should be smart enough to decide if FLAG_LINK_TEXT and FLAG_URL should be combined, e.g. <a href="http://foo.com/">foo.com</a> Yes, you might argue these are somewhat contrived. But when we were first planning the database format for 3.2, we considered that arbitrary documents and XML might be included in a "3.2" release with user-defined bits and field-restricted searching. > can thank Mr Gates for that one... However, it could also > be treated as "level 3 heading", unless it is already given > extra weight somehow. It is not given extra weight currently. Again, the catch would be with field-restricted searches. If we treat things as a level-3 heading or whatever, then we have to block a search at that level as you'll get more than you asked for. -Geoff |