Re: [Ebiness-crawler] Disk storage
Status: Alpha
Brought to you by:
o3dozone
|
From: Mike D. <md...@ki...> - 2001-05-08 20:18:49
|
On Thu, 3 May 2001, Sellaro wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Tue, 1 May 2001, Mike Davis wrote: > <snip> > > The reason I suggest using gdbms is purely because they do work well > > and reliably when their size is below 100Mb and also we haven't got masses > > of time to spend on writing something from scratch (that's already been > > done anyway!) > > It is a very delicated point. If we choose to use gdbm (even with the > pre-hash algorithm) we may be delimiting a future expansion boundary. I > suggest we try to implement both strategies (simplified) and to take a > look at some benchmarks. > Ummm, would be interesting, if you've got the time... What was your exact strategy going to be? B-Tree based system? It seems to me that you're quite interested in the storage side and as such, I hereby delgate that aspect of the project to you ;-> > Is this too burocratic? > Not sure, sounds reasonable, but we'll have to be carefull we don't waste too much time, otherwise we'll still be here next year discussing the same stuff! From some of the other projects I've been involved with, it seems important to keep things moving along, even if it's just a slow pace, otherwise everyone gets real bored and dissappears... So I think what I will do is organise a list of things that need doing and we can talk about them and then split them up amongst us (the interested parties). At the moment, I'm quite interested in pushing on and starting work on the browser portion of the project, as we've now got a semi-functional crawler and quite a bit of data to play with. BTW, apologies for taking so long to respond - went on a bit of a holiday :-) Mike |