From: Richard J. <ri...@an...> - 2006-03-09 09:49:16
|
On Thu, Mar 09, 2006 at 10:18:55AM +0100, Bruno De Fraine wrote: > While this is a nice prototype, I obviously doubt this is what you > want when counting "gigabytes of things". But then again, it's not > entirely clear what you're asking for... Specifically we're parsing Apache logfiles from [a very large company which you will have heard of]. They produce about 1 GB of raw logfiles / day, which we read in, line at a time, and attempt to deduce interesting things. There's no possibility of fitting the logfiles into memory. Much of the problem involves counting how many times certain events happen. Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |