From: Kord C. <ko...@gr...> - 2002-11-20 19:15:31
|
Otis, Good questions. We expect you clients to hold us accountable for what we do with the data, as it is either your data that we are collecting, or your machines that we are using with which to collect the data. One thing that has been holding us up is the Windows client. Now that we have it done (and hopefully stable today), we should be able to retain more clients for crawling. This also allows us to start marketing the client, without us having to worry about newbies using it, and it crashing on them. Crashing programs tend to turn people off strangely enough. ;) Right now we are crawling about 3M URLs a day, with about 30-40 clients running per day. This is an average of about 100,000 URLs per day, per client. We currently have about 30M URLs in the database, so that puts our re-crawl rate at once every 10 days or so. We think that a good goal for re-crawl is about once every 7 days. The plan is to scale the number of URLs in the database to the number of crawlers currently running. As the number of crawlers running goes up, so does the number of URLs that we can re-crawl each week. Expect an announcement from us next week concerning our plans for making the returned data more accessible. I think you guys are going to like what we are going to make available to you. Later, Kord -- -------------------------------------------------------------- Kord Campbell Grub, Inc. President 5500 North Western Avenue #101C Oklahoma City, OK 73118 ko...@gr... Voice: (405) 848-7000 http://www.grub.org Fax: (405) 848-5477 -------------------------------------------------------------- Today's Topics: 1. Grub goals, ETA, etc. (otisg) --__--__-- Message: 1 From: "otisg" <ot...@iV...> To: <gru...@li...> Cc: Date: Mon, 18 Nov 2002 22:34:35 -0800 Subject: [Grub-general] Grub goals, ETA, etc. This is a multi-part message in MIME format. ------=_NextPart_000_1004_01C28F52.AC16A290 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Hello, I've been running the Grub client for a while, and I am curious when some of the things mentioned at http://www.grub.org/investors.php will start happening? Also, I am curious, what is the number of URLs that Grub has crawled so far, and I'm also wondering whether Grub is capable of re-fetching every page it knows at least once a month? I'm asking this because that's how often Google, Alltheweb, etc. do it, so I assume Grub has to do better if it wants to appear attractive to the big search engines, no? Thanks, Otis _______________________________________________________________ Sign up for FREE iVillage newsletters <http://s.ivillage.com/rd/16705> . >From health and pregnancy to shopping and relationships, iVillage has the scoop on what matters most to you. ------=_NextPart_000_1004_01C28F52.AC16A290 Content-Type: text/html Content-Transfer-Encoding: 7bit <HTML> <BODY> Hello,<br> <br> I've been running the Grub client for a while, and I am curious when some of<br> the things mentioned at http://www.grub.org/investors.php will start happening?<br> <br> Also, I am curious, what is the number of URLs that Grub has crawled so far,<br> and I'm also wondering whether Grub is capable of re-fetching every page it<br> knows at least once a month? I'm asking this because that's how often<br> Google, Alltheweb, etc. do it, so I assume Grub has to do better if it wants<br> to appear attractive to the big search engines, no?<br> <br> Thanks,<br> Otis<br> <br> </BODY></HTML> <BR><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">_______________________________________________________________<BR>Sign up for <A HREF="http://s.ivillage.com/rd/16705">FREE iVillage newsletters</A>.<BR>From health and pregnancy to shopping and relationships, iVillage<BR>has the scoop on what matters most to you. </font><br><br> ------=_NextPart_000_1004_01C28F52.AC16A290-- --__--__-- _______________________________________________ Grub-general mailing list Gru...@li... https://lists.sourceforge.net/lists/listinfo/grub-general End of Grub-general Digest |