Re: [Phpgedview-talk] Google
Brought to you by:
canajun2eh,
yalnifj
From: Matthew G. <ma...@po...> - 2005-08-28 12:45:24
|
On Saturday 27 August 2005 18:29, Joe Tellup wrote: > Me too. this past week google used up 2.3 gig of my bandwidth. > > Put this in you robots text file and they are gone > > User-agent: googlebot > Disallow: / > I found that using robots.txt to partially block bots it quite effective at reducing traffic while still permitting your site to be indexed effectively. The trick is to find which files on the site are causing most traffic and block just those. I also blocked all the charts. My idea is that I want there to be at least one entry point to my site for each name and/or place in the database. Here's the robots.txt (my phpGedView installation is in the directory "gedview"): User-agent: * Disallow: /gedview/media/ Disallow: /gedview/timeline.php Disallow: /gedview/fanchart.php Disallow: /gedview/pedigree.php Disallow: /gedview/clippings.php Disallow: /gedview/family.php Disallow: /gedview/ancestry.php Disallow: /gedview/descendancy.php Disallow: /gedview/reportengine.php Disallow: /gedview/hourglass.php Disallow: /gedview/calendar.php Disallow: /gedview/patriarchlist.php Further tuning may be helped by this command which shows where the crawlers are makking most traffic. "access_log" is name of the apache log file - run this command in the web log directory. If you don't have an ssh login to your web host, copy the web log to youor local linux machine. If you don't have a linux machine, install cygwin! grep Googlebot access_log \ |grep gedview |awk -F\" '{ print $2 }' \ |awk '-F[/?]' '{ print $3 }' \ |sort | uniq -c The "grep gedview" will need to be changed for your site to filter only the pages for phpGedView (if youo have other parts to your website). The $3 on the fourth line of the command might need to be changed if your phpGedView install isn't in a sub-directory of the root of your web server. Mine (/gedview) is a single subdirectory. If yours is in a second level directory (e.g. /stuff/gedview) you need to change the $3 to $4. The output looks something like this: 1188 indilist.php 1109 placelist.php 527 individual.php 337 aliveinyear.php 73 repo.php 16 repolist.php 5 famlist.php 4 HTTP 1 relationship.php HTTP > Google is now making mirror images of all websites, I guess they are > going for a stock split. How can you tell? > > -----Original Message----- > From: php...@li... > [mailto:php...@li...]On Behalf Of Ken > Lowther > Sent: Saturday, August 27, 2005 1:52 PM > To: php...@li... > Subject: [Phpgedview-talk] Google > > > Googlebot has been pounding my site. > > http://genealogy.lowther.org/cgi-bin/awstats.pl > > 330 individuals on file: > > http://genealogy.lowther.org/ > > Anyone else had this problem? The only thing I can think is that is > chasing links around in circles. Googlebot was connecte 24/7 using a > MINUMUM of 20% processor on a dual amd 64 2.3 ghz machine. I was > experiencing times when the machine reponded to very litte so that is > what started me digging. > > Ken > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices Agile & Plan-Driven Development * Managing Projects & Teams * > Testing & QA Security * Process Improvement & Measurement * > http://www.sqe.com/bsce5sf > _______________________________________________ > Phpgedview-talk mailing list > Php...@li... > https://lists.sourceforge.net/lists/listinfo/phpgedview-talk > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices Agile & Plan-Driven Development * Managing Projects & Teams * > Testing & QA Security * Process Improvement & Measurement * > http://www.sqe.com/bsce5sf > _______________________________________________ > Phpgedview-talk mailing list > Php...@li... > https://lists.sourceforge.net/lists/listinfo/phpgedview-talk |