From: Reini U. <ru...@x-...> - 2005-02-14 11:37:06
|
Charles Corrigan schrieb: > On Mon, January 31, 2005 17:43, Charles Corrigan said: >>On Thu, January 27, 2005 17:31, Charles Corrigan said: >>>I just looked at the access log for my site and noticed >>>that google was indexing it. And it was trying every >>>possible link from every page! >>> >>>Would it make sense to add a "rel='nofollow'" to links >>>such as "edit text" where it makes no sense for a robot >>>to follow? >> >>When I wrote that, it was already 7 days after Reini had started putting >>the nofollow attribute onto some of the links! My only excuse (hah!) is >>that CVS and then the lists had problems last week. > > Google just re-indexed my site and, again, followed all links, including > action=edit etc. I looked into Google's spec and realised that the > rel="nofollow" only means that the link does not contribute to pagerank. > It does not mean that the link will not be followed. Thanks for clarification! > It looks like the only way to handle this is via the robots.txt. Google > support an extension to the specification that allows wildcards to be > specified in the Disallow field (see > http://www.google.com/intl/en/webmasters/3.html ). We also use the robots meta tag, which says that those initial action links are followed, but subsequent links and indexing in the action page are forbidden. This should be easier to setup (cost: none) than using a hardcoded robots.txt file, but allows one more link. normal + RecentChanges: <meta name="robots" content="index,follow" /> most action pages: <meta name="robots" content="noindex,nofollow" /> PS: We might want to change action=BackLinks to the first rule, as another exeption along with RecentChanges. -- Reini Urban http://xarch.tu-graz.ac.at/home/rurban/ |