From: Jim W. <spi...@us...> - 2005-05-17 14:29:39
|
Hi Ken, We noticed this too. In the one case where it was desireable to have calendar entries spidered we were able to inhibit the search engines by elimating many of the views and forward/backward links from the templates. They just were not necessary for this paticular site. There are a couple of things that I thought might help: One that is simple to implement, is a configurable limit to the calendar. In many cases you might only need a calendar that goes a year or even a few months or weeks into the future. What we see the spiders doing is going multiple decades ahead and going through thousands of empty calendar days. On the other hand if it was limited to a year forward and back you'd only be looking at about 700 gets a month. Another would be to eliminate the links for days with no events scheduled. They really are not necessary. All you get when you click on one is a screen that says "No events today". Eliminate those and the users will save a bunch of wasted mouse clicking and the spiders won't be attacking your calendars. Also, in most cases, pages like the year view will display much quicker since the browsers will not need to process the wasted anchor tags. Best, Jim Wilson > -----Original Message----- > From: Ken Nordquist <ke...@co...> > Sent: Monday, 16. May 2005 15:01 -0400 > To: phpwebsite-dev <php...@li...> > Subject: [Phpwebsite-developers] phpWebsite, SEO, and Meta Tags > > Over the past couple of months, I have noticed search engines crawling > the site and concentrating on the calendar. Each search engine crawls > each calendar entry (year / month /day) regardless of whether or not > there is content. This uses up a considerable amount of bandwidth which > I would like to conserve. > > Currently, I have two solutions available... To disallow search engine > bots access to the calendar via the robots.txt file, or to set the robot > meta tag to either "noindex, nofollow" or "index, nofollow." I want > calendar events crawled, but I want them restricted so disallowing via > robots.txt or the robot meta tag of "noindex, nofollow" are not viable > solutions. The problem with "index, nofollow" is that search engines > will not crawl other appropriate pages from links on a page the search > engine robot does crawl. > > The solution I believe is appropriate is to have the meta tag variables > global which can be set by the module (as needed / desired). This can > be accomplished ala the addPageTitle function in Layout.php. Following > is a patch of Layout.php (for phpWebsite v0.10.1) which will enable > modules to set meta tags (with the exception of generator and > content-type). The added functions are: > setKeywordsTag > setDescriptionTag > setRobotTag > setAuthorTag > setOwnerTag > > Of course, to implement the changes in a module, a field (column) would > have to be created in the mod's db for each metatag which would be > changed in the module or hard-coded in. > > I will work on a calendar-specific hack over the next few days. > > One of the reasons I worked on this was to 1) see how easy / difficult > it would be to implement in the current version of phpWebsite and 2) to > bring attention to the issue before the release of phpWebsite v1.0. > > As always, any and all comments are welcomed. I am not the world's most > proficient coder, so if anyone sees an easier solution, please let me > know. > > Ken Nordquist > > ***** start patch ***** > --- Layout.php.old 2005-05-15 20:43:09.000000000 -0400 > +++ Layout.php 2005-05-16 07:36:19.000000000 -0400 > @@ -10,6 +10,7 @@ > * Controls the layout and themes > * > * @version $Id: Layout.php,v 1.72 2005/03/07 14:05:42 steven Exp $ > + * changed to allow meta tag(s) to be changed per module (172-2 patch) > * @author Matthew McNaney <ma...@NO...> > * @package phpWebSite > */ > @@ -498,17 +499,56 @@ > } > } > > + function setKeywordsTag($keywords_tag) { > + $title = strip_tags($keywords_tag); > + $GLOBALS['keywords_tag'] = $keywords_tag; > + } > + > + function setDescriptionTag($description_tag) { > + $title = strip_tags($description_tag); > + $GLOBALS['description_tag'] = $description_tag; > + } > + > + function setRobotTag($robot_tag) { > + $title = strip_tags($robot_tag); > + $GLOBALS['robot_tag'] = $robot_tag; > + } > + > + function setAuthorTag($author_tag) { > + $title = strip_tags($author_tag); > + $GLOBALS['author_tag'] = $author_tag; > + } > + > + function setOwnerTag($owner_tag) { > + $title = strip_tags($owner_tag); > + $GLOBALS['owner_tag'] = $owner_tag; > + } > + > + > function getMetaTags(){ > $metatags = '<meta name="generator" content="phpWebSite" /> > '; > > + if (isset($GLOBALS['keywords_tag'])) { > + $metatags .= '<meta name="keywords" content="'.$GLOBALS["keywords_tag"].'" /> > +'; > + } else { > + > if ($this->meta_keywords) > $metatags .= '<meta name="keywords" content="'.$this->meta_keywords.'" /> > '; > +} > + > + > + if (isset($GLOBALS['description_tag'])) { > + $metatags .= '<meta name="description" content="'.$GLOBALS["description_tag"].'" /> > +'; > + } else { > > if ($this->meta_description) > $metatags .= '<meta name="description" content="'.$this->meta_description.'" /> > '; > +} > > if (isset($GLOBALS['block_robot'])) { > $robot = '00'; > @@ -516,6 +556,29 @@ > $robot = &$this->meta_robots; > } > > + if (isset($GLOBALS['robot_tag'])) { > + switch ($GLOBALS['robot_tag']){ > + case '00': > + $metatags .= '<meta name="robots" content="noindex, nofollow" /> > +'; > + break; > + > + case '01': > + $metatags .= '<meta name="robots" content="noindex, follow" /> > +'; > + break; > + > + case '10': > + $metatags .= '<meta name="robots" content="index, nofollow" /> > +'; > + break; > + > + case '11': > + $metatags .= '<meta name="robots" content="index, follow" /> > +'; > + break; > + } > + } else { > > if ($this->meta_robots){ > switch ($robot){ > @@ -540,19 +603,38 @@ > break; > } > } > + } > + > + if (isset($GLOBALS['author_tag'])) { > + $metatags .= '<meta name="author" content="'.$GLOBALS["author_tag"].'" /> > +'; > + } else { > > if ($this->meta_author) > $metatags .= '<meta name="author" content="'.$this->meta_author.'" /> > '; > +} > > - if ($this->meta_owner) > - $metatags .= '<meta name="owner" content="'.$this->meta_owner.'" /> > + if (isset($GLOBALS['owner_tag'])) { > + $metatags .= '<meta name="owner" content="'.$GLOBALS["owner_tag"].'" /> > +'; > + } else { > + if ($this->meta_owner) > + $metatags .= '<meta name="owner" content="'.$this->meta_owner.'" /> > '; > + } > > if ($this->meta_content) > $metatags .= '<meta http-equiv="content-type" content="text/html; charset=' . > $this->meta_content.'" /> > '; > > + // set meta tag globals to "null" so they do not affect other pages > + $GLOBALS["keywords_tag"] = ''; > + $GLOBALS['description_tag'] = ''; > + $GLOBALS['robot_tag'] = ''; > + $GLOBALS['author_tag'] = ''; > + $GLOBALS['owner_tag'] = ''; > + > return $metatags; > } > > ***** end patch ***** > > > > ------------------------------------------------------- > This SF.Net email is sponsored by Oracle Space Sweepstakes > Want to be the first software developer in space? > Enter now for the Oracle Space Sweepstakes! > http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click > _______________________________________________ > Phpwebsite-developers mailing list > Php...@li... > https://lists.sourceforge.net/lists/listinfo/phpwebsite-developers > |