I am transfering HTML files to PhpWiki.
It works great with http://diberri.dyndns.org/wikipedia/html2wiki/ :-)
However I am concerned with pages that contained meta-tag: <meta name="robots" content="nofollow">
They are supposed to prevent search-engine spiders from following up links and overloading our database with unwanted requests imbedded in those links.
Is there a way I can implement the same thing in PhpWiki?
I am using version 1.3.13p1
Thanks for any clue!
if you want to protect something, you'd better do it right. I'm not sure about the ACL capability of PhPWiki, but the <meta ...nofollow and rel="nofollow" are two things that do only in part what you want it to do.
The meta tag <meta name="robots" content="...,nofollow" /> will prevent search engines (that are friendly and follow the robots.txt protocol) from following the link. However, if they discover it in any other way they might still visit it and/or index the page it links to.
The GOOGLE_LINKS_NOFOLLOW, I presume does add rel="nofollow" to any link on a page. This is a rather misguided and misnamed invention of Google to prevent comment spam on blogs and wikis. It purpose is not to prevent the spider to use the link to discover anotehr web page and eventually visit it, but it is supposed to suppress, that this link giove any weight of importance to the page it links to. In order to sort the many pages that match a given search keyword, Google takes into account how many links lead to this page. This attribute means don't add up this link in that calculation. So it has nothing to do with "don't follow" in the English meaning of the sense.
Neither of these methods does prevent visitors from seeing the pages. This needs to be done through ACL and/or through separate web sites with internal and external IP addresses.
Busy, teaching users Open Office at http://plan-b-for-openoffice.org/
robots meta tags are added automatically to each and every page within phpwiki.
Most innocent pages contain <meta name="robots" content="index,follow" />,
because ethically we like robots to promote wiki content.
Pages which lead to possible robot confusion, like certain actions and costly ActionPages contain <meta name="robots" content="noindex,nofollow" />
You can set GOOGLE_LINKS_NOFOLLOW to not follow ALL ActionPages, which will prevent indexing through RecentChanges, AllPages, Backlinks and so forth. See lib/display.php
We add also most other established meta tags for autodiscovering features which phpwiki offers. Some new dublic core-style meta tags are being discussed right now in interwiki circles. SisterWiki for better InterWiki linking and the SemanticWeb.
MicroFormats are probably too hackish. We can do the full thing and need not disturb theme and css authors by taking them away magic class names.
>Pages which lead to possible robot confusion, like certain actions and costly
>ActionPages contain <meta name="robots" content="noindex,nofollow" />
>You can set GOOGLE_LINKS_NOFOLLOW to not follow ALL ActionPages, which will
>prevent indexing through RecentChanges, AllPages, Backlinks and so forth. See
>We add also most other established meta tags for autodiscovering features
>phpwiki offers. Some new dublic core-style meta tags are being discussed right
>now in interwiki circles. SisterWiki for better InterWiki linking and the
>MicroFormats are probably too hackish. We can do the full thing and need not
>disturb theme and css authors by taking them away magic class names.
I looked at "display.php". It makes sense regarding ActionPages but I am not sure what to do practically with this code.
In "config.ini" I modified the setting of GOOGLE_LINKS_NOFOLLOW, making it "true" as I understand that engines won't follow external links, which is what I need for this particular page - and do not mind for other pages.
To make things clear, the page I want to protect is:
In that page, most internal links are launching potentially heavy queries on our database.
I hope it will work this way, because it's difficult to check until Google has indexed the wrong pages. :-/
(Thanks for your reply concerning meta-tags!)
I am trying to use part of our Wiki space as an intranet: pages should only be visible to people who logged in as the owner and/or the creator. I have done the following ACL to a test page:
Selected Pages: LPLintranet
Type: invidual page permission
ACL: view:_CREATOR; edit:_CREATOR; create:_CREATOR; list:_OWNER; remove:_ADMIN,_OWNER; change:_ADMIN,_OWNER; dump:_EVERY
But... As you will check, the page is still visible to all users!
What am I doing wrong?
Log in to post a comment.