Each project on our community page has one or more categories via which
they can be listed and linked to. They are added to the url as a parameter
But only one catagory is the main one and that is added to the page as a
<link rel="canonical" href="/Projekt/Bygga-stug-i-Norrland/?category=30" />
Only the main one should be in the search results of course, to avoid
duplicates. For this purpose I figured I would use the attribute "Ignore
non canonical pages" on the HTML Parser. It seems however that it does not,
as wished, ignore pages where the url does not match the canonical link but
rather ignores all pages with any canonical link in the header.
Is there a way to use this setting as I intend?
OpenSearchServer first look's for the canonical URL in a webpage and if there is a canonical URL the canonical URL will be added to the URL database and it will be crawled in next crawl session.If the option "Ignore non canonical pages" is set to false. The crawler will ignore the canonical link and crawls the current webpage.
Hello, thanks for your reply!
Can my problem be that the canonical link is relative? I'm thinking it might not match since it is missing the base domain of the url?
Log in to post a comment.