Only index sertain pages but crawl all pages?

Help
Oskar
2010-11-05
2012-09-13
  • Oskar
    Oskar
    2010-11-06

    Hello!

    Great, thank you so much for your reply! I'll definitely pay for your service
    if my project is successful.

    If I write in like:

    http://www.homepage2.nu/forum

    http://www.homepage2.nu/comments

    Would it while crawling the site search for both links containing "forum" and
    "comments" or would it first crawls the site looking for "forum" and then
    crawl and look for "comments"? The reason for asking is that it first would
    need to visit the links containing "forum" to be able to find the links
    containing "comments.

    Regards

    Oskar

     
  • You should also provide an URL without wildcard to give an entry point to
    OpenSearchServer.

    Perhaps something like : http://www.homepage2.nu/forum/topiclist.php

    By entering: http://www.homepage2.nu/forum in
    the pattern list, OpenSearchServer will build this
    http://www.homepage2.nu/forum as entry point
    URL. This URL may not exist.

    If you are not sure where to find all the available URLs to target /forum
    and /comments, the best way is to crawl all the pages of the web site:
    http://www.homepage2.nu/*. And then use the plugin
    to filter the pages you want to keep, using your own rules.

    Regards,

    Emmanuel Keller

     
  • Only the first one is a valid link, but would this result in that the
    crawler will use http://www.homepage2.nu/forum/topiclist.php as the entry point URL and then index that page
    and all links containing "view" as for example http://www.homepage2.nu/forum/
    viewtopic.php?id=500?

    That's true.

    Also, would you recommend running the crawler on a Windows or Linux server?

    There is not favorite answer. OpenSearchServer fit well on both operating
    systems.