Menu

robots.txt rules ignored

Help
Anonymous
2015-01-10
2015-01-23
  • Anonymous

    Anonymous - 2015-01-10

    I have installed the spider, but the rules in the robots.txt where ignored.
    If I have
    User-agent: PHPCrawl
    Disallow: /
    nothing is crawled. but if I add a file like:
    User-agent: PHPCrawl
    Disallow: /calendar.php

    the crawler spiders the calendar.php file :(
    I use php5.4 but I change it to 5.3 for testing.
    Maybe the Lines in the robots file will parsed in right way.

     
  • Anonymous

    Anonymous - 2015-01-16

    Hi!

    Sorry for the late answer, i overlooked your post.

    So, what is your robots.txt look like, like this?

    User-agent: PHPCrawl
    Disallow: /
    
    User-agent: PHPCrawl
    Disallow: /calendar.php
    

    This should the crawler expect to crawl nothing, right?
    Or did i misunderstand you?

     
  • Anonymous

    Anonymous - 2015-01-17

    Hi I fixed the problem with a link in your buglist

     
  • Uwe Hunfeld

    Uwe Hunfeld - 2015-01-20

    Ah ok.
    So which bug was it?

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.