Could you pls explain WHEN exactly you get this status ?
What did you do?
Did you use the example-script included in the package?
Which URL did you want to crawl?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
other sites with parameters in the url will work fine, e.g:$crawler->setURL("http://www.hotscripts.com/cgi-bin/search.cgi?bool=AND&query=crawl&catid=2");
thanx for help - mark
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The crawler sends an empty "Cookie:" line with the request-header if theres no cookie to send.
I didnt see a webserver so far that cared about,
but the server of http://nuernberg.gelbeseiten-regional.de/ does and returns a "bad request"-header.
Without the "Cookie:"-line everything works fine with that site, i just tried.
I will fix that today or tomorrow i guess (not much work) and upload the patched version here.
I hope thats okay for you,
and thank you for your "report",
Greetings,
huni.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hello,
i am getting the status "HTTP/1.1 400 Bad Request"
what does this mean?
how could i avoid this?
Hello !
Could you pls explain WHEN exactly you get this status ?
What did you do?
Did you use the example-script included in the package?
Which URL did you want to crawl?
i changed $crawler in the original example.php to:
$crawler->setURL("http://nuernberg.gelbeseiten-regional.de/yp/quickSearch.yp?subject=Schreiner&location=Berlin");
when i'm running example.php the following lines will appear:
--- snipp ---
Page requested: http://nuernberg.gelbeseiten-regional.de/yp/quickSearch.yp?subject=Schreiner&location=Berlin
Status: HTTP/1.1 400 Bad Request
Referer-page:
Content received: 439 bytes
Summery:
Links followed: 1
Files received: 1
Bytes received: 603
--- snipp ---
other sites with parameters in the url will work fine, e.g:$crawler->setURL("http://www.hotscripts.com/cgi-bin/search.cgi?bool=AND&query=crawl&catid=2");
thanx for help - mark
Hi Mark again.
You defenately found a bug in phpcrawl there.
The crawler sends an empty "Cookie:" line with the request-header if theres no cookie to send.
I didnt see a webserver so far that cared about,
but the server of http://nuernberg.gelbeseiten-regional.de/ does and returns a "bad request"-header.
Without the "Cookie:"-line everything works fine with that site, i just tried.
I will fix that today or tomorrow i guess (not much work) and upload the patched version here.
I hope thats okay for you,
and thank you for your "report",
Greetings,
huni.