Menu

Check with Google Labs gives code 403

Help
rick
2010-03-25
2013-05-30
<< < 1 2 (Page 2 of 2)
  • rick

    rick - 2010-03-30

    Dear Greg,
    I did some more research. At the point where the script is canceled and returning 403 error I placed some printing of variables, related to this function.
    session_sider.php mod:

    if ($SEARCH_SPIDER && !in_array(PGV_SCRIPT_NAME, $bots_not_allowed)) {
    header("HTTP/1.0 403 Forbidden");
    print "Sorry spider, this page is not available for search engine bots.";
        print (($SEARCH_SPIDER));
    print (($bot_session) );
        print (PGV_SCRIPT_NAME);

    The result is as follows when simulating Google spider:

    Sorry spider, this page is not available for search engine bots.
    Googlebot/ http://www.google.com/bot.html
    xxGOOGLEBOTfsHTTPcffWWWdGOOGLxx
    index.php

    The last one is the pgv script name and this is not included in the 'bots_not_allowed' array.
    I do not understand why phpgedview is blocking this spider.

    Rick.

     
  • Greg Roach

    Greg Roach - 2010-03-30

    Maybe ask Gerry.  This was his area.  He seems to be online now.

     
  • rick

    rick - 2010-03-31

    Hi All,
    Continued testing to find the problem.
    The system seems to work now with following modifications:

    In session_spider.php:
    if ($SEARCH_SPIDER && !in_array(PGV_SCRIPT_NAME, $bots_not_allowed)) {

    header("HTTP/1.0 403 Forbidden");
    print "Sorry , this page is not available for search engine bots.";
    exit;
    }

    into:

    if ($SEARCH_SPIDER && in_array(PGV_SCRIPT_NAME, $bots_not_allowed)) {

    header("HTTP/1.0 403 Forbidden");
    print "Sorry , this page is not available for search engine bots.";
    exit;
    }
    Note: the ! sign before in_array

    In functions_print.php:

    if ($SEARCH_SPIDER) {
    if (
    !(PGV_SCRIPT_NAME=='/individual.php' ||
    PGV_SCRIPT_NAME=='/indilist.php' ||
    PGV_SCRIPT_NAME=='/login.php' ||
    PGV_SCRIPT_NAME=='/family.php' ||
    PGV_SCRIPT_NAME=='/famlist.php' ||
    PGV_SCRIPT_NAME=='/help_text.php' ||
    PGV_SCRIPT_NAME=='/source.php' ||
    PGV_SCRIPT_NAME=='/search_engine.php' ||
    PGV_SCRIPT_NAME=='/index.php')
    ) {
    header("Location: search_engine.php");
    exit;
    }
    }

    into:

    if ($SEARCH_SPIDER) {
    if (
    !(PGV_SCRIPT_NAME=='individual.php' ||
    PGV_SCRIPT_NAME=='indilist.php' ||
    PGV_SCRIPT_NAME=='login.php' ||
    PGV_SCRIPT_NAME=='family.php' ||
    PGV_SCRIPT_NAME=='famlist.php' ||
    PGV_SCRIPT_NAME=='help_text.php' ||
    PGV_SCRIPT_NAME=='source.php' ||
    PGV_SCRIPT_NAME=='search_engine.php' ||
    PGV_SCRIPT_NAME=='index.php')
    ) {
    header("Location: search_engine.php");
    exit;
    }
    }

    Note : the / before the function ID

    I do not know if this is only an issue at my site otherwise it might be required to raise a bug report.

    Thanks for the support till now.

    Rick.

     
  • Greg Roach

    Greg Roach - 2010-03-31

    Rick,

    Both these changes are correct, and are both my own coding errors (in SVN 6879), so apologies.

    Greg

     
  • rick

    rick - 2010-03-31

    Hi Greg,

    No apologies!. I know how it works and you know, having these chalanges makes it intersting for me to search in the source code. Who knows there will be a time that I can assist in development ;-)
    Glad that this is solved and the website can be spidered again by google.

     
  • Wes Groleau

    Wes Groleau - 2010-04-01

    I kind of skimmed through quickly, so forgive me if this has already been addressed.  One of the posts, perhaps without meaning to, gave an incorrect impression.  A 403 is NOT caused by anything in robots.txt.  robots.txt is essentially instructions to robots equivalent to "if you are this, then don't do that."  Good robots obey, bad robots don't.

    Because bad robots exist, PGV also includes code that attempts to detect "they are this" and then prevent them from doing that.  Or rather, prevent them from doing anything-since they are (believed to be) uncivilized, we slap them silly with a 403.

     
  • KosherJava

    KosherJava - 2010-04-01

    I checked your patch into SVN (6953).
    Thanks

     
<< < 1 2 (Page 2 of 2)

Log in to post a comment.