Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Form post

Help
Stephen M
2012-11-27
2013-04-09
  • Stephen M
    Stephen M
    2012-11-27

    Thank you for creating this great tool. I am having a problem with it though.

    I have set it up and it starts to crawl the login page of the site.  The login form posts the data to a different php page which then redirects the user to another page.  The crawler just returns a 302 error on the redirect.  What I am doing wrong? I have tried combinations of index.php and router.php.

    The form is below (https://www.example.com/index.php):

    <form action="router.php" method="post" name="login_form">
                <div style="width:350px;" class="ui-dialog-content ui-widget-content">
                    <div>
                        <label class="loginLabel" for="sessionlogin">Username</label>
                        <input type="text" name="sessionlogin" id="sessionlogin">
                    </div>
                    <div>
                        <label class="loginLabel" for="sessionpass">Password</label>
                        <input type="password" name="sessionpass" id="sessionpass">
                    </div>
                </div>
                <div id="submitContainer">
                    <p><input type="submit" value="Login" id="submitButton" class="ui-button ui-widget ui-state-default ui-corner-all" role="button" aria-disabled="false"></p>
                    <p><small><a onclick="$('#passwordMessage').dialog('open');" href="#">Problems logging in?</a></small></p>
                </div>
            </form>
    

    And the set up for the crawler is below (copied and changed from example.php):

    // URL to crawl
    $crawler->setURL("https://www.example.com/router.php");
    $post_data = array("securelogin" => "username", "securepass" => "password", "" => "login");
    $crawler->addPostData("#https://www.example.com/router.php#", $post_data);
    // Only receive content of files with content-type "text/html"
    $crawler->addContentTypeReceiveRule("#text/html#");
    // Ignore links to pictures, dont even request pictures
    $crawler->addURLFilterRule("#\.(jpg|jpeg|gif|png|bmp)$# i");
    $crawler->addURLFilterRule("#\.(css|js)$# i");
    // Store and send cookie-data like a browser does
    $crawler->enableCookieHandling(true);
    $crawler->setUserAgentString('Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/17.0 Firefox/17.0');
    $crawler->setFollowMode(1);
    $crawler->setFollowRedirects(TRUE);
    // Set the traffic-limit to 1 MB (in bytes,
    // for testing we dont want to "suck" the whole site)
    $crawler->setTrafficLimit(1000 * 1024);
    
     
  • Uwe Hunfeld
    Uwe Hunfeld
    2012-11-28

    Hi!

    Your setup looks ok, it's difficult to say why it is not working as expected without the real URL you are using.
    Could you post it here? Otherwise i cant' t test it.

    And by the way: A 302-code does not mean that an error occured, it's just the http-statuscode for a redirect, so that's fine.
    But afterwards the crawler doen't follow the redirect?

     
  • Stephen M
    Stephen M
    2012-12-11

    Hi

    Sorry for not replying sooner, I have been focusing on other projects.  I will send you a message with the URL and username and password and more information.

    Thank you.

     
  • Hi,

    just wanted to say that i got your mail and i will take a look at it soon (tomorrow or the next days).

    Thanks!

     


Anonymous


Cancel   Add attachments