Retrieving links from a remote web site

Help
2007-05-03
2013-05-30
  • AAGTHosting
    AAGTHosting
    2007-05-03

    I am using snoopy to retrieve specific links from a remote web site.
    I am able to retrieve links, but I am not able to retrieve links using
    the same script from the page I need to retrieve links from.

    It gives me the following error. Does anyone know why I am getting this error? 
    When it outputs links it should give me alist much like this.

    0:link.com/directory
    1:link.com/directory

    But when it gives em this error it looks like it gives me a couple links and then
    quite and sends this error with all the links on one line, but it does not give me the rest of links on the page.

    Quote:
    response code: HTTP/1.1 401 Unauthorized

    Warning: Variable passed to each() is not an array or object in /home/content/a/l/e/directory/html/gregsCode/test.php on line 58

    https://www.domain.com/members/powersearch/control/interresults/https://www.domain.com/register/https://www.domain.com/register/https://www.domain.com/register/https://www.domain.com/register/https://www.domain.com/register/ 

    When I try to grab all the links from a domain such as http://www.domain.com it works.

    Here is the code I am using to call the snoopy class.

    Also, if you want to look at the snoopy code go to.

    http://snoopy.sourceforge.net

    PHP Code:
    include "Snoopy.class.php";
    $snoopy = new Snoopy;

    if($snoopy->fetchlinks("https://www.domain.com/members/powersearch/control/interresults?fromyear=2007&toyear=2008&region=&make=&modeltext=&auction=&numresultsperpage=50&x=33&y=4&mileage=&interior=&engine=&top=&transmission=&radio=&certification=&consignor=&presalechannel=1&dealerexchangechannel=5&cyberlotchannel=2&cyberauctionchannel=4&encorechannel=3&numresultsperpage=50"))
    {
    echo "response code: ".$snoopy->response_code."<br>\n";

    while(list($key,$val) = each($snoopy->results))
    echo $key.": ".$val."<br>\n";
    echo "<p>\n";
    echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
    echo "error fetching document: ".$snoopy->error."\n";

    I also have another issue I am dealing with when I am trying to retrieve links from this site. Some of the links that I need to retrieve are not normal links. The href tag has a call to a javascript function in it like this.

    HTML Code:
    <a href="javascript:goToPresaleResults('18', 'ALBA', '05/03/2007', '05%2F03%2F2007+Albuquerque+AA+-+THURS+SALE+FLEET%2FLEASE%2FCONS%2FGMAC%2FPOST+CARD+SALE', '1');">28 Found</a>
    Is there a php function that I can use that will retrieve links that call a java script function? If I can get each variable in the link then I can build the URL myself and retrieve the URL and then retrieve the info I need on the page.

    Would it be better for me to download the page to the local system and them seek through the page to find what I need?

    This is the java script function that is called from within the href tag.

    Code:
    function goToPresaleResults(saleID, auctionID, saleDate, iResultName, saleChannel)
    {
        document.PSPageSelector.action = "/members/presale/control/powersearchList";
        document.PSPageSelector.saleID.value = saleID;
        document.PSPageSelector.saleNumber.value = saleID;
        document.PSPageSelector.saleDate.value = saleDate;
        document.PSPageSelector.auctionID.value = auctionID;
        if(auctionID == null || auctionID == ''){
            loadOriginalAuctions();
        }else{
            clearAuctions();
        }
        document.PSPageSelector.salechannel.value = saleChannel;
       
        document.PSPageSelector.irname.value = iResultName;
        checkEngineValue();
        document.PSPageSelector.submit();
        return;
    }