Added fetchimages function :)

irbobo
2006-06-21
2013-05-30
  • irbobo
    irbobo
    2006-06-21

    $snoopy->fetchlinks($URL);
    $links = $snoopy->results;
           
    $count_links = count($links);
    for($i=0; $i < $count_links; ++$i){ if(strstr(strtolower($links[$i]),".jpg"))
        $image_links[] = $links[$i]; }
           
    # discard other links...
    $links = $image_links;

    // so special... next time just post what u have
    // instead of looking like a loser. N-O-N-E

     
  • Tim Wood
    Tim Wood
    2006-12-11

    I'm not sure what links have to do with images, but the above code doesn't seem to work.  I adapted some of Snoopy's code to get this.  If you run into problems, I created this is a wrapper to snoopy and had to make tweaks to use it within Snooopy.

    function _stripImages($document) {
        // Purpose: strip the img elements from an html document
        // Input:   $document - document to strip.
        // Output:  $match    - an array of images
        // Created by: Tim Wood of the Data Wranglers (www.datwranglers.com)
       
        preg_match_all("/<IMG [^<>]*>/i",$document,$elements);
       
        // catenate the matches
        $match = implode("\r\n",$elements[0]);
       
        // return the links
        return $match;
    }
       
    function fetchImages( $URI ) {
        // This behaves exactly like fetch( ) except that it only returns
        // the images from the page.
       
        // Function:   fetchImages
        // Purpose:    fetch the images from a web page
        // Input:      $URI    where you are fetching from
        // Output:     $this->results    an array of the URLs
        // Created by: Tim Wood of the Data Wranglers (www.datwranglers.com)
       
        // get the text of a page or fail
        if( ! $this->fetch( $URI ) ) {
            return false;
        }
        $results = $this->results;
       
        if( is_array( $results ) ) {
            for($x=0;$x<count($results);$x++) {
                $results[$x] = $this->_stripImages($results[$x]);
            }
        } else {
            $results = $this->_stripImages($results);
        }

        // update the results
        $this->results = $results;
       
        return true;
    }

     
  • A_L_I_E_N
    A_L_I_E_N
    2010-08-23

    The solution from coyote4til7 above isn`t working korrectly.
    Below is the working solution:
    ===============================

    function fetchimages($URI)
    {
            if ($this->fetch($URI))
            {           
                if(is_array($this->results))
                {
                    for($x=0;$x<count($this->results);$x++)
                        $this->results[$x] = $this->_stripimages($this->results[$x]);
                }
                else
                    $this->results = $this->_stripimages($this->results);
    
                return true;
            }
            else
                return false;
    }
    

    ===============================

    function _stripimages($document)
    {
            preg_match_all("'<\s*img\s.*?src\s*=\s*         # find <a href=
                            ([\"\'])?                   # find single or double quote
                            (?(1) (.*?)\\1 | ([^\s\>]+))        # if quote found, match up to next matching
                                                        # quote, otherwise match up to next space
                            'isx",$document,$links);
    
            // catenate the non-empty matches from the conditional subpattern
            while(list($key,$val) = each($links[2]))
            {
                if(!empty($val))
                    $match[] = $val;
            }
    
            while(list($key,$val) = each($links[3]))
            {
                if(!empty($val))
                    $match[] = $val;
            }
    
            // return the links
            return $match;
    }