Added fetchimages function :)

2004-02-24
2013-05-30
  • Mikael Bladh
    Mikael Bladh
    2004-02-24

    I just wanted to inform you that I have just added an "fetchimages" function to the excellent snoopy class. It works exactly as the "fetchlinks" functions, except that it returns an array of image url's. I dunno if anyone have already done this, but if someone is interested, then I have the function.

    //O-N-E

     
    • Kelly McLellan
      Kelly McLellan
      2004-05-24

      Could you post it?

       
    • irbobo
      irbobo
      2006-06-21

      $snoopy->fetchlinks($URL);
      $links = $snoopy->results;
             
      $count_links = count($links);
      for($i=0; $i < $count_links; ++$i){ if(strstr(strtolower($links[$i]),".jpg"))
          $image_links[] = $links[$i]; }
             
      # discard other links...
      $links = $image_links;

      // so special... next time just post what u have
      // instead of looking like a loser. N-O-N-E

       
    • Tim Wood
      Tim Wood
      2006-12-11

      I'm not sure what links have to do with images, but the above code doesn't seem to work.  I adapted some of Snoopy's code to get this.  If you run into problems, I created this is a wrapper to snoopy and had to make tweaks to use it within Snooopy.

      function _stripImages($document) {
          // Purpose: strip the img elements from an html document
          // Input:   $document - document to strip.
          // Output:  $match    - an array of images
          // Created by: Tim Wood of the Data Wranglers (www.datwranglers.com)
         
          preg_match_all("/<IMG [^<>]*>/i",$document,$elements);
         
          // catenate the matches
          $match = implode("\r\n",$elements[0]);
         
          // return the links
          return $match;
      }
         
      function fetchImages( $URI ) {
          // This behaves exactly like fetch( ) except that it only returns
          // the images from the page.
         
          // Function:   fetchImages
          // Purpose:    fetch the images from a web page
          // Input:      $URI    where you are fetching from
          // Output:     $this->results    an array of the URLs
          // Created by: Tim Wood of the Data Wranglers (www.datwranglers.com)
         
          // get the text of a page or fail
          if( ! $this->fetch( $URI ) ) {
              return false;
          }
          $results = $this->results;
         
          if( is_array( $results ) ) {
              for($x=0;$x<count($results);$x++) {
                  $results[$x] = $this->_stripImages($results[$x]);
              }
          } else {
              $results = $this->_stripImages($results);
          }

          // update the results
          $this->results = $results;
         
          return true;
      }

       
  • A_L_I_E_N
    A_L_I_E_N
    2010-08-23

    The solution from coyote4til7 above isn`t working korrectly.
    Below is the working solution:
    ===============================

    function fetchimages($URI)
    {
            if ($this->fetch($URI))
            {           
                if(is_array($this->results))
                {
                    for($x=0;$x<count($this->results);$x++)
                        $this->results[$x] = $this->_stripimages($this->results[$x]);
                }
                else
                    $this->results = $this->_stripimages($this->results);
    
                return true;
            }
            else
                return false;
    }
    

    ===============================

    function _stripimages($document)
    {
            preg_match_all("'<\s*img\s.*?src\s*=\s*         # find <a href=
                            ([\"\'])?                   # find single or double quote
                            (?(1) (.*?)\\1 | ([^\s\>]+))        # if quote found, match up to next matching
                                                        # quote, otherwise match up to next space
                            'isx",$document,$links);
    
            // catenate the non-empty matches from the conditional subpattern
            while(list($key,$val) = each($links[2]))
            {
                if(!empty($val))
                    $match[] = $val;
            }
    
            while(list($key,$val) = each($links[3]))
            {
                if(!empty($val))
                    $match[] = $val;
            }
    
            // return the links
            return $match;
    }