scraper_element

kavulix
There is a newer version of this page. You can find it here.

Element name: scraper

The scraper element is supported by all items that display a springboard screen (audio, video, slideshow). You can use the scraper element to fill in missing content on the springboard screen before it's displayed. This is accomplished by specifying the url of a html page and then using regex with parentheses to match a specific portion of the html page. That value is then extracted and added to either an attribute on the parent element or it is used to create a new sibling element. In the example below the url is extracted from the src attribute of a video element on a html page. It is then added to the parent element (i.e., the video item) as the url attribute. In order to use scrapers you must set the enablescrapers attribute on the parent item.


Example:

<item type="video" enablescrapers="true">
   <scraper type="parent" ename="item" aname="url" url="http://yoursite.com/page.htm" regex="&lt;video\s.*?src=\x22(.+?)\x22" limit="1"/>
</item>

Supported Attributes

Name: type
Type: string
Supported values:
parent|sibling

Name: ename
Type: string
Supported values:
the element name
Notes:
If the type attribute is set to parent then the ename attribute will always be item.

Name: aname
Type: string
Supported values:
the attribute name

Name: url
Type: string
Supported values:
any url pointing to a html or text document

Name: regex
Type: string
Supported values:
a regular expression that includes parentheses

Name: limit
Type: integer
Supported values:
any integer
Notes:
This attribute value will limit the number of matches returned by the regex. If you are scraping image urls to create a slideshow and you only want to display the first 5 image urls found then you would set this attribute to 5.


Supported Child Nodes

NONE


MongoDB Logo MongoDB