Menu

new to web harvest need your help

Antony raj
2011-03-11
2012-09-04
  • Antony raj

    Antony raj - 2011-03-11

    how to give dynamic host url in config ...

     
  • Antony raj

    Antony raj - 2011-03-11

    Hi folks

    <config charset="UTF-8"></config>

    <var-def name="startUrl">http://news.bbc.co.uk</var-
    def
    ></var-def>

    <var-def name="urlList"></var-def>

    <xpath expression="//img/@src"></xpath>

    <html-to-xml></html-to-xml>

    <http url="startUrl/"></http>

    <loop filter="unique" item="link" maxloops="10" index="i"></loop>

    <list></list>

    <file action="write" path="images/${i}.gif" type="binary"></file>

    <http url="${sys.fullUrl(startUrl, link)}"></http>

    in the above config file we have one start url . but i want to give 10 start
    url is it possible

     
  • newbee

    newbee - 2011-03-11

    Do something like:

    <config charset="UTF-8"></config>

    <var-def name="startUrl"></var-def>

    <loop item="url"></loop>

    <list></list>

    <xpath expression="data(//url)"></xpath>

    <empty></empty>

    <var-def name="urlList"></var-def>

    <xpath expression="//img/@src"></xpath>

    <html-to-xml></html-to-xml>

    <http url="${url}"></http>

    <loop filter="unique" item="link" maxloops="10" index="i"></loop>

    <list></list>

    <file action="write" path="images/${i}.gif" type="binary"></file>

    <http url="${link}"></http>

     
  • Antony raj

    Antony raj - 2011-03-14

    Hi nakoned

    Thanks for ur post it was usefull to me.

    can u tell me how to split a string using delimiter

    for example:

    inside <xq-expression><![CDATA)</xq-expression>

    return

    <school></school>

    {data($fulladdress)}

    ]]>

    here i get fulladdress "india , chennai , 123rd street"

    i want to split this.

    i tried with xquery function its returning me an error

    Thanks

     
  • newbee

    newbee - 2011-03-14

    you probably want to take a look at 'tokenize' xpath function.

     
  • Alex Wajda

    Alex Wajda - 2011-03-14

    Re 1st question - there is absolutely no need of CDATA or Xpath in this case
    :)

    The following straightforward way works:

    <def var="urls">
        url1
        url2
        url3
    </def>
    <loop ....>
        <list>
            <get var="urls"/>
        </list>
    </loop>
    

    Re 2nd question - yes, XPath tokenize would do.

     

Log in to post a comment.