WebHarvest - web data extraction tool / Discussion / Help: Need Help

I am trying to scrape Hindu Newspaper like NYtimes. Sample config file is below. But the output is only date displayed. Config file is given below.

&lt;var-def name=&quot;startUrl&quot;&gt;http://www.hinduonnet.com/&lt;/var-def&gt;
&lt;file action=&quot;write&quot; path=&quot;hindu/hindu.xml&quot; charset=&quot;UTF-8&quot;&gt;
    &lt;template&gt;
        &lt;![CDATA[ &lt;Hindu date=&quot;${sys.datetime(&quot;dd.MM.yyyy&quot;)}&quot;&gt; ]]&gt;
    &lt;/template&gt;

    &lt;loop item=&quot;articleUrl&quot; index=&quot;i&quot;&gt;
        &lt;!-- collects URLs of all articles from the front page --&gt;
        &lt;list&gt;
            &lt;xpath expression=&quot;//td/a[@class='topstory']/@href&quot;&gt;
                &lt;html-to-xml&gt;
                    &lt;http url=&quot;${startUrl}&quot;/&gt;
                &lt;/html-to-xml&gt;
            &lt;/xpath&gt;
            &lt;xpath expression=&quot;//div[@class='bluebk']/a[1]/@href&quot;&gt;
                &lt;html-to-xml&gt;
                    &lt;http url=&quot;${startUrl}&quot;/&gt;
                &lt;/html-to-xml&gt;
            &lt;/xpath&gt;
        &lt;/list&gt;

        &lt;!-- downloads each article and extract data from it --&gt;
        &lt;body&gt;
            &lt;xquery&gt;
                &lt;xq-param name=&quot;doc&quot;&gt;
                    &lt;html-to-xml&gt;
                        &lt;http url=&quot;${sys.fullUrl(startUrl, articleUrl)}?&amp;amp;pagewanted=print&quot;/&gt;
                    &lt;/html-to-xml&gt;
                &lt;/xq-param&gt;
                &lt;xq-expression&gt;&lt;![CDATA[
                    let $author := data($doc//div[@class=&quot;otherstory&quot;])
                    let $title := data($doc/font[@class=&quot;storyhead&quot;])
                    let $text := data($doc//p/a[1])
                        return
                            &lt;article&gt;
                                &lt;title&gt;{normalize-space($title)}&lt;/title&gt;
                                &lt;author&gt;{normalize-space($author)}&lt;/author&gt;
                                &lt;text&gt;{normalize-space($text)}&lt;/text&gt;
                            &lt;/article&gt;
                ]]&gt;&lt;/xq-expression&gt;
            &lt;/xquery&gt;
        &lt;/body&gt;
    &lt;/loop&gt;
    &lt;![CDATA[ &lt;/Hindu&gt; ]]&gt;

please give the correct Xpath so that I can syndicate the Online Newspaper

Need Help

Forums

Help

Need Help document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Need Help