Menu

no change in url for next page in multipage

Help
2010-06-15
2012-09-04
  • kalyan kumar

    kalyan kumar - 2010-06-15

    Hi,

    How to scrape a site which has no change in url when we click on next page
    button.

    For example

    http://www.rupapublications.co.in/client/Category/Biography.aspx

    In this when we move to next page the url doesnt change what can be done for
    such a multi-page scraping.

    Thanks in advance

    Kalyan

     
  • Anonymous

    Anonymous - 2010-06-15

    looks like ajax.

    use firebug (or similar tool) to trace the HTTP "conversation" between client
    and server so you can latch to the appropriate url

     
  • kalyan kumar

    kalyan kumar - 2010-06-15

    Thank you very much...

    I will try to find how i can use firebug to find the appropriate url...

    Anybody can give some pointers on how to do it will very useful to me.

    Thanks in advance

    Kalyan

     
  • kalyan kumar

    kalyan kumar - 2010-06-15

    XHR tab in Net tab of firebug shows 0 requests when i navigate to next page.

    Does it mean no ajax requests are being made...?

     
  • Anonymous

    Anonymous - 2010-06-15

    use the "ALL" filter to see all traffic.

    there are two options:

    1. the data is getting from the server to the client for every page, so it should show up on firebug
    2. the data is already loaded when you load the page initially and shown in pages (do a viewsource on the page to confirm)
     
  • Kalpesh Gada

    Kalpesh Gada - 2010-06-17

    Hi Kalyan,

    The website that you have mentioned is developed using ASP.net . So when you
    click on that "Next" icon, it actually generates an onclick event which
    executes the code on the server side and you can see the modified content on
    the same url.

    Web Harvest cannot be used to perform this onclick based events. You might
    want to use HtmlUnit api to perform onclick based action and get the html
    content of the page. Once you get the content of the page you can use Web
    Harvest for parsing.

    Hope this makes sense to you.

    Thanks

     
  • alosada

    alosada - 2011-11-28

    Hi,

    Could you show us an example of integration between webharvest and htmlunit?

    I have same issue when asking for detail information in an specific URL. Seems
    that those links are created on onclick event.... I'd like to integrate
    HTMLUnit in a webharvest script, and once HTMLUnit leads me to that detail
    information come back to parse that URL via webharvest..

    Thanks in advance

     

Log in to post a comment.