Re: [Htmlunit-user] Help Extracting Schedule from a Website

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

If you go to the webpage in a browser you'll get all sorts of information
on a piece of legislation.  This code simply returns the template without
any data/html populated.  Does this make sense?

On Wednesday, August 12, 2015, Ahmed Ashour <asa...@ya...> wrote:

> Hi,
>
> You can also use Thread.sleep().
>
> What is missing (not loaded)?
>
> Please read http://htmlunit.sourceforge.net/submittingJSBugs.html
>
> Ahmed
>
> ------------------------------
> *From:* Rich Goldman <ri...@um...
> <javascript:_e(%7B%7D,'cvml','ri...@um...');>>
> *To:* htm...@li...
> <javascript:_e(%7B%7D,'cvml','htm...@li...');>
> *Sent:* Wednesday, August 12, 2015 10:22 PM
> *Subject:* Re: [Htmlunit-user] Help Extracting Schedule from a Website
>
>
>
> I am trying to get the populated HTML of another site but it is not
> loading, despite putting the wait time to 30 seconds.
>
> The address is
> http://alisondb.legislature.state.al.us/Alison/SESSBillResult.aspx?BILL=HB1&WIN_TYPE=SELECTED_STATUS
>
>
> Are you all able to get page.asXml();  to produce populated html for this
> address?
>
> I've updated to 2.18 and I've tried putting
> the waitForBackgroundJavaScript in multiple places without success. My code
> is:
>
> public String getWebsiteTextWithJavaScript(String url) {
> WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_6);
> HtmlPage page = null;
> try {
> webClient.waitForBackgroundJavaScript(30000);
> page = webClient.getPage(url);
> } catch (FailingHttpStatusCodeException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> } catch (MalformedURLException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> } catch (IOException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> }
> //
> // // Thread.sleep(10000);
> webClient.waitForBackgroundJavaScript(30000);
> String text = page.asXml();
> webClient.waitForBackgroundJavaScript(30000);
> page.cleanUp();
> webClient.closeAllWindows();
>
> return text;
> }
>

Re: [Htmlunit-user] Help Extracting Schedule from a Website

Java GUI-Less browser, supporting JavaScript, to run against web pages

Re: [Htmlunit-user] Help Extracting Schedule from a Website