From: Rich G. <ri...@um...> - 2015-08-12 22:30:53
|
If you go to the webpage in a browser you'll get all sorts of information on a piece of legislation. This code simply returns the template without any data/html populated. Does this make sense? On Wednesday, August 12, 2015, Ahmed Ashour <asa...@ya...> wrote: > Hi, > > You can also use Thread.sleep(). > > What is missing (not loaded)? > > Please read http://htmlunit.sourceforge.net/submittingJSBugs.html > > Ahmed > > ------------------------------ > *From:* Rich Goldman <ri...@um... > <javascript:_e(%7B%7D,'cvml','ri...@um...');>> > *To:* htm...@li... > <javascript:_e(%7B%7D,'cvml','htm...@li...');> > *Sent:* Wednesday, August 12, 2015 10:22 PM > *Subject:* Re: [Htmlunit-user] Help Extracting Schedule from a Website > > > > I am trying to get the populated HTML of another site but it is not > loading, despite putting the wait time to 30 seconds. > > The address is > http://alisondb.legislature.state.al.us/Alison/SESSBillResult.aspx?BILL=HB1&WIN_TYPE=SELECTED_STATUS > > > Are you all able to get page.asXml(); to produce populated html for this > address? > > I've updated to 2.18 and I've tried putting > the waitForBackgroundJavaScript in multiple places without success. My code > is: > > public String getWebsiteTextWithJavaScript(String url) { > WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_6); > HtmlPage page = null; > try { > webClient.waitForBackgroundJavaScript(30000); > page = webClient.getPage(url); > } catch (FailingHttpStatusCodeException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } catch (MalformedURLException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } catch (IOException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } > // > // // Thread.sleep(10000); > webClient.waitForBackgroundJavaScript(30000); > String text = page.asXml(); > webClient.waitForBackgroundJavaScript(30000); > page.cleanUp(); > webClient.closeAllWindows(); > > return text; > } > |