From: Rich G. <ri...@um...> - 2015-03-06 01:22:10
|
Sorry, ignore this...another newbie error. -Rich On Thu, Mar 5, 2015 at 1:09 PM, Rich Goldman <ri...@um...> wrote: > One followup: > > I'm trying to get the HTML (mainly the link urls) included in some of the > agendas but it's not coming through using: > > WebClient webClient = new WebClient(BrowserVersion.CHROME); > HtmlPage page = webClient.getPage(" > http://www.house.leg.state.mn.us/schedules/schedule.aspx#03/04/2015"); > Thread.sleep(10000); > System.out.println(page.asXml()); > > Is there something I need to do in order to get/keep the html rendering? > > I get (3/4/2015) HF0416-A15-0112.pdf instead of that text with the link to > the pdf file... > -Rich > > On Wed, Mar 4, 2015 at 11:45 AM, Ahmed Ashour <asa...@ya...> wrote: > >> Hi Rich, >> >> Well, waitForBackground is actually better than Thread.sleep(), and it >> works. >> >> It didn't work with you before, because you used it 'before' >> webClient.getPage(), however it should be 'after', to allow JavaScript/AJAX >> to run. >> >> Hope that clarifies, >> >> Ahmed >> ------------------------------ >> *From:* Rich Goldman <ri...@um...> >> *To:* htm...@li... >> *Sent:* Wednesday, March 4, 2015 4:56 PM >> *Subject:* Re: [Htmlunit-user] Help Extracting Schedule from a Website >> >> I think I was confused between using Thread.sleep(10000) >> and webClient.waitForBackgroundJavaScript(10000). >> >> Thanks again. >> -Rich >> >> >> >> On Wed, Mar 4, 2015 at 10:21 AM, Alain BUFERNE <alb...@gm...> >> wrote: >> >> By using HtmlUnit, you generally just program what a normal human being >> will do to use the webSite. Since you just need information send by the >> server in response of clickt this, select that, you don't need to execute >> Js code . >> >> 2015-03-04 7:05 GMT+01:00 Rich Goldman <ri...@um...>: >> >> Doing a bit more digging, it seems the javascript functions for >> populating the agenda items are in: >> http://www.house.leg.state.mn.us/schedules/ScheduleElements0.js?v=1.12 >> >> I don't know enough javascript to know how to execute these functions >> appropriately though. >> -Rich >> >> On Wed, Mar 4, 2015 at 12:41 AM, Rich Goldman <ri...@um...> wrote: >> >> I'm trying to get the schedule information posted at: >> >> http://www.house.leg.state.mn.us/schedules/schedule.aspx#03/06/2015 >> >> The content is loaded dynamically (presumably via AJAX) and I've tried >> the following code: >> >> >> final WebClient webClient = new >> WebClient(BrowserVersion.CHROME); >> webClient.waitForBackgroundJavaScript(10000); >> final HtmlPage page = webClient >> .getPage(" >> http://www.house.leg.state.mn.us/schedules/schedule.aspx#03/06/2015"); >> String javaScriptCode = "SchedJSx.Init();"; >> >> ScriptResult result = page.executeJavaScript(javaScriptCode); >> result.getJavaScriptResult(); >> System.out.println("result: " + result.getJavaScriptResult()); >> >> I can get some of the dynamic content: >> Friday, March 06, 2015 >> 10:30 AM >> Health and Human Services Reform >> Chair: Rep. Tara Mack >> Location: Basement State Office Building >> Note: >> ***Additional bills may be added >> >> but not the agenda/bill list. >> >> I feel like I'm missing something simple that I'm now aware of as a >> newbie. I would appreciate a skilled HTML Unit user looking at the source >> code of the source website and pointing out what I'm missing so I can >> extract the agenda for this meeting as well. >> >> Thanks for any help you can provide. >> -Rich >> >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> > |