You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(6) |
Jul
(17) |
Aug
(18) |
Sep
(22) |
Oct
(16) |
Nov
(6) |
Dec
(11) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(11) |
Feb
(10) |
Mar
(34) |
Apr
(26) |
May
(6) |
Jun
(22) |
Jul
(14) |
Aug
(4) |
Sep
(47) |
Oct
(69) |
Nov
(23) |
Dec
(21) |
2005 |
Jan
(53) |
Feb
(33) |
Mar
(92) |
Apr
(65) |
May
(63) |
Jun
(57) |
Jul
(43) |
Aug
(132) |
Sep
(61) |
Oct
(75) |
Nov
(60) |
Dec
(130) |
2006 |
Jan
(74) |
Feb
(87) |
Mar
(101) |
Apr
(58) |
May
(54) |
Jun
(42) |
Jul
(31) |
Aug
(67) |
Sep
(61) |
Oct
(71) |
Nov
(28) |
Dec
(58) |
2007 |
Jan
(53) |
Feb
(50) |
Mar
(96) |
Apr
(66) |
May
(55) |
Jun
(130) |
Jul
(99) |
Aug
(115) |
Sep
(37) |
Oct
(78) |
Nov
(24) |
Dec
(70) |
2008 |
Jan
(94) |
Feb
(85) |
Mar
(197) |
Apr
(274) |
May
(119) |
Jun
(143) |
Jul
(193) |
Aug
(99) |
Sep
(160) |
Oct
(120) |
Nov
(178) |
Dec
(109) |
2009 |
Jan
(238) |
Feb
(169) |
Mar
(115) |
Apr
(109) |
May
(131) |
Jun
(167) |
Jul
(144) |
Aug
(193) |
Sep
(155) |
Oct
(154) |
Nov
(97) |
Dec
(127) |
2010 |
Jan
(108) |
Feb
(127) |
Mar
(176) |
Apr
(113) |
May
(130) |
Jun
(200) |
Jul
(115) |
Aug
(80) |
Sep
(92) |
Oct
(101) |
Nov
(124) |
Dec
(53) |
2011 |
Jan
(67) |
Feb
(144) |
Mar
(88) |
Apr
(60) |
May
(89) |
Jun
(54) |
Jul
(68) |
Aug
(81) |
Sep
(48) |
Oct
(40) |
Nov
(10) |
Dec
(20) |
2012 |
Jan
(21) |
Feb
(28) |
Mar
(17) |
Apr
(35) |
May
(41) |
Jun
(44) |
Jul
(68) |
Aug
(67) |
Sep
(89) |
Oct
(58) |
Nov
(47) |
Dec
(56) |
2013 |
Jan
(49) |
Feb
(28) |
Mar
(46) |
Apr
(31) |
May
(28) |
Jun
(37) |
Jul
(34) |
Aug
(52) |
Sep
(42) |
Oct
(108) |
Nov
(59) |
Dec
(56) |
2014 |
Jan
(41) |
Feb
(72) |
Mar
(46) |
Apr
(21) |
May
(19) |
Jun
(17) |
Jul
(15) |
Aug
(40) |
Sep
(11) |
Oct
(3) |
Nov
(5) |
Dec
(31) |
2015 |
Jan
(11) |
Feb
(12) |
Mar
(19) |
Apr
(19) |
May
(38) |
Jun
(54) |
Jul
(14) |
Aug
(42) |
Sep
(14) |
Oct
(16) |
Nov
(26) |
Dec
(14) |
2016 |
Jan
(3) |
Feb
(1) |
Mar
(24) |
Apr
(5) |
May
(15) |
Jun
(14) |
Jul
(33) |
Aug
(19) |
Sep
(8) |
Oct
(10) |
Nov
|
Dec
(2) |
2017 |
Jan
(16) |
Feb
(12) |
Mar
(23) |
Apr
(8) |
May
(11) |
Jun
(20) |
Jul
(21) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(9) |
Dec
(2) |
2018 |
Jan
(7) |
Feb
(5) |
Mar
(6) |
Apr
(5) |
May
(1) |
Jun
(2) |
Jul
(2) |
Aug
|
Sep
(4) |
Oct
(3) |
Nov
|
Dec
(4) |
2019 |
Jan
(2) |
Feb
(2) |
Mar
(3) |
Apr
(4) |
May
|
Jun
(4) |
Jul
(9) |
Aug
(2) |
Sep
|
Oct
(4) |
Nov
(1) |
Dec
(7) |
2020 |
Jan
(2) |
Feb
(6) |
Mar
(9) |
Apr
(1) |
May
(1) |
Jun
(15) |
Jul
(1) |
Aug
(1) |
Sep
(2) |
Oct
(6) |
Nov
(3) |
Dec
(5) |
2021 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(1) |
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(6) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Xue-Feng Y. <xy...@no...> - 2017-07-13 17:58:51
|
Yes, I use it to download data with the different parameters. Thanks. On Thu, Jul 13, 2017 at 1:48 PM, Albu Gmail <alb...@gm...> wrote: > I don't understand what you mean by "load a few hundred of remote pages", > htmlunit is used to interact with pages, it's a silent browser. You > interact with hundred of pages ? > > > > Le 13/07/2017 à 19:44, Xue-Feng Yang a écrit : > > Thanks. It's a little complicated solution since I need to load a few > hundreds of remote pages. I'll try this later if my current method don't > work. > > On Thu, Jul 13, 2017 at 12:48 PM, Albu Gmail <alb...@gm...> > wrote: > >> You are really testing my memory man.... >> >> The idea,(my idea) is there are some timers set in the page (auto >> refresh, update or so...) and as it is explained here: >> http://www.webdeveloper.com/forum/showthread.php?233448-Is- >> there-a-way-to-find-if-any-intervals-are-still-open >> >> *You cannot reliably tell if there are any unnamed intervals running, but >> you **can **shut down any that are open.* >> >> In previous answer You can see a call to a methode call >> attendPourJavascriptSaufTimers, for example in : >> >> // add a fake submit button to be able to submit the form( I translated >> from french) >> loginForm.appendChild(fauxBouton ); >> pageEnCours = fauxBouton.click(); >> >> *//webClient.waitForBackgroundJavaScript(AttentePourJavascript.CINQ_SECONDES.getTempo()); * *Original >> call but I got trouble so:* >> webClient.attendPourJavascriptSaufTimers(pageEnCours, >> AttentePourJavascript.CINQ_SECONDES.getTempo()); >> print.save(NomsFichiersPagesSa >> uvegardees.APRES_LOGGING.getUrl(), pageEnCours.asXml(), original); //Waiting >> for 5 seconds but could return before if nothing is running >> >> *What this method is doing:* >> >> public int attendPourJavascriptSaufTimers(HtmlPage page,long tempo){ >> >> String texteDuScript = ScriptAExecuter.ANNULE_LES_TIMERS.getScript(); >> //Use an enumeration where the scripts are described >> Object result = page.executeJavaScript(texteDu >> Script).getJavaScriptResult(); >> int retour = this.waitForBackgroundJavaScript(tempo); >> return retour; >> } >> the script executed (ANNULE_LES_TIMERS is the following: >> *limit= 10;* >> * var np, n= setInterval(function(){},100000);* >> * np= Math.max(0, n-limit);* >> * while(n> np){* >> * clearInterval(n--);* >> >> >> * } **If I wrote all this stuff it was because I was running into >> problems like you are , not getting all the page content I should, so my >> advise is to follow a little bit my track...**even If I don't remember >> all the details* >> *I think also you can see if there are interval set with the website you >> are scrapping and DevTools console of your browser* >> *I remember having done these back and forth sessions between DevTools >> and htmlunit, you really have to understand completely what's running on >> the site if you want to mimic it.* >> >> >> Le 13/07/2017 à 17:36, Xue-Feng Yang a écrit : >> >> I made more experiments on the issue. I added the following >> >> webClient.getOptions().setUseInsecureSSL(true); >> webClient.getCookieManager().setCookiesEnabled(true); >> webClient.setAjaxController(new NicelyResynchronizingAjaxController()); >> >> JavaScriptJobManager manager = htmlPage.getEnclosingWindow(). >> getJobManager(); >> int count = 0; >> while(manager.getJobCount() > 0){ >> System.out.println(count + "@" + manager.getJobCount()); >> webClient.waitForBackgroundJavaScript(10000); >> count ++; >> } >> >> Then I went to sleep. It's been running for a few hours. The job count >> has been changed from 20 to 3 and stayed at 3. >> >> Any thought? >> >> Thanks >> >> On Wed, Jul 12, 2017 at 10:56 PM, Xue-Feng Yang <no...@gm...> wrote: >> >>> >>> Hi, I used htmlunit for getting some other web pages. It works great. >>> >>> However, when I tried https://weather.com/weather/monthly/l/27560:4:US >>> , I got something not correct. >>> >>> Here are the summary of my system: >>> >>> OS: win 10 >>> Java: jdk1.8.0_131 >>> htmlunit: htmlunit-2.27-bin >>> >>> Attached are three pictures. >>> >>> eclipse-debug gives the result htmlunit got. The main code is as follows: >>> >>> webClient = new WebClient(BrowserVersion.FIREFOX_45); >>> webClient.getOptions().setTimeout(600 * 1000); >>> webClient.waitForBackgroundJavaScript(600 * 1000); >>> webClient.getOptions().setRedirectEnabled(true); >>> webClient.getOptions().setJavaScriptEnabled(true); >>> webClient.getOptions().setThrowExceptionOnFailingStatusCode( >>> false); >>> webClient.getOptions().setThrowExceptionOnScriptError(false); >>> webClient.getOptions().setCssEnabled(false); >>> >>> htmlPage = webClient.getPage(_url); >>> page = htmlPage.asXml(); >>> >>> view-source is the source page from Firefox. >>> >>> inspector is the debug tree from Firefox is debugger. >>> >>> It shows only Firefox debugger has the right html tree. >>> >>> My question is how to get the html tree by use of htmlunit? >>> >>> Thanks, >>> >>> Xuefeng >>> >> >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> _______________________________________________ >> Htmlunit-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most engaging >> tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ Htmlunit-user mailing >> list Htm...@li... https://lists.sourceforge.net/ >> lists/listinfo/htmlunit-user > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > _______________________________________________ > Htmlunit-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > |
From: Albu G. <alb...@gm...> - 2017-07-13 17:48:10
|
I don't understand what you mean by "load a few hundred of remote pages", htmlunit is used to interact with pages, it's a silent browser. You interact with hundred of pages ? Le 13/07/2017 à 19:44, Xue-Feng Yang a écrit : > Thanks. It's a little complicated solution since I need to load a few > hundreds of remote pages. I'll try this later if my current method > don't work. > > On Thu, Jul 13, 2017 at 12:48 PM, Albu Gmail <alb...@gm... > <mailto:alb...@gm...>> wrote: > > You are really testing my memory man.... > > The idea,(my idea) is there are some timers set in the page (auto > refresh, update or so...) and as it is explained here: > http://www.webdeveloper.com/forum/showthread.php?233448-Is-there-a-way-to-find-if-any-intervals-are-still-open > <http://www.webdeveloper.com/forum/showthread.php?233448-Is-there-a-way-to-find-if-any-intervals-are-still-open> > > *You cannot reliably tell if there are any unnamed intervals > running, but you**can**shut down any that are open.* > > In previous answer You can see a call to a methode call > attendPourJavascriptSaufTimers, for example in : > > // add a fake submit button to be able to submit the form( I > translated from french) > loginForm.appendChild(fauxBouton ); > pageEnCours = fauxBouton.click(); > ///webClient.waitForBackgroundJavaScript(AttentePourJavascript.CINQ_SECONDES.getTempo()); > / *Original call but I got trouble so:* > webClient.attendPourJavascriptSaufTimers(pageEnCours, > AttentePourJavascript.CINQ_SECONDES.getTempo()); > > print.save(NomsFichiersPagesSauvegardees.APRES_LOGGING.getUrl(), > pageEnCours.asXml(), original); //Waiting for 5 seconds but could > return before if nothing is running > > *What this method is doing:* > > public int attendPourJavascriptSaufTimers(HtmlPage page,long tempo){ > > String texteDuScript = > ScriptAExecuter.ANNULE_LES_TIMERS.getScript(); //Use an > enumeration where the scripts are described > Object result = > page.executeJavaScript(texteDuScript).getJavaScriptResult(); > int retour = this.waitForBackgroundJavaScript(tempo); > return retour; > } > > the script executed (ANNULE_LES_TIMERS is the following: > /limit= 10;// > // var np, n= setInterval(function(){},100000);// > // np= Math.max(0, n-limit);// > // while(n> np){// > // clearInterval(n--);// > // } > > //*If I wrote all this stuff it was because I was running into > problems like you are , not getting all the page content I should, > so my advise is to follow a little bit my track...*//*even If I > don't remember all the details*//* > *//*I think also you can see if there are interval set with the > website you are scrapping and DevTools console of your browser*//* > *//*I remember having done these back and forth sessions between > DevTools and htmlunit, you really have to understand completely > what's running on the site if you want to mimic it.*/ > /* > > */ > Le 13/07/2017 à 17:36, Xue-Feng Yang a écrit : >> I made more experiments on the issue. I added the following >> >> webClient.getOptions().setUseInsecureSSL(true); >> webClient.getCookieManager().setCookiesEnabled(true); >> webClient.setAjaxController(new >> NicelyResynchronizingAjaxController()); >> >> JavaScriptJobManager manager = >> htmlPage.getEnclosingWindow().getJobManager(); >> int count = 0; >> while(manager.getJobCount() > 0){ >> System.out.println(count + "@" + manager.getJobCount()); >> webClient.waitForBackgroundJavaScript(10000); >> count ++; >> } >> >> Then I went to sleep. It's been running for a few hours. The job >> count has been changed from 20 to 3 and stayed at 3. >> >> Any thought? >> >> Thanks >> >> On Wed, Jul 12, 2017 at 10:56 PM, Xue-Feng Yang <no...@gm... >> <mailto:no...@gm...>> wrote: >> >> >> Hi, I used htmlunit for getting some other web pages. It >> works great. >> >> However, when I tried >> https://weather.com/weather/monthly/l/27560:4:US >> <https://weather.com/weather/monthly/l/27560:4:US> , I got >> something not correct. >> >> Here are the summary of my system: >> >> OS: win 10 >> Java: jdk1.8.0_131 >> htmlunit: htmlunit-2.27-bin >> >> Attached are three pictures. >> >> eclipse-debug gives the result htmlunit got. The main code is >> as follows: >> >> webClient = new WebClient(BrowserVersion.FIREFOX_45); >> webClient.getOptions().setTimeout(600 * 1000); >> webClient.waitForBackgroundJavaScript(600 * 1000); >> webClient.getOptions().setRedirectEnabled(true); >> webClient.getOptions().setJavaScriptEnabled(true); >> webClient.getOptions().setThrowExceptionOnFailingStatusCode(false); >> webClient.getOptions().setThrowExceptionOnScriptError(false); >> webClient.getOptions().setCssEnabled(false); >> >> htmlPage = webClient.getPage(_url); >> page = htmlPage.asXml(); >> >> view-source is the source page from Firefox. >> >> inspector is the debug tree from Firefox is debugger. >> >> It shows only Firefox debugger has the right html tree. >> >> My question is how to get the html tree by use of htmlunit? >> >> Thanks, >> >> Xuefeng >> >> >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org!http://sdm.link/slashdot >> >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> <mailto:Htm...@li...> >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> <https://lists.sourceforge.net/lists/listinfo/htmlunit-user> > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ Htmlunit-user > mailing list Htm...@li... > <mailto:Htm...@li...> > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > <https://lists.sourceforge.net/lists/listinfo/htmlunit-user> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: Xue-Feng Y. <xy...@no...> - 2017-07-13 17:44:18
|
Thanks. It's a little complicated solution since I need to load a few hundreds of remote pages. I'll try this later if my current method don't work. On Thu, Jul 13, 2017 at 12:48 PM, Albu Gmail <alb...@gm...> wrote: > You are really testing my memory man.... > > The idea,(my idea) is there are some timers set in the page (auto refresh, > update or so...) and as it is explained here: http://www.webdeveloper.com/ > forum/showthread.php?233448-Is-there-a-way-to-find-if-any- > intervals-are-still-open > > *You cannot reliably tell if there are any unnamed intervals running, but > you **can **shut down any that are open.* > > In previous answer You can see a call to a methode call > attendPourJavascriptSaufTimers, for example in : > > // add a fake submit button to be able to submit the form( I translated > from french) > loginForm.appendChild(fauxBouton ); > pageEnCours = fauxBouton.click(); > > *//webClient.waitForBackgroundJavaScript(AttentePourJavascript.CINQ_SECONDES.getTempo()); * *Original > call but I got trouble so:* > webClient.attendPourJavascriptSaufTimers(pageEnCours, > AttentePourJavascript.CINQ_SECONDES.getTempo()); > print.save(NomsFichiersPagesSauvegardees.APRES_LOGGING.getUrl(), > pageEnCours.asXml(), original); //Waiting for 5 seconds but could return > before if nothing is running > > *What this method is doing:* > > public int attendPourJavascriptSaufTimers(HtmlPage page,long tempo){ > > String texteDuScript = ScriptAExecuter.ANNULE_LES_TIMERS.getScript(); > //Use an enumeration where the scripts are described > Object result = page.executeJavaScript(texteDuScript). > getJavaScriptResult(); > int retour = this.waitForBackgroundJavaScript(tempo); > return retour; > } > the script executed (ANNULE_LES_TIMERS is the following: > *limit= 10;* > * var np, n= setInterval(function(){},100000);* > * np= Math.max(0, n-limit);* > * while(n> np){* > * clearInterval(n--);* > > > * } **If I wrote all this stuff it was because I was running into > problems like you are , not getting all the page content I should, so my > advise is to follow a little bit my track...**even If I don't remember > all the details* > *I think also you can see if there are interval set with the website you > are scrapping and DevTools console of your browser* > *I remember having done these back and forth sessions between DevTools and > htmlunit, you really have to understand completely what's running on the > site if you want to mimic it.* > > > Le 13/07/2017 à 17:36, Xue-Feng Yang a écrit : > > I made more experiments on the issue. I added the following > > webClient.getOptions().setUseInsecureSSL(true); > webClient.getCookieManager().setCookiesEnabled(true); > webClient.setAjaxController(new NicelyResynchronizingAjaxController()); > > JavaScriptJobManager manager = htmlPage.getEnclosingWindow(). > getJobManager(); > int count = 0; > while(manager.getJobCount() > 0){ > System.out.println(count + "@" + manager.getJobCount()); > webClient.waitForBackgroundJavaScript(10000); > count ++; > } > > Then I went to sleep. It's been running for a few hours. The job count has > been changed from 20 to 3 and stayed at 3. > > Any thought? > > Thanks > > On Wed, Jul 12, 2017 at 10:56 PM, Xue-Feng Yang <no...@gm...> wrote: > >> >> Hi, I used htmlunit for getting some other web pages. It works great. >> >> However, when I tried https://weather.com/weather/monthly/l/27560:4:US , >> I got something not correct. >> >> Here are the summary of my system: >> >> OS: win 10 >> Java: jdk1.8.0_131 >> htmlunit: htmlunit-2.27-bin >> >> Attached are three pictures. >> >> eclipse-debug gives the result htmlunit got. The main code is as follows: >> >> webClient = new WebClient(BrowserVersion.FIREFOX_45); >> webClient.getOptions().setTimeout(600 * 1000); >> webClient.waitForBackgroundJavaScript(600 * 1000); >> webClient.getOptions().setRedirectEnabled(true); >> webClient.getOptions().setJavaScriptEnabled(true); >> webClient.getOptions().setThrowExceptionOnFailingStatusCode( >> false); >> webClient.getOptions().setThrowExceptionOnScriptError(false); >> webClient.getOptions().setCssEnabled(false); >> >> htmlPage = webClient.getPage(_url); >> page = htmlPage.asXml(); >> >> view-source is the source page from Firefox. >> >> inspector is the debug tree from Firefox is debugger. >> >> It shows only Firefox debugger has the right html tree. >> >> My question is how to get the html tree by use of htmlunit? >> >> Thanks, >> >> Xuefeng >> > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > _______________________________________________ > Htmlunit-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > |
From: Albu G. <alb...@gm...> - 2017-07-13 16:48:30
|
You are really testing my memory man.... The idea,(my idea) is there are some timers set in the page (auto refresh, update or so...) and as it is explained here: http://www.webdeveloper.com/forum/showthread.php?233448-Is-there-a-way-to-find-if-any-intervals-are-still-open *You cannot reliably tell if there are any unnamed intervals running, but you**can**shut down any that are open.* In previous answer You can see a call to a methode call attendPourJavascriptSaufTimers, for example in : // add a fake submit button to be able to submit the form( I translated from french) loginForm.appendChild(fauxBouton ); pageEnCours = fauxBouton.click(); ///webClient.waitForBackgroundJavaScript(AttentePourJavascript.CINQ_SECONDES.getTempo()); / *Original call but I got trouble so:* webClient.attendPourJavascriptSaufTimers(pageEnCours, AttentePourJavascript.CINQ_SECONDES.getTempo()); print.save(NomsFichiersPagesSauvegardees.APRES_LOGGING.getUrl(), pageEnCours.asXml(), original); //Waiting for 5 seconds but could return before if nothing is running *What this method is doing:* public int attendPourJavascriptSaufTimers(HtmlPage page,long tempo){ String texteDuScript = ScriptAExecuter.ANNULE_LES_TIMERS.getScript(); //Use an enumeration where the scripts are described Object result = page.executeJavaScript(texteDuScript).getJavaScriptResult(); int retour = this.waitForBackgroundJavaScript(tempo); return retour; } the script executed (ANNULE_LES_TIMERS is the following: /limit= 10;// // var np, n= setInterval(function(){},100000);// // np= Math.max(0, n-limit);// // while(n> np){// // clearInterval(n--);// // } //*If I wrote all this stuff it was because I was running into problems like you are , not getting all the page content I should, so my advise is to follow a little bit my track...*//*even If I don't remember all the details*//* *//*I think also you can see if there are interval set with the website you are scrapping and DevTools console of your browser*//* *//*I remember having done these back and forth sessions between DevTools and htmlunit, you really have to understand completely what's running on the site if you want to mimic it.*//* */ Le 13/07/2017 à 17:36, Xue-Feng Yang a écrit : > I made more experiments on the issue. I added the following > > webClient.getOptions().setUseInsecureSSL(true); > webClient.getCookieManager().setCookiesEnabled(true); > webClient.setAjaxController(new NicelyResynchronizingAjaxController()); > > JavaScriptJobManager manager = > htmlPage.getEnclosingWindow().getJobManager(); > int count = 0; > while(manager.getJobCount() > 0){ > System.out.println(count + "@" + manager.getJobCount()); > webClient.waitForBackgroundJavaScript(10000); > count ++; > } > > Then I went to sleep. It's been running for a few hours. The job count > has been changed from 20 to 3 and stayed at 3. > > Any thought? > > Thanks > > On Wed, Jul 12, 2017 at 10:56 PM, Xue-Feng Yang <no...@gm... > <mailto:no...@gm...>> wrote: > > > Hi, I used htmlunit for getting some other web pages. It works great. > > However, when I tried > https://weather.com/weather/monthly/l/27560:4:US > <https://weather.com/weather/monthly/l/27560:4:US> , I got > something not correct. > > Here are the summary of my system: > > OS: win 10 > Java: jdk1.8.0_131 > htmlunit: htmlunit-2.27-bin > > Attached are three pictures. > > eclipse-debug gives the result htmlunit got. The main code is as > follows: > > webClient = new WebClient(BrowserVersion.FIREFOX_45); > webClient.getOptions().setTimeout(600 * 1000); > webClient.waitForBackgroundJavaScript(600 * 1000); > webClient.getOptions().setRedirectEnabled(true); > webClient.getOptions().setJavaScriptEnabled(true); > > webClient.getOptions().setThrowExceptionOnFailingStatusCode(false); > webClient.getOptions().setThrowExceptionOnScriptError(false); > webClient.getOptions().setCssEnabled(false); > > htmlPage = webClient.getPage(_url); > page = htmlPage.asXml(); > > view-source is the source page from Firefox. > > inspector is the debug tree from Firefox is debugger. > > It shows only Firefox debugger has the right html tree. > > My question is how to get the html tree by use of htmlunit? > > Thanks, > > Xuefeng > > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: José R. | I. S.L. <jr...@id...> - 2017-07-13 15:56:25
|
I am new to HTMLUnit (read about it but just starting with a hands-on). I have a web page with a form whose "next" button is a simple input image without name nor id: <input src="https://static.XXX.com/imagenes/siguiente.gif" alt="SIGUIENTE" type="image"> The form head is as follows: <form action="/lugar/" method="PUT" onsubmit="return validaFormulario()" id="frmStep2" name="frmStep2"> And the validation JavaScript is: var _localidadObligatoria = true;var _skipValidation = false; function validaFormulario() { if (_skipValidation == true) return true; if ((document.getElementById('p').value == "df") || (_localidadObligatoria && document.getElementById("l").value == '')) { alert('Rellene los campos necesarios marcados con asterisco.'); return false; } return true;} I have tried different proposals as this one which showed promising: HtmlUnit, how to post form without clicking submit button? <https://stackoverflow.com/questions/7573558/htmlunit-how-to-post-form-without-clicking-submit-button> But when I click on the newly created fake button I only go back to the same page. Can someone me explain the login behind correctly submitting this form based on HTMLUnit/HTM/Javascript perspective? Thanks in advance, Jose -- *José Román Bilbao Castro* Ingeniero Consultor +34 901009188 *jr...@id... <jr...@id...>**http://idiria.com <http://www.idiria.com/>* <*http:// <http://%20%20/>idiria.com/ <http://idiria.com/>*> -- Idiria Sociedad Limitada - Aviso legal Este mensaje, su contenido y cualquier fichero transmitido con él está dirigido únicamente a su destinatario y es confidencial. Por ello, se informa a quien lo reciba por error ó tenga conocimiento del mismo sin ser su destinatario, que la información contenida en él es reservada y su uso no autorizado, por lo que en tal caso le rogamos nos lo comunique por la misma vía o por teléfono (+ 34 690207492), así como que se abstenga de reproducir el mensaje mediante cualquier medio o remitirlo o entregarlo a otra persona, procediendo a su borrado de manera inmediata. Idiria Sociedad Limitada se reserva las acciones legales que le correspondan contra todo tercero que acceda de forma ilegítima al contenido de cualquier mensaje externo procedente del mismo. Para información y consultas visite nuestra web http://www.idiria.com Idiria Sociedad Limitada - Disclaimer This message, its content and any file attached thereto is for the intended recipient only and is confidential. If you have received this e-mail in error or had access to it, you should note that the information in it is private and any use thereof is unauthorised. In such an event please notify us by e-mail or by telephone (+ 34 690207492). Any reproduction of this e-mail by whatsoever means and any transmission or dissemination thereof to other persons is prohibited. It should be deleted immediately from your system. Idiria Sociedad Limitada reserves the right to take legal action against any persons unlawfully gaining access to the content of any external message it has emitted. For additional information, please visit our website http://www.idiria.com |
From: Xue-Feng Y. <xy...@no...> - 2017-07-13 15:36:46
|
I made more experiments on the issue. I added the following webClient.getOptions().setUseInsecureSSL(true); webClient.getCookieManager().setCookiesEnabled(true); webClient.setAjaxController(new NicelyResynchronizingAjaxController()); JavaScriptJobManager manager = htmlPage.getEnclosingWindow().getJobManager(); int count = 0; while(manager.getJobCount() > 0){ System.out.println(count + "@" + manager.getJobCount()); webClient.waitForBackgroundJavaScript(10000); count ++; } Then I went to sleep. It's been running for a few hours. The job count has been changed from 20 to 3 and stayed at 3. Any thought? Thanks On Wed, Jul 12, 2017 at 10:56 PM, Xue-Feng Yang <no...@gm...> wrote: > > Hi, I used htmlunit for getting some other web pages. It works great. > > However, when I tried https://weather.com/weather/monthly/l/27560:4:US , > I got something not correct. > > Here are the summary of my system: > > OS: win 10 > Java: jdk1.8.0_131 > htmlunit: htmlunit-2.27-bin > > Attached are three pictures. > > eclipse-debug gives the result htmlunit got. The main code is as follows: > > webClient = new WebClient(BrowserVersion.FIREFOX_45); > webClient.getOptions().setTimeout(600 * 1000); > webClient.waitForBackgroundJavaScript(600 * 1000); > webClient.getOptions().setRedirectEnabled(true); > webClient.getOptions().setJavaScriptEnabled(true); > webClient.getOptions().setThrowExceptionOnFailingStat > usCode(false); > webClient.getOptions().setThrowExceptionOnScriptError(false); > webClient.getOptions().setCssEnabled(false); > > htmlPage = webClient.getPage(_url); > page = htmlPage.asXml(); > > view-source is the source page from Firefox. > > inspector is the debug tree from Firefox is debugger. > > It shows only Firefox debugger has the right html tree. > > My question is how to get the html tree by use of htmlunit? > > Thanks, > > Xuefeng > |
From: Xue-Feng Y. <no...@gm...> - 2017-07-13 03:37:13
|
Hi, I used HtmlUnit for getting some other web pages. It works great. However, when I tried https://weather.com/weather/monthly/l/27560:4:US , I got the page source code without running the javascript code. Here are the summary of my system: OS: win 10 Java: jdk1.8.0_131 HtmlUnit: htmlunit-2.27-bin eclipse-debug gives the result HtmlUnit got. The main code is as follows: webClient = new WebClient(BrowserVersion.FIREFOX_45); webClient.getOptions().setTimeout(600 * 1000); webClient.waitForBackgroundJavaScript(600 * 1000); webClient.getOptions().setRedirectEnabled(true); webClient.getOptions().setJavaScriptEnabled(true); webClient.getOptions().setThrowExceptionOnFailingStatusCode(false); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getOptions().setCssEnabled(false); htmlPage = webClient.getPage(_url); page = htmlPage.asXml(); The return page is a subset of view-source in Firefox. I found the JavaScript inspector in Firefox has the full HTML tree. My question is how to get the HTML tree of this page by use of HtmlUnit. Thanks, Xuefeng |
From: EricWong <ykw...@ya...> - 2017-07-11 13:47:50
|
Thanks for your source code sharing. The programming technique used is quite advanced and it's not easy to understand it. But it's quite interesting. Thanks. -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42312.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: albu77 <alb...@gm...> - 2017-07-11 09:44:57
|
As I created my webclient factory, I checked for that and I passed asynchrone as a parameter and it is set to true. So it's strange but one thing more to say is that I set the browser version of the webclient to BrowserVersion.FIREFOX_24 . AND LAST BUT NOT LEAST I put also some code I can call +webClient.attendPourJavascriptSaufTimers(pageAffichageLicence, AttentePourJavascript.BEAUCOUP.getTempo()); + webClient.waitForBackgroundJavaScript(AttentePourJavascript.DIX_SECONDES.getTempo()); Two methods which allow any background javascript to execute with a time parameters and in some case the time is long sometime less. the first method kill any anytimer running on the page public int attendPourJavascriptSaufTimers(HtmlPage page,long tempo){ String texteDuScript = ScriptAExecuter.ANNULE_LES_TIMERS.getScript(); Object result = page.executeJavaScript(texteDuScript).getJavaScriptResult(); int retour = this.waitForBackgroundJavaScript(tempo); return retour; } public enum ScriptAExecuter { ANNULE_LES_TIMERS(" limit= 10; \r\n var np, n= setInterval(function(){},100000); \r\n np= Math.max(0, n-limit);\r\n while(n> np){\r\n clearInterval(n--);\r\n }"); final private String script; ScriptAExecuter(String script) { this.script = script; } public String getScript() { return script; } } AS I said it's very far away so I even don't remember the why and how of these code, but What I know it's still in production and running well with htmlunit 2.14. I Hope It could help -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42311.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: EricWong <ykw...@ya...> - 2017-07-11 09:25:11
|
Thanks for your information. The problem is solved. My program already included this line of code before: webClient.setAjaxController(new NicelyResynchronizingAjaxController()); Per advise by you, I focus on this line. I commented it out //webClient.setAjaxController(new NicelyResynchronizingAjaxController()); and the complete html page can be loaded successfully. In your program, you determine whether to use it by: if(ajaxSynchrone) ... May you say a little about how to determine whether "ajaxSynchrone" is true or false? -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42310.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: albu77 <alb...@gm...> - 2017-07-11 07:49:47
|
chart-controls div is not showing either. I think you are not get the page at the right time there should be an ajax call with dom append opearation on success of the ajax call. If I were you I will dig in this direction. perhaps have a look to this link <https://stackoverflow.com/questions/19551043/process-ajax-request-in-htmlunit> It's a long time since I've used htmlunit but looking in my sources I found that I used my own class: public class MyWebClient extends WebClient ...and also if(ajaxSynchrone){ webClient.setAjaxController(new NicelyResynchronizingAjaxController()); There is nothing more I can tell you it's too far away and don't have any way of building any solution now. Good luck... -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42309.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: EricWong <ykw...@ya...> - 2017-07-11 06:28:58
|
I now upload the text file of the result of HtmlPage.asXML() : AsXmlResult.txt <http://htmlunit.10904.n7.nabble.com/file/n42308/AsXmlResult.txt> As it shows, the image tag as described above and as shown in the captured screen is not found. -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42308.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: albu77 <alb...@gm...> - 2017-07-11 05:05:33
|
And can you show what you get from your Page save? -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42307.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: EricWong <ykw...@ya...> - 2017-07-11 01:36:33
|
Thanks for your reply. I have tried "webClient.getOptions().setDownloadImages(true);" but it does not work. I am using the latest version 2.27. I do not need to download any image from the page by Htmlunit. I just want to get the complete html code result just as that shown on the F12 panel of Chrome. -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42306.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: albu77 <alb...@gm...> - 2017-07-10 15:03:31
|
I used htmlunit in the past and it was not loading the images. But You can have a look at this link anyway because now it depends of th version you are using https://stackoverflow.com/questions/3425697/does-htmlunit-load-images-when-it-browses-page <https://stackoverflow.com/questions/3425697/does-htmlunit-load-images-when-it-browses-page> -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303p42304.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: EricWong <ykw...@ya...> - 2017-07-10 14:28:43
|
<http://htmlunit.10904.n7.nabble.com/file/n42303/ECMWF.png> Hello. I try to load a page by HtmlUnit: https://www.ecmwf.int/en/forecasts/charts/catalogue/medium-mslp-wind850?time=2017070900,0,2017070900&projection=classical_europe (It is a page of a major meteorological agency in Europe) I try to get the complete html content by "HtmlPage.asXml()". However, the image tag: img class="chart-image" id="map_1_image" src="..." cannot be loaded even waiting for a period of time. The whole page can be loaded successfully by both Chrome and Firefox. The attached screen shows the page loaded by Chrome with F12 panel. This page does not require any user click. Just type the URL and wait for the ajax to load is OK. (Please subsituted the YYYYMMDD component of the URL with the previous day for test if necessary. E.g. if today is 15 Jul 2017, please use 20170714) May I know how the page can be loaded completely by Htmlunit? Thanks. -- View this message in context: http://htmlunit.10904.n7.nabble.com/Failing-to-load-the-complete-html-content-of-a-page-with-ajax-tp42303.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: Ronald B. <rb...@rb...> - 2017-06-30 06:19:23
|
>> i'm using HtmlUnit 2.23. With the 2.27 and 2.28 SNAPSHOT I get an error >> due to illegal character inside the javascript pages. Have you tried it with the latest team city build? The illegal char should be fixed. On Tue, 27 Jun 2017 20:15:11 +0000 Guillaume Lepinay wrote: > >Hello, with the example no more idea ? >If you need more information feel free to ask me :) > >Le lun. 26 juin 2017 ? 12:11, Guillaume Lepinay <gu...@gm...> a écrit : > >> Hello, >> >> I also tried this : >> >> HtmlAnchor link = mapDigits.get(chiffre); >> link.click(); >> webClient.waitForBackgroundJavaScript(2000); >> webClient.waitForBackgroundJavaScriptStartingBefore(2000); >> >> but still the same problem. >> >> Note that there is NO ajax call in the script. It does just change the >> value of 2 hidden field of a form. >> But after the call, even after waiting, the value has not been changed. >> >> Le lun. 26 juin 2017 ? 11:16, Rural Hunter <rur...@gm...> a >> écrit : >> >>> Hi Guillaume, >>> >>> You'd better wait for some time after the click by WebClient.waitXXXXXX, >>> especially for those scripts with ajax calls. >>> >>> >>> ? 2017/6/26 16:59, Guillaume Lepinay ??: >>> >>> Hello every body, >>> >>> I have a link like this : >>> >>> *<a tabindex="2" href="javascript:raf()">* >>> * 0 * >>> *</a>* >>> >>> I can find it with HtmlUnit and I use the click() method to execute the >>> javascript (method raf() in this example). >>> But nothing happens. >>> >>> The script doesn't run any ajax request. It only changes the value of >>> another field. >>> But after clicking, when I get the value of the input that should have >>> changed, it is still the same. >>> >>> Is there something special to click this kind of link ? >>> >>> Thank you for your help. >>> >>> i'm using HtmlUnit 2.23. With the 2.27 and 2.28 SNAPSHOT I get an error >>> due to illegal character inside the javascript pages. >>> >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> >>> >>> >>> _______________________________________________ >>> Htmlunit-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlunit-user >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Htmlunit-user mailing list >>> Htm...@li... >>> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >>> >> -- >> Cordialement, >> Guillaume Lepinay >> www.clangen.com >> 09 52 95 97 99 >> |
From: Rural H. <rur...@gm...> - 2017-06-28 01:42:06
|
Hi Ronald, Thanks a lot. That's enough for me. :) 在 2017/6/28 0:31, Ronald Brill 写道: > Ok you are rigth - at least partly ;-) > > Have updated the documentation -> please check if this is sufficient for you. > > This method is there for js support so i have added one more test. Looks like the browsers can handle ':before' and '::before'. This is now also fixed, we are in sync with the browsers now. > > Thanks for your report > > RBRi > > On Tue, 27 Jun 2017 09:43:55 +0800 Rural Hunter wrote: >> Thanks Ronald. I already get it working like this: >> HTMLElement element=(HTMLElement)he.getScriptableObject(); >> ComputedCSSStyleDeclaration style=jscript.getComputedStyle(element, ":before"); >> The important thing is the pseudo parameter of getComputedStyle method has to be ":before", not "before" or "::before". I think this should be documented. >> >> ÔÚ 2017/6/27 0:01, Ronald Brill ??µ?: >>> Hi Rural, >>> >>> not sure what you are looking for. Maybe ask the element for the computed css style and >>> then check for the attribute.... >>> >>> On Tue, 20 Jun 2017 15:34:39 +0800 Rural Hunter wrote: >>>> Hi, >>>> >>>> Does htmlunit support the CSS pseudo elements ::before and ::after? If yes, how to get it >>> on the html element? >>>> ----------------------------------------------------------------------- >>> ------- >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> _______________________________________________ >>>> Htmlunit-user mailing list >>>> Htm...@li... >>>> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >>>> >> > |
From: Guillaume L. <gu...@gm...> - 2017-06-27 20:15:29
|
Hello, with the example no more idea ? If you need more information feel free to ask me :) Le lun. 26 juin 2017 à 12:11, Guillaume Lepinay <gu...@gm...> a écrit : > Hello, > > I also tried this : > > HtmlAnchor link = mapDigits.get(chiffre); > link.click(); > webClient.waitForBackgroundJavaScript(2000); > webClient.waitForBackgroundJavaScriptStartingBefore(2000); > > but still the same problem. > > Note that there is NO ajax call in the script. It does just change the > value of 2 hidden field of a form. > But after the call, even after waiting, the value has not been changed. > > Le lun. 26 juin 2017 à 11:16, Rural Hunter <rur...@gm...> a > écrit : > >> Hi Guillaume, >> >> You'd better wait for some time after the click by WebClient.waitXXXXXX, >> especially for those scripts with ajax calls. >> >> >> 在 2017/6/26 16:59, Guillaume Lepinay 写道: >> >> Hello every body, >> >> I have a link like this : >> >> *<a tabindex="2" href="javascript:raf()">* >> * 0 * >> *</a>* >> >> I can find it with HtmlUnit and I use the click() method to execute the >> javascript (method raf() in this example). >> But nothing happens. >> >> The script doesn't run any ajax request. It only changes the value of >> another field. >> But after clicking, when I get the value of the input that should have >> changed, it is still the same. >> >> Is there something special to click this kind of link ? >> >> Thank you for your help. >> >> i'm using HtmlUnit 2.23. With the 2.27 and 2.28 SNAPSHOT I get an error >> due to illegal character inside the javascript pages. >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> >> >> _______________________________________________ >> Htmlunit-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> > -- > Cordialement, > Guillaume Lepinay > www.clangen.com > 09 52 95 97 99 > -- Cordialement, Guillaume Lepinay www.clangen.com 09 52 95 97 99 |
From: Ronald B. <rb...@rb...> - 2017-06-27 16:32:16
|
Ok you are rigth - at least partly ;-) Have updated the documentation -> please check if this is sufficient for you. This method is there for js support so i have added one more test. Looks like the browsers can handle ':before' and '::before'. This is now also fixed, we are in sync with the browsers now. Thanks for your report RBRi On Tue, 27 Jun 2017 09:43:55 +0800 Rural Hunter wrote: > >Thanks Ronald. I already get it working like this: > HTMLElement element=(HTMLElement)he.getScriptableObject(); > ComputedCSSStyleDeclaration style=jscript.getComputedStyle(element, ":before"); >The important thing is the pseudo parameter of getComputedStyle method has to be ":before", not "before" or "::before". I think this should be documented. > >ÔÚ 2017/6/27 0:01, Ronald Brill ??µ?: >>Hi Rural, >> >>not sure what you are looking for. Maybe ask the element for the computed css style and >>then check for the attribute.... >> >>On Tue, 20 Jun 2017 15:34:39 +0800 Rural Hunter wrote: >>>Hi, >>> >>>Does htmlunit support the CSS pseudo elements ::before and ::after? If yes, how to get it >>on the html element? >>> >>>----------------------------------------------------------------------- >>------- >>>Check out the vibrant tech community on one of the world's most >>>engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>_______________________________________________ >>>Htmlunit-user mailing list >>>Htm...@li... >>>https://lists.sourceforge.net/lists/listinfo/htmlunit-user >>> >> > > |
From: Rural H. <rur...@gm...> - 2017-06-27 01:44:08
|
Thanks Ronald. I already get it working like this: HTMLElement element=(HTMLElement)he.getScriptableObject(); ComputedCSSStyleDeclaration style=jscript.getComputedStyle(element, ":before"); The important thing is the pseudo parameter of getComputedStyle method has to be ":before", not "before" or "::before". I think this should be documented. 在 2017/6/27 0:01, Ronald Brill 写道: > Hi Rural, > > not sure what you are looking for. Maybe ask the element for the computed css style and > then check for the attribute.... > > On Tue, 20 Jun 2017 15:34:39 +0800 Rural Hunter wrote: >> Hi, >> >> Does htmlunit support the CSS pseudo elements ::before and ::after? If yes, how to get it > on the html element? >> >> ----------------------------------------------------------------------- > ------- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> > |
From: minersail <jkw...@gm...> - 2017-06-26 22:50:14
|
I have created a dummy account on MarineTraffic to use. Method 1: Manipulating form and clicking button: String username = "htm...@gm..."; String password = "password1"; WebClient webClient = new WebClient(BrowserVersion.CHROME); webClient.getOptions().setJavaScriptEnabled(true); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getOptions().setCssEnabled(false); webClient.setAjaxController(new NicelyResynchronizingAjaxController()); HtmlPage webPage = (HtmlPage)webClient.getPage("https://www.marinetraffic.com/"); HtmlForm loginForm = (HtmlForm)webPage.getElementById("login_form_REACT"); loginForm.getInputByName("data[email]").setValueAttribute(username); loginForm.getInputByName("data[password]").setValueAttribute(password); HtmlPage webPage2 = (HtmlPage)((HtmlButton)loginForm.getFirstByXPath("//button[@type='submit']")).click(); webClient.waitForBackgroundJavaScript(20 * 1000); System.out.println(webPage2.asXml()); ---------------------------------------------------- Method 2: Logging using POST WebRequest: WebClient webClient = new WebClient(BrowserVersion.CHROME); webClient.getOptions().setJavaScriptEnabled(true); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getOptions().setCssEnabled(false); webClient.setAjaxController(new NicelyResynchronizingAjaxController()); HtmlPage webPage = (HtmlPage)webClient.getPage("https://www.marinetraffic.com/"); URL cookieURL = new URL("https://www.marinetraffic.com/"); String cookies = webClient.getCookies(cookieURL).toString(); URL loginurl = new URL("https://www.marinetraffic.com/en/users/ajax_login"); WebRequest requestSettings = new WebRequest(loginurl, HttpMethod.POST); requestSettings.setAdditionalHeader(":authority", "www.marinetraffic.com"); requestSettings.setAdditionalHeader(":method", "POST"); requestSettings.setAdditionalHeader(":path", "/en/users/ajax_login"); requestSettings.setAdditionalHeader(":scheme", "https"); requestSettings.setAdditionalHeader("accept", "*/*"); requestSettings.setAdditionalHeader("accept-encoding", "gzip,deflate,sdch"); requestSettings.setAdditionalHeader("accept-language", "en-US,en;q=0.8"); requestSettings.setAdditionalHeader("content-type", "application/x-www-form-urlencoded; charset=UTF-8"); requestSettings.setAdditionalHeader("cookie", cookies); requestSettings.setAdditionalHeader("origin", "https://www.marinetraffic.com"); requestSettings.setAdditionalHeader("referer", "https://www.marinetraffic.com/en/ais/home/centerx:-33.1/centery:21.4/zoom:4"); requestSettings.setAdditionalHeader("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"); requestSettings.setAdditionalHeader("x-requested-with", "XMLHttpRequest"); requestSettings.setRequestBody("_method=POST&email=htmlunit888%40gmail.com&password=password1&is_ajax=true"); Page redirectPage = webClient.getPage(requestSettings); webClient.waitForBackgroundJavaScript(10 * 1000); System.out.println(redirectPage.getWebResponse()); ------------------------------------------------------------------------ Method 1 I have not gotten to work past verifying the inputs were filled in. Method 2 I have been able to log in, but not been able to navigate the website thereafter. -- View this message in context: http://htmlunit.10904.n7.nabble.com/Navigate-website-after-logging-in-using-WebRequest-tp42263p42271.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |
From: Ronald B. <rb...@rb...> - 2017-06-26 16:01:43
|
Hi Rural, not sure what you are looking for. Maybe ask the element for the computed css style and then check for the attribute.... On Tue, 20 Jun 2017 15:34:39 +0800 Rural Hunter wrote: > >Hi, > >Does htmlunit support the CSS pseudo elements ::before and ::after? If yes, how to get it on the html element? > > >----------------------------------------------------------------------- ------- >Check out the vibrant tech community on one of the world's most >engaging tech sites, Slashdot.org! http://sdm.link/slashdot >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Ronald B. <rb...@rb...> - 2017-06-26 15:56:35
|
Please provide a complete sample (code and maybe account data) (private mail). We will try to reproduce. RBRi On Sun, 25 Jun 2017 15:44:06 -0700 (MST) minersail wrote: > >I have tried to do the typical log in with HTMLUnit by filling in inputs and >clicking the submit button. However, that didn't work, even with giving the >webpage 20 seconds to load. > >So instead, I've tried to log in using a POST WebRequest, which has worked. >The cookies successfully updated and the site itself in chrome told me that >I was logged in elsewhere. The page returned by the POST was not a valid >html page though, so I could not navigate directly from there. > >When I now use the webclient to navigate through the website, it does not >register me as logged in. I've tried both navigating to the website >regularly with client.getPage() as well as a GET request with the updated >cookies attached. > >How would I get the logged in state from the POST request to transfer over >when navigating through the rest of the website? > >I have omitted code for brevity but can post any relevant code necessary. > > > >-- >View this message in context: http://htmlunit.10904.n7.nabble.com/Navigate-website-after-logging-in-using-WebRequest-tp42263.html >Sent from the HtmlUnit - General mailing list archive at Nabble.com. > >------------------------------------------------------------------------------ >Check out the vibrant tech community on one of the world's most >engaging tech sites, Slashdot.org! http://sdm.link/slashdot >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Guillaume L. <gu...@gm...> - 2017-06-26 10:12:15
|
Hello, I also tried this : HtmlAnchor link = mapDigits.get(chiffre); link.click(); webClient.waitForBackgroundJavaScript(2000); webClient.waitForBackgroundJavaScriptStartingBefore(2000); but still the same problem. Note that there is NO ajax call in the script. It does just change the value of 2 hidden field of a form. But after the call, even after waiting, the value has not been changed. Le lun. 26 juin 2017 à 11:16, Rural Hunter <rur...@gm...> a écrit : > Hi Guillaume, > > You'd better wait for some time after the click by WebClient.waitXXXXXX, > especially for those scripts with ajax calls. > > > 在 2017/6/26 16:59, Guillaume Lepinay 写道: > > Hello every body, > > I have a link like this : > > *<a tabindex="2" href="javascript:raf()">* > * 0 * > *</a>* > > I can find it with HtmlUnit and I use the click() method to execute the > javascript (method raf() in this example). > But nothing happens. > > The script doesn't run any ajax request. It only changes the value of > another field. > But after clicking, when I get the value of the input that should have > changed, it is still the same. > > Is there something special to click this kind of link ? > > Thank you for your help. > > i'm using HtmlUnit 2.23. With the 2.27 and 2.28 SNAPSHOT I get an error > due to illegal character inside the javascript pages. > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > _______________________________________________ > Htmlunit-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user > -- Cordialement, Guillaume Lepinay www.clangen.com 09 52 95 97 99 |