You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(6) |
Jul
(17) |
Aug
(18) |
Sep
(22) |
Oct
(16) |
Nov
(6) |
Dec
(11) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(11) |
Feb
(10) |
Mar
(34) |
Apr
(26) |
May
(6) |
Jun
(22) |
Jul
(14) |
Aug
(4) |
Sep
(47) |
Oct
(69) |
Nov
(23) |
Dec
(21) |
2005 |
Jan
(53) |
Feb
(33) |
Mar
(92) |
Apr
(65) |
May
(63) |
Jun
(57) |
Jul
(43) |
Aug
(132) |
Sep
(61) |
Oct
(75) |
Nov
(60) |
Dec
(130) |
2006 |
Jan
(74) |
Feb
(87) |
Mar
(101) |
Apr
(58) |
May
(54) |
Jun
(42) |
Jul
(31) |
Aug
(67) |
Sep
(61) |
Oct
(71) |
Nov
(28) |
Dec
(58) |
2007 |
Jan
(53) |
Feb
(50) |
Mar
(96) |
Apr
(66) |
May
(55) |
Jun
(130) |
Jul
(99) |
Aug
(115) |
Sep
(37) |
Oct
(78) |
Nov
(24) |
Dec
(70) |
2008 |
Jan
(94) |
Feb
(85) |
Mar
(197) |
Apr
(274) |
May
(119) |
Jun
(143) |
Jul
(193) |
Aug
(99) |
Sep
(160) |
Oct
(120) |
Nov
(178) |
Dec
(109) |
2009 |
Jan
(238) |
Feb
(169) |
Mar
(115) |
Apr
(109) |
May
(131) |
Jun
(167) |
Jul
(144) |
Aug
(193) |
Sep
(155) |
Oct
(154) |
Nov
(97) |
Dec
(127) |
2010 |
Jan
(108) |
Feb
(127) |
Mar
(176) |
Apr
(113) |
May
(130) |
Jun
(200) |
Jul
(115) |
Aug
(80) |
Sep
(92) |
Oct
(101) |
Nov
(124) |
Dec
(53) |
2011 |
Jan
(67) |
Feb
(144) |
Mar
(88) |
Apr
(60) |
May
(89) |
Jun
(54) |
Jul
(68) |
Aug
(81) |
Sep
(48) |
Oct
(40) |
Nov
(10) |
Dec
(20) |
2012 |
Jan
(21) |
Feb
(28) |
Mar
(17) |
Apr
(35) |
May
(41) |
Jun
(44) |
Jul
(68) |
Aug
(67) |
Sep
(89) |
Oct
(58) |
Nov
(47) |
Dec
(56) |
2013 |
Jan
(49) |
Feb
(28) |
Mar
(46) |
Apr
(31) |
May
(28) |
Jun
(37) |
Jul
(34) |
Aug
(52) |
Sep
(42) |
Oct
(108) |
Nov
(59) |
Dec
(56) |
2014 |
Jan
(41) |
Feb
(72) |
Mar
(46) |
Apr
(21) |
May
(19) |
Jun
(17) |
Jul
(15) |
Aug
(40) |
Sep
(11) |
Oct
(3) |
Nov
(5) |
Dec
(31) |
2015 |
Jan
(11) |
Feb
(12) |
Mar
(19) |
Apr
(19) |
May
(38) |
Jun
(54) |
Jul
(14) |
Aug
(42) |
Sep
(14) |
Oct
(16) |
Nov
(26) |
Dec
(14) |
2016 |
Jan
(3) |
Feb
(1) |
Mar
(24) |
Apr
(5) |
May
(15) |
Jun
(14) |
Jul
(33) |
Aug
(19) |
Sep
(8) |
Oct
(10) |
Nov
|
Dec
(2) |
2017 |
Jan
(16) |
Feb
(12) |
Mar
(23) |
Apr
(8) |
May
(11) |
Jun
(20) |
Jul
(21) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(9) |
Dec
(2) |
2018 |
Jan
(7) |
Feb
(5) |
Mar
(6) |
Apr
(5) |
May
(1) |
Jun
(2) |
Jul
(2) |
Aug
|
Sep
(4) |
Oct
(3) |
Nov
|
Dec
(4) |
2019 |
Jan
(2) |
Feb
(2) |
Mar
(3) |
Apr
(4) |
May
|
Jun
(4) |
Jul
(9) |
Aug
(2) |
Sep
|
Oct
(4) |
Nov
(1) |
Dec
(7) |
2020 |
Jan
(2) |
Feb
(6) |
Mar
(9) |
Apr
(1) |
May
(1) |
Jun
(15) |
Jul
(1) |
Aug
(1) |
Sep
(2) |
Oct
(6) |
Nov
(3) |
Dec
(5) |
2021 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(1) |
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(6) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Stephen P. <st...@lo...> - 2016-03-08 19:19:37
|
AT LONG LAST... This works: public static void main(String[] args) throws Exception{ try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF); System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog"); String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; HtmlPage page = webClient.getPage(url); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { List<DomNode> b = (List<DomNode>) e.getByXPath(".//*[@itemprop='brand']"); List<DomNode> n = (List<DomNode>) e.getByXPath(".//*[@itemprop='name']"); List<DomNode> p = (List<DomNode>) e.getByXPath(".//*[@data-selenium='price']"); System.out.println(b.get(0).getTextContent() + " / " + n.get(0).getTextContent() + " / " + p.get(0).getTextContent().trim()); } } } } OUTPUT: run: Kino Flo / KAS-D2-C Diva-Lite 200 Travel Case / $218.63 Canon / CN7x17 KAS S Cine-Servo 17-120mm T2.95 (PL Mount) / $29,850.00 Canon / CN7x17 KAS S Cine-Servo 17-120mm T2.95 (EF Mount) / $29,850.00 Kino Flo / KAS-CL6 6-Lamp Carry Case / $45.38 Kino Flo / KAS-D42 Diva-Lite 400 Wheeled Flight Case - for Two each Kino-Flo Diva-Lite 400 Fixtures, Stands, Mounts, Flozier and Lamp Cases / $483.00 Kino Flo / KAS-CE2 Flight Case (Yellow) / $371.25 Kino Flo / KAS-CE2-C Clamshell Travel Case (Yellow) / $288.75 Kino Flo / KAS-D4-C Diva-Lite 400 Travel Case - for Kino Flo Diva-Lite 400 Lighting Kit / $282.00 Kino Flo / KAS-GAF2 Gaffer Kit Ship Case / $536.25 Kino Flo / KAS-V31-Y Yoke Shipping Case - for VistaBeam 300 Fluorescent Fixture with Yoke Mount / $649.95 Kino Flo / KAS-V61-Y Yoke Shipping Case - for VistaBeam 600 Fluorescent Fixture with Yoke Mount / $881.95 Kino Flo / KAS-24S Telescoping Shipping Case, Small - for up to three Kino-Flo 2.0' Fixtures / $136.13 Kino Flo / KAS-41 Telescoping Shipping Case - for 1 Kino-Flo 4' Bank System / $212.50 Kino Flo / KAS-CE2-Y Yoke Ship Case (Black) / $433.13 Kino Flo / KAS-D22 Diva-Lite 200 Flight Case - for Two each Kino-Flo Diva-Lite 200 Fixtures, Stands, Mounts, Flozier and Lamp Cases / $441.95 Kino Flo / KAS-B4-C Clamshell Travel Case for One BarFly 400D Kit (Black) / $325.88 Kino Flo / KAS-B41 BarFly 400 Ship Case / $488.75 Kino Flo / KAS-D4-CS Diva-Lite 401 Travel Case / $325.88 Kino Flo / KAS-INT2 Interview Ship Case / $474.38 Kino Flo / KAS-INT3 Interview Ship Case / $489.88 Kino Flo / KAS-V31 Center Shipping Case - for VistaBeam 300 Fluorescent Fixture with Center Mount / $629.95 Kino Flo / KAS-V61 Center Shipping Case - for VistaBeam 600 Fluorescent Fixture with Center Mount / $827.50 Kino Flo / KAS-V62 Center Shipping Case - for Two VistaBeam 600 Fluorescent Fixtures with Center Mount / $1,133.95 Kino Flo / KAS-VH2 Vista Single Louver Carry Case / $45.95 BUILD SUCCESSFUL (total time: 5 seconds) Ahmed, I question why the implementation of getByXPath is demanding I use the wildcards for the searches instead of being able to specify the elements. But, for now, as long as it works, I'm going to run with it. Thank you all for the input and suggestions. ~ Steve ----- Stephen M. Paulsen Lowing Light & Grip |
From: Stephen P. <st...@lo...> - 2016-03-07 21:32:02
|
Thank you for all your help. I'm still having trouble. Two things are on my mind: Thing 1 - The reason I grabbed itemDetail is because I am looking for three pieces of information that are all enclosed in that tree: <div data-selenium="itemDetail"> <div data-selenium="img-zone"> ... Stuff I don't care about ... </div> <div data-selenium="itemInfo-zone"> <div data-selenium="itemHeader"> <h3 data-selenium="itemHeading"> <a href=""blah blah blah> <span itemprop="brand"> I WANT THIS BRAND </span> <span itemprop="name"> I WANT THIS NAME </span> </a> </h3> <p> ... More stuff I don't care about ... </p> <p> ... Even More Stuff I Don't Care About ... </p> </div> <!-- itemHeader --> <div data-selenium="highlights"> ... Yet Even More Stuff I Don't Care About ... </div> <div data-selenium="itemSection"> ... Whatever ... </div> </div> <!-- itemInfo-zone --> <div data-selenium="conversion-zone"> <div data-selenium="price-zone"> <div data-selenium="prices"> ... Don't care ...</div> <div data-selenium="addToCartPrice"> <p data-selenium="finalPrice"> <span data-selenium="youpayPrice"> ... Yawn ... </span> <span data-selenium="price"> THIS IS THE PRICE I WANT </span> </p> <!-- finalPrice --> </div> <!-- addToCartPrice --> </div> <!-- price-zone --> </div> <!-- conversion-zone --> NOTE: For purposes of brevity, I have removed the additional attributes of the elements. Since I am climbing this learning curve, tell me if that is interfering with my use of XPath. You can see from the code that I am asking XPath to search for only those attributes which I think are the way to the information I want. Tell me if I should be using a different syntax, or even a different algorithm for this. Thing 2 - I applied the changes Ahmed suggested from the last reply. While I can get all 24 sections of itemHeading on the page, when I use the syntax he specified to get my brand information, I get an empty list: SOURCE CODE: public static void main(String[] args) throws Exception{ try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF); System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog"); // String url = "http://localhost:8888/vendor.html"; String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; HtmlPage page = webClient.getPage(url); // List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemHeading']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { System.out.println("***** Contents of e"); System.out.println(e.asXml()); System.out.println("***** End Contents of e"); List<DomNode> b = (List<DomNode>) page.getByXPath("//span[@itemprop='brand']"); System.out.println("--- Contents of b"); System.out.println(b); System.out.println("--- End Contents of b"); } } } } OUTPUT: run: ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258616-REG/Kino_Flo_KAS_D2_C_KAS_D2C_Diva_Lite_200_Travel.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D2-C Diva-Lite 200 Travel Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/1043631-REG/canon_9785b002_cn7x17_kas_s_cine_servo.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Canon </span> <span itemprop="name"> CN7x17 KAS S Cine-Servo 17-120mm T2.95 (PL Mount) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/1043629-REG/canon_9785b001_cn7x17_kas_s_cine_servo.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Canon </span> <span itemprop="name"> CN7x17 KAS S Cine-Servo 17-120mm T2.95 (EF Mount) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258606-REG/Kino_Flo_KAS_CL6_Compact_Carry_Case_for.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CL6 6-Lamp Carry Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/505449-REG/Kino_Flo_KAS_D42_KAS_D42_Diva_Lite_400_Flight.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D42 Diva-Lite 400 Wheeled Flight Case - for Two each Kino-Flo Diva-Lite 400 Fixtures, Stands, Mounts, Flozier and Lamp Cases </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/884961-REG/kino_flo_kas_ce2_c_clamshell_travel_case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CE2-C Clamshell Travel Case (Yellow) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/429918-REG/Kino_Flo_KAS_24S_KAS_24S_Small_Telescoping_Shipping.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-24S Telescoping Shipping Case, Small - for up to three Kino-Flo 2.0' Fixtures </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258673-REG/Kino_Flo_KAS_41_KAS_41_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-41 Telescoping Shipping Case - for 1 Kino-Flo 4' Bank System </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258605-REG/Kino_Flo_KAS_D4_C_KAS_D4_C_Diva_Lite_400_Travel.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D4-C Diva-Lite 400 Travel Case - for Kino Flo Diva-Lite 400 Lighting Kit </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/656771-REG/Kino_Flo_KAS_GAF2_KAS_GAF2_Gaffer_Kit_Ship.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-GAF2 Gaffer Kit Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/884959-REG/kino_flo_kas_ce2_flight_case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CE2 Flight Case (Yellow) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/672726-REG/Kino_Flo_KAS_VH2_KAS_VH2_Vista_Single_Louver.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-VH2 Vista Single Louver Carry Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/507360-REG/Kino_Flo_KAS_V31_Y_KAS_V31_Y_Yoke_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V31-Y Yoke Shipping Case - for VistaBeam 300 Fluorescent Fixture with Yoke Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/507361-REG/Kino_Flo_KAS_V61_Y_KAS_V61_Y_Yoke_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V61-Y Yoke Shipping Case - for VistaBeam 600 Fluorescent Fixture with Yoke Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/884962-REG/kino_flo_kas_ce2_y_yoke_ship_case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CE2-Y Yoke Ship Case (Black) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/507295-REG/Kino_Flo_KAS_D22_KAS_D22_Diva_Lite_200_Flight.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D22 Diva-Lite 200 Flight Case - for Two each Kino-Flo Diva-Lite 200 Fixtures, Stands, Mounts, Flozier and Lamp Cases </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/840408-REG/Kino_Flo_KAS_B4_C_Clamshell_Travel_Case_for.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-B4-C Clamshell Travel Case for One BarFly 400D Kit (Black) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/580907-REG/Kino_Flo_KAS_B41_KAS_B41_BarFly_400_Ship.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-B41 BarFly 400 Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/858258-REG/Kino_Flo_KAS_D4_CS_KAS_D4_CS_Diva_Lite_401_Travel.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D4-CS Diva-Lite 401 Travel Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/656769-REG/Kino_Flo_KAS_INT2_KAS_INT2_Interview_Ship_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-INT2 Interview Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/656770-REG/Kino_Flo_KAS_INT3_KAS_INT3_Interview_Ship_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-INT3 Interview Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/434699-REG/Kino_Flo_KAS_V31_KAS_V31_Center_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V31 Center Shipping Case - for VistaBeam 300 Fluorescent Fixture with Center Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/434700-REG/Kino_Flo_KAS_V61_KAS_V61_Center_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V61 Center Shipping Case - for VistaBeam 600 Fluorescent Fixture with Center Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/434701-REG/Kino_Flo_KAS_V62_KAS_V62_Center_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V62 Center Shipping Case - for Two VistaBeam 600 Fluorescent Fixtures with Center Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b BUILD SUCCESSFUL (total time: 6 seconds) You can see why I am all confused. I have confirmed in my settings that I am using version 2.20. Thank you for all your help. ~ Steve ----- Stephen M. Paulsen Lowing Light & Grip > On Mar 4, 2016, at 4:25 PM, Ahmed Ashour <asa...@ya...> wrote: > > Hi Stephen, > > 'brand' is a descendant of 'itemHeading', not of 'itemDetail'. > > The below works with latest version (with a workaround for the failling JavaScript, a bug report should be created for this). > > > public static void main(String[] args) throws Exception { > try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { > > String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; > // Yes, this is a live site. Be nice. > HtmlPage page = webClient.getPage(url); > List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemHeading']"); > > if (nodeProduct.size() > 0) { > for (DomNode e : nodeProduct) { > System.out.println(e.asXml()); > List<DomNode> b = (List<DomNode>) page.getByXPath("//span[@itemprop='brand']"); > System.out.println(b); > } > } > } > } > > |
From: Ahmed A. <asa...@ya...> - 2016-03-04 21:25:11
|
Hi Stephen, 'brand' is a descendant of 'itemHeading', not of 'itemDetail'. The below works with latest version (with a workaround for the failling JavaScript, a bug report should be created for this). public static void main(String[] args) throws Exception { try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; // Yes, this is a live site. Be nice. HtmlPage page = webClient.getPage(url); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemHeading']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { System.out.println(e.asXml()); List<DomNode> b = (List<DomNode>) page.getByXPath("//span[@itemprop='brand']"); System.out.println(b); } } } } From: Stephen Paulsen <st...@lo...> To: Ahmed Ashour <asa...@ya...> Cc: "htm...@li..." <htm...@li...> Sent: Friday, March 4, 2016 7:18 PM Subject: Re: [Htmlunit-user] Nested getByXPath Has Me All Confused Hi, Ahmed. That's all well and good, but when you run against the full HTML of the whole page, or against the live site, this is what I get as output: /* * * * */ package hutesting; import com.gargoylesoftware.htmlunit.BrowserVersion; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.html.DomNode; import com.gargoylesoftware.htmlunit.html.HtmlPage; import java.util.List; /** * * @author spaulsen */ public class HUTesting { /** * @param args the command line arguments * @throws java.lang.Exception */ public static void main(String[] args) throws Exception { try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { // String url = "http://localhost:8888/vendor.html"; String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; // Yes, this is a live site. Be nice. HtmlPage page = webClient.getPage(url); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { List<DomNode> b = (List<DomNode>) e.getByXPath("//span[@itemprop='brand']"); System.out.println(b); } } } } } Output: run: Mar 04, 2016 1:06:55 PM com.gargoylesoftware.htmlunit.html.HtmlPage loadExternalJavaScriptFile ( Irrelevant and can be ignored ) Mar 04, 2016 1:06:55 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet pixelValue Mar 04, 2016 1:06:57 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet pixelValue Mar 04, 2016 1:06:57 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet pixelValue [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] BUILD SUCCESSFUL (total time: 6 seconds) ----- Stephen M. Paulsen Lowing Light & Grip > On Mar 4, 2016, at 4:42 AM, Ahmed Ashour <asa...@ya...> wrote: > > Hi, > > As hinted earlier, you need to add "//" before span > > The below code prints something: > > public static void main(String[] args) throws Exception { > try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { > > String url = "http://localhost:8080/snippet.html"; > HtmlPage page = webClient.getPage(url); > List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); > > if (nodeProduct.size() > 0) { > for (DomNode e : nodeProduct) { > List<DomNode> b = (List<DomNode>) e.getByXPath("//span[@itemprop='brand']"); > System.out.println(b); > } > } > } > } > > > > From: Stephen Paulsen <st...@lo...> > To: Ahmed Ashour <asa...@ya...>; htm...@li... > Sent: Thursday, March 3, 2016 7:45 PM > Subject: Re: [Htmlunit-user] Nested getByXPath Has Me All Confused > > Hi, Ahmed. > > Attached is a ZIP file which includes 3 text files: > > vendor.html > snippet.html > analyzeResults.txt > > I've obscured the obvious information about the vendor. > > You can see in the fill vendor.html that there is a lot going on. I have been able to separate out the 24 snippets that I need with the data-selenium='itemDetail', however even though the documentation, and your note, indicates the //div... should work, it does not. I've not yet tried the "contains" construction of the parameter, but I do not think that would explain why the search path doesn't work as is. > > When I apply the itemprop='brand' to the snippet, I get zero results. When I apply the //span to the snippet alone, I get *all* 24 brand listings from the complete page, even though I am asking only about the specific element in e. > > The point is to scrape the brand, name, and price from all 24 results returned by the search. > > The analyzeResults.txt is the Java I have been using. You can see some of the variations I have used in constructing the search for the brand. Until that works, I have given up on the searches for the related product name and price. > > Your thoughts? > > Thanks! > > ~ Steve > > > > ----- > Stephen M. Paulsen > Lowing Light & Grip > > > ------------------------------------------------------------------------------ _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: Stephen P. <st...@lo...> - 2016-03-04 20:05:22
|
I have set my production code to ignore all the errors. They are irrelevant to my purpose. I included a brief summary of the errors to show that yes, I am running against the live site and it has things in it that HtmlUnit does not like. However, I am able to get a clean-enough copy of the site into my page object and parse it out for the pieces that I need, up to this point where I am having trouble. Thank you. Merci! ~ SMP ----- Stephen M. Paulsen Lowing Light & Grip > On Mar 4, 2016, at 2:55 PM, Albu Gmail <alb...@gm...> wrote: > > http://stackoverflow.com/questions/16754752/java-htmlunit-failing-to-load-javascript > > > Le 04/03/2016 19:18, Stephen Paulsen a écrit : >> om.gargoylesoftware.htmlunit.DefaultCssErrorHandler error > > > --- > L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. > http://www.avast.com > > > ------------------------------------------------------------------------------ > _______________________________________________ > Htmlunit-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: Albu G. <alb...@gm...> - 2016-03-04 19:55:58
|
http://stackoverflow.com/questions/16754752/java-htmlunit-failing-to-load-javascript Le 04/03/2016 19:18, Stephen Paulsen a écrit : > om.gargoylesoftware.htmlunit.DefaultCssErrorHandler error --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. http://www.avast.com |
From: Stephen P. <st...@lo...> - 2016-03-04 18:18:16
|
Hi, Ahmed. That's all well and good, but when you run against the full HTML of the whole page, or against the live site, this is what I get as output: /* * * * */ package hutesting; import com.gargoylesoftware.htmlunit.BrowserVersion; import com.gargoylesoftware.htmlunit.WebClient; import com.gargoylesoftware.htmlunit.html.DomNode; import com.gargoylesoftware.htmlunit.html.HtmlPage; import java.util.List; /** * * @author spaulsen */ public class HUTesting { /** * @param args the command line arguments * @throws java.lang.Exception */ public static void main(String[] args) throws Exception { try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { // String url = "http://localhost:8888/vendor.html"; String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; // Yes, this is a live site. Be nice. HtmlPage page = webClient.getPage(url); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { List<DomNode> b = (List<DomNode>) e.getByXPath("//span[@itemprop='brand']"); System.out.println(b); } } } } } Output: run: Mar 04, 2016 1:06:55 PM com.gargoylesoftware.htmlunit.html.HtmlPage loadExternalJavaScriptFile ( Irrelevant and can be ignored ) Mar 04, 2016 1:06:55 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error Mar 04, 2016 1:06:56 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet pixelValue Mar 04, 2016 1:06:57 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet pixelValue Mar 04, 2016 1:06:57 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet pixelValue [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] BUILD SUCCESSFUL (total time: 6 seconds) ----- Stephen M. Paulsen Lowing Light & Grip > On Mar 4, 2016, at 4:42 AM, Ahmed Ashour <asa...@ya...> wrote: > > Hi, > > As hinted earlier, you need to add "//" before span > > The below code prints something: > > public static void main(String[] args) throws Exception { > try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { > > String url = "http://localhost:8080/snippet.html"; > HtmlPage page = webClient.getPage(url); > List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); > > if (nodeProduct.size() > 0) { > for (DomNode e : nodeProduct) { > List<DomNode> b = (List<DomNode>) e.getByXPath("//span[@itemprop='brand']"); > System.out.println(b); > } > } > } > } > > > > From: Stephen Paulsen <st...@lo...> > To: Ahmed Ashour <asa...@ya...>; htm...@li... > Sent: Thursday, March 3, 2016 7:45 PM > Subject: Re: [Htmlunit-user] Nested getByXPath Has Me All Confused > > Hi, Ahmed. > > Attached is a ZIP file which includes 3 text files: > > vendor.html > snippet.html > analyzeResults.txt > > I've obscured the obvious information about the vendor. > > You can see in the fill vendor.html that there is a lot going on. I have been able to separate out the 24 snippets that I need with the data-selenium='itemDetail', however even though the documentation, and your note, indicates the //div... should work, it does not. I've not yet tried the "contains" construction of the parameter, but I do not think that would explain why the search path doesn't work as is. > > When I apply the itemprop='brand' to the snippet, I get zero results. When I apply the //span to the snippet alone, I get *all* 24 brand listings from the complete page, even though I am asking only about the specific element in e. > > The point is to scrape the brand, name, and price from all 24 results returned by the search. > > The analyzeResults.txt is the Java I have been using. You can see some of the variations I have used in constructing the search for the brand. Until that works, I have given up on the searches for the related product name and price. > > Your thoughts? > > Thanks! > > ~ Steve > > > > ----- > Stephen M. Paulsen > Lowing Light & Grip > > > |
From: Ahmed A. <asa...@ya...> - 2016-03-04 09:42:35
|
Hi, As hinted earlier, you need to add "//" before span The below code prints something: public static void main(String[] args) throws Exception { try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { String url = "http://localhost:8080/snippet.html"; HtmlPage page = webClient.getPage(url); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { List<DomNode> b = (List<DomNode>) e.getByXPath("//span[@itemprop='brand']"); System.out.println(b); } } } } From: Stephen Paulsen <st...@lo...> To: Ahmed Ashour <asa...@ya...>; htm...@li... Sent: Thursday, March 3, 2016 7:45 PM Subject: Re: [Htmlunit-user] Nested getByXPath Has Me All Confused Hi, Ahmed. Attached is a ZIP file which includes 3 text files: vendor.html snippet.html analyzeResults.txt I've obscured the obvious information about the vendor. You can see in the fill vendor.html that there is a lot going on. I have been able to separate out the 24 snippets that I need with the data-selenium='itemDetail', however even though the documentation, and your note, indicates the //div... should work, it does not. I've not yet tried the "contains" construction of the parameter, but I do not think that would explain why the search path doesn't work as is. When I apply the itemprop='brand' to the snippet, I get zero results. When I apply the //span to the snippet alone, I get *all* 24 brand listings from the complete page, even though I am asking only about the specific element in e. The point is to scrape the brand, name, and price from all 24 results returned by the search. The analyzeResults.txt is the Java I have been using. You can see some of the variations I have used in constructing the search for the brand. Until that works, I have given up on the searches for the related product name and price. Your thoughts? Thanks! ~ Steve ----- Stephen M. Paulsen Lowing Light & Grip |
From: Ahmed A. <asa...@ya...> - 2016-03-03 08:20:13
|
Hi Stephen, >> It *should* be //div[@data-selenium='itemDetail'], This will happen only if you have <div data-selenium='itemDetail'>, it will not work if it is not 'div'. >> e.getByXPath("span[@itemprop='brand']"); Always use "//span", because it will search for sub-children, check XPath tutorials. If you can share the URL with your code, it will be better, or at least send the relevant page HTML (by element.asXml(). Ahmed From: Stephen Paulsen <st...@lo...> To: htm...@li... Sent: Wednesday, March 2, 2016 9:02 PM Subject: [Htmlunit-user] Nested getByXPath Has Me All Confused Hello, HtmlUnit Users. I am not even sure how to ask this question. Tell me if I am leaving out any important information. Here it goes. I am using HtmlUnit to scrape information from various web sites in order to see what my competitors are charging for products we also sell. On one vendor's site, I am able to get the page I expect with the standard this.page = this.webClient.getPage(myURL); Digging into the structure of the page, I am able to identify the 24 products that come back from my search simulated by the getPage List<DomNode> nodeProduct = (List<DomNode>) this.page.getByXPath("//*[@data-selenium='itemDetail']"); I can confirm that the nodeProduct list has 24 elements in it. I do not know why I have to use the wildcard in the XPath. It *should* be //div[@data-selenium='itemDetail'], but that always returns zero entries in the List. For my next trick, I start to iterate through the list to examine each individual entry in the list: for (DomNode e : nodeProduct) { stuff } By way of debugging, I include in "stuff" this line: System.out.println(e.asXml()); This shows me that I do, in fact, have one of the 24 possible products for which I searched. It compares to the HTML source from my browser correctly. For purposes of asking this question, this is the HTML in question: <span itemprop="brand">Kino Flo</span> When I try to get HtmlUnit to tell me what the brand name is of the product in question, I have tried several options. I started with this because it seemed to make the most sense, and appears to be what all the documentation and examples indicate will work: List<DomNode> b = (List<DomNode>) e.getByXPath("span[@itemprop='brand']"); No matter what I have tried, the list of b DomNodes either comes back as length=zero, or false. Yes, the span element I am looking for is buried beneath several other divs. According to the documentation I have read about XPath, this search *should* find it even if buried. I may be misunderstanding the requirements. I come to you on the mailing list after having tried DomNode, DomElement, HtmlElement, wildcards, fully qualified paths, and everything else I can think of or find in an example. What am I misunderstanding? Thank you! ~ Steve ----- Stephen M. Paulsen Lowing Light & Grip ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: Stephen P. <st...@lo...> - 2016-03-03 04:03:14
|
Hello, HtmlUnit Users. I am not even sure how to ask this question. Tell me if I am leaving out any important information. Here it goes. I am using HtmlUnit to scrape information from various web sites in order to see what my competitors are charging for products we also sell. On one vendor's site, I am able to get the page I expect with the standard this.page = this.webClient.getPage(myURL); Digging into the structure of the page, I am able to identify the 24 products that come back from my search simulated by the getPage List<DomNode> nodeProduct = (List<DomNode>) this.page.getByXPath("//*[@data-selenium='itemDetail']"); I can confirm that the nodeProduct list has 24 elements in it. I do not know why I have to use the wildcard in the XPath. It *should* be //div[@data-selenium='itemDetail'], but that always returns zero entries in the List. For my next trick, I start to iterate through the list to examine each individual entry in the list: for (DomNode e : nodeProduct) { stuff } By way of debugging, I include in "stuff" this line: System.out.println(e.asXml()); This shows me that I do, in fact, have one of the 24 possible products for which I searched. It compares to the HTML source from my browser correctly. For purposes of asking this question, this is the HTML in question: <span itemprop="brand">Kino Flo</span> When I try to get HtmlUnit to tell me what the brand name is of the product in question, I have tried several options. I started with this because it seemed to make the most sense, and appears to be what all the documentation and examples indicate will work: List<DomNode> b = (List<DomNode>) e.getByXPath("span[@itemprop='brand']"); No matter what I have tried, the list of b DomNodes either comes back as length=zero, or false. Yes, the span element I am looking for is buried beneath several other divs. According to the documentation I have read about XPath, this search *should* find it even if buried. I may be misunderstanding the requirements. I come to you on the mailing list after having tried DomNode, DomElement, HtmlElement, wildcards, fully qualified paths, and everything else I can think of or find in an example. What am I misunderstanding? Thank you! ~ Steve ----- Stephen M. Paulsen Lowing Light & Grip |
From: Ahmed A. <asa...@ya...> - 2016-02-28 16:36:11
|
Hi all, It is a pleasure to announce the immediate availability of HtmlUnit 2.20. The main features are: - Internet Explorer 8 is removed. Now WebClient default constructor uses the most supported browser (Firefox 38 currently). - Various bug fixes. The detailed list can be found in [1]. Happy testing! The HtmlUnit team [1] http://htmlunit.sourceforge.net/changes-report.html#a2.20 |
From: Aaron B. <Aar...@ad...> - 2016-01-18 23:15:35
|
Thanks for the response, but turns out there IS a way to do this, that’s actually officially supported! After much more searching and seeking, I found this FAQ http://htmlunit.sourceforge.net/faq.html#HowToModifyRequestOrResponse. So simply using the WebConnectionWrapper, I implemented some WebResponse caching. For those interested, here is the code I used. Probably can be cleaned up a bit more, but it works just fine. Just be sure to use the stopCaching() which will clear the cache, otherwise you run into danger of running out of memory after extended usage. //Setting the object on the client webClient.setWebConnection(new RequestCachingWebConnection(ret.getWebConnection())); //make a call to a web page while caching all of the WebRequests that are made, including loading resources on the web page ((RequestCachingWebConnection)webClient.getWebConnection()).startCaching(); page = webClient.getPage(url); List<WebResponse> responses = ((RequestCachingWebConnection)webClient.getWebConnection()).stopCaching(); for( WebResponse response : responses ) { //do stuff } public class RequestCachingWebConnection extends WebConnectionWrapper { private final List<WebResponse> responses = new CopyOnWriteArrayList<>(); private boolean doCache = false; RequestCachingWebConnection(WebClient webClient) { super(webClient); } RequestCachingWebConnection(WebConnection webConnection) { super(webConnection); } @Override public WebResponse getResponse(WebRequest request) throws IOException { WebResponse response = super.getResponse(request); if( doCache ) { responses.add(response); } return response; } public void startCaching() { responses.clear(); doCache = true; } public List<WebResponse> stopCaching() { if( doCache ) { doCache = false; List<WebResponse> tmp = new ArrayList<>(responses); responses.clear(); return tmp; } return new ArrayList<>(); } public List<WebResponse> getCurrentResponses() { return new ArrayList<>(responses); } } Aaron Baff Lead Java Developer e: aar...@ad...<mailto:aar...@ad...> t: 310.556.4440 x29 f: 310. 556. 4441 [cid:6E2...@wa...] On Jan 17, 2016, at 7:24 AM, Ronald Brill <rb...@rb...<mailto:rb...@rb...>> wrote: On Fri, 15 Jan 2016 12:11:01 -0800 Aaron Baff wrote: I'm trying to figure out how to get the WebResponse (or similar type of data) for resources found on an HtmlPage. Specifically in this case, I'd like to get the HTTP Headers returned by a <script> tag with a src to a js file on the server. Is there any way to do this? Hi Aaron, i fear, there is no way to do it at the moment by using the HtmlUnit API. Internally the WebResponse is thrown away at the moment the string content was created from the response. If you only like to see the headers, you can enable wire logging for HttpClient. That is simple and you can see everything send out or received from HtmlUnit. Or if you need a bit more convenience try to work with some WebProxy like Charles or Fiddler. Hope this helps RBRi -------------------------- WETATOR Smart web application testing http://www.wetator.org Aaron Baff Lead Java Developer e: aar...@ad...<mailto:aar...@ad...><mailto:aar...@ad...> t: 310.556.4440 x29 f: 310. 556. 4441 [cid:6E2...@wa...] ________________________________ CONFIDENTIALITY NOTICE This communication (and/or the documents accompanying it) may contain confidential information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received it in error, please advise the sender by reply e-mail and immediately delete the message and any attachments without copying or disclosing the contents. <<< Inline attachment: image001.png - [image/png] >>> ----< Inline text [text-plain-05.txt] >------------------ ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 ----< Inline text [text-plain-06.txt] >------------------ _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user ________________________________ CONFIDENTIALITY NOTICE This communication (and/or the documents accompanying it) may contain confidential information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received it in error, please advise the sender by reply e-mail and immediately delete the message and any attachments without copying or disclosing the contents. |
From: Ronald B. <rb...@rb...> - 2016-01-17 15:24:30
|
On Fri, 15 Jan 2016 12:11:01 -0800 Aaron Baff wrote: > >I'm trying to figure out how to get the WebResponse (or similar type of data) for resources found on an HtmlPage. Specifically in this case, I'd like to get the HTTP Headers returned by a <script> tag with a src to a js file on the server. Is there any >way to do this? > Hi Aaron, i fear, there is no way to do it at the moment by using the HtmlUnit API. Internally the WebResponse is thrown away at the moment the string content was created from the response. If you only like to see the headers, you can enable wire logging for HttpClient. That is simple and you can see everything send out or received from HtmlUnit. Or if you need a bit more convenience try to work with some WebProxy like Charles or Fiddler. Hope this helps RBRi -------------------------- WETATOR Smart web application testing http://www.wetator.org > >Aaron Baff >Lead Java Developer >e: aar...@ad...<mailto:aar...@ad...> >t: 310.556.4440 x29 >f: 310. 556. 4441 > > >[cid:6E2...@wa...] > > >________________________________ >CONFIDENTIALITY NOTICE >This communication (and/or the documents accompanying it) may contain confidential information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the >contents of this information is strictly prohibited. If you have received it in error, please advise the sender by reply e-mail and immediately delete the message and any attachments without copying or disclosing the contents. > ><<< Inline attachment: image001.png - [image/png] >>> > > >----< Inline text [text-plain-05.txt] >------------------ > >------------------------------------------------------------------------------ >Site24x7 APM Insight: Get Deep Visibility into Application Performance >APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >Monitor end-to-end web transactions and take corrective actions now >Troubleshoot faster and improve end-user experience. Signup Now! >http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 > > >----< Inline text [text-plain-06.txt] >------------------ > >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > |
From: Aaron B. <Aar...@ad...> - 2016-01-15 20:24:01
|
I’m trying to figure out how to get the WebResponse (or similar type of data) for resources found on an HtmlPage. Specifically in this case, I’d like to get the HTTP Headers returned by a <script> tag with a src to a js file on the server. Is there any way to do this? Aaron Baff Lead Java Developer e: aar...@ad...<mailto:aar...@ad...> t: 310.556.4440 x29 f: 310. 556. 4441 [cid:6E2...@wa...] ________________________________ CONFIDENTIALITY NOTICE This communication (and/or the documents accompanying it) may contain confidential information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received it in error, please advise the sender by reply e-mail and immediately delete the message and any attachments without copying or disclosing the contents. |
From: Gabriel S. <gab...@gm...> - 2015-12-29 18:37:16
|
I contacted the webmasters of the site and found out that they upgraded their server to require TLS 1.1 or higher. Apparently htmlunit doesn't try these protocols by default. I resolved the problem by adding the following line after instantiating the WebClient. webClient.getOptions().setSSLClientProtocols(new String[] { "TLSv1.2", "TLSv1.1" }); Perhaps the developers should consider changing the defaults of htmlunit so that it tries the newest and strongest protocols first, mimicking the behavior of common browsers such as Firefox and Chrome. Thanks again to the developers and the user community. - Gabriel On Mon, Dec 28, 2015 at 11:43 PM, Gabriel E. Sánchez Martínez < gab...@gm...> wrote: > Hi, > > I want to thank the developers of htmlunit for their time and effort. I'm > hoping you will be able to help me with an issue that I'm stuck in. I've > been using htmlunit for a few months to download a file from a > password-protected site, without issues. This week, without any changes to > the code, it stopped working. I am getting a "java.net.SocketException: > Connection reset" error as soon as I call WebClient.getPage(), before > trying to log in. However, I am able to access the page on Firefox. I can > also access other pages on htmlunit without issues. Any help you can give > me is going to be greatly appreciated! > > The code below reproduces the error: > > final String path = "https://passprogram.mbta.com"; > try { > final WebClient webClient = new WebClient(); > final HtmlPage page1 = webClient.getPage(path); > System.out.println(page1.getTitleText()); > } catch (IOException | FailingHttpStatusCodeException ex) { > Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex); > } > > I get the following output: > > Dec 28, 2015 11:35:15 PM org.apache.http.impl.execchain.RetryExec execute > INFO: I/O exception (java.net.SocketException) caught when processing > request to {s}->https://passprogram.mbta.com:443: Connection reset > Dec 28, 2015 11:35:15 PM org.apache.http.impl.execchain.RetryExec execute > INFO: Retrying request to {s}->https://passprogram.mbta.com:443 > Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute > INFO: I/O exception (java.net.SocketException) caught when processing > request to {s}->https://passprogram.mbta.com:443: Connection reset > Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute > INFO: Retrying request to {s}->https://passprogram.mbta.com:443 > Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute > INFO: I/O exception (java.net.SocketException) caught when processing > request to {s}->https://passprogram.mbta.com:443: Connection reset > Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute > INFO: Retrying request to {s}->https://passprogram.mbta.com:443 > Dec 28, 2015 11:35:16 PM main.Main test > SEVERE: null > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:196) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at sun.security.ssl.InputRecord.readFully(InputRecord.java:442) > at sun.security.ssl.InputRecord.read(InputRecord.java:480) > at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:946) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355) > at > org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:394) > at > org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:353) > at > com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.connectSocket(HtmlUnitSSLConnectionSocketFactory.java:188) > at > org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:134) > at > org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) > at > org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) > at > org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) > at > org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) > at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) > at > org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) > at > org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71) > at > com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:177) > at > com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1324) > at > com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1241) > at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:348) > at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:417) > at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:402) > at main.Main.test(Main.java:160) > at main.Main.main(Main.java:65) > > > I am running Ubuntu 14.04 and have the following Java installed: > > java version "1.7.0_91" > OpenJDK Runtime Environment (IcedTea 2.6.3) (7u91-2.6.3-0ubuntu0.14.04.1) > OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode) > > > Thanks, > Gabriel > > > |
From: Gabriel E. S. M. <gab...@gm...> - 2015-12-29 04:43:14
|
Hi, I want to thank the developers of htmlunit for their time and effort. I'm hoping you will be able to help me with an issue that I'm stuck in. I've been using htmlunit for a few months to download a file from a password-protected site, without issues. This week, without any changes to the code, it stopped working. I am getting a "java.net.SocketException: Connection reset" error as soon as I call WebClient.getPage(), before trying to log in. However, I am able to access the page on Firefox. I can also access other pages on htmlunit without issues. Any help you can give me is going to be greatly appreciated! The code below reproduces the error: final String path = "https://passprogram.mbta.com"; try { final WebClient webClient = new WebClient(); final HtmlPage page1 = webClient.getPage(path); System.out.println(page1.getTitleText()); } catch (IOException | FailingHttpStatusCodeException ex) { Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex); } I get the following output: Dec 28, 2015 11:35:15 PM org.apache.http.impl.execchain.RetryExec execute INFO: I/O exception (java.net.SocketException) caught when processing request to {s}->https://passprogram.mbta.com:443: Connection reset Dec 28, 2015 11:35:15 PM org.apache.http.impl.execchain.RetryExec execute INFO: Retrying request to {s}->https://passprogram.mbta.com:443 Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute INFO: I/O exception (java.net.SocketException) caught when processing request to {s}->https://passprogram.mbta.com:443: Connection reset Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute INFO: Retrying request to {s}->https://passprogram.mbta.com:443 Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute INFO: I/O exception (java.net.SocketException) caught when processing request to {s}->https://passprogram.mbta.com:443: Connection reset Dec 28, 2015 11:35:16 PM org.apache.http.impl.execchain.RetryExec execute INFO: Retrying request to {s}->https://passprogram.mbta.com:443 Dec 28, 2015 11:35:16 PM main.Main test SEVERE: null java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at sun.security.ssl.InputRecord.readFully(InputRecord.java:442) at sun.security.ssl.InputRecord.read(InputRecord.java:480) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:946) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:394) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:353) at com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.connectSocket(HtmlUnitSSLConnectionSocketFactory.java:188) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:134) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71) at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:177) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1324) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1241) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:348) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:417) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:402) at main.Main.test(Main.java:160) at main.Main.main(Main.java:65) I am running Ubuntu 14.04 and have the following Java installed: java version "1.7.0_91" OpenJDK Runtime Environment (IcedTea 2.6.3) (7u91-2.6.3-0ubuntu0.14.04.1) OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode) Thanks, Gabriel |
From: Pablo L. <pab...@ho...> - 2015-12-28 10:13:43
|
Ahmed, If you think I can help you with this bug, don't hesitate to ask me ... ;) Regards, Pablo. El 14/12/2015 a las 14:09, Pablo León escribió: > Hi, > > As it seems that zip attachments are not allowed in this forum, I have > created a github repository with the sources ... > > https://github.com/hangorn/vaadinsTables/tree/master/vaadinsTables > > Regards, > > Pablo > > El 14/12/2015 a las 13:08, Pablo León escribió: >> Hi, Ahmed: >> >> I can't disclosure our project's sources, but I have prepared a tiny >> maven project showing the problem. Thought test passes ok, click >> event isn't fired on server's side, as it happens when you manually >> click on a row ... >> >> Regards, >> >> Pablo. >> >> El 11/12/2015 a las 18:28, Ahmed Ashour escribió: >>> Hi, >>> >>> Do you your project publicly available? or can you post minimal >>> Vaadin and HtmlUnit codes? >>> >>> Ahmed >>> ------------------------------------------------------------------------ >>> *From:* Pablo León <pab...@ho...> >>> *To:* htm...@li... >>> *Sent:* Friday, December 11, 2015 5:53 PM >>> *Subject:* [Htmlunit-user] Vaadin's tables >>> >>> Hi: >>> >>> I'm using htmlunit in a Vaadin based project, and have many Vaadin >>> components correctly responding to htmlunit fired events. >>> >>> But I'm stuck with tables ... while I get no exceptions running this >>> code and DomElement e is found, no ValueChangeEvent nor ItemClickEvent >>> is fired on server side's table. >>> >>> DomElement e = >>> page.getFirstByXPath("//div[@id='tableid']//table[@class='v-table-table']//tr[3]"); >>> e.click(); >>> >>> Does someone have some experience with Vaadin's tables? Any clue? >>> >>> Regards, >>> >>> Pablo. >>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Htmlunit-user mailing list >>> Htm...@li... >>> <mailto:Htm...@li...> >>> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> _______________________________________________ >>> Htmlunit-user mailing list >>> Htm...@li... >>> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> > |
From: Евгений П. <pec...@ya...> - 2015-12-26 05:39:06
|
<div>Hello, dear sirs!</div><div> </div><div>Thank you for awesome HtmlUnit, unfortunately, this time it doesn't help me.</div><div>Can you check if it's a bug in Htmlunit, or the site is too ugly?</div><div> </div><div>This is not my website, I just extract information from it on payed subscription.</div><div><a href="http://spark3.spark-interfax.ru/">http://spark3.spark-interfax.ru/</a></div><div>I cannot even login to it.</div><div> </div><div>This severe problem, reported by HtmlUnit, may cause problems.</div><div> </div><div><strong>Dec 26, 2015 12:18:56 PM com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter runtimeError</strong></div><div><pre><strong>SEVERE: runtimeError: message=[An invalid or illegal selector was specified (selector: '*,:x' error: Invalid selector: *:x).] sourceName=[http://spark3.spark-interfax.ru/content/build/18122015/js/vendor.js] line=[1] lineSource=[null] lineOffset=[0]<br /><br /></strong></pre></div><div>This js-file is downloadable, but unfortunately is compressed.</div><div> </div><div>I'm not a js expert, so struggle to figure if selector "*:x" is allowed by standards or not. Please, comment.</div><div> </div><div> </div><div>My code is kind of:</div><div><pre>protected WebClient webClient; webClient = new WebClient(BrowserVersion.CHROME); webClient.getOptions().setCssEnabled(true); webClient.getOptions().setJavaScriptEnabled(true); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getOptions().setTimeout(30000); webClient.setJavaScriptTimeout(30000); webClient.getOptions().setPopupBlockerEnabled(true); webClient.getOptions().setRedirectEnabled(true); webClient.getCookieManager().setCookiesEnabled(true); webClient.setCssErrorHandler(new com.gargoylesoftware.htmlunit.SilentCssErrorHandler()); webClient.getOptions().setPrintContentOnFailingStatusCode(true); HtmlPage activePage = webClient.getPage("http://spark3.spark-interfax.ru/"); webClient.waitForBackgroundJavaScript(20000); java.io.FileWriter writer = new java.io.FileWriter("/tmp/1.html", false); writer.write(activePage.asXml()); writer.close();</pre></div><div> </div><div>Tried to turn css on/off, popups, redirects, downgrade to HU2.18 from 2.19, all emulated "browsers" in HTMLUNIT. All the same.</div><div> </div><div> <pre><span> </span></pre></div><div> </div> |
From: Ahmed A. <asa...@ya...> - 2015-12-21 08:22:18
|
Hi Pablo, I see the issue, however isolating a minimal JavaScript case would take more time. Please open a bug report to be tracked. Ahmed From: Pablo León <pab...@ho...> To: Ahmed Ashour <asa...@ya...>; htm...@li... Sent: Monday, December 14, 2015 2:09 PM Subject: Re: [Htmlunit-user] Vaadin's tables Hi, As it seems that zip attachments are not allowed in this forum, I have created a github repository with the sources ... https://github.com/hangorn/vaadinsTables/tree/master/vaadinsTables Regards, Pablo |
From: Pablo L. <pab...@ho...> - 2015-12-14 13:10:02
|
Hi, As it seems that zip attachments are not allowed in this forum, I have created a github repository with the sources ... https://github.com/hangorn/vaadinsTables/tree/master/vaadinsTables Regards, Pablo El 14/12/2015 a las 13:08, Pablo León escribió: > Hi, Ahmed: > > I can't disclosure our project's sources, but I have prepared a tiny > maven project showing the problem. Thought test passes ok, click event > isn't fired on server's side, as it happens when you manually click on > a row ... > > Regards, > > Pablo. > > El 11/12/2015 a las 18:28, Ahmed Ashour escribió: >> Hi, >> >> Do you your project publicly available? or can you post minimal >> Vaadin and HtmlUnit codes? >> >> Ahmed >> ------------------------------------------------------------------------ >> *From:* Pablo León <pab...@ho...> >> *To:* htm...@li... >> *Sent:* Friday, December 11, 2015 5:53 PM >> *Subject:* [Htmlunit-user] Vaadin's tables >> >> Hi: >> >> I'm using htmlunit in a Vaadin based project, and have many Vaadin >> components correctly responding to htmlunit fired events. >> >> But I'm stuck with tables ... while I get no exceptions running this >> code and DomElement e is found, no ValueChangeEvent nor ItemClickEvent >> is fired on server side's table. >> >> DomElement e = >> page.getFirstByXPath("//div[@id='tableid']//table[@class='v-table-table']//tr[3]"); >> e.click(); >> >> Does someone have some experience with Vaadin's tables? Any clue? >> >> Regards, >> >> Pablo. >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> <mailto:Htm...@li...> >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user >> >> >> >> >> ------------------------------------------------------------------------------ >> >> >> _______________________________________________ >> Htmlunit-user mailing list >> Htm...@li... >> https://lists.sourceforge.net/lists/listinfo/htmlunit-user > |
From: Ahmed A. <asa...@ya...> - 2015-12-11 17:28:54
|
Hi, Do you your project publicly available? or can you post minimal Vaadin and HtmlUnit codes? Ahmed From: Pablo León <pab...@ho...> To: htm...@li... Sent: Friday, December 11, 2015 5:53 PM Subject: [Htmlunit-user] Vaadin's tables Hi: I'm using htmlunit in a Vaadin based project, and have many Vaadin components correctly responding to htmlunit fired events. But I'm stuck with tables ... while I get no exceptions running this code and DomElement e is found, no ValueChangeEvent nor ItemClickEvent is fired on server side's table. DomElement e = page.getFirstByXPath("//div[@id='tableid']//table[@class='v-table-table']//tr[3]"); e.click(); Does someone have some experience with Vaadin's tables? Any clue? Regards, Pablo. ------------------------------------------------------------------------------ _______________________________________________ Htmlunit-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlunit-user |
From: Pablo L. <pab...@ho...> - 2015-12-11 16:53:35
|
Hi: I'm using htmlunit in a Vaadin based project, and have many Vaadin components correctly responding to htmlunit fired events. But I'm stuck with tables ... while I get no exceptions running this code and DomElement e is found, no ValueChangeEvent nor ItemClickEvent is fired on server side's table. DomElement e = page.getFirstByXPath("//div[@id='tableid']//table[@class='v-table-table']//tr[3]"); e.click(); Does someone have some experience with Vaadin's tables? Any clue? Regards, Pablo. |
From: Christoff, K. <kch...@no...> - 2015-12-02 19:29:08
|
Here's the whole exception: com.gargoylesoftware.htmlunit.ScriptException: Error: Mismatched anonymous define() module: function (e, t, n) { return e.extend({init: function (e) { var t = this; t.options = $.extend({}, {container: ".upload-form", iframeElement: "#hiddenIframeForUpload"}, e), t.$container = $(t.options.container), t.getFileInput().change(function (e) { var n = t.getFileInput().val().replace("C:\\fakepath\\", ""); t.getFileNameInput().val(n), t.enableDisableUploadButton(); }), t.getFileNameIcon().click(function (e) { t.showFileSelection(); }), t.getFileNameInput().click(function (e) { t.showFileSelection(); }), t.getUploadButton().click(function (e) { t.upload(); }), t.enableDisableUploadButton(); }, showFileSelection: function () { this.getFileInput().trigger("click"); }, getFileInput: function () { return this.$container.find("input[type=file]"); }, getFileNameIcon: function () { return this.$container.find(".icon-file"); }, getFileNameInput: function () { return this.$container.find("input[type=text]"); }, getUploadButton: function () { return this.$container.find("button.upload-btn"); }, enableDisableUploadButton: function () { var e = this.getFileInput().val(); e === "" ? this.getUploadButton().attr("disabled", "disabled") : this.getUploadButton().removeAttr("disabled"); }, upload: function () { this.$container.submit(), this.clearPostOutput(), this.checkPostOutput(); }, getSuccessContainer: function () { return this.$container.find(".success"); }, getErrorContainer: function () { return this.$container.find(".error"); }, getIFrameContent: function () { return $(this.options.iframeElement).contents(); }, clearPostOutput: function () { var e = this.getIFrameContent(); $.publish("fileUploaded", [!1]), e.find("input[name=error]").val(""), e.find("input[name=message]").val(""); }, checkPostOutput: function () { var e = this, r = e.options, i = e.getIFrameContent(), s = i.find("input[name=error]"), o = i.find("input[name=message]"); if (s.length > 0 && s.val()) { var u = s.val(); u === "false" ? ($.publish("fileUploaded", [!0]), e.getSuccessContainer().show(), e.getErrorContainer().hide()) : (e.getSuccessContainer().hide(), e.getErrorContainer().html(o.val()).show()); } else { t.getTestMode() || n.delay(n.bind(e.checkPostOutput, e), 1000); } }}); } http://requirejs.org/docs/errors.html#mismatch ( http://localhost:7180/static/ext/require.js#141) Kyle |
From: Ronald B. <rb...@rb...> - 2015-12-02 18:57:46
|
Please post the whole exception RBRi On Wed, 2 Dec 2015 13:43:05 -0500 Christoff, Kyle wrote: > >Ahmed, > >I updated my code with setThrowExceptionOnScriptError(true). When I re-run >it, after clicking on the "Login" button, I get this exception. Should I >conclude that htmlunit cannot be used to automate this internal cloudera >installation web site? If so, could you tell me the reason? > >Kyle > >javascript enabled = true >---------- newpage after loading login page ---------- >Login - Cloudera Manager > > Cloudera Manager >Community Forums >Help > > > >Login >Username: Password: unchecked Remember me on this computer. Login >username=[HtmlTextInput[<input type="text" class="input-large" >id="username" name="j_username" autofocus="">]] >password=[HtmlPasswordInput[<input type="password" class="input-large" >id="password" name="j_password">]] >login=[HtmlButton[<button type="submit" class="btn btn-primary btn-large >btn-block" name="submit">]] >com.gargoylesoftware.htmlunit.ScriptException: Error: Mismatched anonymous >define() module: function (e, t, n) { > return e.extend({init: function (e) { > var t = this; > t.options = $.extend({}, {container: ".upload-form", iframeElement: >"#hiddenIframeForUpload"}, e), t.$container = $(t.options.container), >t.getFileInput().change(function (e) { > var n = t.getFileInput().val().replace("C:\\fakepath\\", ""); > t.getFileNameInput().val(n), t.enableDisableUploadButton(); > }), t.getFileNameIcon().click(function (e) { > t.showFileSelection(); > }), t.getFileNameInput().click(function (e) { > t.showFileSelection(); > }), t.getUploadButton().click(function (e) { > t.upload(); > }), t.enableDisableUploadButton(); > }, showFileSelection: function () { > this.getFileInput().trigger("click"); > }, getFileInput: function () { > return this.$container.find("input[type=file]"); > }, getFileNameIcon: function () { > return this.$container.find(".icon-file"); > }, getFileNameInput: function () { > return this.$container.find("input[type=text]"); > }, getUploadButton: function () { > return this.$container.find("button.upload-btn"); > }, enableDisableUploadButton: function () { > var e = this.getFileInput().val(); > e === "" ? this.getUploadButton().attr("disabled", "disabled") : >this.getUploadButton().removeAttr("disabled"); > }, upload: function () { > this.$container.submit(), this.clearPostOutput(), >this.checkPostOutput(); > }, getSuccessContainer: function () { > return this.$container.find(".success"); > }, getErrorContainer: function () { > return this.$container.find(".error"); > }, getIFrameContent: function () { > return $(this.options.iframeElement).contents(); > }, clearPostOutput: function () { > var e = this.getIFrameContent(); > $.publish("fileUploaded", [!1]), >e.find("input[name=error]").val(""), e.find("input[name=message]").val(""); > }, checkPostOutput: function () { > var e = this, r = e.options, i = e.getIFrameContent(), s = >i.find("input[name=error]"), o = i.find("input[name=message]"); > if (s.length > 0 && s.val()) { > var u = s.val(); > u === "false" ? ($.publish("fileUploaded", [!0]), >e.getSuccessContainer().show(), e.getErrorContainer().hide()) : >(e.getSuccessContainer().hide(), >e.getErrorContainer().html(o.val()).show()); > } else { > t.getTestMode() || n.delay(n.bind(e.checkPostOutput, e), 1000); > } > }}); >} >http://requirejs.org/docs/errors.html#mismatch ( >http://localhost:7180/static/ext/require.js#141) > > > >----< Inline text [text-plain-04.txt] >------------------ > >----------------------------------------------------------------------- ------- >Go from Idea to Many App Stores Faster with Intel(R) XDK >Give your users amazing mobile app experiences with Intel(R) XDK. >Use one codebase in this all-in-one HTML5 development environment. >Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs. >http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140 > > >----< Inline text [text-plain-05.txt] >------------------ > >_______________________________________________ >Htmlunit-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlunit-user > > |
From: Christoff, K. <kch...@no...> - 2015-12-02 18:43:13
|
Ahmed, I updated my code with setThrowExceptionOnScriptError(true). When I re-run it, after clicking on the "Login" button, I get this exception. Should I conclude that htmlunit cannot be used to automate this internal cloudera installation web site? If so, could you tell me the reason? Kyle javascript enabled = true ---------- newpage after loading login page ---------- Login - Cloudera Manager Cloudera Manager Community Forums Help Login Username: Password: unchecked Remember me on this computer. Login username=[HtmlTextInput[<input type="text" class="input-large" id="username" name="j_username" autofocus="">]] password=[HtmlPasswordInput[<input type="password" class="input-large" id="password" name="j_password">]] login=[HtmlButton[<button type="submit" class="btn btn-primary btn-large btn-block" name="submit">]] com.gargoylesoftware.htmlunit.ScriptException: Error: Mismatched anonymous define() module: function (e, t, n) { return e.extend({init: function (e) { var t = this; t.options = $.extend({}, {container: ".upload-form", iframeElement: "#hiddenIframeForUpload"}, e), t.$container = $(t.options.container), t.getFileInput().change(function (e) { var n = t.getFileInput().val().replace("C:\\fakepath\\", ""); t.getFileNameInput().val(n), t.enableDisableUploadButton(); }), t.getFileNameIcon().click(function (e) { t.showFileSelection(); }), t.getFileNameInput().click(function (e) { t.showFileSelection(); }), t.getUploadButton().click(function (e) { t.upload(); }), t.enableDisableUploadButton(); }, showFileSelection: function () { this.getFileInput().trigger("click"); }, getFileInput: function () { return this.$container.find("input[type=file]"); }, getFileNameIcon: function () { return this.$container.find(".icon-file"); }, getFileNameInput: function () { return this.$container.find("input[type=text]"); }, getUploadButton: function () { return this.$container.find("button.upload-btn"); }, enableDisableUploadButton: function () { var e = this.getFileInput().val(); e === "" ? this.getUploadButton().attr("disabled", "disabled") : this.getUploadButton().removeAttr("disabled"); }, upload: function () { this.$container.submit(), this.clearPostOutput(), this.checkPostOutput(); }, getSuccessContainer: function () { return this.$container.find(".success"); }, getErrorContainer: function () { return this.$container.find(".error"); }, getIFrameContent: function () { return $(this.options.iframeElement).contents(); }, clearPostOutput: function () { var e = this.getIFrameContent(); $.publish("fileUploaded", [!1]), e.find("input[name=error]").val(""), e.find("input[name=message]").val(""); }, checkPostOutput: function () { var e = this, r = e.options, i = e.getIFrameContent(), s = i.find("input[name=error]"), o = i.find("input[name=message]"); if (s.length > 0 && s.val()) { var u = s.val(); u === "false" ? ($.publish("fileUploaded", [!0]), e.getSuccessContainer().show(), e.getErrorContainer().hide()) : (e.getSuccessContainer().hide(), e.getErrorContainer().html(o.val()).show()); } else { t.getTestMode() || n.delay(n.bind(e.checkPostOutput, e), 1000); } }}); } http://requirejs.org/docs/errors.html#mismatch ( http://localhost:7180/static/ext/require.js#141) |
From: Ahmed A. <asa...@ya...> - 2015-12-02 15:54:06
|
Hi Kyle, You shouldn't use setThrowExceptionOnScriptError(false), since it may prevent an exception to show details about what fails. I am not sure if the there is a public website for testing against Clloudera manager. Ahmed From: "Christoff, Kyle" <kch...@no...> To: htm...@li... Sent: Wednesday, December 2, 2015 4:35 PM Subject: [Htmlunit-user] cannot read html updated source Hello, I'm trying to automate a cloudera installation by using htmlunit-2.19. I've downloaded cloudera-manager-installer.bin from cloudera. After executing it, a cloudera manager installer web site is available at http://localhost:7180. I'm trying to navigate this web site and supply inputs using htmlunit-2.19 to complete my installation. I can use htmlunit to login and navigate to the page where I can enter the hostnames of what will become my hadoop cluster. After setting a textArea on this page with my comma-separated list of hostnames, I find and click a "Search" button on the page. The page, I believe, executes some javascript to go out and find these hosts. After finding them, the page is updated with a list of the nodes. At this point, a previously unavailable "Continue" button on the page becomes available. I'd like to click it to move to the next page and continue my installation. However, my java program never sees the updated list of nodes and I cannot move forward. I know that htmlunit normally does not wait for javascript to finish, so I've taken steps to allow for that. As you'll see in my attached code, I've: webClient.setAjaxController(new NicelyResynchronizingAjaxController()); and after clicking the "Search" button: webClient.waitForBackgroundJavaScript(10000); but I've not been successful. How can I capture the page updates and continue my installation? thank you, Kyle |