From: Stephen P. <st...@lo...> - 2016-03-07 21:32:02
|
Thank you for all your help. I'm still having trouble. Two things are on my mind: Thing 1 - The reason I grabbed itemDetail is because I am looking for three pieces of information that are all enclosed in that tree: <div data-selenium="itemDetail"> <div data-selenium="img-zone"> ... Stuff I don't care about ... </div> <div data-selenium="itemInfo-zone"> <div data-selenium="itemHeader"> <h3 data-selenium="itemHeading"> <a href=""blah blah blah> <span itemprop="brand"> I WANT THIS BRAND </span> <span itemprop="name"> I WANT THIS NAME </span> </a> </h3> <p> ... More stuff I don't care about ... </p> <p> ... Even More Stuff I Don't Care About ... </p> </div> <!-- itemHeader --> <div data-selenium="highlights"> ... Yet Even More Stuff I Don't Care About ... </div> <div data-selenium="itemSection"> ... Whatever ... </div> </div> <!-- itemInfo-zone --> <div data-selenium="conversion-zone"> <div data-selenium="price-zone"> <div data-selenium="prices"> ... Don't care ...</div> <div data-selenium="addToCartPrice"> <p data-selenium="finalPrice"> <span data-selenium="youpayPrice"> ... Yawn ... </span> <span data-selenium="price"> THIS IS THE PRICE I WANT </span> </p> <!-- finalPrice --> </div> <!-- addToCartPrice --> </div> <!-- price-zone --> </div> <!-- conversion-zone --> NOTE: For purposes of brevity, I have removed the additional attributes of the elements. Since I am climbing this learning curve, tell me if that is interfering with my use of XPath. You can see from the code that I am asking XPath to search for only those attributes which I think are the way to the information I want. Tell me if I should be using a different syntax, or even a different algorithm for this. Thing 2 - I applied the changes Ahmed suggested from the last reply. While I can get all 24 sections of itemHeading on the page, when I use the syntax he specified to get my brand information, I get an empty list: SOURCE CODE: public static void main(String[] args) throws Exception{ try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF); System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog"); // String url = "http://localhost:8888/vendor.html"; String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; HtmlPage page = webClient.getPage(url); // List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemDetail']"); List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemHeading']"); if (nodeProduct.size() > 0) { for (DomNode e : nodeProduct) { System.out.println("***** Contents of e"); System.out.println(e.asXml()); System.out.println("***** End Contents of e"); List<DomNode> b = (List<DomNode>) page.getByXPath("//span[@itemprop='brand']"); System.out.println("--- Contents of b"); System.out.println(b); System.out.println("--- End Contents of b"); } } } } OUTPUT: run: ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258616-REG/Kino_Flo_KAS_D2_C_KAS_D2C_Diva_Lite_200_Travel.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D2-C Diva-Lite 200 Travel Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/1043631-REG/canon_9785b002_cn7x17_kas_s_cine_servo.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Canon </span> <span itemprop="name"> CN7x17 KAS S Cine-Servo 17-120mm T2.95 (PL Mount) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/1043629-REG/canon_9785b001_cn7x17_kas_s_cine_servo.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Canon </span> <span itemprop="name"> CN7x17 KAS S Cine-Servo 17-120mm T2.95 (EF Mount) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258606-REG/Kino_Flo_KAS_CL6_Compact_Carry_Case_for.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CL6 6-Lamp Carry Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/505449-REG/Kino_Flo_KAS_D42_KAS_D42_Diva_Lite_400_Flight.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D42 Diva-Lite 400 Wheeled Flight Case - for Two each Kino-Flo Diva-Lite 400 Fixtures, Stands, Mounts, Flozier and Lamp Cases </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/884961-REG/kino_flo_kas_ce2_c_clamshell_travel_case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CE2-C Clamshell Travel Case (Yellow) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/429918-REG/Kino_Flo_KAS_24S_KAS_24S_Small_Telescoping_Shipping.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-24S Telescoping Shipping Case, Small - for up to three Kino-Flo 2.0' Fixtures </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258673-REG/Kino_Flo_KAS_41_KAS_41_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-41 Telescoping Shipping Case - for 1 Kino-Flo 4' Bank System </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/258605-REG/Kino_Flo_KAS_D4_C_KAS_D4_C_Diva_Lite_400_Travel.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D4-C Diva-Lite 400 Travel Case - for Kino Flo Diva-Lite 400 Lighting Kit </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/656771-REG/Kino_Flo_KAS_GAF2_KAS_GAF2_Gaffer_Kit_Ship.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-GAF2 Gaffer Kit Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/884959-REG/kino_flo_kas_ce2_flight_case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CE2 Flight Case (Yellow) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/672726-REG/Kino_Flo_KAS_VH2_KAS_VH2_Vista_Single_Louver.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-VH2 Vista Single Louver Carry Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/507360-REG/Kino_Flo_KAS_V31_Y_KAS_V31_Y_Yoke_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V31-Y Yoke Shipping Case - for VistaBeam 300 Fluorescent Fixture with Yoke Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/507361-REG/Kino_Flo_KAS_V61_Y_KAS_V61_Y_Yoke_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V61-Y Yoke Shipping Case - for VistaBeam 600 Fluorescent Fixture with Yoke Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/884962-REG/kino_flo_kas_ce2_y_yoke_ship_case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-CE2-Y Yoke Ship Case (Black) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/507295-REG/Kino_Flo_KAS_D22_KAS_D22_Diva_Lite_200_Flight.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D22 Diva-Lite 200 Flight Case - for Two each Kino-Flo Diva-Lite 200 Fixtures, Stands, Mounts, Flozier and Lamp Cases </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/840408-REG/Kino_Flo_KAS_B4_C_Clamshell_Travel_Case_for.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-B4-C Clamshell Travel Case for One BarFly 400D Kit (Black) </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/580907-REG/Kino_Flo_KAS_B41_KAS_B41_BarFly_400_Ship.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-B41 BarFly 400 Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/858258-REG/Kino_Flo_KAS_D4_CS_KAS_D4_CS_Diva_Lite_401_Travel.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-D4-CS Diva-Lite 401 Travel Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/656769-REG/Kino_Flo_KAS_INT2_KAS_INT2_Interview_Ship_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-INT2 Interview Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/656770-REG/Kino_Flo_KAS_INT3_KAS_INT3_Interview_Ship_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-INT3 Interview Ship Case </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/434699-REG/Kino_Flo_KAS_V31_KAS_V31_Center_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V31 Center Shipping Case - for VistaBeam 300 Fluorescent Fixture with Center Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/434700-REG/Kino_Flo_KAS_V61_KAS_V61_Center_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V61 Center Shipping Case - for VistaBeam 600 Fluorescent Fixture with Center Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b ***** Contents of e <h3 data-selenium="itemHeading" class="bold fourteen"> <a href="http://www.bhphotovideo.com/c/product/434701-REG/Kino_Flo_KAS_V62_KAS_V62_Center_Shipping_Case.html" class="c5" data-selenium="itemHeadingLink" itemprop="url"> <span itemprop="brand"> Kino Flo </span> <span itemprop="name"> KAS-V62 Center Shipping Case - for Two VistaBeam 600 Fluorescent Fixtures with Center Mount </span> </a> </h3> ***** End Contents of e --- Contents of b [] --- End Contents of b BUILD SUCCESSFUL (total time: 6 seconds) You can see why I am all confused. I have confirmed in my settings that I am using version 2.20. Thank you for all your help. ~ Steve ----- Stephen M. Paulsen Lowing Light & Grip > On Mar 4, 2016, at 4:25 PM, Ahmed Ashour <asa...@ya...> wrote: > > Hi Stephen, > > 'brand' is a descendant of 'itemHeading', not of 'itemDetail'. > > The below works with latest version (with a workaround for the failling JavaScript, a bug report should be created for this). > > > public static void main(String[] args) throws Exception { > try (final WebClient webClient = new WebClient(BrowserVersion.CHROME)) { > > String url = "http://www.bhphotovideo.com/c/search?Ntt=kas-"; > // Yes, this is a live site. Be nice. > HtmlPage page = webClient.getPage(url); > List<DomNode> nodeProduct = (List<DomNode>) page.getByXPath("//*[@data-selenium='itemHeading']"); > > if (nodeProduct.size() > 0) { > for (DomNode e : nodeProduct) { > System.out.println(e.asXml()); > List<DomNode> b = (List<DomNode>) page.getByXPath("//span[@itemprop='brand']"); > System.out.println(b); > } > } > } > } > > |