From: Gaurab P. <gau...@gm...> - 2013-12-20 04:12:17
|
Hello to All, I am new to Htmlunit, I am using Htmlunit 2.13 to crawl web site, i can handle java script webpages easily using Htmlunit, but from couple of days i have having problem, current i am trying to crawl some information from webpage, that webpage is made using Ajax and other stuff. I am not getting desired pages after clicking anchor tag. On the web page link is embedded in following way. <a id="bidSearchForm:linksBidSearchResults:0:j_idt108" href="#" class="ui-commandlink" onclick="PrimeFaces.ab({source:'bidSearchForm:linksBidSearchResults:0:j_idt108'});return false;"> there are multiple of this type of links i need to go to each of links. Here is my code: WebClient client = new WebClient(BrowserVersion.CHROME); //i have tried with FIREFOX_17 as well LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog"); java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF); java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF); client.setAjaxController(new NicelyResynchronizingAjaxController()); client.waitForBackgroundJavaScript(10000); client.waitForBackgroundJavaScriptStartingBefore(10000); client.getOptions().setCssEnabled(true); client.setCssErrorHandler(new SilentCssErrorHandler()); client.getOptions().setThrowExceptionOnFailingStatusCode(false); client.getOptions().setThrowExceptionOnScriptError(false); client.getOptions().setRedirectEnabled(true); client.getOptions().setAppletEnabled(false); client.getOptions().setJavaScriptEnabled(true); client.getOptions().setPopupBlockerEnabled(true); client.getOptions().setTimeout(5000); client.getOptions().setPrintContentOnFailingStatusCode(false); client.getOptions().setUseInsecureSSL(true); HtmlPage homePage = client.getPage("SomeURL"); synchronized (homePage) { homePage.wait(5000); } List<HtmlAnchor> anchors = new ArrayList<HtmlAnchor>(); anchors = homePage.getAnchors(); HtmlPage tempPage = null; for (int i = 0; i < anchors.size(); i++) { tempPage = anchors.get(i).click(); String secondLink = homePage.getUrl().toString(); System.out.println("LINK TO GET DOCUMENTS = " + secondLink); client.waitForBackgroundJavaScript(10000); client.waitForBackgroundJavaScriptStartingBefore(10000); } -- Best Regards, Gaurab Pradhan |