From: Ahmed A. <asa...@ya...> - 2015-09-30 13:20:05
|
Hi, You need to get a reference to WebClient first: { WebClient webClient = (WebClient) get(driver, "webClient"); HtmlPage htmlPage = (HtmlPage) webClient.getTopLevelWindows().get(0).getEnclosedPage(); } @SuppressWarnings("unchecked") private static <T> T get(final Object o, final String fieldName) throws Exception { final Field field = o.getClass().getDeclaredField(fieldName); field.setAccessible(true); return (T) field.get(o); } Ahmed From: "htm...@li..." <htm...@li...> To: htm...@li... Sent: Wednesday, September 30, 2015 2:07 PM Subject: Auto-discard notification ----- Forwarded Message ----- The attached message has been automatically discarded. maybe a stupid question, but how to combine your both suggestions? htmlPage.save(File); And lSeleniumDriver.get(parsedArgs.getWebUrl()); which is intercepted by your suggested code (and working well) WebDriver driver = new HtmlUnitDriver(BrowserVersion.CHROME) { protected WebClient newWebClient(BrowserVersion version) { WebClient webClient = super.newWebClient(version); new WebConnectionWrapper(webClient) { public WebResponse getResponse(WebRequest request) throws IOException { long time = System.currentTimeMillis(); WebResponse response = super.getResponse(request); String url = request.getUrl().toExternalForm(); String content = response.getContentAsString(); long duration = time - System.currentTimeMillis(); return response; } }; return webClient; } }; I do not see the forest for the trees. On 30.09.2015 09:12, Ahmed Ashour wrote: Hi, You can also use htmlPgae.save(File); Ahmed From: "htm...@li..." <htm...@li...> To: htm...@li... Sent: Wednesday, September 30, 2015 8:00 AM Subject: Auto-discard notification ----- Forwarded Message ----- The attached message has been automatically discarded. Hello Ahmed, thanks for the details. The point is, I do not know the images and for my requirement it is not interesting how many images will be downloaded. I need just all images to download (to measure the size and the amount of requests and the download time) The images are either referenced by HTML, by JS or by CSS. So I also cannot say from which source they show up. In your example I need to name all images, so I have to know them all before downloading. Have I got the point? Is there a way to tell, just download the page as a classical browser would do (with all files, a classical browser would download)? best wishes |