From: gaurab.pradhan <mer...@gm...> - 2015-08-03 11:26:16
|
Dear Ahmed, Thanks for your reply. I am trying to parse a JavaScript and AJAX site. At first i need to login it and with same login session i have to parse more than 300 links and download required information in a loop. I have used htmlunit for this project and it was working fine for last 2 years and now i am facing the Out of memory exception. Currently i am using htmlunit 2.17. My case is following : public class Test { static WebClient client; static { LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog"); java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF); java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF); java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF); client = new WebClient(BrowserVersion.FIREFOX_38); include(); // this method contains client.waitForBackgroundJavaScript(10000); etc etc } public boolean login() throws IOException, InterruptedException { HtmlPage page1 = client.getPage(loginUrl); Thread.sleep(5000); HtmlForm form = page1.getFirstByXPath("//form"); form.getInputByName("username").setValueAttribute("username"); form.getInputByName("password").setValueAttribute("passworf"); HtmlPage homePage = form.getInputByValue("login").click(); homePage = null; client.close(); return true; } public void parseData(String url) { try { while (flag) { HtmlPage homePage = client.getPage(url); for () { // some process and again i have to use same client to access sub links. } } } catch (Exception e) { e.printStackTrace(); } closeAll(); } } I found webClient is occupying more space. <http://htmlunit.10904.n7.nabble.com/file/n36833/abc.jpg> Thanks, Gaurab Pradhan -- View this message in context: http://htmlunit.10904.n7.nabble.com/OutOfMemoryError-Java-heap-space-tp36797p36833.html Sent from the HtmlUnit - General mailing list archive at Nabble.com. |