Dear Ahmed,
Thanks for your reply.
I am trying to parse a JavaScript and AJAX site.
At first i need to login it and with same login session i have to parse more
than 300 links and download required information in a loop.
I have used htmlunit for this project and it was working fine for last 2
years and now i am facing the Out of memory exception.
Currently i am using htmlunit 2.17.
My case is following :
public class Test {
static WebClient client;
static {
LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log",
"org.apache.commons.logging.impl.NoOpLog");
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF);
client = new WebClient(BrowserVersion.FIREFOX_38);
include(); // this method contains
client.waitForBackgroundJavaScript(10000); etc etc
}
public boolean login() throws IOException, InterruptedException {
HtmlPage page1 = client.getPage(loginUrl);
Thread.sleep(5000);
HtmlForm form = page1.getFirstByXPath("//form");
form.getInputByName("username").setValueAttribute("username");
form.getInputByName("password").setValueAttribute("passworf");
HtmlPage homePage = form.getInputByValue("login").click();
homePage = null;
client.close();
return true;
}
public void parseData(String url) {
try {
while (flag) {
HtmlPage homePage = client.getPage(url);
for () {
// some process and again i have to use same client
to access sub links.
}
}
} catch (Exception e) {
e.printStackTrace();
}
closeAll();
}
}
I found webClient is occupying more space.
<http://htmlunit.10904.n7.nabble.com/file/n36833/abc.jpg>
Thanks,
Gaurab Pradhan
--
View this message in context: http://htmlunit.10904.n7.nabble.com/OutOfMemoryError-Java-heap-space-tp36797p36833.html
Sent from the HtmlUnit - General mailing list archive at Nabble.com.
|