Menu

#1638 WebClient.closeAllWindows() does not terminate runaway Javascript thread

2.15
closed
RBRi
None
1
2014-09-26
2014-08-21
No

Unfortunately, I cannot supply details about the specific website being visited in this instance, as I have a server that is periodically processing a number of websites via HTMLUnit, and the problem is not 100% reproduceable.

What I do know is the following. My code gets web clients from a pool of web clients, along the following pattern.

try
{
webClient = getWebClient(); // gets a WebClient from a pool
... // initialise web client and do some work with it
}
finally
{
webClient.closeAllWindows(); // release all the web client's resources
releaseWebClient(webClient); // release the web client back into the pool
}

In this one instance, the closeAllWindows() method never returned. I was able to get a thread dump, and the thread that called closeAllWindows() was blocked with the following stack trace:

"Thread-11" Id=31 BLOCKED on com.gargoylesoftware.htmlunit.html.HtmlPage@f7e2eb owned by "JS executor for com.gargoylesoftware.htmlunit.WebClient@116ffed" Id=1225
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:686)

-  blocked on com.gargoylesoftware.htmlunit.html.HtmlPage@f7e2eb
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:637)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:612)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:1001)
at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeEventListeners(EventListenersContainer.java:179)
at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeBubblingListeners(EventListenersContainer.java:239)
at com.gargoylesoftware.htmlunit.javascript.host.Node.fireEvent(Node.java:824)
at com.gargoylesoftware.htmlunit.javascript.host.Node.fireEvent(Node.java:748)
at com.gargoylesoftware.htmlunit.html.HtmlElement$1.run(HtmlElement.java:920)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.html.HtmlElement.fireEvent(HtmlElement.java:925)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeEventHandlersIfNeeded(HtmlPage.java:1298)
at com.gargoylesoftware.htmlunit.html.HtmlPage.cleanUp(HtmlPage.java:328)
at com.gargoylesoftware.htmlunit.WebWindowImpl.removeChildWindow(WebWindowImpl.java:211)
at com.gargoylesoftware.htmlunit.WebWindowImpl.destroyChildren(WebWindowImpl.java:193)
at com.gargoylesoftware.htmlunit.TopLevelWindow.close(TopLevelWindow.java:125)
at com.gargoylesoftware.htmlunit.WebClient.closeAllWindows(WebClient.java:1748)

...

Checking thread 1225 at different times yielded different stack traces similar to the following:

"JS executor for com.gargoylesoftware.htmlunit.WebClient@116ffed" Id=1225 RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
at java.lang.Throwable.<init>(Throwable.java:213)
at java.lang.Exception.<init>(Exception.java:58)
at java.lang.RuntimeException.<init>(RuntimeException.java:60)
at com.gargoylesoftware.htmlunit.ElementNotFoundException.<init>(ElementNotFoundException.java:38)
at com.gargoylesoftware.htmlunit.html.HtmlPage.getElementById(HtmlPage.java:1729)
at com.gargoylesoftware.htmlunit.html.HtmlPage.getHtmlElementById(HtmlPage.java:1679)
at com.gargoylesoftware.htmlunit.javascript.host.Window.getWithFallback(Window.java:1362)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$FallbackCaller.get(JavaScriptEngine.java:843)
at net.sourceforge.htmlunit.corejs.javascript.ScriptableObject.getProperty(ScriptableObject.java:2296)
at com.gargoylesoftware.htmlunit.javascript.host.Window.call(Window.java:1324)
at net.sourceforge.htmlunit.corejs.javascript.Delegator.call(Delegator.java:220)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1531)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:309)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3057)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:103)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$4.doRun(JavaScriptEngine.java:630)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:690)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:637)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:612)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:1001)
at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptFunctionJob.runJavaScript(JavaScriptFunctionJob.java:53)
at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptExecutionJob.run(JavaScriptExecutionJob.java:102)
at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptJobManagerImpl.runSingleJob(JavaScriptJobManagerImpl.java:328)
at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:162)
at java.lang.Thread.run(Thread.java:679)</init></init></init></init>

This ran for a few days with CPU maxed out at 100%, so it seems evident that this Javascript thread was executing some bad Javascript code that never exited.

Looking through the HtmlUnit API, I can't see any settings that might impose a timeout or otherwise kill errant Javascript processes. I can see code in closeAllWindows() that attempts to shutdown the Javascript engine, but at least in this instance, this wasn't done, leading me to think that it's not quite implemented correctly.

Discussion

  • Melloware Inc

    Melloware Inc - 2014-09-05

    See test case below...

     

    Last edit: Melloware Inc 2014-09-11
  • Melloware Inc

    Melloware Inc - 2014-09-10

    OK I think I have narrowed down this issue and created a reproducible test case. I borrowed some of the code from your testing classes to test if the Javascript thread was still running and I have verified that it is in my specific test case. Of all of my screen scrapers this is the only one leaving its JS thread stuck causing the out of memory when the collector runs over and over again.

       public void testJavascriptMemoryLeak() throws Exception {
          WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24);
          webClient.getOptions().setAppletEnabled(false);
          webClient.getOptions().setCssEnabled(false);
          webClient.getOptions().setJavaScriptEnabled(true);
          webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
          webClient.getOptions().setThrowExceptionOnScriptError(false);
          webClient.getOptions().setTimeout(60000);
          webClient.getOptions().setUseInsecureSSL(true);
          webClient.setAjaxController(new NicelyResynchronizingAjaxController());
          webClient.setCssErrorHandler(new SilentCssErrorHandler());
          webClient.setHTMLParserListener(null);
          webClient.setJavaScriptErrorListener(null);
          webClient.setJavaScriptTimeout(30000);
          webClient.setRefreshHandler(new ThreadedRefreshHandler());
    
          try {
             final HtmlPage htmlPage = webClient
                      .getPage("https://www.dom.com/storm-center/dominion-electric-outage-summary.jsp");
             String pageXml = htmlPage.asXml();
             System.out.println(pageXml);
          } finally {
             // clean up HTML screen scraper resources
             webClient.closeAllWindows();
             webClient.getCookieManager().clearCookies();
          }
    
          webClient = null;
    
          // assert Javascript thread is still stuck
          final List<Thread> jsThreads = getJavaScriptThreads();
          // collect stack traces
          // caution: the threads may terminate after the threads have been returned
          // by getJavaScriptThreads()
          // and before stack traces are retrieved
          if (jsThreads.size() > 0) {
             final Map<String, StackTraceElement[]> stackTraces = new HashMap<String, StackTraceElement[]>();
             for (final Thread t : jsThreads) {
                final StackTraceElement elts[] = t.getStackTrace();
                if (elts != null) {
                   stackTraces.put(t.getName(), elts);
                }
             }
    
             if (!stackTraces.isEmpty()) {
                System.err.println("JS threads still running:");
                for (final Map.Entry<String, StackTraceElement[]> entry : stackTraces.entrySet()) {
                   System.err.println("Thread: " + entry.getKey());
                   final StackTraceElement elts[] = entry.getValue();
                   for (final StackTraceElement elt : elts) {
                      System.err.println(elt);
                   }
                }
                throw new RuntimeException("JS threads are still running: " + jsThreads.size());
             }
          }
       }
    
       protected List<Thread> getJavaScriptThreads() {
          final Thread[] threads = new Thread[Thread.activeCount() + 10];
          Thread.enumerate(threads);
          final List<Thread> jsThreads = new ArrayList<Thread>();
          for (final Thread t : threads) {
             if (t != null && t.getName().startsWith("JS executor for")) {
                jsThreads.add(t);
             }
          }
    
          return jsThreads;
       }
    
     
  • RBRi

    RBRi - 2014-09-17
    • status: open --> accepted
    • assigned_to: RBRi
     
  • RBRi

    RBRi - 2014-09-17

    analyzing...

     
  • RBRi

    RBRi - 2014-09-18

    Thanks a lot for the testcase. Looks like the reason is some kind of onunload processing that restarts the js thread after the thread was closed.
    Have changed the impl and added a (more simpler) unit test.

    Please try next build and report back if this hepls.

     
  • RBRi

    RBRi - 2014-09-18
    • status: accepted --> pending
     
  • Melloware Inc

    Melloware Inc - 2014-09-19

    Thanks for getting to this I will check it out!

     
  • Melloware Inc

    Melloware Inc - 2014-09-23

    Looks good to me. Thanks for fixing. I can't speak for everyone but it definitely closed my issue.

     
  • RBRi

    RBRi - 2014-09-26

    Thanks for reporting back. At least we have two more unit tests now so i will close this for now.

     
  • RBRi

    RBRi - 2014-09-26
    • status: pending --> closed
     

Log in to post a comment.

MongoDB Logo MongoDB