Menu

#1541 HtmlImage and ImageIO leak file descriptor

2.12
closed
image (1)
1
2015-01-10
2013-09-09
No

Hello,

Seems related to #1514.
Images are automaticly retrives by HtmlUnit and read with ImageIO. ImageIO caches images on disk (on tmp dir) and leak file descriptor.

It may be related to
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7166379

I have joined a fast patch that override finalize and close and free imagereader

Here is a human test (difficult to automatize):

@Test
public void imageIoLake() {
    System.setProperty("java.io.tmpdir", "D:\\temp\\io\\")

    WebClient webClient = Helper.newWebClient()
    final HtmlPage page = webClient.getPage("http://www.pcinpact.com");

    DomNodeList<HtmlImage> elements = page.getElementsByTagName("img")

    elements.each {
        try {
            it.getWidth()
        } catch (Exception e) {
            e.printStackTrace()
        }
    }

    webClient.closeAllWindows()

    System.gc();
    Thread.sleep(100000000)
}
1 Attachments

Discussion

  • Nicolas Labrot

    Nicolas Labrot - 2013-09-09

    The test must set into brace with System.gc() outside (or nullify variable)

     
  • Ahmed Ashour

    Ahmed Ashour - 2013-09-26
    • status: open --> pending
     
  • Ahmed Ashour

    Ahmed Ashour - 2013-09-26
    • Is this example Java?
    • Does the issue still work with latest Java 6 or 7?

    I need to reproduce, to justify adding the patch.

     
  • Nicolas Labrot

    Nicolas Labrot - 2013-09-27

    Here is a working test that fails. The issue is reproduced on JDK 7u40.

    Images of yahoo websites are downloaded (with downloadImageIfNeeded/readImageIfNeeded). "htmlunit" temp directory contains "imageioXXX.tmp" files. ImageIO holds file descriptor on these files. They cannot be deleted until the JVM exit.

    With the patch, the finalize method dispose the ImageIO image and release the file descriptor.

    @Test
    public void imageIoLake() throws IOException {

        File tmpDir = new File(System.getProperty("java.io.tmpdir") , "htmlunit");
        tmpDir.mkdirs();
        System.setProperty("java.io.tmpdir", tmpDir.getAbsolutePath());
    
        WebClient webClient = Helper.newWebClient();
        final HtmlPage page = webClient.getPage("http://www.yahoo.com");
    
        DomNodeList<DomElement> elements = page.getElementsByTagName("img");
    
        for (DomElement domElement : elements){
    
            HtmlImage htmlImage = (HtmlImage) domElement;
            try {
                htmlImage.getWidth();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    
        webClient.closeAllWindows();
    
        File[] imgs = tmpDir.listFiles();
    
        Assert.assertTrue(imgs.length>0);
    
        for (File file : imgs){
            Assert.assertTrue(file.delete());
        }
    
    }
    
     
  • Ahmed Ashour

    Ahmed Ashour - 2013-10-10
    • status: pending --> open
    • assigned_to: Ahmed Ashour
     
  • Ahmed Ashour

    Ahmed Ashour - 2013-10-10

    I guess you meant "Assert.assertTrue(imgs.length == 0);"

    Even putting in finalize doesn't delete all images, I used the below code

    @all: what do you think?

            File tmpDir = new File(System.getProperty("java.io.tmpdir") , "htmlunit");
            System.out.println(tmpDir.getAbsolutePath());
            tmpDir.mkdirs();
            System.setProperty("java.io.tmpdir", tmpDir.getAbsolutePath());
    
            WebClient webClient = new WebClient();
            final HtmlPage page = webClient.getPage("http://www.yahoo.com");
    
            DomNodeList<DomElement> elements = page.getElementsByTagName("img");
            for (DomElement domElement : elements){
    
                HtmlImage htmlImage = (HtmlImage) domElement;
                try {
                    System.out.println("getting image...");
                    htmlImage.getWidth();
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
    
            webClient.closeAllWindows();
            System.out.println("Sleeping...");
            System.gc();
            long toWait = System.currentTimeMillis() + 60 * 1000;
            while (System.currentTimeMillis() < toWait) {
                Thread.sleep(100);
            }
            File[] imgs = tmpDir.listFiles();
    
            Assert.assertTrue(imgs.length == 0);
    
     
  • Ahmed Ashour

    Ahmed Ashour - 2013-10-10
    • status: open --> closed
     
  • Ahmed Ashour

    Ahmed Ashour - 2013-10-10

    HtmlImage.finalize() has been added in SVN.

    However, my tests with eclipse wasn't promising, as finalize() was called for only 5 images.

    Anyhow, as it doesn't hurt, it has been committed.

     
    • Nicolas Labrot

      Nicolas Labrot - 2013-10-10

      GC is poorly predictable. In my test I got 0 images, but I call gc more than 1 time.

      Maybe you can try to call it inside the while loop. Anyway thanks for the addition

       

Log in to post a comment.