Menu

#1921 onClick() freezes indefinitly

2.27
accepted
RBRi
None
1
2017-09-21
2017-09-13
No

Small example. The website is from the Turkish Government.

try(WebClient client = new WebClient()) {
            client.setJavaScriptTimeout( 10000 );

            HtmlPage page = client.getPage( "http://ssd.dhmi.gov.tr/page.aspx?mn=388" );
            List<HtmlAnchor> anchors = page.getByXPath("//div[@id='dvPage']/ul/li/a");

            // Airports
            for (HtmlAnchor htmlAnchor : anchors) {
                String name = htmlAnchor.getTextContent();
                System.out.println("[" + name + "] requesting file");
                Page p = htmlAnchor.click();
                System.out.println("[" + name + "] waiting for server response");
                client.waitForBackgroundJavaScript(10000);
                System.out.println("[" + name + "] request acknowledged!");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

Notice how it freezes when it hits "LTBA" indefinitly(!) but not on the other ones which execute the same script.
Also note how setJavaScriptTimeout has not effect.

Discussion

  • RBRi

    RBRi - 2017-09-13
    • status: open --> accepted
    • assigned_to: RBRi
     
  • RBRi

    RBRi - 2017-09-13

    Ok, i'm able to reproduce your problem. The reason is the document returned as xml. HtmlUnit seems to parse this and this parsing does not scale. For large documents the paser runs a bit longer :-)
    Will have a look....

     
  • RBRi

    RBRi - 2017-09-17

    Did some more analysis. Looks really starnge because parsing the singe document is fast. Maybe there is some memory leak somewhere.

     
  • TMIndustries

    TMIndustries - 2017-09-21

    Thank you for looking into this.
    Also tried some indepth parsing with specialised DOM and SAX parser for those file types and even though those files can be immensely huge it also takes less then a second for me.

    According to your first answer, by replacing the PageCreator and simply returning a TextPage for text/xml solves the problem right away.
    No need to parse a file that should've been downloaded either way like a normal browser does in this situation.

    This solves the problem for me, at least. Even though that does not answer the initial problem.
    I am going to leave it to you, now.

    Thank you for your response and commitment.
    With best Regards

     

Log in to post a comment.