Menu

#1615 Error parsing page "Please update to a modern browser for the best Los Angeles Times viewing experience."

Latest SVN
closed
RBRi
None
1
2015-11-11
2014-06-09
Kunal Singh
No

I am using HtmlUnit for text extraction.
final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage("http://www.latimes.com/topic/politics/government/barack-obama-PEPLT007408-topic.html");
final String pageAsText = page.asText();
System.out.println(pageAsText);

It is giving me error:

Exception in thread "main" ======= EXCEPTION START ========
EcmaError: lineNumber=[11] column=[0] lineSource=[<no source="">]</no> name=[TypeError] sourceName=[http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js] message=[TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)]
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)
at

Related

Bugs: #1615

Discussion

  • Ahmed Ashour

    Ahmed Ashour - 2014-06-09
    • status: open --> closed
    • assigned_to: Ahmed Ashour
     
  • Ahmed Ashour

    Ahmed Ashour - 2014-06-09

    IE8 doesn't support it.

    Please use:

        WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_11);
    
     
  • Kunal Singh

    Kunal Singh - 2014-06-09

    Hey,

    Thanks for prompt reply, when I try with
    WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_11);
    final HtmlPage page = webClient.getPage("http://www.latimes.com/topic/politics/government/barack-obama-PEPLT007408-topic.html");
    System.out.println(page.asText());

    It returns:

    Please update to a modern browser for the best Los Angeles Times viewing experience.
    or, you can view an alternate view of this site on your current browser by clicking here.

    Is that is the case that HtmlUnit works for particular browser?

     
  • Ahmed Ashour

    Ahmed Ashour - 2014-06-09

    I see, it would be better if you can isolate the root as hinted in http://htmlunit.sourceforge.net/submittingJSBugs.html

     
  • Ahmed Ashour

    Ahmed Ashour - 2014-06-09
    • summary: com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find function addEventListener in object [object HTMLDocument]. --> Error parsing page "Please update to a modern browser for the best Los Angeles Times viewing experience."
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -10,102 +10,4 @@
     Exception in thread "main" ======= EXCEPTION START ========
     EcmaError: lineNumber=[11] column=[0] lineSource=[<no source>] name=[TypeError] sourceName=[http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js] message=[TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)]
     com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:705)
    -   at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
    -   at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:591)
    -   at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:1078)
    -   at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:393)
    -   at com.gargoylesoftware.htmlunit.html.HtmlScript$3.execute(HtmlScript.java:268)
    -   at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:288)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:741)
    -   at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:701)
    -   at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1137)
    -   at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1039)
    -   at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
    -   at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
    -   at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3126)
    -   at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2093)
    -   at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920)
    -   at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499)
    -   at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452)
    -   at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:965)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:247)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:193)
    -   at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268)
    -   at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156)
    -   at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:468)
    -   at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:342)
    -   at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:407)
    -   at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:392)
    -   at com.test.HtmlUnitExtractor.main(HtmlUnitExtractor.java:11)
    -Caused by: net.sourceforge.htmlunit.corejs.javascript.EcmaError: TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3629)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3613)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError(ScriptRuntime.java:3634)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError2(ScriptRuntime.java:3650)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.notFunctionError(ScriptRuntime.java:3714)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunctionAndThisHelper(ScriptRuntime.java:2233)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunctionAndThis(ScriptRuntime.java:2215)
    -   at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1333)
    -   at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
    -   at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
    -   at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
    -   at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:309)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3057)
    -   at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun(JavaScriptEngine.java:582)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:690)
    -   ... 30 more
    -Enclosed exception: 
    -net.sourceforge.htmlunit.corejs.javascript.EcmaError: TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3629)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3613)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError(ScriptRuntime.java:3634)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError2(ScriptRuntime.java:3650)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.notFunctionError(ScriptRuntime.java:3714)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunctionAndThisHelper(ScriptRuntime.java:2233)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunctionAndThis(ScriptRuntime.java:2215)
    -   at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1333)
    -   at script(http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js:11)
    -   at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
    -   at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
    -   at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
    -   at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:309)
    -   at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3057)
    -   at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun(JavaScriptEngine.java:582)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:690)
    -   at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:620)
    -   at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
    -   at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:591)
    -   at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:1078)
    -   at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:393)
    -   at com.gargoylesoftware.htmlunit.html.HtmlScript$3.execute(HtmlScript.java:268)
    -   at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:288)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:741)
    -   at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:701)
    -   at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1137)
    -   at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1039)
    -   at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
    -   at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
    -   at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3126)
    -   at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2093)
    -   at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:920)
    -   at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499)
    -   at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452)
    -   at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:965)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:247)
    -   at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:193)
    -   at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268)
    -   at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156)
    -   at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:468)
    -   at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:342)
    -   at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:407)
    -   at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:392)
    -   at com.test.HtmlUnitExtractor.main(HtmlUnitExtractor.java:11)
    -======= EXCEPTION END ========
    +   at 
    
    • status: closed --> accepted
    • assigned_to: Ahmed Ashour --> nobody
     
  • RBRi

    RBRi - 2015-10-31

    Have done a test with the latest code from SVN. Now HtmlUnit is able to get the content of the page.

     
    • Vernon Singleton

      I am still getting the error above. Using 2.19-SNAPSHOT. Using the same test page you used above. Looks like this is still a bug.

      Caused by: net.sourceforge.htmlunit.corejs.javascript.EcmaError: TypeError: Cannot find function addEventListener in object [object HTMLDocument]. (script in http://www.latimes.com/topic/politics-government/government/barack-obama-PEPLT007408-topic.html from (34, 9) to (44, 2)#42)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3935)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.constructError(ScriptRuntime.java:3919)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError(ScriptRuntime.java:3944)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.typeError2(ScriptRuntime.java:3960)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.notFunctionError(ScriptRuntime.java:4027)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunctionAndThisHelper(ScriptRuntime.java:2426)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getPropFunctionAndThis(ScriptRuntime.java:2408)
          at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1337)
          at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
          at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
          at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
          at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:309)
          at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3286)
          at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
          at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun(JavaScriptEngine.java:827)
          at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:939)
          ... 64 more
      

      Please consider reopening this ticket.
      Or if you like I will open another.

       

      Last edit: Vernon Singleton 2015-11-10
  • RBRi

    RBRi - 2015-10-31
    • status: accepted --> closed
    • assigned_to: RBRi
     
  • RBRi

    RBRi - 2015-11-10

    You have to specifiy a different browser (see Ahmed's comment above).

     
    • Vernon Singleton

      I have tried with INTERNET_EXPLORER_11 and FIREFOX_31, as noted below, both fail with the same error noted above using 2.19-SNAPSHOT.

      public class MyArtifactIdTester {
      
         private WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31);;
         // private WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_11);
      
         @Test
         public void myArtifactIdTest() throws Exception {
      
            HtmlPage initialPage = webClient.getPage("http://www.latimes.com/topic/politics/government/barack-obama-PEPLT007408-topic.html");
            final String pageAsText = initialPage.asText();
            System.out.println(pageAsText);
      
         }
      }
      
       

      Last edit: Vernon Singleton 2015-11-10
  • RBRi

    RBRi - 2015-11-10

    Sorry Vernon, but i can't reproduce your problem with your code. Your sampel code works fine for me. Can you please check your classpath.

     
  • Vernon Singleton

    Wow, please excuse my stupid. It was my fault. In addition to the above code, I also overlooked this block of code in my class:

       @Before
       public void setUp() {
          webClient = new WebClient();
          webClient.getOptions().setJavaScriptEnabled(true);
          java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
       }
    

    Obviously, this causes the issue noted in this thread. As you said, you need to use a browser other than the default. Thank you for the quick response. So please ignore my previous request and keep this ticket closed as solved.

     
    • RBRi

      RBRi - 2015-11-11

      Ok no problem. Enjoy using HtmlUnit.

      Am 11. November 2015 03:10:31 MEZ, schrieb Vernon Singleton vsingleton@users.sf.net:

      Wow, please excuse my stupid. It was my fault. In addition to the
      above code, I also overlooked this block of code in my class:
      ~~~~
      @Before
      public void setUp() {
      webClient = new WebClient();
      webClient.getOptions().setJavaScriptEnabled(true);
      java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
      }
      ~~~~

      Obviously, this causes the issue noted in this thread. As you said,
      you need to use a browser other than the default. Thank you for the
      quick response. So please ignore my previous request and keep this
      ticket closed as solved.


      ** [bugs:#1615] Error parsing page "Please update to a modern browser
      for the best Los Angeles Times viewing experience."**

      Status: closed
      Group: Latest SVN
      Created: Mon Jun 09, 2014 10:33 AM UTC by Kunal Singh
      Last Updated: Tue Nov 10, 2015 08:54 PM UTC
      Owner: RBRi

      I am using HtmlUnit for text extraction.
      final WebClient webClient = new WebClient();
      final HtmlPage page =
      webClient.getPage("http://www.latimes.com/topic/politics/government/barack-obama-PEPLT007408-topic.html");
      final String pageAsText = page.asText();
      System.out.println(pageAsText);

      It is giving me error:

      Exception in thread "main" ======= EXCEPTION START ========
      EcmaError: lineNumber=[11] column=[0] lineSource=[<no source="">]</no>
      name=[TypeError]
      sourceName=[http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js]
      message=[TypeError: Cannot find function addEventListener in object
      [object HTMLDocument]
      .
      (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)]
      com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot find
      function addEventListener in object [object HTMLDocument].
      (http://www.trbas.com/jive/prod/common/javascripts/mainInit.1q2w3_13a2f8e11fc6a6ec0e18d71a6f3e8dd9.min.js#11)
      at


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/htmlunit/bugs/1615/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      WETATOR
      Smart Web Application Testing
      www.wetetor .org

       

      Related

      Bugs: #1615


Log in to post a comment.