Menu

HTMLParser 1.2/1.3 JavaScript comment problem

2003-04-12
2003-04-13
  • Kedar Panse

    Kedar Panse - 2003-04-12

    I am trying to parse a html file with javascript.  I found out that it fails if a javascript single line comment ends with a wrod in ' it fails to parse. In followin trace it failed at comment ending with word 'page.' (notice it ends with a ' ).  Is there something I am doing wrong or its a bug? Did anyone else had same problem?  Would really appreciate your help.

    org.htmlparser.util.HTMLParserException: Unexpected Exception occurred while reading file://localhost/C:/DOCUME~1/Owner/LOCALS~1/Temp/tmp46206htm.html, in nextHTMLNode
    at Line 98 :         //New dummy form required if dest == 'service' to insure that http_referer exists in Internet Explorer upon return to the service 'page.'
    Previous Line 97 :         };
    org.htmlparser.util.HTMLParserException: HTMLReader.readElement() : Error occurred while trying to read the next element,
    at Line 98 :         //New dummy form required if dest == 'service' to insure that http_referer exists in Internet Explorer upon return to the service 'page.'
    Previous Line 97 :         };
    org.htmlparser.util.HTMLParserException: HTMLReader.readElement() : Error occurred while trying to decipher the tag using scanners
    at Line 98 :         //New dummy form required if dest == 'service' to insure that http_referer exists in Internet Explorer upon return to the service 'page.'
    Previous Line 97 :         };
    org.htmlparser.util.HTMLParserException: HTMLTag.scan() : Error while scanning tag, tag contents = script LANGUAGE="Javascript", tagLine = <script LANGUAGE="Javascript">;
    org.htmlparser.util.HTMLParserException: HTMLScriptScanner.scan() : Error while scanning a script tag, currentLine = <script LANGUAGE="Javascript">;
    org.htmlparser.util.HTMLParserException: HTMLReader.readElement() : Error occurred while trying to read the next element,
    at Line 98 :         //New dummy form required if dest == 'service' to insure that http_referer exists in Internet Explorer upon return to the service 'page.'
    Previous Line 97 :         };
    java.lang.StringIndexOutOfBoundsException: String index out of range: 139
        at java.lang.String.charAt(String.java:455)
        at org.htmlparser.HTMLStringNode.find(HTMLStringNode.java:102)
        at org.htmlparser.HTMLReader.readElement(HTMLReader.java:181)
        at org.htmlparser.scanners.HTMLScriptScanner.scan(HTMLScriptScanner.java:127)

     
    • Kedar Panse

      Kedar Panse - 2003-04-12

      I think I found the problem.........in the HTMLStringNode

      Heres fixed code snipplet

      if (ch=='\'') {
                     
                      if (state==PARSE_IGNORE_STATE) state=PARSE_HAS_BEGUN_STATE;
                      else {
                         
          //Added this to remove the bug (comment ending with a '                if((i+1)<inputLen)
                          if (input.charAt(i+1)=='<'){
                              state = PARSE_IGNORE_STATE;
                          }
                      }
                     
                  }           

       
    • Somik Raha

      Somik Raha - 2003-04-13

      I think you are using an old version of the parser. Pls get the latest and try again.

      Regards
      Somik

       
    • Kedar Panse

      Kedar Panse - 2003-04-13

      Indeed I was :) thanks

      Kedar

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.