Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/beans
In directory sc8-pr-cvs1:/tmp/cvs-serv3574/beans
Modified Files:
StringBean.java
Log Message:
Fix bug #874175 StringBean doesn't handle charset change well
Add EncodingChangeException to distinguish a recoverable character set change
occuring after the lexer has already coughed up some characters using the wrong
encoding. Added testEncodingChange in LexerTests to excercise it.
Changed IteratorImpl to not wrap a ParserException with another ParserException.
Changed StringBean to retry the URL when an encoding change exception is caught.
Index: StringBean.java
===================================================================
RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/beans/StringBean.java,v
retrieving revision 1.35
retrieving revision 1.36
diff -C2 -d -r1.35 -r1.36
*** StringBean.java 2 Jan 2004 16:24:53 -0000 1.35
--- StringBean.java 10 Jan 2004 15:23:33 -0000 1.36
***************
*** 37,40 ****
--- 37,41 ----
import org.htmlparser.tags.Tag;
import org.htmlparser.util.ParserException;
+ import org.htmlparser.util.EncodingChangeException;
import org.htmlparser.util.Translate;
import org.htmlparser.visitors.NodeVisitor;
***************
*** 306,309 ****
--- 307,330 ----
}
}
+ catch (EncodingChangeException ece)
+ {
+ mIsPre = false;
+ mIsScript = false;
+ try
+ { // try again with the encoding now in force
+ mParser.reset ();
+ mBuffer = new StringBuffer (4096);
+ mParser.visitAllNodesWith (this);
+ updateStrings (mBuffer.toString ());
+ }
+ catch (ParserException pe)
+ {
+ updateStrings (pe.toString ());
+ }
+ finally
+ {
+ mBuffer = null;
+ }
+ }
catch (ParserException pe)
{
|