[Htmlparser-cvs] htmlparser/src/org/htmlparser/beans StringBean.java,1.35,1.36
Brought to you by:
derrickoswald
From: <der...@us...> - 2004-01-10 15:23:36
|
Update of /cvsroot/htmlparser/htmlparser/src/org/htmlparser/beans In directory sc8-pr-cvs1:/tmp/cvs-serv3574/beans Modified Files: StringBean.java Log Message: Fix bug #874175 StringBean doesn't handle charset change well Add EncodingChangeException to distinguish a recoverable character set change occuring after the lexer has already coughed up some characters using the wrong encoding. Added testEncodingChange in LexerTests to excercise it. Changed IteratorImpl to not wrap a ParserException with another ParserException. Changed StringBean to retry the URL when an encoding change exception is caught. Index: StringBean.java =================================================================== RCS file: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/beans/StringBean.java,v retrieving revision 1.35 retrieving revision 1.36 diff -C2 -d -r1.35 -r1.36 *** StringBean.java 2 Jan 2004 16:24:53 -0000 1.35 --- StringBean.java 10 Jan 2004 15:23:33 -0000 1.36 *************** *** 37,40 **** --- 37,41 ---- import org.htmlparser.tags.Tag; import org.htmlparser.util.ParserException; + import org.htmlparser.util.EncodingChangeException; import org.htmlparser.util.Translate; import org.htmlparser.visitors.NodeVisitor; *************** *** 306,309 **** --- 307,330 ---- } } + catch (EncodingChangeException ece) + { + mIsPre = false; + mIsScript = false; + try + { // try again with the encoding now in force + mParser.reset (); + mBuffer = new StringBuffer (4096); + mParser.visitAllNodesWith (this); + updateStrings (mBuffer.toString ()); + } + catch (ParserException pe) + { + updateStrings (pe.toString ()); + } + finally + { + mBuffer = null; + } + } catch (ParserException pe) { |