|
From: dman <ds...@ri...> - 2001-11-28 17:24:49
|
On Mon, Nov 26, 2001 at 08:25:12PM +0000, Finn Bock wrote:
| [dman]
|
| >| As an additional information point, my JDK1.2 and JDK1.3 also throws
| >| exceptions, but JDK1.4 silently transform the character into the
| >| unicode-undefined character.
| >
| >I'm not sure that is a good thing (jdk1.4), but maybe you don't have
| >to deal with it. Consider someone who has some source in latin1 (or
| >something else) and has
| >
| >a=F6c =3D "foo"
| >a=FCc =3D "bar"
|
| [I find it a little ironic that my mail agent can't deal any of the
| newer mail encodings]
I didn't do anything special with my mailer (mutt), but it shows the
message as "ISO-8859-1" encoded. I simply picked to characters near
the end of the latin1 encoding. They are vowels with some funny
decorations (I think they're called umlauts, but I'm really not sure).
I use vim 6 (with the less.vim macro) as my pager, and it showed it
correctly. Interestingly enough, that copy of vim was built without
multibyte support, so the 'enc' and 'fenc' settings weren't available.
| >If java uses UTF-8 as the encoding, then those two names
|
| Non-ascii chars in identifiers? I know CPython sometimes allow that, but
| that is not a feature I plan on adding.
I thought that would be nice to have for non-english developers, but
someone has already said otherwise.
| >will end up
| >being the same if jython will treat the unicode-undefined character as
| >a regular character. This would be an additional condition that
| >should raise an exception.
|
| If you put the non-ascii chars inside the quotes then I agree with your
| example and with your conclusion.
Yeah, that would do it too.
| >| Yes. The generated tokenmanager catches all IOExceptions
| >| (MalformedInputException is a subclass of IOException) and interprets
| >| that as eof.
| > [...]
| >
| >Couldn't you just catch that exception and print out a message then
| >exit right before catching IOException?
|
| There are 43 instances of caught IOException in
| PythonGrammerTokenManager such as:
|
| try { curChar = input_stream.readChar(); }
| catch(java.io.IOException e) {
| jjStopStringLiteralDfa_10(0, 0L, active1);
| return 1;
| }
|
| We probably have to catch the MalformedInputException in the
| ReaderCharStream and throw something that will get passed most of the
| catch clauses in the parser.
What if the exception gets turned into IOError (the python exception)?
I just noticed that you said "generated" parser. That may make it
easier or harder to add the proper catches.
I should probably file a bug report, right?
-D
--
(E)ighteen (M)egs (A)nd (C)onstantly (S)wapping
|