From: Stand T. <sta...@gm...> - 2010-02-26 14:42:35
|
So there are a few issues going on here. The first one is that Java will not escape those characters when compiling. _you_ must do so, either by going online and doing it on a website; by running your file with native2ascii; or by writing a converter that will read the file and convert the unicode characters to the ascii escape sequences. The second problem is: what exactly do you mean by it displays properly? What OS are you using? Standard System.out.println will not display any UTF-8 characters in a Windows cmd. There are flavors of linux and even some mac users that tout xterm or a custom version of the terminal that displays unicode characters, however, still going back to the Java class, it must be compiled with the escaped characters to begin with. Java, natively, handles everything in Unicode, but the compiler only compiles against ascii...even if the Java file, itself is Unicode (UTF-8 or 16). So, you would need to take your class and do "native2ascii Main.java Main2.java" or whatever you want it and it will then take the Greek text below and transform it to the escaped sequence. System.out.println("\u00ce\u201d\u00ce\u00bf\u00ce\u00ba\u00ce\u00b9\u00ce\u00bc\u00ce\u00ae"); This still will not display in the sysout as anything but garbage. To help me debug what issues you might be seeing and how to reproduce the issue, please include the following: 1) OS - Windows (include if it's a special version, e.g. the Japanese OS, or what have you), Mac and the version, or flavor of linux 2) how you're viewing the sout - is it in an xterminal you're running your code with? or a windows cmd? An IDE debugger pane? 3) Which version of Java you're compiling and running against. This part isn't as important since unicode in Java will run just about the same on all of the JDKs since 1.1 - but might help thx timo On Fri, Feb 26, 2010 at 5:43 AM, Panayotis Katsaloulis < pan...@pa...> wrote: > > On 26 Φεβ 2010, at 10:23 ΠΜ, Sascha Haeberling wrote: > > Hi Panayotis, >> >> can you give me a concrete example on how to reproduce this? >> >> Thank you >> // Sascha >> > > Yes, just enter any utf-8 string in a System.out directive > > For simplicity, I've attached a demo java source, which can be found (it is > under package "test"). > > Under Java it properly displays "Δοκιμή", while after the conversion it is > something like r4w7w2w1w4u6 and in the source code > @"\1624\1677\1672\1671\1674\1656" > > I really don't know why it is like this, but a rough suggestion is that the > character is escaped but not with something like "\u" > > |