Thread: [q-lang-users] Q/Qpad 7.8 bug with Unicode on Windows?
Brought to you by:
agraef
From: Rob H. <hub...@gm...> - 2007-11-24 18:38:05
|
Dear Albert, the Unicode support on Windows XP seems to be broken. I've tried using a Unicode source file, but Q will not import it. I've tried Q7.7, and upgraded to Q7.8. The latter behaves slightly worse, as it hangs. I've not tried this before, so I don't know if it was broken in earlier versions. I've tried running in Cygwin/Bash as usual, but also in a plain DOS box; the behaviour is the same. I've appended the outputs from the runs below. Qpad doesn't like Unicode files either, but shows the files as 8-bit. (I assume Qpad doesn't support Unicode yet.) [This problem is *not* urgent for me. I just thought I'd try it out of curiosity.] Best wishes, Rob. ------------------------------------------------------------------------------- In DOS or Cygwin/Bash with Q7.7: $ q ____ / __ \ Q interpreter version 7.7 (i386-pc-mingw32) / /_/ / Copyright (c) 1991-2007 by Albert Graef \___\_\ http://q-lang.sourceforge.net This software is distributed under the terms of the GNU General Public License version 2 or later; type `copying' for details. ==> import empty_Unicode Error empty_Unicode.q, line 1: parse error at or near symbol `+' ! Error compiling script ==> import empty Error empty_Unicode.q, line 1: parse error at or near symbol `+' ! Error compiling script ==> [where the `+' is shown in a DOS box *like* a "Box Drawings Double Down And Left" U+2557; that's probably just one of the DOS/Windows-specific characters from the non-7-bit (non-ASCII) part of the set.] ------------------------------------------------------------------------------- In DOS or Cygwin/Bash with Q7.8: D:\q\Unicode> q ____ / __ \ Q interpreter version 7.8 (i386-pc-mingw32) / /_/ / Copyright (c) 1991-2007 by Albert Graef \___\_\ http://q-lang.sourceforge.net This software is distributed under the terms of the GNU General Public License version 2 or later; type `copying' for details. ==> import empty_Unicode Error empty_Unicode.q, line 1: parse error at or near symbol `+' [note the hang.] ------------------------------------------------------------------------------- The <empty_Unicode.q> file contains the following hex bytes, representing a UTF-8 header and a newline. EF BB BF 0D 0A |
From: Albert G. <Dr....@t-...> - 2007-11-25 03:06:21
|
Dear Rob, > the Unicode support on Windows XP seems to be broken. I've tried using a > Unicode source file, but Q will not import it. That's correct, usage of the BOM (byte order mark) at the beginning of an UTF-8 file is not portable, and in fact for good reasons many Unix-compatible tools will choke on it. To quote from the "byte order mark" Wikipedia article: "While UTF-8 does not have byte order issues, a BOM encoded in UTF-8 may be used to mark text as UTF-8. It only identifies a file as UTF-8 and does not state anything about byte order.[1] Quite a lot of Windows software (including Windows Notepad) adds one to UTF-8 files. However in Unix-like systems (which make heavy use of text files for configuration) this practice is not recommended, as it will interfere with correct processing of important codes such as the hash-bang at the start of an interpreted script. It may also interfere with source for programming languages that don't recognise it." Maybe there's some option to turn the BOM off in notepad? I haven't checked. > I've tried Q7.7, and upgraded to Q7.8. The latter behaves slightly > worse, as it hangs. This is the real issue here, thanks for reporting. Apparently the Q 7.8 lexer loops when it tries to read past an illegal UTF-8 character near eof. I'll fix this asap. > Qpad doesn't like Unicode files either, but shows the files as 8-bit. (I > assume Qpad doesn't support Unicode yet.) That's also true, and it probably never will (Qpad uses the simple MFC editor widget which is rather limited and only supports ASCII). As soon as Qt/Q has been ported to Windows, the current Qpad will probably be replaced by something portable and much nicer. :) Thanks, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Albert G. <Dr....@t-...> - 2007-12-07 22:59:34
|
Albert Graef wrote: > This is the real issue here, thanks for reporting. Apparently the Q 7.8 > lexer loops when it tries to read past an illegal UTF-8 character near > eof. I'll fix this asap. This is fixed in cvs now. I'm currently putting together a bugfix release (Q 7.9), did anyone notice any other things to be addressed in this release? Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |