|
From: Arno G. <arn...@gm...> - 2010-11-03 12:40:29
|
Michalis Kamburelis wrote: > Looks like a bug slipped into 0.12.0 release that prevents reading > files starting with UTF-8 BOM. It's fixed now of course in SVN. > > I know that Lazarus and most text editors in general do not add UTF-8 > BOM, so I think this isn't a problem for FPC/Lazarus users. Using UTF-8 without BOM is eval IMO, since there is no way to detect a charset reliable. Any test can only verify that it is correctly encoded UTF-8, it might be a different charset nevertheless, and testing is expensive. > large problem for Delphi users? I do not know how many users actually use UTF-8 source files? > Does some new Delphi version automatically add UTF-8 BOM? Yes, when it opened a UTF-8 source file without BOM (detected), it's saved with a BOM silently (just tested in XE). In all other cases you have to convert a unit to UTF-8, UCS-2, UCS-2Be or UCS-4 and UCS-4Be explicitly in editor's context menu before Delphi saved it with a BOM, the default is still ANSI. IMO PasDoc should detect all BOMs used by Delphi and raise an exception if a charset is not supported. > Bear in mind that I do not use/own Delphi > since a long time, so I'm asking you. > > If you guys think so, we can make a 0.12.1 bugfix release, even today. That would be nice for the non-svn users. -- Arno Garrels |