On 21 August 2010 10:10, Teemu Likonen <tlikonen@...> wrote:
> Thanks. I'm a bit confused, though. It seems that I can get the current
> character encoding (as a string) with a function like this:
>
> (defun current-character-encoding ()
> (alien-funcall
> (extern-alien "nl_langinfo"
> (function (c-string :external-format :default)
> int))
> sb-unix:codeset))
>
> But what effect does "munging" the variable
> SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* have? Who might munge it? The
> variable itself seems to point to the current encoding (as a keyword) so
> I'm not sure why this (ALIEN-FUNCALL ...) stuff is needed. I would guess
> that SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* is not a public API and thus not
> recommended. Care to clarify?
SB-IMPL::*DEFAULT-EXTERNAL-FORMAT* is not a supported interface, and
you are not supposed to munge it. SBCL sets it at startup so that it
doesn't have to call nl_langinfo after startup. Currently SBCL never
munges it afterwards.
Unofficially, it specifies the meaning of :DEFAULT external format
(which is specified to exist by the standard, and is unsurprisingly
the default). Munging it works, isn't terribly dangerous but
definitely not supported and likely to break sooner or later. Reading
it isn't supported either, and will break when the internal interface
changes.
Reading *D-E-F* instead of using the alien call should work just fine,
but if eg. some library does the nasty and munges it, then it will no
longer specify what the OS thinks is the default encoding -- whereas
calling nl_langinfo will always retrieve that information.
Whatever you should do depends on what you want the external format for.
If you just want an external format argument to use that corresponds
to the OS's idea of the default encoding, you can just use :DEFAULT,
unless you have reason to believe a library that is misbehaving and
munging SB-IMPL::*DEFAULT-EXTERNAL-FORMAT*, in which case you may want
to call nl_langinfo.
Then again, you want know whatever the default encoding is, instead of
just passing it as an argument, currently your best bet is to read
SB-IMPL::*DEFAULT-EXTERNAL-FORMAT*, as that tells you what :DEFAULT
means -- or if you want to know what the OS's idea of the default
encoding is, then nl_langinfo which tells you exactly that and is
future-proof to boot.
So, my order of preference is :DEFAULT, nl_langinfo, *D-E-F*.
By the by, don't use SB-UNIX:CODESET constant if you want your code to
remain future-proof. Either grovel the local CODESET #define from
langinfo, or assume that it will always be zero everywhere -- which is
what SBCL currently does currently.
Cheers,
-- Nikodemus
|