Menu

#478 Win32: unicode support for files with public API for externs

bugfix
closed-accepted
puredata (385)
7
2012-12-17
2012-12-11
No

Right now, Pd on Windows does really badly if there are any non-ASCII characters in the path or filename. It makes it freeze for a while, and sometimes crashes it. These patches fix that.

Pd and Tcl/Tk is UTF-8 internally, and UNIXes all use UTF-8 for filenames
and paths. Windows uses UCS-2 everywhere, which is a 16-bit format. The
only place this affects Pd is reading and writing filenames, and printing
to the console. The POSIX-style functions open() and fopen() exist on
Windows, but only work for ASCII filenames. To support Unicode filenames,
we have to convert the UTF-8 to UCS-2, then use Win32-specific functions.

Since any external that opens files will also be affected the same
way, this patch provides a public API: sys_open()/sys_close(), and
sys_fopen()/sys_fclose(). For non-Win32 platforms, they are just
names that point to the normal POSIX versions. On Win32, they are
special functions to handle UTF-8 to UCS-2 conversion.

I have built and run this on Windows XP, Mac OS X 10.6, and Debian/squeeze amd64. These patches are also included in Pd-extended 0.43.4.

Discussion

  • Hans-Christoph Steiner

    patch to apply with 'git am'

     
  • Hans-Christoph Steiner

    and a couple more for good measure ;)

     
  • Miller Puckette

    Miller Puckette - 2012-12-16

    For some reason I can't apply the first patch (I get:
    error: patch failed: src/s_utf8.c:26
    error: src/s_utf8.c: patch does not apply
    error: patch failed: src/s_utf8.h:1
    error: src/s_utf8.h: patch does not apply
    --- even though my eyeballs can't see any reason that would fail.

    Anyway, I'd like not to have m_pd.h be in the business of pulling in other include files
    (except as needed to define its own data structures) as I think that's a threat to future
    portability - can this be fixed to leave m_pd.h, d_soundfile.h, etc., alone and just "do the
    necessary" to the UTF-8 code?

    thanks
    Miller

     
  • Hans-Christoph Steiner

    I agree that its good to avoid putting #includes in m_pd.h stdint.h is a stable and widespread as stddef.h, which has been part of m_pd.h for a long time. I put '#include <stdint.h>" because that's the only header that was already included everywhere that needs types defined in stdint.h.

    Another option would be to put the int32_t, etc. definitions in s_stuff.h or some other header, and then add that header everywhere its needed.

    As for the patches not applying, that's because you accepted 3 patches from Marvin Humpreys that removes lots of code from s_utf8.* and that conflicts with these patches.

     
  • Miller Puckette

    Miller Puckette - 2012-12-16

    OK -- I'll see if I can unwind the other patch and apply this one - if that works I'll go and fix the
    stdint stuff somehow that I can live with :)

     
  • Miller Puckette

    Miller Puckette - 2012-12-17

    Got 'em & pushed to repo... will re-apply the other patches later.

     
  • Hans-Christoph Steiner

    • status: open --> closed-accepted
     

Anonymous
Anonymous

Add attachments
Cancel