From: goat <moo...@gm...> - 2013-11-25 18:42:37
|
Hi there, The business I work for uses py2exe for its Windows builds. I've been tasked with providing unicode support for the software in Windows, which mostly has to do with supporting path names comprised of characters from different locales. It turns out that py2exe cannot run from a folder with characters outside of the current Windows installation's locale (can't run the frozen exe from a folder with chinese characters on a en-us Windows). I've modified the source to support that behavior, and I would love to have people review the changes, to possibly include it in the official release. For starters I would appreciate if someone could show to me where to post the modified source. Regards, goatpig |
From: Thomas H. <th...@ct...> - 2013-11-25 20:03:32
|
Am 25.11.2013 19:42, schrieb goat: > Hi there, > > The business I work for uses py2exe for its Windows builds. I've been > tasked with providing unicode support for the software in Windows, which > mostly has to do with supporting path names comprised of characters from > different locales. > > It turns out that py2exe cannot run from a folder with characters > outside of the current Windows installation's locale (can't run the > frozen exe from a folder with chinese characters on a en-us Windows). > > I've modified the source to support that behavior, and I would love to > have people review the changes, to possibly include it in the official > release. > > For starters I would appreciate if someone could show to me where to > post the modified source. As I'm going to release a new version I'm interested in your patch. Can you port it to this mailing list? BTW: Python3 - I'm working on py2exe for it - will not have such a problem. Thanks, Thomas |
From: Werner <wer...@gm...> - 2013-11-26 10:25:15
|
Thomas, Great news to hear that an update for py2exe with Py3 support is in the works. Can take 'searching for a freeze' tool of my list then:) Werner |
From: goat <moo...@gm...> - 2013-11-26 12:33:09
Attachments:
goatpig_unicode_pathing.patch
|
Mistakenly sent to Thomas Heller only, when I meant to send to the mailing list: "As per Thomas request, the patch for my changes is attached to this email. I'm at your disposition to discuss the content. Currently built and tested for my needs in x86. goatpig" Sorry about that, my bad ='( |
From: Aahz <aa...@py...> - 2013-11-26 22:05:09
|
On Tue, Nov 26, 2013, goat wrote: > > Mistakenly sent to Thomas Heller only, when I meant to send to the > mailing list: > > "As per Thomas request, the patch for my changes is attached to this > email. I'm at your disposition to discuss the content. Currently built > and tested for my needs in x86. You should probably clean this up and resubmit. I know this can't be a clean patch because of lines like this: > +//#define wchar_t __wchar_t > + -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ A house is a machine to keep your cat dry. |
From: goat <moo...@gm...> - 2013-11-27 16:17:02
Attachments:
goatpig_unicode_pathhing
|
Oh sorry about that. Thanks for the advice. Attached a cleaned up version. -goatpig |
From: Aahz <aa...@py...> - 2013-11-28 01:19:52
|
On Wed, Nov 27, 2013, goat wrote: > > Oh sorry about that. Thanks for the advice. Attached a cleaned up version. Still not completely cleaned up, please really try to generate an absolutely minimal patch that follows the current code style. For example: > @@ -33,6 +31,7 @@ > struct IMPORT *p = imports; > HMODULE hmod; > > + Shouldn't add extraneous blank lines > @@ -46,7 +45,12 @@ > ++dllbase; > hmod = GetModuleHandle(dllbase); > if (hmod == NULL) > - hmod = LoadLibrary(dllname); > + { Style seems to be brace on same line as ``if``. > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2000-2013 Thomas Heller, Mark Hammond > + * Copyright (c) 2000, 2001 Thomas Heller Copyright changed for no clear reason, possibly because you started with an old version of the code. Particularly with that last point, I suggest that you start over with a fresh checkout of py2exe and copy over the minimal amount of code from your changes needed to make it work. -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ A house is a machine to keep your cat dry. |
From: goat <moo...@gm...> - 2013-11-28 10:22:36
|
Sorry about that again. So I get it right this time around, am I expected to add comments or not? |
From: Thomas H. <th...@ct...> - 2013-11-28 11:45:24
|
I'm more concerned about the functionality of the patch, not so much about these minor issues (of course it is nicer to have a patch that applies cleanly and only changes stuff that is really required, but for me it is not a bug deal to handle that myself). About the functionality: I tried you patch (with some modifications by me) and it seems to work; I can really start an exe created by py2exe when the exe is in a directory that contains chinese characters, on my german windows installation. The problem is that it only works when the exe is started with the FULL pathname. Which apparently happens when you start it by double-clicking in explorer. This probably is caused by the fact that the Python path includes now the current directory as '.'. Of course this is not acceptable: You cannot start the exe from the command line, even if you are *inside* the exe's directory. When I have time I will experiment a little bit to find out if this restriction can be removed. Will Python work with a sys.path that contains an utf-8 encoded string? I'm not sure - it must be able to bootstrap itself even though the encodings are not available since they have to loaded from the library which is in the zip-archive that py2exe creates... Thanks for the patch anyway, it contains an interesting idea. Thomas |
From: goat <moo...@gm...> - 2013-11-28 12:22:35
|
Interesting find, I'll look up this issue on my own. I assume forcing the current directory with SetCurrentDirectoryW could fix it, if the result of GetCurrentDirectoryW doesn't match the base directory returned by GetModuleFileNameW. Using utf-8 for wide char path names traversal is just my approach, and I don't expect Python to comply to it, even though it has proven itself very versatile with string encoding. Is it possible to force Python's preferred encoding? Maybe it would allow it to just natively deal with the utf-8 paths. |
From: Thomas H. <th...@ct...> - 2013-11-28 12:56:28
|
Am 28.11.2013 13:22, schrieb goat: > Interesting find, I'll look up this issue on my own. I assume forcing > the current directory with SetCurrentDirectoryW could fix it, if the > result of GetCurrentDirectoryW doesn't match the base directory returned > by GetModuleFileNameW. Interesting idea. For executables; but this cannot be used for dlls... > Using utf-8 for wide char path names traversal is just my approach, and > I don't expect Python to comply to it, even though it has proven itself > very versatile with string encoding. Well, python's filesystemencoding on Windows is 'mbcs': Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.getfilesystemencoding() 'mbcs' >>> I have not yet investigated if it is possible to change it. Possibly related is sys.setdefaultencoding(): http://stackoverflow.com/questions/3828723/why-we-need-sys-setdefaultencodingutf-8-in-a-py-script > Is it possible to force Python's preferred encoding? Maybe it would > allow it to just natively deal with the utf-8 paths. Thomas |
From: Thomas H. <th...@ct...> - 2013-11-28 16:00:02
|
BTW: Please post the answers to the mailing list! Am 28.11.2013 14:47, schrieb goat: > >> Interesting idea. For executables; but this cannot be used for dlls... > > It is possible to retrieve the literal path of a DLL regardless of the > exe it is attached to. Sure, but it is at least *not nice* to change the current directory of an exe from a dll when it is loaded. > I have to test the functionality around, not 100% > of the calls to perform. Would also need to detect whether the code is > called as an exe or a DLL, nothing too hard either. > > I'm going to try and implement these fixes and submit a new patch > somewhere around tonight or tomorrow morning. > > Apparently you can set the environment variable PYTHONIOENCODING before > calling Py_Initialize(), I'll give it a look as well. > Thomas |
From: goat <moo...@gm...> - 2013-11-28 16:58:06
|
Right right, I'm messing up this answering part, not to used to mailing lists as you may see. Well true, shouldn't be going around setting the current dir to an attached dll literal path. PYTHONIOENCODING isn't the right way to go either. Maybe PYTHONSTARTUP could be acceptable? Setting this environment variable with a file name as value will have Python run this file prior to running the "main" script. Could be used to set the sys encoding. > BTW: Please post the answers to the mailing list! > > Am 28.11.2013 14:47, schrieb goat: >> Sure, but it is at least *not nice* to change the current directory >> of an exe from a dll when it is loaded. |
From: goat <moo...@gm...> - 2013-11-28 17:15:33
|
Oddly enough I have no issue spawning frozen exes from the a command prompt with the current code. As a matter of fact the current directory is already set to the exe literal basedir, regardless of whether the app is double clicked or called from the command line. How did you achieve that behavior? On 11/28/2013 4:37 PM, Thomas Heller wrote: > BTW: Please post the answers to the mailing list! > > Am 28.11.2013 14:47, schrieb goat: >>> Interesting idea. For executables; but this cannot be used for dlls... >> It is possible to retrieve the literal path of a DLL regardless of the >> exe it is attached to. > Sure, but it is at least *not nice* to change the current directory > of an exe from a dll when it is loaded. > >> I have to test the functionality around, not 100% >> of the calls to perform. Would also need to detect whether the code is >> called as an exe or a DLL, nothing too hard either. >> >> I'm going to try and implement these fixes and submit a new patch >> somewhere around tonight or tomorrow morning. >> >> Apparently you can set the environment variable PYTHONIOENCODING before >> calling Py_Initialize(), I'll give it a look as well. >> > Thomas > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk > _______________________________________________ > Py2exe-users mailing list > Py2...@li... > https://lists.sourceforge.net/lists/listinfo/py2exe-users |
From: goat <moo...@gm...> - 2013-11-28 18:47:49
|
Ran into the spawn error in an odd way. My frozen exe wouldn't start when called by its relevant URI, regardless of where it was situated on file system. Implementing the SetCurrentDirectory approach fixed it. The issue may be rooted somewhere else. On 11/28/2013 4:37 PM, Thomas Heller wrote: > BTW: Please post the answers to the mailing list! > > Am 28.11.2013 14:47, schrieb goat: >>> Interesting idea. For executables; but this cannot be used for dlls... >> It is possible to retrieve the literal path of a DLL regardless of the >> exe it is attached to. > Sure, but it is at least *not nice* to change the current directory > of an exe from a dll when it is loaded. > >> I have to test the functionality around, not 100% >> of the calls to perform. Would also need to detect whether the code is >> called as an exe or a DLL, nothing too hard either. >> >> I'm going to try and implement these fixes and submit a new patch >> somewhere around tonight or tomorrow morning. >> >> Apparently you can set the environment variable PYTHONIOENCODING before >> calling Py_Initialize(), I'll give it a look as well. >> > Thomas > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk > _______________________________________________ > Py2exe-users mailing list > Py2...@li... > https://lists.sourceforge.net/lists/listinfo/py2exe-users |
From: Thomas H. <th...@ct...> - 2013-11-28 20:40:18
|
I experimented a bit. Replaced all the windows api calls by their _W variant, and used wchar_t throughout the code in start.c. Then I constructed a PyObject* containing a list with one unicode object in it, created with PyUnicode_FromWideChar(). Extended the python boot code (in py2exe\boot_common.py) to print sys.path; but the result was this (a byte path): PATH ['c:\\Users\\thomas\\devel\\py2exe\\test\\??\\sme.exe'] So, it seems that Python converts unicode objects on sys.path to byte strings, probably with the mbcs encoding, and this byte string cannot represent these chinese characters on my system. The only idea that I still have is to construct the zipimporter object (which contains the archive with the *strange* filename already) in the py2exe startup code (in C), and inject it into sys.metapath or whatever before any imports in Python takes place. No idea if this will work or not. Am 28.11.2013 16:37, schrieb Thomas Heller: > BTW: Please post the answers to the mailing list! > > Am 28.11.2013 14:47, schrieb goat: >> >>> Interesting idea. For executables; but this cannot be used for dlls... >> >> It is possible to retrieve the literal path of a DLL regardless of the >> exe it is attached to. > > Sure, but it is at least *not nice* to change the current directory > of an exe from a dll when it is loaded. > >> I have to test the functionality around, not 100% >> of the calls to perform. Would also need to detect whether the code is >> called as an exe or a DLL, nothing too hard either. >> >> I'm going to try and implement these fixes and submit a new patch >> somewhere around tonight or tomorrow morning. >> >> Apparently you can set the environment variable PYTHONIOENCODING before >> calling Py_Initialize(), I'll give it a look as well. PYTHONIOENCODING is used only for stdin/stdout/stderr. |
From: goat <moo...@gm...> - 2013-11-28 20:57:43
|
mbcs is locale bound, and the very reason I'm trying to feed paths to Python in utf8. Is it possible to overload import's behavior? One could open all the required scripts, pass the stream id to Python, then overload import to first check for an existing id and otherwise defer to the usual behavior. On 11/28/2013 9:42 PM, Thomas Heller wrote: > I experimented a bit. Replaced all the windows api calls by their _W > variant, and used wchar_t throughout the code in start.c. > > Then I constructed a PyObject* containing a list with one unicode > object in it, created with PyUnicode_FromWideChar(). > Extended the python boot code (in py2exe\boot_common.py) to > print sys.path; but the result was this (a byte path): > > PATH ['c:\\Users\\thomas\\devel\\py2exe\\test\\??\\sme.exe'] > > So, it seems that Python converts unicode objects on sys.path > to byte strings, probably with the mbcs encoding, and this byte string > cannot represent these chinese characters on my system. > > The only idea that I still have is to construct the zipimporter object > (which contains the archive with the *strange* filename already) in the > py2exe startup code (in C), and inject it into sys.metapath or whatever > before any imports in Python takes place. No idea if this will work or > not. > > Am 28.11.2013 16:37, schrieb Thomas Heller: >> BTW: Please post the answers to the mailing list! >> >> Am 28.11.2013 14:47, schrieb goat: >>>> Interesting idea. For executables; but this cannot be used for dlls... >>> It is possible to retrieve the literal path of a DLL regardless of the >>> exe it is attached to. >> Sure, but it is at least *not nice* to change the current directory >> of an exe from a dll when it is loaded. >> >>> I have to test the functionality around, not 100% >>> of the calls to perform. Would also need to detect whether the code is >>> called as an exe or a DLL, nothing too hard either. >>> >>> I'm going to try and implement these fixes and submit a new patch >>> somewhere around tonight or tomorrow morning. >>> >>> Apparently you can set the environment variable PYTHONIOENCODING before >>> calling Py_Initialize(), I'll give it a look as well. > PYTHONIOENCODING is used only for stdin/stdout/stderr. > > > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk > _______________________________________________ > Py2exe-users mailing list > Py2...@li... > https://lists.sourceforge.net/lists/listinfo/py2exe-users |