From: SourceForge.net <no...@so...> - 2010-10-28 10:40:58
|
Feature Requests item #3090894, was opened at 2010-10-19 17:41 Message generated for change (Comment added) made by jimfcarroll You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=351645&aid=3090894&group_id=1645 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Priority: 5 Private: No Submitted By: Jim (jimfcarroll) Assigned to: Nobody/Anonymous (nobody) Summary: Python 2.x unicode handling for std::string Initial Comment: I'm not sure why but, while SWIG handles unicode in python 3, it doesn't handle it in python 2.x. The following test case should work: C++ code ----------------- ... void func(std::string val) { printf ("%s",val.c_str()); } ... ----------------- python 2.x code: ----------------- ... module.func(unicode('hello world')) ... I have attached a patch that allows this to work. The patch applies to [src]/Lib/python/pystrings.swg. ---------------------------------------------------------------------- >Comment By: Jim (jimfcarroll) Date: 2010-10-28 05:40 Message: nitrogenycs, I do not believe you are correct in that this patch doesn't change existing behavior. It doesn't convert the input to UTF-8 UNLESS the input is a Unicode object. Currently, if you pass a Unicode object to a C++ parameter that takes a std::string you get an exception. With the patch, this will now work. Everything else should remain the same. That is, if you pass a python string (no matter what it contains) it operates as before. Note: [code] if (PyUnicode_Check(obj)) { ... } else PyString_AsStringAndSize(obj, &cstr, &len); [/code] Also, this is much more consistent with the Python 3 behavior, which handles Unicode strings ONLY and handles them the same way as the Patch allows Python 2.x to behave. ---------------------------------------------------------------------- Comment By: nitro (nitrogenycs) Date: 2010-10-27 20:04 Message: Hmm, this patch seems to try and convert the input argument to utf-8 implicitly. For my own projects I do not want such behaviour. Sometimes std::strings are used as containers for binary data. Passing a unicode object into such a function should fail. There might be more examples where implicit conversion to utf-8 is not desirable. E.g. somebody expecting their function to take a different encoding than utf-8. The patch will cause subtle breakage if it performs implicit encoding conversions. ---------------------------------------------------------------------- Comment By: Jim (jimfcarroll) Date: 2010-10-27 18:06 Message: It's in 1.6 (Sept 2000). http://docs.python.org/release/1.6/api/unicodeObjects.html It's not in 1.5. 1.5 was the first real release (or at least the oldest one referenced on the Python official site other than 0.9.2 which they only have through a third part reference). Do we need to check if it's been available for that long? ---------------------------------------------------------------------- Comment By: Jim (jimfcarroll) Date: 2010-10-27 17:54 Message: I'll do some digging. I'm using 2.6 but I know it was available in 2.4. Thanks ---------------------------------------------------------------------- Comment By: William Fulton (wsfulton) Date: 2010-10-27 17:25 Message: Your patch assumes PyUnicode_Check is available. Which version of Python did PyUnicode_Check first become available? Probably any use of it needs to be within a #ifdef PY_VERSION_HEX. If it has always been in Python, then we can probably apply this patch. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=351645&aid=3090894&group_id=1645 |