pyxser-users Mailing List for Python XML Serialization
Brought to you by:
damowe
You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(8) |
Jun
|
Jul
|
Aug
(18) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Daniel M. W. <dm...@co...> - 2011-03-24 18:28:27
|
On Thursday 24 March 2011, "Lee Tambeau" <Lee...@so...> wrote: > Gang, Hello Gang, > > > > Is there a release that runs on Windows 7 ... is there an install for it > ... do I need to rebuild under Windows ... Advise No, this is only the source distribution. You must tune the setup.py script which uses python utilities to build binaries (don't know if it works on Windows Python distributions), and the run the setup scripts python setup.py build --force && python setup.py install --force It requires a C compiler (don't know which one can be used on Windows). > > > > Lee Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Lee T. <Lee...@so...> - 2011-03-24 17:23:06
|
Gang, Is there a release that runs on Windows 7 ... is there an install for it ... do I need to rebuild under Windows ... Advise Lee |
From: Daniel M. W. <dm...@co...> - 2010-11-11 12:52:23
|
On Thursday 11 November 2010, Sean Cameron <snc...@gm...> wrote: > I have tried both SetupTools and downloading the tarball and using > setup.py and get the same error on two machines. The first machine is > a Windows 7 x64 box running the amd64 version of Python 2.7. The > second is a Windows XP box running the x86 version of Python. Both > produce exactly the same error. I have found a few references to it > when searching, but not real solutions. It seems to point towards > incompatibilities with Unicode strings. The problem is the fact that pyxser requires pkg-tools to install. I don't know if Windows have available that kind of packages, but without package tools pyxser installer can not detect its dependencies, such as libxml2. If you find a better way to detect dependencies, other than using pkg-tools, please let me know. > > > C:\pyxser-1.5.1r>python ./setup.py --verbose clean --all > *** library_dirs reconfigured... > *** include_dirs reconfigured... > Traceback (most recent call last): > File "./setup.py", line 103, in <module> > **pyxser_params) > TypeError: __init__() keywords must be strings > > Regards, > Sean Cameron > Procom > www.procomdesign.com > Tel +27 21 715 4000 > Fax +27 721 715 2535 > Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Sean C. <snc...@gm...> - 2010-11-11 07:57:59
|
I have tried both SetupTools and downloading the tarball and using setup.py and get the same error on two machines. The first machine is a Windows 7 x64 box running the amd64 version of Python 2.7. The second is a Windows XP box running the x86 version of Python. Both produce exactly the same error. I have found a few references to it when searching, but not real solutions. It seems to point towards incompatibilities with Unicode strings. C:\pyxser-1.5.1r>python ./setup.py --verbose clean --all *** library_dirs reconfigured... *** include_dirs reconfigured... Traceback (most recent call last): File "./setup.py", line 103, in <module> **pyxser_params) TypeError: __init__() keywords must be strings Regards, Sean Cameron Procom www.procomdesign.com Tel +27 21 715 4000 Fax +27 721 715 2535 |
From: Daniel M. W. <dm...@co...> - 2010-08-25 06:59:51
|
On Wednesday 25 August 2010, Vardan Akopian <vak...@gm...> wrote: > Hi Dainel, Hello, I think that it is fixed, please try again. There were unnecessary argument swapping instructions, I've removed them... Thanks for your feedback. > > The latest version (r177 I think), works ok on 32bit CentOS: all the > tests pass ok, and my own app works ok. > But on 64bit ubuntu the test-utf8.py segfaults, even tough all the > other tests pass ok. > > I think it happens in test-utf8.py:90 > It looks like in pyxser_collections.c:154 dupItems is an invalid pointer > (0xb). Here is the top of the back-trace for the segfault: > > #0 0x00007ffff676bf47 in pyxserList_CheckExact (o=<unknown at remote > 0xb>) at ./src/pyxser_tools.c:515 > #1 0x00007ffff676d480 in pyxser_PyListContains (lst=0xb, > o=<value optimized out>) at ./src/pyxser_tools.c:384 > #2 0x00007ffff676f3c5 in pyxser_RunSerializationCol > (args=0x7fffffffde30) at ./src/pyxser_collections.c:160 > #3 0x00007ffff676fa8f in pyxser_GlobalDictSerialization > (args=0x7fffffffde30) at ./src/pyxser_collections.c:462 > #4 0x00007ffff67717fe in pyxser_RunSerialization (args=0x7fffffffde30) > at ./src/pyxser_serializer.c:297 > #5 0x00007ffff67714d1 in pyxser_SerializeXml (args=0x7fffffffde30) > at ./src/pyxser_serializer.c:213 > #6 0x00007ffff676b44a in pyxserxml (self=<value optimized out>, > args=<value optimized out>, keywds=<value optimized out>) > at ./src/pyxser.c:598 > > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-25 05:50:10
|
Hi Dainel, The latest version (r177 I think), works ok on 32bit CentOS: all the tests pass ok, and my own app works ok. But on 64bit ubuntu the test-utf8.py segfaults, even tough all the other tests pass ok. I think it happens in test-utf8.py:90 It looks like in pyxser_collections.c:154 dupItems is an invalid pointer (0xb). Here is the top of the back-trace for the segfault: #0 0x00007ffff676bf47 in pyxserList_CheckExact (o=<unknown at remote 0xb>) at ./src/pyxser_tools.c:515 #1 0x00007ffff676d480 in pyxser_PyListContains (lst=0xb, o=<value optimized out>) at ./src/pyxser_tools.c:384 #2 0x00007ffff676f3c5 in pyxser_RunSerializationCol (args=0x7fffffffde30) at ./src/pyxser_collections.c:160 #3 0x00007ffff676fa8f in pyxser_GlobalDictSerialization (args=0x7fffffffde30) at ./src/pyxser_collections.c:462 #4 0x00007ffff67717fe in pyxser_RunSerialization (args=0x7fffffffde30) at ./src/pyxser_serializer.c:297 #5 0x00007ffff67714d1 in pyxser_SerializeXml (args=0x7fffffffde30) at ./src/pyxser_serializer.c:213 #6 0x00007ffff676b44a in pyxserxml (self=<value optimized out>, args=<value optimized out>, keywds=<value optimized out>) at ./src/pyxser.c:598 -Vardan |
From: Vardan A. <vak...@gm...> - 2010-08-23 16:34:54
|
On Mon, Aug 23, 2010 at 9:00 AM, Daniel Molina Wegener <dm...@co...> wrote: > On Sunday 22 August 2010, > Vardan Akopian <vak...@gm...> wrote: > >> On Sun, Aug 22, 2010 at 7:54 PM, Daniel Molina Wegener <dm...@co...> > wrote: >> > On Sunday 22 August 2010, >> > >> > Vardan Akopian <vak...@gm...> wrote: >> >> On Sun, Aug 22, 2010 at 4:20 PM, Vardan Akopian <vak...@gm...> >> > >> > wrote: >> >> > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener >> >> > <dm...@co...> >> > >> > wrote: >> >> >> On Sunday 22 August 2010, >> >> >> >> >> >> Vardan Akopian <vak...@gm...> wrote: >> >> >>> Hi Daniel, >> >> >>> >> >> >>> > > The fix is to avoid "from ... import *" constructs. Please, >> >> >>> > > see the >> >> >>> > > >> >> >>> > > attached patch for this. >> >> >>> > >> >> >>> > OK, seems that the list filtering has a problem with >> >> >>> > attachments, >> >> >>> > >> >> >>> > can you send it as gzip archive? >> >> >>> >> >> >>> Please see the attached gz file. >> >> >>> >> >> >>> > > Then I tried using this version with my real world application >> >> >>> > > that >> >> >>> > > >> >> >>> > > actually loads objects through sqlalchemy and tries to >> >> >>> > > serialize them. I >> >> >>> > > >> >> >>> > > encountered another segfault. With a bit of debugging (gdb and >> >> >>> > > valgrind) >> >> >>> > > >> >> >>> > > I narrowed down the problem to pyxser_collections.c:138, where >> >> >>> > > you have: >> >> >>> > > >> >> >>> > > PyListObject *dupItems = *args->dupSrcItems >> >> >>> > >> >> >>> > OK, it was fixed on r161 >> >> >>> >> >> >>> Not to be too pedantic, but I think now the check on lines 146-148 >> >> >>> is redundant since it's already checked on line 152. >> >> >>> >> >> >>> > > In my case args->dupSrcItems is NULL, so this will cause a >> >> >>> > > problem. Once >> >> >>> > > >> >> >>> > > I added a null check with an early return (similar to the >> >> >>> > > check on line >> >> >>> > > >> >> >>> > > 142), the problem got resolved and serialization worked. >> >> >>> > > Please let me >> >> >>> > > >> >> >>> > > know if you'd like a patch for this. >> >> >>> > >> >> >>> > Yep, I didn't see that bug before. At other side, I've made many >> >> >>> > >> >> >>> > enhancements to the serialization algorithm and I've added some >> >> >>> > checks >> >> >>> > >> >> >>> > to make the serialization process a little bit more strict. So, >> >> >>> > you >> >> >>> > >> >> >>> > can test the r161 and see what happens to SQL Alchemy objects. >> >> >>> >> >> >>> Works ok with the provided test files. >> >> >>> >> >> >>> > > After this I tried to serialize the same object, but using >> >> >>> > > enc="ascii" or >> >> >>> > > >> >> >>> > > enc="latin1", and got segfaults with both. This time it was in >> >> >>> > > >> >> >>> > > pyxser_strings.c:107. The debugger shows that name is not >> >> >>> > > NULL, but has >> >> >>> > > >> >> >>> > > an invalid pointer (0x14). Something is probably going wrong >> >> >>> > > in >> >> >>> > > >> >> >>> > > pyxser_serializer.c:281, where name is calculated using the >> >> >>> > > >> >> >>> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much >> >> >>> > > more. I >> >> >>> > > >> >> >>> > > could send you back trace for this, so please let me know. >> >> >>> > >> >> >>> > OK, those errors were removed, now it is serializing any >> >> >>> > encoding >> >> >>> > >> >> >>> > supported by both, Python codecs and LibXML 2 codecs. Please for >> >> >>> > >> >> >>> > /latin-./ encodings, use /iso-8859-.*/ form, since it is >> >> >>> > recognized >> >> >>> > >> >> >>> > by both, Python and LibXML2, by default it handles as ascii >> >> >>> > codec >> >> >>> > >> >> >>> > if you try with enc = 'latin-1', you need to use enc = >> >> >>> > 'iso-8859-1' >> >> >>> > >> >> >>> > instead. >> >> >>> >> >> >>> I tried with 'iso-8859-1', and got the same segfault. So I >> >> >>> debugged a bit more, and here is what I found out: I have a >> >> >>> situation where in the PYXSER_GET_ATTR_NAME macro, even tough >> >> >>> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still >> >> >>> returns NULL. After this it segfauls pretty quickly, since >> >> >>> args->name becomes an invalid pointer. According to the Python >> >> >>> docs, PyUnicode_Encode will return NULL if "an exception is >> >> >>> raised by the codec". I could not find anything in the docs that >> >> >>> explains how to retrieve the error from the codec (but then again >> >> >>> I am a complete python noob ;-)). I also don't know if this >> >> >>> situation can be handled. But I think pyxser should at least >> >> >>> check for it and raise a python exception, instead of letting it >> >> >>> segfault. Of course any other information for the reason of the >> >> >>> failure would be great. >> >> >>> >> >> >>> > > And finally, I attach here the output of the profiling >> >> >>> > > command, as you >> >> >>> > > >> >> >>> > > asked. >> >> >>> > >> >> >>> > Thanks for the profiling command, this is very useful on what >> >> >>> > refers to >> >> >>> > >> >> >>> > performance enhancements. As I've said, I've added some lazy >> >> >>> > initializations >> >> >>> > >> >> >>> > and pyxser now runs a little bit faster, and also it has less >> >> >>> > hard disc >> >> >>> > >> >> >>> > reads :) >> >> >>> > >> >> >>> > Tell what happens with r161, and take a look on this page: >> >> >>> > >> >> >>> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ >> >> >>> > >> >> >>> > There is a small tip on how to serialize any SQL Alchemy DTO. Be >> >> >>> > careful >> >> >>> > >> >> >>> > with those objects, the default serialization, with 50 nodes, >> >> >>> > can go very >> >> >>> > >> >> >>> > deep in the object tree, without the desired results. But test >> >> >>> > that serialization, it will help to know if the changes that >> >> >>> > I've added to r161 >> >> >>> > >> >> >>> > are OK or not. >> >> >>> >> >> >>> Thanks for that, I will definitely take a look at the selectors >> >> >>> and will use them. For now I'm just trying to get the simple >> >> >>> usage working. >> >> >>> >> >> >>> BTW, is pyxser really incompatible with python 2.7? Currently the >> >> >>> version check in setup.py excludes 2.7. But when I change it, it >> >> >>> seams to work fine. Unless you know for a fact that 2.7 cannot be >> >> >>> supported, it might be a good idea to allow it in setup.py. >> >> >> >> >> >> I think that I've killed that bug. Please, can you test the r163. >> >> >> I've finished my tests with your patch over SQL Alchemy subtests >> >> >> and also removed some memory leaks from r160. >> >> >> >> >> >> Thanks again for your feedback :) >> >> > >> >> > Looks good, no more segfaults, and I even get an error message about >> >> > the encoding problem ;-) >> >> > I'm gonna start playing with the selectors. I'll let you know if I >> >> > find any problems. >> >> > >> >> > Thanks for all the support. >> >> > -Vardan >> >> >> >> I think I spoke too soon when I said it looks good: with the current >> >> trunk all the name attributes are either empty or wrong (I use >> >> enc="utf-8"). So I get properties like this: >> >> <pyxs:obj module="datetime" type="datetime" name=":" >> >> objid="id152358784"/> <pyxs:prop type="unicode" name="" >> >> size="5">admin</pyxs:prop> >> >> <pyxs:prop type="unicode" name="" size="5">Admin</pyxs:prop> >> >> >> >> This was ok in r161, but got broken in r162. By looking at the code I >> >> saw that in r161 you used PyUnicode_AS_UNICODE(currentKey), but in >> >> r162 you used PyUnicode_FromObject(currentKey). If I change it back to >> >> PyUnicode_AS_UNICODE, the names become ok again: >> >> <pyxs:obj module="datetime" type="datetime" name="created" >> >> objid="id172220336"/> >> >> <pyxs:prop type="unicode" name="name" size="5">admin</pyxs:prop> >> >> <pyxs:prop type="unicode" name="last_name" >> >> size="5">Admin</pyxs:prop> >> >> >> >> I'm not sure if this is the proper fix, but it helps in my case. BTW, >> >> your test-utf8-sqlalchemy.py works ok with or without this change. >> > >> > OK, it was quick to patch, I've added PyUnicode_AS_UNICODE as an >> > alternate method, please review: >> > Transmitting file data . >> > Committed revision 164. >> > >> > Current revision is r164, if it works, please let me know... >> >> r164 did not fix the problem, the names are still wrong. In my >> understanding this means that PyUnicode_FromObject does not return >> NULL, but whatever it returns is not good for passing to >> PyUnicode_Encode. I'm actually a bit surprised that you used >> PyUnicode_FromObject, since that returns PyObject*, while >> PyUnicode_Encode is expecting a Py_UNICODE*. Is casting from PyObject* >> to Py_UNICODE* ok? >> >> -Vardan > > Hello Vardan, > > I was reviewing and enhancing pyxser. I've made some changes since > I've talked with you. Please checkout and review r167 from the > SVN repository and install the trunk version. > > I've added latin-1 and ascii test and memory profiling test. > Everything went fine with all tests here, also I've added more > checks and reviewed the extension with valgrind, so I think > that it do not have bugs. Also it passed the memory leak checking > test. > > The next release of pyxser will be more stable and fast, I've > gained around 15% of performance with the new lazy initialization > routines. > > Best regards, > -- > Daniel Molina Wegener <dmw [at] coder [dot] cl> > System Programmer & Web Developer > Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ > Hi Daniel, Just tested r167 and it worked well. It even fixed the problem I was seeing with properties starting with underscores. So it really was not an encoding related issue on my side (I really only use ascii in this project, so any encoding should work). Thanks. -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-23 16:01:16
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > On Sun, Aug 22, 2010 at 7:54 PM, Daniel Molina Wegener <dm...@co...> wrote: > > On Sunday 22 August 2010, > > > > Vardan Akopian <vak...@gm...> wrote: > >> On Sun, Aug 22, 2010 at 4:20 PM, Vardan Akopian <vak...@gm...> > > > > wrote: > >> > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener > >> > <dm...@co...> > > > > wrote: > >> >> On Sunday 22 August 2010, > >> >> > >> >> Vardan Akopian <vak...@gm...> wrote: > >> >>> Hi Daniel, > >> >>> > >> >>> > > The fix is to avoid "from ... import *" constructs. Please, > >> >>> > > see the > >> >>> > > > >> >>> > > attached patch for this. > >> >>> > > >> >>> > OK, seems that the list filtering has a problem with > >> >>> > attachments, > >> >>> > > >> >>> > can you send it as gzip archive? > >> >>> > >> >>> Please see the attached gz file. > >> >>> > >> >>> > > Then I tried using this version with my real world application > >> >>> > > that > >> >>> > > > >> >>> > > actually loads objects through sqlalchemy and tries to > >> >>> > > serialize them. I > >> >>> > > > >> >>> > > encountered another segfault. With a bit of debugging (gdb and > >> >>> > > valgrind) > >> >>> > > > >> >>> > > I narrowed down the problem to pyxser_collections.c:138, where > >> >>> > > you have: > >> >>> > > > >> >>> > > PyListObject *dupItems = *args->dupSrcItems > >> >>> > > >> >>> > OK, it was fixed on r161 > >> >>> > >> >>> Not to be too pedantic, but I think now the check on lines 146-148 > >> >>> is redundant since it's already checked on line 152. > >> >>> > >> >>> > > In my case args->dupSrcItems is NULL, so this will cause a > >> >>> > > problem. Once > >> >>> > > > >> >>> > > I added a null check with an early return (similar to the > >> >>> > > check on line > >> >>> > > > >> >>> > > 142), the problem got resolved and serialization worked. > >> >>> > > Please let me > >> >>> > > > >> >>> > > know if you'd like a patch for this. > >> >>> > > >> >>> > Yep, I didn't see that bug before. At other side, I've made many > >> >>> > > >> >>> > enhancements to the serialization algorithm and I've added some > >> >>> > checks > >> >>> > > >> >>> > to make the serialization process a little bit more strict. So, > >> >>> > you > >> >>> > > >> >>> > can test the r161 and see what happens to SQL Alchemy objects. > >> >>> > >> >>> Works ok with the provided test files. > >> >>> > >> >>> > > After this I tried to serialize the same object, but using > >> >>> > > enc="ascii" or > >> >>> > > > >> >>> > > enc="latin1", and got segfaults with both. This time it was in > >> >>> > > > >> >>> > > pyxser_strings.c:107. The debugger shows that name is not > >> >>> > > NULL, but has > >> >>> > > > >> >>> > > an invalid pointer (0x14). Something is probably going wrong > >> >>> > > in > >> >>> > > > >> >>> > > pyxser_serializer.c:281, where name is calculated using the > >> >>> > > > >> >>> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much > >> >>> > > more. I > >> >>> > > > >> >>> > > could send you back trace for this, so please let me know. > >> >>> > > >> >>> > OK, those errors were removed, now it is serializing any > >> >>> > encoding > >> >>> > > >> >>> > supported by both, Python codecs and LibXML 2 codecs. Please for > >> >>> > > >> >>> > /latin-./ encodings, use /iso-8859-.*/ form, since it is > >> >>> > recognized > >> >>> > > >> >>> > by both, Python and LibXML2, by default it handles as ascii > >> >>> > codec > >> >>> > > >> >>> > if you try with enc = 'latin-1', you need to use enc = > >> >>> > 'iso-8859-1' > >> >>> > > >> >>> > instead. > >> >>> > >> >>> I tried with 'iso-8859-1', and got the same segfault. So I > >> >>> debugged a bit more, and here is what I found out: I have a > >> >>> situation where in the PYXSER_GET_ATTR_NAME macro, even tough > >> >>> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still > >> >>> returns NULL. After this it segfauls pretty quickly, since > >> >>> args->name becomes an invalid pointer. According to the Python > >> >>> docs, PyUnicode_Encode will return NULL if "an exception is > >> >>> raised by the codec". I could not find anything in the docs that > >> >>> explains how to retrieve the error from the codec (but then again > >> >>> I am a complete python noob ;-)). I also don't know if this > >> >>> situation can be handled. But I think pyxser should at least > >> >>> check for it and raise a python exception, instead of letting it > >> >>> segfault. Of course any other information for the reason of the > >> >>> failure would be great. > >> >>> > >> >>> > > And finally, I attach here the output of the profiling > >> >>> > > command, as you > >> >>> > > > >> >>> > > asked. > >> >>> > > >> >>> > Thanks for the profiling command, this is very useful on what > >> >>> > refers to > >> >>> > > >> >>> > performance enhancements. As I've said, I've added some lazy > >> >>> > initializations > >> >>> > > >> >>> > and pyxser now runs a little bit faster, and also it has less > >> >>> > hard disc > >> >>> > > >> >>> > reads :) > >> >>> > > >> >>> > Tell what happens with r161, and take a look on this page: > >> >>> > > >> >>> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > >> >>> > > >> >>> > There is a small tip on how to serialize any SQL Alchemy DTO. Be > >> >>> > careful > >> >>> > > >> >>> > with those objects, the default serialization, with 50 nodes, > >> >>> > can go very > >> >>> > > >> >>> > deep in the object tree, without the desired results. But test > >> >>> > that serialization, it will help to know if the changes that > >> >>> > I've added to r161 > >> >>> > > >> >>> > are OK or not. > >> >>> > >> >>> Thanks for that, I will definitely take a look at the selectors > >> >>> and will use them. For now I'm just trying to get the simple > >> >>> usage working. > >> >>> > >> >>> BTW, is pyxser really incompatible with python 2.7? Currently the > >> >>> version check in setup.py excludes 2.7. But when I change it, it > >> >>> seams to work fine. Unless you know for a fact that 2.7 cannot be > >> >>> supported, it might be a good idea to allow it in setup.py. > >> >> > >> >> I think that I've killed that bug. Please, can you test the r163. > >> >> I've finished my tests with your patch over SQL Alchemy subtests > >> >> and also removed some memory leaks from r160. > >> >> > >> >> Thanks again for your feedback :) > >> > > >> > Looks good, no more segfaults, and I even get an error message about > >> > the encoding problem ;-) > >> > I'm gonna start playing with the selectors. I'll let you know if I > >> > find any problems. > >> > > >> > Thanks for all the support. > >> > -Vardan > >> > >> I think I spoke too soon when I said it looks good: with the current > >> trunk all the name attributes are either empty or wrong (I use > >> enc="utf-8"). So I get properties like this: > >> <pyxs:obj module="datetime" type="datetime" name=":" > >> objid="id152358784"/> <pyxs:prop type="unicode" name="" > >> size="5">admin</pyxs:prop> > >> <pyxs:prop type="unicode" name="" size="5">Admin</pyxs:prop> > >> > >> This was ok in r161, but got broken in r162. By looking at the code I > >> saw that in r161 you used PyUnicode_AS_UNICODE(currentKey), but in > >> r162 you used PyUnicode_FromObject(currentKey). If I change it back to > >> PyUnicode_AS_UNICODE, the names become ok again: > >> <pyxs:obj module="datetime" type="datetime" name="created" > >> objid="id172220336"/> > >> <pyxs:prop type="unicode" name="name" size="5">admin</pyxs:prop> > >> <pyxs:prop type="unicode" name="last_name" > >> size="5">Admin</pyxs:prop> > >> > >> I'm not sure if this is the proper fix, but it helps in my case. BTW, > >> your test-utf8-sqlalchemy.py works ok with or without this change. > > > > OK, it was quick to patch, I've added PyUnicode_AS_UNICODE as an > > alternate method, please review: > > Transmitting file data . > > Committed revision 164. > > > > Current revision is r164, if it works, please let me know... > > r164 did not fix the problem, the names are still wrong. In my > understanding this means that PyUnicode_FromObject does not return > NULL, but whatever it returns is not good for passing to > PyUnicode_Encode. I'm actually a bit surprised that you used > PyUnicode_FromObject, since that returns PyObject*, while > PyUnicode_Encode is expecting a Py_UNICODE*. Is casting from PyObject* > to Py_UNICODE* ok? > > -Vardan Hello Vardan, I was reviewing and enhancing pyxser. I've made some changes since I've talked with you. Please checkout and review r167 from the SVN repository and install the trunk version. I've added latin-1 and ascii test and memory profiling test. Everything went fine with all tests here, also I've added more checks and reviewed the extension with valgrind, so I think that it do not have bugs. Also it passed the memory leak checking test. The next release of pyxser will be more stable and fast, I've gained around 15% of performance with the new lazy initialization routines. Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-23 03:30:02
|
On Sun, Aug 22, 2010 at 7:54 PM, Daniel Molina Wegener <dm...@co...> wrote: > On Sunday 22 August 2010, > Vardan Akopian <vak...@gm...> wrote: > >> On Sun, Aug 22, 2010 at 4:20 PM, Vardan Akopian <vak...@gm...> > wrote: >> > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener <dm...@co...> > wrote: >> >> On Sunday 22 August 2010, >> >> >> >> Vardan Akopian <vak...@gm...> wrote: >> >>> Hi Daniel, >> >>> >> >>> > > The fix is to avoid "from ... import *" constructs. Please, see >> >>> > > the >> >>> > > >> >>> > > attached patch for this. >> >>> > >> >>> > OK, seems that the list filtering has a problem with attachments, >> >>> > >> >>> > can you send it as gzip archive? >> >>> >> >>> Please see the attached gz file. >> >>> >> >>> > > Then I tried using this version with my real world application >> >>> > > that >> >>> > > >> >>> > > actually loads objects through sqlalchemy and tries to serialize >> >>> > > them. I >> >>> > > >> >>> > > encountered another segfault. With a bit of debugging (gdb and >> >>> > > valgrind) >> >>> > > >> >>> > > I narrowed down the problem to pyxser_collections.c:138, where >> >>> > > you have: >> >>> > > >> >>> > > PyListObject *dupItems = *args->dupSrcItems >> >>> > >> >>> > OK, it was fixed on r161 >> >>> >> >>> Not to be too pedantic, but I think now the check on lines 146-148 is >> >>> redundant since it's already checked on line 152. >> >>> >> >>> > > In my case args->dupSrcItems is NULL, so this will cause a >> >>> > > problem. Once >> >>> > > >> >>> > > I added a null check with an early return (similar to the check >> >>> > > on line >> >>> > > >> >>> > > 142), the problem got resolved and serialization worked. Please >> >>> > > let me >> >>> > > >> >>> > > know if you'd like a patch for this. >> >>> > >> >>> > Yep, I didn't see that bug before. At other side, I've made many >> >>> > >> >>> > enhancements to the serialization algorithm and I've added some >> >>> > checks >> >>> > >> >>> > to make the serialization process a little bit more strict. So, you >> >>> > >> >>> > can test the r161 and see what happens to SQL Alchemy objects. >> >>> >> >>> Works ok with the provided test files. >> >>> >> >>> > > After this I tried to serialize the same object, but using >> >>> > > enc="ascii" or >> >>> > > >> >>> > > enc="latin1", and got segfaults with both. This time it was in >> >>> > > >> >>> > > pyxser_strings.c:107. The debugger shows that name is not NULL, >> >>> > > but has >> >>> > > >> >>> > > an invalid pointer (0x14). Something is probably going wrong in >> >>> > > >> >>> > > pyxser_serializer.c:281, where name is calculated using the >> >>> > > >> >>> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much >> >>> > > more. I >> >>> > > >> >>> > > could send you back trace for this, so please let me know. >> >>> > >> >>> > OK, those errors were removed, now it is serializing any encoding >> >>> > >> >>> > supported by both, Python codecs and LibXML 2 codecs. Please for >> >>> > >> >>> > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized >> >>> > >> >>> > by both, Python and LibXML2, by default it handles as ascii codec >> >>> > >> >>> > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' >> >>> > >> >>> > instead. >> >>> >> >>> I tried with 'iso-8859-1', and got the same segfault. So I debugged a >> >>> bit more, and here is what I found out: I have a situation where in >> >>> the PYXSER_GET_ATTR_NAME macro, even tough >> >>> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still >> >>> returns NULL. After this it segfauls pretty quickly, since args->name >> >>> becomes an invalid pointer. According to the Python docs, >> >>> PyUnicode_Encode will return NULL if "an exception is raised by the >> >>> codec". I could not find anything in the docs that explains how to >> >>> retrieve the error from the codec (but then again I am a complete >> >>> python noob ;-)). I also don't know if this situation can be handled. >> >>> But I think pyxser should at least check for it and raise a python >> >>> exception, instead of letting it segfault. Of course any other >> >>> information for the reason of the failure would be great. >> >>> >> >>> > > And finally, I attach here the output of the profiling command, >> >>> > > as you >> >>> > > >> >>> > > asked. >> >>> > >> >>> > Thanks for the profiling command, this is very useful on what >> >>> > refers to >> >>> > >> >>> > performance enhancements. As I've said, I've added some lazy >> >>> > initializations >> >>> > >> >>> > and pyxser now runs a little bit faster, and also it has less hard >> >>> > disc >> >>> > >> >>> > reads :) >> >>> > >> >>> > Tell what happens with r161, and take a look on this page: >> >>> > >> >>> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ >> >>> > >> >>> > There is a small tip on how to serialize any SQL Alchemy DTO. Be >> >>> > careful >> >>> > >> >>> > with those objects, the default serialization, with 50 nodes, can >> >>> > go very >> >>> > >> >>> > deep in the object tree, without the desired results. But test that >> >>> > serialization, it will help to know if the changes that I've added >> >>> > to r161 >> >>> > >> >>> > are OK or not. >> >>> >> >>> Thanks for that, I will definitely take a look at the selectors and >> >>> will use them. For now I'm just trying to get the simple usage >> >>> working. >> >>> >> >>> BTW, is pyxser really incompatible with python 2.7? Currently the >> >>> version check in setup.py excludes 2.7. But when I change it, it >> >>> seams to work fine. Unless you know for a fact that 2.7 cannot be >> >>> supported, it might be a good idea to allow it in setup.py. >> >> >> >> I think that I've killed that bug. Please, can you test the r163. >> >> I've finished my tests with your patch over SQL Alchemy subtests and >> >> also removed some memory leaks from r160. >> >> >> >> Thanks again for your feedback :) >> > >> > Looks good, no more segfaults, and I even get an error message about >> > the encoding problem ;-) >> > I'm gonna start playing with the selectors. I'll let you know if I >> > find any problems. >> > >> > Thanks for all the support. >> > -Vardan >> >> I think I spoke too soon when I said it looks good: with the current >> trunk all the name attributes are either empty or wrong (I use >> enc="utf-8"). So I get properties like this: >> <pyxs:obj module="datetime" type="datetime" name=":" >> objid="id152358784"/> <pyxs:prop type="unicode" name="" >> size="5">admin</pyxs:prop> >> <pyxs:prop type="unicode" name="" size="5">Admin</pyxs:prop> >> >> This was ok in r161, but got broken in r162. By looking at the code I >> saw that in r161 you used PyUnicode_AS_UNICODE(currentKey), but in >> r162 you used PyUnicode_FromObject(currentKey). If I change it back to >> PyUnicode_AS_UNICODE, the names become ok again: >> <pyxs:obj module="datetime" type="datetime" name="created" >> objid="id172220336"/> >> <pyxs:prop type="unicode" name="name" size="5">admin</pyxs:prop> >> <pyxs:prop type="unicode" name="last_name" size="5">Admin</pyxs:prop> >> >> I'm not sure if this is the proper fix, but it helps in my case. BTW, >> your test-utf8-sqlalchemy.py works ok with or without this change. > > OK, it was quick to patch, I've added PyUnicode_AS_UNICODE as an alternate > method, please review: > Transmitting file data . > Committed revision 164. > > Current revision is r164, if it works, please let me know... > r164 did not fix the problem, the names are still wrong. In my understanding this means that PyUnicode_FromObject does not return NULL, but whatever it returns is not good for passing to PyUnicode_Encode. I'm actually a bit surprised that you used PyUnicode_FromObject, since that returns PyObject*, while PyUnicode_Encode is expecting a Py_UNICODE*. Is casting from PyObject* to Py_UNICODE* ok? -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-23 02:54:22
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > On Sun, Aug 22, 2010 at 4:20 PM, Vardan Akopian <vak...@gm...> wrote: > > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener <dm...@co...> wrote: > >> On Sunday 22 August 2010, > >> > >> Vardan Akopian <vak...@gm...> wrote: > >>> Hi Daniel, > >>> > >>> > > The fix is to avoid "from ... import *" constructs. Please, see > >>> > > the > >>> > > > >>> > > attached patch for this. > >>> > > >>> > OK, seems that the list filtering has a problem with attachments, > >>> > > >>> > can you send it as gzip archive? > >>> > >>> Please see the attached gz file. > >>> > >>> > > Then I tried using this version with my real world application > >>> > > that > >>> > > > >>> > > actually loads objects through sqlalchemy and tries to serialize > >>> > > them. I > >>> > > > >>> > > encountered another segfault. With a bit of debugging (gdb and > >>> > > valgrind) > >>> > > > >>> > > I narrowed down the problem to pyxser_collections.c:138, where > >>> > > you have: > >>> > > > >>> > > PyListObject *dupItems = *args->dupSrcItems > >>> > > >>> > OK, it was fixed on r161 > >>> > >>> Not to be too pedantic, but I think now the check on lines 146-148 is > >>> redundant since it's already checked on line 152. > >>> > >>> > > In my case args->dupSrcItems is NULL, so this will cause a > >>> > > problem. Once > >>> > > > >>> > > I added a null check with an early return (similar to the check > >>> > > on line > >>> > > > >>> > > 142), the problem got resolved and serialization worked. Please > >>> > > let me > >>> > > > >>> > > know if you'd like a patch for this. > >>> > > >>> > Yep, I didn't see that bug before. At other side, I've made many > >>> > > >>> > enhancements to the serialization algorithm and I've added some > >>> > checks > >>> > > >>> > to make the serialization process a little bit more strict. So, you > >>> > > >>> > can test the r161 and see what happens to SQL Alchemy objects. > >>> > >>> Works ok with the provided test files. > >>> > >>> > > After this I tried to serialize the same object, but using > >>> > > enc="ascii" or > >>> > > > >>> > > enc="latin1", and got segfaults with both. This time it was in > >>> > > > >>> > > pyxser_strings.c:107. The debugger shows that name is not NULL, > >>> > > but has > >>> > > > >>> > > an invalid pointer (0x14). Something is probably going wrong in > >>> > > > >>> > > pyxser_serializer.c:281, where name is calculated using the > >>> > > > >>> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much > >>> > > more. I > >>> > > > >>> > > could send you back trace for this, so please let me know. > >>> > > >>> > OK, those errors were removed, now it is serializing any encoding > >>> > > >>> > supported by both, Python codecs and LibXML 2 codecs. Please for > >>> > > >>> > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized > >>> > > >>> > by both, Python and LibXML2, by default it handles as ascii codec > >>> > > >>> > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' > >>> > > >>> > instead. > >>> > >>> I tried with 'iso-8859-1', and got the same segfault. So I debugged a > >>> bit more, and here is what I found out: I have a situation where in > >>> the PYXSER_GET_ATTR_NAME macro, even tough > >>> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still > >>> returns NULL. After this it segfauls pretty quickly, since args->name > >>> becomes an invalid pointer. According to the Python docs, > >>> PyUnicode_Encode will return NULL if "an exception is raised by the > >>> codec". I could not find anything in the docs that explains how to > >>> retrieve the error from the codec (but then again I am a complete > >>> python noob ;-)). I also don't know if this situation can be handled. > >>> But I think pyxser should at least check for it and raise a python > >>> exception, instead of letting it segfault. Of course any other > >>> information for the reason of the failure would be great. > >>> > >>> > > And finally, I attach here the output of the profiling command, > >>> > > as you > >>> > > > >>> > > asked. > >>> > > >>> > Thanks for the profiling command, this is very useful on what > >>> > refers to > >>> > > >>> > performance enhancements. As I've said, I've added some lazy > >>> > initializations > >>> > > >>> > and pyxser now runs a little bit faster, and also it has less hard > >>> > disc > >>> > > >>> > reads :) > >>> > > >>> > Tell what happens with r161, and take a look on this page: > >>> > > >>> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > >>> > > >>> > There is a small tip on how to serialize any SQL Alchemy DTO. Be > >>> > careful > >>> > > >>> > with those objects, the default serialization, with 50 nodes, can > >>> > go very > >>> > > >>> > deep in the object tree, without the desired results. But test that > >>> > serialization, it will help to know if the changes that I've added > >>> > to r161 > >>> > > >>> > are OK or not. > >>> > >>> Thanks for that, I will definitely take a look at the selectors and > >>> will use them. For now I'm just trying to get the simple usage > >>> working. > >>> > >>> BTW, is pyxser really incompatible with python 2.7? Currently the > >>> version check in setup.py excludes 2.7. But when I change it, it > >>> seams to work fine. Unless you know for a fact that 2.7 cannot be > >>> supported, it might be a good idea to allow it in setup.py. > >> > >> I think that I've killed that bug. Please, can you test the r163. > >> I've finished my tests with your patch over SQL Alchemy subtests and > >> also removed some memory leaks from r160. > >> > >> Thanks again for your feedback :) > > > > Looks good, no more segfaults, and I even get an error message about > > the encoding problem ;-) > > I'm gonna start playing with the selectors. I'll let you know if I > > find any problems. > > > > Thanks for all the support. > > -Vardan > > I think I spoke too soon when I said it looks good: with the current > trunk all the name attributes are either empty or wrong (I use > enc="utf-8"). So I get properties like this: > <pyxs:obj module="datetime" type="datetime" name=":" > objid="id152358784"/> <pyxs:prop type="unicode" name="" > size="5">admin</pyxs:prop> > <pyxs:prop type="unicode" name="" size="5">Admin</pyxs:prop> > > This was ok in r161, but got broken in r162. By looking at the code I > saw that in r161 you used PyUnicode_AS_UNICODE(currentKey), but in > r162 you used PyUnicode_FromObject(currentKey). If I change it back to > PyUnicode_AS_UNICODE, the names become ok again: > <pyxs:obj module="datetime" type="datetime" name="created" > objid="id172220336"/> > <pyxs:prop type="unicode" name="name" size="5">admin</pyxs:prop> > <pyxs:prop type="unicode" name="last_name" size="5">Admin</pyxs:prop> > > I'm not sure if this is the proper fix, but it helps in my case. BTW, > your test-utf8-sqlalchemy.py works ok with or without this change. OK, it was quick to patch, I've added PyUnicode_AS_UNICODE as an alternate method, please review: Transmitting file data . Committed revision 164. Current revision is r164, if it works, please let me know... > > Thanks. > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Daniel M. W. <dm...@co...> - 2010-08-23 02:46:31
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > On Sun, Aug 22, 2010 at 4:20 PM, Vardan Akopian <vak...@gm...> wrote: > > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener <dm...@co...> wrote: > >> On Sunday 22 August 2010, > >> > >> Vardan Akopian <vak...@gm...> wrote: > >>> Hi Daniel, > >>> > >>> > > The fix is to avoid "from ... import *" constructs. Please, see > >>> > > the > >>> > > > >>> > > attached patch for this. > >>> > > >>> > OK, seems that the list filtering has a problem with attachments, > >>> > > >>> > can you send it as gzip archive? > >>> > >>> Please see the attached gz file. > >>> > >>> > > Then I tried using this version with my real world application > >>> > > that > >>> > > > >>> > > actually loads objects through sqlalchemy and tries to serialize > >>> > > them. I > >>> > > > >>> > > encountered another segfault. With a bit of debugging (gdb and > >>> > > valgrind) > >>> > > > >>> > > I narrowed down the problem to pyxser_collections.c:138, where > >>> > > you have: > >>> > > > >>> > > PyListObject *dupItems = *args->dupSrcItems > >>> > > >>> > OK, it was fixed on r161 > >>> > >>> Not to be too pedantic, but I think now the check on lines 146-148 is > >>> redundant since it's already checked on line 152. > >>> > >>> > > In my case args->dupSrcItems is NULL, so this will cause a > >>> > > problem. Once > >>> > > > >>> > > I added a null check with an early return (similar to the check > >>> > > on line > >>> > > > >>> > > 142), the problem got resolved and serialization worked. Please > >>> > > let me > >>> > > > >>> > > know if you'd like a patch for this. > >>> > > >>> > Yep, I didn't see that bug before. At other side, I've made many > >>> > > >>> > enhancements to the serialization algorithm and I've added some > >>> > checks > >>> > > >>> > to make the serialization process a little bit more strict. So, you > >>> > > >>> > can test the r161 and see what happens to SQL Alchemy objects. > >>> > >>> Works ok with the provided test files. > >>> > >>> > > After this I tried to serialize the same object, but using > >>> > > enc="ascii" or > >>> > > > >>> > > enc="latin1", and got segfaults with both. This time it was in > >>> > > > >>> > > pyxser_strings.c:107. The debugger shows that name is not NULL, > >>> > > but has > >>> > > > >>> > > an invalid pointer (0x14). Something is probably going wrong in > >>> > > > >>> > > pyxser_serializer.c:281, where name is calculated using the > >>> > > > >>> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much > >>> > > more. I > >>> > > > >>> > > could send you back trace for this, so please let me know. > >>> > > >>> > OK, those errors were removed, now it is serializing any encoding > >>> > > >>> > supported by both, Python codecs and LibXML 2 codecs. Please for > >>> > > >>> > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized > >>> > > >>> > by both, Python and LibXML2, by default it handles as ascii codec > >>> > > >>> > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' > >>> > > >>> > instead. > >>> > >>> I tried with 'iso-8859-1', and got the same segfault. So I debugged a > >>> bit more, and here is what I found out: I have a situation where in > >>> the PYXSER_GET_ATTR_NAME macro, even tough > >>> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still > >>> returns NULL. After this it segfauls pretty quickly, since args->name > >>> becomes an invalid pointer. According to the Python docs, > >>> PyUnicode_Encode will return NULL if "an exception is raised by the > >>> codec". I could not find anything in the docs that explains how to > >>> retrieve the error from the codec (but then again I am a complete > >>> python noob ;-)). I also don't know if this situation can be handled. > >>> But I think pyxser should at least check for it and raise a python > >>> exception, instead of letting it segfault. Of course any other > >>> information for the reason of the failure would be great. > >>> > >>> > > And finally, I attach here the output of the profiling command, > >>> > > as you > >>> > > > >>> > > asked. > >>> > > >>> > Thanks for the profiling command, this is very useful on what > >>> > refers to > >>> > > >>> > performance enhancements. As I've said, I've added some lazy > >>> > initializations > >>> > > >>> > and pyxser now runs a little bit faster, and also it has less hard > >>> > disc > >>> > > >>> > reads :) > >>> > > >>> > Tell what happens with r161, and take a look on this page: > >>> > > >>> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > >>> > > >>> > There is a small tip on how to serialize any SQL Alchemy DTO. Be > >>> > careful > >>> > > >>> > with those objects, the default serialization, with 50 nodes, can > >>> > go very > >>> > > >>> > deep in the object tree, without the desired results. But test that > >>> > serialization, it will help to know if the changes that I've added > >>> > to r161 > >>> > > >>> > are OK or not. > >>> > >>> Thanks for that, I will definitely take a look at the selectors and > >>> will use them. For now I'm just trying to get the simple usage > >>> working. > >>> > >>> BTW, is pyxser really incompatible with python 2.7? Currently the > >>> version check in setup.py excludes 2.7. But when I change it, it > >>> seams to work fine. Unless you know for a fact that 2.7 cannot be > >>> supported, it might be a good idea to allow it in setup.py. > >> > >> I think that I've killed that bug. Please, can you test the r163. > >> I've finished my tests with your patch over SQL Alchemy subtests and > >> also removed some memory leaks from r160. > >> > >> Thanks again for your feedback :) > > > > Looks good, no more segfaults, and I even get an error message about > > the encoding problem ;-) > > I'm gonna start playing with the selectors. I'll let you know if I > > find any problems. > > > > Thanks for all the support. > > -Vardan > > I think I spoke too soon when I said it looks good: with the current > trunk all the name attributes are either empty or wrong (I use > enc="utf-8"). So I get properties like this: > <pyxs:obj module="datetime" type="datetime" name=":" > objid="id152358784"/> <pyxs:prop type="unicode" name="" > size="5">admin</pyxs:prop> > <pyxs:prop type="unicode" name="" size="5">Admin</pyxs:prop> > > This was ok in r161, but got broken in r162. By looking at the code I > saw that in r161 you used PyUnicode_AS_UNICODE(currentKey), but in > r162 you used PyUnicode_FromObject(currentKey). If I change it back to > PyUnicode_AS_UNICODE, the names become ok again: > <pyxs:obj module="datetime" type="datetime" name="created" > objid="id172220336"/> > <pyxs:prop type="unicode" name="name" size="5">admin</pyxs:prop> > <pyxs:prop type="unicode" name="last_name" size="5">Admin</pyxs:prop> Seems that you have a very strange distribution... <?xml version="1.0" encoding="iso-8859-1"?> <pyxs:obj xmlns:pyxs="http://projects.coder.cl/pyxser/model/" version="1.0" type="User" module="testpkg.sample" objid="id169965932"> <pyxs:prop type="str" name="password">password</pyxs:prop> <pyxs:prop type="str" name="fullname">Ed Jones</pyxs:prop> <pyxs:prop type="str" name="name">ed</pyxs:prop> </pyxs:obj> Here is doing well the serialization, I will add PyUnicode_AS_UNICODE as alternate method, but tomorrow, I need to take a rest. > > I'm not sure if this is the proper fix, but it helps in my case. BTW, > your test-utf8-sqlalchemy.py works ok with or without this change. > > Thanks. > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-23 02:12:41
|
On Sun, Aug 22, 2010 at 4:20 PM, Vardan Akopian <vak...@gm...> wrote: > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener <dm...@co...> wrote: >> On Sunday 22 August 2010, >> Vardan Akopian <vak...@gm...> wrote: >> >>> Hi Daniel, >>> >>> > > The fix is to avoid "from ... import *" constructs. Please, see the >>> > > >>> > > attached patch for this. >>> > >>> > OK, seems that the list filtering has a problem with attachments, >>> > >>> > can you send it as gzip archive? >>> >>> Please see the attached gz file. >>> >>> > > Then I tried using this version with my real world application that >>> > > >>> > > actually loads objects through sqlalchemy and tries to serialize >>> > > them. I >>> > > >>> > > encountered another segfault. With a bit of debugging (gdb and >>> > > valgrind) >>> > > >>> > > I narrowed down the problem to pyxser_collections.c:138, where you >>> > > have: >>> > > >>> > > PyListObject *dupItems = *args->dupSrcItems >>> > >>> > OK, it was fixed on r161 >>> >>> Not to be too pedantic, but I think now the check on lines 146-148 is >>> redundant since it's already checked on line 152. >>> >>> > > In my case args->dupSrcItems is NULL, so this will cause a problem. >>> > > Once >>> > > >>> > > I added a null check with an early return (similar to the check on >>> > > line >>> > > >>> > > 142), the problem got resolved and serialization worked. Please let >>> > > me >>> > > >>> > > know if you'd like a patch for this. >>> > >>> > Yep, I didn't see that bug before. At other side, I've made many >>> > >>> > enhancements to the serialization algorithm and I've added some checks >>> > >>> > to make the serialization process a little bit more strict. So, you >>> > >>> > can test the r161 and see what happens to SQL Alchemy objects. >>> >>> Works ok with the provided test files. >>> >>> > > After this I tried to serialize the same object, but using >>> > > enc="ascii" or >>> > > >>> > > enc="latin1", and got segfaults with both. This time it was in >>> > > >>> > > pyxser_strings.c:107. The debugger shows that name is not NULL, but >>> > > has >>> > > >>> > > an invalid pointer (0x14). Something is probably going wrong in >>> > > >>> > > pyxser_serializer.c:281, where name is calculated using the >>> > > >>> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I >>> > > >>> > > could send you back trace for this, so please let me know. >>> > >>> > OK, those errors were removed, now it is serializing any encoding >>> > >>> > supported by both, Python codecs and LibXML 2 codecs. Please for >>> > >>> > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized >>> > >>> > by both, Python and LibXML2, by default it handles as ascii codec >>> > >>> > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' >>> > >>> > instead. >>> >>> I tried with 'iso-8859-1', and got the same segfault. So I debugged a >>> bit more, and here is what I found out: I have a situation where in >>> the PYXSER_GET_ATTR_NAME macro, even tough >>> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still >>> returns NULL. After this it segfauls pretty quickly, since args->name >>> becomes an invalid pointer. According to the Python docs, >>> PyUnicode_Encode will return NULL if "an exception is raised by the >>> codec". I could not find anything in the docs that explains how to >>> retrieve the error from the codec (but then again I am a complete >>> python noob ;-)). I also don't know if this situation can be handled. >>> But I think pyxser should at least check for it and raise a python >>> exception, instead of letting it segfault. Of course any other >>> information for the reason of the failure would be great. >>> >>> > > And finally, I attach here the output of the profiling command, as >>> > > you >>> > > >>> > > asked. >>> > >>> > Thanks for the profiling command, this is very useful on what refers to >>> > >>> > performance enhancements. As I've said, I've added some lazy >>> > initializations >>> > >>> > and pyxser now runs a little bit faster, and also it has less hard disc >>> > >>> > reads :) >>> > >>> > Tell what happens with r161, and take a look on this page: >>> > >>> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ >>> > >>> > There is a small tip on how to serialize any SQL Alchemy DTO. Be >>> > careful >>> > >>> > with those objects, the default serialization, with 50 nodes, can go >>> > very >>> > >>> > deep in the object tree, without the desired results. But test that >>> > serialization, it will help to know if the changes that I've added to >>> > r161 >>> > >>> > are OK or not. >>> >>> Thanks for that, I will definitely take a look at the selectors and >>> will use them. For now I'm just trying to get the simple usage >>> working. >>> >>> BTW, is pyxser really incompatible with python 2.7? Currently the >>> version check in setup.py excludes 2.7. But when I change it, it seams >>> to work fine. Unless you know for a fact that 2.7 cannot be supported, >>> it might be a good idea to allow it in setup.py. >> >> I think that I've killed that bug. Please, can you test the r163. >> I've finished my tests with your patch over SQL Alchemy subtests and >> also removed some memory leaks from r160. >> >> Thanks again for your feedback :) >> > > Looks good, no more segfaults, and I even get an error message about > the encoding problem ;-) > I'm gonna start playing with the selectors. I'll let you know if I > find any problems. > > Thanks for all the support. > -Vardan > I think I spoke too soon when I said it looks good: with the current trunk all the name attributes are either empty or wrong (I use enc="utf-8"). So I get properties like this: <pyxs:obj module="datetime" type="datetime" name=":" objid="id152358784"/> <pyxs:prop type="unicode" name="" size="5">admin</pyxs:prop> <pyxs:prop type="unicode" name="" size="5">Admin</pyxs:prop> This was ok in r161, but got broken in r162. By looking at the code I saw that in r161 you used PyUnicode_AS_UNICODE(currentKey), but in r162 you used PyUnicode_FromObject(currentKey). If I change it back to PyUnicode_AS_UNICODE, the names become ok again: <pyxs:obj module="datetime" type="datetime" name="created" objid="id172220336"/> <pyxs:prop type="unicode" name="name" size="5">admin</pyxs:prop> <pyxs:prop type="unicode" name="last_name" size="5">Admin</pyxs:prop> I'm not sure if this is the proper fix, but it helps in my case. BTW, your test-utf8-sqlalchemy.py works ok with or without this change. Thanks. -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-23 00:51:36
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > Hi Daniel, > > It looks like pyxser has a problem serializing built-in types when > they are at the top level. For example each of the following will > throw a "ValueError: Argument given for serialization/deserialization > is not a Python Object or is None.": > > pyxser.serialize(obj = 1, enc = 'utf-8') > pyxser.serialize(obj = 'foo', enc = 'utf-8') > pyxser.serialize(obj = ['foo'], enc = 'utf-8') > pyxser.serialize(obj = ('foo'), enc = 'utf-8') > pyxser.serialize(obj = {1:'foo'}, enc = 'utf-8') pyxser, pyxser XML schema and pyxser XML DTD are designed to serialize objects with a constructor. Since those types of objects do not have an accesible constructor from the C/API, I can't deserialize those objects and pyxser can't deserialize those kinds of objects. > > Also, an empty class also has the same problem: > class Test: pass > pyxser.serialize(obj = Test(), enc = 'utf-8') The same happens here. Those objects require a defined constructor accesible from the Python C/API. The Python C/API just provides me of PyInstance_NewRaw() for objects with argumented constructors and PyObject_CallFunctionObjArgs() for object with non-argumented constructor. If you read the documentation you will notice that is not possible to restore those instances. > > But if I do > x = Test() > x.foo = 'foo' > pyxser.serialize(obj = x, enc = 'utf-8') > > then it works fine. > > Is this by design, or am I using it wrongly? > I encountered this when I was trying to serialize the result of > User.all() that gets all the users through sqlalchemy. Yep, that works. If you want to serialize a list, like the User.all() is returning, I suggest that you must use a dummy object: x = Test() x.foo = User.all() pyxser.serialize(obj = x, enc = 'utf-8') This will work :) > > Thanks. > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-22 23:54:19
|
Hi Daniel, It looks like pyxser has a problem serializing built-in types when they are at the top level. For example each of the following will throw a "ValueError: Argument given for serialization/deserialization is not a Python Object or is None.": pyxser.serialize(obj = 1, enc = 'utf-8') pyxser.serialize(obj = 'foo', enc = 'utf-8') pyxser.serialize(obj = ['foo'], enc = 'utf-8') pyxser.serialize(obj = ('foo'), enc = 'utf-8') pyxser.serialize(obj = {1:'foo'}, enc = 'utf-8') Also, an empty class also has the same problem: class Test: pass pyxser.serialize(obj = Test(), enc = 'utf-8') But if I do x = Test() x.foo = 'foo' pyxser.serialize(obj = x, enc = 'utf-8') then it works fine. Is this by design, or am I using it wrongly? I encountered this when I was trying to serialize the result of User.all() that gets all the users through sqlalchemy. Thanks. -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-22 23:36:22
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener <dm...@co...> wrote: > > On Sunday 22 August 2010, > > > > Vardan Akopian <vak...@gm...> wrote: > >> Hi Daniel, > >> > >> > > The fix is to avoid "from ... import *" constructs. Please, see > >> > > the > >> > > > >> > > attached patch for this. > >> > > >> > OK, seems that the list filtering has a problem with attachments, > >> > > >> > can you send it as gzip archive? > >> > >> Please see the attached gz file. > >> > >> > > Then I tried using this version with my real world application > >> > > that > >> > > > >> > > actually loads objects through sqlalchemy and tries to serialize > >> > > them. I > >> > > > >> > > encountered another segfault. With a bit of debugging (gdb and > >> > > valgrind) > >> > > > >> > > I narrowed down the problem to pyxser_collections.c:138, where you > >> > > have: > >> > > > >> > > PyListObject *dupItems = *args->dupSrcItems > >> > > >> > OK, it was fixed on r161 > >> > >> Not to be too pedantic, but I think now the check on lines 146-148 is > >> redundant since it's already checked on line 152. > >> > >> > > In my case args->dupSrcItems is NULL, so this will cause a > >> > > problem. Once > >> > > > >> > > I added a null check with an early return (similar to the check on > >> > > line > >> > > > >> > > 142), the problem got resolved and serialization worked. Please > >> > > let me > >> > > > >> > > know if you'd like a patch for this. > >> > > >> > Yep, I didn't see that bug before. At other side, I've made many > >> > > >> > enhancements to the serialization algorithm and I've added some > >> > checks > >> > > >> > to make the serialization process a little bit more strict. So, you > >> > > >> > can test the r161 and see what happens to SQL Alchemy objects. > >> > >> Works ok with the provided test files. > >> > >> > > After this I tried to serialize the same object, but using > >> > > enc="ascii" or > >> > > > >> > > enc="latin1", and got segfaults with both. This time it was in > >> > > > >> > > pyxser_strings.c:107. The debugger shows that name is not NULL, > >> > > but has > >> > > > >> > > an invalid pointer (0x14). Something is probably going wrong in > >> > > > >> > > pyxser_serializer.c:281, where name is calculated using the > >> > > > >> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. > >> > > I > >> > > > >> > > could send you back trace for this, so please let me know. > >> > > >> > OK, those errors were removed, now it is serializing any encoding > >> > > >> > supported by both, Python codecs and LibXML 2 codecs. Please for > >> > > >> > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized > >> > > >> > by both, Python and LibXML2, by default it handles as ascii codec > >> > > >> > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' > >> > > >> > instead. > >> > >> I tried with 'iso-8859-1', and got the same segfault. So I debugged a > >> bit more, and here is what I found out: I have a situation where in > >> the PYXSER_GET_ATTR_NAME macro, even tough > >> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still > >> returns NULL. After this it segfauls pretty quickly, since args->name > >> becomes an invalid pointer. According to the Python docs, > >> PyUnicode_Encode will return NULL if "an exception is raised by the > >> codec". I could not find anything in the docs that explains how to > >> retrieve the error from the codec (but then again I am a complete > >> python noob ;-)). I also don't know if this situation can be handled. > >> But I think pyxser should at least check for it and raise a python > >> exception, instead of letting it segfault. Of course any other > >> information for the reason of the failure would be great. > >> > >> > > And finally, I attach here the output of the profiling command, as > >> > > you > >> > > > >> > > asked. > >> > > >> > Thanks for the profiling command, this is very useful on what refers > >> > to > >> > > >> > performance enhancements. As I've said, I've added some lazy > >> > initializations > >> > > >> > and pyxser now runs a little bit faster, and also it has less hard > >> > disc > >> > > >> > reads :) > >> > > >> > Tell what happens with r161, and take a look on this page: > >> > > >> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > >> > > >> > There is a small tip on how to serialize any SQL Alchemy DTO. Be > >> > careful > >> > > >> > with those objects, the default serialization, with 50 nodes, can go > >> > very > >> > > >> > deep in the object tree, without the desired results. But test that > >> > serialization, it will help to know if the changes that I've added > >> > to r161 > >> > > >> > are OK or not. > >> > >> Thanks for that, I will definitely take a look at the selectors and > >> will use them. For now I'm just trying to get the simple usage > >> working. > >> > >> BTW, is pyxser really incompatible with python 2.7? Currently the > >> version check in setup.py excludes 2.7. But when I change it, it seams > >> to work fine. Unless you know for a fact that 2.7 cannot be supported, > >> it might be a good idea to allow it in setup.py. > > > > I think that I've killed that bug. Please, can you test the r163. > > I've finished my tests with your patch over SQL Alchemy subtests and > > also removed some memory leaks from r160. > > > > Thanks again for your feedback :) > > Looks good, no more segfaults, and I even get an error message about > the encoding problem ;-) That's great :D > I'm gonna start playing with the selectors. I'll let you know if I > find any problems. OK, no problem... let me know if something /strange/ happens... > > Thanks for all the support. No problem, thanks for your feedback :D > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-22 23:20:33
|
On Sun, Aug 22, 2010 at 3:12 PM, Daniel Molina Wegener <dm...@co...> wrote: > On Sunday 22 August 2010, > Vardan Akopian <vak...@gm...> wrote: > >> Hi Daniel, >> >> > > The fix is to avoid "from ... import *" constructs. Please, see the >> > > >> > > attached patch for this. >> > >> > OK, seems that the list filtering has a problem with attachments, >> > >> > can you send it as gzip archive? >> >> Please see the attached gz file. >> >> > > Then I tried using this version with my real world application that >> > > >> > > actually loads objects through sqlalchemy and tries to serialize >> > > them. I >> > > >> > > encountered another segfault. With a bit of debugging (gdb and >> > > valgrind) >> > > >> > > I narrowed down the problem to pyxser_collections.c:138, where you >> > > have: >> > > >> > > PyListObject *dupItems = *args->dupSrcItems >> > >> > OK, it was fixed on r161 >> >> Not to be too pedantic, but I think now the check on lines 146-148 is >> redundant since it's already checked on line 152. >> >> > > In my case args->dupSrcItems is NULL, so this will cause a problem. >> > > Once >> > > >> > > I added a null check with an early return (similar to the check on >> > > line >> > > >> > > 142), the problem got resolved and serialization worked. Please let >> > > me >> > > >> > > know if you'd like a patch for this. >> > >> > Yep, I didn't see that bug before. At other side, I've made many >> > >> > enhancements to the serialization algorithm and I've added some checks >> > >> > to make the serialization process a little bit more strict. So, you >> > >> > can test the r161 and see what happens to SQL Alchemy objects. >> >> Works ok with the provided test files. >> >> > > After this I tried to serialize the same object, but using >> > > enc="ascii" or >> > > >> > > enc="latin1", and got segfaults with both. This time it was in >> > > >> > > pyxser_strings.c:107. The debugger shows that name is not NULL, but >> > > has >> > > >> > > an invalid pointer (0x14). Something is probably going wrong in >> > > >> > > pyxser_serializer.c:281, where name is calculated using the >> > > >> > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I >> > > >> > > could send you back trace for this, so please let me know. >> > >> > OK, those errors were removed, now it is serializing any encoding >> > >> > supported by both, Python codecs and LibXML 2 codecs. Please for >> > >> > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized >> > >> > by both, Python and LibXML2, by default it handles as ascii codec >> > >> > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' >> > >> > instead. >> >> I tried with 'iso-8859-1', and got the same segfault. So I debugged a >> bit more, and here is what I found out: I have a situation where in >> the PYXSER_GET_ATTR_NAME macro, even tough >> pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still >> returns NULL. After this it segfauls pretty quickly, since args->name >> becomes an invalid pointer. According to the Python docs, >> PyUnicode_Encode will return NULL if "an exception is raised by the >> codec". I could not find anything in the docs that explains how to >> retrieve the error from the codec (but then again I am a complete >> python noob ;-)). I also don't know if this situation can be handled. >> But I think pyxser should at least check for it and raise a python >> exception, instead of letting it segfault. Of course any other >> information for the reason of the failure would be great. >> >> > > And finally, I attach here the output of the profiling command, as >> > > you >> > > >> > > asked. >> > >> > Thanks for the profiling command, this is very useful on what refers to >> > >> > performance enhancements. As I've said, I've added some lazy >> > initializations >> > >> > and pyxser now runs a little bit faster, and also it has less hard disc >> > >> > reads :) >> > >> > Tell what happens with r161, and take a look on this page: >> > >> > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ >> > >> > There is a small tip on how to serialize any SQL Alchemy DTO. Be >> > careful >> > >> > with those objects, the default serialization, with 50 nodes, can go >> > very >> > >> > deep in the object tree, without the desired results. But test that >> > serialization, it will help to know if the changes that I've added to >> > r161 >> > >> > are OK or not. >> >> Thanks for that, I will definitely take a look at the selectors and >> will use them. For now I'm just trying to get the simple usage >> working. >> >> BTW, is pyxser really incompatible with python 2.7? Currently the >> version check in setup.py excludes 2.7. But when I change it, it seams >> to work fine. Unless you know for a fact that 2.7 cannot be supported, >> it might be a good idea to allow it in setup.py. > > I think that I've killed that bug. Please, can you test the r163. > I've finished my tests with your patch over SQL Alchemy subtests and > also removed some memory leaks from r160. > > Thanks again for your feedback :) > Looks good, no more segfaults, and I even get an error message about the encoding problem ;-) I'm gonna start playing with the selectors. I'll let you know if I find any problems. Thanks for all the support. -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-22 22:12:55
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > Hi Daniel, > > > > The fix is to avoid "from ... import *" constructs. Please, see the > > > > > > attached patch for this. > > > > OK, seems that the list filtering has a problem with attachments, > > > > can you send it as gzip archive? > > Please see the attached gz file. > > > > Then I tried using this version with my real world application that > > > > > > actually loads objects through sqlalchemy and tries to serialize > > > them. I > > > > > > encountered another segfault. With a bit of debugging (gdb and > > > valgrind) > > > > > > I narrowed down the problem to pyxser_collections.c:138, where you > > > have: > > > > > > PyListObject *dupItems = *args->dupSrcItems > > > > OK, it was fixed on r161 > > Not to be too pedantic, but I think now the check on lines 146-148 is > redundant since it's already checked on line 152. > > > > In my case args->dupSrcItems is NULL, so this will cause a problem. > > > Once > > > > > > I added a null check with an early return (similar to the check on > > > line > > > > > > 142), the problem got resolved and serialization worked. Please let > > > me > > > > > > know if you'd like a patch for this. > > > > Yep, I didn't see that bug before. At other side, I've made many > > > > enhancements to the serialization algorithm and I've added some checks > > > > to make the serialization process a little bit more strict. So, you > > > > can test the r161 and see what happens to SQL Alchemy objects. > > Works ok with the provided test files. > > > > After this I tried to serialize the same object, but using > > > enc="ascii" or > > > > > > enc="latin1", and got segfaults with both. This time it was in > > > > > > pyxser_strings.c:107. The debugger shows that name is not NULL, but > > > has > > > > > > an invalid pointer (0x14). Something is probably going wrong in > > > > > > pyxser_serializer.c:281, where name is calculated using the > > > > > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I > > > > > > could send you back trace for this, so please let me know. > > > > OK, those errors were removed, now it is serializing any encoding > > > > supported by both, Python codecs and LibXML 2 codecs. Please for > > > > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized > > > > by both, Python and LibXML2, by default it handles as ascii codec > > > > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' > > > > instead. > > I tried with 'iso-8859-1', and got the same segfault. So I debugged a > bit more, and here is what I found out: I have a situation where in > the PYXSER_GET_ATTR_NAME macro, even tough > pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still > returns NULL. After this it segfauls pretty quickly, since args->name > becomes an invalid pointer. According to the Python docs, > PyUnicode_Encode will return NULL if "an exception is raised by the > codec". I could not find anything in the docs that explains how to > retrieve the error from the codec (but then again I am a complete > python noob ;-)). I also don't know if this situation can be handled. > But I think pyxser should at least check for it and raise a python > exception, instead of letting it segfault. Of course any other > information for the reason of the failure would be great. > > > > And finally, I attach here the output of the profiling command, as > > > you > > > > > > asked. > > > > Thanks for the profiling command, this is very useful on what refers to > > > > performance enhancements. As I've said, I've added some lazy > > initializations > > > > and pyxser now runs a little bit faster, and also it has less hard disc > > > > reads :) > > > > Tell what happens with r161, and take a look on this page: > > > > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > > > > There is a small tip on how to serialize any SQL Alchemy DTO. Be > > careful > > > > with those objects, the default serialization, with 50 nodes, can go > > very > > > > deep in the object tree, without the desired results. But test that > > serialization, it will help to know if the changes that I've added to > > r161 > > > > are OK or not. > > Thanks for that, I will definitely take a look at the selectors and > will use them. For now I'm just trying to get the simple usage > working. > > BTW, is pyxser really incompatible with python 2.7? Currently the > version check in setup.py excludes 2.7. But when I change it, it seams > to work fine. Unless you know for a fact that 2.7 cannot be supported, > it might be a good idea to allow it in setup.py. I think that I've killed that bug. Please, can you test the r163. I've finished my tests with your patch over SQL Alchemy subtests and also removed some memory leaks from r160. Thanks again for your feedback :) > > Thanks. > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-22 19:49:50
|
Hi Daniel, > > The fix is to avoid "from ... import *" constructs. Please, see the > > > attached patch for this. > > OK, seems that the list filtering has a problem with attachments, > > can you send it as gzip archive? > Please see the attached gz file. > > > > > Then I tried using this version with my real world application that > > > actually loads objects through sqlalchemy and tries to serialize them. I > > > encountered another segfault. With a bit of debugging (gdb and valgrind) > > > I narrowed down the problem to pyxser_collections.c:138, where you have: > > > PyListObject *dupItems = *args->dupSrcItems > > OK, it was fixed on r161 > Not to be too pedantic, but I think now the check on lines 146-148 is redundant since it's already checked on line 152. > > > > > In my case args->dupSrcItems is NULL, so this will cause a problem. Once > > > I added a null check with an early return (similar to the check on line > > > 142), the problem got resolved and serialization worked. Please let me > > > know if you'd like a patch for this. > > Yep, I didn't see that bug before. At other side, I've made many > > enhancements to the serialization algorithm and I've added some checks > > to make the serialization process a little bit more strict. So, you > > can test the r161 and see what happens to SQL Alchemy objects. Works ok with the provided test files. > > > > > > After this I tried to serialize the same object, but using enc="ascii" or > > > enc="latin1", and got segfaults with both. This time it was in > > > pyxser_strings.c:107. The debugger shows that name is not NULL, but has > > > an invalid pointer (0x14). Something is probably going wrong in > > > pyxser_serializer.c:281, where name is calculated using the > > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I > > > could send you back trace for this, so please let me know. > > OK, those errors were removed, now it is serializing any encoding > > supported by both, Python codecs and LibXML 2 codecs. Please for > > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized > > by both, Python and LibXML2, by default it handles as ascii codec > > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' > > instead. > I tried with 'iso-8859-1', and got the same segfault. So I debugged a bit more, and here is what I found out: I have a situation where in the PYXSER_GET_ATTR_NAME macro, even tough pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still returns NULL. After this it segfauls pretty quickly, since args->name becomes an invalid pointer. According to the Python docs, PyUnicode_Encode will return NULL if "an exception is raised by the codec". I could not find anything in the docs that explains how to retrieve the error from the codec (but then again I am a complete python noob ;-)). I also don't know if this situation can be handled. But I think pyxser should at least check for it and raise a python exception, instead of letting it segfault. Of course any other information for the reason of the failure would be great. > > > > > And finally, I attach here the output of the profiling command, as you > > > asked. > > Thanks for the profiling command, this is very useful on what refers to > > performance enhancements. As I've said, I've added some lazy initializations > > and pyxser now runs a little bit faster, and also it has less hard disc > > reads :) > > Tell what happens with r161, and take a look on this page: > > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > > There is a small tip on how to serialize any SQL Alchemy DTO. Be careful > > with those objects, the default serialization, with 50 nodes, can go very > > deep in the object tree, without the desired results. But test that serialization, it will help to know if the changes that I've added to r161 > > are OK or not. > Thanks for that, I will definitely take a look at the selectors and will use them. For now I'm just trying to get the simple usage working. BTW, is pyxser really incompatible with python 2.7? Currently the version check in setup.py excludes 2.7. But when I change it, it seams to work fine. Unless you know for a fact that 2.7 cannot be supported, it might be a good idea to allow it in setup.py. Thanks. -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-22 18:21:21
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > On Sat, Aug 21, 2010 at 7:13 PM, Daniel Molina Wegener <dm...@co...> wrote: > > On Saturday 21 August 2010, > > > > Vardan Akopian <vak...@gm...> wrote: > > > Hello, > > > > Hello Vardan, > > > > > Is there a known open bug with the 1.4.6r and trunk? In my tests > > > running any of the test-utf8*.py tests generates a segfault. I > > > attach here a gdb session when running test-utf8.py (back-trace is > > > at the end), using the current trunk (r159). Python version is 2.6.5 > > > on kubuntu 10.04. Please let me know if more information is needed. > > > I get similar results with python 2.7 as well. > > > > > Thanks for your feedback. Today I was working in pyxser. Also I'm > > > > currently testing pyxser on Kubuntu 10.04.1. Please checkout the > > revision 160 (trunk) and let me know if the same error happens. I > > think that you have found a bug, but I'm sure that now is corrected on > > the trunk branch. On that case I will release pyxser-1.4.8r on Monday, > > since I've enhanced its performance using lazy intialization. > > > > For a better test case, please use the following command: > > > > python2.6 -m cProfile ./test-utf8-profiling.py > > > > It will dump the timings on 1000 calls for each internal function > > > > of pyxser. If you can bring me the output, it will be great. > > > > Still I need to do more tests and find memory leaks if any. > > > > > Thanks. > > > -Vardan > > > > Best regards, > > -- > > Daniel Molina Wegener <dmw [at] coder [dot] cl> > > System Programmer & Web Developer > > Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ > > Hi Daniel, Hello Vardan... > > Thanks for the quick reply. No, thanks again for your feedback :) > First the good news: indeed version r160 fixes the segfaults with the > included test-utf8*.py. > BTW, I had to modify the test-utf8-sqlalchemy.py a little bit, since with > the current version I was getting > Traceback (most recent call last): > File "test-utf8-sqlalchemy.py", line 16, in <module> > from sqlalchemy.orm.properties import * > AttributeError: 'module' object has no attribute 'BackRef' > > The fix is to avoid "from ... import *" constructs. Please, see the > attached patch for this. OK, seems that the list filtering has a problem with attachments, can you send it as gzip archive? > > Then I tried using this version with my real world application that > actually loads objects through sqlalchemy and tries to serialize them. I > encountered another segfault. With a bit of debugging (gdb and valgrind) > I narrowed down the problem to pyxser_collections.c:138, where you have: > PyListObject *dupItems = *args->dupSrcItems OK, it was fixed on r161 > > In my case args->dupSrcItems is NULL, so this will cause a problem. Once > I added a null check with an early return (similar to the check on line > 142), the problem got resolved and serialization worked. Please let me > know if you'd like a patch for this. Yep, I didn't see that bug before. At other side, I've made many enhancements to the serialization algorithm and I've added some checks to make the serialization process a little bit more strict. So, you can test the r161 and see what happens to SQL Alchemy objects. > > After this I tried to serialize the same object, but using enc="ascii" or > enc="latin1", and got segfaults with both. This time it was in > pyxser_strings.c:107. The debugger shows that name is not NULL, but has > an invalid pointer (0x14). Something is probably going wrong in > pyxser_serializer.c:281, where name is calculated using the > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I > could send you back trace for this, so please let me know. OK, those errors were removed, now it is serializing any encoding supported by both, Python codecs and LibXML 2 codecs. Please for /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized by both, Python and LibXML2, by default it handles as ascii codec if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' instead. > > And finally, I attach here the output of the profiling command, as you > asked. Thanks for the profiling command, this is very useful on what refers to performance enhancements. As I've said, I've added some lazy initializations and pyxser now runs a little bit faster, and also it has less hard disc reads :) Tell what happens with r161, and take a look on this page: http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ There is a small tip on how to serialize any SQL Alchemy DTO. Be careful with those objects, the default serialization, with 50 nodes, can go very deep in the object tree, without the desired results. But test that serialization, it will help to know if the changes that I've added to r161 are OK or not. > > Thanks. > -Vardan Thanks for your feedback and best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Vardan A. <vak...@gm...> - 2010-08-22 07:43:55
|
On Sat, Aug 21, 2010 at 7:13 PM, Daniel Molina Wegener <dm...@co...> wrote: > On Saturday 21 August 2010, > Vardan Akopian <vak...@gm...> wrote: > > > Hello, > > Hello Vardan, > > > > > Is there a known open bug with the 1.4.6r and trunk? In my tests running > > any of the test-utf8*.py tests generates a segfault. I attach here a gdb > > session when running test-utf8.py (back-trace is at the end), using the > > current trunk (r159). Python version is 2.6.5 on kubuntu 10.04. Please > > let me know if more information is needed. I get similar results with > > python 2.7 as well. > > Thanks for your feedback. Today I was working in pyxser. Also I'm > currently testing pyxser on Kubuntu 10.04.1. Please checkout the revision > 160 (trunk) and let me know if the same error happens. I think that you > have found a bug, but I'm sure that now is corrected on the trunk branch. > On that case I will release pyxser-1.4.8r on Monday, since I've enhanced > its performance using lazy intialization. > > For a better test case, please use the following command: > > python2.6 -m cProfile ./test-utf8-profiling.py > > It will dump the timings on 1000 calls for each internal function > of pyxser. If you can bring me the output, it will be great. > > Still I need to do more tests and find memory leaks if any. > > > > > Thanks. > > -Vardan > > Best regards, > -- > Daniel Molina Wegener <dmw [at] coder [dot] cl> > System Programmer & Web Developer > Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ > Hi Daniel, Thanks for the quick reply. First the good news: indeed version r160 fixes the segfaults with the included test-utf8*.py. BTW, I had to modify the test-utf8-sqlalchemy.py a little bit, since with the current version I was getting Traceback (most recent call last): File "test-utf8-sqlalchemy.py", line 16, in <module> from sqlalchemy.orm.properties import * AttributeError: 'module' object has no attribute 'BackRef' The fix is to avoid "from ... import *" constructs. Please, see the attached patch for this. Then I tried using this version with my real world application that actually loads objects through sqlalchemy and tries to serialize them. I encountered another segfault. With a bit of debugging (gdb and valgrind) I narrowed down the problem to pyxser_collections.c:138, where you have: PyListObject *dupItems = *args->dupSrcItems In my case args->dupSrcItems is NULL, so this will cause a problem. Once I added a null check with an early return (similar to the check on line 142), the problem got resolved and serialization worked. Please let me know if you'd like a patch for this. After this I tried to serialize the same object, but using enc="ascii" or enc="latin1", and got segfaults with both. This time it was in pyxser_strings.c:107. The debugger shows that name is not NULL, but has an invalid pointer (0x14). Something is probably going wrong in pyxser_serializer.c:281, where name is calculated using the PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I could send you back trace for this, so please let me know. And finally, I attach here the output of the profiling command, as you asked. Thanks. -Vardan |
From: Daniel M. W. <dm...@co...> - 2010-08-22 02:40:56
|
On Saturday 21 August 2010, Vardan Akopian <vak...@gm...> wrote: > Hello, Hello Vardan, > > Is there a known open bug with the 1.4.6r and trunk? In my tests running > any of the test-utf8*.py tests generates a segfault. I attach here a gdb > session when running test-utf8.py (back-trace is at the end), using the > current trunk (r159). Python version is 2.6.5 on kubuntu 10.04. Please > let me know if more information is needed. I get similar results with > python 2.7 as well. Thanks for your feedback. Today I was working in pyxser. Also I'm currently testing pyxser on Kubuntu 10.04.1. Please checkout the revision 160 (trunk) and let me know if the same error happens. I think that you have found a bug, but I'm sure that now is corrected on the trunk branch. On that case I will release pyxser-1.4.8r on Monday, since I've enhanced its performance using lazy intialization. For a better test case, please use the following command: python2.6 -m cProfile ./test-utf8-profiling.py It will dump the timings on 1000 calls for each internal function of pyxser. If you can bring me the output, it will be great. Still I need to do more tests and find memory leaks if any. > > Thanks. > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Max S. <ms...@gm...> - 2010-05-11 05:33:14
|
Hi, Daniel! Thank you for your work! I've tested revision 145, it works fine. But there is a really simple issue. When I'm trying to serialize any sqlalchemy-based object, pyxser segaults. So I looked at the sources and added a simple check for a NULL pointer: --- src/pyxser_collections.c (revision 145) +++ src/pyxser_collections.c (working copy) @@ -195,7 +195,9 @@ args->rootNode = &csn; args->currentNode = &csn; newSerNode = pyxser_SerializeXml(args); - PYXSER_FREE_OBJECT(unic); + if (unic != NULL) { + PYXSER_FREE_OBJECT(unic); + } args->o = oold; args->item = oold; args->currentNode = currentNodeOld; I don't know the C language enought to fix it in the right way, may be there is a memory leak now. All other tests I made generates valid xml. Now we have cool tool to serialize python objects to xml-) Thank you! Best regards, Max Sinelnikov. On Mon, May 10, 2010 at 9:23 PM, Daniel Molina Wegener <dm...@co...> wrote: > NP Max, I think that the problem is solved, please try the r145 on > trunk. Please notify me if that works, I've made some small test here, > but I need to be sure that it works with your application :) > > Thanks in advance... > > > Best regards, > -- > Daniel Molina Wegener <dmw [at] coder [dot] cl> > Software Architect, System Programmer & Web Developer > Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ > |
From: Daniel M. W. <dm...@co...> - 2010-05-10 14:24:06
|
On Friday 07 May 2010, Max Sinelnikov <ms...@gm...> wrote: > Oh, sorry, I clicked 'Reply' instead of 'Reply to all' and forgot to > attache files. NP Max, I think that the problem is solved, please try the r145 on trunk. Please notify me if that works, I've made some small test here, but I need to be sure that it works with your application :) Thanks in advance... > Original message: > > > Hi, Daniel. > > > Thanks for your feedback. The code is fixed, pyxser now supports > > > > unicode keys on dictionaries. I've refactored the code, so pyxser > > now runs faster. I want to release pyxser-1.4.4r tomorrow, so if can > > have your feedback about how is working the trunk branch of it, it > > would be very helpful, since I don't have access to the modules > > that you are using... > > I'm happy with revision 143. All my objects successfully serialized(I > don't need for deserializing now, so I haven't test this functionality > yet, sorry-)) > But there is the same problem when I'm serializing one of my objects > with depth=12 and more. I attached two files to this mail: > depth-11.xml is a valid xml document I got with depth=11, and > depth-12.xml is the same serialized object with depth=12. Problem is > with Resource object objid="id33427536"(depth-12.xml, Line 58). > > A few words about objects I have. I'm using sqlalchemy on top of > mysql, there are several tables with relations. SQLAlchemy creates > objects related to tables and, for exapmle, I have object named > "company" with attribute "roles" which looks like list. Every item of > company.roles is an company_role instance and so on. It's possible to > > do something like this(as example): > >>> r = Resource.get_by_id(1) > >>> company.roles[1].permissions[0].pack.resources[0] is r > > True > > I'm trying to reproduce described issue with simple schema, but with > no success. I thoungth that there may be a problem with nested > dictionries, but things like t.a = > {'hehe':{'ololo':{'haha':{u'хехе':{u'наан':{u'чочо':{u'хрхрх':11}}}}}}} > works fine. Will try again during this holidays. > > Best regards, Max Sinelnikov. Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> Software Architect, System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Daniel M. W. <dm...@co...> - 2010-05-05 14:25:25
|
On Wednesday 05 May 2010, Max Sinelnikov <ms...@gm...> wrote: > May be it will be helpful: I'm running Debian stable (Lenny) 5.0 with > python2.5 and libxml2 2.6.32 and pyxser revision 139. > test-utf8.py passed, but when I'm adding unicode key python segfaults: Hello Max, thanks for your feedback. I've fixed pyxser, so it will be released this week, for a while you can try revision 140 (r140) which is in trunk, it is quite stable, yet I need to do more tests, searching for memory leaks and similar tasks, but almost is done. > > --- test-utf8.py (revision 139) > +++ test-utf8.py (working copy) > @@ -362,7 +362,7 @@ > another.dyn_prop1 = thisa > test.dyn_prop1 = [u'holá', u'chaó', another] > test.dyn_prop2 = (u'hol`', u'sïn', 'trip', other) > - test.dyn_prop3 = {'sáludó1': u'hólà', 'sáludó2': u'chäó', > 'sòludò4': 'goodbye', 'saludo5': thisc} > + test.dyn_prop3 = {'sáludó1': u'hólà', 'sáludó2': u'chäó', > u'sòludò4': 'goodbye', 'saludo5': thisc} > test.dyn_prop4 = u'sómé tèxtè ïñ Unicodè' > test.dyn_prop5 = u'Añother Texé Iñ ÜnìcóDËc' > test.dyn_prop6 = 1.5 > > > Best regards, Max Sinelnikov. > Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> Software Architect, System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |
From: Max S. <ms...@gm...> - 2010-05-05 05:54:47
|
May be it will be helpful: I'm running Debian stable (Lenny) 5.0 with python2.5 and libxml2 2.6.32 and pyxser revision 139. test-utf8.py passed, but when I'm adding unicode key python segfaults: --- test-utf8.py (revision 139) +++ test-utf8.py (working copy) @@ -362,7 +362,7 @@ another.dyn_prop1 = thisa test.dyn_prop1 = [u'holá', u'chaó', another] test.dyn_prop2 = (u'hol`', u'sïn', 'trip', other) - test.dyn_prop3 = {'sáludó1': u'hólà', 'sáludó2': u'chäó', 'sòludò4': 'goodbye', 'saludo5': thisc} + test.dyn_prop3 = {'sáludó1': u'hólà', 'sáludó2': u'chäó', u'sòludò4': 'goodbye', 'saludo5': thisc} test.dyn_prop4 = u'sómé tèxtè ïñ Unicodè' test.dyn_prop5 = u'Añother Texé Iñ ÜnìcóDËc' test.dyn_prop6 = 1.5 Best regards, Max Sinelnikov. |