Re: [pyxser-users] segfault when running the utf8 tests
Brought to you by:
damowe
From: Daniel M. W. <dm...@co...> - 2010-08-22 22:12:55
|
On Sunday 22 August 2010, Vardan Akopian <vak...@gm...> wrote: > Hi Daniel, > > > > The fix is to avoid "from ... import *" constructs. Please, see the > > > > > > attached patch for this. > > > > OK, seems that the list filtering has a problem with attachments, > > > > can you send it as gzip archive? > > Please see the attached gz file. > > > > Then I tried using this version with my real world application that > > > > > > actually loads objects through sqlalchemy and tries to serialize > > > them. I > > > > > > encountered another segfault. With a bit of debugging (gdb and > > > valgrind) > > > > > > I narrowed down the problem to pyxser_collections.c:138, where you > > > have: > > > > > > PyListObject *dupItems = *args->dupSrcItems > > > > OK, it was fixed on r161 > > Not to be too pedantic, but I think now the check on lines 146-148 is > redundant since it's already checked on line 152. > > > > In my case args->dupSrcItems is NULL, so this will cause a problem. > > > Once > > > > > > I added a null check with an early return (similar to the check on > > > line > > > > > > 142), the problem got resolved and serialization worked. Please let > > > me > > > > > > know if you'd like a patch for this. > > > > Yep, I didn't see that bug before. At other side, I've made many > > > > enhancements to the serialization algorithm and I've added some checks > > > > to make the serialization process a little bit more strict. So, you > > > > can test the r161 and see what happens to SQL Alchemy objects. > > Works ok with the provided test files. > > > > After this I tried to serialize the same object, but using > > > enc="ascii" or > > > > > > enc="latin1", and got segfaults with both. This time it was in > > > > > > pyxser_strings.c:107. The debugger shows that name is not NULL, but > > > has > > > > > > an invalid pointer (0x14). Something is probably going wrong in > > > > > > pyxser_serializer.c:281, where name is calculated using the > > > > > > PYXSER_GET_ATTR_NAME macro. But I could not narrow down much more. I > > > > > > could send you back trace for this, so please let me know. > > > > OK, those errors were removed, now it is serializing any encoding > > > > supported by both, Python codecs and LibXML 2 codecs. Please for > > > > /latin-./ encodings, use /iso-8859-.*/ form, since it is recognized > > > > by both, Python and LibXML2, by default it handles as ascii codec > > > > if you try with enc = 'latin-1', you need to use enc = 'iso-8859-1' > > > > instead. > > I tried with 'iso-8859-1', and got the same segfault. So I debugged a > bit more, and here is what I found out: I have a situation where in > the PYXSER_GET_ATTR_NAME macro, even tough > pyxserUnicode_Check(currentKey) returns 1, PyUnicode_Encode still > returns NULL. After this it segfauls pretty quickly, since args->name > becomes an invalid pointer. According to the Python docs, > PyUnicode_Encode will return NULL if "an exception is raised by the > codec". I could not find anything in the docs that explains how to > retrieve the error from the codec (but then again I am a complete > python noob ;-)). I also don't know if this situation can be handled. > But I think pyxser should at least check for it and raise a python > exception, instead of letting it segfault. Of course any other > information for the reason of the failure would be great. > > > > And finally, I attach here the output of the profiling command, as > > > you > > > > > > asked. > > > > Thanks for the profiling command, this is very useful on what refers to > > > > performance enhancements. As I've said, I've added some lazy > > initializations > > > > and pyxser now runs a little bit faster, and also it has less hard disc > > > > reads :) > > > > Tell what happens with r161, and take a look on this page: > > > > http://coder.cl/2010/08/ann-pyxser-1-4-6r-released/ > > > > There is a small tip on how to serialize any SQL Alchemy DTO. Be > > careful > > > > with those objects, the default serialization, with 50 nodes, can go > > very > > > > deep in the object tree, without the desired results. But test that > > serialization, it will help to know if the changes that I've added to > > r161 > > > > are OK or not. > > Thanks for that, I will definitely take a look at the selectors and > will use them. For now I'm just trying to get the simple usage > working. > > BTW, is pyxser really incompatible with python 2.7? Currently the > version check in setup.py excludes 2.7. But when I change it, it seams > to work fine. Unless you know for a fact that 2.7 cannot be supported, > it might be a good idea to allow it in setup.py. I think that I've killed that bug. Please, can you test the r163. I've finished my tests with your patch over SQL Alchemy subtests and also removed some memory leaks from r160. Thanks again for your feedback :) > > Thanks. > -Vardan Best regards, -- Daniel Molina Wegener <dmw [at] coder [dot] cl> System Programmer & Web Developer Phone: +56 (2) 979-0277 | Blog: http://coder.cl/ |