#23 segfault running test suite on py2.3 x86_64

Errors
closed-fixed
7
2005-02-28
2005-02-10
Matthew L Daniel
No

Testing: BuildClass, FindClass, ClassList, {...} ... ok
Testing: Clear, Reset, Class.PPForm, Class.Description,
Class.Module ... ok
Testing: Class.BuildSubclass ... ok
Testing: Class.WatchSlots, Class.WatchInstances ... ok
Testing: Class.MessageHandlerIndex,
Class.MessageHandlerWatched ... ok
Testing: Class.UnwatchMessageHandler,
Class.WatchMessageHandler ... ok
Testing: Class.MessageHandlerName,
Class.MessageHandlerType ... ok
Testing: Class.NextMessageHandlerIndex,
Class.MessageHandlerDeletable ... ok
Testing: BuildMessageHandler, Class.AddMessageHandler,
{...} ... ok
Testing: Class.BuildInstance, Class.RawInstance ... ok
Testing: Class.MessageHandlerList,
Class.AllMessageHandlerList ... ok
Testing: BuildInstance, Class.Deletable,
Instance.Slots, Instance.PPForm ... ok
Testing: FindInstance, Instance.Class, Instance.Name ... ok
Testing: LoadInstancesFromString ... ok
Testing: Slots.Names, Slots.Exists,
Slots.ExistsDefined, {...} ... ok
Testing: Slots.Cardinality, Slots.AllowedValues ... ok
Testing: Slots.Types, Slots.Sources ... ok
Testing: Slots.IsPublic, Slots.IsInitable, Slots.Range
... ok
Testing: Slots.IsWritable, Slots.HasDirectAccess,
Slots.Facets ... ok
Testing: ClassList, InitialClass, FindClass, Class.Next
... ok
Testing: InitialDeffacts, DeffactsList, Deffacts.Next,
Deffacts.Name ... ok
Testing: FindDeffacts, Deffacts.PPForm,
Deffacts.Deletable, {...} ... ok
Testing: InitialDefinstances, DefinstancesList,
Definstances.Next, {...} ... ok
After installing the module, I ran `cd testsuite;
python tests.py` and it yielded the following segfault.

If I can help further, please contact me.

===
Testing: Definstances.Name, Definstances.Module,
Definstances.Deletable ... ok
Testing: FactList, InitialFact, Fact.Next, Fact.Index
... ok
Testing: InitialFunction, FunctionList, Function.Name,
Function.Next ... ok
Testing: FindFunction, Function.PPForm, Function.Watch,
{...} ... ok
Testing: BuildGeneric, Generic.Name, Generic.PPForm,
Generic.Watch ... ok
Testing: InitialGeneric, GenericList, FindGeneric,
{...} ... ok
Testing: BuildGlobal, Global.Name, Global.PPForm, {...}
... ok
Testing: InitialGlobal, GlobalList, FindGlobal,
Global.Watch, {...} ... ok
Testing: InstancesChanged, InitialInstance,
FindInstance, Instance.Next ... ok
Testing: Instance.IsValid, Instance.Remove,
Instance.DirectRemove ... ok
Testing: Instance.Class, Instance.Slots, Instance.Send
... ok
Testing: Class.InitialInstance,
Class.InitialSubclassInstance, {...} ... Segmentation
fault (core dumped)

====
(gdb) bt
#0 EnvGetNextInstanceInClassAndSubclasses_PY
(theEnv=0x57ff80,
cptr=0x100000000, iptr=0x1,
iterationInfo=0x7fbfffde40) at inscptch_py.c:98
#1 0x0000002a97f1c11a in
g_getNextInstanceInClassAndSubclasses (
self=0x57ff80, args=0x2a9821a3b0) at clipsmodule.c:5719
#2 0x0000003f0218973f in _PyEval_SliceIndex ()
from /usr/lib64/libpython2.3.so.1.0

Discussion

<< < 1 2 (Page 2 of 2)
  • Logged In: YES
    user_id=328337

    Hi,

    I just checked into the CVS repository a version that (among
    other things) swaps the lines where some checks are
    performed: now the module checks for a valid instance before
    incrementing the CLIPS reference count for it, I don't
    remember why I did the opposite before. It should result in an
    exception where it now segfaults: this is still not the
    behaviour I was expecting, but that requires some more step-
    by-step debugging, and maybe some patching to the CLIPS
    sources. I will try to thoroughly debug the code... but maybe I
    will need to spend part of this weekend in a more
    constructive way than pizzas and social life.

    OTOH, the information you posted is really useful for me to
    see what happens: this kind of bugs are usually the result of
    a poor design, and probably a good debugging session will
    help me isolate it.

    I hope to pop up as soon as possible with a solution, within
    the next two or three days. Meanwhile I'll stay tuned on SF,
    in case you test the CVS version and find some more clues. I
    still have to sign in for the amd64 compile server on SF,
    probably I'll do it tomorrow.

    Have a nice weekend,

    F.

     
  • Logged In: YES
    user_id=88251

    cvs update from 2004-02-25 13:34 EST

    #####

    Testing PyCLIPS top level module
    Building classes
    Initial/Next Instance 1
    Initial/Next SubclassInstance 1
    Test01
    Test02
    Test03
    Test04
    Initial/Next Instance 2
    Initial/Next SubclassInstance 2
    Test05
    Test06
    Testing ends of lists
    Test07 ok
    Test08 ok
    Test09 ok
    Segmentation fault (core dumped)

    #######

    #0 0x0000002a9577f9e2 in EnvValidInstanceAddress
    (theEnv=0x520530, iptr=0x4554415254535f48) at
    ./clipssrc/inscom.c:648
    No locals.
    #1 0x0000002a9574a41a in
    g_getNextInstanceInClassAndSubclasses (self=0x520530,
    args=0x0) at clipsmodule.c:5760
    p = (clips_InstanceObject *) 0x2a955c7290
    q = (clips_InstanceObject *) 0x2a955c72b0
    c = (clips_DefclassObject *) 0x2a955c52b8
    o = {supplementalInfo = 0x0, type = 4, value =
    0x522cd0, begin = 1, end = 4294967295, next = 0x0}
    ptr = (void *) 0x4554415254535f48
    #2 0x0000003f0218973f in _PyEval_SliceIndex () from
    /usr/lib64/libpython2.3.so.1.0
    No symbol table info available.

    #####

    Testing: InstancesChanged, InitialInstance, FindInstance,
    Instance.Next ... ok
    Testing: Instance.IsValid, Instance.Remove,
    Instance.DirectRemove ... ok
    Testing: Instance.Class, Instance.Slots, Instance.Send ... ok
    Testing: Class.InitialInstance,
    Class.InitialSubclassInstance, {...} ... Segmentation fault
    (core dumped)

    #####

    #0 EnvGetNextInstanceInClassAndSubclasses_PY
    (theEnv=0x51dda0, cptr=0x100000000, iptr=0x1,
    iterationInfo=0x7fbfffded0) at inscptch_py.c:98
    nextInstance = (INSTANCE_TYPE *) 0x0
    theClass = (DEFCLASS *) 0x0
    #1 0x0000002a959973da in
    g_getNextInstanceInClassAndSubclasses (self=0x51dda0,
    args=0x2a95c983f8) at clipsmodule.c:5754
    p = (clips_InstanceObject *) 0x2a955c7f30
    q = (clips_InstanceObject *) 0x6031e0
    c = (clips_DefclassObject *) 0x2a95c8fd20
    o = {supplementalInfo = 0x0, type = 4, value =
    0x1648e70, begin = 1, end = 4294967295, next = 0x0}
    ptr = (void *) 0x1643320
    #2 0x0000003f0218973f in _PyEval_SliceIndex () from
    /usr/lib64/libpython2.3.so.1.0
    No symbol table info available.

     
  • Logged In: YES
    user_id=328337

    Well, the *same* error in different conditions makes my
    opinions about this riddle stronger. In fact I followed the code
    flow, and there is a point where a (long)((register
    unsigned) "could-be-0" - 1) operation is performed. Maybe
    this is the guilty piece of code. For you to see, I bet it is in
    file clipssrc/classinf.c:614, where a register unsigned is used
    in SetpDOEnd(...). If you look at the SetpDOEnd _macro_, it
    casts the result of a subtraction to long *after* performing the
    subtraction. But "unsigned int register" might not be the
    same as long... I don't know the details about the x86_64
    platform, but situations like these could be influenced in my
    opinion.

    All of this just to say that I have a small patch for the "[Env]
    Set?DO*" macros (clipssrc/evaluatn.h), which I enclosed in a
    small shell file. The script (it just invokes patch against a
    homemade diff) is attached as "pev_ia64.sh" at the bottom of
    the page: download it to the same directory as setup.py, and
    run it. What it does, is to add a cast to long to the macros
    argument before the operation: maybe it's enough. In case it
    is, it should also correct other possible error conditions on
    this platform. I tested that the patched CLIPS source works
    on 32 bit platforms, and asked the folks@SF to have access
    to the compile farm (an AMD64 for x86_64).

    If you want to try to see what happens, collect the patch and
    try. As soon as my CF account is enabled I will do the same.
    For now this is the most that I can do, but there is still some
    time before the weekend ends.

    Have a nice weekend,

    F.

     
  • Possible patch for IA64 architectures

     
    Attachments
  • Logged In: YES
    user_id=328337

    Good news!

    I had my compile farm account, and just had my successful
    experience building and testing PyCLIPS there. I tried the
    unpatched version before, and had exactly your error. After
    patching evaluatn.h I ran the test suite and it completed
    successfully. Too bad the error is not in my source tree, but
    I'll provide the patch in following releases for people so lucky
    to have access to 64bit platforms. Still nothing is known
    about Win64 - I can't afford it.

    I will just wait for your feedback to close the bug, and go
    back to my memory leak hunting.

    Thank you again for submitting the annoyance; if it still
    persists I'll keep working on it with your help. And if you find
    some other unexpected behaviour, bug notices and
    suggestions are always welcome...

    Hope to hear from you soon,

    F.

     
    • status: open --> closed
     
  • Logged In: YES
    user_id=88251

    That patch works great. All test suites I have tried work
    without issue.

    I appreciate your help with this and hope it helps the CLIPS
    folks, too.

    Plus now you have an x86_64 compile farm account. Go you. :-)

     
    • status: closed --> closed-fixed
     
<< < 1 2 (Page 2 of 2)