Menu

#378 where on dice of pdl bad results

normal
closed-fixed
None
5
2015-09-28
2015-03-24
No

Running the following code on RHEL7:

#!/usr/bin/perl
use PDL;

print "OS: ".`uname -a`;
print "Perl: ".$^V."\n";
print "PDL: ".$PDL::VERSION."\n";

my $pdl = PDL->new([ [[0, 0]], [[1, 1]], [[2, 2]] ]);
print sprintf("\nPDL:%s\n", $pdl);
my $pslice     = $pdl->where($pdl == 0);
print sprintf("\nPDL Slice:\n%s\n", $pslice);

my $data      = $pdl->dice("X","X",[1, 0]);
print sprintf("\nData:%s\n", $data);

my $dslice     = $data->where($data == 0);
print sprintf("Should be same as PDL Slice\nData Slice:\n%s\n", $dslice);

Generates the following output:

OS: Linux riskrh7 3.10.0-123.9.3.el7.x86_64 #1 SMP Thu Oct 30 00:16:40 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
Perl: v5.16.3
PDL: 2.007_13

PDL:
[
 [
  [0 0]
 ]
 [
  [1 1]
 ]
 [
  [2 2]
 ]
]


PDL Slice:
[0 0]

Data:
[
 [
  [1 1]
 ]
 [
  [0 0]
 ]
]

Should be same as PDL Slice
Data Slice:
[7.897708e-28 3.1620201e-322]

Expected the "Data Slice" values to be the same as "PDL Slice".

Discussion

  • Derek Lamb

    Derek Lamb - 2015-03-27

    Thank you for the bug report. I verified this also in stock 2.007, but it is not present in 2.006. So this appears to be a regression.

    In the meantime, you can get around this by severing the dataflow connection between $pdl and $data:

    my $data      = $pdl->dice("X","X",[1, 0])->sever;
    
     
  • Matthew McGillis

    Happy to hear you are able to reproduce the same problem.

    Unfortunately for what we do we need the connection. The next thing we do generally is assign a value to that slice and expect it to update the original PDL. What we are finding is this never works and in some cases will cause a segmentation fault.

     
  • Derek Lamb

    Derek Lamb - 2015-03-27

    Running your test script through the perl debugger, it appears that the garbage comes in from clump. Specifically, where() (Primitive.pm line 2499)

    $data = $_[0]->clump(-1) if $_[0]->getndims>1;
    

    which then goes to PDL::clump, and then

    return &PDL::_clump_int(@_);
    

    After that, we'll need to debug the pp_def'd _clump_int in slices.pd. I'm out of tuits for the moment (and don't have a perl with debugging symbols handy).

     
  • Matthew McGillis

    Just wondering if there is anything else I can do/provide to help move this forward.

     
  • Matthew McGillis

    Well I grabbed a copy of 2.012 and gave the code another try it has the same problem. So I took a look at Basic/Slices/slices.pd and modified like this:

    pp_def(
            '_clump_int',
            P2Child => 1,
            NoPdlThread => 1,
            DefaultFlow => 1,
            Reversible => 1,
            AffinePriv => 1,
            OtherPars => 'int n',
            RedoDims => 'int i; PDL_Indx d1;
                    int nrem; int nnew;
                    if($COMP(n) > $PARENT(ndims)) {
                            /* Now with more flavor:  truncate clumping in this case to
                             * the total number of dimensions that actually exist...
                             *  --CED 17-Mar-2002
                             */
                            $COMP(n) = -1;
    
    #ifdef older_croaking_code
                            $SETNDIMS(0);  /* fix to make sure we do not get problems later */
                            $PRIV(offs) = 0;
                            $SETDIMS();
                            $CROAK("Too many dimensions %d to clump from %d",
                                    $COMP(n),$PARENT(ndims));
    #endif
                    }
                     nrem = ($COMP(n)< 0 ? $PARENT(threadids[0]) + 1 + ($COMP(n)) : $COMP(n));
                     if(nrem < 0) {
                            $CROAK("Too many dimensions %d to leave behind when clumping from %d",-$COMP(n),$PARENT(ndims));
                     }
                     fprintf(stderr,"START n=%d,i=%d,nrem=%d,nnew=%d,ndims=%d,...\n",$COMP(n),i,nrem,nnew,$PARENT(ndims));
    
                     nnew = $PARENT(ndims) - nrem + 1;
                     fprintf(stderr,"SET n=%d,i=%d,nrem=%d,nnew=%d,ndims=%d,...\n",$COMP(n),i,nrem,nnew,$PARENT(ndims));
                     $SETNDIMS(nnew);
                     $DOPRIVDIMS();
                     $PRIV(offs) = 0;
                     d1=1;
                     for(i=0; i<nrem; i++) {
                            d1 *= $PARENT(dims[i]);
                     }
                     fprintf(stderr,"CHILD/PRIV n=%d,i=%d,nrem=%d,nnew=%d,ndims=%d,...\n",$COMP(n),i,nrem,nnew,$PARENT(ndims));
                     $CHILD(dims[0]) = d1;
                     $PRIV(incs[0]) = 1;
                     for(; i<$PARENT(ndims); i++) {
                            fprintf(stderr,"Inside CHILD/PRIV n=%d,i=%d,nrem=%d,nnew=%d,ndims=%d,...\n",$COMP(n),i,nrem,nnew,$PARENT(ndims));
                            $CHILD(dims[i-nrem+1]) = $PARENT(dims[i]);
                            $PRIV(incs[i-nrem+1]) = $PARENT(dimincs[i]);
                     }
                     $SETDIMS();
                     $SETDELTATHREADIDS(1-nrem);
                     fprintf(stderr,"n=%d,i=%d,nrem=%d,nnew=%d,ndims=%d,...\n",$COMP(n),i,nrem,nnew,$PARENT(ndims));
                     ',
            Doc => 'internal',
    );
    

    Then I ran the script which generated this:

    $ perl -I PDL-2.012/blib/lib -I PDL-2.012/blib/arch bug378 
    OS: Linux rh7 3.10.0-123.13.2.el7.x86_64 #1 SMP Fri Dec 12 19:51:03 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
    Perl: v5.16.3
    PDL: 2.012
    
    PDL:
    [
     [
      [0 0]
     ]
     [
      [1 1]
     ]
     [
      [2 2]
     ]
    ]
    
    START n=-1,i=0,nrem=3,nnew=0,ndims=3,...
    SET n=-1,i=0,nrem=3,nnew=1,ndims=3,...
    CHILD/PRIV n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    START n=-1,i=0,nrem=3,nnew=0,ndims=3,...
    SET n=-1,i=0,nrem=3,nnew=1,ndims=3,...
    CHILD/PRIV n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    
    PDL Slice:
    [0 0]
    
    Data:
    [
     [
      [1 1]
     ]
     [
      [0 0]
     ]
    ]
    
    START n=-1,i=0,nrem=3,nnew=0,ndims=3,...
    SET n=-1,i=0,nrem=3,nnew=1,ndims=3,...
    CHILD/PRIV n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    START n=-1,i=0,nrem=3,nnew=0,ndims=3,...
    SET n=-1,i=0,nrem=3,nnew=1,ndims=3,...
    CHILD/PRIV n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    n=-1,i=3,nrem=3,nnew=1,ndims=3,...
    Should be same as PDL Slice
    Data Slice:
    [2.3715151e-322 2.4694508e-317]
    

    Nothing in the above jumps out at me as the potential problem. My guess is the issue is in the expansion of $PARENT in this for a slice vs pdl:

                     for(; i<$PARENT(ndims); i++) {
                            fprintf(stderr,"Inside CHILD/PRIV n=%d,i=%d,nrem=%d,nnew=%d,ndims=%d,...\n",$COMP(n),i,nrem,nnew,$PARENT(ndims));
                            $CHILD(dims[i-nrem+1]) = $PARENT(dims[i]);
                            $PRIV(incs[i-nrem+1]) = $PARENT(dimincs[i]);
                     }
    

    But so far I'm not finding much documentation on $PARENT.

    Probably dig into the Basic/Slices/Slices.xs next it looks like that may have more of the details about potential differences between PARENT being the original PDL vs Slice of another PDL.

     
  • Matthew McGillis

    After tracing through the code... the initial problem is:

    pdl_get_offset from Basic/Core/pdlsections.c

    For the above example when pdl_at calls pdl_get_offset it returns a value outside the pdl_at data range which is from 0-3.

    The reason it is getting values outside that range is the indx value is 2 so when we have a pos of 2 and 3 those values get multiplied by 2 resulting in a ioff value outside the possible range for the data resulting in bad values. We are determing the incs value to use in

    XS_PDL__Core_listref_c Basic/Core/Core.c

    My gues something in the logic for assigning the incs is probable not right in this context for some reason. I would assume we also have a similar issue in what ever code does assingments as well which is where core dumps are currently happening if attempt to write to these sorts of pdl's.

    Let me know how else I can help to move this forward this seems like a fairly significant bug to have pdl core dumping from using clump results.

     
  • Chris Marshall

    Chris Marshall - 2015-08-16
    • Group: critical --> normal
     
  • Matthew McGillis

    One more step closer.

    It looks like PDL sets incs inside:

    pdl_make_physvaffine Basic/Core/pdlapi.c

    From comparing execution of good code vs bad code what I'm seeing is this:

    Good Code:
    it->trans->pdsl[0]->trans->incs[0]=1
    it->trans->incs[0]=1

    Bad Code:
    it->trans->pdsl[0]->trans->incs[0]=2
    it->trans->incs[0]=2

    For the Bad Code because the slice now has incs of 2 pdl_make_physvaffine is using that in:

    newinc += at->incs[j]*ninced

    Next thing i can think of is to take a closer look at the dice code to sse how it is generating the pdls[0]->trans->incs[0] value of 2.

     
  • Matthew McGillis

    One thing I have noticed from the above is that with bad code the dice prints as:


    [
    [1 1

    ]

    [0 0

    ]
    ]

    But with good code the dice prints as:


    [
    [0 0

    ]

    [1 1

    ]
    ]

    Which sort of makes since givne that the code uses different incs values.

     
  • Matthew McGillis

    That previous post was incorrect both good code and bad code will produce the same exact results for the dice with the 1s first then the 0s.

    But the dice transform is very different between the good and bad code.

    For good code it looks something like this:

    ## from ndims 3 nvals 4
    ## data 0 1.000000
    ## data 1 1.000000
    ## data 2 0.000000
    ## data 3 0.000000
    ## ndims 2 offset 0 ioff 0
    ## dim 0 pos 0 dims 2 incs 1
    ## dim 1 pos 0 dims 1 incs 2
    ## ndims 2 offset 0 ioff 1
    ## dim 0 pos 1 dims 2 incs 1
    ## dim 1 pos 0 dims 1 incs 2
    ## from ndims 3 nvals 4
    ## data 0 1.000000
    ## data 1 1.000000
    ## data 2 0.000000
    ## data 3 0.000000
    ## ndims 2 offset 2 ioff 2
    ## dim 0 pos 0 dims 2 incs 1
    ## dim 1 pos 0 dims 1 incs 2
    ## ndims 2 offset 2 ioff 3
    ## dim 0 pos 1 dims 2 incs 1
    ## dim 1 pos 0 dims 1 incs 2
    

    For bad code it looks something like this:

    ## from ndims 3 nvals 4
    ## data 0 1.000000
    ## data 1 0.000000
    ## data 2 1.000000
    ## data 3 0.000000
    ## ndims 2 offset 0 ioff 0
    ## dim 0 pos 0 dims 2 incs 2
    ## dim 1 pos 0 dims 1 incs 1
    ## ndims 2 offset 0 ioff 2
    ## dim 0 pos 1 dims 2 incs 2
    ## dim 1 pos 0 dims 1 incs 1
    ## from ndims 3 nvals 4
    ## data 0 1.000000
    ## data 1 0.000000
    ## data 2 1.000000
    ## data 3 0.000000
    ## ndims 2 offset 1 ioff 1
    ## dim 0 pos 0 dims 2 incs 2
    ## dim 1 pos 0 dims 1 incs 1
    ## ndims 2 offset 1 ioff 3
    ## dim 0 pos 1 dims 2 incs 2
    ## dim 1 pos 0 dims 1 incs 1
    

    The above is created by adding the following debug output to

    Basic/Core/Core.c line 1392

       if (PDL_VAFFOK(x)) {
         PDLDEBUG_f(printf("## from ndims %i nvals %i\n",x->vafftrans->from->ndims,
                           x->vafftrans->from->nvals));
         for(ind=0; ind < x->vafftrans->from->nvals; ind++) {
           PDLDEBUG_f(printf("## data %i %f\n",ind,((PDL_Double *)data)[ind]));
         }
       } else {
         PDLDEBUG_f(printf("## x ndims %i nvals %i\n",x->ndims,x->nvals));
         for(ind=0; ind < x->nvals; ind++) {
           PDLDEBUG_f(printf("## data %i %f\n",ind,((PDL_Double *)data)[ind]));
         }
       }
    

    Basic/Core/pdlsections.c line 223

       PDLDEBUG_f(printf("## ndims %i offset %i ioff %i\n",ndims,offset,ioff));
       for (i=0;i<ndims;i++) {
         PDLDEBUG_f(printf("## dim %i pos %i dims %i incs %i\n",i,
                           pos[i],dims[i],incs[i]));
       }
    
     

    Last edit: Matthew McGillis 2015-08-26
  • Matthew McGillis

    This commit is what broke clump with dice:

    http://sourceforge.net/p/pdl/code/ci/ef5728b56ed998658c774f3abe8c7598d471f2dd/

    The change from:

      for ($a..$n-1) {
         $ix = $ix->dummy(0);
      }
    
      return $self->mv($axis,0)->index($ix)->mv($n-1,$axis);
    

    To:

      return $self->mv($axis,0)->index1d($ix)->mv(0,$axis);
    

    Is not compatible with clump. Would be ncie to revert back or find another solution that is compatible with clump results.

     
  • Matthew McGillis

    I added this to my t/slices.t to check for this issue:

    # Test of dice clump compatability
    my $xxx = PDL->new([[[0,0]],[[1,1]],[[2,2]]]);
    is_deeply($xxx->where($xxx == 0)->unpdl,[0,0],"dice clump base zero");
    my $dice = $xxx->dice("X","X",[1,0]);
    is_deeply($dice->clump(-1)->unpdl,[1,1,0,0],"dice clump correct");
    is_deeply($dice->where($dice == 0)->unpdl,[0,0],"dice clump where zero");
    
     
  • Chris Marshall

    Chris Marshall - 2015-09-28
    • status: open --> closed-fixed
    • assigned_to: Craig DeForest
     
  • Chris Marshall

    Chris Marshall - 2015-09-28

    Fixed in git master and will appear in coming PDL-2.014 release. Thanks for reporting and debugging the problem, Matt.

     
  • Matthew McGillis

    Thanks for getting the patch Craig put together into the next release.

    Glanced at the commit and didn't see any additions to the test cases. The above provided test cases would help detect these sorts of issues if you add them in to the regression tests. If you didn't notice them you might want to consider induding them.

     
    • Chris Marshall

      Chris Marshall - 2015-09-28

      Added, thanks for the reminder. I didn't realize they weren't in the patch.

      On Mon, Sep 28, 2015 at 3:01 PM, Matthew McGillis mmcgillis@users.sf.net
      wrote:

      Thanks for getting the patch Craig put together into the next release.

      Glanced at the commit and didn't see any additions to the test cases. The
      above provided test cases would help detect these sorts of issues if you
      add them in to the regression tests. If you didn't notice them you might
      want to consider induding them.


       

Log in to post a comment.