From: Karl G. <kar...@ma...> - 2024-01-07 00:27:11
|
Hi all, This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) I noticed a few things when running one of my complicated codes, I will start seperate email threads First there seems to be a serious rcols bug: e.g. create a file # tmp.dat 1 2 3 4 Loaded PDL v2.084 (supports bad values) pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p median($x) 0 pdl> p $x [100 200 300 400] It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 Notes - making a $x->copy() removes the effect - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? I’d be interested to know if others can reproduce this. It definitely needs a fix best Karl |
From: Karl G. <kar...@ma...> - 2024-01-07 01:32:38
|
Further to this, I looked through the rcols() diff. I could find no significant change in the code that smelled like it could cause this. Here is a visual diff: https://www.diffchecker.com/w2FX8O61/ <https://www.diffchecker.com/w2FX8O61/>. (v2.025, v2.084) I am afraid it must be a subtle bug to do with the internal routines that rcols uses (buffering and extending of ndarrays?) and perhaps the underlying dataflow engine. Arghh! Probably worth tracking down as it might be causing other badness…. Karl > On 7 Jan 2024, at 11:26 am, Karl Glazebrook via pdl-general <pdl...@li...> wrote: > > Hi all, > > This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) > > I noticed a few things when running one of my complicated codes, I will start seperate email threads > > First there seems to be a serious rcols bug: > > > e.g. create a file > > # tmp.dat > 1 > 2 > 3 > 4 > > > Loaded PDL v2.084 (supports bad values) > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> p $x > [1 2 3 4] > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p median($x) > 0 > pdl> p $x > [100 200 300 400] > > > It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 > > Notes > - making a $x->copy() removes the effect > - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? > > I’d be interested to know if others can reproduce this. It definitely needs a fix > > best > > Karl > > > > _______________________________________________ > pdl-general mailing list > pdl...@li... > https://lists.sourceforge.net/lists/listinfo/pdl-general |
From: Luis M. <mo...@ic...> - 2024-01-07 01:56:06
|
I noticed that medover and maxover do work as expected in this case. On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: > Hi all, > > This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) > > I noticed a few things when running one of my complicated codes, I will start seperate email threads > > First there seems to be a serious rcols bug: > > > e.g. create a file > > # tmp.dat > 1 > 2 > 3 > 4 > > > Loaded PDL v2.084 (supports bad values) > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> p $x > [1 2 3 4] > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p median($x) > 0 > pdl> p $x > [100 200 300 400] > > > It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 > > Notes > - making a $x->copy() removes the effect > - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? > > I’d be interested to know if others can reproduce this. It definitely needs a fix > > best > > Karl > > > > _______________________________________________ > pdl-general mailing list > pdl...@li... > https://lists.sourceforge.net/lists/listinfo/pdl-general -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Av. Universidad s/n CP 62210 | (*)/\/ \ Cuernavaca, Morelos, México | mo...@fi... /\_/\__/ GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB |
From: Luis M. <mo...@ic...> - 2024-01-07 02:29:18
|
I was able to reproduce it, and I don't understand it. Median, min, max fail even after printing the updated $x, as if there were to different variables $x. On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: > Hi all, > > This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) > > I noticed a few things when running one of my complicated codes, I will start seperate email threads > > First there seems to be a serious rcols bug: > > > e.g. create a file > > # tmp.dat > 1 > 2 > 3 > 4 > > > Loaded PDL v2.084 (supports bad values) > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> p $x > [1 2 3 4] > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p median($x) > 0 > pdl> p $x > [100 200 300 400] > > > It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 > > Notes > - making a $x->copy() removes the effect > - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? > > I’d be interested to know if others can reproduce this. It definitely needs a fix > > best > > Karl > > > > _______________________________________________ > pdl-general mailing list > pdl...@li... > https://lists.sourceforge.net/lists/listinfo/pdl-general -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Av. Universidad s/n CP 62210 | (*)/\/ \ Cuernavaca, Morelos, México | mo...@fi... /\_/\__/ GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB |
From: Karl G. <kar...@ma...> - 2024-01-07 04:42:01
|
Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions There does it indeed to be something wrong with clump too, so that is probably the underlying cause pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->clump(-1) [0 0 0 0] What could be happening in rcols() that produces an ndarray that behaves like that? Karl > On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic...> wrote: > > I noticed that medover and maxover do work as expected in this case. > > > On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: >> Hi all, >> >> This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) >> >> I noticed a few things when running one of my complicated codes, I will start seperate email threads >> >> First there seems to be a serious rcols bug: >> >> >> e.g. create a file >> >> # tmp.dat >> 1 >> 2 >> 3 >> 4 >> >> >> Loaded PDL v2.084 (supports bad values) >> pdl> $x = rcols 'tmp.dat' >> Reading data into ndarrays of type: [ Double ] >> Read in 4 elements. >> >> pdl> p $x >> [1 2 3 4] >> pdl> $x *= 100 >> >> pdl> p $x >> [100 200 300 400] >> pdl> p median($x) >> 0 >> pdl> p $x >> [100 200 300 400] >> >> >> It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 >> >> Notes >> - making a $x->copy() removes the effect >> - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? >> >> I’d be interested to know if others can reproduce this. It definitely needs a fix >> >> best >> >> Karl >> >> >> > > >> _______________________________________________ >> pdl-general mailing list >> pdl...@li... >> https://lists.sourceforge.net/lists/listinfo/pdl-general > > > -- > > o > W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) > Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ > Av. Universidad s/n CP 62210 | (*)/\/ \ > Cuernavaca, Morelos, México | mo...@fi... /\_/\__/ > GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB |
From: Karl G. <kar...@ma...> - 2024-01-07 05:32:48
|
OK here is some deeper diving in to the problem use PDL; $x = rcols 'tmp.dat'; # This does causes the error #$x = sequence(4)+1; # This works print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] ; looks like the error is happening in the internal pp routine _clump_int in slices.pd. I also found ->sever() had the same behaviour with rcols: pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->sever [0 0 0 0] Note this is not a general problem with dataflow, if I make a sequence and slice or index is then the ops work fine. It is just something weird on the ndarray produced by rcols. Karl > On 7 Jan 2024, at 3:41 pm, Karl Glazebrook via pdl-general <pdl...@li...> wrote: > > Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions > > There does it indeed to be something wrong with clump too, so that is probably the underlying cause > > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> p $x > [1 2 3 4] > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p $x->clump(-1) > [0 0 0 0] > > What could be happening in rcols() that produces an ndarray that behaves like that? > > Karl > > >> On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic... <mailto:mo...@ic...>> wrote: >> >> I noticed that medover and maxover do work as expected in this case. >> >> >> On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: >>> Hi all, >>> >>> This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) >>> >>> I noticed a few things when running one of my complicated codes, I will start seperate email threads >>> >>> First there seems to be a serious rcols bug: >>> >>> >>> e.g. create a file >>> >>> # tmp.dat >>> 1 >>> 2 >>> 3 >>> 4 >>> >>> >>> Loaded PDL v2.084 (supports bad values) >>> pdl> $x = rcols 'tmp.dat' >>> Reading data into ndarrays of type: [ Double ] >>> Read in 4 elements. >>> >>> pdl> p $x >>> [1 2 3 4] >>> pdl> $x *= 100 >>> >>> pdl> p $x >>> [100 200 300 400] >>> pdl> p median($x) >>> 0 >>> pdl> p $x >>> [100 200 300 400] >>> >>> >>> It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 >>> >>> Notes >>> - making a $x->copy() removes the effect >>> - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? >>> >>> I’d be interested to know if others can reproduce this. It definitely needs a fix >>> >>> best >>> >>> Karl >>> >>> >>> >> >> >>> _______________________________________________ >>> pdl-general mailing list >>> pdl...@li... <mailto:pdl...@li...> >>> https://lists.sourceforge.net/lists/listinfo/pdl-general >> >> >> -- >> >> o >> W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) >> Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ >> Av. Universidad s/n CP 62210 | (*)/\/ \ >> Cuernavaca, Morelos, México | mo...@fi... <mailto:mo...@fi...> /\_/\__/ >> GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB > > _______________________________________________ > pdl-general mailing list > pdl...@li... > https://lists.sourceforge.net/lists/listinfo/pdl-general |
From: Karl G. <kar...@ma...> - 2024-01-07 06:30:12
|
Seems I can’t help myself! I have now found the offending line in rcols() and can now reproduce this without rcols. That should at least make it easier to track down. use PDL; $x = sequence(100)+1; # This works $x = $x->mv(-1,0)->slice("0:3"); print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] 1; It seems to be the particular combination of ->mv and ->slice (that was on line 693 of misc.pd). This is a 1D ndarray so the ->mv(-1,0) should do nothing. Removing it however makes the problem go away. Works fine in PDL-2.025. Some bug that has been introduced in ->mv ? Sorry to the stream of consciousness series of emails. I will stop looking now... Karl > On 7 Jan 2024, at 4:32 pm, Karl Glazebrook <kar...@ma...> wrote: > > OK here is some deeper diving in to the problem > > > use PDL; > $x = rcols 'tmp.dat'; # This does causes the error > #$x = sequence(4)+1; # This works > print $x, "\n"; > $x *= 100; > print $x, "\n"; > $y=&PDL::_clump_int($x,-1); > print $y, "\n"; # prints [0 0 0 0] > ; > > > > looks like the error is happening in the internal pp routine _clump_int in slices.pd. > > > I also found ->sever() had the same behaviour with rcols: > > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p $x->sever > [0 0 0 0] > > > Note this is not a general problem with dataflow, if I make a sequence and slice or index is then the ops work fine. It is just something weird on the ndarray produced by rcols. > > Karl > > > > > >> On 7 Jan 2024, at 3:41 pm, Karl Glazebrook via pdl-general <pdl...@li... <mailto:pdl...@li...>> wrote: >> >> Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions >> >> There does it indeed to be something wrong with clump too, so that is probably the underlying cause >> >> pdl> $x = rcols 'tmp.dat' >> Reading data into ndarrays of type: [ Double ] >> Read in 4 elements. >> >> pdl> p $x >> [1 2 3 4] >> pdl> $x *= 100 >> >> pdl> p $x >> [100 200 300 400] >> pdl> p $x->clump(-1) >> [0 0 0 0] >> >> What could be happening in rcols() that produces an ndarray that behaves like that? >> >> Karl >> >> >>> On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic... <mailto:mo...@ic...>> wrote: >>> >>> I noticed that medover and maxover do work as expected in this case. >>> >>> >>> On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: >>>> Hi all, >>>> >>>> This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) >>>> >>>> I noticed a few things when running one of my complicated codes, I will start seperate email threads >>>> >>>> First there seems to be a serious rcols bug: >>>> >>>> >>>> e.g. create a file >>>> >>>> # tmp.dat >>>> 1 >>>> 2 >>>> 3 >>>> 4 >>>> >>>> >>>> Loaded PDL v2.084 (supports bad values) >>>> pdl> $x = rcols 'tmp.dat' >>>> Reading data into ndarrays of type: [ Double ] >>>> Read in 4 elements. >>>> >>>> pdl> p $x >>>> [1 2 3 4] >>>> pdl> $x *= 100 >>>> >>>> pdl> p $x >>>> [100 200 300 400] >>>> pdl> p median($x) >>>> 0 >>>> pdl> p $x >>>> [100 200 300 400] >>>> >>>> >>>> It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 >>>> >>>> Notes >>>> - making a $x->copy() removes the effect >>>> - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? >>>> >>>> I’d be interested to know if others can reproduce this. It definitely needs a fix >>>> >>>> best >>>> >>>> Karl >>>> >>>> >>>> >>> >>> >>>> _______________________________________________ >>>> pdl-general mailing list >>>> pdl...@li... <mailto:pdl...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/pdl-general <https://lists.sourceforge.net/lists/listinfo/pdl-general> >>> >>> >>> -- >>> >>> o >>> W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) >>> Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ >>> Av. Universidad s/n CP 62210 | (*)/\/ \ >>> Cuernavaca, Morelos, México | mo...@fi... <mailto:mo...@fi...> /\_/\__/ >>> GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB >> >> _______________________________________________ >> pdl-general mailing list >> pdl...@li... <mailto:pdl...@li...> >> https://lists.sourceforge.net/lists/listinfo/pdl-general > |
From: Ed . <ej...@ho...> - 2024-01-13 14:16:30
|
Hi Karl, Thank you for both reporting this issue, then doing this deep investigation as well. In order to track down exactly when this changed, I stripped down your repro code to: use PDL::LiteF; # not full PDL so can just build “make core” $x = sequence(4)+1; $x = $x->mv(-1,0)->slice("0:2"); $x *= 100; $y=&PDL::_clump_int($x,-1); print "x=$x y=$y\n"; $wrong = "$y" eq "[0 0 0]"; exit $wrong; As you can see in that comment, if one comments out the multiplication, it all still works correctly. The “sequence” can be replaced with hardcoded “pdl [1..5]”. But if one changes either the “mv”, or the “slice”, or the inplace “*=” (including replacing with “$x = $x * 100” ), or the “_clump_int”, it doesn’t misbehave. Anyway, using the above code to tell me whether I’d found where it started failing, I did “git bisect” as follows (noted here to help anyone who wants to do this themselves; I’d forgotten and had to look up how): git bisect start git bisect bad # current “master” is bad git bisect good 2.025 # tell it it was working as of 2.025 perl Makefile.PL && time make core && perl -Mblib repro-script; echo $? # kept running this, then: git bisect bad # if failing git bisect good # if working correctly git bisect reset # when finished, to close down the bisect Note the use of “make core” which takes about 2 mins from scratch on my system, vs about 6 to “make” everything, saving lots of time. It turns out it was this, which was released with 2.078: commit a4678091acf7e450c02a7b0feaf3c7578f37e53f Author: Ed J <mo...@us...> Date: Sun Apr 3 22:27:28 2022 +0100 parents of non-flowing trans also track trans_children so can de-register on destroy Basic/Core/pdlapi.c | 46 +++++++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 27 deletions(-) I am now investigating a fix, which given how specific it is to trigger, will probably be small, and is surely related to book-keeping of parents vs children, and flowing transformations. Best regards, Ed From: Karl Glazebrook via pdl-general<mailto:pdl...@li...> Sent: 07 January 2024 06:30 To: perldl<mailto:pdl...@li...> Subject: Re: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue Seems I can’t help myself! I have now found the offending line in rcols() and can now reproduce this without rcols. That should at least make it easier to track down. use PDL; $x = sequence(100)+1; # This works $x = $x->mv(-1,0)->slice("0:3"); print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] 1; It seems to be the particular combination of ->mv and ->slice (that was on line 693 of misc.pd). This is a 1D ndarray so the ->mv(-1,0) should do nothing. Removing it however makes the problem go away. Works fine in PDL-2.025. Some bug that has been introduced in ->mv ? Sorry to the stream of consciousness series of emails. I will stop looking now... Karl On 7 Jan 2024, at 4:32 pm, Karl Glazebrook <kar...@ma...<mailto:kar...@ma...>> wrote: OK here is some deeper diving in to the problem use PDL; $x = rcols 'tmp.dat'; # This does causes the error #$x = sequence(4)+1; # This works print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] ; looks like the error is happening in the internal pp routine _clump_int in slices.pd. I also found ->sever() had the same behaviour with rcols: pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->sever [0 0 0 0] Note this is not a general problem with dataflow, if I make a sequence and slice or index is then the ops work fine. It is just something weird on the ndarray produced by rcols. Karl On 7 Jan 2024, at 3:41 pm, Karl Glazebrook via pdl-general <pdl...@li...<mailto:pdl...@li...>> wrote: Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions There does it indeed to be something wrong with clump too, so that is probably the underlying cause pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->clump(-1) [0 0 0 0] What could be happening in rcols() that produces an ndarray that behaves like that? Karl On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic...<mailto:mo...@ic...>> wrote: I noticed that medover and maxover do work as expected in this case. On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: Hi all, This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) I noticed a few things when running one of my complicated codes, I will start seperate email threads First there seems to be a serious rcols bug: e.g. create a file # tmp.dat 1 2 3 4 Loaded PDL v2.084 (supports bad values) pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p median($x) 0 pdl> p $x [100 200 300 400] It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 Notes - making a $x->copy() removes the effect - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? I’d be interested to know if others can reproduce this. It definitely needs a fix best Karl _______________________________________________ pdl-general mailing list pdl...@li...<mailto:pdl...@li...> https://lists.sourceforge.net/lists/listinfo/pdl-general -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Av. Universidad s/n CP 62210 | (*)/\/ \ Cuernavaca, Morelos, México | mo...@fi...<mailto:mo...@fi...> /\_/\__/ GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB _______________________________________________ pdl-general mailing list pdl...@li...<mailto:pdl...@li...> https://lists.sourceforge.net/lists/listinfo/pdl-general |
From: Karl G. <kar...@ma...> - 2024-01-15 07:17:40
|
Thanks Ed. Yes I can confirm the code snippet bugs as you describe on PDL-2.084 Good detective work and impressive use of git! Hope it is just a bug. I can also confirm it works on 2.077 and bugs on 2.078 Karl > On 14 Jan 2024, at 1:16 am, Ed . <ej...@ho...> wrote: > > Hi Karl, > > Thank you for both reporting this issue, then doing this deep investigation as well. > > In order to track down exactly when this changed, I stripped down your repro code to: > use PDL::LiteF; # not full PDL so can just build “make core” > $x = sequence(4)+1; > $x = $x->mv(-1,0)->slice("0:2"); > $x *= 100; > $y=&PDL::_clump_int($x,-1); > print "x=$x y=$y\n"; > $wrong = "$y" eq "[0 0 0]"; > exit $wrong; > > As you can see in that comment, if one comments out the multiplication, it all still works correctly. The “sequence” can be replaced with hardcoded “pdl [1..5]”. But if one changes either the “mv”, or the “slice”, or the inplace “*=” (including replacing with “$x = $x * 100” ), or the “_clump_int”, it doesn’t misbehave. > > Anyway, using the above code to tell me whether I’d found where it started failing, I did “git bisect” as follows (noted here to help anyone who wants to do this themselves; I’d forgotten and had to look up how): > git bisect start > git bisect bad # current “master” is bad > git bisect good 2.025 # tell it it was working as of 2.025 > perl Makefile.PL && time make core && perl -Mblib repro-script; echo $? # kept running this, then: > git bisect bad # if failing > git bisect good # if working correctly > git bisect reset # when finished, to close down the bisect > Note the use of “make core” which takes about 2 mins from scratch on my system, vs about 6 to “make” everything, saving lots of time. > > It turns out it was this, which was released with 2.078: > commit a4678091acf7e450c02a7b0feaf3c7578f37e53f > Author: Ed J <mo...@us... <mailto:mo...@us...>> > Date: Sun Apr 3 22:27:28 2022 +0100 > > parents of non-flowing trans also track trans_children so can de-register on destroy > > Basic/Core/pdlapi.c | 46 +++++++++++++++++++--------------------------- > 1 file changed, 19 insertions(+), 27 deletions(-) > > I am now investigating a fix, which given how specific it is to trigger, will probably be small, and is surely related to book-keeping of parents vs children, and flowing transformations. > > Best regards, > Ed > > From: Karl Glazebrook via pdl-general <mailto:pdl...@li...> > Sent: 07 January 2024 06:30 > To: perldl <mailto:pdl...@li...> > Subject: Re: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue > > Seems I can’t help myself! I have now found the offending line in rcols() and can now reproduce this without rcols. That should at least make it easier to track down. > > use PDL; > $x = sequence(100)+1; # This works > $x = $x->mv(-1,0)->slice("0:3"); > print $x, "\n"; > $x *= 100; > print $x, "\n"; > $y=&PDL::_clump_int($x,-1); > print $y, "\n"; # prints [0 0 0 0] > 1; > > It seems to be the particular combination of ->mv and ->slice (that was on line 693 of misc.pd). This is a 1D ndarray so the ->mv(-1,0) should do nothing. Removing it however makes the problem go away. > > Works fine in PDL-2.025. Some bug that has been introduced in ->mv ? > > Sorry to the stream of consciousness series of emails. I will stop looking now... > > Karl > > > > > > On 7 Jan 2024, at 4:32 pm, Karl Glazebrook <kar...@ma... <mailto:kar...@ma...>> wrote: > > OK here is some deeper diving in to the problem > > > use PDL; > $x = rcols 'tmp.dat'; # This does causes the error > #$x = sequence(4)+1; # This works > print $x, "\n"; > $x *= 100; > print $x, "\n"; > $y=&PDL::_clump_int($x,-1); > print $y, "\n"; # prints [0 0 0 0] > ; > > > > looks like the error is happening in the internal pp routine _clump_int in slices.pd. > > > I also found ->sever() had the same behaviour with rcols: > > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p $x->sever > [0 0 0 0] > > > > Note this is not a general problem with dataflow, if I make a sequence and slice or index is then the ops work fine. It is just something weird on the ndarray produced by rcols. > > Karl > > > > > > > On 7 Jan 2024, at 3:41 pm, Karl Glazebrook via pdl-general <pdl...@li... <mailto:pdl...@li...>> wrote: > > Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions > > There does it indeed to be something wrong with clump too, so that is probably the underlying cause > > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> p $x > [1 2 3 4] > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p $x->clump(-1) > [0 0 0 0] > > > What could be happening in rcols() that produces an ndarray that behaves like that? > > Karl > > > > On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic... <mailto:mo...@ic...>> wrote: > > I noticed that medover and maxover do work as expected in this case. > > > On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: > > Hi all, > > This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) > > I noticed a few things when running one of my complicated codes, I will start seperate email threads > > First there seems to be a serious rcols bug: > > > e.g. create a file > > # tmp.dat > 1 > 2 > 3 > 4 > > > Loaded PDL v2.084 (supports bad values) > pdl> $x = rcols 'tmp.dat' > Reading data into ndarrays of type: [ Double ] > Read in 4 elements. > > pdl> p $x > [1 2 3 4] > pdl> $x *= 100 > > pdl> p $x > [100 200 300 400] > pdl> p median($x) > 0 > pdl> p $x > [100 200 300 400] > > > It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 > > Notes > - making a $x->copy() removes the effect > - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? > > I’d be interested to know if others can reproduce this. It definitely needs a fix > > best > > Karl > > > > > > > _______________________________________________ > pdl-general mailing list > pdl...@li... <mailto:pdl...@li...> > https://lists.sourceforge.net/lists/listinfo/pdl-general <https://lists.sourceforge.net/lists/listinfo/pdl-general> > > > -- > > o > W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) > Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ > Av. Universidad s/n CP 62210 | (*)/\/ \ > Cuernavaca, Morelos, México | mo...@fi... <mailto:mo...@fi...> /\_/\__/ > GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB > > _______________________________________________ > pdl-general mailing list > pdl...@li... <mailto:pdl...@li...> > https://lists.sourceforge.net/lists/listinfo/pdl-general <https://lists.sourceforge.net/lists/listinfo/pdl-general> |
From: Ed . <ej...@ho...> - 2024-01-28 20:21:34
Attachments:
montage.png
|
Hi Karl, I can confirm it’s definitely “just” a bug (which isn’t surprising, given it used to work and intuitively feels like it still should). I don’t have a fix quite yet, but wanted to share my findings so far. In the latest git master, I’ve expanded the ndarray API slightly so you can achieve similar to “$pdl->dump” just in Perl. Using that, there is now an implementation of what I threatened to do some time ago: code to take an ndarray, and walk its graph of connected PDL operations, other ndarrays, repeat. There is also code to take that graph, and visualise it using GraphViz2 (the Perl module) and graphviz (the excellent utility). This is the script that exercises the bug (the graph-visualising bit of which is now in the docs for the new “PDL::Core::pdumpgraphvizify”), including use of “PDL::Core::set_debugging(1)”, which makes PDL tell you what it’s doing step-by-step: use PDL; $MV = 1; $MULT = 1; $count = 1; $format = 'png'; sub output { $g = PDL::Core::pdumpgraph(PDL::Core::pdumphash($_[0])); require GraphViz2; $gv = GraphViz2->from_graph(PDL::Core::pdumpgraphvizify($g)); $gv->run(format => $format, output_file => 'output'.$count++.".$format"); } $x_orig = pdl [1..4]; output($x_orig); $x_mv = $MV ? $x_orig->mv(-1,0) : $x_orig; output($x_orig); $x_slice = $x_mv->slice("0:2"); output($x_orig); $x_slice *= 100 if $MULT; output($x_orig); $y = &PDL::_clump_int($x_slice,-1); output($x_orig); PDL::Core::set_debugging(1); print "DUMPX y: $y\n"; # prints [0 0 0 0] PDL::Core::set_debugging(0); output($x_orig); I also attach the image made by ImageMagick “montage”, partly because I think it’s pretty neat. The dashed line shows for a “vaffine” ndarray, where its data-pointer points to. PDL operations are red-outlined, ndarrays are blue. As the image shows, between state 5 (before printing the “clump” output) and 6 (after that), things go a bit wrong. Before, the slice-output ndarray has no data of its own, and points to the original ndarray, modified by the multiplication of some of its elements. After, it does have data of its own allocated, which is zeroes, but still points to the original (and printing it looks there). However, its “child”, via “clump”, looks in its data, and gets zeroes. There are several subtle (which is why it takes such specific circumstances to trigger), interacting bugs: * the printing does “make_physical” which it shouldn’t * the “make_physical” goes recursively, which is more than it needs to * when “make_physical” happens to a vaffine trans, it allocates data but doesn’t load the right data into it Best regards, Ed ________________________________ From: Karl Glazebrook <kar...@ma...> Sent: Monday, January 15, 2024 7:17:15 AM To: Ed . <ej...@ho...> Cc: perldl <pdl...@li...> Subject: Re: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue Thanks Ed. Yes I can confirm the code snippet bugs as you describe on PDL-2.084 Good detective work and impressive use of git! Hope it is just a bug. I can also confirm it works on 2.077 and bugs on 2.078 Karl On 14 Jan 2024, at 1:16 am, Ed . <ej...@ho...<mailto:ej...@ho...>> wrote: Hi Karl, Thank you for both reporting this issue, then doing this deep investigation as well. In order to track down exactly when this changed, I stripped down your repro code to: use PDL::LiteF; # not full PDL so can just build “make core” $x = sequence(4)+1; $x = $x->mv(-1,0)->slice("0:2"); $x *= 100; $y=&PDL::_clump_int($x,-1); print "x=$x y=$y\n"; $wrong = "$y" eq "[0 0 0]"; exit $wrong; As you can see in that comment, if one comments out the multiplication, it all still works correctly. The “sequence” can be replaced with hardcoded “pdl [1..5]”. But if one changes either the “mv”, or the “slice”, or the inplace “*=” (including replacing with “$x = $x * 100” ), or the “_clump_int”, it doesn’t misbehave. Anyway, using the above code to tell me whether I’d found where it started failing, I did “git bisect” as follows (noted here to help anyone who wants to do this themselves; I’d forgotten and had to look up how): git bisect start git bisect bad # current “master” is bad git bisect good 2.025 # tell it it was working as of 2.025 perl Makefile.PL && time make core && perl -Mblib repro-script; echo $? # kept running this, then: git bisect bad # if failing git bisect good # if working correctly git bisect reset # when finished, to close down the bisect Note the use of “make core” which takes about 2 mins from scratch on my system, vs about 6 to “make” everything, saving lots of time. It turns out it was this, which was released with 2.078: commit a4678091acf7e450c02a7b0feaf3c7578f37e53f Author: Ed J <mo...@us...<mailto:mo...@us...>> Date: Sun Apr 3 22:27:28 2022 +0100 parents of non-flowing trans also track trans_children so can de-register on destroy Basic/Core/pdlapi.c | 46 +++++++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 27 deletions(-) I am now investigating a fix, which given how specific it is to trigger, will probably be small, and is surely related to book-keeping of parents vs children, and flowing transformations. Best regards, Ed From: Karl Glazebrook via pdl-general<mailto:pdl...@li...> Sent: 07 January 2024 06:30 To: perldl<mailto:pdl...@li...> Subject: Re: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue Seems I can’t help myself! I have now found the offending line in rcols() and can now reproduce this without rcols. That should at least make it easier to track down. use PDL; $x = sequence(100)+1; # This works $x = $x->mv(-1,0)->slice("0:3"); print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] 1; It seems to be the particular combination of ->mv and ->slice (that was on line 693 of misc.pd). This is a 1D ndarray so the ->mv(-1,0) should do nothing. Removing it however makes the problem go away. Works fine in PDL-2.025. Some bug that has been introduced in ->mv ? Sorry to the stream of consciousness series of emails. I will stop looking now... Karl On 7 Jan 2024, at 4:32 pm, Karl Glazebrook <kar...@ma...<mailto:kar...@ma...>> wrote: OK here is some deeper diving in to the problem use PDL; $x = rcols 'tmp.dat'; # This does causes the error #$x = sequence(4)+1; # This works print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] ; looks like the error is happening in the internal pp routine _clump_int in slices.pd. I also found ->sever() had the same behaviour with rcols: pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->sever [0 0 0 0] Note this is not a general problem with dataflow, if I make a sequence and slice or index is then the ops work fine. It is just something weird on the ndarray produced by rcols. Karl On 7 Jan 2024, at 3:41 pm, Karl Glazebrook via pdl-general <pdl...@li...<mailto:pdl...@li...>> wrote: Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions There does it indeed to be something wrong with clump too, so that is probably the underlying cause pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->clump(-1) [0 0 0 0] What could be happening in rcols() that produces an ndarray that behaves like that? Karl On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic...<mailto:mo...@ic...>> wrote: I noticed that medover and maxover do work as expected in this case. On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: Hi all, This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) I noticed a few things when running one of my complicated codes, I will start seperate email threads First there seems to be a serious rcols bug: e.g. create a file # tmp.dat 1 2 3 4 Loaded PDL v2.084 (supports bad values) pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p median($x) 0 pdl> p $x [100 200 300 400] It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 Notes - making a $x->copy() removes the effect - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? I’d be interested to know if others can reproduce this. It definitely needs a fix best Karl _______________________________________________ pdl-general mailing list pdl...@li...<mailto:pdl...@li...> https://lists.sourceforge.net/lists/listinfo/pdl-general -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Av. Universidad s/n CP 62210 | (*)/\/ \ Cuernavaca, Morelos, México | mo...@fi...<mailto:mo...@fi...> /\_/\__/ GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB _______________________________________________ pdl-general mailing list pdl...@li...<mailto:pdl...@li...> https://lists.sourceforge.net/lists/listinfo/pdl-general |
From: Ed . <ej...@ho...> - 2024-02-10 20:43:04
|
Hi Karl (and all), I am pleased (and somewhat relieved) that the latest “git master” has a fix that seems to comprehensively sort this out. See https://github.com/PDLPorters/pdl/issues/461 for discussion of what was happening (dataflow was being impeded by “DATACHANGED” not getting propagated right). See attached for my slightly-expanded script to exercise this (all captured in PDL tests so it can’t happen again); the “+=” bit was an addition by me that revealed further “challenges”, also captured in tests. Please also see attached the result of running the 24 images that creates through “montage output* -tile 8x3 -geometry '1x1<' montage.png”. There will be a dev-release of PDL shortly. I’m also going to address: * things getting “over-physicalised” (i.e. be lazier) * make “clump” be a proper affine transformation to further reduce physical copies of data * vaffine transformation-outputs that do get physicalised will just become a normal flowing transformation, which will make tracking these things easier Best regards, Ed From: Ed .<mailto:ej...@ho...> Sent: 28 January 2024 20:21 To: Karl Glazebrook<mailto:kar...@ma...> Cc: perldl<mailto:pdl...@li...> Subject: RE: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue Hi Karl, I can confirm it’s definitely “just” a bug (which isn’t surprising, given it used to work and intuitively feels like it still should). I don’t have a fix quite yet, but wanted to share my findings so far. In the latest git master, I’ve expanded the ndarray API slightly so you can achieve similar to “$pdl->dump” just in Perl. Using that, there is now an implementation of what I threatened to do some time ago: code to take an ndarray, and walk its graph of connected PDL operations, other ndarrays, repeat. There is also code to take that graph, and visualise it using GraphViz2 (the Perl module) and graphviz (the excellent utility). This is the script that exercises the bug (the graph-visualising bit of which is now in the docs for the new “PDL::Core::pdumpgraphvizify”), including use of “PDL::Core::set_debugging(1)”, which makes PDL tell you what it’s doing step-by-step: use PDL; $MV = 1; $MULT = 1; $count = 1; $format = 'png'; sub output { $g = PDL::Core::pdumpgraph(PDL::Core::pdumphash($_[0])); require GraphViz2; $gv = GraphViz2->from_graph(PDL::Core::pdumpgraphvizify($g)); $gv->run(format => $format, output_file => 'output'.$count++.".$format"); } $x_orig = pdl [1..4]; output($x_orig); $x_mv = $MV ? $x_orig->mv(-1,0) : $x_orig; output($x_orig); $x_slice = $x_mv->slice("0:2"); output($x_orig); $x_slice *= 100 if $MULT; output($x_orig); $y = &PDL::_clump_int($x_slice,-1); output($x_orig); PDL::Core::set_debugging(1); print "DUMPX y: $y\n"; # prints [0 0 0 0] PDL::Core::set_debugging(0); output($x_orig); I also attach the image made by ImageMagick “montage”, partly because I think it’s pretty neat. The dashed line shows for a “vaffine” ndarray, where its data-pointer points to. PDL operations are red-outlined, ndarrays are blue. As the image shows, between state 5 (before printing the “clump” output) and 6 (after that), things go a bit wrong. Before, the slice-output ndarray has no data of its own, and points to the original ndarray, modified by the multiplication of some of its elements. After, it does have data of its own allocated, which is zeroes, but still points to the original (and printing it looks there). However, its “child”, via “clump”, looks in its data, and gets zeroes. There are several subtle (which is why it takes such specific circumstances to trigger), interacting bugs: * the printing does “make_physical” which it shouldn’t * the “make_physical” goes recursively, which is more than it needs to * when “make_physical” happens to a vaffine trans, it allocates data but doesn’t load the right data into it Best regards, Ed From: Karl Glazebrook <kar...@ma...> Sent: Monday, January 15, 2024 7:17:15 AM To: Ed . <ej...@ho...> Cc: perldl <pdl...@li...> Subject: Re: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue Thanks Ed. Yes I can confirm the code snippet bugs as you describe on PDL-2.084 Good detective work and impressive use of git! Hope it is just a bug. I can also confirm it works on 2.077 and bugs on 2.078 Karl On 14 Jan 2024, at 1:16 am, Ed . <ej...@ho...<mailto:ej...@ho...>> wrote: Hi Karl, Thank you for both reporting this issue, then doing this deep investigation as well. In order to track down exactly when this changed, I stripped down your repro code to: use PDL::LiteF; # not full PDL so can just build “make core” $x = sequence(4)+1; $x = $x->mv(-1,0)->slice("0:2"); $x *= 100; $y=&PDL::_clump_int($x,-1); print "x=$x y=$y\n"; $wrong = "$y" eq "[0 0 0]"; exit $wrong; As you can see in that comment, if one comments out the multiplication, it all still works correctly. The “sequence” can be replaced with hardcoded “pdl [1..5]”. But if one changes either the “mv”, or the “slice”, or the inplace “*=” (including replacing with “$x = $x * 100” ), or the “_clump_int”, it doesn’t misbehave. Anyway, using the above code to tell me whether I’d found where it started failing, I did “git bisect” as follows (noted here to help anyone who wants to do this themselves; I’d forgotten and had to look up how): git bisect start git bisect bad # current “master” is bad git bisect good 2.025 # tell it it was working as of 2.025 perl Makefile.PL && time make core && perl -Mblib repro-script; echo $? # kept running this, then: git bisect bad # if failing git bisect good # if working correctly git bisect reset # when finished, to close down the bisect Note the use of “make core” which takes about 2 mins from scratch on my system, vs about 6 to “make” everything, saving lots of time. It turns out it was this, which was released with 2.078: commit a4678091acf7e450c02a7b0feaf3c7578f37e53f Author: Ed J <mo...@us...<mailto:mo...@us...>> Date: Sun Apr 3 22:27:28 2022 +0100 parents of non-flowing trans also track trans_children so can de-register on destroy Basic/Core/pdlapi.c | 46 +++++++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 27 deletions(-) I am now investigating a fix, which given how specific it is to trigger, will probably be small, and is surely related to book-keeping of parents vs children, and flowing transformations. Best regards, Ed From: Karl Glazebrook via pdl-general<mailto:pdl...@li...> Sent: 07 January 2024 06:30 To: perldl<mailto:pdl...@li...> Subject: Re: [Pdl-general] Changes I noted PDL2.025 -> PDL2.084 - rcols issue Seems I can’t help myself! I have now found the offending line in rcols() and can now reproduce this without rcols. That should at least make it easier to track down. use PDL; $x = sequence(100)+1; # This works $x = $x->mv(-1,0)->slice("0:3"); print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] 1; It seems to be the particular combination of ->mv and ->slice (that was on line 693 of misc.pd). This is a 1D ndarray so the ->mv(-1,0) should do nothing. Removing it however makes the problem go away. Works fine in PDL-2.025. Some bug that has been introduced in ->mv ? Sorry to the stream of consciousness series of emails. I will stop looking now... Karl On 7 Jan 2024, at 4:32 pm, Karl Glazebrook <kar...@ma...<mailto:kar...@ma...>> wrote: OK here is some deeper diving in to the problem use PDL; $x = rcols 'tmp.dat'; # This does causes the error #$x = sequence(4)+1; # This works print $x, "\n"; $x *= 100; print $x, "\n"; $y=&PDL::_clump_int($x,-1); print $y, "\n"; # prints [0 0 0 0] ; looks like the error is happening in the internal pp routine _clump_int in slices.pd. I also found ->sever() had the same behaviour with rcols: pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->sever [0 0 0 0] Note this is not a general problem with dataflow, if I make a sequence and slice or index is then the ops work fine. It is just something weird on the ndarray produced by rcols. Karl On 7 Jan 2024, at 3:41 pm, Karl Glazebrook via pdl-general <pdl...@li...<mailto:pdl...@li...>> wrote: Ah! I believe the difference between medover and median is a clump(-1) to collapse the dimensions There does it indeed to be something wrong with clump too, so that is probably the underlying cause pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p $x->clump(-1) [0 0 0 0] What could be happening in rcols() that produces an ndarray that behaves like that? Karl On 7 Jan 2024, at 12:48 pm, Luis Mochan <mo...@ic...<mailto:mo...@ic...>> wrote: I noticed that medover and maxover do work as expected in this case. On Sun, Jan 07, 2024 at 11:26:56AM +1100, Karl Glazebrook via pdl-general wrote: Hi all, This dinosaur just upgraded from PDL v2.025 to v.2.084 (yes, I know that is lame) I noticed a few things when running one of my complicated codes, I will start seperate email threads First there seems to be a serious rcols bug: e.g. create a file # tmp.dat 1 2 3 4 Loaded PDL v2.084 (supports bad values) pdl> $x = rcols 'tmp.dat' Reading data into ndarrays of type: [ Double ] Read in 4 elements. pdl> p $x [1 2 3 4] pdl> $x *= 100 pdl> p $x [100 200 300 400] pdl> p median($x) 0 pdl> p $x [100 200 300 400] It seems the median function sees the values BEFORE the inplace multiplacation, whereas print does not. This is very bad. min() and max() are similar. No idea what is going on here! The behaviour or absent from v2.025 Notes - making a $x->copy() removes the effect - creating $x using sequence also removes, so it is something to do with rcols() and not inplace in general? I’d be interested to know if others can reproduce this. It definitely needs a fix best Karl _______________________________________________ pdl-general mailing list pdl...@li...<mailto:pdl...@li...> https://lists.sourceforge.net/lists/listinfo/pdl-general -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Av. Universidad s/n CP 62210 | (*)/\/ \ Cuernavaca, Morelos, México | mo...@fi...<mailto:mo...@fi...> /\_/\__/ GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB _______________________________________________ pdl-general mailing list pdl...@li...<mailto:pdl...@li...> https://lists.sourceforge.net/lists/listinfo/pdl-general |