#511 Histogram gives wrong results

closed-fixed
nobody
None
5
2013-02-26
2013-01-30
jmurthy
No

Histogram gives different results than the explicit summing up of data. This works correctly in IDL and for simpler cases but fails for my data file. I couldn't add the file but will send if needed.

gdl --version
GDL - GNU Data Language, Version 0.9.2
$more test.pro
fuv_nrows=13582379
fuv=fltarr(5,fuv_nrows)
openr,1,"all_scst_fuv_sa.csv"
readf,1,fuv
close,1
h=histogram(fuv(0,*),min=0,bin=1)
hq=where(h gt 0)
t=lonarr(15039)
for i=0,15038 do t(i)=n_elements(where(fuv(0,*) eq i))

Discussion

  • jmurthy

    jmurthy - 2013-01-30

    Here's an example with no data file:
    DL> fuv_nrows=13582379
    GDL> fuv=fltarr(5,fuv_nrows)
    GDL> fuv(0,*) = lindgen(fuv_nrows) mod 1300
    GDL> h=histogram(fuv(0,*),min=0,bin=1)
    GDL> help,h
    H LONG = Array[1300]
    GDL> for i=0,1299 do t(i)=n_elements(where(fuv(0,*) eq i))
    GDL> q=where(h ne t)
    GDL> print,q(0)
    2
    GDL> print,h(0),t(0)
    10448 10448
    GDL> print,h(2),t(2)
    20896 10448

     
  • Alain C.

    Alain C. - 2013-02-01

    z=lindgen(13000) mod 1300
    plot, histogram(z, bin=1)
    oplot, histogram(z), psym=2

    thanks you for this bug report, a serious problem triggered by the used of keyword
    bin=
    (a solution should arrive soon)

    Alain

     
  • Josh Sixsmith

    Josh Sixsmith - 2013-02-05

    I can't replicate the result using version 0.9.3

    fuv_nrows=13582379
    fuv=fltarr(5,fuv_nrows)
    fuv(0,*) = lindgen(fuv_nrows) mod 1300
    h=histogram(fuv(0,*),min=0,bin=1)
    t=lonarr(15039)
    for i=0,1299 do t(i)=n_elements(where(fuv(0,*) eq i))
    q=where(h ne t)
    print,q
    -1

    Alain's example produces a duplicate plot. Using total() to show differences:

    z=lindgen(13000) mod 1300
    h1 = histogram(z, bin=1)
    h2 = histogram(z)
    print, total(h2 - h1)
    0.00000

    Josh

     
  • Alain C.

    Alain C. - 2013-02-19

    Still busy by urgent tasks and also Eigen3 !

    using the same CVS version (today one) I have clear differences between 32 and 64 bits OS.
    (no pb on 32b)

    (I think the test TOTAL(h2-h1) is not pertinent because the total amount of counted values
    will compensate --but maybe not at the right place. I would use instead: TOTAL(ABS(h2-h1)) !)

    Alain

     
  • Alain C.

    Alain C. - 2013-02-26

    A temporary solution was put in the CVS. Please test then report problems in a new bug report. (with example, and, please, information on ELF version of the GDL version).

    The problem comes from the GSL and inaccuray in bins computations (integer versus float/double). We succeed to reproducible it outside GDL, just calling the "gsl-histogram" code on a ramp made by integers only.
    On same input data, the code is OK on 32b ELF "gsl-histogram" and gives wrong results on 64b ELF.

    The bug is now known on the GSL side. See thread started at
    http://lists.gnu.org/archive/html/bug-gsl/2013-02/msg00005.html
    (if you have any idea to improve GSL code, please post on the GSL thread !)

    Alain

     
  • Alain C.

    Alain C. - 2013-02-26
    • status: open --> closed-fixed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks