Menu

#328 WWZ statistic goes to zero in the presence of gaps

Defect
open
Algorithms (37)
8
2018-03-05
2012-07-10
David Benn
No

Brian Kloppenborg reported a bug in WWZ when I was at NACAA 2012. Here are some email excerpts.

"I have found an annoying bug in the WWZ algorithm that leads to premature termination of the WWZ algorithm. This happens both in WinWWZ and in VStar. Steps to reproduce:

1) Load the V_Vstar.[csv|txt] file into VStar or WinWWZ
2) Choose a period range from 20 to 300 days in 0.5 day steps. Decay of 0.0125
3) Run it.

VStar will terminate after the first block of data. WinWWZ does better, but still appears to terminate before the data ends."

The data file V_Vstar.txt is attached.

Brian also reported the following which may or may not be related:

"I also found a performance issue in VStar. In both VStar and WinWWZ, select JD 2446000-2455000 and run the WWZ with the aforementioned parameters except crank the decay down to 0.0005. This process is SIGNIFICANTLY faster in WinWWZ than it is in VStar on my machine (I'm even running Windows in a Virtual Machine)."

Discussion

  • David Benn

    David Benn - 2012-07-10

    Example data file with gaps in JD range.

     
  • David Benn

    David Benn - 2012-07-10

    Here is some additional email commentary from Brian:

    For our discussion, lets use:
    P_min: 20 day
    P_max: 250 day
    P_step: 1 day
    decay: 0.0125
    Time division: 500

    This data has three large gaps. With the parameters specified, I would expect WWZ to return zeros in the regions where there are gaps, and non-zero WWZ coefficients where there is data. Contrary to my anticipated behavior, WWZ returns ONLY zero values for the WWZ output after it encounters a gap that is greater than the decay coefficient.

    A) This is easily demonstrated in the data file you have attached. In the file max_wwz_bk_20-300.txt, there are non-zero WWZ coefficients (column 4) up to JD 2436800 after which the first gap is encountered. At this point the WWZ output remains zero despite the large block of data at JD 2445200 - 2449500.

    B) Just to check, lets start WWZ at JD 2445200 instead (using the same parameters specified above). Using WinWWZ you will get mostly non-zero output for the entire interval JD 2445200-2456000 (this is fine, expected behavior). With VStar, all coefficients are suddenly zero starting at ~2449500. Again, this is where there is a gap in the data. This contradicts the results from paragraph (A) that implied the WWZ output in this time interval were zero. (see attached max_wwz_... file).

    C) Now, start WWZ after the gap (JD 2450000+) with the same parameters. Suddenly the coefficients that were just identically zero, are mostly non-zero! This contradicts the results from paragraphs (A) and (B) which implied that the WWZ output in this region was zero.

    Note, that this gap thing isn't just a problem with the VStar implementation. WinWWZ also chokes on the first large gap in this data set. I remember having similar issues with the Fortran version, but haven't confirmed the problem exists there (19 days until my dissertation is due).

     
  • David Benn

    David Benn - 2012-07-10
    • summary: WWZ statistic goes to zero in the presence of large gaps --> WWZ statistic goes to zero in the presence of gaps
     
  • David Benn

    David Benn - 2012-07-10

    I can certainly reproduce the problem and am in the process of debugging.

     
  • David Benn

    David Benn - 2012-08-07

    I recently saw that the WWZ 1.1 Fortran documentation mentions this bug, so it's a known problem.

    In section 3: [Note, the data should not contain data gaps larger than about 2 cycles.]

    In section 4: WWZ scans the data set starting from the earliest data and progressing to the latest. If you notice that the program returns zero values of the WWZ statistic, then you probably have a large data gap just prior to the point where the zero values begin. Consider truncating the data set to include only data before or after the gap, or split the data and analyze both sets separately.

    This is consistent with what Matt has already told us.

     
  • David Benn

    David Benn - 2012-08-07
    • priority: 9 --> 8
     
  • David Benn

    David Benn - 2018-02-10

    This came up again in email correspodence with Gary Walker who also encountered the bug. It occurred to me that I should compare against an R implementation by Grant Foster if one exists. I know that DCDFT does, not sure about WWZ. Should also just talk with Grant about this bug.

     

Log in to post a comment.