Menu

#446 PSRFITS scale/offset 16-bit overflow

next release
open
nobody
None
5
2021-02-24
2021-02-23
No

I've noticed profile artifacts occasionally appearing in FITS files created by recent versions of PSRCHIVE. I believe this is due to recent changes introduced to the 16-bit integer data rescaling (ticket #440). One of the changes was to use the full 16-bit range rather than 15 bits as before. Data values seem to occasionally overflow the range now, I think likely due to floating-point precision issues in the computation.

For example see the attached ASCII profile, and plots showing how it looks originally, and after conversion to PSRFITS using psrconv -o PSRFITS -e fits. The original profile minimum point has "wrapped around" and now takes a value near the profile max. This was done with PSRCHIVE 2021-02-13 but more recent builds still show this.

One solution would be to reduce the range slightly, ie in ProfileColumn.C change the following two lines:

  double the_min = 1-pow(2,15);
  double the_max = pow(2,15)-2;

to (for example)

  double the_min = 1-pow(2,14.8);
  double the_max = pow(2,14.8)-2;

I've tested this change and it does fix this specific example, however it might be worth thinking a bit more about what the proper amount of padding should be.

1 Attachments

Related

Bugs: #446

Discussion

  • Paul Demorest

    Paul Demorest - 2021-02-23

    Here is the ASCII profile file from the example.

     
  • Paul Demorest

    Paul Demorest - 2021-02-23

    Checking that the values are in range (and clipping them if not) after applying scale/offset but before casting to int16 may also be a good idea. Perhaps in combination with a slight reduction in range as I suggested above.

     
  • Willem van Straten

    Hi Paul,

    I'm really sorry about the trouble!

    I'm also surprised that the 24-bit mantissa of a 32-bit floating point number does not have the precision to round to a 16-bit integer with a maximum error of +/- 1

    "pow(2,15)-2" was intended to be one less than the maximum possible +ve 16-bit signed integer and "1-pow(2,15)" was intended to be one more than the minimum possible -ve 16-bit signed integer.

    Unfortunately, I can't recreate the error on my laptop with the file that you provided. Could you please test if setting

    the_max = pow(2,15)-(1+N);
    the_min = N-pow(2,15);

    where N = 1, 2, 3, ...

    also solves the problem on your end? I'm not sure why, but I'd like to keep the min and max set to integer values (as opposed to 2^14.8).

    It would be nice to have a theoretical derivation for what N should be. I thought that N=1 would be sufficient ...

     
  • Paul Demorest

    Paul Demorest - 2021-02-24

    OK I've checked in a fix in cc245bd28, using N=16 and also clipping the range at INT16_MIN, INT16_MAX. Still running some tests but so far I have not run into any cases where this fails. If you want to make any additional adjustments please go ahead!

     
  • Paul Demorest

    Paul Demorest - 2021-02-24

    Regarding the expected level of error in this calculation (ie "what N should be"), I believe the error in the packed (16-bit) values at the profile min or max will be on the order of:

    int16_err ~ few * INT16_MAX * (D/R) * 1e-7

    Where INT16_MAX = 32767, D is the DC level in the profile data, R is the range (max-min) of the profile data. 1e-7 is the dynamic range of single-precision floats. The "few" is to account for accumulated error over the course of the scale/offset computation, it's maybe something like 2.

    In my example data, D/R ~ 1000, so int16_err ~ 6 counts. This seems consistent with the observation that this overflowed the range, given that the tolerance (N) was only 1 count. Also consistent with N=16 fixing this case.

    I don't think these values are unreasonable so I'm still a bit surprised this wasn't noticed before. I wonder if different compilers act differently when asked to convert an out-of-range float to an int. It's possible some clip/saturate rather than wrapping around, which would tend to mask this issue.

    I'm not sure what this implies in terms of a more robust fix. In principle D/R could be very large, although as it approaches 1e7 there will be other issues to floating point precision.

     
  • Paul Demorest

    Paul Demorest - 2021-02-24

    One last(?) thought about this is that, longer-term, it might be useful for PSRCHIVE to optionally be able to just write the data as floats and avoid this conversion entirely. PSRFITS supports this and PSRCHIVE can already read this flavor of file. Saving a factor of 2 in disk space is nice but not always necessary, depends on the context.

     
    • Willem van Straten

      Hi Paul,

      Thanks for working out the potential overflow. I agree that it would be
      nice to optionally output floats instead of int16.

      Also, I wonder if we could cast all floats to double-precision floats
      during all of the internal computations that take place before casting to
      int16 ... there might also be some clever re-ordering of operations that
      reduces accumulated rounding errors.

      ... I'll have to think more about this later! (teaching starts next week
      and I have been procrastinating on tackling the mountain of preparations.)

      Cheers,
      Willem

      On Thu, 25 Feb 2021 at 09:50, Paul Demorest demorest@users.sourceforge.net
      wrote:

      One last(?) thought about this is that, longer-term, it might be useful
      for PSRCHIVE to optionally be able to just write the data as floats and
      avoid this conversion entirely. PSRFITS supports this and PSRCHIVE can
      already read this flavor of file. Saving a factor of 2 in disk space is
      nice but not always necessary, depends on the context.


      Status: open
      Group: next release
      Created: Tue Feb 23, 2021 06:26 PM UTC by Paul Demorest
      Last Updated: Wed Feb 24, 2021 07:02 PM UTC
      Owner: nobody
      Attachments:

      I've noticed profile artifacts occasionally appearing in FITS files
      created by recent versions of PSRCHIVE. I believe this is due to recent
      changes introduced to the 16-bit integer data rescaling (ticket #440). One
      of the changes was to use the full 16-bit range rather than 15 bits as
      before. Data values seem to occasionally overflow the range now, I think
      likely due to floating-point precision issues in the computation.

      For example see the attached ASCII profile, and plots showing how it looks
      originally, and after conversion to PSRFITS using psrconv -o PSRFITS -e
      fits. The original profile minimum point has "wrapped around" and now
      takes a value near the profile max. This was done with PSRCHIVE 2021-02-13
      but more recent builds still show this.

      One solution would be to reduce the range slightly, ie in ProfileColumn.C
      change the following two lines:

      double the_min = 1-pow(2,15); double the_max = pow(2,15)-2;

      to (for example)

      double the_min = 1-pow(2,14.8); double the_max = pow(2,14.8)-2;

      I've tested this change and it does fix this specific example, however it
      might be worth thinking a bit more about what the proper amount of padding
      should be.


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/psrchive/bugs/446/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Bugs: #446


Log in to post a comment.

MongoDB Logo MongoDB