Menu

How is FCS3 data transformed for display?

Support
Anonymous
2021-07-14
2021-07-15
  • Anonymous

    Anonymous - 2021-07-14

    Dear Sven,
    I've acquired a set of 96 FCS3 files in batch mode (96 well plate) on a Beckman Coulter Cytoflex. Strangely, when I load these into FCSalyzer, one (and only one) of them displays the FSC-A parameter 'squashed-down' by about 3-fold. Looking closely, the FSC-A axis limits for this file only are much higher than the others (from zero to about 58000k, instead of from zero to about 16000k for all the others).
    Looking at some of the actual data (using the 'View File Data' menu), I see that the FSC-A 'Events as in File' have been transformed into the 'Events as displayed' by dividing by a consistent factor of 4096 for most of the files, but that the single squashed-down-FSC-A file has had the FSC-A 'Events as in File' transformed to 'Events as displayed' by dividing by a factor of about 14170. If I manually multiply the FSC-A values by 3 (using the 'Paramater multipliers' option in the 'Format data files' menu) I can 'rescue' the strangely displayed file, which now appears similar to all the others: the populations appear at similar x,y positions in a dotplot (although the axis limits are still different), and go 'through' the same gates.
    I hunted around a little bit in the metadata, but I couldn't immediately find any paramaters that explain why the axis limits should be different, nor why the 'Events as displayed' should be calculated differently for this one file.
    Please could you explain a little bit how FCSalyzer 'decides' the axis limits, or how it sets the factor to transform data for display? Is there something in the metadata that I've missed?
    I'm attaching two example FCS3 files: '01-617-C11.fcs' is the 'funny' file that displays the FSC-A 'squashed-down', '01-617-C10.fcs' is representative of all the other 95 files. I've also attached a screenshot showing how they are displayed differently.
    Thanks very, very much if you're able to shed any light on this!
    Dominic

    Dominic van Essen
    Setup: FCSalyzer 0.9.22-alpha running with Java version 1.8.0_291, same results on several Mac computers running various versions of MacOS X.

     
    • Sven Mostböck

      Sven Mostböck - 2021-07-14

      Hi Dominic,

      thanks for providing the report and especially the example data files.
      It took me a while and I had to dig deep into old sourc code I had not looked at in ages, but I found the reason.

      In short: the bug with the FSC-A in your datafile happened because I tried to repair a problem with another parameter in FSC datafiles.

      In long: the range of each parameter is defined in the $PnR parameter of the datafile (n being the number of the parameter). In your datafile, the measured parameters including $P2 (= FSC-A) have a range of 16,777,216. The parameter FSC-Width has only a range of 10,000. The "Time" parameter has a range of 900,000,000. And here we have a problem - the range for the parameters does not necessarily match the values of the events. In your files, FSC-A and SSC-A both have values that are much higher than the official range of the parameter, up to 91 million. On the other hand, the time stamps don't even come close to their 900 million range, with the top value being 1.45 million.

      To account for that, there is a check that adjusts the range for the "time" or "event" parameters to better match the actual values. That is reasonable simple as these numbers increase steadily for all events in the file, with the last event having the highest values. So adjustment is easy.

      However, in that routine, there was also a check if the last or first event in any parameter is higher than the proposed range. If so, the range is adjusted. Please note - I tested only the first and the last event, not any of the middle events. According to my comments in the code, this is done because I stumbled across an example FCS file, where the defined range did not match the event values.
      Unfortunately, for FCS-A in your datafile C11, the first event has a value of 54 million, so the range got increased to 54 million.

      My plan: I guess I will remove that check and use only the pre-defined ranges, except for the "time" and "event" parameters. But I have to take another look at my various example data files to see, why exactly I made that check. Was probably a misunderstanding from my side. It will take me a little while to release an updated version.

      Regards,
      Sven

       

      Last edit: Sven Mostböck 2021-07-14
  • Anonymous

    Anonymous - 2021-07-14

    Sven -
    Thanks very much for the explanation. Especially since it makes me more relaxed to do my dirty work-around of multiplying-up the FSC-A values to let them be analysed for this sample, at least until the next version of FCSalyzer!
    One other thing that I noticed that arisies from this is that after the dirty multiplying-up to put the displayed dots visually in the 'right' place, they now pass through the same gate as the other, un-multiplied samples, even though they now have FSC-A values that are completely different (or, put another way, without multiplying-up, the events that should pass through a gate don't do so). This suggests to me that the gating is using the 'as displayed' x-axis position values (which are affected by the range of the display axis), rather than the actual data values. Perhaps this is deliberate, but it's surprising and unintuitive to me...

    All the best,
    Dominic

     
    • Sven Mostböck

      Sven Mostböck - 2021-07-14

      Hi Dominic,

      first, attached please find a "special" version of 0.9.22 without the bug. I simply removed the quick check that I men tioned above. I hope that did not break anything else ;-)
      I thought it stupid that you have to tweak around now that one file in your analysis ...

      For displaying samples: the FCS raw data is taken, compensated, then transformed (log or logicle) and finally mapped from their original range to a range of 0-4095. The plots show these values then reduced to the pixel-size of the plot. Please note that this is only used for displaying the data in the plots, not for calculating statistics and such.

      Regions, markers and quadrants are all placed onto the values for display, i.e. values from 0-4095 for each parameter. Thus you are absolutely right: if you mix datafiles that have different ranges, the regions to not match the event values, but the transformed data for display. Also - very important: if you use different data transformations for the same parameter in different datafiles, the regions would also filter the displayed transformed data, therefore different events for the two files! Please see the tutorial, where this is shown early in the video: https://youtu.be/RWIh2mgQCcM

      There are multiple reasons why I did it like that. Other flow software found solutions for them, but I reached my limits as a hobby programmer:
      1) FCS datafiles can have very large ranges. However, for regions I use the JAVA-intern "shape" object, as this has a "is this dot inside the shape" function, that I use for filtering events. To make such a function myself sounds very complicated, especially once the user makes weird shapes with crossing lines and what-not. But the "shape" object uses the Integer data type, which can't hold ranges that are as large as some FACS machines generate. So I have to map the data to a reduced range.
      2) The log or logicle transformation means that the region lines mapp onto the actual FCS data not as "straight/linear" but follow a log/logicle curve. I have no clue how to define a shape that is entered on the displayed data to do this properly.
      3) Event data can be above the display range (like in your FCS-A). For display, these values above the range are mapped onto the highest value 4095. The calculations however use the original FCS data. So for the regions, I would use the original raw data, but I would have to then map the values that are higher than the range to the highest value.

      (2) is my biggest problem, (1) would require reprogramming an internal Java object, never fun, including a "find the inside" algorithm that I probably don't understand, and (3) is solvable but cumbersome.

      So I decided: "no analysis should mix and match data files of different ranges and apply different transformations to the same parameter". As long as all data files are of the same range and use the same data transformation, all is well :-)
      You only noticed because of the bug ;-)

      Regards,
      Sven

       
  • Anonymous

    Anonymous - 2021-07-15

    Thanks a lot again for the explanation. All that makes sense, and I kind-of assumed that reasons like those woule be the case (and that it was only the 'surprise' of data files with different ranges that caused the effect to appear). It seems pretty logical to me to use 'display' data to work with gates drawn on the displays, especially if the ranges shouldn't be different.
    And super-thanks for the 'special' 9.22! This goes well beyond the reasonable expectations from a self-described 'hobby programmer' ! Thankyou!

    Cheers,
    Dominic

     

Anonymous
Anonymous

Add attachments
Cancel