Menu

Session splitting results in different glms - why?

Help
qwer1304
2013-03-04
2013-03-08
  • qwer1304

    qwer1304 - 2013-03-04

    Hi,

    I'm having trouble understanding why splitting a session into sub-sessions results in different betas after executing GLM.

    Here's what I do:
    1. I have SCR recording of 180 DISTINCT conditions (each condition appears EXACTLY once). The duration of the conditions are either 10 or 5 seconds.
    2. GLM:
    a. I run GLM on a SINGLE file holding the whole session (1x180) w/o normalization.
    b. I split the recording into two 90 conditions long files (2x90) and run GLM w/o normalization twice on EACH file SEPARATELY (not in multi-session mode).
    3. I compare the betas from 2.a with concatenated betas from 2.b. There's some resemblance, but they're definitely different. The question is - WHY? (IMHO, they should be identical, except for perhaps the stitching point between the two sub-sessions of 2.b, but they're not).

    I can provide all the files (input & output) and/or output plots demonstrating the mismatch, if needed.

    Any help would be highly appreciated.

    Thx,
    David

     
  • Dominik Bach

    Dominik Bach - 2013-03-04

    Hi David

    The data will be different, so the parameter estimates are also likely to be different.

    This is because the data are filtered differently, for the two alternatives that you tried out - and surely the estimates have high imprecision anyway.

    But if the difference appears too dramatic to you, I would be happy to have a look if you send me the data, and onset files - just to make sure it's not a bug in the code.

    Best
    Dom

     
  • qwer1304

    qwer1304 - 2013-03-04

    Hi,

    I sent you the files - thanx for agreeing to take a look.

    I'm not sure I understand why you think that the estimates would be different if the data is the SAME, just split differently.

    This could happen at the stitching point (and still the effect should be very temporally limited), but I already see a difference at the beginning of the 1st segment which is identical to the non-split segment.

    See the attached figure that shows the difference between the two runs and the two plots in the same figure.

    Thx,
    David

     

    Last edit: qwer1304 2013-03-04
  • qwer1304

    qwer1304 - 2013-03-05

    Hi,

    More info:
    1.The imported data is IDENTICAL in the two cases (1x180 and 2x90).
    2.If I concatenate the imported data and run GLM, the result is IDENTICAL to 1x180.

    Conclusion:
    It is the fact that the data is split, EVENTHOUGH IT'S IDENTICAL TO THE NON-SPLIT CASE, that causes a DIFFERENT result at GLM stage.

    Note that the conditions in the data are unique (occur ONLY once).

    Hope this helps to pinpoint the problem,
    Thx,
    David

     
  • qwer1304

    qwer1304 - 2013-03-07

    Hi,

    I think I understand the root cause of the phenomenon (should have been obvious :( ).

    GLM is an offline batch analysis tool and therefore is, by definition, non-causal in the sense that temporally future values affect temporally past values.

    When a sequence is split, the initial conditions of the 2nd segment change and therefore results of GLM application to the 2nd segment change; this was expected.

    However, since GLM analysis looks at all available data, the absence of the 2nd segment affects the analysis of the 1st segment (this was unexpected :( ).

    It could be interesting to quantify this effect.

    Please confirm the above analysis.

    Thx,
    David

     
  • Dominik Bach

    Dominik Bach - 2013-03-07

    Hi David

    I haven't been able to look into this in detail; it's on my list.

    What could affect the estimation in the first session is the bidirectional filtering (which you could turn off) such that removing the second session affects the data of the first session, and the long tail of the SCRF that reaches into the second session, such that removing the second session changes parameter estimation.

    Best
    Dom

     
  • qwer1304

    qwer1304 - 2013-03-07

    Hi,

    I tried playing w/ the 2-way filtering, but it didn't have a significant effect. It is the act of splitting that creates this change at GLM step.

    The long tail of SCRF indeed affects the 2nd segment; still, I was puzzled that the 1st segment was affected as well.

    I think (as I wrote above) that this is due to GLM being an offline batch non-causal analysis tool, so removal of any data affects all results (not just temporally future).

    Cheers,
    David

     
  • Dominik Bach

    Dominik Bach - 2013-03-08

    Hi David

    I looked at your data. On my machine, GLM did not produce any usable parameter estimates at all. Recall that the design matrix is not of full rank; in fact the linear dependencies are rather pronounced for this design matrix. It seems that the numerical procedure to compute the pseudo-inverse does not come up with a meaningful solution. This means that the differences between using one session and split the session cannot really be interpreted at all.

    I should say that this procedure was not programmed to cope with such cases; I'd use DCM for this.

    Hope this helps
    Dom

     
  • qwer1304

    qwer1304 - 2013-03-08

    Hi,

    Thx for taking a look.

    I noticed that too. However, SCRF + 2 derivatives (time and dispersion) produce (for me) nice within segment results, which remain nice irrespective of how the top-level segment is split, as long as analysis is performed on a single segment

    David

     

Log in to post a comment.