I'm having trouble understanding why splitting a session into sub-sessions results in different betas after executing GLM.
Here's what I do:
1. I have SCR recording of 180 DISTINCT conditions (each condition appears EXACTLY once). The duration of the conditions are either 10 or 5 seconds.
2. GLM:
a. I run GLM on a SINGLE file holding the whole session (1x180) w/o normalization.
b. I split the recording into two 90 conditions long files (2x90) and run GLM w/o normalization twice on EACH file SEPARATELY (not in multi-session mode).
3. I compare the betas from 2.a with concatenated betas from 2.b. There's some resemblance, but they're definitely different. The question is - WHY? (IMHO, they should be identical, except for perhaps the stitching point between the two sub-sessions of 2.b, but they're not).
I can provide all the files (input & output) and/or output plots demonstrating the mismatch, if needed.
Any help would be highly appreciated.
Thx,
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The data will be different, so the parameter estimates are also likely to be different.
This is because the data are filtered differently, for the two alternatives that you tried out - and surely the estimates have high imprecision anyway.
But if the difference appears too dramatic to you, I would be happy to have a look if you send me the data, and onset files - just to make sure it's not a bug in the code.
Best
Dom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I sent you the files - thanx for agreeing to take a look.
I'm not sure I understand why you think that the estimates would be different if the data is the SAME, just split differently.
This could happen at the stitching point (and still the effect should be very temporally limited), but I already see a difference at the beginning of the 1st segment which is identical to the non-split segment.
See the attached figure that shows the difference between the two runs and the two plots in the same figure.
More info:
1.The imported data is IDENTICAL in the two cases (1x180 and 2x90).
2.If I concatenate the imported data and run GLM, the result is IDENTICAL to 1x180.
Conclusion:
It is the fact that the data is split, EVENTHOUGH IT'S IDENTICAL TO THE NON-SPLIT CASE, that causes a DIFFERENT result at GLM stage.
Note that the conditions in the data are unique (occur ONLY once).
Hope this helps to pinpoint the problem,
Thx,
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think I understand the root cause of the phenomenon (should have been obvious :( ).
GLM is an offline batch analysis tool and therefore is, by definition, non-causal in the sense that temporally future values affect temporally past values.
When a sequence is split, the initial conditions of the 2nd segment change and therefore results of GLM application to the 2nd segment change; this was expected.
However, since GLM analysis looks at all available data, the absence of the 2nd segment affects the analysis of the 1st segment (this was unexpected :( ).
It could be interesting to quantify this effect.
Please confirm the above analysis.
Thx,
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I haven't been able to look into this in detail; it's on my list.
What could affect the estimation in the first session is the bidirectional filtering (which you could turn off) such that removing the second session affects the data of the first session, and the long tail of the SCRF that reaches into the second session, such that removing the second session changes parameter estimation.
Best
Dom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried playing w/ the 2-way filtering, but it didn't have a significant effect. It is the act of splitting that creates this change at GLM step.
The long tail of SCRF indeed affects the 2nd segment; still, I was puzzled that the 1st segment was affected as well.
I think (as I wrote above) that this is due to GLM being an offline batch non-causal analysis tool, so removal of any data affects all results (not just temporally future).
Cheers,
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I looked at your data. On my machine, GLM did not produce any usable parameter estimates at all. Recall that the design matrix is not of full rank; in fact the linear dependencies are rather pronounced for this design matrix. It seems that the numerical procedure to compute the pseudo-inverse does not come up with a meaningful solution. This means that the differences between using one session and split the session cannot really be interpreted at all.
I should say that this procedure was not programmed to cope with such cases; I'd use DCM for this.
Hope this helps
Dom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I noticed that too. However, SCRF + 2 derivatives (time and dispersion) produce (for me) nice within segment results, which remain nice irrespective of how the top-level segment is split, as long as analysis is performed on a single segment
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I'm having trouble understanding why splitting a session into sub-sessions results in different betas after executing GLM.
Here's what I do:
1. I have SCR recording of 180 DISTINCT conditions (each condition appears EXACTLY once). The duration of the conditions are either 10 or 5 seconds.
2. GLM:
a. I run GLM on a SINGLE file holding the whole session (1x180) w/o normalization.
b. I split the recording into two 90 conditions long files (2x90) and run GLM w/o normalization twice on EACH file SEPARATELY (not in multi-session mode).
3. I compare the betas from 2.a with concatenated betas from 2.b. There's some resemblance, but they're definitely different. The question is - WHY? (IMHO, they should be identical, except for perhaps the stitching point between the two sub-sessions of 2.b, but they're not).
I can provide all the files (input & output) and/or output plots demonstrating the mismatch, if needed.
Any help would be highly appreciated.
Thx,
David
Hi David
The data will be different, so the parameter estimates are also likely to be different.
This is because the data are filtered differently, for the two alternatives that you tried out - and surely the estimates have high imprecision anyway.
But if the difference appears too dramatic to you, I would be happy to have a look if you send me the data, and onset files - just to make sure it's not a bug in the code.
Best
Dom
Hi,
I sent you the files - thanx for agreeing to take a look.
I'm not sure I understand why you think that the estimates would be different if the data is the SAME, just split differently.
This could happen at the stitching point (and still the effect should be very temporally limited), but I already see a difference at the beginning of the 1st segment which is identical to the non-split segment.
See the attached figure that shows the difference between the two runs and the two plots in the same figure.
Thx,
David
Last edit: qwer1304 2013-03-04
Hi,
More info:
1.The imported data is IDENTICAL in the two cases (1x180 and 2x90).
2.If I concatenate the imported data and run GLM, the result is IDENTICAL to 1x180.
Conclusion:
It is the fact that the data is split, EVENTHOUGH IT'S IDENTICAL TO THE NON-SPLIT CASE, that causes a DIFFERENT result at GLM stage.
Note that the conditions in the data are unique (occur ONLY once).
Hope this helps to pinpoint the problem,
Thx,
David
Hi,
I think I understand the root cause of the phenomenon (should have been obvious :( ).
GLM is an offline batch analysis tool and therefore is, by definition, non-causal in the sense that temporally future values affect temporally past values.
When a sequence is split, the initial conditions of the 2nd segment change and therefore results of GLM application to the 2nd segment change; this was expected.
However, since GLM analysis looks at all available data, the absence of the 2nd segment affects the analysis of the 1st segment (this was unexpected :( ).
It could be interesting to quantify this effect.
Please confirm the above analysis.
Thx,
David
Hi David
I haven't been able to look into this in detail; it's on my list.
What could affect the estimation in the first session is the bidirectional filtering (which you could turn off) such that removing the second session affects the data of the first session, and the long tail of the SCRF that reaches into the second session, such that removing the second session changes parameter estimation.
Best
Dom
Hi,
I tried playing w/ the 2-way filtering, but it didn't have a significant effect. It is the act of splitting that creates this change at GLM step.
The long tail of SCRF indeed affects the 2nd segment; still, I was puzzled that the 1st segment was affected as well.
I think (as I wrote above) that this is due to GLM being an offline batch non-causal analysis tool, so removal of any data affects all results (not just temporally future).
Cheers,
David
Hi David
I looked at your data. On my machine, GLM did not produce any usable parameter estimates at all. Recall that the design matrix is not of full rank; in fact the linear dependencies are rather pronounced for this design matrix. It seems that the numerical procedure to compute the pseudo-inverse does not come up with a meaningful solution. This means that the differences between using one session and split the session cannot really be interpreted at all.
I should say that this procedure was not programmed to cope with such cases; I'd use DCM for this.
Hope this helps
Dom
Hi,
Thx for taking a look.
I noticed that too. However, SCRF + 2 derivatives (time and dispersion) produce (for me) nice within segment results, which remain nice irrespective of how the top-level segment is split, as long as analysis is performed on a single segment
David