Menu

Interpret two-step sequential PCA

Help
2021-01-11
2021-01-11
  • Casper Kerren

    Casper Kerren - 2021-01-11

    Dear Dr. Dien,

    I have followed the tutorial for the two-step sequential PCA. I have two conditions, correct and incorrect. In a first step using the temporal PCA, I retain 20 factors. I then compute the spatial PCA of the temporal factors and I retain 2 factors. Looking at the temporal factors first, only two are significant. Looking at the spatiotemporal, I find a few factors that are significant. However, some of them are really on the edge (e.g., p-value of ~.045). I have a few questions, but I start with an example.

    The first temporal factor is signficant (p-value of .0002). However, only one of the spatial of this temporal is significant. Is it correct that I can only say something conclusive about that significant spatiotemporal factor, or can I actually say something conclusinve about the non-significant spatiotemporal factor too, as it comes from a significant temporal factor? Similarily, if the temporal factor is not significantly different between correct and incorrect, but the spatiotemporal is, can I say something about the spatiotemporal then?

    Second, since I have so many factors, what do I do about multiple testing? Is it OK to regard them as separate, and therefore not correct for multiple comparison, and regard .05 as significant?

    Third , as far as I understand the sign of a factor is arbitrary, and when correlating it with another vector you can take the absolute of the factor. Is this correct also in the case of a spatiotemporal factor? That is, if I correlate the factor with a behavioural index, can I do corr(abs(factor),behavioural))? If not, why?

    Thank you!

    Casper

     
  • Joe Dien

    Joe Dien - 2021-01-11

    Complicated questions! The initial temporal PCA is separating EEG activity based on timecourse, so anything that has the same timecourse will end up in the same temporal factor, even if they are entirely different processes. When you say that the first temporal factor is significant, what you really mean is that it was significant at the channel with the largest amplitude, which is what the EP Toolkit automatically finds for you. That doesn't mean that the activity captured by the first temporal PCA was significant at every channel. The second, spatial, step split up the temporal factor on the basis of scalp topographies. That showed you that in fact there were likely two different things in that temporal window and only one of them was significant, and that one was likely centered on that channel used by the EP Toolkit to measure the factor. So what you found is what you found. And yes, the reverse could happen as well. It is possible that the maximum amplitude channel is missing a significant effect at some other channel(s) reflecting a different EEG process.

    Multiple comparisons is indeed a problem for PCA. My recommended approach is to first identify your ERP components of a priori interest (e.g., P300 which is well known to have a maximum at Pz and a latency that has a predictable relationship to task reaction times). When you find a factor that corresponds to that ERP component, you don't need to do MCP. But you have to be able to make a very solid argument for this decision and be able to document on what basis you have made this judgment. Everything else is subject to Type I error rate inflation. Use the Bonferroni correction control in the EP Toolkit for them, if you are using it to run the ANOVAs (or equivalent if not). So if you have 20*2=40 factors and two of them can be identified as being of a priori interest, then use MCP on the rest. So if using Bonferroni, the alpha would be .05/38=.0013. Of course, that is pretty stringent. Most of the factors are tiny junk factors that don't even look like ERPs and likely reflect residual artifact. So what the EP Toolkit does during the autoPCA is automatically ignore any factors that account for less than half a percent of the total variance. You simply don't even look at them. O perhaps of those 40, only 23 are large enough to be of potential interest. Minus the two of a priori interest. Now your alpha is .05/21=.0024. So still stringent, but better. But no, those .045 p-values are not going to hack it, and they generally shouldn't. But if you found something interesting, you could still flag it as being worth a replication attempt. I probably wouldn't mention it in the original write-up though, but could add it as a note in proof in the successful replication.

    The polarity of factor loadings, which is what most people look at if they’re not using my EP Toolkit, is arbitrary. The polarity of the reconstituted factor (which is the product of the factor loadings and the factor scores plus some scaling from the SDs) is not arbitrary. The reconstituted factors are literally the partitioning of the original data and added together should reproduce it exactly, minus any variance in the factors that were dropped. So to answer your question, you need to be more specific. A factor doesn't have an inherent polarity. But if you multiply the factor loadings by the factor scores to regenerate the data that it accounts for, then it does have a polarity at a specific channel at a specific time point. So what exactly are you correlating here? I'm not clear. I certainly wouldn't take the absolute of the factor scores, if that is what you are doing. That would obliterate the difference between the two extremes of the scores, like saying there is no difference between low and high anxious people. The polarity of the pure factor score is abstracted in the absence of the factor loading information (a large positive value for a temporal PCA factor score denotes both a very positive value at time points with a positive loading and a very negative value at time points with a negative loading), but the relative difference between high and low is meaningful and you would destroy that meaningfulness by taking the absolute value. I'd only consider doing something like that if I was hypothesizing a non-linear relationship, like a U-shaped curve.

    Also, remember that the polarity of even the original ERP data is arbitrary. As I’m sure you know, all electrical fields are actually bipolar, so they have both a negative and a positive pole. When we call something like an N400 a negativity, what we really mean is that it is a negativity at the channels that we are analyzing, either because they are more centrally located or because of tradition or whatever.

    You may find my 2012 paper helpful if you haven't read it yet:

    Dien, J. (2012). Applying Principal Components Analysis to Event Related Potentials: A Tutorial. Developmental Neuropsychology, 37(6), 497–517.

     

Log in to post a comment.