I am trying to use BETR on my timecourse data (Affymetrix arrays, normalized with RMA, no missing value, no negative value). I have 10 samples : 5 timepoints, 2 replicates each.
I use the following parameters for BETR : 1 condition, 5 timepoints, alpha = 0.05. My problem is that during the analysis (shrinkage loop, iteration #2), I get the following error : "The data you are using contains invalid data. To remove the invalid genes and re-run the analysis, click 'Continue'".
So I click 'Continue', and a few seconds later I get another error message : "Not enough valid genes", and the analysis stops there.
Do you have any clue why it does not work ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It sounds like your data might contain a significant number of NAs. You might try filtering your data first and running it again, or imputing the missing values.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for answering. As I said, my data has no missing value, no "NA".
I checked this twice :
- in R, just after the RMA normalisation : sum(is.na(exp_rma)) => returns '0'
- in the shell : grep NA exp_rma.txt => returns nothing
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Oops, I didn't see you mention that in your first post.
Another possibility is that you have a lot of "flat" genes- genes for which the expression values are equal (and therefore have variance = 0). If this is true, you can try using the variance filter to ensure that flat genes are removed.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This worked, thanks. Well at least, it removed the error.
I applied a variance filter to keep the 50% highest SD genes, and ran BETR again. But the result is strange : _all_ the genes are called significant, with a significance value of 0.0. Which would mean that at least 50% of the genes are not "flat" - so why did I need to run the variance filter in the first place ? I'm confused.
I also have another concern : is BETR supposed to perform well with only 2 replicates for each time point ? Are there more suitable / robust methods for this kind of data ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I did another test, keeping the 90% highest SD genes. BETR worked well, and called significant 30035 of the 40590 genes.
I really don't understand why it didn't work the first time, when I didn't use the variance filter…
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As an experiment, I used the variance filter and kept 100% of the genes. Then I launched the BETR algorithm, and it worked, whereas it didn't work with the original data (same genes).
This raises another question : does the variance filter make any transformation to the data ? I don't think it would, but I'm new to MeV and I'm really confused. Did I miss something ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I am trying to use BETR on my timecourse data (Affymetrix arrays, normalized with RMA, no missing value, no negative value). I have 10 samples : 5 timepoints, 2 replicates each.
I use the following parameters for BETR : 1 condition, 5 timepoints, alpha = 0.05. My problem is that during the analysis (shrinkage loop, iteration #2), I get the following error : "The data you are using contains invalid data. To remove the invalid genes and re-run the analysis, click 'Continue'".
So I click 'Continue', and a few seconds later I get another error message : "Not enough valid genes", and the analysis stops there.
Do you have any clue why it does not work ?
Hello,
It sounds like your data might contain a significant number of NAs. You might try filtering your data first and running it again, or imputing the missing values.
Thanks for answering. As I said, my data has no missing value, no "NA".
I checked this twice :
- in R, just after the RMA normalisation : sum(is.na(exp_rma)) => returns '0'
- in the shell : grep NA exp_rma.txt => returns nothing
Oops, I didn't see you mention that in your first post.
Another possibility is that you have a lot of "flat" genes- genes for which the expression values are equal (and therefore have variance = 0). If this is true, you can try using the variance filter to ensure that flat genes are removed.
This worked, thanks. Well at least, it removed the error.
I applied a variance filter to keep the 50% highest SD genes, and ran BETR again. But the result is strange : _all_ the genes are called significant, with a significance value of 0.0. Which would mean that at least 50% of the genes are not "flat" - so why did I need to run the variance filter in the first place ? I'm confused.
I also have another concern : is BETR supposed to perform well with only 2 replicates for each time point ? Are there more suitable / robust methods for this kind of data ?
I did another test, keeping the 90% highest SD genes. BETR worked well, and called significant 30035 of the 40590 genes.
I really don't understand why it didn't work the first time, when I didn't use the variance filter…
Sorry for the multiple posting.
As an experiment, I used the variance filter and kept 100% of the genes. Then I launched the BETR algorithm, and it worked, whereas it didn't work with the original data (same genes).
This raises another question : does the variance filter make any transformation to the data ? I don't think it would, but I'm new to MeV and I'm really confused. Did I miss something ?