BETR : invalid data

Help
thepepper
2010-05-03
2013-05-20
  • thepepper
    thepepper
    2010-05-03

    Hello,

    I am trying to use BETR on my timecourse data (Affymetrix arrays, normalized with RMA, no missing value, no negative value). I have 10 samples : 5 timepoints, 2 replicates each.
    I use the following parameters for BETR : 1 condition, 5 timepoints, alpha = 0.05. My problem is that during the analysis (shrinkage loop, iteration #2), I get the following error : "The data you are using contains invalid data. To remove the invalid genes and re-run the analysis, click 'Continue'".
    So I click 'Continue', and a few seconds later I get another error message : "Not enough valid genes", and the analysis stops there.

    Do you have any clue why it does not work ?

     
  • Hello,

    It sounds like your data might contain a significant number of NAs.  You might try filtering your data first and running it again, or imputing the missing values.

     
  • thepepper
    thepepper
    2010-05-03

    Thanks for answering. As I said, my data has no missing value, no "NA".
    I checked this twice :
    - in R, just after the RMA normalisation : sum(is.na(exp_rma))    =>  returns '0'
    - in the shell : grep NA exp_rma.txt   =>  returns nothing

     
  • Oops, I didn't see you mention that in your first post.

    Another possibility is that you have a lot of "flat" genes- genes for which the expression values are equal (and therefore have variance = 0).  If this is true, you can try using the variance filter to ensure that flat genes are removed.

     
  • thepepper
    thepepper
    2010-05-03

    This worked, thanks. Well at least, it removed the error.
    I applied a variance filter to keep the 50% highest SD genes, and ran BETR again. But the result is strange : _all_ the genes are called significant, with a significance value of 0.0. Which would mean that at least 50% of the genes are not "flat" - so why did I need to run the variance filter in the first place ? I'm confused.

    I also have another concern : is BETR supposed to perform well with only 2 replicates for each time point ? Are there more suitable / robust methods for this kind of data ?

     
  • thepepper
    thepepper
    2010-05-03

    I did another test, keeping the 90% highest SD genes. BETR worked well, and called significant 30035 of the 40590 genes.
    I really don't understand why it didn't work the first time, when I didn't use the variance filter…

     
  • thepepper
    thepepper
    2010-05-03

    Sorry for the multiple posting.

    As an experiment, I used the variance filter and kept 100% of the genes. Then I launched the BETR algorithm, and it worked, whereas it didn't work with the original data (same genes).
    This raises another question : does the variance filter make any transformation to the data ? I don't think it would, but I'm new to MeV and I'm really confused. Did I miss something ?