## Re: [Rpy] Repeated measure ANOVA : formula and summary problem

 Hi Toby,

You're right that I can call the R file as many time I want, even if add some subjects in my experiment. However, I need Python because I make some treatments on my data before analysing it (e.g. some fittings with non linear functions...). I could probably make the same directly in R, but I don't know the language enough... (shame on me, bouuuuuh !). However, I also planned to cross the responses of my subjects with some characteristics of the auditory signal I presented to them, and for signal handling, I'm definitely more cumfortable in Scipy than in R. Whatever... the point is that then the number and the names of the parameters I have to include in the analysis varies every time. This could be solved by making Python function such as:

|def aov(r_instance, formula, data):
    f = open("file.R","rb")
    f.write("""
        library('stats')
        my.summary = function(Res) {
            av <- aov( %s, data=Res)
            summary(av)
        }
    """ % formula
    f.close()
    r_instance.source("file.R")
    r_instance.my_summary(data)|

Then I just have to call:

|aov(r, "score~factor+Error(id_subject/factor)", Res)|

I'll try this. Thanks !
-Etienne

Toby Hocking a écrit :
> Hi Etienne,
>
> I don't think you have to make a new R file every time, you just have to make it once and call r.source("file.R") every time. Then use your R function with a new dataset. This type of data flow works great for me, and I think it is rather the opposite of hacking, since you achieve separation of R code and python code to a large extent.
>
> Toby Of course, I could easily make a Python function
that create the R file and call it... etc...
Maybe I'm a purist, but this juste looks like hacking... isn't it a more
straight forward way to do it ?

Thank you again, your solution definetly solve my issue!!
-Etienne



Toby Hocking a écrit :

>> Why don't you put your R code in file.R:
>>
>> library('stats')
>> my.summary = function(Res) {
>>     av <- aov( score~factor+Error(id_subject/factor), data=Res)
>>     summary(av)
>> }
>>
>> then from python:
>>
>> r.source("file.R")
>> r.my_summary(Res)
>>
>> ??? Hi everyone,

I'm new to RPy, and I came to this terrific module as I was used to make some of my analyses in R, and I came to Python in replacement of Matlab. Formerly, I manipulated data with Matlab, put it in a MySQL database, and made my stats in R via ODBC. I'm now thinking about jumping one step by calling R directly from Python with RPy.

The analysis I almost always have to do is a repeated measure ANOVA. The way I do this in R is :


# after odbc connection and sql query, Res contains my data

library('stats')
av <- aov( score~factor+Error(id_subject/factor), data=Res)
summary(av)


Now I tried the same in RPy :


# retrieve data from sql query, Res is a dictionnary

r.library('stats')
av = r.aov("score~factor+Error(id_subject/factor)", data=Res)


This fails saying that "Error" isn't defined in the dataframe...
After reading some R doc about GLM, I found that using the R function formula() seemed to solve this problem:

av = r.aov(r.formula("score~factor+Error(id_subject/factor)"), data=Res)
r.summary(av)

However, a new problem rose in r.summary(). This function returns something that isn't readable, and that does not contain the p values, or anything similar.  From: Etienne Gaudrain - 2007-12-18 18:52:37

Hi everyone,

I'm new to RPy, and I came to this terrific module as I was used to make some of my analyses in R, and I came to Python in replacement of Matlab. Formerly, I manipulated data with Matlab, put it in a MySQL database, and made my stats in R via ODBC. I'm now thinking about jumping one step by calling R directly from Python with RPy.

The analysis I almost always have to do is a repeated measure ANOVA. The way I do this in R is :

# /after odbc connection and sql query, Res contains my data /

library('stats')
av <- aov( score~factor+Error(id_subject/factor), data=Res)
summary(av)

Now I tried the same in RPy :

# /retrieve data from sql query, Res is a dictionnary /

r.library('stats')
av = r.aov("score~factor+Error(id_subject/factor)", data=Res)

This fails saying that "Error" isn't defined in the dataframe...
After reading some R doc about GLM, I found that using the R function formula() seemed to solve this problem:

av = r.aov(r.formula("score~factor+Error(id_subject/factor)"), data=Res)
r.summary(av)

However, a new problem rose in r.summary(). This function returns something that isn't readable, and that does not contain the p values, or anything similar. It seems that the r.summary_aov() function might be adequat, but this function returns an Error saying that there is a NaN somewhere...

Does anybody have an advice on how to perform the repeated measure ANOVA?
Thanks!
-Etienne

PS : I use Windows XP, Python 2.5.1, Numpy 1.0.3.1 and RPy 1.0.1-Numpy-py2.5 and R 2.6.1.
 From: Toby Hocking - 2007-12-18 18:57:44

Why don't you put your R code in file.R:

library('stats')
my.summary = function(Res) {
    av <- aov( score~factor+Error(id_subject/factor), data=Res)
    summary(av)
}

then from python:

r.source("file.R")
r.my_summary(Res)

???
 From: Etienne Gaudrain - 2007-12-18 19:17:28

Ok, I guess this should work, thank you very much!
However, it means that I have to make an R file everytime I make a different analysis. Of course, I could easily make a Python function that create the R file and call it... etc...
Maybe I'm a purist, but this juste looks like hacking... isn't it a more straight forward way to do it ?

Thank you again, your solution definetly solve my issue!!
-Etienne
 From: Toby Hocking - 2007-12-18 19:21:30

Hi Etienne,

I don't think you have to make a new R file every time, you just have to make it once and call r.source("file.R") every time. Then use your R function with a new dataset. This type of data flow works great for me, and I think it is rather the opposite of hacking, since you achieve separation of R code and python code to a large extent.

Toby
 From: Etienne Gaudrain - 2007-12-18 19:39:07

Hi Toby,

You're right that I can call the R file as many time I want, even if add some subjects in my experiment. However, I need Python because I make some treatments on my data before analysing it (e.g. some fittings with non linear functions...). I could probably make the same directly in R, but I don't know the language enough... (shame on me, bouuuuuh !). However, I also planned to cross the responses of my subjects with some characteristics of the auditory signal I presented to them, and for signal handling, I'm definitely more cumfortable in Scipy than in R. Whatever... the point is that then the number and the names of the parameters I have to include in the analysis varies every time. This could be solved by making Python function such as:

|def aov(r_instance, formula, data):
    f = open("file.R","rb")
    f.write("""
        library('stats')
        my.summary = function(Res) {
            av <- aov( %s, data=Res)
            summary(av)
        }
    """ % formula
    f.close()
    r_instance.source("file.R")
    r_instance.my_summary(data)|

Then I just have to call:

|aov(r, "score~factor+Error(id_subject/factor)", Res)|

I'll try this. Thanks !
-Etienne This type of data flow works great for me, and I think it is rather the opposite of hacking, since you achieve separation of R code and python code to a large extent. > > Toby > > -----Original Message----- > From: rpy-list-bounces@... > [mailto:rpy-list-bounces@...]On Behalf Of Etienne > Gaudrain > Sent: Tuesday, December 18, 2007 11:18 AM > To: RPy help, support and design discussion list > Subject: Re: [Rpy] Repeated measure ANOVA : formula and summary problem > > > Ok, I guess this should work, thank you very much! > However, it means that I have to make an R file everytime I make a > different analysis. Of course, I could easily make a Python function > that create the R file and call it... etc... > Maybe I'm a purist, but this juste looks like hacking... isn't it a more > straight forward way to do it ? > > Thank you again, your solution definetly solve my issue!! > -Etienne > > > > > Toby Hocking a écrit : > >> Why don't you put your R code in file.R: >> >> library('stats') >> my.summary = function(Res) { >> av <- aov( score~factor+Error(id_subject/factor), data=Res) >> summary(av) >> } >> >> then from python: >> >> r.source("file.R") >> r.my_summary(Res) >> >> ??? >> >> >> -----Original Message----- >> From: rpy-list-bounces@... [mailto:rpy-list-bounces@...]On Behalf Of Etienne Gaudrain >> Sent: Tuesday, December 18, 2007 10:53 AM >> To: rpy-list@... >> Subject: [Rpy] Repeated measure ANOVA : formula and summary problem >> >> >> Hi everyone, >> >> I'm new to RPy, and I came to this terrific module as I was used to make some of my analyses in R, and I came to Python in replacement of Matlab. Formerly, I manipulated data with Matlab, put it in a MySQL database, and made my stats in R via ODBC. I'm now thinking about jumping one step by calling R directly from Python with RPy. >> >> The analysis I almost always have to do is a repeated measure ANOVA. The way I do this in R is : >> >> >> # after odbc connection and sql query, Res contains my data >> >> library('stats') >> av <- aov( score~factor+Error(id_subject/factor), data=Res) >> summary(av) >> >> >> Now I tried the same in RPy : >> >> >> # retrieve data from sql query, Res is a dictionnary >> >> r.library('stats') >> av = r.aov("score~factor+Error(id_subject/factor)", data=Res) >> >> >> This fails saying that "Error" isn't defined in the dataframe... >> After reading some R doc about GLM, I found that using the R function formula() seemed to solve this problem: >> >> av = r.aov(r.formula("score~factor+Error(id_subject/factor)"), data=Res) >> r.summary(av) >> >> However, a new problem rose in r.summary(). This function returns something that isn't readable, and that does not contain the p values, or anything similar. It seems that the r.summary_aov() function might be adequat, but this function returns an Error saying that there is a NaN somewhere...

Does anybody have an advice on how to perform the repeated measure ANOVA?
Thanks!
-Etienne

PS : I use Windows XP, Python 2.5.1, Numpy 1.0.3.1 and RPy 1.0.1-Numpy-py2.5 and R 2.6.1.
 From: Gregory Warnes - 2007-12-18 19:25:27

Hi Etienne,

The basic problem is that under the default conversion mode (BASIC_CONVERSION) all R objects are converted to roughly-equivalent python structures. As a consequence, the object 'av' isn't actually an R object, so r.summary(av) won't treat it as such. The simplest solution is to change the conversion mode to NO_CONVERSION adn then explicitly request conversion of an object when you need the python version. IE:

> set_default_mode(NO_CONVERSION)
> v = r.aov(r.formula("score~factor+Error(id_subject/factor)"), data=Res)
> set_default_mode(BASIC_CONVERSION)
> r.summary(av)

-G
 From: Etienne Gaudrain - 2007-12-18 19:42:59

Hi Gregory,

This is good to know! I'll try this. Thank you very much !!

I imagine that the fact that the fact that av will be a true R object will help the summary() function to find the good method summary.aov(). However, I tried to call explicitly summary_aov() with no success. But I have to test it before to go further.
Thanks again !!
-Etienne
 From: Gregory Warnes - 2007-12-18 19:46:44

Hi Etienne,

> I imagine that the fact that the fact that av will be a true R object
> will help the summary() function to find the good method summary.aov().

Not only do you need to have R call the correct R fucntion, it must have a correct R object. When the conversion R-->Python happens, some information is lost, so the av object itself isn't a valid aov R object when Python--R conversion happens.

-G
 From: Etienne Gaudrain - 2007-12-18 20:21:12

Ok! Thank you very much!!!
-Etienne As a consequence, the object 'av' isn't actually >>> an R object, so r.summary(av) won't treat it as such. The simplest >>> solution is to change the conversion mode to NO_CONVERSION adn then >>> explicitly request conversion of an object when you need the python >>> version. IE: >>> >>> >>>> set_default_mode(NO_CONVERSION) >>>> v = r.aov(r.formula("score~factor+Error(id_subject/factor)"), >>>> data=Res)set_default_mode(BASIC_CONVERSION) >>>> set_default_mode(BASIC_CONVERSION) >>>> r.summary(av) >>>> >>> -G >>> >>> On Dec 18, 2007, at 1:52PM , Etienne Gaudrain wrote: >>> >>> >>>> Hi everyone, >>>> >>>> I'm new to RPy, and I came to this terrific module as I was used to >>>> make some of my analyses in R, and I came to Python in >>>> replacement of >>>> Matlab. Formerly, I manipulated data with Matlab, put it in a MySQL >>>> database, and made my stats in R via ODBC. I'm now thinking about >>>> jumping one step by calling R directly from Python with RPy. >>>> >>>> The analysis I almost always have to do is a repeated measure ANOVA. >>>> The way I do this in R is : >>>> >>>> >>>> # /after odbc connection and sql query, Res contains my data >>>> / >>>> library('stats') >>>> av <- aov( score~factor+Error(id_subject/factor), data=Res) >>>> summary(av) >>>> >>>> >>>> Now I tried the same in RPy : >>>> >>>> >>>> # /retrieve data from sql query, Res is a dictionnary >>>> / >>>> r.library('stats') >>>> av = r.aov("score~factor+Error(id_subject/factor)", data=Res) >>>> >>>> >>>> This fails saying that "Error" isn't defined in the dataframe... >>>> After reading some R doc about GLM, I found that using the R >>>> function >>>> formula() seemed to solve this problem: >>>> >>>> av = r.aov(r.formula("score~factor+Error(id_subject/factor)"), >>>> data=Res) >>>> r.summary(av) >>>> >>>> However, a new problem rose in r.summary(). This function returns >>>> something that isn't readable, and that does not contain the p >>>> values, or anything similar. It seems that the r.summary_aov() >>>> function might be adequat, but this function returns an Error saying >>>> that there is a NaN somewhere... >>>> >>>> Does anybody have an advice on how to perform the repeated >>>> measure ANOVA? >>>> Thanks! >>>> -Etienne >>>> >>>> >>>> PS : I use Windows XP, Python 2.5.1, Numpy 1.0.3.1 and RPy >>>> 1.0.1-Numpy-py2.5 and R 2.6.1. >>>> >>>> -------------------------------------------------------------------- >>>> ----- >>>> SF.Net email is sponsored by: >>>> Check out the new SourceForge.net Marketplace. >>>> It's the best place to buy or sell services >>>> for just about anything Open Source. >>>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/ >>>> marketplace >>>> >>> --------------------------------------------------------------------- >>> --- >>> >>> --------------------------------------------------------------------- >>> ---- >>> SF.Net email is sponsored by: >>> Check out the new SourceForge.net Marketplace. >>> It's the best place to buy or sell services >>> for just about anything Open Source. >>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/ >>> marketplace >>> --------------------------------------------------------------------- >>> --- >>> >>> _______________________________________________ >>> rpy-list mailing list >>> rpy-list@... >>> https://lists.sourceforge.net/lists/listinfo/rpy-list >>> >>> >> ---------------------------------------------------------------------- >> --- >> SF.Net email is sponsored by: >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services >> for just about anything Open Source. >> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/ >> marketplace >> _______________________________________________ >> rpy-list mailing list >> rpy-list@... >> https://lists.sourceforge.net/lists/listinfo/rpy-list >> > > > ------------------------------------------------------------------------- > SF.Net email is sponsored by: > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services > for just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > rpy-list mailing list > rpy-list@... > https://lists.sourceforge.net/lists/listinfo/rpy-list > > > ```
 From: Etienne Gaudrain - 2007-12-19 09:11:55

And... it works !! Thanks again !!
-Etienne

