## Rotatinal Invariance in Factor Models

2013-05-03
2013-07-25

• Georg Hosoya
2013-05-03

Hi!

I am new to this forum and my background is mostly in psychological methods. At the moment one of my sideprojects is to explore the MCMC framework for estimating latent variable models and I have stumbled over the problem of rotational invariance described in this article by Erosheva and Curtis here.

Here is some code to show a simple factor model with one latent continuous node eta (e.g. a trait) 'influencing' the manifest indicators y (e.g. responses to items).

```model{
##Likelihood
# i: Persons
# j: Items

for(i in 1:n)
{
for(j in 1:k)
{
mu[i,j]<-beta[j]+lambda[j]*eta[i]
y[i,j]~dnorm(mu[i,j], tau.y[j])
}
}

##Priors
for(j in 1:k)
{
# Item-Residuals
tau.y[j] <- pow(sigma.y[j], -2)
sigma.y[j] ~ dunif (0, 100)
# Intercepts
beta[j]~ dnorm(0,1.0E-3)
lambda[j]~dnorm(0,1.0E-3)
}

# Factor-Values
for(i in 1:n)
{
eta[i]~dnorm(0,1)
}

}
```

In this case the likelihood surface is bimodal if the lambdas consist of a mix of positive and negative real values, leading to two equivalent solutions for the lambdas and etas reflected around the zero axis.

I am just curious, if anyone knows of a solution that does not involve postprocessing the markov chains as described in the article above, maybe by introducing another set of constraints?

It would be really cool, if MCMC could also be applied to more complex latent variable models.

If anyone is interested I can also provide some simple R code to simulate data from a known distribution to check for consistency.

Thanks and best,

Georg

Last edit: Georg Hosoya 2013-05-03

• Georg Hosoya
2013-07-09

Dear all,

thanks to this forum I had an interesting E-Mail exchange with Vincent Arel-Bundock from Michigan University who together with one of his co-authors, Walter R. Mebane, greatly helped solving the problem.

The trick seems to be fixing the loading of one item to one and the intercept to zero. The variance of the latent trait distribution is freely estimated. Maybe one should estimate the mean also.

I tried it out with the Holzinger & Swineford dataset, here is the code:

```# 1_fa.R
# Experimental

library(R2jags)
library(MBESS)
data(HS.data)
# Select item-subset of HS.data
y<-as.matrix(HS.data[,26:30])
n<-dim(y)[1]
k<-dim(y)[2]

data<-list("y", "n", "k")

parameters<-c("beta", "sigma.y", "lambda", "eta", "sigma.eta", "eta.mean")

output<-jags.parallel(data, inits=NULL, parameters,
model.file="1_fab.txt",n.chains=2, n.iter=2000, n.burnin=1000)

attach.jags(output)
plot(output)

cor(apply(eta,2,mean), apply(y,1,mean))
```

This is the jags script:

```# Experimental
# 1_fab.txt

model{
## Likelihood
# i: Persons
# k: Items

for(i in 1:n)
{
for(j in 1:k)
{
mu[i,j]<-beta[j]+lambda[j]*eta[i]
y[i,j]~dnorm(mu[i,j], tau.y[j])
}
}

## Constraints
# intercept of first item constraint to zero

lambda[1]<-1
beta[1]<-0

for(j in 2:k)
{
# Intercepts
beta[j]~ dnorm(0,1.0E-3)
lambda[j]~dnorm(0,1.0E-3)
}

# Priors for item residuals
for(j in 1:k)
{
tau.y[j] <- pow(sigma.y[j], -2)
sigma.y[j] ~ dunif (0, 100)
}

# Priors for factor values
for(i in 1:n)
{
eta[i]~dnorm(eta.mean, tau.eta)
}

# Hyperprior for latent score distribution
eta.mean~dnorm(0,1.0E-4)
tau.eta<- pow(sigma.eta, -2)
sigma.eta ~ dunif (0, 100)
}
```

Big thanks! If I have the time I will try out multiple latent variables with a scaled inverse Wishart prior on the factor score variance-covariance matrix.

I am also curious if this solution transfers to the Birnbaum model.

Georg

Last edit: Georg Hosoya 2013-07-09

• Martyn Plummer
2013-07-10

I'm glad someone was able to help you. I stayed silent because I am not too familiar with factor analysis models.

• Georg Hosoya
2013-07-25

Thanks Martyn! Meanwhile I have tried a 3-factorial model with a scaled inverse Wishart prior (after Gelman & Hill, 2007) on the variance-covariance matrix of factor values on the Holzinger-Swineford dataset that is also used in the tutorial for the R package lavaan by Yves Rosseel et al.

Using the traditional practice of setting one loading per factor to one for identification reasons seems to work nicely in this case and gives comparable results to lavaan.

In the following code I simply use lavaan first and in a second step the model is fit with JAGS for comparison. I think the scripts are pretty much self-explanatory.

However, keep in mind that this is somewhat experimental.

Here is the R-script:

```## holzinger.R
# Experimental!

library(lavaan)

# From the lavaan tutorial
HS.model <- ' visual  =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed   =~ x7 + x8 + x9 '

fit <- cfa(HS.model, data=HolzingerSwineford1939, meanstructure=TRUE)

summary(fit, standardized=TRUE)

library(R2jags)
data<-HolzingerSwineford1939
## Transform data to long format----
#  the complicated way

# Dependent variable
y<-c(data\$x1, data\$x2, data\$x3, data\$x4, data\$x5, data\$x6, data\$x7, data\$x8, data\$x9)

# Item identifier
item<-c(rep(1, 301), rep(2, 301), rep(3, 301), rep(4, 301), rep(5, 301), rep(6, 301), rep(7, 301), rep(8, 301), rep(9, 301))

# Construct identifier (1: visual, 2: textual, 3: speed)
const<-c(rep(1,3*301 ), rep(2,3*301 ), rep(3,3*301 ))

# Person id
id<-rep(seq(1:301),9)
## ---------------------------------
# Fitting the model with JAGS
W=(diag(3))

dat<-list("y", "item", "const", "id", "W")
parameters<-(c("beta", "sigma.y", "lam", "sigma.eta1", "sigma.eta2", "sigma.eta3", "rho_eta13", "rho_eta12", "rho_eta23", "eta"))

output<-jags.parallel(dat, inits=NULL, parameters,  model.file="3fa.txt",n.chains=2, n.iter=3000, n.burnin=2000)
plot(output)
```

Here is the jags.script.

```## 3fa.txt
# Experimental!
model{

# Likelihood
for(i in 1:2709) # Loop over observations
{
# Linear decomposition could probably done more
# elegantly with matrix algebra
mu[i]<-beta[item[i]] +
lam[const[i], item[i]]*eta[const[i], id[i]]+
lam[const[i], item[i]]*eta[const[i], id[i]]+
lam[const[i], item[i]]*eta[const[i], id[i]]

# Distributional assumptions for the
# observed variables
y[i]~dnorm(mu[i], tau.y[item[i]])
}

# Priors for the items
for(i in 1:9)
{
beta[i]~dnorm(0,1.0E-3)
tau.y[i] <- pow(sigma.y[i], -2)
sigma.y[i] ~ dunif (0, 100)
}

# Constraints and priors
lam[1,1]<-1
lam[2,1]<-0
lam[3,1]<-0
lam[1,2]~dnorm(0,1.0E-3)
lam[2,2]<-0
lam[3,2]<-0
lam[1,3]~dnorm(0,1.0E-3)
lam[2,3]<-0
lam[3,3]<-0
lam[1,4]<-0
lam[2,4]<-1
lam[3,4]<-0
lam[1,5]<-0
lam[2,5]~dnorm(0,1.0E-3)
lam[3,5]<-0
lam[1,6]<-0
lam[2,6]~dnorm(0,1.0E-3)
lam[3,6]<-0
lam[1,7]<-0
lam[2,7]<-0
lam[3,7]<-1
lam[1,8]<-0
lam[2,8]<-0
lam[3,8]~dnorm(0,1.0E-3)
lam[1,9]<-0
lam[2,9]<-0
lam[3,9]~dnorm(0,1.0E-3)

# Scaled inverse Wishart prior for
# variance-covariance matrix of factor values
# after Gelman and Hill (2007)

for(i in 1:301)
{
eta[1,i]<-xi.eta1*eta.raw[i,1]
eta[2,i]<-xi.eta2*eta.raw[i,2]
eta[3,i]<-xi.eta3*eta.raw[i,3]
eta.raw[i,1:3]~dmnorm(eta.raw.hat[i,],tau.eta.raw[,])
eta.raw.hat[i,1]<-mu.eta1.raw
eta.raw.hat[i,2]<-mu.eta2.raw
eta.raw.hat[i,3]<-mu.eta3.raw
}
mu.eta1<-xi.eta1*mu.eta1.raw
mu.eta2<-xi.eta2*mu.eta2.raw
mu.eta3<-xi.eta3*mu.eta3.raw
mu.eta1.raw<-0
mu.eta2.raw<-0
mu.eta3.raw<-0
xi.eta1~dunif(0,500)
xi.eta2~dunif(0,500)
xi.eta3~dunif(0,500)
tau.eta.raw[1:3,1:3] ~ dwish(W[,],df)
df<-4
sigma.eta.raw[1:3,1:3]<-inverse(tau.eta.raw[,])
sigma.eta1<-xi.eta1*sqrt(sigma.eta.raw[1,1])
sigma.eta2<-xi.eta2*sqrt(sigma.eta.raw[2,2])
sigma.eta3<-xi.eta3*sqrt(sigma.eta.raw[3,3])
rho_eta12<-sigma.eta.raw[1,2]/sqrt(sigma.eta.raw[1,1]*sigma.eta.raw[2,2])
rho_eta13<-sigma.eta.raw[1,3]/sqrt(sigma.eta.raw[1,1]*sigma.eta.raw[3,3])
rho_eta23<-sigma.eta.raw[2,3]/sqrt(sigma.eta.raw[2,2]*sigma.eta.raw[3,3])
}
```