Nick Bene - 2009-08-06

Hi,

I initially posted thse questions in the opennlp forum and only later realised, there is a separate forum for maxent. Here we go:

In many sources they say that the maximum entropy model used in NLP is, essentially, multinomial logistic regression. About the latter it is known (http://en.wikipedia.org/wiki/Multinomial_logit) that one has to specify a reference category (e.g. a certain class of Named Entity). The reference category has zero feature coefficients and, essentially, serves as the intercept for the regression that computes the logit scores for the other categories. 

How is this done for the opennlp.maxent tool? How is the reference category determined? 

And most importantly: I would like to use the coefficients computed in the maxEnt model file to run a corresponding multinomial logistic regression on unknown data. Would such use be appropriate? 

If yes, this would mean that using those coefficients a multinomial logistic regression (say in R) would produce roughly the same probabilities as the MaxEnt tool running the same model file?

thanks,

Nick