This is a consistency question. I found that enet_path has a clever behaviour for this:
The logic here is:
> if center_data changes X, then X wasn't centered.
> If this is the case, and the user supplied a precomputed Gram (and maybe an Xy),
> then we can assume that the user is wrong.
> If the user is actually not wrong, that would mean that he took the effort to
> build the Gram matrix from a centered, standardized X, but then passed the uncentered
> X to the enet_path method. This is unlikely.
The problem is that LassoLars for example (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/least_angle.py#L427), and probably many others don't behave like this.
I find it a bit impolite to ignore the Gram parameter that a user supplies, just because it's wrong, even though it's helpful. I see a couple of clean ways out of this:
1. Always use the Gram, if the user passed it. Under the assumption that a user that passes his own Gram is in pretty deep swamps already, and he knows what he's doing.
2. Discard the Gram everywhere in this situation, but also notify the user.
2'. Maybe pass the Gram and the Xy to the _center_data method so we only have one point of entry? And then, maybe we can code a clever way to update the Gram too, instead of recomputing? (is it possible)
thanks for raising the problem.
how about forcing the user to explicitly set fit_intercept=False and
normalize=False when gram matrix and Xy are provided. If any of these
is True, they are both discarded anyway.
To avoid breaking backward compat I would start by raising a warning
that they are discarded and in 2 releases making this mandatory.
what do others think?
2012/6/26 Alexandre Gramfort <alexandre.gramfort@...>:
> hi vlad,
> thanks for raising the problem.
> how about forcing the user to explicitly set fit_intercept=False and
> normalize=False when gram matrix and Xy are provided. If any of these
> is True, they are both discarded anyway.
> To avoid breaking backward compat I would start by raising a warning
> that they are discarded and in 2 releases making this mandatory.
> what do others think?
http://twitter.com/ogrisel - http://github.com/ogrisel