From: Justin R <jus...@gm...> - 2012-06-12 06:03:47
|
operating system Windows 7 matplotlib version : 1.1.0 obtained from sourceforge the class seems to generate the same Wt matrix for every input. The every element of the weight matrix is either +sqrt(1/2) or -sqrt(1/2). dat1 = 4*np.random.randn(200,1) + 2 dat2 = dat1*.25 + 1*np.random.randn(200,1) pcaObj1 = PCA(np.hstack((dat1,dat2))) print pcaObj1.Wt dat3 = 2*np.random.randn(200,1) + 2 dat4 = dat3*2 + 3*np.random.randn(200,1) pcaObj2 = PCA(np.hstack((dat1,dat2))) print pcaObj2.Wt The output Y seems to be correct, and the projection function works. only the Wt matrix seems to be messed up. Am I using this class incorrectly, or could this be a bug? thanks, Justin |
From: Paul H. <pmh...@gm...> - 2012-06-12 16:59:30
|
On Mon, Jun 11, 2012 at 11:03 PM, Justin R <jus...@gm...> wrote: > operating system Windows 7 > matplotlib version : 1.1.0 > obtained from sourceforge > > the class seems to generate the same Wt matrix for every input. The > every element of the weight matrix is either +sqrt(1/2) or -sqrt(1/2). > > dat1 = 4*np.random.randn(200,1) + 2 > dat2 = dat1*.25 + 1*np.random.randn(200,1) > pcaObj1 = PCA(np.hstack((dat1,dat2))) > print pcaObj1.Wt > > dat3 = 2*np.random.randn(200,1) + 2 > dat4 = dat3*2 + 3*np.random.randn(200,1) > pcaObj2 = PCA(np.hstack((dat1,dat2))) > print pcaObj2.Wt > > The output Y seems to be correct, and the projection function works. > only the Wt matrix seems to be messed up. Am I using this class > incorrectly, or could this be a bug? > thanks, > Justin Justin, could you post a self-contained script that demonstrates the issue? Where does this PCA function come from? In [1]: from pylab import * In [2]: PCA --------------------------------------------------------------------------- NameError Traceback (most recent call last) C:\Users\phobson\<ipython-input-2-dcf6991f51c0> in <module>() ----> 1 PCA NameError: name 'PCA' is not defined -paul |
From: Warren W. <war...@en...> - 2012-06-14 14:47:31
|
On Tue, Jun 12, 2012 at 11:59 AM, Paul Hobson <pmh...@gm...> wrote: > On Mon, Jun 11, 2012 at 11:03 PM, Justin R <jus...@gm...> wrote: > > operating system Windows 7 > > matplotlib version : 1.1.0 > > obtained from sourceforge > > > > the class seems to generate the same Wt matrix for every input. The > > every element of the weight matrix is either +sqrt(1/2) or -sqrt(1/2). > > > > dat1 = 4*np.random.randn(200,1) + 2 > > dat2 = dat1*.25 + 1*np.random.randn(200,1) > > pcaObj1 = PCA(np.hstack((dat1,dat2))) > > print pcaObj1.Wt > > > > dat3 = 2*np.random.randn(200,1) + 2 > > dat4 = dat3*2 + 3*np.random.randn(200,1) > > pcaObj2 = PCA(np.hstack((dat1,dat2))) > > print pcaObj2.Wt > > > > The output Y seems to be correct, and the projection function works. > > only the Wt matrix seems to be messed up. Am I using this class > > incorrectly, or could this be a bug? > > thanks, > > Justin > > Justin, could you post a self-contained script that demonstrates the > issue? Where does this PCA function come from? > > In [1]: from pylab import * > > In [2]: PCA > --------------------------------------------------------------------------- > NameError Traceback (most recent call last) > C:\Users\phobson\<ipython-input-2-dcf6991f51c0> in <module>() > ----> 1 PCA > > NameError: name 'PCA' is not defined > > Paul, In case you never got an answer to this: PCA is in the mlab submodule, so if you do "from pylab import *", you would use mlab.PCA. (At least that's the case in matplotlib 1.1.0). Warren > -paul > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > |
From: Goyo <goy...@gm...> - 2012-06-13 19:01:39
|
2012/6/12 Paul Hobson <pmh...@gm...>: > On Mon, Jun 11, 2012 at 11:03 PM, Justin R <jus...@gm...> wrote: > Justin, could you post a self-contained script that demonstrates the > issue? Where does this PCA function come from? It comes from matplotlib.mlab. Just add these imports before the OP's code: import numpy as np from matplotlib.mlab import PCA But I don't know much about PCA and can't comment on this. Goyo |
From: Aronne M. <aro...@gm...> - 2012-06-16 15:38:56
|
On Tue, Jun 12, 2012 at 1:03 AM, Justin R <jus...@gm...> wrote: > operating system Windows 7 > matplotlib version : 1.1.0 > obtained from sourceforge > > the class seems to generate the same Wt matrix for every input. The > every element of the weight matrix is either +sqrt(1/2) or -sqrt(1/2). > > dat1 = 4*np.random.randn(200,1) + 2 > dat2 = dat1*.25 + 1*np.random.randn(200,1) > pcaObj1 = PCA(np.hstack((dat1,dat2))) > print pcaObj1.Wt > > dat3 = 2*np.random.randn(200,1) + 2 > dat4 = dat3*2 + 3*np.random.randn(200,1) > pcaObj2 = PCA(np.hstack((dat1,dat2))) > print pcaObj2.Wt > > The output Y seems to be correct, and the projection function works. > only the Wt matrix seems to be messed up. Am I using this class > incorrectly, or could this be a bug? Hi, I wouldn't call myself a PCA expert - so don't weight my answer too heavily - but here is what I think is happening: Looking at the code, the input data array is centered and scaled to unit variance in each dimension. The attribute .a of the class is a copy of the array that is actually sent to the SVD; note the centering/scaling. I don't have a proof of this, but intuitively I expect that the PCA axes associated with a 2-dimension centered/scaled array will always be at 45" angles (e.g., [1,1], [-1,1], etc., which are normalized to [sqrt(1/2), sqrt(1/2)], etc). I think one way to describe this is that after centering/scaling there are no degrees of freedom left if you only started with 2 dimensions. So I don't think there is a bug, but it is maybe unclear what the PCA class is doing. If you increase to > 2 dimensions, you can see there is random fluctuation in Wt: In [102]: pcaObj = PCA(np.random.randn(200,2)) In [103]: pcaObj.Wt Out[103]: array([[-0.70710678, -0.70710678], [-0.70710678, 0.70710678]]) In [104]: pcaObj = PCA(np.random.randn(200,3)) In [105]: pcaObj.Wt Out[105]: array([[ 0.65456366, -0.24141116, -0.7164266 ], [ 0.39843462, 0.91551401, 0.05553329], [ 0.64249223, -0.32179924, 0.69544877]]) In [106]: pcaObj = PCA(np.random.randn(200,3)) In [107]: pcaObj.Wt Out[107]: array([[-0.29885902, -0.67436982, 0.67521007], [-0.95428685, 0.21449891, -0.20815098], [-0.00446109, -0.70655189, -0.70764718]]) Hope that helps, Aronne |