You can subscribe to this list here.
2004 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}
(1) 
_{Sep}

_{Oct}

_{Nov}
(1) 
_{Dec}


2005 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}
(1) 
_{Jul}

_{Aug}

_{Sep}

_{Oct}
(2) 
_{Nov}

_{Dec}
(1) 
2006 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}
(3) 
_{Jun}
(1) 
_{Jul}
(3) 
_{Aug}
(8) 
_{Sep}

_{Oct}

_{Nov}

_{Dec}

2007 
_{Jan}
(1) 
_{Feb}

_{Mar}
(1) 
_{Apr}

_{May}
(2) 
_{Jun}
(3) 
_{Jul}
(1) 
_{Aug}
(4) 
_{Sep}
(15) 
_{Oct}
(4) 
_{Nov}

_{Dec}

2008 
_{Jan}
(10) 
_{Feb}
(2) 
_{Mar}

_{Apr}

_{May}
(7) 
_{Jun}
(4) 
_{Jul}
(6) 
_{Aug}
(12) 
_{Sep}

_{Oct}
(3) 
_{Nov}
(13) 
_{Dec}
(10) 
2009 
_{Jan}
(12) 
_{Feb}
(19) 
_{Mar}
(27) 
_{Apr}

_{May}
(6) 
_{Jun}
(9) 
_{Jul}

_{Aug}
(5) 
_{Sep}
(12) 
_{Oct}
(20) 
_{Nov}
(1) 
_{Dec}
(8) 
2010 
_{Jan}
(5) 
_{Feb}
(8) 
_{Mar}
(3) 
_{Apr}
(4) 
_{May}
(3) 
_{Jun}
(12) 
_{Jul}
(22) 
_{Aug}
(19) 
_{Sep}
(7) 
_{Oct}
(7) 
_{Nov}
(7) 
_{Dec}
(21) 
2011 
_{Jan}
(10) 
_{Feb}
(18) 
_{Mar}
(26) 
_{Apr}
(12) 
_{May}

_{Jun}
(3) 
_{Jul}
(6) 
_{Aug}
(11) 
_{Sep}
(19) 
_{Oct}
(32) 
_{Nov}
(31) 
_{Dec}
(27) 
2012 
_{Jan}
(8) 
_{Feb}
(5) 
_{Mar}
(19) 
_{Apr}
(3) 
_{May}
(3) 
_{Jun}
(14) 
_{Jul}
(15) 
_{Aug}
(3) 
_{Sep}
(14) 
_{Oct}
(7) 
_{Nov}
(6) 
_{Dec}
(36) 
2013 
_{Jan}
(18) 
_{Feb}
(8) 
_{Mar}
(22) 
_{Apr}
(4) 
_{May}
(18) 
_{Jun}
(16) 
_{Jul}
(9) 
_{Aug}
(8) 
_{Sep}
(4) 
_{Oct}
(6) 
_{Nov}
(1) 
_{Dec}
(3) 
2014 
_{Jan}
(5) 
_{Feb}
(3) 
_{Mar}
(5) 
_{Apr}
(6) 
_{May}
(2) 
_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}

S  M  T  W  T  F  S 







1

2

3

4

5

6

7

8

9

10

11

12

13

14
(7) 
15

16

17
(8) 
18
(1) 
19
(5) 
20

21

22

23

24
(1) 
25

26
(10) 
27

28

29

30

31






From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111026 15:03:11

Gnah, now I finally understand. Thanks for your patience! I guess I didn't see the forest for all the trees anymore. Well, back to reworking my network and trying again. At least that should be pretty straightforward now :) Cheers! Fabian On 10/26/11, Niko Wilbert wrote: > Hi Fabian, > > in our current implementation the covariance matrix update is > vectorized over all samples in the batch, so you have to multiply that > memory number with the number of samples. > > I guess in principle we could add a check in CovarianceMatrix, so that > a singlesample incremental update is used if the samples dimension is > larger than a threshold value. We could write a little benchmark to > determine a good threshold value, since it would probably not only > save memory but could also be somewhat faster (as I remember that > beyond a certain size vectorization in numpy has a negative effect on > performance). > > Cheers, Niko > > > On Wed, Oct 26, 2011 at 4:34 PM, Fabian Schoenfeld > <fabian.schoenfeld@...> wrote: > > Yes, so far you are right. However, hat I left out from the original post to > > make it more readable: The first SFA node within the SFA* flow node is to reduce > > the dimensionality, and thus it only relays the 64 slowest signals found within > > the 240 input channels. So the quadratic expansion only works on 64 channels and > > produces 2144 output channels, which makes the sizes quite manageable again  > > a 2144*2144 covariance matrix of float32 fits into a mere ~17.5 MB. > > > > I will probably try to send the data in iterative batches, but with my current > > setup I still can't rationalize where all the memory usage comes from  and I > > would rather avoid working with multiple batches, as I'm outsourcing my > > computations to graphics hardware, and thus I'd like to minimize large transfers > > that would result from splitting my data into batches. > > > > (If you remember my prior postings, I was talking about a new graphics card and > > promised some timings. I have the hardware now, but not the time to do a nicely > > measured timing series. So far, however, I can already say the thing is a real > > beast :) > > > > Cheers, > > Fabian > > > > > > > > > > On 10/26/11, Tiziano Zito wrote: > >> > However, I still don't quite understand: Since I'm using a clone layer, isn't > >> > there just a single node, which is reused / retrained over and over again > >> > during the training of the clone layer? > >> > > >> > "each SFANode during training has to keep in memory the covariance matrix" > >> > > >> > That I fully understand, but since there should be only one SFA node, there > >> > should be only one covariance matrix (which gets big, true)? So after a clone > >> > node was trained, the next clone node is trained  which however is the exact > >> > same node, and thus does not require its own covariance matrix? > >> you are right, but look at the numbers: > >>  first SFA node: input_dim=240, cov_mat=240x240=57600 items if you > >> for dtype('f') > numpy.dtype('f').itemsize*240*240/(1024*1024.) = 0.21M > >> for dtype('d') > numpy.dtype('d').itemsize*240*240/(1024*1024.) = 0.43M > >> ok, this is small > >>  SFA node after expansion: input_dim = mdp.nodes._expanded_dim(2,240) = 29160 > >> covariance matrix = 29160x29160: > >> numpy.dtype('f').itemsize*29160*29160)/(1024*1024.) = 3243.7M > >> numpy.dtype('d').itemsize*29160*29160)/(1024*1024.) = 6487.3M > >> which is already quite big > >> > >> now add to that the full input data, some overhead, the fact that (if you > >> don't have scipy) the eigenvalue decomposition casts the single > >> precision matrix to a double precision one, and I think you can > >> easily explain the 8G. > >> unless I seriously misunderstood your setup, that is :) > >> > >> ciao, > >> tiziano > >> > >> > >>  > >> The demand for IT networking professionals continues to grow, and the > >> demand for specialized networking skills is growing even more rapidly. > >> Take a complimentary Learning@... SelfAssessment and learn > >> about Cisco certifications, training, and career opportunities. > >> http://p.sf.net/sfu/ciscodev2dev > >> _______________________________________________ > >> mdptoolkitusers mailing list > >> mdptoolkitusers@... > >> https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > > > >  > > The demand for IT networking professionals continues to grow, and the > > demand for specialized networking skills is growing even more rapidly. > > Take a complimentary Learning@... SelfAssessment and learn > > about Cisco certifications, training, and career opportunities. > > http://p.sf.net/sfu/ciscodev2dev > > _______________________________________________ > > mdptoolkitusers mailing list > > mdptoolkitusers@... > > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > > > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Niko Wilbert <mail@ni...>  20111026 14:45:39

Hi Fabian, in our current implementation the covariance matrix update is vectorized over all samples in the batch, so you have to multiply that memory number with the number of samples. I guess in principle we could add a check in CovarianceMatrix, so that a singlesample incremental update is used if the samples dimension is larger than a threshold value. We could write a little benchmark to determine a good threshold value, since it would probably not only save memory but could also be somewhat faster (as I remember that beyond a certain size vectorization in numpy has a negative effect on performance). Cheers, Niko On Wed, Oct 26, 2011 at 4:34 PM, Fabian Schoenfeld <fabian.schoenfeld@...> wrote: > Yes, so far you are right. However, hat I left out from the original post to > make it more readable: The first SFA node within the SFA* flow node is to reduce > the dimensionality, and thus it only relays the 64 slowest signals found within > the 240 input channels. So the quadratic expansion only works on 64 channels and > produces 2144 output channels, which makes the sizes quite manageable again  > a 2144*2144 covariance matrix of float32 fits into a mere ~17.5 MB. > > I will probably try to send the data in iterative batches, but with my current > setup I still can't rationalize where all the memory usage comes from  and I > would rather avoid working with multiple batches, as I'm outsourcing my > computations to graphics hardware, and thus I'd like to minimize large transfers > that would result from splitting my data into batches. > > (If you remember my prior postings, I was talking about a new graphics card and > promised some timings. I have the hardware now, but not the time to do a nicely > measured timing series. So far, however, I can already say the thing is a real > beast :) > > Cheers, > Fabian > > > > > On 10/26/11, Tiziano Zito wrote: >> > However, I still don't quite understand: Since I'm using a clone layer, isn't >> > there just a single node, which is reused / retrained over and over again >> > during the training of the clone layer? >> > >> > "each SFANode during training has to keep in memory the covariance matrix" >> > >> > That I fully understand, but since there should be only one SFA node, there >> > should be only one covariance matrix (which gets big, true)? So after a clone >> > node was trained, the next clone node is trained  which however is the exact >> > same node, and thus does not require its own covariance matrix? >> you are right, but look at the numbers: >>  first SFA node: input_dim=240, cov_mat=240x240=57600 items if you >> for dtype('f') > numpy.dtype('f').itemsize*240*240/(1024*1024.) = 0.21M >> for dtype('d') > numpy.dtype('d').itemsize*240*240/(1024*1024.) = 0.43M >> ok, this is small >>  SFA node after expansion: input_dim = mdp.nodes._expanded_dim(2,240) = 29160 >> covariance matrix = 29160x29160: >> numpy.dtype('f').itemsize*29160*29160)/(1024*1024.) = 3243.7M >> numpy.dtype('d').itemsize*29160*29160)/(1024*1024.) = 6487.3M >> which is already quite big >> >> now add to that the full input data, some overhead, the fact that (if you >> don't have scipy) the eigenvalue decomposition casts the single >> precision matrix to a double precision one, and I think you can >> easily explain the 8G. >> unless I seriously misunderstood your setup, that is :) >> >> ciao, >> tiziano >> >> >>  >> The demand for IT networking professionals continues to grow, and the >> demand for specialized networking skills is growing even more rapidly. >> Take a complimentary Learning@... SelfAssessment and learn >> about Cisco certifications, training, and career opportunities. >> http://p.sf.net/sfu/ciscodev2dev >> _______________________________________________ >> mdptoolkitusers mailing list >> mdptoolkitusers@... >> https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111026 14:35:44

Yes, so far you are right. However, hat I left out from the original post to make it more readable: The first SFA node within the SFA* flow node is to reduce the dimensionality, and thus it only relays the 64 slowest signals found within the 240 input channels. So the quadratic expansion only works on 64 channels and produces 2144 output channels, which makes the sizes quite manageable again  a 2144*2144 covariance matrix of float32 fits into a mere ~17.5 MB. I will probably try to send the data in iterative batches, but with my current setup I still can't rationalize where all the memory usage comes from  and I would rather avoid working with multiple batches, as I'm outsourcing my computations to graphics hardware, and thus I'd like to minimize large transfers that would result from splitting my data into batches. (If you remember my prior postings, I was talking about a new graphics card and promised some timings. I have the hardware now, but not the time to do a nicely measured timing series. So far, however, I can already say the thing is a real beast :) Cheers, Fabian On 10/26/11, Tiziano Zito wrote: > > However, I still don't quite understand: Since I'm using a clone layer, isn't > > there just a single node, which is reused / retrained over and over again > > during the training of the clone layer? > > > > "each SFANode during training has to keep in memory the covariance matrix" > > > > That I fully understand, but since there should be only one SFA node, there > > should be only one covariance matrix (which gets big, true)? So after a clone > > node was trained, the next clone node is trained  which however is the exact > > same node, and thus does not require its own covariance matrix? > you are right, but look at the numbers: >  first SFA node: input_dim=240, cov_mat=240x240=57600 items if you > for dtype('f') > numpy.dtype('f').itemsize*240*240/(1024*1024.) = 0.21M > for dtype('d') > numpy.dtype('d').itemsize*240*240/(1024*1024.) = 0.43M > ok, this is small >  SFA node after expansion: input_dim = mdp.nodes._expanded_dim(2,240) = 29160 > covariance matrix = 29160x29160: > numpy.dtype('f').itemsize*29160*29160)/(1024*1024.) = 3243.7M > numpy.dtype('d').itemsize*29160*29160)/(1024*1024.) = 6487.3M > which is already quite big > > now add to that the full input data, some overhead, the fact that (if you > don't have scipy) the eigenvalue decomposition casts the single > precision matrix to a double precision one, and I think you can > easily explain the 8G. > unless I seriously misunderstood your setup, that is :) > > ciao, > tiziano > > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Niko Wilbert <mail@ni...>  20111026 10:29:50

Hi Fabian, it sounds like your Problem is that you feed in all the data at once. With a smaller chunk size it shouldn't be a problem (e.g., we used 200 or 500 data points per chunk in our object recognition work). But of course you are right that CloneLayer will only use a single instance of each SFANode. Note that you can also use hinet.show_flow to get a quick impression of the network dimensions (there is even a switch to display the rough node size). Cheers, Niko On Wed, Oct 26, 2011 at 12:20 PM, Tiziano Zito <tiziano.zito@...> wrote: >> However, I still don't quite understand: Since I'm using a clone layer, isn't >> there just a single node, which is reused / retrained over and over again >> during the training of the clone layer? >> >> "each SFANode during training has to keep in memory the covariance matrix" >> >> That I fully understand, but since there should be only one SFA node, there >> should be only one covariance matrix (which gets big, true)? So after a clone >> node was trained, the next clone node is trained  which however is the exact >> same node, and thus does not require its own covariance matrix? > you are right, but look at the numbers: >  first SFA node: input_dim=240, cov_mat=240x240=57600 items if you > for dtype('f') > numpy.dtype('f').itemsize*240*240/(1024*1024.) = 0.21M > for dtype('d') > numpy.dtype('d').itemsize*240*240/(1024*1024.) = 0.43M > ok, this is small >  SFA node after expansion: input_dim = mdp.nodes._expanded_dim(2,240) = 29160 > covariance matrix = 29160x29160: > numpy.dtype('f').itemsize*29160*29160)/(1024*1024.) = 3243.7M > numpy.dtype('d').itemsize*29160*29160)/(1024*1024.) = 6487.3M > which is already quite big > > now add to that the full input data, some overhead, the fact that (if you > don't have scipy) the eigenvalue decomposition casts the single > precision matrix to a double precision one, and I think you can > easily explain the 8G. > unless I seriously misunderstood your setup, that is :) > > ciao, > tiziano > > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > 
From: Tiziano Zito <tiziano.zito@bc...>  20111026 10:20:17

> However, I still don't quite understand: Since I'm using a clone layer, isn't > there just a single node, which is reused / retrained over and over again > during the training of the clone layer? > > "each SFANode during training has to keep in memory the covariance matrix" > > That I fully understand, but since there should be only one SFA node, there > should be only one covariance matrix (which gets big, true)? So after a clone > node was trained, the next clone node is trained  which however is the exact > same node, and thus does not require its own covariance matrix? you are right, but look at the numbers:  first SFA node: input_dim=240, cov_mat=240x240=57600 items if you for dtype('f') > numpy.dtype('f').itemsize*240*240/(1024*1024.) = 0.21M for dtype('d') > numpy.dtype('d').itemsize*240*240/(1024*1024.) = 0.43M ok, this is small  SFA node after expansion: input_dim = mdp.nodes._expanded_dim(2,240) = 29160 covariance matrix = 29160x29160: numpy.dtype('f').itemsize*29160*29160)/(1024*1024.) = 3243.7M numpy.dtype('d').itemsize*29160*29160)/(1024*1024.) = 6487.3M which is already quite big now add to that the full input data, some overhead, the fact that (if you don't have scipy) the eigenvalue decomposition casts the single precision matrix to a double precision one, and I think you can easily explain the 8G. unless I seriously misunderstood your setup, that is :) ciao, tiziano 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111026 10:19:42

Hi, you're right of course, but I left a bit out in order to make the original post more readable. The numbers are all roughly correct, since the first SFA node within one of the SFA* nodes has an output dimension of only 64, which after expansion becomes a stillmanageable 2144. Cheers, Fabian On 10/26/11, Pietro Berkes wrote: > Hi Fabian! > > After the quadratic expansion, 80x3 becomes 29160: > > In [5]: mdp.nodes.QuadraticExpansionNode(input_dim=240) > Out[5]: QuadraticExpansionNode(input_dim=240, output_dim=29160, dtype=None) > > So the second layer of SFANodes has to compute a 29160x29160 matrix, > which is *huge*. > > P. > > > On Wed, Oct 26, 2011 at 10:54 AM, Fabian Schoenfeld > <fabian.schoenfeld@...> wrote: > > Hi, > > > > thanks for the fast answer! > > > > However, I still don't quite understand: Since I'm using a clone layer, isn't > > there just a single node, which is reused / retrained over and over again > > during the training of the clone layer? > > > > "each SFANode during training has to keep in memory the covariance matrix" > > > > That I fully understand, but since there should be only one SFA node, there > > should be only one covariance matrix (which gets big, true)? So after a clone > > node was trained, the next clone node is trained  which however is the exact > > same node, and thus does not require its own covariance matrix? > > > > Cheers, > > Fabian > > > > > > > > On 10/26/11, Tiziano Zito wrote: > >> > >> hi fabian, > >> > >> keep in mind that each SFANode during training has to keep in memory > >> the covariance matrix of the input data, which can become quite huge > >> if you have a large number of dimensions. after expansion the > >> covariance matrix of the expanded data is even larger, so I am > >> absolutely not surprised that you land by 8GB. the covariance > >> matrices are deleted on stop_training. to significantly reduce the > >> amount of memory you need you can: 1) send the data in chunks and > >> not in one shot, 2) insert a PCANode with automatic or fixed > >> dimensionality reduction right after the expansion node: you really > >> don't/shouldn't care about dimensions which explain 0.0001% of the > >> total variance. > >> > >> ciao, > >> tiziano > >> > >> On Wed 26 Oct, 11:32, Fabian Schoenfeld wrote: > >> > Hi! > >> > > >> > I have a question regarding the inner memory workings of some of the mdp hinet > >> > modules. here's my scenario: I'm training several clone layers of SFA* nodes  > >> > it's essentially an expanded version of the hinet tutorial  where each SFA* > >> > node is a flow node, containing an SFA node, a quadratic expansion, and > >> > another SFA node: > >> > > >> > SFA* flow node: > >> > Input > [ SFANode > Quadr. Exp. Node > SFANode ] > Output > >> > > >> > My data marix is a simple 2D numpy array of (current testcase) dimension > >> > 15k x 38k, which is of size ~549.3 MB. The data gets routed through a hinet > >> > switchboard into a clone layer of around 500 clones/copies(?) of a single SFA* > >> > node: > >> > > >> > ^ > >> > Clone layer of SFA* nodes > >> > ^ > >> > Switchboard > >> > ^ > >> > Data > >> > > >> > Now my actual question is how much memory this setup should require. When > >> > training the network, it takes about 8.5 GB of space  and I don't really see > >> > why. Of course it needs to hold the original data, that's about 0.5 GB. The > >> > switchboard just does the routing, so it shouldn't use any significant amount of > >> > memory at all. The clone layer probably does, but since there is only one 'real' > >> > SFA* node, the clone nodes should have to be trained separately, i.e., only the > >> > footprint of a single SFA* node should be observed..? > >> > > >> > A single SFA* node operates on an image patch of dimension 10x8 (x3 for color), > >> > which means it has to deal with a data matrix of 15k x 80x3, that's about 3MB. > >> > Then there's the quadratic expansion, and the second SFA node within the SFA* > >> > node has to deal with a data matrix of about 30MB of size. > >> > > >> > Again: it's a clone layer, so when training the layer, all nodes should be > >> > trained one after the other, and node requires works on about 33MB of raw data. > >> > And here, I presume, my error of thinking lies. > >> > > >> > As far as I can tell, my memory footprint should consist of about the following: > >> > > >> > 0.5 GB (raw data) > >> > 3 MB (data the first SFA node within a SFA* node has to deal with) > >> > 33 MB (data the second SFA node within an SFA* node has to deal with) > >> > ?? MB (overhead memory to maintain all active structures) > >> > > >> > that yields about 533 MB + overhead, which is nowhere near the 8.5 GB of memory > >> > that the system tells me is used by the python process. > >> > > >> > Since I plan to work on a much larger dataset further down the line, I would be > >> > very thankful for any clarification on this matter! (And I hope I presented my > >> > case in a somewhat easy to read manner.) > >> > > >> > Regards, > >> > Fabian > >> > > >> > > >> > (On reading through, I realized I calculated with onebyte values, which of > >> > course makes no sense. However, when calculating with (32bit) floats instead, > >> > that's only a factor of four, which still is far away from the observed 8.5 GB.)> Output > >> > > >> >  > >> > The demand for IT networking professionals continues to grow, and the > >> > demand for specialized networking skills is growing even more rapidly. > >> > Take a complimentary Learning@... SelfAssessment and learn > >> > about Cisco certifications, training, and career opportunities. > >> > http://p.sf.net/sfu/ciscodev2dev > >> > _______________________________________________ > >> > mdptoolkitusers mailing list > >> > mdptoolkitusers@... > >> > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > >> > >>  > >> The demand for IT networking professionals continues to grow, and the > >> demand for specialized networking skills is growing even more rapidly. > >> Take a complimentary Learning@... SelfAssessment and learn > >> about Cisco certifications, training, and career opportunities. > >> http://p.sf.net/sfu/ciscodev2dev > >> _______________________________________________ > >> mdptoolkitusers mailing list > >> mdptoolkitusers@... > >> https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > > > >  > > The demand for IT networking professionals continues to grow, and the > > demand for specialized networking skills is growing even more rapidly. > > Take a complimentary Learning@... SelfAssessment and learn > > about Cisco certifications, training, and career opportunities. > > http://p.sf.net/sfu/ciscodev2dev > > _______________________________________________ > > mdptoolkitusers mailing list > > mdptoolkitusers@... > > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > > > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Pietro Berkes <berkes@ga...>  20111026 10:13:21

Hi Fabian! After the quadratic expansion, 80x3 becomes 29160: In [5]: mdp.nodes.QuadraticExpansionNode(input_dim=240) Out[5]: QuadraticExpansionNode(input_dim=240, output_dim=29160, dtype=None) So the second layer of SFANodes has to compute a 29160x29160 matrix, which is *huge*. P. On Wed, Oct 26, 2011 at 10:54 AM, Fabian Schoenfeld <fabian.schoenfeld@...> wrote: > Hi, > > thanks for the fast answer! > > However, I still don't quite understand: Since I'm using a clone layer, isn't > there just a single node, which is reused / retrained over and over again > during the training of the clone layer? > > "each SFANode during training has to keep in memory the covariance matrix" > > That I fully understand, but since there should be only one SFA node, there > should be only one covariance matrix (which gets big, true)? So after a clone > node was trained, the next clone node is trained  which however is the exact > same node, and thus does not require its own covariance matrix? > > Cheers, > Fabian > > > > On 10/26/11, Tiziano Zito wrote: >> >> hi fabian, >> >> keep in mind that each SFANode during training has to keep in memory >> the covariance matrix of the input data, which can become quite huge >> if you have a large number of dimensions. after expansion the >> covariance matrix of the expanded data is even larger, so I am >> absolutely not surprised that you land by 8GB. the covariance >> matrices are deleted on stop_training. to significantly reduce the >> amount of memory you need you can: 1) send the data in chunks and >> not in one shot, 2) insert a PCANode with automatic or fixed >> dimensionality reduction right after the expansion node: you really >> don't/shouldn't care about dimensions which explain 0.0001% of the >> total variance. >> >> ciao, >> tiziano >> >> On Wed 26 Oct, 11:32, Fabian Schoenfeld wrote: >> > Hi! >> > >> > I have a question regarding the inner memory workings of some of the mdp hinet >> > modules. here's my scenario: I'm training several clone layers of SFA* nodes  >> > it's essentially an expanded version of the hinet tutorial  where each SFA* >> > node is a flow node, containing an SFA node, a quadratic expansion, and >> > another SFA node: >> > >> > SFA* flow node: >> > Input > [ SFANode > Quadr. Exp. Node > SFANode ] > Output >> > >> > My data marix is a simple 2D numpy array of (current testcase) dimension >> > 15k x 38k, which is of size ~549.3 MB. The data gets routed through a hinet >> > switchboard into a clone layer of around 500 clones/copies(?) of a single SFA* >> > node: >> > >> > ^ >> > Clone layer of SFA* nodes >> > ^ >> > Switchboard >> > ^ >> > Data >> > >> > Now my actual question is how much memory this setup should require. When >> > training the network, it takes about 8.5 GB of space  and I don't really see >> > why. Of course it needs to hold the original data, that's about 0.5 GB. The >> > switchboard just does the routing, so it shouldn't use any significant amount of >> > memory at all. The clone layer probably does, but since there is only one 'real' >> > SFA* node, the clone nodes should have to be trained separately, i.e., only the >> > footprint of a single SFA* node should be observed..? >> > >> > A single SFA* node operates on an image patch of dimension 10x8 (x3 for color), >> > which means it has to deal with a data matrix of 15k x 80x3, that's about 3MB. >> > Then there's the quadratic expansion, and the second SFA node within the SFA* >> > node has to deal with a data matrix of about 30MB of size. >> > >> > Again: it's a clone layer, so when training the layer, all nodes should be >> > trained one after the other, and node requires works on about 33MB of raw data. >> > And here, I presume, my error of thinking lies. >> > >> > As far as I can tell, my memory footprint should consist of about the following: >> > >> > 0.5 GB (raw data) >> > 3 MB (data the first SFA node within a SFA* node has to deal with) >> > 33 MB (data the second SFA node within an SFA* node has to deal with) >> > ?? MB (overhead memory to maintain all active structures) >> > >> > that yields about 533 MB + overhead, which is nowhere near the 8.5 GB of memory >> > that the system tells me is used by the python process. >> > >> > Since I plan to work on a much larger dataset further down the line, I would be >> > very thankful for any clarification on this matter! (And I hope I presented my >> > case in a somewhat easy to read manner.) >> > >> > Regards, >> > Fabian >> > >> > >> > (On reading through, I realized I calculated with onebyte values, which of >> > course makes no sense. However, when calculating with (32bit) floats instead, >> > that's only a factor of four, which still is far away from the observed 8.5 GB.)> Output >> > >> >  >> > The demand for IT networking professionals continues to grow, and the >> > demand for specialized networking skills is growing even more rapidly. >> > Take a complimentary Learning@... SelfAssessment and learn >> > about Cisco certifications, training, and career opportunities. >> > http://p.sf.net/sfu/ciscodev2dev >> > _______________________________________________ >> > mdptoolkitusers mailing list >> > mdptoolkitusers@... >> > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers >> >>  >> The demand for IT networking professionals continues to grow, and the >> demand for specialized networking skills is growing even more rapidly. >> Take a complimentary Learning@... SelfAssessment and learn >> about Cisco certifications, training, and career opportunities. >> http://p.sf.net/sfu/ciscodev2dev >> _______________________________________________ >> mdptoolkitusers mailing list >> mdptoolkitusers@... >> https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111026 09:55:41

Hi, thanks for the fast answer! However, I still don't quite understand: Since I'm using a clone layer, isn't there just a single node, which is reused / retrained over and over again during the training of the clone layer? "each SFANode during training has to keep in memory the covariance matrix" That I fully understand, but since there should be only one SFA node, there should be only one covariance matrix (which gets big, true)? So after a clone node was trained, the next clone node is trained  which however is the exact same node, and thus does not require its own covariance matrix? Cheers, Fabian On 10/26/11, Tiziano Zito wrote: > > hi fabian, > > keep in mind that each SFANode during training has to keep in memory > the covariance matrix of the input data, which can become quite huge > if you have a large number of dimensions. after expansion the > covariance matrix of the expanded data is even larger, so I am > absolutely not surprised that you land by 8GB. the covariance > matrices are deleted on stop_training. to significantly reduce the > amount of memory you need you can: 1) send the data in chunks and > not in one shot, 2) insert a PCANode with automatic or fixed > dimensionality reduction right after the expansion node: you really > don't/shouldn't care about dimensions which explain 0.0001% of the > total variance. > > ciao, > tiziano > > On Wed 26 Oct, 11:32, Fabian Schoenfeld wrote: > > Hi! > > > > I have a question regarding the inner memory workings of some of the mdp hinet > > modules. here's my scenario: I'm training several clone layers of SFA* nodes  > > it's essentially an expanded version of the hinet tutorial  where each SFA* > > node is a flow node, containing an SFA node, a quadratic expansion, and > > another SFA node: > > > > SFA* flow node: > > Input > [ SFANode > Quadr. Exp. Node > SFANode ] > Output > > > > My data marix is a simple 2D numpy array of (current testcase) dimension > > 15k x 38k, which is of size ~549.3 MB. The data gets routed through a hinet > > switchboard into a clone layer of around 500 clones/copies(?) of a single SFA* > > node: > > > > ^ > > Clone layer of SFA* nodes > > ^ > > Switchboard > > ^ > > Data > > > > Now my actual question is how much memory this setup should require. When > > training the network, it takes about 8.5 GB of space  and I don't really see > > why. Of course it needs to hold the original data, that's about 0.5 GB. The > > switchboard just does the routing, so it shouldn't use any significant amount of > > memory at all. The clone layer probably does, but since there is only one 'real' > > SFA* node, the clone nodes should have to be trained separately, i.e., only the > > footprint of a single SFA* node should be observed..? > > > > A single SFA* node operates on an image patch of dimension 10x8 (x3 for color), > > which means it has to deal with a data matrix of 15k x 80x3, that's about 3MB. > > Then there's the quadratic expansion, and the second SFA node within the SFA* > > node has to deal with a data matrix of about 30MB of size. > > > > Again: it's a clone layer, so when training the layer, all nodes should be > > trained one after the other, and node requires works on about 33MB of raw data. > > And here, I presume, my error of thinking lies. > > > > As far as I can tell, my memory footprint should consist of about the following: > > > > 0.5 GB (raw data) > > 3 MB (data the first SFA node within a SFA* node has to deal with) > > 33 MB (data the second SFA node within an SFA* node has to deal with) > > ?? MB (overhead memory to maintain all active structures) > > > > that yields about 533 MB + overhead, which is nowhere near the 8.5 GB of memory > > that the system tells me is used by the python process. > > > > Since I plan to work on a much larger dataset further down the line, I would be > > very thankful for any clarification on this matter! (And I hope I presented my > > case in a somewhat easy to read manner.) > > > > Regards, > > Fabian > > > > > > (On reading through, I realized I calculated with onebyte values, which of > > course makes no sense. However, when calculating with (32bit) floats instead, > > that's only a factor of four, which still is far away from the observed 8.5 GB.)> Output > > > >  > > The demand for IT networking professionals continues to grow, and the > > demand for specialized networking skills is growing even more rapidly. > > Take a complimentary Learning@... SelfAssessment and learn > > about Cisco certifications, training, and career opportunities. > > http://p.sf.net/sfu/ciscodev2dev > > _______________________________________________ > > mdptoolkitusers mailing list > > mdptoolkitusers@... > > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Tiziano Zito <tiziano.zito@bc...>  20111026 09:47:47

hi fabian, keep in mind that each SFANode during training has to keep in memory the covariance matrix of the input data, which can become quite huge if you have a large number of dimensions. after expansion the covariance matrix of the expanded data is even larger, so I am absolutely not surprised that you land by 8GB. the covariance matrices are deleted on stop_training. to significantly reduce the amount of memory you need you can: 1) send the data in chunks and not in one shot, 2) insert a PCANode with automatic or fixed dimensionality reduction right after the expansion node: you really don't/shouldn't care about dimensions which explain 0.0001% of the total variance. ciao, tiziano On Wed 26 Oct, 11:32, Fabian Schoenfeld wrote: > Hi! > > I have a question regarding the inner memory workings of some of the mdp hinet > modules. here's my scenario: I'm training several clone layers of SFA* nodes  > it's essentially an expanded version of the hinet tutorial  where each SFA* > node is a flow node, containing an SFA node, a quadratic expansion, and > another SFA node: > > SFA* flow node: > Input > [ SFANode > Quadr. Exp. Node > SFANode ] > Output > > My data marix is a simple 2D numpy array of (current testcase) dimension > 15k x 38k, which is of size ~549.3 MB. The data gets routed through a hinet > switchboard into a clone layer of around 500 clones/copies(?) of a single SFA* > node: > > ^ > Clone layer of SFA* nodes > ^ > Switchboard > ^ > Data > > Now my actual question is how much memory this setup should require. When > training the network, it takes about 8.5 GB of space  and I don't really see > why. Of course it needs to hold the original data, that's about 0.5 GB. The > switchboard just does the routing, so it shouldn't use any significant amount of > memory at all. The clone layer probably does, but since there is only one 'real' > SFA* node, the clone nodes should have to be trained separately, i.e., only the > footprint of a single SFA* node should be observed..? > > A single SFA* node operates on an image patch of dimension 10x8 (x3 for color), > which means it has to deal with a data matrix of 15k x 80x3, that's about 3MB. > Then there's the quadratic expansion, and the second SFA node within the SFA* > node has to deal with a data matrix of about 30MB of size. > > Again: it's a clone layer, so when training the layer, all nodes should be > trained one after the other, and node requires works on about 33MB of raw data. > And here, I presume, my error of thinking lies. > > As far as I can tell, my memory footprint should consist of about the following: > > 0.5 GB (raw data) > 3 MB (data the first SFA node within a SFA* node has to deal with) > 33 MB (data the second SFA node within an SFA* node has to deal with) > ?? MB (overhead memory to maintain all active structures) > > that yields about 533 MB + overhead, which is nowhere near the 8.5 GB of memory > that the system tells me is used by the python process. > > Since I plan to work on a much larger dataset further down the line, I would be > very thankful for any clarification on this matter! (And I hope I presented my > case in a somewhat easy to read manner.) > > Regards, > Fabian > > > (On reading through, I realized I calculated with onebyte values, which of > course makes no sense. However, when calculating with (32bit) floats instead, > that's only a factor of four, which still is far away from the observed 8.5 GB.)> Output > >  > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@... SelfAssessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/ciscodev2dev > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111026 09:33:23

Hi! I have a question regarding the inner memory workings of some of the mdp hinet modules. here's my scenario: I'm training several clone layers of SFA* nodes  it's essentially an expanded version of the hinet tutorial  where each SFA* node is a flow node, containing an SFA node, a quadratic expansion, and another SFA node: SFA* flow node: Input > [ SFANode > Quadr. Exp. Node > SFANode ] > Output My data marix is a simple 2D numpy array of (current testcase) dimension 15k x 38k, which is of size ~549.3 MB. The data gets routed through a hinet switchboard into a clone layer of around 500 clones/copies(?) of a single SFA* node: ^ Clone layer of SFA* nodes ^ Switchboard ^ Data Now my actual question is how much memory this setup should require. When training the network, it takes about 8.5 GB of space  and I don't really see why. Of course it needs to hold the original data, that's about 0.5 GB. The switchboard just does the routing, so it shouldn't use any significant amount of memory at all. The clone layer probably does, but since there is only one 'real' SFA* node, the clone nodes should have to be trained separately, i.e., only the footprint of a single SFA* node should be observed..? A single SFA* node operates on an image patch of dimension 10x8 (x3 for color), which means it has to deal with a data matrix of 15k x 80x3, that's about 3MB. Then there's the quadratic expansion, and the second SFA node within the SFA* node has to deal with a data matrix of about 30MB of size. Again: it's a clone layer, so when training the layer, all nodes should be trained one after the other, and node requires works on about 33MB of raw data. And here, I presume, my error of thinking lies. As far as I can tell, my memory footprint should consist of about the following: 0.5 GB (raw data) 3 MB (data the first SFA node within a SFA* node has to deal with) 33 MB (data the second SFA node within an SFA* node has to deal with) ?? MB (overhead memory to maintain all active structures) that yields about 533 MB + overhead, which is nowhere near the 8.5 GB of memory that the system tells me is used by the python process. Since I plan to work on a much larger dataset further down the line, I would be very thankful for any clarification on this matter! (And I hope I presented my case in a somewhat easy to read manner.) Regards, Fabian (On reading through, I realized I calculated with onebyte values, which of course makes no sense. However, when calculating with (32bit) floats instead, that's only a factor of four, which still is far away from the observed 8.5 GB.)> Output 
From: Tiziano Zito <tiziano.zito@bc...>  20111024 13:59:14

We are glad to announce release 3.2 of the Modular toolkit for Data Processing (MDP). MDP is a Python library of widely used data processing algorithms that can be combined according to a pipeline analogy to build more complex data processing software. The base of available algorithms includes signal processing methods (Principal Component Analysis, Independent Component Analysis, Slow Feature Analysis), manifold learning methods ([Hessian] Locally Linear Embedding), several classifiers, probabilistic methods (Factor Analysis, RBM), data preprocessing methods, and many others. What's new in version 3.2?   improved sklearn wrappers  update sklearn, shogun, and pp wrappers to new versions  do not leave temporary files around after testing  refactoring and cleaning up of HTML exporting features  improve export of signature and docstring to public methods  fixed and updated FastICANode to closely resemble the original Matlab version (thanks to Ben Willmore)  support for new numpy version  new NeuralGasNode (thanks to Michael Schmuker)  several bug fixes and improvements We recommend all users to upgrade. Resources  Download: http://sourceforge.net/projects/mdptoolkit/files Homepage: http://mdptoolkit.sourceforge.net Mailing list: http://lists.sourceforge.net/mailman/listinfo/mdptoolkitusers Acknowledgments  We thank the contributors to this release: Michael Schmuker, Ben Willmore. The MDP developers, Pietro Berkes Zbigniew JędrzejewskiSzmek RikeBenjamin Schuppner Niko Wilbert Tiziano Zito 
From: Pietro Berkes <berkes@ga...>  20111019 15:57:57

Dear Noam, there is no implementation of GAs or simulated annealing in MDP at the moment... Best, Pietro On Wed, Oct 19, 2011 at 3:36 PM, Noam Peled <peled.noam@...> wrote: > Hello MDP users, > Do anyone know if I can use MDP with Genetic algorithm, or simulated > annealing? > > Thanks! > >  > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunkd2doct > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > > 
From: Noam Peled <peled.noam@gm...>  20111019 14:36:49

Hello MDP users, Do anyone know if I can use MDP with Genetic algorithm, or simulated annealing? Thanks! 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111019 13:29:26

Ah of course, that makes a lot of sense. Thanks :) On 10/19/11, Tiziano Zito wrote: > > Just a very quick question about the internal expansion of the SFA > > nodes: Where do I set the degree of the expansion? > > > > I.e., when using a SFA2Node, the signals go through quadratic > > expansion. Can this degree be set arbitrarily, or is the SFANode > > only there to do linear SFA, while the SFA2Node can be used for > > quadratic SFA, and any higher degrees are simply not supported? > > > for higher degrees you simply build a flow. for example, to expand > in the space of polynomials of degree 6: > > flow = mdp.nodes.PolynomialExpansionNode(6)+mdp.nodes.SFANode() > > there are also other expansion nodes available, for example: > mdp.nodes.RBFExpansionNode > > > ciao, > tiziano > > >  > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunkd2doct > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Tiziano Zito <tiziano.zito@bc...>  20111019 13:13:08

> Just a very quick question about the internal expansion of the SFA > nodes: Where do I set the degree of the expansion? > > I.e., when using a SFA2Node, the signals go through quadratic > expansion. Can this degree be set arbitrarily, or is the SFANode > only there to do linear SFA, while the SFA2Node can be used for > quadratic SFA, and any higher degrees are simply not supported? > for higher degrees you simply build a flow. for example, to expand in the space of polynomials of degree 6: flow = mdp.nodes.PolynomialExpansionNode(6)+mdp.nodes.SFANode() there are also other expansion nodes available, for example: mdp.nodes.RBFExpansionNode ciao, tiziano 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111019 13:07:44

Hi! Just a very quick question about the internal expansion of the SFA nodes: Where do I set the degree of the expansion? I.e., when using a SFA2Node, the signals go through quadratic expansion. Can this degree be set arbitrarily, or is the SFANode only there to do linear SFA, while the SFA2Node can be used for quadratic SFA, and any higher degrees are simply not supported? Cheers, Fabian 
From: Pietro Berkes <berkes@ga...>  20111018 07:22:33

Hi Alejandro! it looks like you are missing the python numerical libraries. The easiest solution is to install one of the scientific python bundles, either Python(x,y) http://www.pythonxy.com/ or EPD http://www.enthought.com/products/getepd.php (there is an 'academic' and a 'free' version button on the upper right side of the page). P. 2011/10/17 Alejandro Coca Castro <acocac@...>: > Zbyszek, sorry but my OS is windows 7, so how can i check if numpy and > scipy are installed? > > Best, > > 2011/10/17, Zbigniew JędrzejewskiSzmek <zbyszek@...>: >> On 10/17/2011 11:43 PM, Alejandro Coca Castro wrote: >>> Hi, i want to run the GNG example >>> (http://mdptoolkit.sourceforge.net/examples/gng/gng.html#gng), >>> however when i run the python script, i have the next errror: >>> >>> File >>> "C:\Python25\Lib\sitepackages\pythonwin\pywin\framework\scriptutils.py", >>> line 325, in RunScript >>> exec codeObject in __main__.__dict__ >>> File "C:\Users\ALEJANDRO\Desktop\GNG\gng1.py", line 4, in<module> >>> import mdp >>> File "C:\Python25\Lib\sitepackages\mdp\__init__.py", line 118, >>> in<module> >>> numx_rand, numx_version) = configuration.get_numx() >>> File "C:\Python25\Lib\sitepackages\mdp\configuration.py", line 169, >>> in get_numx >>> raise ImportError(msg) >>> ImportError: Could not import any of the numeric backends. >>> Import errors: >>> scipy: No module named scipy >>> numpy: No module named numpy >> Hi Alejandro, >> >> it's like it says: you probably have no numpy and no scipy installed. >> >> Please first try to run >> python c 'import numpy; print numpy.version.version' >> >> Best, >> Zbyszek >> >> >>> >>> I have yet installed the MDP toolkit in my system (windows 7), >>> >>> If someone can help me, i´ll be grate, >>> >>> Best regards, >>> >> >> >>  >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunkd2doct >> _______________________________________________ >> mdptoolkitusers mailing list >> mdptoolkitusers@... >> https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers >> > > >  > Alejandro Coca > UN > >  > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunkd2doct > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > 
From: Alejandro Coca Castro <acocac@gm...>  20111017 22:04:17

Zbyszek, sorry but my OS is windows 7, so how can i check if numpy and scipy are installed? Best, 2011/10/17, Zbigniew JędrzejewskiSzmek <zbyszek@...>: > On 10/17/2011 11:43 PM, Alejandro Coca Castro wrote: >> Hi, i want to run the GNG example >> (http://mdptoolkit.sourceforge.net/examples/gng/gng.html#gng), >> however when i run the python script, i have the next errror: >> >> File >> "C:\Python25\Lib\sitepackages\pythonwin\pywin\framework\scriptutils.py", >> line 325, in RunScript >> exec codeObject in __main__.__dict__ >> File "C:\Users\ALEJANDRO\Desktop\GNG\gng1.py", line 4, in<module> >> import mdp >> File "C:\Python25\Lib\sitepackages\mdp\__init__.py", line 118, >> in<module> >> numx_rand, numx_version) = configuration.get_numx() >> File "C:\Python25\Lib\sitepackages\mdp\configuration.py", line 169, >> in get_numx >> raise ImportError(msg) >> ImportError: Could not import any of the numeric backends. >> Import errors: >> scipy: No module named scipy >> numpy: No module named numpy > Hi Alejandro, > > it's like it says: you probably have no numpy and no scipy installed. > > Please first try to run > python c 'import numpy; print numpy.version.version' > > Best, > Zbyszek > > >> >> I have yet installed the MDP toolkit in my system (windows 7), >> >> If someone can help me, i´ll be grate, >> >> Best regards, >> > > >  > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunkd2doct > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers >  Alejandro Coca UN 
From: Zbigniew JędrzejewskiSzmek <zbyszek@in...>  20111017 21:51:06

On 10/17/2011 11:43 PM, Alejandro Coca Castro wrote: > Hi, i want to run the GNG example > (http://mdptoolkit.sourceforge.net/examples/gng/gng.html#gng), > however when i run the python script, i have the next errror: > > File "C:\Python25\Lib\sitepackages\pythonwin\pywin\framework\scriptutils.py", > line 325, in RunScript > exec codeObject in __main__.__dict__ > File "C:\Users\ALEJANDRO\Desktop\GNG\gng1.py", line 4, in<module> > import mdp > File "C:\Python25\Lib\sitepackages\mdp\__init__.py", line 118, in<module> > numx_rand, numx_version) = configuration.get_numx() > File "C:\Python25\Lib\sitepackages\mdp\configuration.py", line 169, > in get_numx > raise ImportError(msg) > ImportError: Could not import any of the numeric backends. > Import errors: > scipy: No module named scipy > numpy: No module named numpy Hi Alejandro, it's like it says: you probably have no numpy and no scipy installed. Please first try to run python c 'import numpy; print numpy.version.version' Best, Zbyszek > > I have yet installed the MDP toolkit in my system (windows 7), > > If someone can help me, i´ll be grate, > > Best regards, > 
From: Alejandro Coca Castro <acocac@gm...>  20111017 21:43:14

Hi, i want to run the GNG example (http://mdptoolkit.sourceforge.net/examples/gng/gng.html#gng), however when i run the python script, i have the next errror: File "C:\Python25\Lib\sitepackages\pythonwin\pywin\framework\scriptutils.py", line 325, in RunScript exec codeObject in __main__.__dict__ File "C:\Users\ALEJANDRO\Desktop\GNG\gng1.py", line 4, in <module> import mdp File "C:\Python25\Lib\sitepackages\mdp\__init__.py", line 118, in <module> numx_rand, numx_version) = configuration.get_numx() File "C:\Python25\Lib\sitepackages\mdp\configuration.py", line 169, in get_numx raise ImportError(msg) ImportError: Could not import any of the numeric backends. Import errors: scipy: No module named scipy numpy: No module named numpy I have yet installed the MDP toolkit in my system (windows 7), If someone can help me, i´ll be grate, Best regards,  Alejandro Coca UN 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111017 10:12:45

Ah, excellent, that's what I needed to know. Since multiplication access is that centralized, I'll definitely keep it on my todo list. As long as it's merely a "bonus feature", however, it probably will take some time until I get around to it. But good to know that implementing it should be pretty straightforward (famous last words). Cheers, Fabian On 10/17/11, Tiziano Zito wrote: > > One other thing, and thereby going back to MDP: My current plan is > > to very selectively insert CUDA calls wherever I need them, > > instead of completely replacing the core matrix multiplication > > with a CUBLAS call. The main reason for that is: I don't know how > > I would do this  fix something deep whithin Numpy, I suppose. Any > > ideas for that? Ideally, one would simply set a certain flag (at > > runtime), and basic matrix operations would transparently relayed > > to CUBLAS.. but at the moment, I'm having a hard time of guessing > > how much of a hassle this would be to implement. > > > > almost all matrix multiplications in MDP are done using the internal > function mdp.utils.mult, which right now is aliased to mdp.numx.dot [1], > i.e. numpy.dot. You can write an extension that links your own CUBLAS > thing to mdp.utils.mult, taking care of relinking it to the > original value once you disable the extension, i.e. when you exit > the context manager. I am not sure you would get a good speed up in > all cases, though. it probably only makes sense if the matrices are > big enough, so you may want to define a function which dynamically > chooses GPU or CPU depending on the size. also note that numpy.dot > is also able to perform vectormatrix and vectorvector multiplication, > so you need to take care of this cases too if you don't want to get > wrong size exceptions at some random points during calculation. > > ciao, > tiziano > > [1] mdp/utils/__init__.py line 36. > > >  > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunkd2doct > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Tiziano Zito <tiziano.zito@bc...>  20111017 09:58:23

> One other thing, and thereby going back to MDP: My current plan is > to very selectively insert CUDA calls wherever I need them, > instead of completely replacing the core matrix multiplication > with a CUBLAS call. The main reason for that is: I don't know how > I would do this  fix something deep whithin Numpy, I suppose. Any > ideas for that? Ideally, one would simply set a certain flag (at > runtime), and basic matrix operations would transparently relayed > to CUBLAS.. but at the moment, I'm having a hard time of guessing > how much of a hassle this would be to implement. > almost all matrix multiplications in MDP are done using the internal function mdp.utils.mult, which right now is aliased to mdp.numx.dot [1], i.e. numpy.dot. You can write an extension that links your own CUBLAS thing to mdp.utils.mult, taking care of relinking it to the original value once you disable the extension, i.e. when you exit the context manager. I am not sure you would get a good speed up in all cases, though. it probably only makes sense if the matrices are big enough, so you may want to define a function which dynamically chooses GPU or CPU depending on the size. also note that numpy.dot is also able to perform vectormatrix and vectorvector multiplication, so you need to take care of this cases too if you don't want to get wrong size exceptions at some random points during calculation. ciao, tiziano [1] mdp/utils/__init__.py line 36. 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111017 09:46:40

Sure can do :) To get a sense of perspective, I can post a simple timing series of the basic approach (i.e., a slowish CUDA card and an unoptimized Numpy): """[ Results for 100.000 x 3320 run ] Setup data.. done Python multiplication (float32): 1130.99ms Python multiplication (float64): 1383.57ms CUBLAS multiplication (float32): 209.98ms Python/Numpy float64 vs float32 abs err: 76867372002.3 avg err: 6973.74183502 Python/Numpy float64 vs cublas abs err: 2553431570.56 avg err: 231.658402032 Python/Numpy float32 vs cublas abs err: 76605617152.0 avg err: 6949.99429816 Avg result value: 1638577205.58 Numpy (float32) avg error: 0.000426% CUBLAS (float32) avg error: 0.000014% """ The graphics card is a GeForce 9500 GT with 32 CUDA cores and 1gb of memory. The matrix dimensions are ~100k x 3k and the result is a ~3k x 3k square. Note that the CUDA computation seems to produce a far more correct result with 32bit floats than Numpy does  I wonder whether ATLAS would make a difference here (or I just screwed my float comparison ;). I also tested the same computation on a GTX 260, which works on 192 CUDA cores, and takes about three seconds to complete the same multiplication. So I'm now going for a new GTX 580, and then get the CUBLAS calls integrated with the MDP module. Looking forward to the action :) One other thing, and thereby going back to MDP: My current plan is to very selectively insert CUDA calls wherever I need them, instead of completely replacing the core matrix multiplication with a CUBLAS call. The main reason for that is: I don't know how I would do this  fix something deep whithin Numpy, I suppose. Any ideas for that? Ideally, one would simply set a certain flag (at runtime), and basic matrix operations would transparently relayed to CUBLAS.. but at the moment, I'm having a hard time of guessing how much of a hassle this would be to implement. Cheers, Fabian On 10/17/11, Tiziano Zito wrote: > Even if this is not really MDP related, please keep us posted on > your benchmarks: I am sure most people here would be interested in > seeing the results ;) > > ciao, > Tiziano > > > On Mon 17 Oct, 10:14, Fabian Schoenfeld wrote: > > Judging from your reactions, I will definitely have a go at ATLAS. > > However, it will still most likely be for benchmarking and > > comparison, as the matrices I have to deal with get pretty big: At > > the moment I multiply 100k x 3k matrices together, and via Numpy > > this not only takes a lot of time (which ATLAS might solve), but > > also a LOT of memory (several gigabytes more than the pure storage > > of the data requires, which I don't fully understand, and I'm not > > sure whether ATLAS would have a significantly smaller memory > > footprint). > > > > The overall procedure requires more than a thousand of these > > multiplications, and so I'm really looking for the fastest way to > > do this. As I see it, CUDA should be by far the best option here, > > as it's perfectly suited to speed up matrix multiplications. Also, > > with the use of CUBLAS, it's not really that complicated (once > > CUDA runs, that is), and I'm assuming that since CUBLAS was > > directly made be the CUDA team (AFAIK), it should be VERY good at > > what it does  and again, matrix multiplication is simply > > embarassingly parallel, so it should really pack a punch. > > > > Still, I appreciate your hints, it's always useful to consider > > different angles. I don't really see an alternative to CUDA/CUBLAS > > at the moment, however. > > > > Cheers, Fabian > > > > > > > > On 10/14/11, Zbigniew JędrzejewskiSzmek wrote: > > > On 10/14/2011 04:22 PM, Fabian Schoenfeld wrote: > > > >Hi! > > > > > > > >Thank you both Tiziano and Zbyszek for your helpful answers and > > > >loads of useful links! It will probably take me some time to > > > >dig through all this, but looks like I might use a lot of it :) > > > > > > > >At the moment I'm probably working with vanilla Numpy (I didn't > > > >compile it myself), but since CUBLAS will probably be faster > > > >than an optimized Numpy anyways, I think I'll skip a custom > > > >compilation over ATLAS. Still appreciate the hint, however, as > > > >I didn't know it makes that much of a difference. > > > Hi, I think you're falling into the premature optimization trap. > > > Installing a custom ATLAS is quite simple. If you use debian, > > > then you need to download the source package, run 'fakeroot > > > debian/rules custom' (it is described in the readme, I don't > > > remember the exact details), and then install the package. If > > > you're using the numpy package, it'll automatically start to use > > > the replacement ATLAS. This gives you a potentially significant > > > speedup with very little work. Only then should you start > > > looking at more complicated solutions. > > > > > > Best, Zbyszek > > > > > > > > > > >Also, a note about pycublas: it doesn't work. This was the > > > >first thing I tried, but with the current version of CUDA, it > > > >only moans about missing symbols and even when those are > > > >cleared starts to produce segfaults. It's nice to get a sense > > > >of what needs to be done in order to use CUBLAS, but the code > > > >posted does NOT work (anymore) when simply copied and pasted. > > > > > > > >Cheers! Fabian > > > > > >  > > All the data continuously generated in your IT infrastructure > > contains a definitive record of customers, application > > performance, security threats, fraudulent activity and more. > > Splunk takes this data and makes sense of it. Business sense. IT > > sense. Common sense. http://p.sf.net/sfu/splunkd2doct > > _______________________________________________ mdptoolkitusers > > mailing list mdptoolkitusers@... > > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers > >  > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunkd2doct > _______________________________________________ > mdptoolkitusers mailing list > mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Tiziano Zito <tiziano.zito@bc...>  20111017 08:50:46

Even if this is not really MDP related, please keep us posted on your benchmarks: I am sure most people here would be interested in seeing the results ;) ciao, Tiziano On Mon 17 Oct, 10:14, Fabian Schoenfeld wrote: > Judging from your reactions, I will definitely have a go at ATLAS. > However, it will still most likely be for benchmarking and > comparison, as the matrices I have to deal with get pretty big: At > the moment I multiply 100k x 3k matrices together, and via Numpy > this not only takes a lot of time (which ATLAS might solve), but > also a LOT of memory (several gigabytes more than the pure storage > of the data requires, which I don't fully understand, and I'm not > sure whether ATLAS would have a significantly smaller memory > footprint). > > The overall procedure requires more than a thousand of these > multiplications, and so I'm really looking for the fastest way to > do this. As I see it, CUDA should be by far the best option here, > as it's perfectly suited to speed up matrix multiplications. Also, > with the use of CUBLAS, it's not really that complicated (once > CUDA runs, that is), and I'm assuming that since CUBLAS was > directly made be the CUDA team (AFAIK), it should be VERY good at > what it does  and again, matrix multiplication is simply > embarassingly parallel, so it should really pack a punch. > > Still, I appreciate your hints, it's always useful to consider > different angles. I don't really see an alternative to CUDA/CUBLAS > at the moment, however. > > Cheers, Fabian > > > > On 10/14/11, Zbigniew JędrzejewskiSzmek wrote: > > On 10/14/2011 04:22 PM, Fabian Schoenfeld wrote: > > >Hi! > > > > > >Thank you both Tiziano and Zbyszek for your helpful answers and > > >loads of useful links! It will probably take me some time to > > >dig through all this, but looks like I might use a lot of it :) > > > > > >At the moment I'm probably working with vanilla Numpy (I didn't > > >compile it myself), but since CUBLAS will probably be faster > > >than an optimized Numpy anyways, I think I'll skip a custom > > >compilation over ATLAS. Still appreciate the hint, however, as > > >I didn't know it makes that much of a difference. > > Hi, I think you're falling into the premature optimization trap. > > Installing a custom ATLAS is quite simple. If you use debian, > > then you need to download the source package, run 'fakeroot > > debian/rules custom' (it is described in the readme, I don't > > remember the exact details), and then install the package. If > > you're using the numpy package, it'll automatically start to use > > the replacement ATLAS. This gives you a potentially significant > > speedup with very little work. Only then should you start > > looking at more complicated solutions. > > > > Best, Zbyszek > > > > > > > >Also, a note about pycublas: it doesn't work. This was the > > >first thing I tried, but with the current version of CUDA, it > > >only moans about missing symbols and even when those are > > >cleared starts to produce segfaults. It's nice to get a sense > > >of what needs to be done in order to use CUBLAS, but the code > > >posted does NOT work (anymore) when simply copied and pasted. > > > > > >Cheers! Fabian > > >  > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application > performance, security threats, fraudulent activity and more. > Splunk takes this data and makes sense of it. Business sense. IT > sense. Common sense. http://p.sf.net/sfu/splunkd2doct > _______________________________________________ mdptoolkitusers > mailing list mdptoolkitusers@... > https://lists.sourceforge.net/lists/listinfo/mdptoolkitusers 
From: Fabian Schoenfeld <fabian.schoenfeld@in...>  20111017 08:14:48

Judging from your reactions, I will definitely have a go at ATLAS. However, it will still most likely be for benchmarking and comparison, as the matrices I have to deal with get pretty big: At the moment I multiply 100k x 3k matrices together, and via Numpy this not only takes a lot of time (which ATLAS might solve), but also a LOT of memory (several gigabytes more than the pure storage of the data requires, which I don't fully understand, and I'm not sure whether ATLAS would have a significantly smaller memory footprint). The overall procedure requires more than a thousand of these multiplications, and so I'm really looking for the fastest way to do this. As I see it, CUDA should be by far the best option here, as it's perfectly suited to speed up matrix multiplications. Also, with the use of CUBLAS, it's not really that complicated (once CUDA runs, that is), and I'm assuming that since CUBLAS was directly made be the CUDA team (AFAIK), it should be VERY good at what it does  and again, matrix multiplication is simply embarassingly parallel, so it should really pack a punch. Still, I appreciate your hints, it's always useful to consider different angles. I don't really see an alternative to CUDA/CUBLAS at the moment, however. Cheers, Fabian On 10/14/11, Zbigniew JędrzejewskiSzmek wrote: > On 10/14/2011 04:22 PM, Fabian Schoenfeld wrote: > >Hi! > > > >Thank you both Tiziano and Zbyszek for your helpful answers and loads of useful > >links! It will probably take me some time to dig through all this, but looks > >like I might use a lot of it :) > > > >At the moment I'm probably working with vanilla Numpy (I didn't compile it > >myself), but since CUBLAS will probably be faster than an optimized Numpy > >anyways, I think I'll skip a custom compilation over ATLAS. Still appreciate > >the hint, however, as I didn't know it makes that much of a difference. > Hi, > I think you're falling into the premature optimization trap. Installing a custom ATLAS is quite simple. If you use debian, then you need to download the source package, run 'fakeroot debian/rules custom' (it is described in the readme, I don't remember the exact details), and then install the package. If you're using the numpy package, it'll automatically start to use the replacement ATLAS. This gives you a potentially significant speedup with very little work. Only then should you start looking at more complicated solutions. > > Best, > Zbyszek > > > > >Also, a note about pycublas: it doesn't work. This was the first thing I tried, > >but with the current version of CUDA, it only moans about missing symbols and > >even when those are cleared starts to produce segfaults. It's nice to get a > >sense of what needs to be done in order to use CUBLAS, but the code posted does > >NOT work (anymore) when simply copied and pasted. > > > >Cheers! > >Fabian 