I have just downloaded Waffles, trying to do some experiments using the
manifold sculpting algorithm. When trying to go through the examples given
under "Examples of dimensionality reduction", I can create and plot the
swissroll data set. PCA reduction works fine but neither LLE or Manifold
Sculpting works. After executing the command nothing happens. I realize that
these algorithms are more time demanding but doing similar experiments on a
Matlab implementation of LLE, this data set would be reduced within seconds.
Is this a known problem or am I doing something wrong? I am using a PC running
WindowsXP.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When you say "nothing happens", do you mean "it starts crunching and seems to
take forever"? A good way to test for poor scalability is to reduce the number
of points with which the manifold is sampled. If it is fast with 500 points
and slow with 2000 points, then the implementation scales poorly.
This is probably the case with our LLE implementation because we utilize a
dense-matrix eigenvector decomposition, whereas the Matlab implementation uses
a sparse-matrix eigenvector decomposition. (We have implemented a version that
calls out to another library to perform sparse eigenvector decomposition, but
we pulled it out in order to keep the dependency requirements of Waffles
minimal. I'm currently looking for a way to put it back in. This might end up
requiring me to implement a bunch of hairy algorithms--either that, or we stop
worrying about keeping our dependencies minimal.)
Manifold Sculpting is inherently quite slow, but this is due to a large
coefficient, rather than poor scalability. With very large datasets, it
actually scales better than the Matlab implementations of LLE and Isomap.
I repeated the experiments in the tutorial, and timed how long each algorithm
took. LLE and Manifold Sculpting took 4 and 2 minutes respectively. (This was
on a 3GHz machine with 16GB RAM running Ubuntu 10.10, compiled with g++.
Visual C++ 2008 seems to compile my code to run about 20% slower than g++,
although this is just a vague impression--I've never actually measured it.)
Okay, I couldn't sleep at nights knowing I had a poor implementation in my
library, so I burned a couple days of productivity and implemented the
necessary sparse matrix decompositions. The same test with LLE that used to
take over 4 minutes now runs in less than one-third of a second--much better!
mike@rib:~/tmp$ time waffles_plot scatter lle.arff -spectrum -aspect -out
lle.png
Plot saved to lle.png.
real 0m0.296s
user 0m0.280s
sys 0m0.004s
Alas, there's not much I can do to make Manifold Sculpting any faster, since
its speed is due to the algorithm rather than the implementation. Here's the
link for instructions to get the latest sources: http://waffles.sourceforge.n
et/tutorial/subversion.html
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have just downloaded Waffles, trying to do some experiments using the
manifold sculpting algorithm. When trying to go through the examples given
under "Examples of dimensionality reduction", I can create and plot the
swissroll data set. PCA reduction works fine but neither LLE or Manifold
Sculpting works. After executing the command nothing happens. I realize that
these algorithms are more time demanding but doing similar experiments on a
Matlab implementation of LLE, this data set would be reduced within seconds.
Is this a known problem or am I doing something wrong? I am using a PC running
WindowsXP.
When you say "nothing happens", do you mean "it starts crunching and seems to
take forever"? A good way to test for poor scalability is to reduce the number
of points with which the manifold is sampled. If it is fast with 500 points
and slow with 2000 points, then the implementation scales poorly.
This is probably the case with our LLE implementation because we utilize a
dense-matrix eigenvector decomposition, whereas the Matlab implementation uses
a sparse-matrix eigenvector decomposition. (We have implemented a version that
calls out to another library to perform sparse eigenvector decomposition, but
we pulled it out in order to keep the dependency requirements of Waffles
minimal. I'm currently looking for a way to put it back in. This might end up
requiring me to implement a bunch of hairy algorithms--either that, or we stop
worrying about keeping our dependencies minimal.)
Manifold Sculpting is inherently quite slow, but this is due to a large
coefficient, rather than poor scalability. With very large datasets, it
actually scales better than the Matlab implementations of LLE and Isomap.
I repeated the experiments in the tutorial, and timed how long each algorithm
took. LLE and Manifold Sculpting took 4 and 2 minutes respectively. (This was
on a 3GHz machine with 16GB RAM running Ubuntu 10.10, compiled with g++.
Visual C++ 2008 seems to compile my code to run about 20% slower than g++,
although this is just a vague impression--I've never actually measured it.)
mike@rib:~/tmp$ waffles_generate swissroll 2000 -cutoutstar -seed 0 > sr.arff
mike@rib:~/tmp$ time waffles_transform isomap sr.arff kdtree 14 2 >
isomap.arff
real 0m21.073s
user 0m21.013s
sys 0m0.060s
mike@rib:~/tmp$ time waffles_transform lle sr.arff kdtree 14 2 > lle.arff
real 4m4.673s
user 4m4.615s
sys 0m0.064s
mike@rib:~/tmp$ time waffles_transform breadthfirstunfolding sr.arff kdtree 14
2 -reps 20 > bfu.arff
real 0m4.321s
user 0m4.312s
sys 0m0.000s
mike@rib:~/tmp$ time waffles_transform manifoldsculpting sr.arff kdtree 14 2 >
ms.arff
real 2m18.115s
user 2m18.121s
sys 0m0.004s
Okay, I couldn't sleep at nights knowing I had a poor implementation in my
library, so I burned a couple days of productivity and implemented the
necessary sparse matrix decompositions. The same test with LLE that used to
take over 4 minutes now runs in less than one-third of a second--much better!
mike@rib:~/tmp$ time waffles_plot scatter lle.arff -spectrum -aspect -out
lle.png
Plot saved to lle.png.
real 0m0.296s
user 0m0.280s
sys 0m0.004s
Alas, there's not much I can do to make Manifold Sculpting any faster, since
its speed is due to the algorithm rather than the implementation. Here's the
link for instructions to get the latest sources: http://waffles.sourceforge.n
et/tutorial/subversion.html