Dimensionality reduction not working?

Help
2011-03-25
2012-09-14
  • Nobody/Anonymous

    I have just downloaded Waffles, trying to do some experiments using the
    manifold sculpting algorithm. When trying to go through the examples given
    under "Examples of dimensionality reduction", I can create and plot the
    swissroll data set. PCA reduction works fine but neither LLE or Manifold
    Sculpting works. After executing the command nothing happens. I realize that
    these algorithms are more time demanding but doing similar experiments on a
    Matlab implementation of LLE, this data set would be reduced within seconds.
    Is this a known problem or am I doing something wrong? I am using a PC running
    WindowsXP.

     
  • Mike Gashler

    Mike Gashler - 2011-03-25

    When you say "nothing happens", do you mean "it starts crunching and seems to
    take forever"? A good way to test for poor scalability is to reduce the number
    of points with which the manifold is sampled. If it is fast with 500 points
    and slow with 2000 points, then the implementation scales poorly.

    This is probably the case with our LLE implementation because we utilize a
    dense-matrix eigenvector decomposition, whereas the Matlab implementation uses
    a sparse-matrix eigenvector decomposition. (We have implemented a version that
    calls out to another library to perform sparse eigenvector decomposition, but
    we pulled it out in order to keep the dependency requirements of Waffles
    minimal. I'm currently looking for a way to put it back in. This might end up
    requiring me to implement a bunch of hairy algorithms--either that, or we stop
    worrying about keeping our dependencies minimal.)

    Manifold Sculpting is inherently quite slow, but this is due to a large
    coefficient, rather than poor scalability. With very large datasets, it
    actually scales better than the Matlab implementations of LLE and Isomap.

    I repeated the experiments in the tutorial, and timed how long each algorithm
    took. LLE and Manifold Sculpting took 4 and 2 minutes respectively. (This was
    on a 3GHz machine with 16GB RAM running Ubuntu 10.10, compiled with g++.
    Visual C++ 2008 seems to compile my code to run about 20% slower than g++,
    although this is just a vague impression--I've never actually measured it.)

    mike@rib:~/tmp$ waffles_generate swissroll 2000 -cutoutstar -seed 0 > sr.arff

    mike@rib:~/tmp$ time waffles_transform isomap sr.arff kdtree 14 2 >
    isomap.arff

    real 0m21.073s

    user 0m21.013s

    sys 0m0.060s

    mike@rib:~/tmp$ time waffles_transform lle sr.arff kdtree 14 2 > lle.arff

    real 4m4.673s

    user 4m4.615s

    sys 0m0.064s

    mike@rib:~/tmp$ time waffles_transform breadthfirstunfolding sr.arff kdtree 14
    2 -reps 20 > bfu.arff

    real 0m4.321s

    user 0m4.312s

    sys 0m0.000s

    mike@rib:~/tmp$ time waffles_transform manifoldsculpting sr.arff kdtree 14 2 >
    ms.arff

    real 2m18.115s

    user 2m18.121s

    sys 0m0.004s

     
  • Mike Gashler

    Mike Gashler - 2011-03-29

    Okay, I couldn't sleep at nights knowing I had a poor implementation in my
    library, so I burned a couple days of productivity and implemented the
    necessary sparse matrix decompositions. The same test with LLE that used to
    take over 4 minutes now runs in less than one-third of a second--much better!

    mike@rib:~/tmp$ time waffles_plot scatter lle.arff -spectrum -aspect -out
    lle.png

    Plot saved to lle.png.

    real 0m0.296s

    user 0m0.280s

    sys 0m0.004s

    Alas, there's not much I can do to make Manifold Sculpting any faster, since
    its speed is due to the algorithm rather than the implementation. Here's the
    link for instructions to get the latest sources: http://waffles.sourceforge.n
    et/tutorial/subversion.html

     

Anonymous
Anonymous

Cancel  Add attachments