We are very pleased to announce the release of SenseClusters
version 0.93. This version marks our first steps towards supporting
Latent Semantic Analysis in addition to our native SenseClusters
In this version we now support word clustering (feature clustering
really, as it is not limited to just unigrams or single words) that
is based on a feature by context representation. In other words,
features are clustered based on the contexts in which they occur.
These matrices can optionally be reduced with SVD prior to clustering.
We refer to this as LSA feature clustering.
These feature by context representations are what we believe
characterizes LSA, and makes it different from our native SenseClusters
methods. We have supported a form of word clustering prior to this
release, and it is based on a word by word representation, that is words
are clustered based on the words with which they occur.
You can download version 0.93 from sourceforge:
As a preview, in version 0.95 we will have support for doing context
discrimination "the LSA way". The features found in contexts to be
discriminated will be represented by vectors that show which contexts
those features occur in, thus providing a second way of doing order 2
At present our native SenseClusters order 2 methodology is based on
replacing the words in the contexts to be clustered with vectors showing
the words with which they occur.
There are some other significant changes in version 0.93, among them that
SenseClusters now requires the use of Perl 5.8.5 or better. The most
current version of Perl is 5.8.8 now, and 5.8.5 is several years old, so
it is probably time to upgrade anyway if you are running something less
Also, we have attempted to clarify the installation instructions further.
We will continue to work on that in 0.95, hopefully making SenseClusters
much easier to install. We think the instructions are quite a bit better
now, so please check them out:
Also, remember that you can experiment with SenseClusters using our web interface:
Log in to post a comment.