Hi Nohemi,

I think the best source for results for the AllWords module are two publications from 2009.

For efficiency, the main "cost" is the measure that you use - the most expensive measures tend to be the most effective (lesk and vector) so while you can speed things up by using other measures (wup or lin perhaps) you may pay a price in terms of accuracy. I think vector generally runs a bit faster than lesk, so if you haven't tried vector you might want to.

The best way to speed things up is probably to run on different cores or different machines, since each sentence is handled independently. So, when I am running larger experiments I tend to split up the work across a few different cores or machines. 

I hope this helps! Please let us know if any other questions arise.

Good luck,

On Sat, Oct 19, 2013 at 10:03 PM, Nohemi Fernandez <nf68@cornell.edu> wrote:

I have recently come across your word sense disambiguation module WordNet::SenseRelate::AllWords and I have a few questions regarding its application in my particular situation.

My team and I would like to use your module to solve issues in word sense ambiguity in definitions of words. For example, in one definition of the word 'chair' we have: 'a separate seat for one person, typically with a back and four legs'. Looking at the WordNet result for 'person' we find that this word could be referring to one of three senses; we would like to extract the correct sense given the context of the definition. 

I wanted to know whether there exists a more recent study of the AllWords module with regards to the F-measure and methods of testing (ie. the human created gold standard) or if the 2005 paper continues to be the most relevant measure of accuracy. 
Lastly, I am concerned with the efficiency of the module given that we currently have a set of 100,000 words mapped to definitions. After running a small test suite of 500 word to definition mappings, I have concluded that the AllWords algorithm running locally on my machine takes about 2.2 seconds on average  to run on a context of 5-20 words. Could you advise me on any options for speeding up the processing?

Thank you,

October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
senserelate-users mailing list

Ted Pedersen