You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
(7) |
Oct
(1) |
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(1) |
Feb
(1) |
Mar
(9) |
Apr
(5) |
May
(5) |
Jun
(9) |
Jul
|
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
(1) |
Dec
(1) |
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(2) |
Dec
(9) |
2007 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
(1) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(10) |
Nov
(3) |
Dec
(2) |
2008 |
Jan
(1) |
Feb
|
Mar
(8) |
Apr
(9) |
May
|
Jun
(7) |
Jul
(1) |
Aug
(2) |
Sep
(1) |
Oct
(13) |
Nov
(1) |
Dec
(3) |
2009 |
Jan
(1) |
Feb
(4) |
Mar
(1) |
Apr
(1) |
May
(3) |
Jun
(6) |
Jul
(3) |
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
(3) |
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
(1) |
Apr
(3) |
May
(8) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2012 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
2013 |
Jan
(6) |
Feb
(2) |
Mar
|
Apr
|
May
(3) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(9) |
Apr
(4) |
May
(3) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
(3) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
From: Ted P. <tpederse@d.umn.edu> - 2011-11-17 03:55:33
|
Hi Wesley, You are right - wsd.pl does not handle e.g. terribly well! We ran into this issue with i.e. and inserted a special case into the wsd.pl code which you can see below...wsd.pl does some very basic text cleaning where punctuation marks are removed except in those cases where it is important to WordNet to preserve (like i.e.) sub cleanLine { my $line = shift; chomp($line); my @words=split(/ +/,$line); foreach my $word (@words){ next if($word eq "i.e." || $word eq "ie." || $word eq "et_al." || $word eq "al."); $word =~ s/([A-Z])/\L$1/g; if ($word =~ m/_/){ $word =~ s/[.|!|?|,|;]+$/ /; } else{ $word =~ s/[^$OK_CHARS]/ /g; } } return join (' ', @words); } This should be expanded to include e.g. and perhaps other abbreviations, although I don't think I'll be able to do that really quickly (so you might want to modify yourself if Perl is familiar - if not we can try and expedite things a bit...so let us know). Thanks for pointing this out! Cordially, Ted On Wed, Nov 16, 2011 at 7:52 PM, Wesley May <wj...@gm...> wrote: > Ah, upon further review the mistake is mine, never mind :D > > Though there is another thing (the opposite problem perhaps!). Words > like "e.g." get split into two word, "e" and "g". Is there a... > nouncompoundify? :) > > Thanks! > > On Wed, Nov 16, 2011 at 8:40 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: >> Hi Wesley, >> >> Actually that is what --nocompoundify is *supposed* to be doing - >> could you send me the command you are running and the output you are >> gettting? Then I can investigate a bit further. >> >> Thanks! >> Ted >> >> On Wed, Nov 16, 2011 at 4:43 PM, Wesley May <wj...@gm...> wrote: >>> Hi Ted, >>> >>> Is there a way to disable making WordNet compounds in >>> SenseRelate-AllWords? For instance, if I have the (stop-word removed) >>> sentence: >>> >>> "tires rattling wheels now roll off another day was valley" >>> >>> ...then SenseRelate tries to form the compound "roll_off". I thought >>> that the --nocompoundify was for this, but I guess I'm wrong because >>> it doesn't seem to stop that. >>> >>> Thanks! >>> Wesley May >>> >>> >>> On Tue, Nov 15, 2011 at 5:50 PM, Wesley May <wj...@gm...> wrote: >>>> Looks good, thanks very much! I'll let you know if I have any questions :) >>>> >>>> Wesley May >>>> >>>> On Sun, Nov 13, 2011 at 6:05 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: >>>>> HI Wesley, >>>>> >>>>> I think in 2007 there were two systems that we'd call unsupervised. >>>>> One was knowledge based (WordNet::SenseRelate::AllWords) and the other >>>>> was a clustering approach (SenseClusters). You can find both of those >>>>> here: >>>>> >>>>> http://senserelate.sourceforge.net >>>>> http://senseclusters.sourceforge.net >>>>> >>>>> The 2007 systems used these pretty much out of the box, so it's simply >>>>> a matter of setting the command line parameters appropriately, which I >>>>> hope is documented in our system description papers (which you can >>>>> find on my publications page). >>>>> >>>>> But, if you have any questions about any of this, please don't >>>>> hestitate to let me know. >>>>> >>>>> Good luck! >>>>> Ted >>>>> >>>>> On Sun, Nov 13, 2011 at 3:40 PM, Wesley May <we...@cs...> wrote: >>>>>> Hi Dr. Pedersen, >>>>>> >>>>>> I'm a grad student at the University of Toronto, working with Suzanne >>>>>> Stevenson, and I'm looking for a good unsupervised, general-purpose >>>>>> WSD algorithm. >>>>>> Do you happen to have code available for your SemEval 2007 submission? >>>>>> >>>>>> Thanks very much! >>>>>> Wesley May >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ted Pedersen >>>>> http://www.d.umn.edu/~tpederse >>>>> >>>> >>> >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-07-06 16:33:34
|
Hi Wael, I don't think there's a convenient way to take the output from wsd.pl and put it into similarity.pl However, the good news is I don't know if you even need to do this. :) wsd.pl has some extensive tracing options, some of which actually return the underlying similarity scores that are being computed! http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl#OPTIONS <http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl#OPTIONS>For example, when I use --trace 8 i get all the similarity scores... Here's the text to be disambiguated.... marimba(14): more testit cat mouse horse pig dog fish bear lion zebra tiger And below is the command I ran and the resulting output...this is of course just one possible form of tracing, but in general I think you'll be able to get the similarity scores you want out from wsd.pl without having to re-process them. If that's not the case do let me know. Cordially, Ted marimba(15): wsd.pl --context testit --window 3 --type WordNet::Similarity::path --format raw --trace 8 Current configuration: context file : testit format : raw scheme : normal tagged text : no measure : WordNet::Similarity::path window : 3 contextScore : 0 pairScore : 0 measure config: (none) glosses : no nocompoundify : no usemono : no backoff : no trace : 8 forcepos : no stoplist : (none) Loading WordNet... done. cat#n#2 mouse#n#1 horse#n#1 pig#n#2 dog#n#2 fish#v#1 bear#n#2 lion#n#1 zebra#n#1 tiger#n#2 cat#n#1 mouse#n#1 0.166666666666667 cat#n#1 mouse#n#2 0.0416666666666667 cat#n#1 mouse#n#3 0.0909090909090909 cat#n#1 mouse#n#4 0.0625 cat#n#2 mouse#n#1 0.0833333333333333 cat#n#2 mouse#n#2 0.0588235294117647 cat#n#2 mouse#n#3 0.2 cat#n#2 mouse#n#4 0.0833333333333333 cat#n#3 mouse#n#1 0.0833333333333333 cat#n#3 mouse#n#2 0.0588235294117647 cat#n#3 mouse#n#3 0.2 cat#n#3 mouse#n#4 0.0833333333333333 cat#n#4 mouse#n#1 0.0588235294117647 cat#n#4 mouse#n#2 0.0588235294117647 cat#n#4 mouse#n#3 0.142857142857143 cat#n#4 mouse#n#4 0.0769230769230769 cat#n#5 mouse#n#1 0.0625 cat#n#5 mouse#n#2 0.05 cat#n#5 mouse#n#3 0.0909090909090909 cat#n#5 mouse#n#4 0.166666666666667 cat#n#6 mouse#n#1 0.0588235294117647 cat#n#6 mouse#n#2 0.0476190476190476 cat#n#6 mouse#n#3 0.0833333333333333 cat#n#6 mouse#n#4 0.111111111111111 cat#n#7 mouse#n#1 0.166666666666667 cat#n#7 mouse#n#2 0.0416666666666667 cat#n#7 mouse#n#3 0.0909090909090909 cat#n#7 mouse#n#4 0.0625 cat#n#8 mouse#n#1 0.0434782608695652 cat#n#8 mouse#n#2 0.0526315789473684 cat#n#8 mouse#n#3 0.0666666666666667 cat#n#8 mouse#n#4 0.0526315789473684 cat#v#1 mouse#v#1 0.142857142857143 cat#v#1 mouse#v#2 0.125 cat#v#2 mouse#v#1 0.142857142857143 cat#v#2 mouse#v#2 0.125 mouse#n#1 cat#n#1 0.166666666666667 mouse#n#1 cat#n#2 0.0833333333333333 mouse#n#1 cat#n#3 0.0833333333333333 mouse#n#1 cat#n#4 0.0588235294117647 mouse#n#1 cat#n#5 0.0625 mouse#n#1 cat#n#6 0.0588235294117647 mouse#n#1 cat#n#7 0.166666666666667 mouse#n#1 cat#n#8 0.0434782608695652 mouse#n#1 horse#n#1 0.142857142857143 mouse#n#1 horse#n#2 0.0625 mouse#n#1 horse#n#3 0.05 mouse#n#1 horse#n#4 0.0666666666666667 mouse#n#1 horse#n#5 0.0588235294117647 mouse#n#2 cat#n#2 0.0588235294117647 mouse#n#2 cat#n#3 0.0588235294117647 mouse#n#2 cat#n#4 0.0588235294117647 mouse#n#2 cat#n#5 0.05 mouse#n#2 cat#n#6 0.0476190476190476 mouse#n#2 cat#n#7 0.0416666666666667 mouse#n#2 cat#n#8 0.0526315789473684 mouse#n#2 horse#n#1 0.04 mouse#n#2 horse#n#2 0.05 mouse#n#2 horse#n#3 0.0625 mouse#n#2 horse#n#4 0.0526315789473684 mouse#n#2 horse#n#5 0.0476190476190476 mouse#n#3 cat#n#2 0.2 mouse#n#3 cat#n#3 0.2 mouse#n#3 cat#n#4 0.142857142857143 mouse#n#3 cat#n#5 0.0909090909090909 mouse#n#3 cat#n#6 0.0833333333333333 mouse#n#3 cat#n#7 0.0909090909090909 mouse#n#3 cat#n#8 0.0666666666666667 mouse#n#3 horse#n#1 0.0833333333333333 mouse#n#3 horse#n#2 0.0909090909090909 mouse#n#3 horse#n#3 0.0833333333333333 mouse#n#3 horse#n#4 0.1 mouse#n#3 horse#n#5 0.0833333333333333 mouse#n#4 cat#n#2 0.0833333333333333 mouse#n#4 cat#n#3 0.0833333333333333 mouse#n#4 cat#n#4 0.0769230769230769 mouse#n#4 cat#n#5 0.166666666666667 mouse#n#4 cat#n#6 0.111111111111111 mouse#n#4 cat#n#7 0.0625 mouse#n#4 cat#n#8 0.0526315789473684 mouse#n#4 horse#n#1 0.0588235294117647 mouse#n#4 horse#n#2 0.125 mouse#n#4 horse#n#3 0.0625 mouse#n#4 horse#n#4 0.111111111111111 mouse#n#4 horse#n#5 0.111111111111111 mouse#v#1 cat#v#2 0.142857142857143 mouse#v#2 cat#v#1 0.125 mouse#v#2 cat#v#2 0.125 horse#n#1 mouse#n#1 0.142857142857143 horse#n#1 mouse#n#2 0.04 horse#n#1 mouse#n#3 0.0833333333333333 horse#n#1 mouse#n#4 0.0588235294117647 horse#n#1 pig#n#1 0.142857142857143 horse#n#1 pig#n#2 0.0666666666666667 horse#n#1 pig#n#3 0.0666666666666667 horse#n#1 pig#n#4 0.0625 horse#n#1 pig#n#5 0.0588235294117647 horse#n#1 pig#n#6 0.0625 horse#n#2 mouse#n#2 0.05 horse#n#2 mouse#n#3 0.0909090909090909 horse#n#2 mouse#n#4 0.125 horse#n#2 pig#n#1 0.0555555555555556 horse#n#2 pig#n#2 0.0714285714285714 horse#n#2 pig#n#3 0.0714285714285714 horse#n#2 pig#n#4 0.0666666666666667 horse#n#2 pig#n#5 0.125 horse#n#2 pig#n#6 0.111111111111111 horse#n#3 mouse#n#2 0.0625 horse#n#3 mouse#n#3 0.0833333333333333 horse#n#3 mouse#n#4 0.0625 horse#n#3 pig#n#1 0.0454545454545455 horse#n#3 pig#n#2 0.0666666666666667 horse#n#3 pig#n#3 0.0666666666666667 horse#n#3 pig#n#4 0.0625 horse#n#3 pig#n#5 0.0625 horse#n#3 pig#n#6 0.0666666666666667 horse#n#4 mouse#n#2 0.0526315789473684 horse#n#4 mouse#n#3 0.1 horse#n#4 mouse#n#4 0.111111111111111 horse#n#4 pig#n#1 0.0588235294117647 horse#n#4 pig#n#2 0.0769230769230769 horse#n#4 pig#n#3 0.0769230769230769 horse#n#4 pig#n#4 0.0714285714285714 horse#n#4 pig#n#5 0.111111111111111 horse#n#4 pig#n#6 0.125 horse#n#5 mouse#n#2 0.0476190476190476 horse#n#5 mouse#n#3 0.0833333333333333 horse#n#5 mouse#n#4 0.111111111111111 horse#n#5 pig#n#1 0.0526315789473684 horse#n#5 pig#n#2 0.0666666666666667 horse#n#5 pig#n#3 0.0666666666666667 horse#n#5 pig#n#4 0.0625 horse#n#5 pig#n#5 0.111111111111111 horse#n#5 pig#n#6 0.1 horse#v#1 mouse#v#2 0.111111111111111 horse#v#1 pig#v#1 0.142857142857143 horse#v#1 pig#v#2 0.125 horse#v#1 pig#v#3 0.111111111111111 pig#n#1 horse#n#1 0.142857142857143 pig#n#1 horse#n#2 0.0555555555555556 pig#n#1 horse#n#3 0.0454545454545455 pig#n#1 horse#n#4 0.0588235294117647 pig#n#1 horse#n#5 0.0526315789473684 pig#n#1 dog#n#2 0.0666666666666667 pig#n#1 dog#n#3 0.0714285714285714 pig#n#1 dog#n#4 0.0714285714285714 pig#n#1 dog#n#5 0.05 pig#n#1 dog#n#6 0.0555555555555556 pig#n#1 dog#n#7 0.0588235294117647 pig#n#2 horse#n#2 0.0714285714285714 pig#n#2 horse#n#3 0.0666666666666667 pig#n#2 horse#n#4 0.0769230769230769 pig#n#2 horse#n#5 0.0666666666666667 pig#n#2 dog#n#2 0.2 pig#n#2 dog#n#3 0.125 pig#n#2 dog#n#4 0.166666666666667 pig#n#2 dog#n#5 0.0769230769230769 pig#n#2 dog#n#6 0.0714285714285714 pig#n#2 dog#n#7 0.0769230769230769 pig#n#3 horse#n#2 0.0714285714285714 pig#n#3 horse#n#3 0.0666666666666667 pig#n#3 horse#n#4 0.0769230769230769 pig#n#3 horse#n#5 0.0666666666666667 pig#n#3 dog#n#2 0.2 pig#n#3 dog#n#3 0.125 pig#n#3 dog#n#4 0.166666666666667 pig#n#3 dog#n#5 0.0769230769230769 pig#n#3 dog#n#6 0.0714285714285714 pig#n#3 dog#n#7 0.0769230769230769 pig#n#4 horse#n#2 0.0666666666666667 pig#n#4 horse#n#3 0.0625 pig#n#4 horse#n#4 0.0714285714285714 pig#n#4 horse#n#5 0.0625 pig#n#4 dog#n#2 0.1 pig#n#4 dog#n#3 0.111111111111111 pig#n#4 dog#n#4 0.111111111111111 pig#n#4 dog#n#5 0.0714285714285714 pig#n#4 dog#n#6 0.0666666666666667 pig#n#4 dog#n#7 0.0714285714285714 pig#n#5 horse#n#2 0.125 pig#n#5 horse#n#3 0.0625 pig#n#5 horse#n#4 0.111111111111111 pig#n#5 horse#n#5 0.111111111111111 pig#n#5 dog#n#2 0.0769230769230769 pig#n#5 dog#n#3 0.0833333333333333 pig#n#5 dog#n#4 0.0833333333333333 pig#n#5 dog#n#5 0.0714285714285714 pig#n#5 dog#n#6 0.125 pig#n#5 dog#n#7 0.142857142857143 pig#n#6 horse#n#2 0.111111111111111 pig#n#6 horse#n#3 0.0666666666666667 pig#n#6 horse#n#4 0.125 pig#n#6 horse#n#5 0.1 pig#n#6 dog#n#2 0.0833333333333333 pig#n#6 dog#n#3 0.0909090909090909 pig#n#6 dog#n#4 0.0909090909090909 pig#n#6 dog#n#5 0.0769230769230769 pig#n#6 dog#n#6 0.111111111111111 pig#n#6 dog#n#7 0.125 pig#v#1 horse#v#1 0.142857142857143 dog#n#1 pig#n#1 0.125 dog#n#1 pig#n#2 0.111111111111111 dog#n#1 pig#n#3 0.111111111111111 dog#n#1 pig#n#4 0.1 dog#n#1 pig#n#5 0.0909090909090909 dog#n#1 pig#n#6 0.1 dog#n#1 fish#n#2 0.0833333333333333 dog#n#1 fish#n#3 0.166666666666667 dog#n#1 fish#n#4 0.0909090909090909 dog#n#2 pig#n#1 0.0666666666666667 dog#n#2 pig#n#2 0.2 dog#n#2 pig#n#3 0.2 dog#n#2 pig#n#4 0.1 dog#n#2 pig#n#5 0.0769230769230769 dog#n#2 pig#n#6 0.0833333333333333 dog#n#2 fish#n#2 0.0909090909090909 dog#n#2 fish#n#3 0.166666666666667 dog#n#2 fish#n#4 0.0833333333333333 dog#n#3 pig#n#1 0.0714285714285714 dog#n#3 pig#n#2 0.125 dog#n#3 pig#n#3 0.125 dog#n#3 pig#n#4 0.111111111111111 dog#n#3 pig#n#5 0.0833333333333333 dog#n#3 pig#n#6 0.0909090909090909 dog#n#3 fish#n#2 0.1 dog#n#3 fish#n#3 0.2 dog#n#3 fish#n#4 0.0909090909090909 dog#n#4 pig#n#1 0.0714285714285714 dog#n#4 pig#n#2 0.166666666666667 dog#n#4 pig#n#3 0.166666666666667 dog#n#4 pig#n#4 0.111111111111111 dog#n#4 pig#n#5 0.0833333333333333 dog#n#4 pig#n#6 0.0909090909090909 dog#n#4 fish#n#2 0.1 dog#n#4 fish#n#3 0.2 dog#n#4 fish#n#4 0.0909090909090909 dog#n#5 pig#n#1 0.05 dog#n#5 pig#n#2 0.0769230769230769 dog#n#5 pig#n#3 0.0769230769230769 dog#n#5 pig#n#4 0.0714285714285714 dog#n#5 pig#n#5 0.0714285714285714 dog#n#5 pig#n#6 0.0769230769230769 dog#n#5 fish#n#2 0.2 dog#n#5 fish#n#3 0.1 dog#n#5 fish#n#4 0.0833333333333333 dog#n#6 pig#n#1 0.0555555555555556 dog#n#6 pig#n#2 0.0714285714285714 dog#n#6 pig#n#3 0.0714285714285714 dog#n#6 pig#n#4 0.0666666666666667 dog#n#6 pig#n#5 0.125 dog#n#6 pig#n#6 0.111111111111111 dog#n#6 fish#n#2 0.0769230769230769 dog#n#6 fish#n#3 0.0909090909090909 dog#n#6 fish#n#4 0.0833333333333333 dog#n#7 pig#n#1 0.0588235294117647 dog#n#7 pig#n#2 0.0769230769230769 dog#n#7 pig#n#3 0.0769230769230769 dog#n#7 pig#n#4 0.0714285714285714 dog#n#7 pig#n#5 0.142857142857143 dog#n#7 pig#n#6 0.125 dog#n#7 fish#n#2 0.0833333333333333 dog#n#7 fish#n#3 0.1 dog#n#7 fish#n#4 0.0909090909090909 dog#v#1 pig#v#1 0.166666666666667 dog#v#1 pig#v#2 0.142857142857143 dog#v#1 pig#v#3 0.125 dog#v#1 fish#v#1 0.166666666666667 dog#v#1 fish#v#2 0.125 fish#n#1 dog#n#1 0.142857142857143 fish#n#1 dog#n#2 0.0909090909090909 fish#n#1 dog#n#3 0.1 fish#n#1 dog#n#4 0.1 fish#n#1 dog#n#5 0.0625 fish#n#1 dog#n#6 0.0714285714285714 fish#n#1 dog#n#7 0.0769230769230769 fish#n#1 bear#n#2 0.1 fish#n#2 dog#n#2 0.0909090909090909 fish#n#2 dog#n#3 0.1 fish#n#2 dog#n#4 0.1 fish#n#2 dog#n#5 0.2 fish#n#2 dog#n#6 0.0769230769230769 fish#n#2 dog#n#7 0.0833333333333333 fish#n#2 bear#n#2 0.1 fish#n#3 dog#n#2 0.166666666666667 fish#n#3 dog#n#3 0.2 fish#n#3 dog#n#4 0.2 fish#n#3 dog#n#5 0.1 fish#n#3 dog#n#6 0.0909090909090909 fish#n#3 dog#n#7 0.1 fish#n#3 bear#n#2 0.2 fish#n#4 dog#n#2 0.0833333333333333 fish#n#4 dog#n#3 0.0909090909090909 fish#n#4 dog#n#4 0.0909090909090909 fish#n#4 dog#n#5 0.0833333333333333 fish#n#4 dog#n#6 0.0833333333333333 fish#n#4 dog#n#7 0.0909090909090909 fish#n#4 bear#n#2 0.0909090909090909 fish#v#1 dog#v#1 0.166666666666667 fish#v#1 bear#v#1 0.2 fish#v#1 bear#v#2 0.166666666666667 fish#v#1 bear#v#3 0.125 fish#v#1 bear#v#4 0.166666666666667 fish#v#1 bear#v#5 0.2 fish#v#1 bear#v#6 0.2 fish#v#1 bear#v#7 0.2 fish#v#1 bear#v#8 0.166666666666667 fish#v#1 bear#v#9 0.2 fish#v#1 bear#v#10 0.2 fish#v#1 bear#v#11 0.2 fish#v#1 bear#v#12 0.25 fish#v#1 bear#v#13 0.142857142857143 fish#v#2 bear#v#1 0.142857142857143 fish#v#2 bear#v#2 0.125 fish#v#2 bear#v#3 0.1 fish#v#2 bear#v#4 0.125 fish#v#2 bear#v#5 0.142857142857143 fish#v#2 bear#v#6 0.142857142857143 fish#v#2 bear#v#7 0.142857142857143 fish#v#2 bear#v#8 0.125 fish#v#2 bear#v#9 0.142857142857143 fish#v#2 bear#v#10 0.142857142857143 fish#v#2 bear#v#11 0.142857142857143 fish#v#2 bear#v#12 0.166666666666667 fish#v#2 bear#v#13 0.111111111111111 bear#n#1 fish#n#1 0.142857142857143 bear#n#1 fish#n#2 0.0625 bear#n#1 fish#n#3 0.1 bear#n#1 fish#n#4 0.0666666666666667 bear#n#1 lion#n#1 0.2 bear#n#1 lion#n#2 0.0769230769230769 bear#n#1 lion#n#3 0.1 bear#n#1 lion#n#4 0.0666666666666667 bear#n#2 fish#n#1 0.1 bear#n#2 fish#n#2 0.1 bear#n#2 fish#n#3 0.2 bear#n#2 fish#n#4 0.0909090909090909 bear#n#2 lion#n#1 0.0714285714285714 bear#n#2 lion#n#2 0.125 bear#n#2 lion#n#3 0.2 bear#n#2 lion#n#4 0.0909090909090909 bear#v#1 fish#v#1 0.2 bear#v#1 fish#v#2 0.142857142857143 bear#v#2 fish#v#1 0.166666666666667 bear#v#2 fish#v#2 0.125 bear#v#3 fish#v#1 0.125 bear#v#3 fish#v#2 0.1 bear#v#4 fish#v#1 0.166666666666667 bear#v#4 fish#v#2 0.125 bear#v#5 fish#v#1 0.2 bear#v#5 fish#v#2 0.142857142857143 bear#v#6 fish#v#1 0.2 bear#v#6 fish#v#2 0.142857142857143 bear#v#7 fish#v#1 0.2 bear#v#7 fish#v#2 0.142857142857143 bear#v#8 fish#v#1 0.166666666666667 bear#v#8 fish#v#2 0.125 bear#v#9 fish#v#1 0.2 bear#v#9 fish#v#2 0.142857142857143 bear#v#10 fish#v#1 0.2 bear#v#10 fish#v#2 0.142857142857143 bear#v#11 fish#v#1 0.2 bear#v#11 fish#v#2 0.142857142857143 bear#v#12 fish#v#1 0.25 bear#v#12 fish#v#2 0.166666666666667 bear#v#13 fish#v#1 0.142857142857143 bear#v#13 fish#v#2 0.111111111111111 lion#n#1 bear#n#1 0.2 lion#n#1 bear#n#2 0.0714285714285714 lion#n#2 bear#n#1 0.0769230769230769 lion#n#2 bear#n#2 0.125 lion#n#3 bear#n#1 0.1 lion#n#3 bear#n#2 0.2 lion#n#4 bear#n#1 0.0666666666666667 lion#n#4 bear#n#2 0.0909090909090909 zebra#n#1 lion#n#1 0.111111111111111 zebra#n#1 lion#n#2 0.0666666666666667 zebra#n#1 lion#n#3 0.0833333333333333 zebra#n#1 lion#n#4 0.0588235294117647 zebra#n#1 tiger#n#1 0.0833333333333333 zebra#n#1 tiger#n#2 0.111111111111111 tiger#n#1 zebra#n#1 0.0833333333333333 tiger#n#2 zebra#n#1 0.111111111111111 On Wed, Jul 6, 2011 at 6:02 AM, Wael Gomaa <wae...@gm...> wrote: > ** > > > Dear Prof.Ted, > are there different format of the output file of wsd.pl command?, as i > want the suitable format for running the wsd output as an input of > similarity.pl command. > Thanks, > > > -- > Wael Hassan Gomaa > Mobile: +2 014 6767 4 66 > PhD student, > Faculty of Computers and Information, > Cairo University, Egypt > > __._,_.___ > Reply to sender <wae...@gm...?subject=Re%3A%20WSD%2Epl%20output>| Reply > to group <wn-...@ya...?subject=Re%3A%20WSD%2Epl%20output>| Reply > via web post<http://groups.yahoo.com/group/wn-similarity/post;_ylc=X3oDMTJwYmU1ZHFwBF9TAzk3MzU5NzE0BGdycElkAzEwMDQxOTgxBGdycHNwSWQDMTcwNTAwNzcwOQRtc2dJZAM4NzEEc2VjA2Z0cgRzbGsDcnBseQRzdGltZQMxMzA5OTYwNzY5?act=reply&messageNum=871>| Start > a New Topic<http://groups.yahoo.com/group/wn-similarity/post;_ylc=X3oDMTJmNnZqMXA4BF9TAzk3MzU5NzE0BGdycElkAzEwMDQxOTgxBGdycHNwSWQDMTcwNTAwNzcwOQRzZWMDZnRyBHNsawNudHBjBHN0aW1lAzEzMDk5NjA3Njk-> > Messages in this topic<http://groups.yahoo.com/group/wn-similarity/message/871;_ylc=X3oDMTMzM29uNjIzBF9TAzk3MzU5NzE0BGdycElkAzEwMDQxOTgxBGdycHNwSWQDMTcwNTAwNzcwOQRtc2dJZAM4NzEEc2VjA2Z0cgRzbGsDdnRwYwRzdGltZQMxMzA5OTYwNzY5BHRwY0lkAzg3MQ-->( > 1) > Recent Activity: > > - New Members<http://groups.yahoo.com/group/wn-similarity/members;_ylc=X3oDMTJndHRha2Z0BF9TAzk3MzU5NzE0BGdycElkAzEwMDQxOTgxBGdycHNwSWQDMTcwNTAwNzcwOQRzZWMDdnRsBHNsawN2bWJycwRzdGltZQMxMzA5OTYwNzY5?o=6> > 2 > > Visit Your Group<http://groups.yahoo.com/group/wn-similarity;_ylc=X3oDMTJmaW1rbzVrBF9TAzk3MzU5NzE0BGdycElkAzEwMDQxOTgxBGdycHNwSWQDMTcwNTAwNzcwOQRzZWMDdnRsBHNsawN2Z2hwBHN0aW1lAzEzMDk5NjA3Njk-> > [image: Yahoo! Groups]<http://groups.yahoo.com/;_ylc=X3oDMTJlMjJuaW1hBF9TAzk3NDc2NTkwBGdycElkAzEwMDQxOTgxBGdycHNwSWQDMTcwNTAwNzcwOQRzZWMDZnRyBHNsawNnZnAEc3RpbWUDMTMwOTk2MDc2OQ--> > Switch to: Text-Only<wn-...@ya...?subject=Change+Delivery+Format:+Traditional>, > Daily Digest<wn-...@ya...?subject=Email+Delivery:+Digest>• > Unsubscribe<wn-...@ya...?subject=Unsubscribe>• Terms > of Use <http://docs.yahoo.com/info/terms/> > . > > __,_._,___ > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-27 19:18:40
|
Greetings all, http://talisker.d.umn.edu http://marimba.d.umn.edu are both back up and fully operational. Please let us know if you have any questions or concerns. Enjoy, Ted On Fri, May 27, 2011 at 10:22 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Greetings all, > > http://talisker.d.umn.edu and all the interfaces provided there is > back up. This includes WordNet::Similarity, > WordNet::SenseRelate::AllWords, and SenseClusters. > > The other system (marimba) is in the process of coming back up, and > should be fully available later today. > > Enjoy, > Ted > > On Thu, May 26, 2011 at 2:10 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: >> Greetings all, >> >> The web interfaces for WordNet::Similarity, >> WordNet::SenseRelate::AllWords, and SenseClusters are all down due to >> a long overdue upgrade. But, at least one of our systems will be back >> before 5pm Friday April 27, perhaps both. >> >> These interfaces will continue to be located at the following URLs >> once they are back: >> >> http://marimba.d.umn.edu >> http://talisker.d.umn.edu >> >> Sorry for the short notice, but hopefully things will be back in a day >> or two. If you really need something run *now* let me know and I'll >> see what I can do to assist. >> >> Cordially, >> Ted >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-27 15:22:54
|
Greetings all, http://talisker.d.umn.edu and all the interfaces provided there is back up. This includes WordNet::Similarity, WordNet::SenseRelate::AllWords, and SenseClusters. The other system (marimba) is in the process of coming back up, and should be fully available later today. Enjoy, Ted On Thu, May 26, 2011 at 2:10 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: > Greetings all, > > The web interfaces for WordNet::Similarity, > WordNet::SenseRelate::AllWords, and SenseClusters are all down due to > a long overdue upgrade. But, at least one of our systems will be back > before 5pm Friday April 27, perhaps both. > > These interfaces will continue to be located at the following URLs > once they are back: > > http://marimba.d.umn.edu > http://talisker.d.umn.edu > > Sorry for the short notice, but hopefully things will be back in a day > or two. If you really need something run *now* let me know and I'll > see what I can do to assist. > > Cordially, > Ted > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-26 19:10:43
|
Greetings all, The web interfaces for WordNet::Similarity, WordNet::SenseRelate::AllWords, and SenseClusters are all down due to a long overdue upgrade. But, at least one of our systems will be back before 5pm Friday April 27, perhaps both. These interfaces will continue to be located at the following URLs once they are back: http://marimba.d.umn.edu http://talisker.d.umn.edu Sorry for the short notice, but hopefully things will be back in a day or two. If you really need something run *now* let me know and I'll see what I can do to assist. Cordially, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-12 13:22:25
|
Hi Wael, I think both WordNet::Similarity and WordNet::SenseRelate could be very useful in measuring the semantic similarity/relatedness between phrases or sentences or questions, although there's nothing in either package that directly supports that. Rather, they can be viewed as providing building blocks for doing that kind of work. You could imagine, perhaps, doing something like the following as a very simple sort of baseline approach.... Use WordNet::SenseRelate::AllWords to assign senses to all the words in your question Q and also in your possible answer statements A1 - AN. Then, use WordNet::Similarity to measure the pairwise similarity between word senses as found in question Q and each of the answer statements. You might want to normalize for length for reach pairwise similarity (by dividing by the number of words or sense tagged words, for example). The only support WordNet::Similarity would provide would be to get those pairwise measurements - the normalizing and summing is something you'd need to take care of... Then, you'd have values expressing the similarity between Q and each An, and you could see which An results in the most similarity. I don't mean to suggest this is the right or only way to do this - just an idea of how you could use WordNet::Similarity and WordNet::SenseRelate to handle this... You might also want to look at the following paper, which both uses WordNet::Similarity as well as Latent Semantic Analysis and another technique for a similar sort of problem... Michael Mohler and Rada Mihalcea, Text-to-text Semantic Similarity for Automatic Short Answer Grading, in Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2009), Athens, Greece, March 2009 http://www.cse.unt.edu/~rada/papers/mohler.eacl09.pdf Finally, remember that WordNet::SenseRelate::AllWords has a web interface you could use for simple tests, although if you already have WordNet::Similarity installed then installing WordNet::SenseRelate::AllWords is usually trivial. The interface is available here... http://marimba.d.umn.edu http://talisker.d.umn.edu Hope this helps. Good luck on what sounds like an interesting project. Cordially, Ted On Wed, May 11, 2011 at 7:23 AM, Wael Gomaa <wae...@gm...> wrote: > Dear Prof.Ted, > thanks for your efforts in WordNet Similarity and Senserelate, > your recommendations were the main reason to achieve my master degree. > i registered my PhD with title "Automatic Arabic Essay Assessment". > Now i have a list of predefined Essay Questions and model Answers of > "History subject" with Arabic Language, i translated them to English using > google translation API as i know that similarity and senserelate support > only English Language. > what i want is: > - Measure the similarity between any conceived question from Teacher and all > the predefined questions to get the most similar question with its model > answer. > - Measure the similarity between the Student Answer and the predefined Model > Answer to calculate a score that represents the semantic similarity between > the two answers. > my questions are: > - What are the packages that i need? WordNet::Similarity Package or > WordNet::Senserelate Packages or both. > - Is the text Length controls in choosing the similarity Algorithm? as the > question length is very short (one sentence) and the answer length is very > long (many paragraphs). > sorry for the long mail, > Thanks for your cooperation, > Best Regards, > -- > Wael Hassan Gomaa > Mobile: +2 014 6767 4 66 > PhD student, > Faculty of Computers and Information, > Cairo University, Egypt > > ------------------------------------------------------------------------------ > Achieve unprecedented app performance and reliability > What every C/C++ and Fortran developer should know. > Learn how Intel has extended the reach of its next-generation tools > to help boost performance applications - inlcuding clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > senserelate-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senserelate-users > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-12 02:01:19
|
Hi Arun, I'd try the --config option, in order to provide the lesk measure with a stoplist. If you aren't using any configuration file then I don't think you are getting a stoplist for lesk (note this is different than the --stoplist option) and that might help. You can find the configuration options for lesk documented here... http://search.cpan.org/dist/WordNet-Similarity/lib/WordNet/Similarity/lesk.pm As a quick primer, you specify a configuration file that can look like this: WordNet::Similarity::lesk stop::stoplist.txt An example of a stoplist file can be found here : http://cpansearch.perl.org/src/TPEDERSE/WordNet-Similarity-2.05/samples/stoplist.txt So, you would then simply run wsd.pl with the config option (--config) and that would change the lesk measure by adding a stoplist to it (that would remove stop words from the glosses that it matches). Otherwise, I think it might just be worthwhile to experiment a bit - there are quite a few options to wsd.pl that have a fairly significant impact on the algorithm (for better or worse), so you might even discover some combination of options that works even better (in which case we'd be happy to hear about that). Best of luck, and keep us posted on how things go. Cordially, Ted On Wed, May 11, 2011 at 3:56 PM, Arun N <aru...@gm...> wrote: > I tested on Senseval 3 with following options as described in varadha's > thesis. > --format wntagged --type WordNet::Similarity::lesk --window 15 --backoff > --contextScore=0.0 --pairScore=0.0 --stoplist default-stoplist-raw.txt > --nocompoundify > I got F-measure of 51.6(used allwords-scorer.pl) > I have attached my output file. > How can this be improved to 54 ? > > Arun, > On Wed, May 11, 2011 at 2:22 PM, Ted Pedersen <tpederse@d.umn.edu> wrote: >> >> Hi Arun, >> >> See comments inline... >> >> On Wed, May 11, 2011 at 12:25 PM, Arun N <aru...@gm...> wrote: >> > Thanks for the reply. >> > I agree that results improve when backoff option is used. >> > In page 186, The table has results for noun, verbs, adjectives, and >> > adverbs. >> > There is no column for all words results. >> >> That's correct. >> >> > Also, the demo paper's results are for all parts of speech . I guess. >> >> Correct - that was a very short paper so we tried to make it as >> condensed as possible. >> >> > whereas the Page 174 has backoff results for window 15 and it has all >> > POS >> > results. >> >> Correct. >> >> > My question is, Does the table in demo paper correspond to all POS >> > results ? >> >> Yes. >> >> > If yes, then Page 174 has the results I guess, because, page 186 doesnot >> > have all POS results. >> >> Yes, I think the overall results are presented first, with the more >> detailed scores later. >> >> > Moreover, In Page 174, table 143 has results with backoff option set and >> > the >> > results for lesk algorithm is 50.9. >> >> Yes, that's true. >> >> I think Varada's thesis specifies completely the options we used, so >> that's your best starting point. I don't recall what options were used >> in the NAACL demo paper. I think the main thing that could differ >> might be the window size or perhaps the stoplist used by lesk when >> measuring it's overlaps, or the stoplist used by wsd.pl. >> >> But, I'm confident that the results in the NAACL paper are as >> reported, and also with Varada's thesis. There is some variation in >> the experiments that appears to be important, although I can't >> reconstruct exactly what that is. I think the best thing might be to >> try to run the wsd.pl program on the Senseval-3 data with the options >> described in Varada's thesis and see what that results in. I'd be >> happy to look at those results and comments further (once we know what >> happens there.) >> >> Hope this helps. >> >> Good luck, >> Ted >> >> > Arun, >> > >> > On Wed, May 11, 2011 at 8:21 AM, Ted Pedersen <tpederse@d.umn.edu> >> > wrote: >> >> >> >> Hi Arun, >> >> >> >> See comments inline... >> >> >> >> On Wed, May 11, 2011 at 12:18 AM, Arun N <aru...@gm...> wrote: >> >> > Hi Guys, >> >> > I need a small clarification. >> >> > >> >> > In the paper >> >> > >> >> > http://www.d.umn.edu/~kolha002/publications/pedersenk09-demo-final.pdf >> >> > the F-measure for Senseval 3 (lesk) with window size 15 is 54 [ P - >> >> > 54 >> >> > : R >> >> > - 53 ] >> >> > >> >> > But I cannot find a similar F-measure value in Varadha's thesis. >> >> > In the thesis >> >> > http://www.d.umn.edu/~kolha002/publications/Kolhatkar-thesis.pdf >> >> > >> >> > page number 173-174 has the results for Window size 15 >> >> > >> >> > All the results for window = 15 and lesk measure is not more than 51 >> >> >> >> If you look on the last page of Varada's thesis (page 186) I think >> >> you'll see the source of the results in the NAACL demo paper - note >> >> that in this case we use the --backoff option, which means default to >> >> sense 1 when we can't establish anything with the SenseRelate >> >> algorithm. In the earlier results you mention (pages 173-174) there is >> >> no such backoff, so you see somewhat lower results. >> >> >> >> > >> >> > Could you tell what options did u set for getting the highest >> >> > F-measure >> >> > 54 >> >> > as reported in the paper ? >> >> >> >> See page 186 of Varada's thesis. >> >> >> >> > >> >> > >> >> > Secondly, >> >> > Agirre et al >> >> > http://www.aclweb.org/anthology/E/E09/E09-1005.pdf >> >> > >> >> > The authors claim that they get better results when Wordnet 1.7 was >> >> > used >> >> > instead of Wordnet 3.0. >> >> > So, did you guys experiment SR-AW with wordnet 1.7 ? >> >> >> >> No. The WordNet group at Princeton doesn't support 1.7 any longer, so >> >> we don't use it. Overall WordNet 3.0 is much improved on earlier >> >> versions of WordNet, so I think it makes sense to use it. >> >> >> >> However, remember that the SemCor data is based on version 1.5 of >> >> WordNet, so in some ways it makes sense that an earlier version would >> >> work better (since as the versions progress those mappings back to 1.5 >> >> become more and more noisy). But, I think that tells us more about the >> >> evaluation data than it does WordNet. >> >> >> >> > Also, I would like to know whether the actual key given for senseval >> >> > 2 >> >> > and >> >> > 3 was based on Wordnet 1.7 or Wordnet 3.0 ? >> >> >> >> To be honest I just don't recall. You might need to dig around a bit >> >> for some answers to that - http://senseval.org will be a good starting >> >> point for that. Also remember that WordNet 2.0 was quite popular for >> >> some time, and could have been used (especially for Senseval-2, since >> >> I don't think 3.0 was released at that time). >> >> >> >> > I downloaded Senseval data sets from Rada Mihalcea's website which >> >> > was >> >> > actually suggested by Varadha. >> >> >> >> Great! That's a very useful resource. >> >> ( http://www.cse.unt.edu/~rada/downloads.html#sensevalsemcor ) >> >> >> >> Hope this helps! >> >> >> >> Good luck, >> >> Ted >> >> >> >> > >> >> > Arun, >> >> > >> >> > On Mon, Apr 25, 2011 at 12:16 PM, Arun N <aru...@gm...> wrote: >> >> >> >> >> >> Thanks Varadha. This is what I was searching for. >> >> >> Arun, >> >> >> >> >> >> On Mon, Apr 25, 2011 at 10:47 AM, Ted Pedersen <tpederse@d.umn.edu> >> >> >> wrote: >> >> >>> >> >> >>> Hi Varada, >> >> >>> >> >> >>> Ah......that's the part I was forgetting!!!!!!!!!!!!!!!!!!!!!!!!!!! >> >> >>> :) >> >> >>> Thanks very much for clarifying this. >> >> >>> >> >> >>> Arun, I hope this works out, and please let us know if additional >> >> >>> questions arise. >> >> >>> >> >> >>> Thanks! >> >> >>> Ted >> >> >>> >> >> >>> On Mon, Apr 25, 2011 at 10:23 AM, varada kolhatkar >> >> >>> <var...@gm...> wrote: >> >> >>> > Hi Arun, >> >> >>> > semcor-reformat.pl needs SemCor formatted input. For my >> >> >>> > experiments >> >> >>> > I >> >> >>> > used >> >> >>> > Senseval data converted into SemCor format by Rada Mihalcea. >> >> >>> > You can download it from her webpage. >> >> >>> > http://www.cse.unt.edu/~rada/downloads.html >> >> >>> > Search for 'Senseval-3 English all-words converted into SemCor >> >> >>> > format' >> >> >>> > Hope that helps, >> >> >>> > Varada >> >> >>> > >> >> >>> > On Mon, Apr 25, 2011 at 7:39 AM, Ted Pedersen >> >> >>> > <tpederse@d.umn.edu> >> >> >>> > wrote: >> >> >>> >> >> >> >>> >> Thanks for these additional details Arun! We'll investigate >> >> >>> >> further >> >> >>> >> and report back asap, I hope later today (Monday). >> >> >>> >> >> >> >>> >> Cordially, >> >> >>> >> Ted >> >> >>> >> >> >> >>> >> On Sun, Apr 24, 2011 at 10:16 PM, Arun N <aru...@gm...> >> >> >>> >> wrote: >> >> >>> >> > @Ted, >> >> >>> >> > This is the command that I used and the corresponding error >> >> >>> >> > message. >> >> >>> >> > $ semcor-reformat.pl --file english-all-words.xml >> >> >>> >> > Nameless tag: '?xml version="1.0"?' >> >> >>> >> > Nameless tag: '!DOCTYPE corpus SYSTEM "all-words.dtd"' >> >> >>> >> > Use of uninitialized value in subroutine entry at >> >> >>> >> > /usr/local/bin/semcor-reformat.pl line 222, <FH> chunk 1. >> >> >>> >> > Can't use string ("") as a subroutine ref while "strict refs" >> >> >>> >> > in >> >> >>> >> > use >> >> >>> >> > at >> >> >>> >> > /usr/local/bin/semcor-reformat.pl line 222, <FH> chunk 1. >> >> >>> >> > Arun, >> >> >>> >> > On Sun, Apr 24, 2011 at 10:12 PM, Arun N <aru...@gm...> >> >> >>> >> > wrote: >> >> >>> >> >> >> >> >>> >> >> @Varadha >> >> >>> >> >> The results in Varadha's thesis (p. 193) say that SENSEVAL 3 >> >> >>> >> >> was >> >> >>> >> >> given >> >> >>> >> >> in >> >> >>> >> >> wntagged format. >> >> >>> >> >> I just want to know how did you convert that to wntagged >> >> >>> >> >> format >> >> >>> >> >> ? >> >> >>> >> >> the .xml file doesnt have POS tags at all as ted mentioned in >> >> >>> >> >> the >> >> >>> >> >> earlier >> >> >>> >> >> mail >> >> >>> >> >> So I guess, I am using a wrong file for SENSEVAL 3, but I am >> >> >>> >> >> sure >> >> >>> >> >> that >> >> >>> >> >> I >> >> >>> >> >> downloaded it from the SENSEVAL 3 site. >> >> >>> >> >> Arun, >> >> >>> >> >> >> >> >>> >> >> On Sun, Apr 24, 2011 at 10:08 PM, Arun N >> >> >>> >> >> <aru...@gm...> >> >> >>> >> >> wrote: >> >> >>> >> >>> >> >> >>> >> >>> One quick clarification, the .xml file that I sent, was the >> >> >>> >> >>> one >> >> >>> >> >>> that >> >> >>> >> >>> Varadha experimented for SENSEVAL 3? >> >> >>> >> >>> or >> >> >>> >> >>> Varadha, can u give me link where you downloaded the data >> >> >>> >> >>> set >> >> >>> >> >>> for >> >> >>> >> >>> evaluating SR-AW on SENSEVAL 3. >> >> >>> >> >>> >> >> >>> >> >>> Arun, >> >> >>> >> >>> On Sun, Apr 24, 2011 at 9:41 PM, Ted Pedersen >> >> >>> >> >>> <tpederse@d.umn.edu> >> >> >>> >> >>> wrote: >> >> >>> >> >>>> >> >> >>> >> >>>> Hi Arun, >> >> >>> >> >>>> >> >> >>> >> >>>> BTW, I might be wrong about not having this functionality >> >> >>> >> >>>> in >> >> >>> >> >>>> SenseRelate::AllWords. Can you send the command that you >> >> >>> >> >>>> try >> >> >>> >> >>>> to >> >> >>> >> >>>> run >> >> >>> >> >>>> and the error that you get? I'll check on a few things in >> >> >>> >> >>>> the >> >> >>> >> >>>> meantime. >> >> >>> >> >>>> >> >> >>> >> >>>> Thanks! >> >> >>> >> >>>> Ted >> >> >>> >> >>>> >> >> >>> >> >>>> On Sun, Apr 24, 2011 at 9:33 PM, Ted Pedersen >> >> >>> >> >>>> <tpederse@d.umn.edu> >> >> >>> >> >>>> wrote: >> >> >>> >> >>>> > Hi Arun, >> >> >>> >> >>>> > >> >> >>> >> >>>> > You can format input to WordNet::SenseRelate::AllWords as >> >> >>> >> >>>> > wntagged >> >> >>> >> >>>> > (four part of speech tags, n, v, a, r) >> >> >>> >> >>>> > >> >> >>> >> >>>> > cats#n run#v >> >> >>> >> >>>> > >> >> >>> >> >>>> > or raw (plain text) >> >> >>> >> >>>> > >> >> >>> >> >>>> > cats run >> >> >>> >> >>>> > >> >> >>> >> >>>> > or tagged (penn treebank) >> >> >>> >> >>>> > >> >> >>> >> >>>> > cats/NP run/VB >> >> >>> >> >>>> > >> >> >>> >> >>>> > Based on what I see in the xml file you sent, I think you >> >> >>> >> >>>> > probably >> >> >>> >> >>>> > just want to convert this to a raw text format (where you >> >> >>> >> >>>> > have >> >> >>> >> >>>> > one >> >> >>> >> >>>> > sentence per line, one line per sentence) since there are >> >> >>> >> >>>> > no >> >> >>> >> >>>> > pos >> >> >>> >> >>>> > tags >> >> >>> >> >>>> > (so no point in using wntagged or tagged). >> >> >>> >> >>>> > >> >> >>> >> >>>> > We don't have a converter from SensEval-3 format in >> >> >>> >> >>>> > SenseRelate::AllWords...however, I think I might know of >> >> >>> >> >>>> > one >> >> >>> >> >>>> > I >> >> >>> >> >>>> > can >> >> >>> >> >>>> > refer you to....let me check on that and report back on >> >> >>> >> >>>> > Monday. >> >> >>> >> >>>> > >> >> >>> >> >>>> > Cordially, >> >> >>> >> >>>> > Ted >> >> >>> >> >>>> > >> >> >>> >> >>>> > On Sun, Apr 24, 2011 at 9:21 PM, Arun N >> >> >>> >> >>>> > <aru...@gm...> >> >> >>> >> >>>> > wrote: >> >> >>> >> >>>> >> I am planning to experiment on SENSEVAL 3 all words data >> >> >>> >> >>>> >> set. >> >> >>> >> >>>> >> But, it is in a different format from Semcor. >> >> >>> >> >>>> >> When I tried to use extract-semcor.pl on the file, it >> >> >>> >> >>>> >> showed >> >> >>> >> >>>> >> some >> >> >>> >> >>>> >> error. >> >> >>> >> >>>> >> I downloaded the senseval3 all words test data from site >> >> >>> >> >>>> >> http://www.senseval.org/senseval3/data.html >> >> >>> >> >>>> >> I have also attached the file. >> >> >>> >> >>>> >> I just want to know how should I format SENSEVAL 3 all >> >> >>> >> >>>> >> words >> >> >>> >> >>>> >> data >> >> >>> >> >>>> >> and >> >> >>> >> >>>> >> give >> >> >>> >> >>>> >> to wsd.pl ? >> >> >>> >> >>>> >> Arun, >> >> >>> >> >>>> >> >> >> >>> >> >>>> >> On Sun, Apr 24, 2011 at 8:20 PM, Ted Pedersen >> >> >>> >> >>>> >> <tpederse@d.umn.edu> >> >> >>> >> >>>> >> wrote: >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> Hi Arun, >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> I'm not sure what you mean by senseval...do you mean >> >> >>> >> >>>> >>> the >> >> >>> >> >>>> >>> semcor >> >> >>> >> >>>> >>> format? >> >> >>> >> >>>> >>> Or... ? >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> BTW, for wntagged, do you mean text that looks like >> >> >>> >> >>>> >>> this: >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> cats#n run#v fast#r >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> Just wanted to clarify since there are a few different >> >> >>> >> >>>> >>> formats... >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> Thanks! >> >> >>> >> >>>> >>> Ted >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> On Sun, Apr 24, 2011 at 7:41 PM, Arun N >> >> >>> >> >>>> >>> <aru...@gm...> >> >> >>> >> >>>> >>> wrote: >> >> >>> >> >>>> >>> > Thanks for the reply guys. >> >> >>> >> >>>> >>> > Is there any perl script to convert senseval format >> >> >>> >> >>>> >>> > to >> >> >>> >> >>>> >>> > wntagged >> >> >>> >> >>>> >>> > for >> >> >>> >> >>>> >>> > senserelate. >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > Arun, >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > On Sun, Apr 24, 2011 at 6:30 PM, varada kolhatkar >> >> >>> >> >>>> >>> > <var...@gm...> wrote: >> >> >>> >> >>>> >>> >> >> >> >>> >> >>>> >>> >> Yes, semcor-reformat.pl is the script which can be >> >> >>> >> >>>> >>> >> used >> >> >>> >> >>>> >>> >> to >> >> >>> >> >>>> >>> >> generate wsd >> >> >>> >> >>>> >>> >> key file. We also provide scorer2-sort.pl as it's >> >> >>> >> >>>> >>> >> easier >> >> >>> >> >>>> >>> >> to >> >> >>> >> >>>> >>> >> compare >> >> >>> >> >>>> >>> >> sorted >> >> >>> >> >>>> >>> >> lists. >> >> >>> >> >>>> >>> >> extract-semcor-plaintext.pl can be used to extract >> >> >>> >> >>>> >>> >> plain >> >> >>> >> >>>> >>> >> text >> >> >>> >> >>>> >>> >> (text >> >> >>> >> >>>> >>> >> without POS tag info) from semcor. If you want to >> >> >>> >> >>>> >>> >> experiment >> >> >>> >> >>>> >>> >> with >> >> >>> >> >>>> >>> >> the >> >> >>> >> >>>> >>> >> effect >> >> >>> >> >>>> >>> >> of POS tagging on wsd, you can use this script. >> >> >>> >> >>>> >>> >> >> >> >>> >> >>>> >>> >> Varada >> >> >>> >> >>>> >>> >> On Sat, Apr 23, 2011 at 11:13 PM, Ted Pedersen >> >> >>> >> >>>> >>> >> <tpederse@d.umn.edu> >> >> >>> >> >>>> >>> >> wrote: >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Hi Arun, >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Try out the following commands to create a key >> >> >>> >> >>>> >>> >>> file...Note >> >> >>> >> >>>> >>> >>> that >> >> >>> >> >>>> >>> >>> I'm >> >> >>> >> >>>> >>> >>> using semcor-sample.txt as the source of the key. >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> This is my key... >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> marengo(22): more semcor-sample.txt >> >> >>> >> >>>> >>> >>> <contextfile concordance=brown> >> >> >>> >> >>>> >>> >>> <context filename=br-e24 paras=yes> >> >> >>> >> >>>> >>> >>> <p pnum=1> >> >> >>> >> >>>> >>> >>> <s snum=1> >> >> >>> >> >>>> >>> >>> <wf cmd=ignore pos=DT>The</wf> >> >> >>> >> >>>> >>> >>> <wf cmd=done pos=JJ lemma=russian wnsn=1 >> >> >>> >> >>>> >>> >>> lexsn=3:01:00::>Russian</wf> >> >> >>> >> >>>> >>> >>> <wf cmd=done pos=NN lemma=gymnast wnsn=1 >> >> >>> >> >>>> >>> >>> lexsn=1:18:00::>gymnasts</wf> >> >> >>> >> >>>> >>> >>> <wf cmd=done pos=IN >> >> >>> >> >>>> >>> >>> ot=idiom>beat_the_tar_out_of</wf> >> >> >>> >> >>>> >>> >>> <wf cmd=ignore pos=DT>the</wf> >> >> >>> >> >>>> >>> >>> </s> >> >> >>> >> >>>> >>> >>> </p> >> >> >>> >> >>>> >>> >>> </context> >> >> >>> >> >>>> >>> >>> </contextfile> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> marengo(20): semcor-reformat.pl --file >> >> >>> >> >>>> >>> >>> semcor-sample.txt >> >> >>> >> >>>> >>> >>> --key | >> >> >>> >> >>>> >>> >>> scorer2-sort.pl > key.txt >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> marengo(21): cat key.txt >> >> >>> >> >>>> >>> >>> gymnast.n 2 1 >> >> >>> >> >>>> >>> >>> russian.a 1 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Assume that these are the answers generated by my >> >> >>> >> >>>> >>> >>> system... >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> marengo(23): more answers.txt >> >> >>> >> >>>> >>> >>> gymnast.n 2 3 >> >> >>> >> >>>> >>> >>> russian.a 1 1 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Then I could run the scorer like this... >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> marengo(29): allwords-scorer2.pl --ansfile >> >> >>> >> >>>> >>> >>> answers.txt >> >> >>> >> >>>> >>> >>> --keyfile >> >> >>> >> >>>> >>> >>> key.txt >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> score for "answers.txt" using key "key.txt" : >> >> >>> >> >>>> >>> >>> precision: 0.500 (1 correct of 2 attempted.) >> >> >>> >> >>>> >>> >>> recall: 0.500 (1 correct of 2 in total) >> >> >>> >> >>>> >>> >>> F-measure: 0.500 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> attempted: 100.00%(2 attempted of 2 in total) >> >> >>> >> >>>> >>> >>> part of speech tag mismatch in attempted >> >> >>> >> >>>> >>> >>> instances: >> >> >>> >> >>>> >>> >>> 0.00% (0 >> >> >>> >> >>>> >>> >>> mismatches of 2 attempted instances) >> >> >>> >> >>>> >>> >>> skipped instances : 0.00% (skipped 0 instances of >> >> >>> >> >>>> >>> >>> total >> >> >>> >> >>>> >>> >>> 2 >> >> >>> >> >>>> >>> >>> instances >> >> >>> >> >>>> >>> >>> because the instance id or the word was not found >> >> >>> >> >>>> >>> >>> in >> >> >>> >> >>>> >>> >>> the >> >> >>> >> >>>> >>> >>> answer >> >> >>> >> >>>> >>> >>> file) >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Nouns: >> >> >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 1 nouns >> >> >>> >> >>>> >>> >>> attempted.) >> >> >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 1 noun instances in >> >> >>> >> >>>> >>> >>> total) >> >> >>> >> >>>> >>> >>> F-measure: 0.000 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Verbs: >> >> >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 0 verbs >> >> >>> >> >>>> >>> >>> attempted.) >> >> >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 0 verb instances in >> >> >>> >> >>>> >>> >>> total) >> >> >>> >> >>>> >>> >>> F-measure: 0.000 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Adjectives: >> >> >>> >> >>>> >>> >>> Precision : 1.000 (1 correct of 1 adjectives >> >> >>> >> >>>> >>> >>> attempted.) >> >> >>> >> >>>> >>> >>> Recall : 1.000 (1 correct of 1 adjective instances >> >> >>> >> >>>> >>> >>> in >> >> >>> >> >>>> >>> >>> total) >> >> >>> >> >>>> >>> >>> F-measure: 1.000 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Adverbs: >> >> >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 0 adverbs >> >> >>> >> >>>> >>> >>> attempted.) >> >> >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 0 adverb instances in >> >> >>> >> >>>> >>> >>> total) >> >> >>> >> >>>> >>> >>> F-measure: 0.000 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Confusion Matrix for part of speech tags : >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Noun Verb Adj >> >> >>> >> >>>> >>> >>> Adv >> >> >>> >> >>>> >>> >>> | Key >> >> >>> >> >>>> >>> >>> Noun 1 0 0 >> >> >>> >> >>>> >>> >>> 0 >> >> >>> >> >>>> >>> >>> | 1 >> >> >>> >> >>>> >>> >>> Verb 0 0 0 >> >> >>> >> >>>> >>> >>> 0 >> >> >>> >> >>>> >>> >>> | 0 >> >> >>> >> >>>> >>> >>> Adj 0 0 1 >> >> >>> >> >>>> >>> >>> 0 >> >> >>> >> >>>> >>> >>> | 1 >> >> >>> >> >>>> >>> >>> Adv 0 0 0 >> >> >>> >> >>>> >>> >>> 0 >> >> >>> >> >>>> >>> >>> | 0 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> --------------------------------------------------------------------------------|------- >> >> >>> >> >>>> >>> >>> Ans 1 0 1 >> >> >>> >> >>>> >>> >>> 0 >> >> >>> >> >>>> >>> >>> | 2 >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> I hope this is of some help. Please let us know >> >> >>> >> >>>> >>> >>> though >> >> >>> >> >>>> >>> >>> if >> >> >>> >> >>>> >>> >>> there >> >> >>> >> >>>> >>> >>> are >> >> >>> >> >>>> >>> >>> additional issues to resolve! >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> Cordially, >> >> >>> >> >>>> >>> >>> Ted >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> On Sat, Apr 23, 2011 at 7:54 PM, Arun N >> >> >>> >> >>>> >>> >>> <aru...@gm...> >> >> >>> >> >>>> >>> >>> wrote: >> >> >>> >> >>>> >>> >>> > Hi, >> >> >>> >> >>>> >>> >>> > Thanks, for the quick reply. >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > Actually, I wrote my own scorer. But, I am >> >> >>> >> >>>> >>> >>> > thinking >> >> >>> >> >>>> >>> >>> > of >> >> >>> >> >>>> >>> >>> > using >> >> >>> >> >>>> >>> >>> > the >> >> >>> >> >>>> >>> >>> > scorer >> >> >>> >> >>>> >>> >>> > provided in senserelate package. >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > btw, It would be great, if you could tell how to >> >> >>> >> >>>> >>> >>> > generate >> >> >>> >> >>>> >>> >>> > the >> >> >>> >> >>>> >>> >>> > key >> >> >>> >> >>>> >>> >>> > file >> >> >>> >> >>>> >>> >>> > for a >> >> >>> >> >>>> >>> >>> > semcor input file. >> >> >>> >> >>>> >>> >>> > There is a perl script >> >> >>> >> >>>> >>> >>> > extract-semcor-plaintext.pl >> >> >>> >> >>>> >>> >>> > --key >> >> >>> >> >>>> >>> >>> > flag, but >> >> >>> >> >>>> >>> >>> > it >> >> >>> >> >>>> >>> >>> > generates a key with just the POS tags but not >> >> >>> >> >>>> >>> >>> > the >> >> >>> >> >>>> >>> >>> > wordnet >> >> >>> >> >>>> >>> >>> > senses. >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > Do you have any perl code to generate the key >> >> >>> >> >>>> >>> >>> > file(in a >> >> >>> >> >>>> >>> >>> > suitable >> >> >>> >> >>>> >>> >>> > format >> >> >>> >> >>>> >>> >>> > for >> >> >>> >> >>>> >>> >>> > scorer) from a semcor file ? so that it can be >> >> >>> >> >>>> >>> >>> > passed >> >> >>> >> >>>> >>> >>> > to >> >> >>> >> >>>> >>> >>> > the >> >> >>> >> >>>> >>> >>> > allwords-scorer.pl. >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > Arun, >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > On Sat, Apr 23, 2011 at 3:16 PM, Ted Pedersen >> >> >>> >> >>>> >>> >>> > <tpederse@d.umn.edu> >> >> >>> >> >>>> >>> >>> > wrote: >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> Hi Arun, >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> Nice to hear from you. You may also wish to >> >> >>> >> >>>> >>> >>> >> consult >> >> >>> >> >>>> >>> >>> >> Varada >> >> >>> >> >>>> >>> >>> >> Kolhatkar's >> >> >>> >> >>>> >>> >>> >> MS thesis, which is a more recent use of >> >> >>> >> >>>> >>> >>> >> WordNet::SenseRelate::Allwords. While some >> >> >>> >> >>>> >>> >>> >> differences >> >> >>> >> >>>> >>> >>> >> in >> >> >>> >> >>>> >>> >>> >> results >> >> >>> >> >>>> >>> >>> >> are >> >> >>> >> >>>> >>> >>> >> to be expected as the years go by (due to >> >> >>> >> >>>> >>> >>> >> changes >> >> >>> >> >>>> >>> >>> >> in >> >> >>> >> >>>> >>> >>> >> WordNet >> >> >>> >> >>>> >>> >>> >> for >> >> >>> >> >>>> >>> >>> >> example) they should be fairly minor. >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> An Extended Analysis of a Method of All Words >> >> >>> >> >>>> >>> >>> >> Sense >> >> >>> >> >>>> >>> >>> >> Disambiguation >> >> >>> >> >>>> >>> >>> >> (Kolhatkar) - Master of Science Thesis, >> >> >>> >> >>>> >>> >>> >> Department >> >> >>> >> >>>> >>> >>> >> of >> >> >>> >> >>>> >>> >>> >> Computer >> >> >>> >> >>>> >>> >>> >> Science, University of Minnesota, Duluth, >> >> >>> >> >>>> >>> >>> >> August, >> >> >>> >> >>>> >>> >>> >> 2009. >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> http://www.d.umn.edu/~tpederse/Pubs/varada-thesis.pdf >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> Regarding your results, what were your precision >> >> >>> >> >>>> >>> >>> >> and >> >> >>> >> >>>> >>> >>> >> recall >> >> >>> >> >>>> >>> >>> >> values? >> >> >>> >> >>>> >>> >>> >> Did you use the scoring program that comes with >> >> >>> >> >>>> >>> >>> >> WordNet::SenseRelate::AllWords? Also, if you >> >> >>> >> >>>> >>> >>> >> could >> >> >>> >> >>>> >>> >>> >> send >> >> >>> >> >>>> >>> >>> >> the >> >> >>> >> >>>> >>> >>> >> exact >> >> >>> >> >>>> >>> >>> >> command you ran that would help us understand >> >> >>> >> >>>> >>> >>> >> what >> >> >>> >> >>>> >>> >>> >> might >> >> >>> >> >>>> >>> >>> >> be >> >> >>> >> >>>> >>> >>> >> happening. >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> Thanks! >> >> >>> >> >>>> >>> >>> >> Ted >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> On Sat, Apr 23, 2011 at 1:07 PM, Arun N >> >> >>> >> >>>> >>> >>> >> <aru...@gm...> >> >> >>> >> >>>> >>> >>> >> wrote: >> >> >>> >> >>>> >>> >>> >> > Hi Jason and Ted, >> >> >>> >> >>>> >>> >>> >> > I am Arun Nedunchezhian, Graduate Student at >> >> >>> >> >>>> >>> >>> >> > UT >> >> >>> >> >>>> >>> >>> >> > Austin. I >> >> >>> >> >>>> >>> >>> >> > am >> >> >>> >> >>>> >>> >>> >> > working >> >> >>> >> >>>> >>> >>> >> > on a >> >> >>> >> >>>> >>> >>> >> > project which uses >> >> >>> >> >>>> >>> >>> >> > WordNet::SenseRelate::Allwords >> >> >>> >> >>>> >>> >>> >> > package. >> >> >>> >> >>>> >>> >>> >> > I read the results section in your(Jason) MS >> >> >>> >> >>>> >>> >>> >> > thesis. >> >> >>> >> >>>> >>> >>> >> > You >> >> >>> >> >>>> >>> >>> >> > have >> >> >>> >> >>>> >>> >>> >> > mentioned >> >> >>> >> >>>> >>> >>> >> > that >> >> >>> >> >>>> >>> >>> >> > Precision and Recall for Semcor5 (5 documents >> >> >>> >> >>>> >>> >>> >> > from >> >> >>> >> >>>> >>> >>> >> > semcor[br-a01,br-a02,br-k18,br-m02,br-r05]) is >> >> >>> >> >>>> >>> >>> >> > .63 >> >> >>> >> >>>> >>> >>> >> > and >> >> >>> >> >>>> >>> >>> >> > .51. >> >> >>> >> >>>> >>> >>> >> > I tried to run SR-AW package over the same >> >> >>> >> >>>> >>> >>> >> > set >> >> >>> >> >>>> >>> >>> >> > of >> >> >>> >> >>>> >>> >>> >> > documents >> >> >>> >> >>>> >>> >>> >> > and I >> >> >>> >> >>>> >>> >>> >> > got >> >> >>> >> >>>> >>> >>> >> > much lesser values for Precision and recall. >> >> >>> >> >>>> >>> >>> >> > Precision = No.of words sense tagged >> >> >>> >> >>>> >>> >>> >> > correctly >> >> >>> >> >>>> >>> >>> >> > / >> >> >>> >> >>>> >>> >>> >> > No.of >> >> >>> >> >>>> >>> >>> >> > words >> >> >>> >> >>>> >>> >>> >> > sense >> >> >>> >> >>>> >>> >>> >> > tagged. >> >> >>> >> >>>> >>> >>> >> > Recall = No.of words sense tagged correctly >> >> >>> >> >>>> >>> >>> >> > / >> >> >>> >> >>>> >>> >>> >> > No.of >> >> >>> >> >>>> >>> >>> >> > words >> >> >>> >> >>>> >>> >>> >> > in >> >> >>> >> >>>> >>> >>> >> > the >> >> >>> >> >>>> >>> >>> >> > documents(tagged as cmd=done). >> >> >>> >> >>>> >>> >>> >> > SR-AW tags word either as <word#pos#senseid> >> >> >>> >> >>>> >>> >>> >> > or >> >> >>> >> >>>> >>> >>> >> > <word#ND>. >> >> >>> >> >>>> >>> >>> >> > No. of words sense tagged = count of >> >> >>> >> >>>> >>> >>> >> > <word#pos#senseid>. >> >> >>> >> >>>> >>> >>> >> > is the above equation correct ? >> >> >>> >> >>>> >>> >>> >> > Is this the way to compute precision and >> >> >>> >> >>>> >>> >>> >> > recall? >> >> >>> >> >>>> >>> >>> >> > What are the tags that you set for SR-AW >> >> >>> >> >>>> >>> >>> >> > execution? >> >> >>> >> >>>> >>> >>> >> > I set the following >> >> >>> >> >>>> >>> >>> >> > Window = 3 >> >> >>> >> >>>> >>> >>> >> > type = WordNet::SenseRelate::lesk >> >> >>> >> >>>> >>> >>> >> > I used a stoplist of articles, prepositions. >> >> >>> >> >>>> >>> >>> >> > >> >> >>> >> >>>> >>> >>> >> > Arun, >> >> >>> >> >>>> >>> >>> >> > >> >> >>> >> >>>> >>> >>> >> > -- >> >> >>> >> >>>> >>> >>> >> > The mind is everything. >> >> >>> >> >>>> >>> >>> >> > What you think you become. - Buddha >> >> >>> >> >>>> >>> >>> >> > >> >> >>> >> >>>> >>> >>> >> > >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> >> >> >>> >> >>>> >>> >>> >> -- >> >> >>> >> >>>> >>> >>> >> Ted Pedersen >> >> >>> >> >>>> >>> >>> >> http://www.d.umn.edu/~tpederse >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > -- >> >> >>> >> >>>> >>> >>> > The mind is everything. >> >> >>> >> >>>> >>> >>> > What you think you become. - Buddha >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> > >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> -- >> >> >>> >> >>>> >>> >>> Ted Pedersen >> >> >>> >> >>>> >>> >>> http://www.d.umn.edu/~tpederse >> >> >>> >> >>>> >>> >> >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > -- >> >> >>> >> >>>> >>> > The mind is everything. >> >> >>> >> >>>> >>> > What you think you become. - Buddha >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> > >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> -- >> >> >>> >> >>>> >>> Ted Pedersen >> >> >>> >> >>>> >>> http://www.d.umn.edu/~tpederse >> >> >>> >> >>>> >> >> >> >>> >> >>>> >> >> >> >>> >> >>>> >> >> >> >>> >> >>>> >> -- >> >> >>> >> >>>> >> The mind is everything. >> >> >>> >> >>>> >> What you think you become. - Buddha >> >> >>> >> >>>> >> >> >> >>> >> >>>> >> >> >> >>> >> >>>> > >> >> >>> >> >>>> > >> >> >>> >> >>>> > >> >> >>> >> >>>> > -- >> >> >>> >> >>>> > Ted Pedersen >> >> >>> >> >>>> > http://www.d.umn.edu/~tpederse >> >> >>> >> >>>> > >> >> >>> >> >>>> >> >> >>> >> >>>> >> >> >>> >> >>>> >> >> >>> >> >>>> -- >> >> >>> >> >>>> Ted Pedersen >> >> >>> >> >>>> http://www.d.umn.edu/~tpederse >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> >> >>> -- >> >> >>> >> >>> The mind is everything. >> >> >>> >> >>> What you think you become. - Buddha >> >> >>> >> >>> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> -- >> >> >>> >> >> The mind is everything. >> >> >>> >> >> What you think you become. - Buddha >> >> >>> >> >> >> >> >>> >> > >> >> >>> >> > >> >> >>> >> > >> >> >>> >> > -- >> >> >>> >> > The mind is everything. >> >> >>> >> > What you think you become. - Buddha >> >> >>> >> > >> >> >>> >> > >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> -- >> >> >>> >> Ted Pedersen >> >> >>> >> http://www.d.umn.edu/~tpederse >> >> >>> > >> >> >>> > >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Ted Pedersen >> >> >>> http://www.d.umn.edu/~tpederse >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> The mind is everything. >> >> >> What you think you become. - Buddha >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > The mind is everything. >> >> > What you think you become. - Buddha >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Ted Pedersen >> >> http://www.d.umn.edu/~tpederse >> > >> > >> > >> > -- >> > The mind is everything. >> > What you think you become. - Buddha >> > >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse > > > > -- > The mind is everything. > What you think you become. - Buddha > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-11 19:22:43
|
Hi Arun, See comments inline... On Wed, May 11, 2011 at 12:25 PM, Arun N <aru...@gm...> wrote: > Thanks for the reply. > I agree that results improve when backoff option is used. > In page 186, The table has results for noun, verbs, adjectives, and adverbs. > There is no column for all words results. That's correct. > Also, the demo paper's results are for all parts of speech . I guess. Correct - that was a very short paper so we tried to make it as condensed as possible. > whereas the Page 174 has backoff results for window 15 and it has all POS > results. Correct. > My question is, Does the table in demo paper correspond to all POS results ? Yes. > If yes, then Page 174 has the results I guess, because, page 186 doesnot > have all POS results. Yes, I think the overall results are presented first, with the more detailed scores later. > Moreover, In Page 174, table 143 has results with backoff option set and the > results for lesk algorithm is 50.9. Yes, that's true. I think Varada's thesis specifies completely the options we used, so that's your best starting point. I don't recall what options were used in the NAACL demo paper. I think the main thing that could differ might be the window size or perhaps the stoplist used by lesk when measuring it's overlaps, or the stoplist used by wsd.pl. But, I'm confident that the results in the NAACL paper are as reported, and also with Varada's thesis. There is some variation in the experiments that appears to be important, although I can't reconstruct exactly what that is. I think the best thing might be to try to run the wsd.pl program on the Senseval-3 data with the options described in Varada's thesis and see what that results in. I'd be happy to look at those results and comments further (once we know what happens there.) Hope this helps. Good luck, Ted > Arun, > > On Wed, May 11, 2011 at 8:21 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: >> >> Hi Arun, >> >> See comments inline... >> >> On Wed, May 11, 2011 at 12:18 AM, Arun N <aru...@gm...> wrote: >> > Hi Guys, >> > I need a small clarification. >> > >> > In the paper >> > http://www.d.umn.edu/~kolha002/publications/pedersenk09-demo-final.pdf >> > the F-measure for Senseval 3 (lesk) with window size 15 is 54 [ P - 54 >> > : R >> > - 53 ] >> > >> > But I cannot find a similar F-measure value in Varadha's thesis. >> > In the thesis >> > http://www.d.umn.edu/~kolha002/publications/Kolhatkar-thesis.pdf >> > >> > page number 173-174 has the results for Window size 15 >> > >> > All the results for window = 15 and lesk measure is not more than 51 >> >> If you look on the last page of Varada's thesis (page 186) I think >> you'll see the source of the results in the NAACL demo paper - note >> that in this case we use the --backoff option, which means default to >> sense 1 when we can't establish anything with the SenseRelate >> algorithm. In the earlier results you mention (pages 173-174) there is >> no such backoff, so you see somewhat lower results. >> >> > >> > Could you tell what options did u set for getting the highest F-measure >> > 54 >> > as reported in the paper ? >> >> See page 186 of Varada's thesis. >> >> > >> > >> > Secondly, >> > Agirre et al >> > http://www.aclweb.org/anthology/E/E09/E09-1005.pdf >> > >> > The authors claim that they get better results when Wordnet 1.7 was >> > used >> > instead of Wordnet 3.0. >> > So, did you guys experiment SR-AW with wordnet 1.7 ? >> >> No. The WordNet group at Princeton doesn't support 1.7 any longer, so >> we don't use it. Overall WordNet 3.0 is much improved on earlier >> versions of WordNet, so I think it makes sense to use it. >> >> However, remember that the SemCor data is based on version 1.5 of >> WordNet, so in some ways it makes sense that an earlier version would >> work better (since as the versions progress those mappings back to 1.5 >> become more and more noisy). But, I think that tells us more about the >> evaluation data than it does WordNet. >> >> > Also, I would like to know whether the actual key given for senseval 2 >> > and >> > 3 was based on Wordnet 1.7 or Wordnet 3.0 ? >> >> To be honest I just don't recall. You might need to dig around a bit >> for some answers to that - http://senseval.org will be a good starting >> point for that. Also remember that WordNet 2.0 was quite popular for >> some time, and could have been used (especially for Senseval-2, since >> I don't think 3.0 was released at that time). >> >> > I downloaded Senseval data sets from Rada Mihalcea's website which was >> > actually suggested by Varadha. >> >> Great! That's a very useful resource. >> ( http://www.cse.unt.edu/~rada/downloads.html#sensevalsemcor ) >> >> Hope this helps! >> >> Good luck, >> Ted >> >> > >> > Arun, >> > >> > On Mon, Apr 25, 2011 at 12:16 PM, Arun N <aru...@gm...> wrote: >> >> >> >> Thanks Varadha. This is what I was searching for. >> >> Arun, >> >> >> >> On Mon, Apr 25, 2011 at 10:47 AM, Ted Pedersen <tpederse@d.umn.edu> >> >> wrote: >> >>> >> >>> Hi Varada, >> >>> >> >>> Ah......that's the part I was forgetting!!!!!!!!!!!!!!!!!!!!!!!!!!! :) >> >>> Thanks very much for clarifying this. >> >>> >> >>> Arun, I hope this works out, and please let us know if additional >> >>> questions arise. >> >>> >> >>> Thanks! >> >>> Ted >> >>> >> >>> On Mon, Apr 25, 2011 at 10:23 AM, varada kolhatkar >> >>> <var...@gm...> wrote: >> >>> > Hi Arun, >> >>> > semcor-reformat.pl needs SemCor formatted input. For my experiments >> >>> > I >> >>> > used >> >>> > Senseval data converted into SemCor format by Rada Mihalcea. >> >>> > You can download it from her webpage. >> >>> > http://www.cse.unt.edu/~rada/downloads.html >> >>> > Search for 'Senseval-3 English all-words converted into SemCor >> >>> > format' >> >>> > Hope that helps, >> >>> > Varada >> >>> > >> >>> > On Mon, Apr 25, 2011 at 7:39 AM, Ted Pedersen <tpederse@d.umn.edu> >> >>> > wrote: >> >>> >> >> >>> >> Thanks for these additional details Arun! We'll investigate further >> >>> >> and report back asap, I hope later today (Monday). >> >>> >> >> >>> >> Cordially, >> >>> >> Ted >> >>> >> >> >>> >> On Sun, Apr 24, 2011 at 10:16 PM, Arun N <aru...@gm...> >> >>> >> wrote: >> >>> >> > @Ted, >> >>> >> > This is the command that I used and the corresponding error >> >>> >> > message. >> >>> >> > $ semcor-reformat.pl --file english-all-words.xml >> >>> >> > Nameless tag: '?xml version="1.0"?' >> >>> >> > Nameless tag: '!DOCTYPE corpus SYSTEM "all-words.dtd"' >> >>> >> > Use of uninitialized value in subroutine entry at >> >>> >> > /usr/local/bin/semcor-reformat.pl line 222, <FH> chunk 1. >> >>> >> > Can't use string ("") as a subroutine ref while "strict refs" in >> >>> >> > use >> >>> >> > at >> >>> >> > /usr/local/bin/semcor-reformat.pl line 222, <FH> chunk 1. >> >>> >> > Arun, >> >>> >> > On Sun, Apr 24, 2011 at 10:12 PM, Arun N <aru...@gm...> >> >>> >> > wrote: >> >>> >> >> >> >>> >> >> @Varadha >> >>> >> >> The results in Varadha's thesis (p. 193) say that SENSEVAL 3 >> >>> >> >> was >> >>> >> >> given >> >>> >> >> in >> >>> >> >> wntagged format. >> >>> >> >> I just want to know how did you convert that to wntagged format >> >>> >> >> ? >> >>> >> >> the .xml file doesnt have POS tags at all as ted mentioned in >> >>> >> >> the >> >>> >> >> earlier >> >>> >> >> mail >> >>> >> >> So I guess, I am using a wrong file for SENSEVAL 3, but I am >> >>> >> >> sure >> >>> >> >> that >> >>> >> >> I >> >>> >> >> downloaded it from the SENSEVAL 3 site. >> >>> >> >> Arun, >> >>> >> >> >> >>> >> >> On Sun, Apr 24, 2011 at 10:08 PM, Arun N <aru...@gm...> >> >>> >> >> wrote: >> >>> >> >>> >> >>> >> >>> One quick clarification, the .xml file that I sent, was the >> >>> >> >>> one >> >>> >> >>> that >> >>> >> >>> Varadha experimented for SENSEVAL 3? >> >>> >> >>> or >> >>> >> >>> Varadha, can u give me link where you downloaded the data set >> >>> >> >>> for >> >>> >> >>> evaluating SR-AW on SENSEVAL 3. >> >>> >> >>> >> >>> >> >>> Arun, >> >>> >> >>> On Sun, Apr 24, 2011 at 9:41 PM, Ted Pedersen >> >>> >> >>> <tpederse@d.umn.edu> >> >>> >> >>> wrote: >> >>> >> >>>> >> >>> >> >>>> Hi Arun, >> >>> >> >>>> >> >>> >> >>>> BTW, I might be wrong about not having this functionality in >> >>> >> >>>> SenseRelate::AllWords. Can you send the command that you try >> >>> >> >>>> to >> >>> >> >>>> run >> >>> >> >>>> and the error that you get? I'll check on a few things in the >> >>> >> >>>> meantime. >> >>> >> >>>> >> >>> >> >>>> Thanks! >> >>> >> >>>> Ted >> >>> >> >>>> >> >>> >> >>>> On Sun, Apr 24, 2011 at 9:33 PM, Ted Pedersen >> >>> >> >>>> <tpederse@d.umn.edu> >> >>> >> >>>> wrote: >> >>> >> >>>> > Hi Arun, >> >>> >> >>>> > >> >>> >> >>>> > You can format input to WordNet::SenseRelate::AllWords as >> >>> >> >>>> > wntagged >> >>> >> >>>> > (four part of speech tags, n, v, a, r) >> >>> >> >>>> > >> >>> >> >>>> > cats#n run#v >> >>> >> >>>> > >> >>> >> >>>> > or raw (plain text) >> >>> >> >>>> > >> >>> >> >>>> > cats run >> >>> >> >>>> > >> >>> >> >>>> > or tagged (penn treebank) >> >>> >> >>>> > >> >>> >> >>>> > cats/NP run/VB >> >>> >> >>>> > >> >>> >> >>>> > Based on what I see in the xml file you sent, I think you >> >>> >> >>>> > probably >> >>> >> >>>> > just want to convert this to a raw text format (where you >> >>> >> >>>> > have >> >>> >> >>>> > one >> >>> >> >>>> > sentence per line, one line per sentence) since there are no >> >>> >> >>>> > pos >> >>> >> >>>> > tags >> >>> >> >>>> > (so no point in using wntagged or tagged). >> >>> >> >>>> > >> >>> >> >>>> > We don't have a converter from SensEval-3 format in >> >>> >> >>>> > SenseRelate::AllWords...however, I think I might know of one >> >>> >> >>>> > I >> >>> >> >>>> > can >> >>> >> >>>> > refer you to....let me check on that and report back on >> >>> >> >>>> > Monday. >> >>> >> >>>> > >> >>> >> >>>> > Cordially, >> >>> >> >>>> > Ted >> >>> >> >>>> > >> >>> >> >>>> > On Sun, Apr 24, 2011 at 9:21 PM, Arun N >> >>> >> >>>> > <aru...@gm...> >> >>> >> >>>> > wrote: >> >>> >> >>>> >> I am planning to experiment on SENSEVAL 3 all words data >> >>> >> >>>> >> set. >> >>> >> >>>> >> But, it is in a different format from Semcor. >> >>> >> >>>> >> When I tried to use extract-semcor.pl on the file, it >> >>> >> >>>> >> showed >> >>> >> >>>> >> some >> >>> >> >>>> >> error. >> >>> >> >>>> >> I downloaded the senseval3 all words test data from site >> >>> >> >>>> >> http://www.senseval.org/senseval3/data.html >> >>> >> >>>> >> I have also attached the file. >> >>> >> >>>> >> I just want to know how should I format SENSEVAL 3 all >> >>> >> >>>> >> words >> >>> >> >>>> >> data >> >>> >> >>>> >> and >> >>> >> >>>> >> give >> >>> >> >>>> >> to wsd.pl ? >> >>> >> >>>> >> Arun, >> >>> >> >>>> >> >> >>> >> >>>> >> On Sun, Apr 24, 2011 at 8:20 PM, Ted Pedersen >> >>> >> >>>> >> <tpederse@d.umn.edu> >> >>> >> >>>> >> wrote: >> >>> >> >>>> >>> >> >>> >> >>>> >>> Hi Arun, >> >>> >> >>>> >>> >> >>> >> >>>> >>> I'm not sure what you mean by senseval...do you mean the >> >>> >> >>>> >>> semcor >> >>> >> >>>> >>> format? >> >>> >> >>>> >>> Or... ? >> >>> >> >>>> >>> >> >>> >> >>>> >>> BTW, for wntagged, do you mean text that looks like this: >> >>> >> >>>> >>> >> >>> >> >>>> >>> cats#n run#v fast#r >> >>> >> >>>> >>> >> >>> >> >>>> >>> Just wanted to clarify since there are a few different >> >>> >> >>>> >>> formats... >> >>> >> >>>> >>> >> >>> >> >>>> >>> Thanks! >> >>> >> >>>> >>> Ted >> >>> >> >>>> >>> >> >>> >> >>>> >>> >> >>> >> >>>> >>> On Sun, Apr 24, 2011 at 7:41 PM, Arun N >> >>> >> >>>> >>> <aru...@gm...> >> >>> >> >>>> >>> wrote: >> >>> >> >>>> >>> > Thanks for the reply guys. >> >>> >> >>>> >>> > Is there any perl script to convert senseval format to >> >>> >> >>>> >>> > wntagged >> >>> >> >>>> >>> > for >> >>> >> >>>> >>> > senserelate. >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > Arun, >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > On Sun, Apr 24, 2011 at 6:30 PM, varada kolhatkar >> >>> >> >>>> >>> > <var...@gm...> wrote: >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> >> Yes, semcor-reformat.pl is the script which can be used >> >>> >> >>>> >>> >> to >> >>> >> >>>> >>> >> generate wsd >> >>> >> >>>> >>> >> key file. We also provide scorer2-sort.pl as it's >> >>> >> >>>> >>> >> easier >> >>> >> >>>> >>> >> to >> >>> >> >>>> >>> >> compare >> >>> >> >>>> >>> >> sorted >> >>> >> >>>> >>> >> lists. >> >>> >> >>>> >>> >> extract-semcor-plaintext.pl can be used to extract >> >>> >> >>>> >>> >> plain >> >>> >> >>>> >>> >> text >> >>> >> >>>> >>> >> (text >> >>> >> >>>> >>> >> without POS tag info) from semcor. If you want to >> >>> >> >>>> >>> >> experiment >> >>> >> >>>> >>> >> with >> >>> >> >>>> >>> >> the >> >>> >> >>>> >>> >> effect >> >>> >> >>>> >>> >> of POS tagging on wsd, you can use this script. >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> >> Varada >> >>> >> >>>> >>> >> On Sat, Apr 23, 2011 at 11:13 PM, Ted Pedersen >> >>> >> >>>> >>> >> <tpederse@d.umn.edu> >> >>> >> >>>> >>> >> wrote: >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Hi Arun, >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Try out the following commands to create a key >> >>> >> >>>> >>> >>> file...Note >> >>> >> >>>> >>> >>> that >> >>> >> >>>> >>> >>> I'm >> >>> >> >>>> >>> >>> using semcor-sample.txt as the source of the key. >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> This is my key... >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> marengo(22): more semcor-sample.txt >> >>> >> >>>> >>> >>> <contextfile concordance=brown> >> >>> >> >>>> >>> >>> <context filename=br-e24 paras=yes> >> >>> >> >>>> >>> >>> <p pnum=1> >> >>> >> >>>> >>> >>> <s snum=1> >> >>> >> >>>> >>> >>> <wf cmd=ignore pos=DT>The</wf> >> >>> >> >>>> >>> >>> <wf cmd=done pos=JJ lemma=russian wnsn=1 >> >>> >> >>>> >>> >>> lexsn=3:01:00::>Russian</wf> >> >>> >> >>>> >>> >>> <wf cmd=done pos=NN lemma=gymnast wnsn=1 >> >>> >> >>>> >>> >>> lexsn=1:18:00::>gymnasts</wf> >> >>> >> >>>> >>> >>> <wf cmd=done pos=IN ot=idiom>beat_the_tar_out_of</wf> >> >>> >> >>>> >>> >>> <wf cmd=ignore pos=DT>the</wf> >> >>> >> >>>> >>> >>> </s> >> >>> >> >>>> >>> >>> </p> >> >>> >> >>>> >>> >>> </context> >> >>> >> >>>> >>> >>> </contextfile> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> marengo(20): semcor-reformat.pl --file >> >>> >> >>>> >>> >>> semcor-sample.txt >> >>> >> >>>> >>> >>> --key | >> >>> >> >>>> >>> >>> scorer2-sort.pl > key.txt >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> marengo(21): cat key.txt >> >>> >> >>>> >>> >>> gymnast.n 2 1 >> >>> >> >>>> >>> >>> russian.a 1 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Assume that these are the answers generated by my >> >>> >> >>>> >>> >>> system... >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> marengo(23): more answers.txt >> >>> >> >>>> >>> >>> gymnast.n 2 3 >> >>> >> >>>> >>> >>> russian.a 1 1 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Then I could run the scorer like this... >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> marengo(29): allwords-scorer2.pl --ansfile answers.txt >> >>> >> >>>> >>> >>> --keyfile >> >>> >> >>>> >>> >>> key.txt >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> score for "answers.txt" using key "key.txt" : >> >>> >> >>>> >>> >>> precision: 0.500 (1 correct of 2 attempted.) >> >>> >> >>>> >>> >>> recall: 0.500 (1 correct of 2 in total) >> >>> >> >>>> >>> >>> F-measure: 0.500 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> attempted: 100.00%(2 attempted of 2 in total) >> >>> >> >>>> >>> >>> part of speech tag mismatch in attempted instances: >> >>> >> >>>> >>> >>> 0.00% (0 >> >>> >> >>>> >>> >>> mismatches of 2 attempted instances) >> >>> >> >>>> >>> >>> skipped instances : 0.00% (skipped 0 instances of >> >>> >> >>>> >>> >>> total >> >>> >> >>>> >>> >>> 2 >> >>> >> >>>> >>> >>> instances >> >>> >> >>>> >>> >>> because the instance id or the word was not found in >> >>> >> >>>> >>> >>> the >> >>> >> >>>> >>> >>> answer >> >>> >> >>>> >>> >>> file) >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Nouns: >> >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 1 nouns attempted.) >> >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 1 noun instances in >> >>> >> >>>> >>> >>> total) >> >>> >> >>>> >>> >>> F-measure: 0.000 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Verbs: >> >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 0 verbs attempted.) >> >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 0 verb instances in >> >>> >> >>>> >>> >>> total) >> >>> >> >>>> >>> >>> F-measure: 0.000 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Adjectives: >> >>> >> >>>> >>> >>> Precision : 1.000 (1 correct of 1 adjectives >> >>> >> >>>> >>> >>> attempted.) >> >>> >> >>>> >>> >>> Recall : 1.000 (1 correct of 1 adjective instances in >> >>> >> >>>> >>> >>> total) >> >>> >> >>>> >>> >>> F-measure: 1.000 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Adverbs: >> >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 0 adverbs attempted.) >> >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 0 adverb instances in >> >>> >> >>>> >>> >>> total) >> >>> >> >>>> >>> >>> F-measure: 0.000 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Confusion Matrix for part of speech tags : >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Noun Verb Adj >> >>> >> >>>> >>> >>> Adv >> >>> >> >>>> >>> >>> | Key >> >>> >> >>>> >>> >>> Noun 1 0 0 >> >>> >> >>>> >>> >>> 0 >> >>> >> >>>> >>> >>> | 1 >> >>> >> >>>> >>> >>> Verb 0 0 0 >> >>> >> >>>> >>> >>> 0 >> >>> >> >>>> >>> >>> | 0 >> >>> >> >>>> >>> >>> Adj 0 0 1 >> >>> >> >>>> >>> >>> 0 >> >>> >> >>>> >>> >>> | 1 >> >>> >> >>>> >>> >>> Adv 0 0 0 >> >>> >> >>>> >>> >>> 0 >> >>> >> >>>> >>> >>> | 0 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> --------------------------------------------------------------------------------|------- >> >>> >> >>>> >>> >>> Ans 1 0 1 >> >>> >> >>>> >>> >>> 0 >> >>> >> >>>> >>> >>> | 2 >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> I hope this is of some help. Please let us know though >> >>> >> >>>> >>> >>> if >> >>> >> >>>> >>> >>> there >> >>> >> >>>> >>> >>> are >> >>> >> >>>> >>> >>> additional issues to resolve! >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> Cordially, >> >>> >> >>>> >>> >>> Ted >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> On Sat, Apr 23, 2011 at 7:54 PM, Arun N >> >>> >> >>>> >>> >>> <aru...@gm...> >> >>> >> >>>> >>> >>> wrote: >> >>> >> >>>> >>> >>> > Hi, >> >>> >> >>>> >>> >>> > Thanks, for the quick reply. >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > Actually, I wrote my own scorer. But, I am thinking >> >>> >> >>>> >>> >>> > of >> >>> >> >>>> >>> >>> > using >> >>> >> >>>> >>> >>> > the >> >>> >> >>>> >>> >>> > scorer >> >>> >> >>>> >>> >>> > provided in senserelate package. >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > btw, It would be great, if you could tell how to >> >>> >> >>>> >>> >>> > generate >> >>> >> >>>> >>> >>> > the >> >>> >> >>>> >>> >>> > key >> >>> >> >>>> >>> >>> > file >> >>> >> >>>> >>> >>> > for a >> >>> >> >>>> >>> >>> > semcor input file. >> >>> >> >>>> >>> >>> > There is a perl script extract-semcor-plaintext.pl >> >>> >> >>>> >>> >>> > --key >> >>> >> >>>> >>> >>> > flag, but >> >>> >> >>>> >>> >>> > it >> >>> >> >>>> >>> >>> > generates a key with just the POS tags but not the >> >>> >> >>>> >>> >>> > wordnet >> >>> >> >>>> >>> >>> > senses. >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > Do you have any perl code to generate the key >> >>> >> >>>> >>> >>> > file(in a >> >>> >> >>>> >>> >>> > suitable >> >>> >> >>>> >>> >>> > format >> >>> >> >>>> >>> >>> > for >> >>> >> >>>> >>> >>> > scorer) from a semcor file ? so that it can be >> >>> >> >>>> >>> >>> > passed >> >>> >> >>>> >>> >>> > to >> >>> >> >>>> >>> >>> > the >> >>> >> >>>> >>> >>> > allwords-scorer.pl. >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > Arun, >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > On Sat, Apr 23, 2011 at 3:16 PM, Ted Pedersen >> >>> >> >>>> >>> >>> > <tpederse@d.umn.edu> >> >>> >> >>>> >>> >>> > wrote: >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> Hi Arun, >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> Nice to hear from you. You may also wish to consult >> >>> >> >>>> >>> >>> >> Varada >> >>> >> >>>> >>> >>> >> Kolhatkar's >> >>> >> >>>> >>> >>> >> MS thesis, which is a more recent use of >> >>> >> >>>> >>> >>> >> WordNet::SenseRelate::Allwords. While some >> >>> >> >>>> >>> >>> >> differences >> >>> >> >>>> >>> >>> >> in >> >>> >> >>>> >>> >>> >> results >> >>> >> >>>> >>> >>> >> are >> >>> >> >>>> >>> >>> >> to be expected as the years go by (due to changes >> >>> >> >>>> >>> >>> >> in >> >>> >> >>>> >>> >>> >> WordNet >> >>> >> >>>> >>> >>> >> for >> >>> >> >>>> >>> >>> >> example) they should be fairly minor. >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> An Extended Analysis of a Method of All Words Sense >> >>> >> >>>> >>> >>> >> Disambiguation >> >>> >> >>>> >>> >>> >> (Kolhatkar) - Master of Science Thesis, Department >> >>> >> >>>> >>> >>> >> of >> >>> >> >>>> >>> >>> >> Computer >> >>> >> >>>> >>> >>> >> Science, University of Minnesota, Duluth, August, >> >>> >> >>>> >>> >>> >> 2009. >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> http://www.d.umn.edu/~tpederse/Pubs/varada-thesis.pdf >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> Regarding your results, what were your precision >> >>> >> >>>> >>> >>> >> and >> >>> >> >>>> >>> >>> >> recall >> >>> >> >>>> >>> >>> >> values? >> >>> >> >>>> >>> >>> >> Did you use the scoring program that comes with >> >>> >> >>>> >>> >>> >> WordNet::SenseRelate::AllWords? Also, if you could >> >>> >> >>>> >>> >>> >> send >> >>> >> >>>> >>> >>> >> the >> >>> >> >>>> >>> >>> >> exact >> >>> >> >>>> >>> >>> >> command you ran that would help us understand what >> >>> >> >>>> >>> >>> >> might >> >>> >> >>>> >>> >>> >> be >> >>> >> >>>> >>> >>> >> happening. >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> Thanks! >> >>> >> >>>> >>> >>> >> Ted >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> On Sat, Apr 23, 2011 at 1:07 PM, Arun N >> >>> >> >>>> >>> >>> >> <aru...@gm...> >> >>> >> >>>> >>> >>> >> wrote: >> >>> >> >>>> >>> >>> >> > Hi Jason and Ted, >> >>> >> >>>> >>> >>> >> > I am Arun Nedunchezhian, Graduate Student at UT >> >>> >> >>>> >>> >>> >> > Austin. I >> >>> >> >>>> >>> >>> >> > am >> >>> >> >>>> >>> >>> >> > working >> >>> >> >>>> >>> >>> >> > on a >> >>> >> >>>> >>> >>> >> > project which uses WordNet::SenseRelate::Allwords >> >>> >> >>>> >>> >>> >> > package. >> >>> >> >>>> >>> >>> >> > I read the results section in your(Jason) MS >> >>> >> >>>> >>> >>> >> > thesis. >> >>> >> >>>> >>> >>> >> > You >> >>> >> >>>> >>> >>> >> > have >> >>> >> >>>> >>> >>> >> > mentioned >> >>> >> >>>> >>> >>> >> > that >> >>> >> >>>> >>> >>> >> > Precision and Recall for Semcor5 (5 documents >> >>> >> >>>> >>> >>> >> > from >> >>> >> >>>> >>> >>> >> > semcor[br-a01,br-a02,br-k18,br-m02,br-r05]) is >> >>> >> >>>> >>> >>> >> > .63 >> >>> >> >>>> >>> >>> >> > and >> >>> >> >>>> >>> >>> >> > .51. >> >>> >> >>>> >>> >>> >> > I tried to run SR-AW package over the same set >> >>> >> >>>> >>> >>> >> > of >> >>> >> >>>> >>> >>> >> > documents >> >>> >> >>>> >>> >>> >> > and I >> >>> >> >>>> >>> >>> >> > got >> >>> >> >>>> >>> >>> >> > much lesser values for Precision and recall. >> >>> >> >>>> >>> >>> >> > Precision = No.of words sense tagged correctly >> >>> >> >>>> >>> >>> >> > / >> >>> >> >>>> >>> >>> >> > No.of >> >>> >> >>>> >>> >>> >> > words >> >>> >> >>>> >>> >>> >> > sense >> >>> >> >>>> >>> >>> >> > tagged. >> >>> >> >>>> >>> >>> >> > Recall = No.of words sense tagged correctly / >> >>> >> >>>> >>> >>> >> > No.of >> >>> >> >>>> >>> >>> >> > words >> >>> >> >>>> >>> >>> >> > in >> >>> >> >>>> >>> >>> >> > the >> >>> >> >>>> >>> >>> >> > documents(tagged as cmd=done). >> >>> >> >>>> >>> >>> >> > SR-AW tags word either as <word#pos#senseid> or >> >>> >> >>>> >>> >>> >> > <word#ND>. >> >>> >> >>>> >>> >>> >> > No. of words sense tagged = count of >> >>> >> >>>> >>> >>> >> > <word#pos#senseid>. >> >>> >> >>>> >>> >>> >> > is the above equation correct ? >> >>> >> >>>> >>> >>> >> > Is this the way to compute precision and recall? >> >>> >> >>>> >>> >>> >> > What are the tags that you set for SR-AW >> >>> >> >>>> >>> >>> >> > execution? >> >>> >> >>>> >>> >>> >> > I set the following >> >>> >> >>>> >>> >>> >> > Window = 3 >> >>> >> >>>> >>> >>> >> > type = WordNet::SenseRelate::lesk >> >>> >> >>>> >>> >>> >> > I used a stoplist of articles, prepositions. >> >>> >> >>>> >>> >>> >> > >> >>> >> >>>> >>> >>> >> > Arun, >> >>> >> >>>> >>> >>> >> > >> >>> >> >>>> >>> >>> >> > -- >> >>> >> >>>> >>> >>> >> > The mind is everything. >> >>> >> >>>> >>> >>> >> > What you think you become. - Buddha >> >>> >> >>>> >>> >>> >> > >> >>> >> >>>> >>> >>> >> > >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> >> >>> >> >>>> >>> >>> >> -- >> >>> >> >>>> >>> >>> >> Ted Pedersen >> >>> >> >>>> >>> >>> >> http://www.d.umn.edu/~tpederse >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > -- >> >>> >> >>>> >>> >>> > The mind is everything. >> >>> >> >>>> >>> >>> > What you think you become. - Buddha >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> > >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> -- >> >>> >> >>>> >>> >>> Ted Pedersen >> >>> >> >>>> >>> >>> http://www.d.umn.edu/~tpederse >> >>> >> >>>> >>> >> >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > -- >> >>> >> >>>> >>> > The mind is everything. >> >>> >> >>>> >>> > What you think you become. - Buddha >> >>> >> >>>> >>> > >> >>> >> >>>> >>> > >> >>> >> >>>> >>> >> >>> >> >>>> >>> >> >>> >> >>>> >>> >> >>> >> >>>> >>> -- >> >>> >> >>>> >>> Ted Pedersen >> >>> >> >>>> >>> http://www.d.umn.edu/~tpederse >> >>> >> >>>> >> >> >>> >> >>>> >> >> >>> >> >>>> >> >> >>> >> >>>> >> -- >> >>> >> >>>> >> The mind is everything. >> >>> >> >>>> >> What you think you become. - Buddha >> >>> >> >>>> >> >> >>> >> >>>> >> >> >>> >> >>>> > >> >>> >> >>>> > >> >>> >> >>>> > >> >>> >> >>>> > -- >> >>> >> >>>> > Ted Pedersen >> >>> >> >>>> > http://www.d.umn.edu/~tpederse >> >>> >> >>>> > >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> -- >> >>> >> >>>> Ted Pedersen >> >>> >> >>>> http://www.d.umn.edu/~tpederse >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> -- >> >>> >> >>> The mind is everything. >> >>> >> >>> What you think you become. - Buddha >> >>> >> >>> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> -- >> >>> >> >> The mind is everything. >> >>> >> >> What you think you become. - Buddha >> >>> >> >> >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > -- >> >>> >> > The mind is everything. >> >>> >> > What you think you become. - Buddha >> >>> >> > >> >>> >> > >> >>> >> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Ted Pedersen >> >>> >> http://www.d.umn.edu/~tpederse >> >>> > >> >>> > >> >>> >> >>> >> >>> >> >>> -- >> >>> Ted Pedersen >> >>> http://www.d.umn.edu/~tpederse >> >> >> >> >> >> >> >> -- >> >> The mind is everything. >> >> What you think you become. - Buddha >> >> >> > >> > >> > >> > -- >> > The mind is everything. >> > What you think you become. - Buddha >> > >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse > > > > -- > The mind is everything. > What you think you become. - Buddha > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-05-11 13:21:26
|
Hi Arun, See comments inline... On Wed, May 11, 2011 at 12:18 AM, Arun N <aru...@gm...> wrote: > Hi Guys, > I need a small clarification. > > In the paper > http://www.d.umn.edu/~kolha002/publications/pedersenk09-demo-final.pdf > the F-measure for Senseval 3 (lesk) with window size 15 is 54 [ P - 54 : R > - 53 ] > > But I cannot find a similar F-measure value in Varadha's thesis. > In the thesis > http://www.d.umn.edu/~kolha002/publications/Kolhatkar-thesis.pdf > > page number 173-174 has the results for Window size 15 > > All the results for window = 15 and lesk measure is not more than 51 If you look on the last page of Varada's thesis (page 186) I think you'll see the source of the results in the NAACL demo paper - note that in this case we use the --backoff option, which means default to sense 1 when we can't establish anything with the SenseRelate algorithm. In the earlier results you mention (pages 173-174) there is no such backoff, so you see somewhat lower results. > > Could you tell what options did u set for getting the highest F-measure 54 > as reported in the paper ? See page 186 of Varada's thesis. > > > Secondly, > Agirre et al > http://www.aclweb.org/anthology/E/E09/E09-1005.pdf > > The authors claim that they get better results when Wordnet 1.7 was used > instead of Wordnet 3.0. > So, did you guys experiment SR-AW with wordnet 1.7 ? No. The WordNet group at Princeton doesn't support 1.7 any longer, so we don't use it. Overall WordNet 3.0 is much improved on earlier versions of WordNet, so I think it makes sense to use it. However, remember that the SemCor data is based on version 1.5 of WordNet, so in some ways it makes sense that an earlier version would work better (since as the versions progress those mappings back to 1.5 become more and more noisy). But, I think that tells us more about the evaluation data than it does WordNet. > Also, I would like to know whether the actual key given for senseval 2 and > 3 was based on Wordnet 1.7 or Wordnet 3.0 ? To be honest I just don't recall. You might need to dig around a bit for some answers to that - http://senseval.org will be a good starting point for that. Also remember that WordNet 2.0 was quite popular for some time, and could have been used (especially for Senseval-2, since I don't think 3.0 was released at that time). > I downloaded Senseval data sets from Rada Mihalcea's website which was > actually suggested by Varadha. Great! That's a very useful resource. ( http://www.cse.unt.edu/~rada/downloads.html#sensevalsemcor ) Hope this helps! Good luck, Ted > > Arun, > > On Mon, Apr 25, 2011 at 12:16 PM, Arun N <aru...@gm...> wrote: >> >> Thanks Varadha. This is what I was searching for. >> Arun, >> >> On Mon, Apr 25, 2011 at 10:47 AM, Ted Pedersen <tpederse@d.umn.edu> wrote: >>> >>> Hi Varada, >>> >>> Ah......that's the part I was forgetting!!!!!!!!!!!!!!!!!!!!!!!!!!! :) >>> Thanks very much for clarifying this. >>> >>> Arun, I hope this works out, and please let us know if additional >>> questions arise. >>> >>> Thanks! >>> Ted >>> >>> On Mon, Apr 25, 2011 at 10:23 AM, varada kolhatkar >>> <var...@gm...> wrote: >>> > Hi Arun, >>> > semcor-reformat.pl needs SemCor formatted input. For my experiments I >>> > used >>> > Senseval data converted into SemCor format by Rada Mihalcea. >>> > You can download it from her webpage. >>> > http://www.cse.unt.edu/~rada/downloads.html >>> > Search for 'Senseval-3 English all-words converted into SemCor format' >>> > Hope that helps, >>> > Varada >>> > >>> > On Mon, Apr 25, 2011 at 7:39 AM, Ted Pedersen <tpederse@d.umn.edu> >>> > wrote: >>> >> >>> >> Thanks for these additional details Arun! We'll investigate further >>> >> and report back asap, I hope later today (Monday). >>> >> >>> >> Cordially, >>> >> Ted >>> >> >>> >> On Sun, Apr 24, 2011 at 10:16 PM, Arun N <aru...@gm...> wrote: >>> >> > @Ted, >>> >> > This is the command that I used and the corresponding error message. >>> >> > $ semcor-reformat.pl --file english-all-words.xml >>> >> > Nameless tag: '?xml version="1.0"?' >>> >> > Nameless tag: '!DOCTYPE corpus SYSTEM "all-words.dtd"' >>> >> > Use of uninitialized value in subroutine entry at >>> >> > /usr/local/bin/semcor-reformat.pl line 222, <FH> chunk 1. >>> >> > Can't use string ("") as a subroutine ref while "strict refs" in use >>> >> > at >>> >> > /usr/local/bin/semcor-reformat.pl line 222, <FH> chunk 1. >>> >> > Arun, >>> >> > On Sun, Apr 24, 2011 at 10:12 PM, Arun N <aru...@gm...> >>> >> > wrote: >>> >> >> >>> >> >> @Varadha >>> >> >> The results in Varadha's thesis (p. 193) say that SENSEVAL 3 was >>> >> >> given >>> >> >> in >>> >> >> wntagged format. >>> >> >> I just want to know how did you convert that to wntagged format ? >>> >> >> the .xml file doesnt have POS tags at all as ted mentioned in the >>> >> >> earlier >>> >> >> mail >>> >> >> So I guess, I am using a wrong file for SENSEVAL 3, but I am sure >>> >> >> that >>> >> >> I >>> >> >> downloaded it from the SENSEVAL 3 site. >>> >> >> Arun, >>> >> >> >>> >> >> On Sun, Apr 24, 2011 at 10:08 PM, Arun N <aru...@gm...> >>> >> >> wrote: >>> >> >>> >>> >> >>> One quick clarification, the .xml file that I sent, was the one >>> >> >>> that >>> >> >>> Varadha experimented for SENSEVAL 3? >>> >> >>> or >>> >> >>> Varadha, can u give me link where you downloaded the data set for >>> >> >>> evaluating SR-AW on SENSEVAL 3. >>> >> >>> >>> >> >>> Arun, >>> >> >>> On Sun, Apr 24, 2011 at 9:41 PM, Ted Pedersen <tpederse@d.umn.edu> >>> >> >>> wrote: >>> >> >>>> >>> >> >>>> Hi Arun, >>> >> >>>> >>> >> >>>> BTW, I might be wrong about not having this functionality in >>> >> >>>> SenseRelate::AllWords. Can you send the command that you try to >>> >> >>>> run >>> >> >>>> and the error that you get? I'll check on a few things in the >>> >> >>>> meantime. >>> >> >>>> >>> >> >>>> Thanks! >>> >> >>>> Ted >>> >> >>>> >>> >> >>>> On Sun, Apr 24, 2011 at 9:33 PM, Ted Pedersen >>> >> >>>> <tpederse@d.umn.edu> >>> >> >>>> wrote: >>> >> >>>> > Hi Arun, >>> >> >>>> > >>> >> >>>> > You can format input to WordNet::SenseRelate::AllWords as >>> >> >>>> > wntagged >>> >> >>>> > (four part of speech tags, n, v, a, r) >>> >> >>>> > >>> >> >>>> > cats#n run#v >>> >> >>>> > >>> >> >>>> > or raw (plain text) >>> >> >>>> > >>> >> >>>> > cats run >>> >> >>>> > >>> >> >>>> > or tagged (penn treebank) >>> >> >>>> > >>> >> >>>> > cats/NP run/VB >>> >> >>>> > >>> >> >>>> > Based on what I see in the xml file you sent, I think you >>> >> >>>> > probably >>> >> >>>> > just want to convert this to a raw text format (where you have >>> >> >>>> > one >>> >> >>>> > sentence per line, one line per sentence) since there are no >>> >> >>>> > pos >>> >> >>>> > tags >>> >> >>>> > (so no point in using wntagged or tagged). >>> >> >>>> > >>> >> >>>> > We don't have a converter from SensEval-3 format in >>> >> >>>> > SenseRelate::AllWords...however, I think I might know of one I >>> >> >>>> > can >>> >> >>>> > refer you to....let me check on that and report back on Monday. >>> >> >>>> > >>> >> >>>> > Cordially, >>> >> >>>> > Ted >>> >> >>>> > >>> >> >>>> > On Sun, Apr 24, 2011 at 9:21 PM, Arun N <aru...@gm...> >>> >> >>>> > wrote: >>> >> >>>> >> I am planning to experiment on SENSEVAL 3 all words data set. >>> >> >>>> >> But, it is in a different format from Semcor. >>> >> >>>> >> When I tried to use extract-semcor.pl on the file, it showed >>> >> >>>> >> some >>> >> >>>> >> error. >>> >> >>>> >> I downloaded the senseval3 all words test data from site >>> >> >>>> >> http://www.senseval.org/senseval3/data.html >>> >> >>>> >> I have also attached the file. >>> >> >>>> >> I just want to know how should I format SENSEVAL 3 all words >>> >> >>>> >> data >>> >> >>>> >> and >>> >> >>>> >> give >>> >> >>>> >> to wsd.pl ? >>> >> >>>> >> Arun, >>> >> >>>> >> >>> >> >>>> >> On Sun, Apr 24, 2011 at 8:20 PM, Ted Pedersen >>> >> >>>> >> <tpederse@d.umn.edu> >>> >> >>>> >> wrote: >>> >> >>>> >>> >>> >> >>>> >>> Hi Arun, >>> >> >>>> >>> >>> >> >>>> >>> I'm not sure what you mean by senseval...do you mean the >>> >> >>>> >>> semcor >>> >> >>>> >>> format? >>> >> >>>> >>> Or... ? >>> >> >>>> >>> >>> >> >>>> >>> BTW, for wntagged, do you mean text that looks like this: >>> >> >>>> >>> >>> >> >>>> >>> cats#n run#v fast#r >>> >> >>>> >>> >>> >> >>>> >>> Just wanted to clarify since there are a few different >>> >> >>>> >>> formats... >>> >> >>>> >>> >>> >> >>>> >>> Thanks! >>> >> >>>> >>> Ted >>> >> >>>> >>> >>> >> >>>> >>> >>> >> >>>> >>> On Sun, Apr 24, 2011 at 7:41 PM, Arun N <aru...@gm...> >>> >> >>>> >>> wrote: >>> >> >>>> >>> > Thanks for the reply guys. >>> >> >>>> >>> > Is there any perl script to convert senseval format to >>> >> >>>> >>> > wntagged >>> >> >>>> >>> > for >>> >> >>>> >>> > senserelate. >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > Arun, >>> >> >>>> >>> > >>> >> >>>> >>> > On Sun, Apr 24, 2011 at 6:30 PM, varada kolhatkar >>> >> >>>> >>> > <var...@gm...> wrote: >>> >> >>>> >>> >> >>> >> >>>> >>> >> Yes, semcor-reformat.pl is the script which can be used to >>> >> >>>> >>> >> generate wsd >>> >> >>>> >>> >> key file. We also provide scorer2-sort.pl as it's easier >>> >> >>>> >>> >> to >>> >> >>>> >>> >> compare >>> >> >>>> >>> >> sorted >>> >> >>>> >>> >> lists. >>> >> >>>> >>> >> extract-semcor-plaintext.pl can be used to extract plain >>> >> >>>> >>> >> text >>> >> >>>> >>> >> (text >>> >> >>>> >>> >> without POS tag info) from semcor. If you want to >>> >> >>>> >>> >> experiment >>> >> >>>> >>> >> with >>> >> >>>> >>> >> the >>> >> >>>> >>> >> effect >>> >> >>>> >>> >> of POS tagging on wsd, you can use this script. >>> >> >>>> >>> >> >>> >> >>>> >>> >> Varada >>> >> >>>> >>> >> On Sat, Apr 23, 2011 at 11:13 PM, Ted Pedersen >>> >> >>>> >>> >> <tpederse@d.umn.edu> >>> >> >>>> >>> >> wrote: >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Hi Arun, >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Try out the following commands to create a key >>> >> >>>> >>> >>> file...Note >>> >> >>>> >>> >>> that >>> >> >>>> >>> >>> I'm >>> >> >>>> >>> >>> using semcor-sample.txt as the source of the key. >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> This is my key... >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> marengo(22): more semcor-sample.txt >>> >> >>>> >>> >>> <contextfile concordance=brown> >>> >> >>>> >>> >>> <context filename=br-e24 paras=yes> >>> >> >>>> >>> >>> <p pnum=1> >>> >> >>>> >>> >>> <s snum=1> >>> >> >>>> >>> >>> <wf cmd=ignore pos=DT>The</wf> >>> >> >>>> >>> >>> <wf cmd=done pos=JJ lemma=russian wnsn=1 >>> >> >>>> >>> >>> lexsn=3:01:00::>Russian</wf> >>> >> >>>> >>> >>> <wf cmd=done pos=NN lemma=gymnast wnsn=1 >>> >> >>>> >>> >>> lexsn=1:18:00::>gymnasts</wf> >>> >> >>>> >>> >>> <wf cmd=done pos=IN ot=idiom>beat_the_tar_out_of</wf> >>> >> >>>> >>> >>> <wf cmd=ignore pos=DT>the</wf> >>> >> >>>> >>> >>> </s> >>> >> >>>> >>> >>> </p> >>> >> >>>> >>> >>> </context> >>> >> >>>> >>> >>> </contextfile> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> marengo(20): semcor-reformat.pl --file semcor-sample.txt >>> >> >>>> >>> >>> --key | >>> >> >>>> >>> >>> scorer2-sort.pl > key.txt >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> marengo(21): cat key.txt >>> >> >>>> >>> >>> gymnast.n 2 1 >>> >> >>>> >>> >>> russian.a 1 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Assume that these are the answers generated by my >>> >> >>>> >>> >>> system... >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> marengo(23): more answers.txt >>> >> >>>> >>> >>> gymnast.n 2 3 >>> >> >>>> >>> >>> russian.a 1 1 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Then I could run the scorer like this... >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> marengo(29): allwords-scorer2.pl --ansfile answers.txt >>> >> >>>> >>> >>> --keyfile >>> >> >>>> >>> >>> key.txt >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> score for "answers.txt" using key "key.txt" : >>> >> >>>> >>> >>> precision: 0.500 (1 correct of 2 attempted.) >>> >> >>>> >>> >>> recall: 0.500 (1 correct of 2 in total) >>> >> >>>> >>> >>> F-measure: 0.500 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> attempted: 100.00%(2 attempted of 2 in total) >>> >> >>>> >>> >>> part of speech tag mismatch in attempted instances: >>> >> >>>> >>> >>> 0.00% (0 >>> >> >>>> >>> >>> mismatches of 2 attempted instances) >>> >> >>>> >>> >>> skipped instances : 0.00% (skipped 0 instances of total >>> >> >>>> >>> >>> 2 >>> >> >>>> >>> >>> instances >>> >> >>>> >>> >>> because the instance id or the word was not found in the >>> >> >>>> >>> >>> answer >>> >> >>>> >>> >>> file) >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Nouns: >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 1 nouns attempted.) >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 1 noun instances in total) >>> >> >>>> >>> >>> F-measure: 0.000 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Verbs: >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 0 verbs attempted.) >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 0 verb instances in total) >>> >> >>>> >>> >>> F-measure: 0.000 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Adjectives: >>> >> >>>> >>> >>> Precision : 1.000 (1 correct of 1 adjectives attempted.) >>> >> >>>> >>> >>> Recall : 1.000 (1 correct of 1 adjective instances in >>> >> >>>> >>> >>> total) >>> >> >>>> >>> >>> F-measure: 1.000 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Adverbs: >>> >> >>>> >>> >>> Precision : 0.000 (0 correct of 0 adverbs attempted.) >>> >> >>>> >>> >>> Recall : 0.000 (0 correct of 0 adverb instances in >>> >> >>>> >>> >>> total) >>> >> >>>> >>> >>> F-measure: 0.000 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Confusion Matrix for part of speech tags : >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Noun Verb Adj >>> >> >>>> >>> >>> Adv >>> >> >>>> >>> >>> | Key >>> >> >>>> >>> >>> Noun 1 0 0 >>> >> >>>> >>> >>> 0 >>> >> >>>> >>> >>> | 1 >>> >> >>>> >>> >>> Verb 0 0 0 >>> >> >>>> >>> >>> 0 >>> >> >>>> >>> >>> | 0 >>> >> >>>> >>> >>> Adj 0 0 1 >>> >> >>>> >>> >>> 0 >>> >> >>>> >>> >>> | 1 >>> >> >>>> >>> >>> Adv 0 0 0 >>> >> >>>> >>> >>> 0 >>> >> >>>> >>> >>> | 0 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> --------------------------------------------------------------------------------|------- >>> >> >>>> >>> >>> Ans 1 0 1 >>> >> >>>> >>> >>> 0 >>> >> >>>> >>> >>> | 2 >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> I hope this is of some help. Please let us know though if >>> >> >>>> >>> >>> there >>> >> >>>> >>> >>> are >>> >> >>>> >>> >>> additional issues to resolve! >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> Cordially, >>> >> >>>> >>> >>> Ted >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> On Sat, Apr 23, 2011 at 7:54 PM, Arun N >>> >> >>>> >>> >>> <aru...@gm...> >>> >> >>>> >>> >>> wrote: >>> >> >>>> >>> >>> > Hi, >>> >> >>>> >>> >>> > Thanks, for the quick reply. >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > Actually, I wrote my own scorer. But, I am thinking of >>> >> >>>> >>> >>> > using >>> >> >>>> >>> >>> > the >>> >> >>>> >>> >>> > scorer >>> >> >>>> >>> >>> > provided in senserelate package. >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > btw, It would be great, if you could tell how to >>> >> >>>> >>> >>> > generate >>> >> >>>> >>> >>> > the >>> >> >>>> >>> >>> > key >>> >> >>>> >>> >>> > file >>> >> >>>> >>> >>> > for a >>> >> >>>> >>> >>> > semcor input file. >>> >> >>>> >>> >>> > There is a perl script extract-semcor-plaintext.pl >>> >> >>>> >>> >>> > --key >>> >> >>>> >>> >>> > flag, but >>> >> >>>> >>> >>> > it >>> >> >>>> >>> >>> > generates a key with just the POS tags but not the >>> >> >>>> >>> >>> > wordnet >>> >> >>>> >>> >>> > senses. >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > Do you have any perl code to generate the key file(in a >>> >> >>>> >>> >>> > suitable >>> >> >>>> >>> >>> > format >>> >> >>>> >>> >>> > for >>> >> >>>> >>> >>> > scorer) from a semcor file ? so that it can be passed >>> >> >>>> >>> >>> > to >>> >> >>>> >>> >>> > the >>> >> >>>> >>> >>> > allwords-scorer.pl. >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > Arun, >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > On Sat, Apr 23, 2011 at 3:16 PM, Ted Pedersen >>> >> >>>> >>> >>> > <tpederse@d.umn.edu> >>> >> >>>> >>> >>> > wrote: >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> Hi Arun, >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> Nice to hear from you. You may also wish to consult >>> >> >>>> >>> >>> >> Varada >>> >> >>>> >>> >>> >> Kolhatkar's >>> >> >>>> >>> >>> >> MS thesis, which is a more recent use of >>> >> >>>> >>> >>> >> WordNet::SenseRelate::Allwords. While some differences >>> >> >>>> >>> >>> >> in >>> >> >>>> >>> >>> >> results >>> >> >>>> >>> >>> >> are >>> >> >>>> >>> >>> >> to be expected as the years go by (due to changes in >>> >> >>>> >>> >>> >> WordNet >>> >> >>>> >>> >>> >> for >>> >> >>>> >>> >>> >> example) they should be fairly minor. >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> An Extended Analysis of a Method of All Words Sense >>> >> >>>> >>> >>> >> Disambiguation >>> >> >>>> >>> >>> >> (Kolhatkar) - Master of Science Thesis, Department of >>> >> >>>> >>> >>> >> Computer >>> >> >>>> >>> >>> >> Science, University of Minnesota, Duluth, August, >>> >> >>>> >>> >>> >> 2009. >>> >> >>>> >>> >>> >> http://www.d.umn.edu/~tpederse/Pubs/varada-thesis.pdf >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> Regarding your results, what were your precision and >>> >> >>>> >>> >>> >> recall >>> >> >>>> >>> >>> >> values? >>> >> >>>> >>> >>> >> Did you use the scoring program that comes with >>> >> >>>> >>> >>> >> WordNet::SenseRelate::AllWords? Also, if you could >>> >> >>>> >>> >>> >> send >>> >> >>>> >>> >>> >> the >>> >> >>>> >>> >>> >> exact >>> >> >>>> >>> >>> >> command you ran that would help us understand what >>> >> >>>> >>> >>> >> might >>> >> >>>> >>> >>> >> be >>> >> >>>> >>> >>> >> happening. >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> Thanks! >>> >> >>>> >>> >>> >> Ted >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> On Sat, Apr 23, 2011 at 1:07 PM, Arun N >>> >> >>>> >>> >>> >> <aru...@gm...> >>> >> >>>> >>> >>> >> wrote: >>> >> >>>> >>> >>> >> > Hi Jason and Ted, >>> >> >>>> >>> >>> >> > I am Arun Nedunchezhian, Graduate Student at UT >>> >> >>>> >>> >>> >> > Austin. I >>> >> >>>> >>> >>> >> > am >>> >> >>>> >>> >>> >> > working >>> >> >>>> >>> >>> >> > on a >>> >> >>>> >>> >>> >> > project which uses WordNet::SenseRelate::Allwords >>> >> >>>> >>> >>> >> > package. >>> >> >>>> >>> >>> >> > I read the results section in your(Jason) MS thesis. >>> >> >>>> >>> >>> >> > You >>> >> >>>> >>> >>> >> > have >>> >> >>>> >>> >>> >> > mentioned >>> >> >>>> >>> >>> >> > that >>> >> >>>> >>> >>> >> > Precision and Recall for Semcor5 (5 documents from >>> >> >>>> >>> >>> >> > semcor[br-a01,br-a02,br-k18,br-m02,br-r05]) is .63 >>> >> >>>> >>> >>> >> > and >>> >> >>>> >>> >>> >> > .51. >>> >> >>>> >>> >>> >> > I tried to run SR-AW package over the same set of >>> >> >>>> >>> >>> >> > documents >>> >> >>>> >>> >>> >> > and I >>> >> >>>> >>> >>> >> > got >>> >> >>>> >>> >>> >> > much lesser values for Precision and recall. >>> >> >>>> >>> >>> >> > Precision = No.of words sense tagged correctly / >>> >> >>>> >>> >>> >> > No.of >>> >> >>>> >>> >>> >> > words >>> >> >>>> >>> >>> >> > sense >>> >> >>>> >>> >>> >> > tagged. >>> >> >>>> >>> >>> >> > Recall = No.of words sense tagged correctly / >>> >> >>>> >>> >>> >> > No.of >>> >> >>>> >>> >>> >> > words >>> >> >>>> >>> >>> >> > in >>> >> >>>> >>> >>> >> > the >>> >> >>>> >>> >>> >> > documents(tagged as cmd=done). >>> >> >>>> >>> >>> >> > SR-AW tags word either as <word#pos#senseid> or >>> >> >>>> >>> >>> >> > <word#ND>. >>> >> >>>> >>> >>> >> > No. of words sense tagged = count of >>> >> >>>> >>> >>> >> > <word#pos#senseid>. >>> >> >>>> >>> >>> >> > is the above equation correct ? >>> >> >>>> >>> >>> >> > Is this the way to compute precision and recall? >>> >> >>>> >>> >>> >> > What are the tags that you set for SR-AW execution? >>> >> >>>> >>> >>> >> > I set the following >>> >> >>>> >>> >>> >> > Window = 3 >>> >> >>>> >>> >>> >> > type = WordNet::SenseRelate::lesk >>> >> >>>> >>> >>> >> > I used a stoplist of articles, prepositions. >>> >> >>>> >>> >>> >> > >>> >> >>>> >>> >>> >> > Arun, >>> >> >>>> >>> >>> >> > >>> >> >>>> >>> >>> >> > -- >>> >> >>>> >>> >>> >> > The mind is everything. >>> >> >>>> >>> >>> >> > What you think you become. - Buddha >>> >> >>>> >>> >>> >> > >>> >> >>>> >>> >>> >> > >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> >>> >> >>>> >>> >>> >> -- >>> >> >>>> >>> >>> >> Ted Pedersen >>> >> >>>> >>> >>> >> http://www.d.umn.edu/~tpederse >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > -- >>> >> >>>> >>> >>> > The mind is everything. >>> >> >>>> >>> >>> > What you think you become. - Buddha >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> > >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> -- >>> >> >>>> >>> >>> Ted Pedersen >>> >> >>>> >>> >>> http://www.d.umn.edu/~tpederse >>> >> >>>> >>> >> >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> > -- >>> >> >>>> >>> > The mind is everything. >>> >> >>>> >>> > What you think you become. - Buddha >>> >> >>>> >>> > >>> >> >>>> >>> > >>> >> >>>> >>> >>> >> >>>> >>> >>> >> >>>> >>> >>> >> >>>> >>> -- >>> >> >>>> >>> Ted Pedersen >>> >> >>>> >>> http://www.d.umn.edu/~tpederse >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> -- >>> >> >>>> >> The mind is everything. >>> >> >>>> >> What you think you become. - Buddha >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > -- >>> >> >>>> > Ted Pedersen >>> >> >>>> > http://www.d.umn.edu/~tpederse >>> >> >>>> > >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> -- >>> >> >>>> Ted Pedersen >>> >> >>>> http://www.d.umn.edu/~tpederse >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> -- >>> >> >>> The mind is everything. >>> >> >>> What you think you become. - Buddha >>> >> >>> >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> The mind is everything. >>> >> >> What you think you become. - Buddha >>> >> >> >>> >> > >>> >> > >>> >> > >>> >> > -- >>> >> > The mind is everything. >>> >> > What you think you become. - Buddha >>> >> > >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Ted Pedersen >>> >> http://www.d.umn.edu/~tpederse >>> > >>> > >>> >>> >>> >>> -- >>> Ted Pedersen >>> http://www.d.umn.edu/~tpederse >> >> >> >> -- >> The mind is everything. >> What you think you become. - Buddha >> > > > > -- > The mind is everything. > What you think you become. - Buddha > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Wael G. <wae...@gm...> - 2011-05-11 12:23:24
|
Dear Prof.Ted, thanks for your efforts in WordNet Similarity and Senserelate, your recommendations were the main reason to achieve my master degree. i registered my PhD with title "Automatic Arabic Essay Assessment". Now i have a list of predefined Essay Questions and model Answers of "History subject" with Arabic Language, i translated them to English using google translation API as i know that similarity and senserelate support only English Language. what i want is: - Measure the similarity between any conceived question from Teacher and all the predefined questions to get the most similar question with its model answer. - Measure the similarity between the Student Answer and the predefined Model Answer to calculate *a score* that represents the semantic similarity between the two answers. my questions are: - What are the packages that i need? WordNet::Similarity Package or WordNet::Senserelate Packages or both. - Is the text Length controls in choosing the similarity Algorithm? as the question length is very short (one sentence) and the answer length is very long (many paragraphs). sorry for the long mail, Thanks for your cooperation, Best Regards, -- Wael Hassan Gomaa Mobile: +2 014 6767 4 66 PhD student, Faculty of Computers and Information, Cairo University, Egypt |
From: Ted P. <tpederse@d.umn.edu> - 2011-04-02 17:14:25
|
Hi Dian, I've looked into this a bit more, and indeed your analysis is correct. The glosses are converted to lower case while synsets are not. Below is an example using winston_churchill self-similarity, since his entry has both capitalized sysnet entries and capitalizations in his gloss. I think it's best to regard this as an oversight, and something that we would certainly adjust in future (so that synset words are lowercased along with glosses, or so that matching is not case sensitive). Right now of course the effect of what we are doing is that in some cases we are missing matches between synsets and glosses (when dealing with proper nouns in synsets it seems most likely, which can certainly occur as we see with Hamlet and Winston_Churchill). Thanks very much for pointing this out, this kind of observation and detailed investigation is really very much appreciated, and quite helpful to us. Thanks! Ted marengo(86): wn winston_churchill -synsn -g Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun winston_churchill 1 sense of winston churchill Sense 1 Churchill, Winston Churchill, Winston S. Churchill, Sir Winston Leonard Spenser Churchill -- (British statesman and leader during World War II; received Nobel prize for literature in 1953 (1874-1965)) INSTANCE OF=> statesman, solon, national leader -- (a man who is a respected leader in national or international affairs) INSTANCE OF=> writer, author -- (writes (books or stories or articles or the like) professionally (for pay)) marengo(87): similarity.pl --type WordNet::Similarity::lesk winston_churchill winston_churchill --trace 32 --config lesk.config Loading WordNet... done. Loading Module... done. winston_churchill#n#1 winston_churchill#n#1: Synset 1: winston_churchill#n#1 Synset 2: winston_churchill#n#1 Functions: syns - syns : 4 Overlaps: 1 x "Winston_S._Churchill" 1 x "Churchill" 1 x "Sir_Winston_Leonard_Spenser_Churchill" 1 x "Winston_Churchill" winston_churchill#n#1 winston_churchill#n#1 4 marengo(88): similarity.pl --type WordNet::Similarity::lesk winston_churchill winston_churchill --trace 32 --config lesk1.config Loading WordNet... done. Loading Module... done. winston_churchill#n#1 winston_churchill#n#1: Synset 1: winston_churchill#n#1 Synset 2: winston_churchill#n#1 Functions: glos - glos : 169 Overlaps: 1 x "british statesman leader world war ii received nobel prize literature 1953 1874 1965" Functions: glos - hype glos : 1 Overlaps: 1 x "leader" Functions: hype glos - glos : 1 Overlaps: 1 x "leader" Functions: hype glos - hype glos : 85 Overlaps: 1 x "man be respected leader national international affairs" 1 x "write book story article professionally pay" On Fri, Apr 1, 2011 at 8:51 AM, Dian Paskalis <dia...@ya...> wrote: > Hi Mr.Ted, > > I'm sorry but maybe there is a misunderstanding. I didn't mean about the > capital letters in the query that made the difference. It doesn't matter > whether I use capital letters in the query or not, the results are just the > same. > > Please look at the results below, for those I used query "play hamlet" and > window words of 10. > In WordNet, the synset words for hamlet#n#2 is "Hamlet" <- with capital > letters, while in play#v#4 it has the word "hamlet". Below is the printed > tracing result for the related words. > > Synset 1: play#v#4 > Synset 2: hamlet#n#2 > Functions: example - glos : 0.00617283950617284 > Overlaps: 1 x "s" > Functions: glos - hype glos : 0.0416666666666667 > Overlaps: 1 x "play" > Functions: hypo glos - hype glos : 0.00304878048780488 > Overlaps: 1 x "play" > > While the synset words for hamlet#n#1 are "hamlet" <- no capital letter and > "crossroads", while in play#v#4 it has the word "hamlet" <- no capital > letter too. Below is the printed tracing result for the related words. > > Synset 1: play#v#4 > Synset 2: hamlet#n#1 > Functions: example - syns : 0.0185185185185185 > Overlaps: 1 x "hamlet" > > As you can see, it returned "hamlet" because in the processed synset word > and example, they use the same "hamlet" <- no capital letter. > So what I meant by not processed wasn't about the query, but the words taken > from synset words. The program takes the synset words as they are given by > WordNet without turning them into lower case. Thus, it also considers the > capitalization of the letters in two words (from the synset word, gloss, > example, etc) for the overlaps. > > I noticed that when taking words from gloss, example etc. you turn them into > lower case. But when taking words from synset words, you didn't turn them > into lower case, thus leaving the capital letters in the words. > In that case, is there any reason why you did that? > > > With respect, > > Dian Paskalis > Computer Science > School of Electrical and Informatics Engineering > Institute Technology of Bandung > Indonesia > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-04-01 17:35:34
|
Hi Dian, Thanks for clarifying this! I see what you mean, and I'll look at this in more detail to see if I can remember if there is any reason for doing as you describe. To be honest I can' think of one, so it might simply have been an oversight. Let me check into this a little bit more to answer you more precisely, but that's what I think now. But, thank you again for pointing this out, and providing such a clear example. That really does help!! Cordially, Ted On Fri, Apr 1, 2011 at 8:51 AM, Dian Paskalis <dia...@ya...> wrote: > Hi Mr.Ted, > > I'm sorry but maybe there is a misunderstanding. I didn't mean about the > capital letters in the query that made the difference. It doesn't matter > whether I use capital letters in the query or not, the results are just the > same. > > Please look at the results below, for those I used query "play hamlet" and > window words of 10. > In WordNet, the synset words for hamlet#n#2 is "Hamlet" <- with capital > letters, while in play#v#4 it has the word "hamlet". Below is the printed > tracing result for the related words. > > Synset 1: play#v#4 > Synset 2: hamlet#n#2 > Functions: example - glos : 0.00617283950617284 > Overlaps: 1 x "s" > Functions: glos - hype glos : 0.0416666666666667 > Overlaps: 1 x "play" > Functions: hypo glos - hype glos : 0.00304878048780488 > Overlaps: 1 x "play" > > While the synset words for hamlet#n#1 are "hamlet" <- no capital letter and > "crossroads", while in play#v#4 it has the word "hamlet" <- no capital > letter too. Below is the printed tracing result for the related words. > > Synset 1: play#v#4 > Synset 2: hamlet#n#1 > Functions: example - syns : 0.0185185185185185 > Overlaps: 1 x "hamlet" > > As you can see, it returned "hamlet" because in the processed synset word > and example, they use the same "hamlet" <- no capital letter. > So what I meant by not processed wasn't about the query, but the words taken > from synset words. The program takes the synset words as they are given by > WordNet without turning them into lower case. Thus, it also considers the > capitalization of the letters in two words (from the synset word, gloss, > example, etc) for the overlaps. > > I noticed that when taking words from gloss, example etc. you turn them into > lower case. But when taking words from synset words, you didn't turn them > into lower case, thus leaving the capital letters in the words. > In that case, is there any reason why you did that? > > > With respect, > > Dian Paskalis > Computer Science > School of Electrical and Informatics Engineering > Institute Technology of Bandung > Indonesia > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-04-01 12:21:15
|
Hi Dian, See comments inline... On Fri, Apr 1, 2011 at 4:21 AM, Dian Paskalis <dia...@ya...> wrote: > Hi Mr. Ted, > > Thanks for the reply. Now I've got a different result when I used the query > "He played Hamlet as well as Macbeth". When I printed the trace from > Allwords, it seems that when comparing between synset words and other > glosses > you didn't process it (for example: turn the synset words into lower case). > Because of this, when it compared played#v#4 and hamlet#n#2, it didn't > return 'hamlet' as the overlap. I think you are mistaken here - I think what's happening is that no relatedness is found. No relatedness found isn't the same as not finding the word - it just means that no relatedness was measured but the word was included in the algorithm. A good way to see this would be to have input where the word is repeated, then you will find relatedness scores due to self similarity. For example... marengo(13): cat file macbeth MACBETH Macbeth marengo(12): wsd.pl --context file --type WordNet::Similarity::lesk --format raw Current configuration: context file : file format : raw scheme : normal tagged text : no measure : WordNet::Similarity::lesk window : 3 contextScore : 0 pairScore : 0 measure config: (none) glosses : no nocompoundify : no usemono : no backoff : no trace : no forcepos : no stoplist : (none) Loading WordNet... done. macbeth#n#1 macbeth#n#1 macbeth#n#1 This shows that the algorithm is taking macbeth regardless of the form and assigning a sense to it...if that doesn't happen in other contexts, it just means that there is no relatedness found between Macbeth and the surrounding words. In that case you might want to increase the window size or make other adjustments to the parameters. > > This is what I got when it compared between played#v#4's example and > hamlet#n#2's synset word: > Example : gielgud played hamlet EEE00004EEE she want to act lady macbeth but > she be too young for the role EEE00004EEE > she played the servant to her husband s master > Synset : Hamlet > The same happened with macbeth#n#1's synset. > I find it rather weird because in the example 'hamlet' refers to hamlet#n#2. > Is there any reason why you didn't process the synset words? I believe they are being processed, and just that relatedness isn't being found. You might want to increase the window size, use a different measure of relatedness, and/or make other adjustments. > > Another question, when using compound words, I noticed that you only > compoundify the words in context and not those in glosses, examples, etc. > Am I mistaken? Why is that? That's true. We compoundify the words in the text since we want to find those in WordNet (and "as well" is not the same as "as_well".) When we are finding overlaps in glosses, "as well" matches as 2 consecutive words (so the score is 4) whereas if we had that as "as_well" it would only be a match of 1 word, with a score of 1. We felt that a compound match is actually more significant than a single word match, so decided not to indicate compounds, knowing that we'd get a higher score when they match. I hope this helps. Please feel free to ask additional questions - since these are of fairly general interest I've moved this onto our senserelate mailing list. Cordially, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <tpederse@d.umn.edu> - 2011-03-24 17:02:26
|
I think the differences you are seeing are due to differences in how the lesk measure is set up on the web interface. In particular, the lesk measure can be given an option stoplist that is used in figuring out the measures of relatedness it calculates. This is generally a good idea, and something you may wish to consider adding to your command line setup. Note that the stoplist you provided is for the WSD algorithm - the lesk algorithm may also have a stoplist specified via its configuration file, shown below...Note that the lesk stoplist is not in regular expression form, but is just a plain text list of stop words, one word per line. So, I think if you make your lesk measure use a stoplist as shown below, your results should agree with the web interface. I hope this helps, please let us know if additional questions arise. Good luck, Ted marengo(203): wsd.pl --context context.txt --type WordNet::Similarity::lesk --format raw --config lesk.config --stop stop.txt Current configuration: context file : context.txt format : raw scheme : normal tagged text : no measure : WordNet::Similarity::lesk window : 3 contextScore : 0 pairScore : 0 measure config: lesk.config glosses : no nocompoundify : no usemono : no backoff : no trace : no forcepos : no stoplist : stop.txt Loading WordNet... done. the#o state_bank#n#1 of#o india#n#1 is#v#1 located#v#1 at#o bank#n#1 of#o river#n#1 marengo(204): cat lesk.config WordNet::Similarity::lesk relation::lesk-relation.dat stop::stoplist.txt marengo(205): cat stoplist.txt a aboard about above across after against all along alongside although amid amidst among amongst an and another anti any anybody anyone anything around as astride at aught bar barring because before behind below beneath beside besides between beyond both but by circa concerning considering despite down during each either enough everybody everyone except excepting excluding few fewer following for from he her hers herself him himself his hisself i idem if ilk in including inside into it its itself like many me mine minus more most myself naught near neither nobody none nor nothing notwithstanding of off on oneself onto opposite or other otherwise our ourself ourselves outside over own past pending per plus regarding round save self several she since so some somebody someone something somewhat such suchlike sundry than that the thee theirs them themselves there they thine this thou though through throughout thyself till to tother toward towards twain under underneath unless unlike until up upon us various versus via vis-a-vis we what whatall whatever whatsoever when whereas wherewith wherewithal which whichever whichsoever while who whoever whom whomever whomso whomsoever whose whosoever with within without worth ye yet yon yonder you you-all yours yourself yourselves On Thu, Mar 24, 2011 at 4:54 AM, ybl...@co... <ybl...@co...> wrote: > Dear Sir / Madam, > > We are still getting problem while disambiguation. When > we use WordNet::SenseRelate::Allwords package in our program, the sense of > words "locate" and "bank" are not disambiguated properly. Below I have > pasted the Input file, stoplist file and output file. On command line we > just run the perl script as >perl wnsraw.pl. > > For Input : the state bank of india is located at bank of river > Output is : the#o state_bank#n#1 of#ND india#n#1 be#v#1 locate#v#2 at#o > bank#n#2 of#ND river#n#1 > > For the same input, > Web Inteface output is: the state_bank#n#1 of india#n#1 be#v#1 locate#v#1 at > bank#n#1 of river#n#1 > > perl script:wnsraw.pl > > #!/usr/local/perl -w > use WordNet::SenseRelate::AllWords; > use WordNet::QueryData; > use WordNet::Tools; > my $qd = WordNet::QueryData->new; > defined $qd or die "Construction of WordNet::QueryData failed"; > my $wntools = WordNet::Tools->new($qd); > defined $wntools or die "\nCouldn't construct WordNet::Tools object"; > $file="stoplist.txt"; > $outfile="out.txt"; > my $wsd = WordNet::SenseRelate::AllWords->new (wordnet => $qd, > wntools => $wntools, > stoplist => $file, > outfile => $outfile, > pairScore => 0.0, > contextScore => 0.0, > measure => > 'WordNet::Similarity::lesk'); > my @context = qw/the state bank of india is located at bank of river/; > my @results = $wsd->disambiguate (window => 3, > tagged => 0, > scheme => 'normal', > context => [@context]); > print "@results\n"; > > Stoplist file: stoplist.txt > /\ba\b/ > /\ban\b/ > /\bas\b/ > /\bat\b/ > /\bby\b/ > /\bi\b/ > /\bin\b/ > /\bit\b/ > /\bthe\b/ > /\bhe\b/ > /\bhis\b/ > /\bme\b/ > /\boh\b/ > /\bok\b/ > /\bor\b/ > /\bthou\b/ > /\bus\b/ > /\bwho\b/ > /\bwa\b/ > > Outputfile: out.txt > > Results after disambiguation... > the#o the o > state_bank state_bank n 1 > of of ND > india india n 1 > is be v 1 > located locate v 2 > at#o at o > bank bank n 2 > of of ND > river river n 1 > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Kamal, J. <JKamal@ETS.ORG> - 2010-04-06 21:26:23
|
Hi Ted: Thanks so much for your help. It did resolve the issue. Though I was trying to get the latest of everything, I am not sure how I ended up getting an older version of WordNet::Similarity. Thanks so much your help again. Best Regards! Jyoti -----Original Message----- From: Ted Pedersen [mailto:dul...@gm...] Sent: Tuesday, April 06, 2010 12:04 PM To: Kamal, Jyoti Cc: sen...@li...; senserelate-users Subject: Re: [Senserelate-users] [Senserelate-developers]WordNet::SenseRelate Hi Jyoti, About the only thing I see that's a little out of the ordinary is that you are using version 2.01 of WordNet::Similarity, while the most current version is 2.05. I would suggest updating to 2.05 and seeing if that doesn't resolve these issues. You can find download links from CPAN and sourceforge for 2.05 here: http://wn-similarity.sourceforge.net/ Good luck, and let us know what happens. If you continue to get errors, go ahead and send the output of make test again. Ted On Tue, Apr 6, 2010 at 10:38 AM, Kamal, Jyoti <JK...@et...> wrote: > Hi Ted: > > Thanks so much for your response. I really didn't expect such a quick > response & I truly appreciate it. It's been a while, I have been struggling > with this. > > Please find my comments below. > > Thanks! > > Jyoti > > -----Original Message----- > From: Ted Pedersen [mailto:dul...@gm...] > Sent: Tuesday, April 06, 2010 11:22 AM > To: Kamal, Jyoti > Cc: sen...@li...; senserelate-users > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > Hi Jyoti, > > If you have installed WordNet::Similarity, you should have gotten > > WordNet::Tools as a result of that. But, we can double check that... > > 1) what kind of system are you running on? If linux can you send me > > the output of > > uname -a > > Linux etsis134.ets.org 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 > x86_64 x86_64 x86_64 GNU/Linux > > 2) can you send the output of > > similarity.pl --measure WordNet::Similarity::path dog cat > > [jkamal@etsis134 bin]$ similarity.pl --type=WordNet::Similarity::path dog#n > cat#n > > Loading WordNet... done. > > Loading Module... done. > > dog#n#1 cat#n#1 0.2 > > (just to check that similarity is ok) > > 4) can you send the output of > > perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' > > perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' > > perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' > > perl -MWordNet::SenseRelate::AllWords -e 'print > > "$WordNet::SenseRelate::AllWords::VERSION\n"' > > [jkamal@etsis134 bin]$ perl -MWordNet::QueryData -e 'print > "$WordNet::QueryData::VERSION\n"' > > 1.49 > > [jkamal@etsis134 bin]$ perl -MWordNet::Similarity -e 'print > "$WordNet::Similarity::VERSION\n"' > > 2.01 > > [jkamal@etsis134 bin]$ perl -MWordNet::Tools -e 'print > "$WordNet::Tools::VERSION\n"' > > 2.01 > > [jkamal@etsis134 bin]$ perl -MWordNet::SenseRelate::AllWords -e 'print > "$WordNet::SenseRelate::AllWords::VERSION\n"' > > 0.19 > > (this will show what versions of modules you are using..) > > 5) What version of WordNet are you using? > > [JK] WordNet 3.0 > > on linux > > wn -l > > should show that... > > 6) can you send the complete output from > > WordNet::SenseRelate::AllWords make test? > > --------------------------------------------------------------------- > > [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > > t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > > # expected: 'The#CL' > > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'DT#NT' > > # expected: 'star#n#1' > > t/Error-Suffixes-tagged-wntagged....NOK 4# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'star#NT' > > # expected: 'marry#v#1' > > t/Error-Suffixes-tagged-wntagged....NOK 5# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'NN#NT' > > # expected: 'qn#CL' > > t/Error-Suffixes-tagged-wntagged....NOK 6# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 7# got: 'married#NT' > > # expected: 'astronomer#n#1' > > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 8# got: 'VBD#NT' > > # expected: '.#IT' > > t/Error-Suffixes-tagged-wntagged....ok 41/41# Looks like you failed 6 tests > of 41. > > t/Error-Suffixes-tagged-wntagged....dubious > > Test returned status 6 (wstat 1536, 0x600) > > DIED. FAILED tests 3-8 > > Failed 6/41 tests, 85.37% okay > > t/WordNet-SenseRelate-AllWords......# WordNet hash : > eOS9lXC6GvMWznF1wkZofDdtbBU > > t/WordNet-SenseRelate-AllWords......ok 1/30# WordNet path : > /home/jkamal/myWordNet3//dict/ > > t/WordNet-SenseRelate-AllWords......ok 3/30# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 96) > > t/WordNet-SenseRelate-AllWords......NOK 4# got: '11' > > # expected: '5' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 5# got: 'my#NT' > > # expected: 'my#CL' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 6# got: 'PRP#NT' > > # expected: 'cat#n#7' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 7# got: 'cat#NT' > > # expected: 'be#v#1' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 8# got: 'NN#NT' > > # expected: 'a#CL' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 9# got: 'is#NT' > > # expected: 'wise#a#1' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > # got: 'VBZ#NT' > > # expected: 'cat#n#7' > > t/WordNet-SenseRelate-AllWords......NOK 13# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 131) > > t/WordNet-SenseRelate-AllWords......NOK 14# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'my#NT' > > # expected: 'my#CL' > > t/WordNet-SenseRelate-AllWords......NOK 15# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'PRP#NT' > > # expected: 'cat#n#NR' > > t/WordNet-SenseRelate-AllWords......NOK 16# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'cat#NT' > > # expected: 'be#v#1' > > t/WordNet-SenseRelate-AllWords......NOK 17# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'NN#NT' > > # expected: 'a#CL' > > t/WordNet-SenseRelate-AllWords......NOK 18# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'is#NT' > > # expected: 'wise#a#1' > > t/WordNet-SenseRelate-AllWords......NOK 19# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'VBZ#NT' > > # expected: 'cat#n#7' > > t/WordNet-SenseRelate-AllWords......ok 30/30# Looks like you failed 14 tests > of 30. > > t/WordNet-SenseRelate-AllWords......dubious > > Test returned status 14 (wstat 3584, 0xe00) > > DIED. FAILED tests 4-10, 13-19 > > Failed 14/30 tests, 53.33% okay > > t/wsd...............................ok 2/5Current configuration: > > context file : /tmp/25819.1in > > format : tagged > > scheme : normal > > tagged text : yes > > measure : WordNet::Similarity::lesk > > window : 3 > > contextScore : 0 > > pairScore : 0 > > measure config: (none) > > glosses : no > > nocompoundify : no > > usemono : no > > backoff : no > > trace : no > > forcepos : no > > stoplist : (none) > > Loading WordNet... done. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > # Failed test (t/wsd.t at line 59) > > # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > > # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > > t/wsd...............................ok 4/5Current configuration: > > context file : /tmp/25819.2in > > format : raw > > scheme : normal > > tagged text : no > > measure : WordNet::Similarity::lesk > > window : 3 > > contextScore : 0 > > pairScore : 0 > > measure config: (none) > > glosses : no > > nocompoundify : no > > usemono : no > > backoff : no > > trace : no > > forcepos : no > > stoplist : (none) > > Loading WordNet... done. > > t/wsd...............................ok 5/5# Looks like you failed 1 tests of > 5. > > t/wsd...............................dubious > > Test returned status 1 (wstat 256, 0x100) > > DIED. FAILED test 3 > > Failed 1/5 tests, 80.00% okay > > Failed Test Stat Wstat Total Fail Failed List of > Failed > > ------------------------------------------------------------------------------- > > t/Error-Suffixes-tagged-wntagged. 6 1536 41 6 14.63% 3-8 > > t/WordNet-SenseRelate-AllWords.t 14 3584 30 14 46.67% 4-10 13-19 > > t/wsd.t 1 256 5 1 20.00% 3 > > Failed 3/3 test scripts, 0.00% okay. 21/76 subtests failed, 72.37% okay. > > make: *** [test_dynamic] Error 1 > > -------------------------------------------------------- > > Thanks! > > Ted > > On Tue, Apr 6, 2010 at 9:59 AM, Kamal, Jyoti <JK...@et...> wrote: > >> Hello Everyone: > >> I just joined the senserelate mailing group. I work for a project in ETS > >> & we are doing some experiments with various tools out there for > >> semantic similarity. > >> I was able to get WordNet::Similarity to work but it seems that without > >> senserelate, I cannot do much & the combination of both is what I am > >> looking for. I am having really hard time getting senserelate to install > >> on my linux box. Can someone please help? > >> > >> Here is some description of my error. Please let me know if you need > >> more info. > >> ------------------------------------------------------------------------ > >> ------------------------------ > >> > >> Something doesn't seem to be right in extracting the WordNet Tags. It > >> seems that before we get the sense number for each word, we try to tag > >> each word with wordNet Tags & something is missing here. > >> One thing I wanted to bring to attention is that one of the modules > >> needed for SenseRelate is WordNet::Tools but I could not find anywhere > >> of how to install this module. After doing all my study, I came to the > >> conclusion that it comes as a part of WordNet::Similarity & since I > >> already have that up & running, I have Wordnet::Tools working as well. I > >> may be wrong here. > >> > >> After doing the make, When I run "make test", I get various errors > >> like.. > >> > >> [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > >> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > >> "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > >> t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > >> (t/Error-Suffixes-tagged-wntagged.t at line 71) > >> t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > >> # expected: 'The#CL' > >> # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > >> . > >> . > >> . > >> Loading WordNet... done. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> # Failed test (t/wsd.t at line 59) > >> # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > >> # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > >> > >> ------------------------------------------------------------------------ > >> --------------------------------- > >> Please let me know if you have any question. > >> > >> Thanks! > >> Jyoti > >> > >> > >> -----Original Message----- > >> From: Siddharth Patwardhan [mailto:si...@cs...] > >> Sent: Monday, April 05, 2010 7:15 PM > >> To: Ted Pedersen > >> Cc: Ambikesh jayal; sen...@li...; > >> senserelate-users; sat...@gm... > >> Subject: Re: [Senserelate-users] > >> [Senserelate-developers]WordNet::SenseRelate > >> > >> I remember a little while back Linas Vepsats wrote something that could > >> deal with sensekeys. He's released it on CPAN: > >> > >> http://search.cpan.org/dist/WordNet-SenseKey/ > >> > >> -- Sid. > >> > >> On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: > >>> Hi Bano, > >>> > >>> Very impressive memory. :) That had totally slipped my mind, but > >>> indeed it is here (on my very own web page :) > >>> > >>> http://www.d.umn.edu/~tpederse/wordnet.html > >>> > >>> Here's the short description from that page... > >>> > >>> Map from QueryData to WordNet sense-keys > >>> > >>> QueryData identifies WordNet senses using a word#pos#sense format. > >>> WordNet identifies senses using sense-keys (aka mnemonics). This > >>> program creates a mapping between the QueryData format and the WordNet > >>> sense-key format. (This tool is not specific to Senseval-2 data - it > >>> is generally useful if are using QueryData to access WordNet.) > >>> > >>> So, this sounds very much like what Ambikesh may want to use. Thanks > >>> for pointing this out, I absolutely missed this! > >>> > >>> Thanks! > >>> Ted > >>> > >>> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee > >> <sat...@gm...> wrote: > >>> > Hi Ted, > >>> > > >>> > I'm pretty rusty with Senserelate, but I vaguely recall having > >> written a > >>> > program (way back when!) that at least created a map between the > >> sensekeys > >>> > and the word#pos#sense format (but maybe we are talking of something > >> else > >>> > here?) I googled around for it, and found this link: > >>> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this > >> program > >>> > still exist? As far as I remember, it depended on the minutiae of > >> the > >>> > various file formats in WordNet, so I wouldn't be surprised if those > >> formats > >>> > have changed now rendering the program useless :-). > >>> > > >>> > -Bano > >>> > > >>> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> > >> wrote: > >>> >> > >>> >> Hi Ambikesh, > >>> >> > >>> >> See my comments inline... > >>> >> > >>> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal > >> <jay...@ya...> > >>> >> wrote: > >>> >> > > >>> >> > Hi, > >>> >> > The WordNet::SenseRelate returns the value in the format > >> "infer#v#5". To > >>> >> > run my experiments I need to compare it with a value in the > >> format > >>> >> > "infer%2:31:01::". > >>> >> > 1. Is there a function that takes sense key as input and returns > >> the > >>> >> > corresponding sense number? For example inputting "infer%2:31:01" > >> should > >>> >> > return "infer#v#5". > >>> >> > >>> >> I am not sure, but if there is it would be in WordNet::QueryData. > >>> >> > >>> >> http://search.cpan.org/dist/WordNet-QueryData/ > >>> >> > >>> >> While we use WordNet::QueryData, we don't include all of its > >>> >> functionality, so this might be something that they provide but we > >>> >> don't use. There is mailing list devoted to QueryData that might be > >>> >> the best place to ask this - it's a google group named wn-perl > >>> >> (details can be found at the site above). > >>> >> > >>> >> > > >>> >> > 2. Can WordNet::SenseRelate be configured to return the results > >> in the > >>> >> > format "infer%2:31:01::" ? > >>> >> > >>> >> No, we only support the wps format (word#part-of-speech#sense, as > >> in > >>> >> dog#n#2). > >>> >> > >>> >> > Also can WordNet::SenseRelate be configured for list of > >> stopwords, > >>> >> > special characters? > >>> >> > >>> >> Yes. See the stoplist option described here > >>> >> > >>> >> > >>> >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen > >> seRelate/AllWords.pm > >>> >> > >>> >> and here > >>> >> > >>> >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl > >>> >> > >>> >> and find a sample stoplist here : > >>> >> > >>> >> > >>> >> > >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 > >> 9/samples/default-stoplist-raw.txt > >>> >> > >>> >> ;) > >>> >> > >>> >> Good luck, > >>> >> Ted > >>> >> > >>> >> > Thanks, > >>> >> > Regards, > >>> >> > Ambikesh Jayal. > >>> >> > School of IS, Computing & Maths, > >>> >> > Brunel University, > >>> >> > Uxbridge, UB8 3PH, > >>> >> > United Kingdom. > >>> >> > Email: amb...@br... > >>> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj > >>> >> > > >>> >> > >>> >> > >>> >> -- > >>> >> Ted Pedersen > >>> >> http://www.d.umn.edu/~tpederse > >>> >> > >>> >> > >>> >> > >> ------------------------------------------------------------------------ > >> ------ > >>> >> Download Intel® Parallel Studio Eval > >>> >> Try the new software tools for yourself. Speed compiling, find bugs > >>> >> proactively, and fine-tune applications for parallel performance. > >>> >> See why Intel Parallel Studio got high marks during beta. > >>> >> http://p.sf.net/sfu/intel-sw-dev > >>> >> _______________________________________________ > >>> >> senserelate-developers mailing list > >>> >> sen...@li... > >>> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Ted Pedersen > >>> http://www.d.umn.edu/~tpederse > >>> > >>> > >> ------------------------------------------------------------------------ > >> ------ > >>> Download Intel® Parallel Studio Eval > >>> Try the new software tools for yourself. Speed compiling, find bugs > >>> proactively, and fine-tune applications for parallel performance. > >>> See why Intel Parallel Studio got high marks during beta. > >>> http://p.sf.net/sfu/intel-sw-dev > >>> _______________________________________________ > >>> senserelate-developers mailing list > >>> sen...@li... > >>> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > >> > >> > >> ------------------------------------------------------------------------ > >> ------ > >> Download Intel® Parallel Studio Eval > >> Try the new software tools for yourself. Speed compiling, find bugs > >> proactively, and fine-tune applications for parallel performance. > >> See why Intel Parallel Studio got high marks during beta. > >> http://p.sf.net/sfu/intel-sw-dev > >> _______________________________________________ > >> senserelate-users mailing list > >> sen...@li... > >> https://lists.sourceforge.net/lists/listinfo/senserelate-users > >> > >> -------------------------------------------------- > >> This e-mail and any files transmitted with it may contain privileged or >> confidential information. > >> It is solely for use by the individual for whom it is intended, even if >> addressed incorrectly. > >> If you received this e-mail in error, please notify the sender; do not >> disclose, copy, distribute, > >> or take any action in reliance on the contents of this information; and >> delete it from > >> your system. Any other use of this e-mail is prohibited. > >> > >> Thank you for your compliance. > >> -------------------------------------------------- > >> > >> > > -- > > Ted Pedersen > > http://www.d.umn.edu/~tpederse > > -------------------------------------------------- > This e-mail and any files transmitted with it may contain privileged or > confidential information. > It is solely for use by the individual for whom it is intended, even if > addressed incorrectly. > If you received this e-mail in error, please notify the sender; do not > disclose, copy, distribute, > or take any action in reliance on the contents of this information; and > delete it from > your system. Any other use of this e-mail is prohibited. > > Thank you for your compliance. > -------------------------------------------------- > -- Ted Pedersen http://www.d.umn.edu/~tpederse -------------------------------------------------- This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. -------------------------------------------------- |
From: Ted P. <dul...@gm...> - 2010-04-06 16:08:00
|
Hi Jyoti, To get the wn command to work, I think you need to have /usr/localWordNet-3.0/bin in your PATH, although in your case it looks like you've put WordNet in a different directory than this default (which is fine) so you'd need to do something comparable (if you wanted wn). That said, wn is not used by WordNet::Similarity or WordNet::SenseRelate, I sometimes just find it an easy tool to work with. The fact that similarity.pl is working tells us that WordNet is installed ok, and your WNHOME value shows us that it's 3.0 Thanks, Ted On Tue, Apr 6, 2010 at 11:00 AM, Kamal, Jyoti <JK...@et...> wrote: > Hi Ted: > > Just wanted to mention that “wn -l” didn’t give me any result. I hope, this > is not an issue as I am sure that WordNet is fine as WordNet::similarity is > working. However, if I do wn –l, this is what I see. > > [jkamal@etsis134 ~]$ wn -l > > -bash: wn: command not found > > > > However, I do have this line in my .bash_profile & WordNet::similarity > gives me the desired result. > > export WNHOME=/home/jkamal/myWordNet3/ > > > > Thanks! > > Jyoti > > > > From: Kamal, Jyoti [mailto:JKamal@ETS.ORG] > Sent: Tuesday, April 06, 2010 11:38 AM > To: Ted Pedersen > Cc: sen...@li...; senserelate-users > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > > > Hi Ted: > > Thanks so much for your response. I really didn't expect such a quick > response & I truly appreciate it. It's been a while, I have been struggling > with this. > > Please find my comments below. > > Thanks! > > Jyoti > > -----Original Message----- > From: Ted Pedersen [mailto:dul...@gm...] > Sent: Tuesday, April 06, 2010 11:22 AM > To: Kamal, Jyoti > Cc: sen...@li...; senserelate-users > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > Hi Jyoti, > > If you have installed WordNet::Similarity, you should have gotten > > WordNet::Tools as a result of that. But, we can double check that... > > 1) what kind of system are you running on? If linux can you send me > > the output of > > uname –a > > Linux etsis134.ets.org 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 > x86_64 x86_64 x86_64 GNU/Linux > > 2) can you send the output of > > similarity.pl --measure WordNet::Similarity::path dog cat > > [jkamal@etsis134 bin]$ similarity.pl --type=WordNet::Similarity::path dog#n > cat#n > > Loading WordNet... done. > > Loading Module... done. > > dog#n#1 cat#n#1 0.2 > > (just to check that similarity is ok) > > 4) can you send the output of > > perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' > > perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' > > perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' > > perl -MWordNet::SenseRelate::AllWords -e 'print > > "$WordNet::SenseRelate::AllWords::VERSION\n"' > > [jkamal@etsis134 bin]$ perl -MWordNet::QueryData -e 'print > "$WordNet::QueryData::VERSION\n"' > > 1.49 > > [jkamal@etsis134 bin]$ perl -MWordNet::Similarity -e 'print > "$WordNet::Similarity::VERSION\n"' > > 2.01 > > [jkamal@etsis134 bin]$ perl -MWordNet::Tools -e 'print > "$WordNet::Tools::VERSION\n"' > > 2.01 > > [jkamal@etsis134 bin]$ perl -MWordNet::SenseRelate::AllWords -e 'print > "$WordNet::SenseRelate::AllWords::VERSION\n"' > > 0.19 > > (this will show what versions of modules you are using..) > > 5) What version of WordNet are you using? > > [JK] WordNet 3.0 > > on linux > > wn -l > > should show that... > > 6) can you send the complete output from > > WordNet::SenseRelate::AllWords make test? > > --------------------------------------------------------------------- > > [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > > t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > > # expected: 'The#CL' > > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'DT#NT' > > # expected: 'star#n#1' > > t/Error-Suffixes-tagged-wntagged....NOK 4# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'star#NT' > > # expected: 'marry#v#1' > > t/Error-Suffixes-tagged-wntagged....NOK 5# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'NN#NT' > > # expected: 'qn#CL' > > t/Error-Suffixes-tagged-wntagged....NOK 6# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 7# got: 'married#NT' > > # expected: 'astronomer#n#1' > > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 8# got: 'VBD#NT' > > # expected: '.#IT' > > t/Error-Suffixes-tagged-wntagged....ok 41/41# Looks like you failed 6 tests > of 41. > > t/Error-Suffixes-tagged-wntagged....dubious > > Test returned status 6 (wstat 1536, 0x600) > > DIED. FAILED tests 3-8 > > Failed 6/41 tests, 85.37% okay > > t/WordNet-SenseRelate-AllWords......# WordNet hash : > eOS9lXC6GvMWznF1wkZofDdtbBU > > t/WordNet-SenseRelate-AllWords......ok 1/30# WordNet path : > /home/jkamal/myWordNet3//dict/ > > t/WordNet-SenseRelate-AllWords......ok 3/30# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 96) > > t/WordNet-SenseRelate-AllWords......NOK 4# got: '11' > > # expected: '5' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 5# got: 'my#NT' > > # expected: 'my#CL' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 6# got: 'PRP#NT' > > # expected: 'cat#n#7' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 7# got: 'cat#NT' > > # expected: 'be#v#1' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 8# got: 'NN#NT' > > # expected: 'a#CL' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 9# got: 'is#NT' > > # expected: 'wise#a#1' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > # got: 'VBZ#NT' > > # expected: 'cat#n#7' > > t/WordNet-SenseRelate-AllWords......NOK 13# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 131) > > t/WordNet-SenseRelate-AllWords......NOK 14# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'my#NT' > > # expected: 'my#CL' > > t/WordNet-SenseRelate-AllWords......NOK 15# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'PRP#NT' > > # expected: 'cat#n#NR' > > t/WordNet-SenseRelate-AllWords......NOK 16# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'cat#NT' > > # expected: 'be#v#1' > > t/WordNet-SenseRelate-AllWords......NOK 17# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'NN#NT' > > # expected: 'a#CL' > > t/WordNet-SenseRelate-AllWords......NOK 18# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'is#NT' > > # expected: 'wise#a#1' > > t/WordNet-SenseRelate-AllWords......NOK 19# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'VBZ#NT' > > # expected: 'cat#n#7' > > t/WordNet-SenseRelate-AllWords......ok 30/30# Looks like you failed 14 tests > of 30. > > t/WordNet-SenseRelate-AllWords......dubious > > Test returned status 14 (wstat 3584, 0xe00) > > DIED. FAILED tests 4-10, 13-19 > > Failed 14/30 tests, 53.33% okay > > t/wsd...............................ok 2/5Current configuration: > > context file : /tmp/25819.1in > > format : tagged > > scheme : normal > > tagged text : yes > > measure : WordNet::Similarity::lesk > > window : 3 > > contextScore : 0 > > pairScore : 0 > > measure config: (none) > > glosses : no > > nocompoundify : no > > usemono : no > > backoff : no > > trace : no > > forcepos : no > > stoplist : (none) > > Loading WordNet... done. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > # Failed test (t/wsd.t at line 59) > > # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > > # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > > t/wsd...............................ok 4/5Current configuration: > > context file : /tmp/25819.2in > > format : raw > > scheme : normal > > tagged text : no > > measure : WordNet::Similarity::lesk > > window : 3 > > contextScore : 0 > > pairScore : 0 > > measure config: (none) > > glosses : no > > nocompoundify : no > > usemono : no > > backoff : no > > trace : no > > forcepos : no > > stoplist : (none) > > Loading WordNet... done. > > t/wsd...............................ok 5/5# Looks like you failed 1 tests of > 5. > > t/wsd...............................dubious > > Test returned status 1 (wstat 256, 0x100) > > DIED. FAILED test 3 > > Failed 1/5 tests, 80.00% okay > > Failed Test Stat Wstat Total Fail Failed List of > Failed > > ------------------------------------------------------------------------------- > > t/Error-Suffixes-tagged-wntagged. 6 1536 41 6 14.63% 3-8 > > t/WordNet-SenseRelate-AllWords.t 14 3584 30 14 46.67% 4-10 13-19 > > t/wsd.t 1 256 5 1 20.00% 3 > > Failed 3/3 test scripts, 0.00% okay. 21/76 subtests failed, 72.37% okay. > > make: *** [test_dynamic] Error 1 > > -------------------------------------------------------- > > Thanks! > > Ted > > On Tue, Apr 6, 2010 at 9:59 AM, Kamal, Jyoti <JK...@et...> wrote: > >> Hello Everyone: > >> I just joined the senserelate mailing group. I work for a project in ETS > >> & we are doing some experiments with various tools out there for > >> semantic similarity. > >> I was able to get WordNet::Similarity to work but it seems that without > >> senserelate, I cannot do much & the combination of both is what I am > >> looking for. I am having really hard time getting senserelate to install > >> on my linux box. Can someone please help? > >> > >> Here is some description of my error. Please let me know if you need > >> more info. > >> ------------------------------------------------------------------------ > >> ------------------------------ > >> > >> Something doesn't seem to be right in extracting the WordNet Tags. It > >> seems that before we get the sense number for each word, we try to tag > >> each word with wordNet Tags & something is missing here. > >> One thing I wanted to bring to attention is that one of the modules > >> needed for SenseRelate is WordNet::Tools but I could not find anywhere > >> of how to install this module. After doing all my study, I came to the > >> conclusion that it comes as a part of WordNet::Similarity & since I > >> already have that up & running, I have Wordnet::Tools working as well. I > >> may be wrong here. > >> > >> After doing the make, When I run "make test", I get various errors > >> like.. > >> > >> [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > >> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > >> "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > >> t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > >> (t/Error-Suffixes-tagged-wntagged.t at line 71) > >> t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > >> # expected: 'The#CL' > >> # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > >> . > >> . > >> . > >> Loading WordNet... done. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> # Failed test (t/wsd.t at line 59) > >> # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > >> # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > >> > >> ------------------------------------------------------------------------ > >> --------------------------------- > >> Please let me know if you have any question. > >> > >> Thanks! > >> Jyoti > >> > >> > >> -----Original Message----- > >> From: Siddharth Patwardhan [mailto:si...@cs...] > >> Sent: Monday, April 05, 2010 7:15 PM > >> To: Ted Pedersen > >> Cc: Ambikesh jayal; sen...@li...; > >> senserelate-users; sat...@gm... > >> Subject: Re: [Senserelate-users] > >> [Senserelate-developers]WordNet::SenseRelate > >> > >> I remember a little while back Linas Vepsats wrote something that could > >> deal with sensekeys. He's released it on CPAN: > >> > >> http://search.cpan.org/dist/WordNet-SenseKey/ > >> > >> -- Sid. > >> > >> On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: > >>> Hi Bano, > >>> > >>> Very impressive memory. :) That had totally slipped my mind, but > >>> indeed it is here (on my very own web page :) > >>> > >>> http://www.d.umn.edu/~tpederse/wordnet.html > >>> > >>> Here's the short description from that page... > >>> > >>> Map from QueryData to WordNet sense-keys > >>> > >>> QueryData identifies WordNet senses using a word#pos#sense format. > >>> WordNet identifies senses using sense-keys (aka mnemonics). This > >>> program creates a mapping between the QueryData format and the WordNet > >>> sense-key format. (This tool is not specific to Senseval-2 data - it > >>> is generally useful if are using QueryData to access WordNet.) > >>> > >>> So, this sounds very much like what Ambikesh may want to use. Thanks > >>> for pointing this out, I absolutely missed this! > >>> > >>> Thanks! > >>> Ted > >>> > >>> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee > >> <sat...@gm...> wrote: > >>> > Hi Ted, > >>> > > >>> > I'm pretty rusty with Senserelate, but I vaguely recall having > >> written a > >>> > program (way back when!) that at least created a map between the > >> sensekeys > >>> > and the word#pos#sense format (but maybe we are talking of something > >> else > >>> > here?) I googled around for it, and found this link: > >>> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this > >> program > >>> > still exist? As far as I remember, it depended on the minutiae of > >> the > >>> > various file formats in WordNet, so I wouldn't be surprised if those > >> formats > >>> > have changed now rendering the program useless :-). > >>> > > >>> > -Bano > >>> > > >>> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> > >> wrote: > >>> >> > >>> >> Hi Ambikesh, > >>> >> > >>> >> See my comments inline... > >>> >> > >>> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal > >> <jay...@ya...> > >>> >> wrote: > >>> >> > > >>> >> > Hi, > >>> >> > The WordNet::SenseRelate returns the value in the format > >> "infer#v#5". To > >>> >> > run my experiments I need to compare it with a value in the > >> format > >>> >> > "infer%2:31:01::". > >>> >> > 1. Is there a function that takes sense key as input and returns > >> the > >>> >> > corresponding sense number? For example inputting "infer%2:31:01" > >> should > >>> >> > return "infer#v#5". > >>> >> > >>> >> I am not sure, but if there is it would be in WordNet::QueryData. > >>> >> > >>> >> http://search.cpan.org/dist/WordNet-QueryData/ > >>> >> > >>> >> While we use WordNet::QueryData, we don't include all of its > >>> >> functionality, so this might be something that they provide but we > >>> >> don't use. There is mailing list devoted to QueryData that might be > >>> >> the best place to ask this - it's a google group named wn-perl > >>> >> (details can be found at the site above). > >>> >> > >>> >> > > >>> >> > 2. Can WordNet::SenseRelate be configured to return the results > >> in the > >>> >> > format "infer%2:31:01::" ? > >>> >> > >>> >> No, we only support the wps format (word#part-of-speech#sense, as > >> in > >>> >> dog#n#2). > >>> >> > >>> >> > Also can WordNet::SenseRelate be configured for list of > >> stopwords, > >>> >> > special characters? > >>> >> > >>> >> Yes. See the stoplist option described here > >>> >> > >>> >> > >>> >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen > >> seRelate/AllWords.pm > >>> >> > >>> >> and here > >>> >> > >>> >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl > >>> >> > >>> >> and find a sample stoplist here : > >>> >> > >>> >> > >>> >> > >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 > >> 9/samples/default-stoplist-raw.txt > >>> >> > >>> >> ;) > >>> >> > >>> >> Good luck, > >>> >> Ted > >>> >> > >>> >> > Thanks, > >>> >> > Regards, > >>> >> > Ambikesh Jayal. > >>> >> > School of IS, Computing & Maths, > >>> >> > Brunel University, > >>> >> > Uxbridge, UB8 3PH, > >>> >> > United Kingdom. > >>> >> > Email: amb...@br... > >>> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj > >>> >> > > >>> >> > >>> >> > >>> >> -- > >>> >> Ted Pedersen > >>> >> http://www.d.umn.edu/~tpederse > >>> >> > >>> >> > >>> >> > >> ------------------------------------------------------------------------ > >> ------ > >>> >> Download Intel® Parallel Studio Eval > >>> >> Try the new software tools for yourself. Speed compiling, find bugs > >>> >> proactively, and fine-tune applications for parallel performance. > >>> >> See why Intel Parallel Studio got high marks during beta. > >>> >> http://p.sf.net/sfu/intel-sw-dev > >>> >> _______________________________________________ > >>> >> senserelate-developers mailing list > >>> >> sen...@li... > >>> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Ted Pedersen > >>> http://www.d.umn.edu/~tpederse > >>> > >>> > >> ------------------------------------------------------------------------ > >> ------ > >>> Download Intel® Parallel Studio Eval > >>> Try the new software tools for yourself. Speed compiling, find bugs > >>> proactively, and fine-tune applications for parallel performance. > >>> See why Intel Parallel Studio got high marks during beta. > >>> http://p.sf.net/sfu/intel-sw-dev > >>> _______________________________________________ > >>> senserelate-developers mailing list > >>> sen...@li... > >>> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > >> > >> > >> ------------------------------------------------------------------------ > >> ------ > >> Download Intel® Parallel Studio Eval > >> Try the new software tools for yourself. Speed compiling, find bugs > >> proactively, and fine-tune applications for parallel performance. > >> See why Intel Parallel Studio got high marks during beta. > >> http://p.sf.net/sfu/intel-sw-dev > >> _______________________________________________ > >> senserelate-users mailing list > >> sen...@li... > >> https://lists.sourceforge.net/lists/listinfo/senserelate-users > >> > >> -------------------------------------------------- > >> This e-mail and any files transmitted with it may contain privileged or >> confidential information. > >> It is solely for use by the individual for whom it is intended, even if >> addressed incorrectly. > >> If you received this e-mail in error, please notify the sender; do not >> disclose, copy, distribute, > >> or take any action in reliance on the contents of this information; and >> delete it from > >> your system. Any other use of this e-mail is prohibited. > >> > >> Thank you for your compliance. > >> -------------------------------------------------- > >> > >> > > > > -- > > Ted Pedersen > > http://www.d.umn.edu/~tpederse > > > > -------------------------------------------------- > > This e-mail and any files transmitted with it may contain privileged or > confidential information. > > It is solely for use by the individual for whom it is intended, even if > addressed incorrectly. > > If you received this e-mail in error, please notify the sender; do not > disclose, copy, distribute, > > or take any action in reliance on the contents of this information; and > delete it from > > your system. Any other use of this e-mail is prohibited. > > > > Thank you for your compliance. > > -------------------------------------------------- > > -------------------------------------------------- > This e-mail and any files transmitted with it may contain privileged or > confidential information. > It is solely for use by the individual for whom it is intended, even if > addressed incorrectly. > If you received this e-mail in error, please notify the sender; do not > disclose, copy, distribute, > or take any action in reliance on the contents of this information; and > delete it from > your system. Any other use of this e-mail is prohibited. > > Thank you for your compliance. > -------------------------------------------------- > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <dul...@gm...> - 2010-04-06 16:04:29
|
Hi Jyoti, About the only thing I see that's a little out of the ordinary is that you are using version 2.01 of WordNet::Similarity, while the most current version is 2.05. I would suggest updating to 2.05 and seeing if that doesn't resolve these issues. You can find download links from CPAN and sourceforge for 2.05 here: http://wn-similarity.sourceforge.net/ Good luck, and let us know what happens. If you continue to get errors, go ahead and send the output of make test again. Ted On Tue, Apr 6, 2010 at 10:38 AM, Kamal, Jyoti <JK...@et...> wrote: > Hi Ted: > > Thanks so much for your response. I really didn't expect such a quick > response & I truly appreciate it. It's been a while, I have been struggling > with this. > > Please find my comments below. > > Thanks! > > Jyoti > > -----Original Message----- > From: Ted Pedersen [mailto:dul...@gm...] > Sent: Tuesday, April 06, 2010 11:22 AM > To: Kamal, Jyoti > Cc: sen...@li...; senserelate-users > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > Hi Jyoti, > > If you have installed WordNet::Similarity, you should have gotten > > WordNet::Tools as a result of that. But, we can double check that... > > 1) what kind of system are you running on? If linux can you send me > > the output of > > uname –a > > Linux etsis134.ets.org 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 > x86_64 x86_64 x86_64 GNU/Linux > > 2) can you send the output of > > similarity.pl --measure WordNet::Similarity::path dog cat > > [jkamal@etsis134 bin]$ similarity.pl --type=WordNet::Similarity::path dog#n > cat#n > > Loading WordNet... done. > > Loading Module... done. > > dog#n#1 cat#n#1 0.2 > > (just to check that similarity is ok) > > 4) can you send the output of > > perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' > > perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' > > perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' > > perl -MWordNet::SenseRelate::AllWords -e 'print > > "$WordNet::SenseRelate::AllWords::VERSION\n"' > > [jkamal@etsis134 bin]$ perl -MWordNet::QueryData -e 'print > "$WordNet::QueryData::VERSION\n"' > > 1.49 > > [jkamal@etsis134 bin]$ perl -MWordNet::Similarity -e 'print > "$WordNet::Similarity::VERSION\n"' > > 2.01 > > [jkamal@etsis134 bin]$ perl -MWordNet::Tools -e 'print > "$WordNet::Tools::VERSION\n"' > > 2.01 > > [jkamal@etsis134 bin]$ perl -MWordNet::SenseRelate::AllWords -e 'print > "$WordNet::SenseRelate::AllWords::VERSION\n"' > > 0.19 > > (this will show what versions of modules you are using..) > > 5) What version of WordNet are you using? > > [JK] WordNet 3.0 > > on linux > > wn -l > > should show that... > > 6) can you send the complete output from > > WordNet::SenseRelate::AllWords make test? > > --------------------------------------------------------------------- > > [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > > t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > > # expected: 'The#CL' > > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'DT#NT' > > # expected: 'star#n#1' > > t/Error-Suffixes-tagged-wntagged....NOK 4# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'star#NT' > > # expected: 'marry#v#1' > > t/Error-Suffixes-tagged-wntagged....NOK 5# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > # got: 'NN#NT' > > # expected: 'qn#CL' > > t/Error-Suffixes-tagged-wntagged....NOK 6# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 7# got: 'married#NT' > > # expected: 'astronomer#n#1' > > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > > t/Error-Suffixes-tagged-wntagged....NOK 8# got: 'VBD#NT' > > # expected: '.#IT' > > t/Error-Suffixes-tagged-wntagged....ok 41/41# Looks like you failed 6 tests > of 41. > > t/Error-Suffixes-tagged-wntagged....dubious > > Test returned status 6 (wstat 1536, 0x600) > > DIED. FAILED tests 3-8 > > Failed 6/41 tests, 85.37% okay > > t/WordNet-SenseRelate-AllWords......# WordNet hash : > eOS9lXC6GvMWznF1wkZofDdtbBU > > t/WordNet-SenseRelate-AllWords......ok 1/30# WordNet path : > /home/jkamal/myWordNet3//dict/ > > t/WordNet-SenseRelate-AllWords......ok 3/30# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 96) > > t/WordNet-SenseRelate-AllWords......NOK 4# got: '11' > > # expected: '5' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 5# got: 'my#NT' > > # expected: 'my#CL' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 6# got: 'PRP#NT' > > # expected: 'cat#n#7' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 7# got: 'cat#NT' > > # expected: 'be#v#1' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 8# got: 'NN#NT' > > # expected: 'a#CL' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > t/WordNet-SenseRelate-AllWords......NOK 9# got: 'is#NT' > > # expected: 'wise#a#1' > > # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) > > # got: 'VBZ#NT' > > # expected: 'cat#n#7' > > t/WordNet-SenseRelate-AllWords......NOK 13# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 131) > > t/WordNet-SenseRelate-AllWords......NOK 14# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'my#NT' > > # expected: 'my#CL' > > t/WordNet-SenseRelate-AllWords......NOK 15# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'PRP#NT' > > # expected: 'cat#n#NR' > > t/WordNet-SenseRelate-AllWords......NOK 16# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'cat#NT' > > # expected: 'be#v#1' > > t/WordNet-SenseRelate-AllWords......NOK 17# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'NN#NT' > > # expected: 'a#CL' > > t/WordNet-SenseRelate-AllWords......NOK 18# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'is#NT' > > # expected: 'wise#a#1' > > t/WordNet-SenseRelate-AllWords......NOK 19# Failed test > (t/WordNet-SenseRelate-AllWords.t at line 136) > > # got: 'VBZ#NT' > > # expected: 'cat#n#7' > > t/WordNet-SenseRelate-AllWords......ok 30/30# Looks like you failed 14 tests > of 30. > > t/WordNet-SenseRelate-AllWords......dubious > > Test returned status 14 (wstat 3584, 0xe00) > > DIED. FAILED tests 4-10, 13-19 > > Failed 14/30 tests, 53.33% okay > > t/wsd...............................ok 2/5Current configuration: > > context file : /tmp/25819.1in > > format : tagged > > scheme : normal > > tagged text : yes > > measure : WordNet::Similarity::lesk > > window : 3 > > contextScore : 0 > > pairScore : 0 > > measure config: (none) > > glosses : no > > nocompoundify : no > > usemono : no > > backoff : no > > trace : no > > forcepos : no > > stoplist : (none) > > Loading WordNet... done. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. > > Use of uninitialized value in concatenation (.) or string at utils/wsd.pl > line 229. > > # Failed test (t/wsd.t at line 59) > > # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > > # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > > t/wsd...............................ok 4/5Current configuration: > > context file : /tmp/25819.2in > > format : raw > > scheme : normal > > tagged text : no > > measure : WordNet::Similarity::lesk > > window : 3 > > contextScore : 0 > > pairScore : 0 > > measure config: (none) > > glosses : no > > nocompoundify : no > > usemono : no > > backoff : no > > trace : no > > forcepos : no > > stoplist : (none) > > Loading WordNet... done. > > t/wsd...............................ok 5/5# Looks like you failed 1 tests of > 5. > > t/wsd...............................dubious > > Test returned status 1 (wstat 256, 0x100) > > DIED. FAILED test 3 > > Failed 1/5 tests, 80.00% okay > > Failed Test Stat Wstat Total Fail Failed List of > Failed > > ------------------------------------------------------------------------------- > > t/Error-Suffixes-tagged-wntagged. 6 1536 41 6 14.63% 3-8 > > t/WordNet-SenseRelate-AllWords.t 14 3584 30 14 46.67% 4-10 13-19 > > t/wsd.t 1 256 5 1 20.00% 3 > > Failed 3/3 test scripts, 0.00% okay. 21/76 subtests failed, 72.37% okay. > > make: *** [test_dynamic] Error 1 > > -------------------------------------------------------- > > Thanks! > > Ted > > On Tue, Apr 6, 2010 at 9:59 AM, Kamal, Jyoti <JK...@et...> wrote: > >> Hello Everyone: > >> I just joined the senserelate mailing group. I work for a project in ETS > >> & we are doing some experiments with various tools out there for > >> semantic similarity. > >> I was able to get WordNet::Similarity to work but it seems that without > >> senserelate, I cannot do much & the combination of both is what I am > >> looking for. I am having really hard time getting senserelate to install > >> on my linux box. Can someone please help? > >> > >> Here is some description of my error. Please let me know if you need > >> more info. > >> ------------------------------------------------------------------------ > >> ------------------------------ > >> > >> Something doesn't seem to be right in extracting the WordNet Tags. It > >> seems that before we get the sense number for each word, we try to tag > >> each word with wordNet Tags & something is missing here. > >> One thing I wanted to bring to attention is that one of the modules > >> needed for SenseRelate is WordNet::Tools but I could not find anywhere > >> of how to install this module. After doing all my study, I came to the > >> conclusion that it comes as a part of WordNet::Similarity & since I > >> already have that up & running, I have Wordnet::Tools working as well. I > >> may be wrong here. > >> > >> After doing the make, When I run "make test", I get various errors > >> like.. > >> > >> [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > >> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > >> "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > >> t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > >> (t/Error-Suffixes-tagged-wntagged.t at line 71) > >> t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > >> # expected: 'The#CL' > >> # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > >> . > >> . > >> . > >> Loading WordNet... done. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > >> 228. > >> Use of uninitialized value in concatenation (.) or string at > >> utils/wsd.pl line 229. > >> # Failed test (t/wsd.t at line 59) > >> # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > >> # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > >> > >> ------------------------------------------------------------------------ > >> --------------------------------- > >> Please let me know if you have any question. > >> > >> Thanks! > >> Jyoti > >> > >> > >> -----Original Message----- > >> From: Siddharth Patwardhan [mailto:si...@cs...] > >> Sent: Monday, April 05, 2010 7:15 PM > >> To: Ted Pedersen > >> Cc: Ambikesh jayal; sen...@li...; > >> senserelate-users; sat...@gm... > >> Subject: Re: [Senserelate-users] > >> [Senserelate-developers]WordNet::SenseRelate > >> > >> I remember a little while back Linas Vepsats wrote something that could > >> deal with sensekeys. He's released it on CPAN: > >> > >> http://search.cpan.org/dist/WordNet-SenseKey/ > >> > >> -- Sid. > >> > >> On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: > >>> Hi Bano, > >>> > >>> Very impressive memory. :) That had totally slipped my mind, but > >>> indeed it is here (on my very own web page :) > >>> > >>> http://www.d.umn.edu/~tpederse/wordnet.html > >>> > >>> Here's the short description from that page... > >>> > >>> Map from QueryData to WordNet sense-keys > >>> > >>> QueryData identifies WordNet senses using a word#pos#sense format. > >>> WordNet identifies senses using sense-keys (aka mnemonics). This > >>> program creates a mapping between the QueryData format and the WordNet > >>> sense-key format. (This tool is not specific to Senseval-2 data - it > >>> is generally useful if are using QueryData to access WordNet.) > >>> > >>> So, this sounds very much like what Ambikesh may want to use. Thanks > >>> for pointing this out, I absolutely missed this! > >>> > >>> Thanks! > >>> Ted > >>> > >>> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee > >> <sat...@gm...> wrote: > >>> > Hi Ted, > >>> > > >>> > I'm pretty rusty with Senserelate, but I vaguely recall having > >> written a > >>> > program (way back when!) that at least created a map between the > >> sensekeys > >>> > and the word#pos#sense format (but maybe we are talking of something > >> else > >>> > here?) I googled around for it, and found this link: > >>> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this > >> program > >>> > still exist? As far as I remember, it depended on the minutiae of > >> the > >>> > various file formats in WordNet, so I wouldn't be surprised if those > >> formats > >>> > have changed now rendering the program useless :-). > >>> > > >>> > -Bano > >>> > > >>> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> > >> wrote: > >>> >> > >>> >> Hi Ambikesh, > >>> >> > >>> >> See my comments inline... > >>> >> > >>> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal > >> <jay...@ya...> > >>> >> wrote: > >>> >> > > >>> >> > Hi, > >>> >> > The WordNet::SenseRelate returns the value in the format > >> "infer#v#5". To > >>> >> > run my experiments I need to compare it with a value in the > >> format > >>> >> > "infer%2:31:01::". > >>> >> > 1. Is there a function that takes sense key as input and returns > >> the > >>> >> > corresponding sense number? For example inputting "infer%2:31:01" > >> should > >>> >> > return "infer#v#5". > >>> >> > >>> >> I am not sure, but if there is it would be in WordNet::QueryData. > >>> >> > >>> >> http://search.cpan.org/dist/WordNet-QueryData/ > >>> >> > >>> >> While we use WordNet::QueryData, we don't include all of its > >>> >> functionality, so this might be something that they provide but we > >>> >> don't use. There is mailing list devoted to QueryData that might be > >>> >> the best place to ask this - it's a google group named wn-perl > >>> >> (details can be found at the site above). > >>> >> > >>> >> > > >>> >> > 2. Can WordNet::SenseRelate be configured to return the results > >> in the > >>> >> > format "infer%2:31:01::" ? > >>> >> > >>> >> No, we only support the wps format (word#part-of-speech#sense, as > >> in > >>> >> dog#n#2). > >>> >> > >>> >> > Also can WordNet::SenseRelate be configured for list of > >> stopwords, > >>> >> > special characters? > >>> >> > >>> >> Yes. See the stoplist option described here > >>> >> > >>> >> > >>> >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen > >> seRelate/AllWords.pm > >>> >> > >>> >> and here > >>> >> > >>> >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl > >>> >> > >>> >> and find a sample stoplist here : > >>> >> > >>> >> > >>> >> > >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 > >> 9/samples/default-stoplist-raw.txt > >>> >> > >>> >> ;) > >>> >> > >>> >> Good luck, > >>> >> Ted > >>> >> > >>> >> > Thanks, > >>> >> > Regards, > >>> >> > Ambikesh Jayal. > >>> >> > School of IS, Computing & Maths, > >>> >> > Brunel University, > >>> >> > Uxbridge, UB8 3PH, > >>> >> > United Kingdom. > >>> >> > Email: amb...@br... > >>> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj > >>> >> > > >>> >> > >>> >> > >>> >> -- > >>> >> Ted Pedersen > >>> >> http://www.d.umn.edu/~tpederse > >>> >> > >>> >> > >>> >> > >> ------------------------------------------------------------------------ > >> ------ > >>> >> Download Intel® Parallel Studio Eval > >>> >> Try the new software tools for yourself. Speed compiling, find bugs > >>> >> proactively, and fine-tune applications for parallel performance. > >>> >> See why Intel Parallel Studio got high marks during beta. > >>> >> http://p.sf.net/sfu/intel-sw-dev > >>> >> _______________________________________________ > >>> >> senserelate-developers mailing list > >>> >> sen...@li... > >>> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Ted Pedersen > >>> http://www.d.umn.edu/~tpederse > >>> > >>> > >> ------------------------------------------------------------------------ > >> ------ > >>> Download Intel® Parallel Studio Eval > >>> Try the new software tools for yourself. Speed compiling, find bugs > >>> proactively, and fine-tune applications for parallel performance. > >>> See why Intel Parallel Studio got high marks during beta. > >>> http://p.sf.net/sfu/intel-sw-dev > >>> _______________________________________________ > >>> senserelate-developers mailing list > >>> sen...@li... > >>> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > >> > >> > >> ------------------------------------------------------------------------ > >> ------ > >> Download Intel® Parallel Studio Eval > >> Try the new software tools for yourself. Speed compiling, find bugs > >> proactively, and fine-tune applications for parallel performance. > >> See why Intel Parallel Studio got high marks during beta. > >> http://p.sf.net/sfu/intel-sw-dev > >> _______________________________________________ > >> senserelate-users mailing list > >> sen...@li... > >> https://lists.sourceforge.net/lists/listinfo/senserelate-users > >> > >> -------------------------------------------------- > >> This e-mail and any files transmitted with it may contain privileged or >> confidential information. > >> It is solely for use by the individual for whom it is intended, even if >> addressed incorrectly. > >> If you received this e-mail in error, please notify the sender; do not >> disclose, copy, distribute, > >> or take any action in reliance on the contents of this information; and >> delete it from > >> your system. Any other use of this e-mail is prohibited. > >> > >> Thank you for your compliance. > >> -------------------------------------------------- > >> > >> > > -- > > Ted Pedersen > > http://www.d.umn.edu/~tpederse > > -------------------------------------------------- > This e-mail and any files transmitted with it may contain privileged or > confidential information. > It is solely for use by the individual for whom it is intended, even if > addressed incorrectly. > If you received this e-mail in error, please notify the sender; do not > disclose, copy, distribute, > or take any action in reliance on the contents of this information; and > delete it from > your system. Any other use of this e-mail is prohibited. > > Thank you for your compliance. > -------------------------------------------------- > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Kamal, J. <JKamal@ETS.ORG> - 2010-04-06 16:00:18
|
Hi Ted: Just wanted to mention that "wn -l" didn't give me any result. I hope, this is not an issue as I am sure that WordNet is fine as WordNet::similarity is working. However, if I do wn -l, this is what I see. [jkamal@etsis134 ~]$ wn -l -bash: wn: command not found However, I do have this line in my .bash_profile & WordNet::similarity gives me the desired result. export WNHOME=/home/jkamal/myWordNet3/ Thanks! Jyoti From: Kamal, Jyoti [mailto:JKamal@ETS.ORG] Sent: Tuesday, April 06, 2010 11:38 AM To: Ted Pedersen Cc: sen...@li...; senserelate-users Subject: Re: [Senserelate-users] [Senserelate-developers]WordNet::SenseRelate Hi Ted: Thanks so much for your response. I really didn't expect such a quick response & I truly appreciate it. It's been a while, I have been struggling with this. Please find my comments below. Thanks! Jyoti -----Original Message----- From: Ted Pedersen [mailto:dul...@gm...] Sent: Tuesday, April 06, 2010 11:22 AM To: Kamal, Jyoti Cc: sen...@li...; senserelate-users Subject: Re: [Senserelate-users] [Senserelate-developers]WordNet::SenseRelate Hi Jyoti, If you have installed WordNet::Similarity, you should have gotten WordNet::Tools as a result of that. But, we can double check that... 1) what kind of system are you running on? If linux can you send me the output of uname -a Linux etsis134.ets.org 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux 2) can you send the output of similarity.pl --measure WordNet::Similarity::path dog cat [jkamal@etsis134 bin]$ similarity.pl --type=WordNet::Similarity::path dog#n cat#n Loading WordNet... done. Loading Module... done. dog#n#1 cat#n#1 0.2 (just to check that similarity is ok) 4) can you send the output of perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' perl -MWordNet::SenseRelate::AllWords -e 'print "$WordNet::SenseRelate::AllWords::VERSION\n"' [jkamal@etsis134 bin]$ perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' 1.49 [jkamal@etsis134 bin]$ perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' 2.01 [jkamal@etsis134 bin]$ perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' 2.01 [jkamal@etsis134 bin]$ perl -MWordNet::SenseRelate::AllWords -e 'print "$WordNet::SenseRelate::AllWords::VERSION\n"' 0.19 (this will show what versions of modules you are using..) 5) What version of WordNet are you using? [JK] WordNet 3.0 on linux wn -l should show that... 6) can you send the complete output from WordNet::SenseRelate::AllWords make test? --------------------------------------------------------------------- [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' # expected: 'The#CL' # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) # got: 'DT#NT' # expected: 'star#n#1' t/Error-Suffixes-tagged-wntagged....NOK 4# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) # got: 'star#NT' # expected: 'marry#v#1' t/Error-Suffixes-tagged-wntagged....NOK 5# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) # got: 'NN#NT' # expected: 'qn#CL' t/Error-Suffixes-tagged-wntagged....NOK 6# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 7# got: 'married#NT' # expected: 'astronomer#n#1' # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 8# got: 'VBD#NT' # expected: '.#IT' t/Error-Suffixes-tagged-wntagged....ok 41/41# Looks like you failed 6 tests of 41. t/Error-Suffixes-tagged-wntagged....dubious Test returned status 6 (wstat 1536, 0x600) DIED. FAILED tests 3-8 Failed 6/41 tests, 85.37% okay t/WordNet-SenseRelate-AllWords......# WordNet hash : eOS9lXC6GvMWznF1wkZofDdtbBU t/WordNet-SenseRelate-AllWords......ok 1/30# WordNet path : /home/jkamal/myWordNet3//dict/ t/WordNet-SenseRelate-AllWords......ok 3/30# Failed test (t/WordNet-SenseRelate-AllWords.t at line 96) t/WordNet-SenseRelate-AllWords......NOK 4# got: '11' # expected: '5' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 5# got: 'my#NT' # expected: 'my#CL' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 6# got: 'PRP#NT' # expected: 'cat#n#7' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 7# got: 'cat#NT' # expected: 'be#v#1' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 8# got: 'NN#NT' # expected: 'a#CL' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 9# got: 'is#NT' # expected: 'wise#a#1' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) # got: 'VBZ#NT' # expected: 'cat#n#7' t/WordNet-SenseRelate-AllWords......NOK 13# Failed test (t/WordNet-SenseRelate-AllWords.t at line 131) t/WordNet-SenseRelate-AllWords......NOK 14# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'my#NT' # expected: 'my#CL' t/WordNet-SenseRelate-AllWords......NOK 15# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'PRP#NT' # expected: 'cat#n#NR' t/WordNet-SenseRelate-AllWords......NOK 16# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'cat#NT' # expected: 'be#v#1' t/WordNet-SenseRelate-AllWords......NOK 17# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'NN#NT' # expected: 'a#CL' t/WordNet-SenseRelate-AllWords......NOK 18# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'is#NT' # expected: 'wise#a#1' t/WordNet-SenseRelate-AllWords......NOK 19# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'VBZ#NT' # expected: 'cat#n#7' t/WordNet-SenseRelate-AllWords......ok 30/30# Looks like you failed 14 tests of 30. t/WordNet-SenseRelate-AllWords......dubious Test returned status 14 (wstat 3584, 0xe00) DIED. FAILED tests 4-10, 13-19 Failed 14/30 tests, 53.33% okay t/wsd...............................ok 2/5Current configuration: context file : /tmp/25819.1in format : tagged scheme : normal tagged text : yes measure : WordNet::Similarity::lesk window : 3 contextScore : 0 pairScore : 0 measure config: (none) glosses : no nocompoundify : no usemono : no backoff : no trace : no forcepos : no stoplist : (none) Loading WordNet... done. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. # Failed test (t/wsd.t at line 59) # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' t/wsd...............................ok 4/5Current configuration: context file : /tmp/25819.2in format : raw scheme : normal tagged text : no measure : WordNet::Similarity::lesk window : 3 contextScore : 0 pairScore : 0 measure config: (none) glosses : no nocompoundify : no usemono : no backoff : no trace : no forcepos : no stoplist : (none) Loading WordNet... done. t/wsd...............................ok 5/5# Looks like you failed 1 tests of 5. t/wsd...............................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 3 Failed 1/5 tests, 80.00% okay Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------ ------- t/Error-Suffixes-tagged-wntagged. 6 1536 41 6 14.63% 3-8 t/WordNet-SenseRelate-AllWords.t 14 3584 30 14 46.67% 4-10 13-19 t/wsd.t 1 256 5 1 20.00% 3 Failed 3/3 test scripts, 0.00% okay. 21/76 subtests failed, 72.37% okay. make: *** [test_dynamic] Error 1 -------------------------------------------------------- Thanks! Ted On Tue, Apr 6, 2010 at 9:59 AM, Kamal, Jyoti <JK...@et...> wrote: > Hello Everyone: > I just joined the senserelate mailing group. I work for a project in ETS > & we are doing some experiments with various tools out there for > semantic similarity. > I was able to get WordNet::Similarity to work but it seems that without > senserelate, I cannot do much & the combination of both is what I am > looking for. I am having really hard time getting senserelate to install > on my linux box. Can someone please help? > > Here is some description of my error. Please let me know if you need > more info. > ------------------------------------------------------------------------ > ------------------------------ > > Something doesn't seem to be right in extracting the WordNet Tags. It > seems that before we get the sense number for each word, we try to tag > each word with wordNet Tags & something is missing here. > One thing I wanted to bring to attention is that one of the modules > needed for SenseRelate is WordNet::Tools but I could not find anywhere > of how to install this module. After doing all my study, I came to the > conclusion that it comes as a part of WordNet::Similarity & since I > already have that up & running, I have Wordnet::Tools working as well. I > may be wrong here. > > After doing the make, When I run "make test", I get various errors > like.. > > [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > # expected: 'The#CL' > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > . > . > . > Loading WordNet... done. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > # Failed test (t/wsd.t at line 59) > # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > > ------------------------------------------------------------------------ > --------------------------------- > Please let me know if you have any question. > > Thanks! > Jyoti > > > -----Original Message----- > From: Siddharth Patwardhan [mailto:si...@cs...] > Sent: Monday, April 05, 2010 7:15 PM > To: Ted Pedersen > Cc: Ambikesh jayal; sen...@li...; > senserelate-users; sat...@gm... > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > I remember a little while back Linas Vepsats wrote something that could > deal with sensekeys. He's released it on CPAN: > > http://search.cpan.org/dist/WordNet-SenseKey/ > > -- Sid. > > On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: >> Hi Bano, >> >> Very impressive memory. :) That had totally slipped my mind, but >> indeed it is here (on my very own web page :) >> >> http://www.d.umn.edu/~tpederse/wordnet.html >> >> Here's the short description from that page... >> >> Map from QueryData to WordNet sense-keys >> >> QueryData identifies WordNet senses using a word#pos#sense format. >> WordNet identifies senses using sense-keys (aka mnemonics). This >> program creates a mapping between the QueryData format and the WordNet >> sense-key format. (This tool is not specific to Senseval-2 data - it >> is generally useful if are using QueryData to access WordNet.) >> >> So, this sounds very much like what Ambikesh may want to use. Thanks >> for pointing this out, I absolutely missed this! >> >> Thanks! >> Ted >> >> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee > <sat...@gm...> wrote: >> > Hi Ted, >> > >> > I'm pretty rusty with Senserelate, but I vaguely recall having > written a >> > program (way back when!) that at least created a map between the > sensekeys >> > and the word#pos#sense format (but maybe we are talking of something > else >> > here?) I googled around for it, and found this link: >> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this > program >> > still exist? As far as I remember, it depended on the minutiae of > the >> > various file formats in WordNet, so I wouldn't be surprised if those > formats >> > have changed now rendering the program useless :-). >> > >> > -Bano >> > >> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> > wrote: >> >> >> >> Hi Ambikesh, >> >> >> >> See my comments inline... >> >> >> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal > <jay...@ya...> >> >> wrote: >> >> > >> >> > Hi, >> >> > The WordNet::SenseRelate returns the value in the format > "infer#v#5". To >> >> > run my experiments I need to compare it with a value in the > format >> >> > "infer%2:31:01::". >> >> > 1. Is there a function that takes sense key as input and returns > the >> >> > corresponding sense number? For example inputting "infer%2:31:01" > should >> >> > return "infer#v#5". >> >> >> >> I am not sure, but if there is it would be in WordNet::QueryData. >> >> >> >> http://search.cpan.org/dist/WordNet-QueryData/ >> >> >> >> While we use WordNet::QueryData, we don't include all of its >> >> functionality, so this might be something that they provide but we >> >> don't use. There is mailing list devoted to QueryData that might be >> >> the best place to ask this - it's a google group named wn-perl >> >> (details can be found at the site above). >> >> >> >> > >> >> > 2. Can WordNet::SenseRelate be configured to return the results > in the >> >> > format "infer%2:31:01::" ? >> >> >> >> No, we only support the wps format (word#part-of-speech#sense, as > in >> >> dog#n#2). >> >> >> >> > Also can WordNet::SenseRelate be configured for list of > stopwords, >> >> > special characters? >> >> >> >> Yes. See the stoplist option described here >> >> >> >> >> >> > http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen > seRelate/AllWords.pm >> >> >> >> and here >> >> >> >> > http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl >> >> >> >> and find a sample stoplist here : >> >> >> >> >> >> > http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 > 9/samples/default-stoplist-raw.txt >> >> >> >> ;) >> >> >> >> Good luck, >> >> Ted >> >> >> >> > Thanks, >> >> > Regards, >> >> > Ambikesh Jayal. >> >> > School of IS, Computing & Maths, >> >> > Brunel University, >> >> > Uxbridge, UB8 3PH, >> >> > United Kingdom. >> >> > Email: amb...@br... >> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj >> >> > >> >> >> >> >> >> -- >> >> Ted Pedersen >> >> http://www.d.umn.edu/~tpederse >> >> >> >> >> >> > ------------------------------------------------------------------------ > ------ >> >> Download Intel® Parallel Studio Eval >> >> Try the new software tools for yourself. Speed compiling, find bugs >> >> proactively, and fine-tune applications for parallel performance. >> >> See why Intel Parallel Studio got high marks during beta. >> >> http://p.sf.net/sfu/intel-sw-dev >> >> _______________________________________________ >> >> senserelate-developers mailing list >> >> sen...@li... >> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers >> > >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> > ------------------------------------------------------------------------ > ------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> senserelate-developers mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > ------------------------------------------------------------------------ > ------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > senserelate-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senserelate-users > > -------------------------------------------------- > This e-mail and any files transmitted with it may contain privileged or confidential information. > It is solely for use by the individual for whom it is intended, even if addressed incorrectly. > If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, > or take any action in reliance on the contents of this information; and delete it from > your system. Any other use of this e-mail is prohibited. > > Thank you for your compliance. > -------------------------------------------------- > > -- Ted Pedersen http://www.d.umn.edu/~tpederse -------------------------------------------------- This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. -------------------------------------------------- -------------------------------------------------- This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. -------------------------------------------------- |
From: Kamal, J. <JKamal@ETS.ORG> - 2010-04-06 15:38:17
|
Hi Ted: Thanks so much for your response. I really didn't expect such a quick response & I truly appreciate it. It's been a while, I have been struggling with this. Please find my comments below. Thanks! Jyoti -----Original Message----- From: Ted Pedersen [mailto:dul...@gm...] Sent: Tuesday, April 06, 2010 11:22 AM To: Kamal, Jyoti Cc: sen...@li...; senserelate-users Subject: Re: [Senserelate-users] [Senserelate-developers]WordNet::SenseRelate Hi Jyoti, If you have installed WordNet::Similarity, you should have gotten WordNet::Tools as a result of that. But, we can double check that... 1) what kind of system are you running on? If linux can you send me the output of uname -a Linux etsis134.ets.org 2.6.9-78.ELsmp #1 SMP Wed Jul 9 15:46:26 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux 2) can you send the output of similarity.pl --measure WordNet::Similarity::path dog cat [jkamal@etsis134 bin]$ similarity.pl --type=WordNet::Similarity::path dog#n cat#n Loading WordNet... done. Loading Module... done. dog#n#1 cat#n#1 0.2 (just to check that similarity is ok) 4) can you send the output of perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' perl -MWordNet::SenseRelate::AllWords -e 'print "$WordNet::SenseRelate::AllWords::VERSION\n"' [jkamal@etsis134 bin]$ perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' 1.49 [jkamal@etsis134 bin]$ perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' 2.01 [jkamal@etsis134 bin]$ perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' 2.01 [jkamal@etsis134 bin]$ perl -MWordNet::SenseRelate::AllWords -e 'print "$WordNet::SenseRelate::AllWords::VERSION\n"' 0.19 (this will show what versions of modules you are using..) 5) What version of WordNet are you using? [JK] WordNet 3.0 on linux wn -l should show that... 6) can you send the complete output from WordNet::SenseRelate::AllWords make test? --------------------------------------------------------------------- [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' # expected: 'The#CL' # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) # got: 'DT#NT' # expected: 'star#n#1' t/Error-Suffixes-tagged-wntagged....NOK 4# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) # got: 'star#NT' # expected: 'marry#v#1' t/Error-Suffixes-tagged-wntagged....NOK 5# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) # got: 'NN#NT' # expected: 'qn#CL' t/Error-Suffixes-tagged-wntagged....NOK 6# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 7# got: 'married#NT' # expected: 'astronomer#n#1' # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 8# got: 'VBD#NT' # expected: '.#IT' t/Error-Suffixes-tagged-wntagged....ok 41/41# Looks like you failed 6 tests of 41. t/Error-Suffixes-tagged-wntagged....dubious Test returned status 6 (wstat 1536, 0x600) DIED. FAILED tests 3-8 Failed 6/41 tests, 85.37% okay t/WordNet-SenseRelate-AllWords......# WordNet hash : eOS9lXC6GvMWznF1wkZofDdtbBU t/WordNet-SenseRelate-AllWords......ok 1/30# WordNet path : /home/jkamal/myWordNet3//dict/ t/WordNet-SenseRelate-AllWords......ok 3/30# Failed test (t/WordNet-SenseRelate-AllWords.t at line 96) t/WordNet-SenseRelate-AllWords......NOK 4# got: '11' # expected: '5' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 5# got: 'my#NT' # expected: 'my#CL' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 6# got: 'PRP#NT' # expected: 'cat#n#7' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 7# got: 'cat#NT' # expected: 'be#v#1' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 8# got: 'NN#NT' # expected: 'a#CL' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) t/WordNet-SenseRelate-AllWords......NOK 9# got: 'is#NT' # expected: 'wise#a#1' # Failed test (t/WordNet-SenseRelate-AllWords.t at line 99) # got: 'VBZ#NT' # expected: 'cat#n#7' t/WordNet-SenseRelate-AllWords......NOK 13# Failed test (t/WordNet-SenseRelate-AllWords.t at line 131) t/WordNet-SenseRelate-AllWords......NOK 14# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'my#NT' # expected: 'my#CL' t/WordNet-SenseRelate-AllWords......NOK 15# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'PRP#NT' # expected: 'cat#n#NR' t/WordNet-SenseRelate-AllWords......NOK 16# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'cat#NT' # expected: 'be#v#1' t/WordNet-SenseRelate-AllWords......NOK 17# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'NN#NT' # expected: 'a#CL' t/WordNet-SenseRelate-AllWords......NOK 18# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'is#NT' # expected: 'wise#a#1' t/WordNet-SenseRelate-AllWords......NOK 19# Failed test (t/WordNet-SenseRelate-AllWords.t at line 136) # got: 'VBZ#NT' # expected: 'cat#n#7' t/WordNet-SenseRelate-AllWords......ok 30/30# Looks like you failed 14 tests of 30. t/WordNet-SenseRelate-AllWords......dubious Test returned status 14 (wstat 3584, 0xe00) DIED. FAILED tests 4-10, 13-19 Failed 14/30 tests, 53.33% okay t/wsd...............................ok 2/5Current configuration: context file : /tmp/25819.1in format : tagged scheme : normal tagged text : yes measure : WordNet::Similarity::lesk window : 3 contextScore : 0 pairScore : 0 measure config: (none) glosses : no nocompoundify : no usemono : no backoff : no trace : no forcepos : no stoplist : (none) Loading WordNet... done. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. # Failed test (t/wsd.t at line 59) # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' t/wsd...............................ok 4/5Current configuration: context file : /tmp/25819.2in format : raw scheme : normal tagged text : no measure : WordNet::Similarity::lesk window : 3 contextScore : 0 pairScore : 0 measure config: (none) glosses : no nocompoundify : no usemono : no backoff : no trace : no forcepos : no stoplist : (none) Loading WordNet... done. t/wsd...............................ok 5/5# Looks like you failed 1 tests of 5. t/wsd...............................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 3 Failed 1/5 tests, 80.00% okay Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------ ------- t/Error-Suffixes-tagged-wntagged. 6 1536 41 6 14.63% 3-8 t/WordNet-SenseRelate-AllWords.t 14 3584 30 14 46.67% 4-10 13-19 t/wsd.t 1 256 5 1 20.00% 3 Failed 3/3 test scripts, 0.00% okay. 21/76 subtests failed, 72.37% okay. make: *** [test_dynamic] Error 1 -------------------------------------------------------- Thanks! Ted On Tue, Apr 6, 2010 at 9:59 AM, Kamal, Jyoti <JK...@et...> wrote: > Hello Everyone: > I just joined the senserelate mailing group. I work for a project in ETS > & we are doing some experiments with various tools out there for > semantic similarity. > I was able to get WordNet::Similarity to work but it seems that without > senserelate, I cannot do much & the combination of both is what I am > looking for. I am having really hard time getting senserelate to install > on my linux box. Can someone please help? > > Here is some description of my error. Please let me know if you need > more info. > ------------------------------------------------------------------------ > ------------------------------ > > Something doesn't seem to be right in extracting the WordNet Tags. It > seems that before we get the sense number for each word, we try to tag > each word with wordNet Tags & something is missing here. > One thing I wanted to bring to attention is that one of the modules > needed for SenseRelate is WordNet::Tools but I could not find anywhere > of how to install this module. After doing all my study, I came to the > conclusion that it comes as a part of WordNet::Similarity & since I > already have that up & running, I have Wordnet::Tools working as well. I > may be wrong here. > > After doing the make, When I run "make test", I get various errors > like.. > > [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > # expected: 'The#CL' > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > . > . > . > Loading WordNet... done. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > # Failed test (t/wsd.t at line 59) > # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > > ------------------------------------------------------------------------ > --------------------------------- > Please let me know if you have any question. > > Thanks! > Jyoti > > > -----Original Message----- > From: Siddharth Patwardhan [mailto:si...@cs...] > Sent: Monday, April 05, 2010 7:15 PM > To: Ted Pedersen > Cc: Ambikesh jayal; sen...@li...; > senserelate-users; sat...@gm... > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > I remember a little while back Linas Vepsats wrote something that could > deal with sensekeys. He's released it on CPAN: > > http://search.cpan.org/dist/WordNet-SenseKey/ > > -- Sid. > > On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: >> Hi Bano, >> >> Very impressive memory. :) That had totally slipped my mind, but >> indeed it is here (on my very own web page :) >> >> http://www.d.umn.edu/~tpederse/wordnet.html >> >> Here's the short description from that page... >> >> Map from QueryData to WordNet sense-keys >> >> QueryData identifies WordNet senses using a word#pos#sense format. >> WordNet identifies senses using sense-keys (aka mnemonics). This >> program creates a mapping between the QueryData format and the WordNet >> sense-key format. (This tool is not specific to Senseval-2 data - it >> is generally useful if are using QueryData to access WordNet.) >> >> So, this sounds very much like what Ambikesh may want to use. Thanks >> for pointing this out, I absolutely missed this! >> >> Thanks! >> Ted >> >> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee > <sat...@gm...> wrote: >> > Hi Ted, >> > >> > I'm pretty rusty with Senserelate, but I vaguely recall having > written a >> > program (way back when!) that at least created a map between the > sensekeys >> > and the word#pos#sense format (but maybe we are talking of something > else >> > here?) I googled around for it, and found this link: >> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this > program >> > still exist? As far as I remember, it depended on the minutiae of > the >> > various file formats in WordNet, so I wouldn't be surprised if those > formats >> > have changed now rendering the program useless :-). >> > >> > -Bano >> > >> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> > wrote: >> >> >> >> Hi Ambikesh, >> >> >> >> See my comments inline... >> >> >> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal > <jay...@ya...> >> >> wrote: >> >> > >> >> > Hi, >> >> > The WordNet::SenseRelate returns the value in the format > "infer#v#5". To >> >> > run my experiments I need to compare it with a value in the > format >> >> > "infer%2:31:01::". >> >> > 1. Is there a function that takes sense key as input and returns > the >> >> > corresponding sense number? For example inputting "infer%2:31:01" > should >> >> > return "infer#v#5". >> >> >> >> I am not sure, but if there is it would be in WordNet::QueryData. >> >> >> >> http://search.cpan.org/dist/WordNet-QueryData/ >> >> >> >> While we use WordNet::QueryData, we don't include all of its >> >> functionality, so this might be something that they provide but we >> >> don't use. There is mailing list devoted to QueryData that might be >> >> the best place to ask this - it's a google group named wn-perl >> >> (details can be found at the site above). >> >> >> >> > >> >> > 2. Can WordNet::SenseRelate be configured to return the results > in the >> >> > format "infer%2:31:01::" ? >> >> >> >> No, we only support the wps format (word#part-of-speech#sense, as > in >> >> dog#n#2). >> >> >> >> > Also can WordNet::SenseRelate be configured for list of > stopwords, >> >> > special characters? >> >> >> >> Yes. See the stoplist option described here >> >> >> >> >> >> > http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen > seRelate/AllWords.pm >> >> >> >> and here >> >> >> >> > http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl >> >> >> >> and find a sample stoplist here : >> >> >> >> >> >> > http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 > 9/samples/default-stoplist-raw.txt >> >> >> >> ;) >> >> >> >> Good luck, >> >> Ted >> >> >> >> > Thanks, >> >> > Regards, >> >> > Ambikesh Jayal. >> >> > School of IS, Computing & Maths, >> >> > Brunel University, >> >> > Uxbridge, UB8 3PH, >> >> > United Kingdom. >> >> > Email: amb...@br... >> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj >> >> > >> >> >> >> >> >> -- >> >> Ted Pedersen >> >> http://www.d.umn.edu/~tpederse >> >> >> >> >> >> > ------------------------------------------------------------------------ > ------ >> >> Download Intel® Parallel Studio Eval >> >> Try the new software tools for yourself. Speed compiling, find bugs >> >> proactively, and fine-tune applications for parallel performance. >> >> See why Intel Parallel Studio got high marks during beta. >> >> http://p.sf.net/sfu/intel-sw-dev >> >> _______________________________________________ >> >> senserelate-developers mailing list >> >> sen...@li... >> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers >> > >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> > ------------------------------------------------------------------------ > ------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> senserelate-developers mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > ------------------------------------------------------------------------ > ------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > senserelate-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senserelate-users > > -------------------------------------------------- > This e-mail and any files transmitted with it may contain privileged or confidential information. > It is solely for use by the individual for whom it is intended, even if addressed incorrectly. > If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, > or take any action in reliance on the contents of this information; and delete it from > your system. Any other use of this e-mail is prohibited. > > Thank you for your compliance. > -------------------------------------------------- > > -- Ted Pedersen http://www.d.umn.edu/~tpederse -------------------------------------------------- This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. -------------------------------------------------- |
From: Ted P. <dul...@gm...> - 2010-04-06 15:22:04
|
Hi Jyoti, If you have installed WordNet::Similarity, you should have gotten WordNet::Tools as a result of that. But, we can double check that... 1) what kind of system are you running on? If linux can you send me the output of uname -a 2) can you send the output of similarity.pl --measure WordNet::Similarity::path dog cat (just to check that similarity is ok) 4) can you send the output of perl -MWordNet::QueryData -e 'print "$WordNet::QueryData::VERSION\n"' perl -MWordNet::Similarity -e 'print "$WordNet::Similarity::VERSION\n"' perl -MWordNet::Tools -e 'print "$WordNet::Tools::VERSION\n"' perl -MWordNet::SenseRelate::AllWords -e 'print "$WordNet::SenseRelate::AllWords::VERSION\n"' (this will show what versions of modules you are using..) 5) What version of WordNet are you using? on linux wn -l should show that... 6) can you send the complete output from WordNet::SenseRelate::AllWords make test? Thanks! Ted On Tue, Apr 6, 2010 at 9:59 AM, Kamal, Jyoti <JK...@et...> wrote: > Hello Everyone: > I just joined the senserelate mailing group. I work for a project in ETS > & we are doing some experiments with various tools out there for > semantic similarity. > I was able to get WordNet::Similarity to work but it seems that without > senserelate, I cannot do much & the combination of both is what I am > looking for. I am having really hard time getting senserelate to install > on my linux box. Can someone please help? > > Here is some description of my error. Please let me know if you need > more info. > ------------------------------------------------------------------------ > ------------------------------ > > Something doesn't seem to be right in extracting the WordNet Tags. It > seems that before we get the sense number for each word, we try to tag > each word with wordNet Tags & something is missing here. > One thing I wanted to bring to attention is that one of the modules > needed for SenseRelate is WordNet::Tools but I could not find anywhere > of how to install this module. After doing all my study, I came to the > conclusion that it comes as a part of WordNet::Similarity & since I > already have that up & running, I have Wordnet::Tools working as well. I > may be wrong here. > > After doing the make, When I run "make test", I get various errors > like.. > > [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test > PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t > t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test > (t/Error-Suffixes-tagged-wntagged.t at line 71) > t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' > # expected: 'The#CL' > # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) > . > . > . > Loading WordNet... done. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > Use of uninitialized value in pattern match (m//) at utils/wsd.pl line > 228. > Use of uninitialized value in concatenation (.) or string at > utils/wsd.pl line 229. > # Failed test (t/wsd.t at line 59) > # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' > # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' > > ------------------------------------------------------------------------ > --------------------------------- > Please let me know if you have any question. > > Thanks! > Jyoti > > > -----Original Message----- > From: Siddharth Patwardhan [mailto:si...@cs...] > Sent: Monday, April 05, 2010 7:15 PM > To: Ted Pedersen > Cc: Ambikesh jayal; sen...@li...; > senserelate-users; sat...@gm... > Subject: Re: [Senserelate-users] > [Senserelate-developers]WordNet::SenseRelate > > I remember a little while back Linas Vepsats wrote something that could > deal with sensekeys. He's released it on CPAN: > > http://search.cpan.org/dist/WordNet-SenseKey/ > > -- Sid. > > On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: >> Hi Bano, >> >> Very impressive memory. :) That had totally slipped my mind, but >> indeed it is here (on my very own web page :) >> >> http://www.d.umn.edu/~tpederse/wordnet.html >> >> Here's the short description from that page... >> >> Map from QueryData to WordNet sense-keys >> >> QueryData identifies WordNet senses using a word#pos#sense format. >> WordNet identifies senses using sense-keys (aka mnemonics). This >> program creates a mapping between the QueryData format and the WordNet >> sense-key format. (This tool is not specific to Senseval-2 data - it >> is generally useful if are using QueryData to access WordNet.) >> >> So, this sounds very much like what Ambikesh may want to use. Thanks >> for pointing this out, I absolutely missed this! >> >> Thanks! >> Ted >> >> On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee > <sat...@gm...> wrote: >> > Hi Ted, >> > >> > I'm pretty rusty with Senserelate, but I vaguely recall having > written a >> > program (way back when!) that at least created a map between the > sensekeys >> > and the word#pos#sense format (but maybe we are talking of something > else >> > here?) I googled around for it, and found this link: >> > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this > program >> > still exist? As far as I remember, it depended on the minutiae of > the >> > various file formats in WordNet, so I wouldn't be surprised if those > formats >> > have changed now rendering the program useless :-). >> > >> > -Bano >> > >> > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> > wrote: >> >> >> >> Hi Ambikesh, >> >> >> >> See my comments inline... >> >> >> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal > <jay...@ya...> >> >> wrote: >> >> > >> >> > Hi, >> >> > The WordNet::SenseRelate returns the value in the format > "infer#v#5". To >> >> > run my experiments I need to compare it with a value in the > format >> >> > "infer%2:31:01::". >> >> > 1. Is there a function that takes sense key as input and returns > the >> >> > corresponding sense number? For example inputting "infer%2:31:01" > should >> >> > return "infer#v#5". >> >> >> >> I am not sure, but if there is it would be in WordNet::QueryData. >> >> >> >> http://search.cpan.org/dist/WordNet-QueryData/ >> >> >> >> While we use WordNet::QueryData, we don't include all of its >> >> functionality, so this might be something that they provide but we >> >> don't use. There is mailing list devoted to QueryData that might be >> >> the best place to ask this - it's a google group named wn-perl >> >> (details can be found at the site above). >> >> >> >> > >> >> > 2. Can WordNet::SenseRelate be configured to return the results > in the >> >> > format "infer%2:31:01::" ? >> >> >> >> No, we only support the wps format (word#part-of-speech#sense, as > in >> >> dog#n#2). >> >> >> >> > Also can WordNet::SenseRelate be configured for list of > stopwords, >> >> > special characters? >> >> >> >> Yes. See the stoplist option described here >> >> >> >> >> >> > http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen > seRelate/AllWords.pm >> >> >> >> and here >> >> >> >> > http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl >> >> >> >> and find a sample stoplist here : >> >> >> >> >> >> > http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 > 9/samples/default-stoplist-raw.txt >> >> >> >> ;) >> >> >> >> Good luck, >> >> Ted >> >> >> >> > Thanks, >> >> > Regards, >> >> > Ambikesh Jayal. >> >> > School of IS, Computing & Maths, >> >> > Brunel University, >> >> > Uxbridge, UB8 3PH, >> >> > United Kingdom. >> >> > Email: amb...@br... >> >> > Webpage: http://people.brunel.ac.uk/~cspgaaj >> >> > >> >> >> >> >> >> -- >> >> Ted Pedersen >> >> http://www.d.umn.edu/~tpederse >> >> >> >> >> >> > ------------------------------------------------------------------------ > ------ >> >> Download Intel® Parallel Studio Eval >> >> Try the new software tools for yourself. Speed compiling, find bugs >> >> proactively, and fine-tune applications for parallel performance. >> >> See why Intel Parallel Studio got high marks during beta. >> >> http://p.sf.net/sfu/intel-sw-dev >> >> _______________________________________________ >> >> senserelate-developers mailing list >> >> sen...@li... >> >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers >> > >> > >> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> > ------------------------------------------------------------------------ > ------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> senserelate-developers mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > ------------------------------------------------------------------------ > ------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > senserelate-users mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senserelate-users > > -------------------------------------------------- > This e-mail and any files transmitted with it may contain privileged or confidential information. > It is solely for use by the individual for whom it is intended, even if addressed incorrectly. > If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, > or take any action in reliance on the contents of this information; and delete it from > your system. Any other use of this e-mail is prohibited. > > Thank you for your compliance. > -------------------------------------------------- > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Kamal, J. <JKamal@ETS.ORG> - 2010-04-06 14:59:34
|
Hello Everyone: I just joined the senserelate mailing group. I work for a project in ETS & we are doing some experiments with various tools out there for semantic similarity. I was able to get WordNet::Similarity to work but it seems that without senserelate, I cannot do much & the combination of both is what I am looking for. I am having really hard time getting senserelate to install on my linux box. Can someone please help? Here is some description of my error. Please let me know if you need more info. ------------------------------------------------------------------------ ------------------------------ Something doesn't seem to be right in extracting the WordNet Tags. It seems that before we get the sense number for each word, we try to tag each word with wordNet Tags & something is missing here. One thing I wanted to bring to attention is that one of the modules needed for SenseRelate is WordNet::Tools but I could not find anywhere of how to install this module. After doing all my study, I came to the conclusion that it comes as a part of WordNet::Similarity & since I already have that up & running, I have Wordnet::Tools working as well. I may be wrong here. After doing the make, When I run "make test", I get various errors like.. [jkamal@etsis134 WordNet-SenseRelate-AllWords-0.19]$ make test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/Error-Suffixes-tagged-wntagged....ok 2/41# Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) t/Error-Suffixes-tagged-wntagged....NOK 3# got: 'The#NT' # expected: 'The#CL' # Failed test (t/Error-Suffixes-tagged-wntagged.t at line 71) . . . Loading WordNet... done. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. Use of uninitialized value in pattern match (m//) at utils/wsd.pl line 228. Use of uninitialized value in concatenation (.) or string at utils/wsd.pl line 229. # Failed test (t/wsd.t at line 59) # got: 'parking_tickets#NT are#NT expensive#NT #NT #NT #NT ' # expected: 'parking_tickets#n#1 are#v#1 expensive#a#1 ' ------------------------------------------------------------------------ --------------------------------- Please let me know if you have any question. Thanks! Jyoti -----Original Message----- From: Siddharth Patwardhan [mailto:si...@cs...] Sent: Monday, April 05, 2010 7:15 PM To: Ted Pedersen Cc: Ambikesh jayal; sen...@li...; senserelate-users; sat...@gm... Subject: Re: [Senserelate-users] [Senserelate-developers]WordNet::SenseRelate I remember a little while back Linas Vepsats wrote something that could deal with sensekeys. He's released it on CPAN: http://search.cpan.org/dist/WordNet-SenseKey/ -- Sid. On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: > Hi Bano, > > Very impressive memory. :) That had totally slipped my mind, but > indeed it is here (on my very own web page :) > > http://www.d.umn.edu/~tpederse/wordnet.html > > Here's the short description from that page... > > Map from QueryData to WordNet sense-keys > > QueryData identifies WordNet senses using a word#pos#sense format. > WordNet identifies senses using sense-keys (aka mnemonics). This > program creates a mapping between the QueryData format and the WordNet > sense-key format. (This tool is not specific to Senseval-2 data - it > is generally useful if are using QueryData to access WordNet.) > > So, this sounds very much like what Ambikesh may want to use. Thanks > for pointing this out, I absolutely missed this! > > Thanks! > Ted > > On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee <sat...@gm...> wrote: > > Hi Ted, > > > > I'm pretty rusty with Senserelate, but I vaguely recall having written a > > program (way back when!) that at least created a map between the sensekeys > > and the word#pos#sense format (but maybe we are talking of something else > > here?) I googled around for it, and found this link: > > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this program > > still exist? As far as I remember, it depended on the minutiae of the > > various file formats in WordNet, so I wouldn't be surprised if those formats > > have changed now rendering the program useless :-). > > > > -Bano > > > > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> wrote: > >> > >> Hi Ambikesh, > >> > >> See my comments inline... > >> > >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal <jay...@ya...> > >> wrote: > >> > > >> > Hi, > >> > The WordNet::SenseRelate returns the value in the format "infer#v#5". To > >> > run my experiments I need to compare it with a value in the format > >> > "infer%2:31:01::". > >> > 1. Is there a function that takes sense key as input and returns the > >> > corresponding sense number? For example inputting "infer%2:31:01" should > >> > return "infer#v#5". > >> > >> I am not sure, but if there is it would be in WordNet::QueryData. > >> > >> http://search.cpan.org/dist/WordNet-QueryData/ > >> > >> While we use WordNet::QueryData, we don't include all of its > >> functionality, so this might be something that they provide but we > >> don't use. There is mailing list devoted to QueryData that might be > >> the best place to ask this - it's a google group named wn-perl > >> (details can be found at the site above). > >> > >> > > >> > 2. Can WordNet::SenseRelate be configured to return the results in the > >> > format "infer%2:31:01::" ? > >> > >> No, we only support the wps format (word#part-of-speech#sense, as in > >> dog#n#2). > >> > >> > Also can WordNet::SenseRelate be configured for list of stopwords, > >> > special characters? > >> > >> Yes. See the stoplist option described here > >> > >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/Sen seRelate/AllWords.pm > >> > >> and here > >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl > >> > >> and find a sample stoplist here : > >> > >> > >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.1 9/samples/default-stoplist-raw.txt > >> > >> ;) > >> > >> Good luck, > >> Ted > >> > >> > Thanks, > >> > Regards, > >> > Ambikesh Jayal. > >> > School of IS, Computing & Maths, > >> > Brunel University, > >> > Uxbridge, UB8 3PH, > >> > United Kingdom. > >> > Email: amb...@br... > >> > Webpage: http://people.brunel.ac.uk/~cspgaaj > >> > > >> > >> > >> -- > >> Ted Pedersen > >> http://www.d.umn.edu/~tpederse > >> > >> > >> ------------------------------------------------------------------------ ------ > >> Download Intel® Parallel Studio Eval > >> Try the new software tools for yourself. Speed compiling, find bugs > >> proactively, and fine-tune applications for parallel performance. > >> See why Intel Parallel Studio got high marks during beta. > >> http://p.sf.net/sfu/intel-sw-dev > >> _______________________________________________ > >> senserelate-developers mailing list > >> sen...@li... > >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > ------------------------------------------------------------------------ ------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > senserelate-developers mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senserelate-developers ------------------------------------------------------------------------ ------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ senserelate-users mailing list sen...@li... https://lists.sourceforge.net/lists/listinfo/senserelate-users -------------------------------------------------- This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. -------------------------------------------------- |
From: Ted P. <dul...@gm...> - 2010-04-06 01:41:34
|
---------- Forwarded message ---------- From: Benjamin R. Haskell <wo...@be...> Date: Mon, Apr 5, 2010 at 8:38 PM Subject: Re: [wn-perl] Re: [Senserelate-developers] WordNet::SenseRelate To: wn...@go... Cc: lin...@gm..., Ted Pedersen <dul...@gm...>, si...@cs..., sat...@gm..., sen...@li..., senserelate-users <sen...@li...> (My cross-posting will fail, since I don't think I'm subscribed to some of the lists -- feel free to forward as desired.) The web interface is incorrect in the cases you list. Internally, the Perl library we used at the lab for all of our projects was somewhat stupid about the 'adjective'/'satellite' distinction. ('Stupid' in the sense that it conflated 3 and 5 -- I gave it an option called conflate35, which defaulted to true.) The good news is that the 3/5 distinction is redundant with the :'head-word':'head-sense' trailing portion of a sense key. You can just do: sub recover_sense35 { local $_ = shift; /::$/ and s/%3/%5/; $_ } to get the correct senses. I'm kind of surprised that the web interface is wrong -- I thought I'd corrected that at some point -- but maybe it got reverted when the site moved. (And I'm no longer working at WordNet anyway.) Best, Ben On Tue, 6 Apr 2010, Ambikesh jayal wrote: > Dear Linas, > > Thanks for developing WordNet::SenseKey. It is ecactly the program I was looking for. Thanks Siddharth and Ted and for > pointing to SenseKey. > > However the output from WordNet::SenseKey seems to be bit different from the corresponding value shown by the WordNet web > interface. For example for the sense number "distinct#a#1", the WordNet::SenseKey shows sensekey as > "distinct%5:00:00:different:00" where as the WordNet web interface shows sense key as "distinct%3:00:00:different:00". I > apologise if I am missing something here. > > > Following are more examples. > > > Output from WordNet::SenseKey.pm > ****distinct#a#1 [distinct%5:00:00:different:00] > ****distinct#a#2 [distinct%3:00:00::] > ****distinct#a#3 [distinct%5:00:00:separate:00] > ****distinct#a#4 [distinct%5:00:00:definite:00] > ****distinct#a#5 [distinct%5:00:00:clear:00] > > Output from WordNet web interface http://wordnetweb.princeton.edu/perl/webwn > distinct#1 (distinct%3:00:00:different:00), > distinct#2 (distinct%3:00:00::) > distinct#3 (distinct%3:00:00:separate:00) > distinct#4 (distinct%3:00:00:definite:00) > distinct#5 (distinct%3:00:00:clear:00), > > Regards, > Ambikesh Jayal. > School of IS, Computing & Maths, > Brunel University, > Uxbridge, UB8 3PH, > United Kingdom. > Email: amb...@br... > > --- On Mon, 5/4/10, Siddharth Patwardhan <si...@cs...> wrote: > > From: Siddharth Patwardhan <si...@cs...> > Subject: Re: [Senserelate-developers] WordNet::SenseRelate > To: "Ted Pedersen" <dul...@gm...> > Cc: sat...@gm..., "Ambikesh jayal" <jay...@ya...>, > sen...@li..., "senserelate-users" <sen...@li...> > Date: Monday, 5 April, 2010, 19:14 > > I remember a little while back Linas Vepsats wrote something that could > deal with sensekeys. He's released it on CPAN: > > http://search.cpan.org/dist/WordNet-SenseKey/ > > -- Sid. > > On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: > > Hi Bano, > > > > Very impressive memory. :) That had totally slipped my mind, but > > indeed it is here (on my very own web page :) > > > > http://www.d.umn.edu/~tpederse/wordnet.html > > > > Here's the short description from that page... > > > > Map from QueryData to WordNet sense-keys > > > > QueryData identifies WordNet senses using a word#pos#sense format. > > WordNet identifies senses using sense-keys (aka mnemonics). This > > program creates a mapping between the QueryData format and the WordNet > > sense-key format. (This tool is not specific to Senseval-2 data - it > > is generally useful if are using QueryData to access WordNet.) > > > > So, this sounds very much like what Ambikesh may want to use. Thanks > > for pointing this out, I absolutely missed this! > > > > Thanks! > > Ted > > > > On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee <sat...@gm...> wrote: > > > Hi Ted, > > > > > > I'm pretty rusty with Senserelate, but I vaguely recall having written a > > > program (way back when!) that at least created a map between the sensekeys > > > and the word#pos#sense format (but maybe we are talking of something else > > > here?) I googled around for it, and found this link: > > > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this program > > > still exist? As far as I remember, it depended on the minutiae of the > > > various file formats in WordNet, so I wouldn't be surprised if those formats > > > have changed now rendering the program useless :-). > > > > > > -Bano > > > > > > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> wrote: > > >> > > >> Hi Ambikesh, > > >> > > >> See my comments inline... > > >> > > >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal <jay...@ya...> > > >> wrote: > > >> > > > >> > Hi, > > >> > The WordNet::SenseRelate returns the value in the format "infer#v#5". To > > >> > run my experiments I need to compare it with a value in the format > > >> > "infer%2:31:01::". > > >> > 1. Is there a function that takes sense key as input and returns the > > >> > corresponding sense number? For example inputting "infer%2:31:01" should > > >> > return "infer#v#5". > > >> > > >> I am not sure, but if there is it would be in WordNet::QueryData. > > >> > > >> http://search.cpan.org/dist/WordNet-QueryData/ > > >> > > >> While we use WordNet::QueryData, we don't include all of its > > >> functionality, so this might be something that they provide but we > > >> don't use. There is mailing list devoted to QueryData that might be > > >> the best place to ask this - it's a google group named wn-perl > > >> (details can be found at the site above). > > >> > > >> > > > >> > 2. Can WordNet::SenseRelate be configured to return the results in the > > >> > format "infer%2:31:01::" ? > > >> > > >> No, we only support the wps format (word#part-of-speech#sense, as in > > >> dog#n#2). > > >> > > >> > Also can WordNet::SenseRelate be configured for list of stopwords, > > >> > special characters? > > >> > > >> Yes. See the stoplist option described here > > >> > > >> > > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/SenseRelate/AllWords.pm > > >> > > >> and here > > >> > > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl > > >> > > >> and find a sample stoplist here : > > >> > > >> > > >> > http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.19/samples/default-stoplist-raw.txt > > >> > > >> ;) > > >> > > >> Good luck, > > >> Ted > > >> > > >> > Thanks, > > >> > Regards, > > >> > Ambikesh Jayal. > > >> > School of IS, Computing & Maths, > > >> > Brunel University, > > >> > Uxbridge, UB8 3PH, > > >> > United Kingdom. > > >> > Email: amb...@br... > > >> > Webpage: http://people.brunel.ac.uk/~cspgaaj > > >> > > > >> > > >> > > >> -- > > >> Ted Pedersen > > >> http://www.d.umn.edu/~tpederse > > >> > > >> > > >> ------------------------------------------------------------------------------ > > >> Download Intel® Parallel Studio Eval > > >> Try the new software tools for yourself. Speed compiling, find bugs > > >> proactively, and fine-tune applications for parallel performance. > > >> See why Intel Parallel Studio got high marks during beta. > > >> http://p.sf.net/sfu/intel-sw-dev > > >> _______________________________________________ > > >> senserelate-developers mailing list > > >> sen...@li... > > >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > > > > > > > > > > > > -- > > Ted Pedersen > > http://www.d.umn.edu/~tpederse > > > > ------------------------------------------------------------------------------ > > Download Intel® Parallel Studio Eval > > Try the new software tools for yourself. Speed compiling, find bugs > > proactively, and fine-tune applications for parallel performance. > > See why Intel Parallel Studio got high marks during beta. > > http://p.sf.net/sfu/intel-sw-dev > > _______________________________________________ > > senserelate-developers mailing list > > sen...@li... > > https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > -- > You received this message because you are subscribed to the Google Groups "wn-perl" group. > To post to this group, send email to wn...@go.... > To unsubscribe from this group, send email to wn-...@go.... > For more options, visit this group at http://groups.google.com/group/wn-perl?hl=en. > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Siddharth P. <si...@cs...> - 2010-04-05 23:14:49
|
I remember a little while back Linas Vepsats wrote something that could deal with sensekeys. He's released it on CPAN: http://search.cpan.org/dist/WordNet-SenseKey/ -- Sid. On Mon, 2010-04-05 at 17:34 -0500, Ted Pedersen wrote: > Hi Bano, > > Very impressive memory. :) That had totally slipped my mind, but > indeed it is here (on my very own web page :) > > http://www.d.umn.edu/~tpederse/wordnet.html > > Here's the short description from that page... > > Map from QueryData to WordNet sense-keys > > QueryData identifies WordNet senses using a word#pos#sense format. > WordNet identifies senses using sense-keys (aka mnemonics). This > program creates a mapping between the QueryData format and the WordNet > sense-key format. (This tool is not specific to Senseval-2 data - it > is generally useful if are using QueryData to access WordNet.) > > So, this sounds very much like what Ambikesh may want to use. Thanks > for pointing this out, I absolutely missed this! > > Thanks! > Ted > > On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee <sat...@gm...> wrote: > > Hi Ted, > > > > I'm pretty rusty with Senserelate, but I vaguely recall having written a > > program (way back when!) that at least created a map between the sensekeys > > and the word#pos#sense format (but maybe we are talking of something else > > here?) I googled around for it, and found this link: > > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this program > > still exist? As far as I remember, it depended on the minutiae of the > > various file formats in WordNet, so I wouldn't be surprised if those formats > > have changed now rendering the program useless :-). > > > > -Bano > > > > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> wrote: > >> > >> Hi Ambikesh, > >> > >> See my comments inline... > >> > >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal <jay...@ya...> > >> wrote: > >> > > >> > Hi, > >> > The WordNet::SenseRelate returns the value in the format "infer#v#5". To > >> > run my experiments I need to compare it with a value in the format > >> > "infer%2:31:01::". > >> > 1. Is there a function that takes sense key as input and returns the > >> > corresponding sense number? For example inputting "infer%2:31:01" should > >> > return "infer#v#5". > >> > >> I am not sure, but if there is it would be in WordNet::QueryData. > >> > >> http://search.cpan.org/dist/WordNet-QueryData/ > >> > >> While we use WordNet::QueryData, we don't include all of its > >> functionality, so this might be something that they provide but we > >> don't use. There is mailing list devoted to QueryData that might be > >> the best place to ask this - it's a google group named wn-perl > >> (details can be found at the site above). > >> > >> > > >> > 2. Can WordNet::SenseRelate be configured to return the results in the > >> > format "infer%2:31:01::" ? > >> > >> No, we only support the wps format (word#part-of-speech#sense, as in > >> dog#n#2). > >> > >> > Also can WordNet::SenseRelate be configured for list of stopwords, > >> > special characters? > >> > >> Yes. See the stoplist option described here > >> > >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/SenseRelate/AllWords.pm > >> > >> and here > >> > >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl > >> > >> and find a sample stoplist here : > >> > >> > >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.19/samples/default-stoplist-raw.txt > >> > >> ;) > >> > >> Good luck, > >> Ted > >> > >> > Thanks, > >> > Regards, > >> > Ambikesh Jayal. > >> > School of IS, Computing & Maths, > >> > Brunel University, > >> > Uxbridge, UB8 3PH, > >> > United Kingdom. > >> > Email: amb...@br... > >> > Webpage: http://people.brunel.ac.uk/~cspgaaj > >> > > >> > >> > >> -- > >> Ted Pedersen > >> http://www.d.umn.edu/~tpederse > >> > >> > >> ------------------------------------------------------------------------------ > >> Download Intel® Parallel Studio Eval > >> Try the new software tools for yourself. Speed compiling, find bugs > >> proactively, and fine-tune applications for parallel performance. > >> See why Intel Parallel Studio got high marks during beta. > >> http://p.sf.net/sfu/intel-sw-dev > >> _______________________________________________ > >> senserelate-developers mailing list > >> sen...@li... > >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > senserelate-developers mailing list > sen...@li... > https://lists.sourceforge.net/lists/listinfo/senserelate-developers |
From: Ted P. <dul...@gm...> - 2010-04-05 22:34:27
|
Hi Bano, Very impressive memory. :) That had totally slipped my mind, but indeed it is here (on my very own web page :) http://www.d.umn.edu/~tpederse/wordnet.html Here's the short description from that page... Map from QueryData to WordNet sense-keys QueryData identifies WordNet senses using a word#pos#sense format. WordNet identifies senses using sense-keys (aka mnemonics). This program creates a mapping between the QueryData format and the WordNet sense-key format. (This tool is not specific to Senseval-2 data - it is generally useful if are using QueryData to access WordNet.) So, this sounds very much like what Ambikesh may want to use. Thanks for pointing this out, I absolutely missed this! Thanks! Ted On Mon, Apr 5, 2010 at 5:26 PM, Satanjeev Banerjee <sat...@gm...> wrote: > Hi Ted, > > I'm pretty rusty with Senserelate, but I vaguely recall having written a > program (way back when!) that at least created a map between the sensekeys > and the word#pos#sense format (but maybe we are talking of something else > here?) I googled around for it, and found this link: > http://www.d.umn.edu/~tpederse/Code/Readme-qd2wn.txt. Does this program > still exist? As far as I remember, it depended on the minutiae of the > various file formats in WordNet, so I wouldn't be surprised if those formats > have changed now rendering the program useless :-). > > -Bano > > On Mon, Apr 5, 2010 at 6:11 PM, Ted Pedersen <dul...@gm...> wrote: >> >> Hi Ambikesh, >> >> See my comments inline... >> >> On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal <jay...@ya...> >> wrote: >> > >> > Hi, >> > The WordNet::SenseRelate returns the value in the format "infer#v#5". To >> > run my experiments I need to compare it with a value in the format >> > "infer%2:31:01::". >> > 1. Is there a function that takes sense key as input and returns the >> > corresponding sense number? For example inputting "infer%2:31:01" should >> > return "infer#v#5". >> >> I am not sure, but if there is it would be in WordNet::QueryData. >> >> http://search.cpan.org/dist/WordNet-QueryData/ >> >> While we use WordNet::QueryData, we don't include all of its >> functionality, so this might be something that they provide but we >> don't use. There is mailing list devoted to QueryData that might be >> the best place to ask this - it's a google group named wn-perl >> (details can be found at the site above). >> >> > >> > 2. Can WordNet::SenseRelate be configured to return the results in the >> > format "infer%2:31:01::" ? >> >> No, we only support the wps format (word#part-of-speech#sense, as in >> dog#n#2). >> >> > Also can WordNet::SenseRelate be configured for list of stopwords, >> > special characters? >> >> Yes. See the stoplist option described here >> >> >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/SenseRelate/AllWords.pm >> >> and here >> >> http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl >> >> and find a sample stoplist here : >> >> >> http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.19/samples/default-stoplist-raw.txt >> >> ;) >> >> Good luck, >> Ted >> >> > Thanks, >> > Regards, >> > Ambikesh Jayal. >> > School of IS, Computing & Maths, >> > Brunel University, >> > Uxbridge, UB8 3PH, >> > United Kingdom. >> > Email: amb...@br... >> > Webpage: http://people.brunel.ac.uk/~cspgaaj >> > >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> >> >> ------------------------------------------------------------------------------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> senserelate-developers mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senserelate-developers > > -- Ted Pedersen http://www.d.umn.edu/~tpederse |
From: Ted P. <dul...@gm...> - 2010-04-05 22:11:45
|
Hi Ambikesh, See my comments inline... On Mon, Apr 5, 2010 at 4:43 PM, Ambikesh jayal <jay...@ya...> wrote: > > Hi, > The WordNet::SenseRelate returns the value in the format "infer#v#5". To run my experiments I need to compare it with a value in the format "infer%2:31:01::". > 1. Is there a function that takes sense key as input and returns the corresponding sense number? For example inputting "infer%2:31:01" should return "infer#v#5". I am not sure, but if there is it would be in WordNet::QueryData. http://search.cpan.org/dist/WordNet-QueryData/ While we use WordNet::QueryData, we don't include all of its functionality, so this might be something that they provide but we don't use. There is mailing list devoted to QueryData that might be the best place to ask this - it's a google group named wn-perl (details can be found at the site above). > > 2. Can WordNet::SenseRelate be configured to return the results in the format "infer%2:31:01::" ? No, we only support the wps format (word#part-of-speech#sense, as in dog#n#2). > Also can WordNet::SenseRelate be configured for list of stopwords, special characters? Yes. See the stoplist option described here http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/lib/WordNet/SenseRelate/AllWords.pm and here http://search.cpan.org/dist/WordNet-SenseRelate-AllWords/utils/wsd.pl and find a sample stoplist here : http://cpansearch.perl.org/src/TPEDERSE/WordNet-SenseRelate-AllWords-0.19/samples/default-stoplist-raw.txt ;) Good luck, Ted > Thanks, > Regards, > Ambikesh Jayal. > School of IS, Computing & Maths, > Brunel University, > Uxbridge, UB8 3PH, > United Kingdom. > Email: amb...@br... > Webpage: http://people.brunel.ac.uk/~cspgaaj > -- Ted Pedersen http://www.d.umn.edu/~tpederse |