From: Marc H. <mha...@gm...> - 2015-03-13 16:40:03
|
Hi All, I replied to Ted earlier today, when I should have replied to the listserve. So here are my responses to Ted's questions: Hi Ted, Thank you for getting back to me so quickly! In my message, I was striving for brevity, but it looks like as a result I was unclear in what I am trying to accomplish. I am a psychologist working on automating a coding procedure for written content. Specifically, I am trying to categorize sentences in terms of whether they do or do not contain achievement motive imagery. In the past, people have tried to accomplish this using a simple word-search function, setting up rules such as "if the sentence contains the word 'best', mark the sentence has containing achievement imagery." Although such a procedure correctly identifies "he is the world's best scientist" as having achievement imagery, we get false positives when "best" is used in different senses, such as when it means most likely ("it was his best chance") or when it denotes emotional closeness ("they are best friends"). So I am attempting a 2-step procedure: first, disambiguate the sentences into their synsets so I can know the most likely sense in which each word is meant. Next, take each synset that I have determined would denote achievement motive imagery, such as achiever#n#1, best#a#1, and greatness#n#1 (let's call them target synsets), and compute the similarity between each target synset and its closest synset in a given sentence. That way, I can categorize a sentence as having achievement imagery if it either contains any of my target synsets, or contains any synsets that are very similar to any of my target synsets. I will determine what counts as very similar through testing with already coded materials. I would of course be happy to use WordNet::Similarity to compute the similarity function, particularly because I like that it contains the option to perform adapted Lesk relatedness, whereas python's nltk does not, but I don't know that this helps me with the larger problem of disambiguating sentences, and then calculating their similarities to words not present in those sentences. Best, Marc Marc Halusic Graduate Student University of Missouri-Columbia Social and Personality Psychology On Fri, Mar 13, 2015 at 7:12 AM, Ted Pedersen <dul...@gm...> wrote: > If you want to find the similarity between two synsets like cat#n#1 and > dog#n#1, have you consider the use of WordNet::Similarity? This is what > SenseRelate uses under the hood, and is really set up to do these kinds of > similarity measurements. > > http://wn-similarity.sourceforge.net for details...below are some > examples of how it can be used. > > ukko(32): similarity.pl -type WordNet::Similarity::path cat#n#1 dog#n#1 > Loading WordNet... done. > Loading Module... done. > cat#n#1 dog#n#1 0.2 > ukko(33): similarity.pl -type WordNet::Similarity::path --file test > Loading WordNet... done. > Loading Module... done. > cat#n#1 dog#n#1 > cat#n#1 dog#n#1 0.2 > > mouse#n#1 hat#n#2 > mouse#n#1 hat#n#2 0.0454545454545455 > > ukko(34): cat test > cat#n#1 dog#n#1 > mouse#n#1 hat#n#2 > > That said, if I've misunderstood what you'd like to do, please let me know > and I'll try again! > > Good luck, > Ted > > > On Thu, Mar 12, 2015 at 3:45 PM, Marc Halusic <mha...@gm...> wrote: > >> Hi All, >> I am working on a project that requires that I disambiguate a large >> number of sentences into collections of synsets that could be recognized by >> wordnet. The reason that I need to do this is that, for each sentence, I >> need to compute similarity scores of a variety of target words against >> their most similar equivalents in a sentence. For example, I might want to >> compare the target word "dog#n#1' against the sentence "the cats ate the >> fish" and convert the sentence so I can find that "dog#n#1" is most similar >> to "cat#n#1", and compute how similar those two synsets are (I have a >> python script that can do this as long as the sentences have been >> disambiguated into wordnet synsets). Because the target words are not in >> the sentences, and are very numerous (around 300), I don't think that using >> a trace option is quite right for what I am trying to do. Looking at >> previous posts, I understand that it is either not easy or not possible to >> convert SenseRelate output to synsets that could be used in such >> calculations. I am therefore curious whether it is impossible, or just >> difficult, and how difficult it would be. I am also curious to know if >> there are better ways that I could perform these calculations with >> SenseRelate that perhaps I have not thought of yet. >> >> Best, >> >> Marc >> >> Marc Halusic >> Graduate Student >> University of Missouri-Columbia >> Social and Personality Psychology >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> senserelate-users mailing list >> sen...@li... >> https://lists.sourceforge.net/lists/listinfo/senserelate-users >> >> > |