Menu

#51 getResourcesByTerms not dealing with free-text tags corretly

open
9
2012-01-24
2011-10-27
No

Dear all,

I have just found a new bug in the platform.

After submitting a new free-text annotation (in this case "multimedia") using the following invocation:
http://insemtives.science.unitn.it/platform-rest/annotation-service/storeTagAnnotation?json=%7B%22name%22%3A%22multimedia%22%2C%22resource%22%3A%7B%22uri%22%3A%22http%3A%2F%2Fcatai.net%2Fblog%2F2011%2F10%2Fholodesk-holografia-controlada-mediante-kinect%2F%22%7D%2C%22owner%22%3A%7B%22name%22%3A%22gtorodelvalle%22%2C%22uri%22%3A%22http%3A%2F%2Ftwitter.com%2Fgtorodelvalle%22%7D%2C%22id%22%3A0%2C%22synset%22%3Anull%2C%22creationDate%22%3A1319739240904%2C%22accuracy%22%3A1%7D

We try to search for content annotated with that tag using the following invocation:
http://insemtives.science.unitn.it/platform-rest/knowledge-service/getResourcesByTerms?json=%7B%22includeTargets%22%3Atrue%2C%22maxSpecificityDistance%22%3A2%2C%22maxGeneralityDistance%22%3A0%2C%22operator%22%3A%22or%22%2C%22senses%22%3A%5B%7B%22term%22%3A%22multimedia%22%2C%22_specializationType%22%3A%22org.insemtives.platform.unitn.commons.model.QueryTerm%22%7D%5D%7D

we get [] as result instead of the annotated content.

After trying other free-text annotation we think that the problem may be the fact that "multimedia" has a sense in the Wordnet ontology since if we use a free-text tag with no sense in the ontology, everything seems to work. Just change "multimedia" in the previous invocations by the text "123".

Thank you very much!

Discussion

  • Juan Pane

    Juan Pane - 2011-10-28
    • assigned_to: juanpane --> pravdin
     
  • Juan Pane

    Juan Pane - 2011-10-28

    Assigning the ticket to Viktor

     
  • Germán Toro del Valle

    Increasing priority to highest value...

     
  • Germán Toro del Valle

    • priority: 5 --> 9
     
  • Viktor Pravdin

    Viktor Pravdin - 2011-10-31

    Fixed, please check.

     
  • Viktor Pravdin

    Viktor Pravdin - 2011-10-31
    • assigned_to: pravdin --> gtorodelvalle
     
  • Germán Toro del Valle

    • assigned_to: gtorodelvalle --> juanpane
     
  • Juan Pane

    Juan Pane - 2011-11-01

    Assigning the ticket to Viktor

     
  • Juan Pane

    Juan Pane - 2011-11-01
    • assigned_to: juanpane --> pravdin
     
  • Viktor Pravdin

    Viktor Pravdin - 2012-01-11

    This issue was discussed in the email, so let me summarize it here. The core of the problem is to define how we should treat the case when the synset URI is absent in the request. In general, there can be three cases:
    1) The synset URI is present and has some value. In this case the service will return only the terms with the given sense
    2) The synset URI is present and its value is null. In this case the service will return only the free-text terms
    3) The synset URI is absent. At the moment the service returns all terms, and the terms with the senses get the higher scoring with the default scoring parameters. The scoring can be tweaked (e.g., setting conceptTermWeight and termWeight to the same value) to let the free-text terms and terms with sense to have the same rank.

    Please let us know if this resolves the issue or if you need some additional actions to be taken. As far as I know German is not available at the moment, so I assign it to Daniel.

     
  • Viktor Pravdin

    Viktor Pravdin - 2012-01-11
    • assigned_to: pravdin --> danielfdez
     
  • Daniel Fernández Casado

    • assigned_to: danielfdez --> pravdin
     
  • Daniel Fernández Casado

    Hi Victor.

    Last week we wanted to focus on the experiment. Now, I am trying to reproduce your three cases that you have defined two weeks ago.

    For example:

    1) http://insemtives.science.unitn.it/test/platform-rest/knowledge-service/getResourcesByTerms?json=\{"includeTargets":true,"maxSpecificityDistance":5,"maxGeneralityDistance":5,"operator":"or","senses":[{"term":"multimedia","synsetUri":"http://www.w3.org/2006/03/wn/wn20/instances/synset-multimedia-noun-1","_specializationType":"org.insemtives.platform.unitn.commons.model.QueryTerm"}]}

    2) http://insemtives.science.unitn.it/test/platform-rest/knowledge-service/getResourcesByTerms?json=
    {"includeTargets":true,"maxSpecificityDistance":5,"maxGeneralityDistance":5,"operator":"or","senses":[{"term":"multimedia","synsetUri":null,"_specializationType":"org.insemtives.platform.unitn.commons.model.QueryTerm"}]}

    3) http://insemtives.science.unitn.it/test/platform-rest/knowledge-service/getResourcesByTerms?json=
    {"includeTargets":true,"maxSpecificityDistance":5,"maxGeneralityDistance":5,"operator":"or","senses":[{"term":"multimedia","_specializationType":"org.insemtives.platform.unitn.commons.model.QueryTerm"}]}

    No differences on results between cases "2" and "3". The expected output in "3" is the union between "1" and "2". Is the approach correct? If so, the problem is not solved: ( I await your comments.

     
  • Viktor Pravdin

    Viktor Pravdin - 2012-01-24

    Sorry, I probably didn't explain it correctly. The three item list below is the proposal which I gathered from the long list of emails, and it's not implemented yet. My message below was to ask if the proposal was correct and that no points were missing, so if it is the desired behavior please confirm it and then we can start implementing it; if it's not please correct the proposal.
    Right now the knowledge service treats the cases 2 and 3 in the same way.

     
  • Viktor Pravdin

    Viktor Pravdin - 2012-01-24
    • assigned_to: pravdin --> danielfdez
     
  • Daniel Fernández Casado

    • assigned_to: danielfdez --> pravdin
     
  • Daniel Fernández Casado

    I think that the proposal looks good because it gives the enough flexibility to choose the behavior that best work for each use case. Also, everything that is customizable and configurable (conceptTermWeight and termWeight parameters) is very positive for the project. The proposal was correct.

     

Log in to post a comment.

MongoDB Logo MongoDB