Dear all,
I have just found a new bug in the platform.
After submitting a new free-text annotation (in this case "multimedia") using the following invocation:
http://insemtives.science.unitn.it/platform-rest/annotation-service/storeTagAnnotation?json=%7B%22name%22%3A%22multimedia%22%2C%22resource%22%3A%7B%22uri%22%3A%22http%3A%2F%2Fcatai.net%2Fblog%2F2011%2F10%2Fholodesk-holografia-controlada-mediante-kinect%2F%22%7D%2C%22owner%22%3A%7B%22name%22%3A%22gtorodelvalle%22%2C%22uri%22%3A%22http%3A%2F%2Ftwitter.com%2Fgtorodelvalle%22%7D%2C%22id%22%3A0%2C%22synset%22%3Anull%2C%22creationDate%22%3A1319739240904%2C%22accuracy%22%3A1%7D
We try to search for content annotated with that tag using the following invocation:
http://insemtives.science.unitn.it/platform-rest/knowledge-service/getResourcesByTerms?json=%7B%22includeTargets%22%3Atrue%2C%22maxSpecificityDistance%22%3A2%2C%22maxGeneralityDistance%22%3A0%2C%22operator%22%3A%22or%22%2C%22senses%22%3A%5B%7B%22term%22%3A%22multimedia%22%2C%22_specializationType%22%3A%22org.insemtives.platform.unitn.commons.model.QueryTerm%22%7D%5D%7D
we get [] as result instead of the annotated content.
After trying other free-text annotation we think that the problem may be the fact that "multimedia" has a sense in the Wordnet ontology since if we use a free-text tag with no sense in the ontology, everything seems to work. Just change "multimedia" in the previous invocations by the text "123".
Thank you very much!
Assigning the ticket to Viktor
Increasing priority to highest value...
Fixed, please check.
Right now the same search: http://insemtives.science.unitn.it/platform-rest/knowledge-service/getResourcesByTerms?json=%7B%22includeTargets%22%3Atrue%2C%22maxSpecificityDistance%22%3A2%2C%22maxGeneralityDistance%22%3A0%2C%22operator%22%3A%22or%22%2C%22senses%22%3A%5B%7B%22term%22%3A%22multimedia%22%2C%22_specializationType%22%3A%22org.insemtives.platform.unitn.commons.model.QueryTerm%22%7D%5D%7D
returns results for resources annotated using the term ("multimedia") with and without senses. The point is that according to the default values of the parameters the resources annotated using a sense gets much bigger scores although the sense was not specified by the user in the query. There is an email about this to centralize all the discussions.
Assigning the ticket to Viktor
This issue was discussed in the email, so let me summarize it here. The core of the problem is to define how we should treat the case when the synset URI is absent in the request. In general, there can be three cases:
1) The synset URI is present and has some value. In this case the service will return only the terms with the given sense
2) The synset URI is present and its value is null. In this case the service will return only the free-text terms
3) The synset URI is absent. At the moment the service returns all terms, and the terms with the senses get the higher scoring with the default scoring parameters. The scoring can be tweaked (e.g., setting conceptTermWeight and termWeight to the same value) to let the free-text terms and terms with sense to have the same rank.
Please let us know if this resolves the issue or if you need some additional actions to be taken. As far as I know German is not available at the moment, so I assign it to Daniel.
Hi Victor.
Last week we wanted to focus on the experiment. Now, I am trying to reproduce your three cases that you have defined two weeks ago.
For example:
1) http://insemtives.science.unitn.it/test/platform-rest/knowledge-service/getResourcesByTerms?json=\{"includeTargets":true,"maxSpecificityDistance":5,"maxGeneralityDistance":5,"operator":"or","senses":[{"term":"multimedia","synsetUri":"http://www.w3.org/2006/03/wn/wn20/instances/synset-multimedia-noun-1","_specializationType":"org.insemtives.platform.unitn.commons.model.QueryTerm"}]}
2) http://insemtives.science.unitn.it/test/platform-rest/knowledge-service/getResourcesByTerms?json=
{"includeTargets":true,"maxSpecificityDistance":5,"maxGeneralityDistance":5,"operator":"or","senses":[{"term":"multimedia","synsetUri":null,"_specializationType":"org.insemtives.platform.unitn.commons.model.QueryTerm"}]}
3) http://insemtives.science.unitn.it/test/platform-rest/knowledge-service/getResourcesByTerms?json=
{"includeTargets":true,"maxSpecificityDistance":5,"maxGeneralityDistance":5,"operator":"or","senses":[{"term":"multimedia","_specializationType":"org.insemtives.platform.unitn.commons.model.QueryTerm"}]}
No differences on results between cases "2" and "3". The expected output in "3" is the union between "1" and "2". Is the approach correct? If so, the problem is not solved: ( I await your comments.
Sorry, I probably didn't explain it correctly. The three item list below is the proposal which I gathered from the long list of emails, and it's not implemented yet. My message below was to ask if the proposal was correct and that no points were missing, so if it is the desired behavior please confirm it and then we can start implementing it; if it's not please correct the proposal.
Right now the knowledge service treats the cases 2 and 3 in the same way.
I think that the proposal looks good because it gives the enough flexibility to choose the behavior that best work for each use case. Also, everything that is customizable and configurable (conceptTermWeight and termWeight parameters) is very positive for the project. The proposal was correct.