Hi,
I was wondering if anybody has experience with an anaphor resolution/substitution tool. This function is not included in an OpenNLP API right? Any hints, tips, links are welcome.
Thanx
Peter
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Peter,
This functionality is currently being integrated into opennlp but is not included in the last released version. The next release will include coreference resolution and will likely be in April. Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
The short answer is no. The code out there doesn't include a means of using the existing parser information or named-entity information, or displaying the output of the coreference material, or models for running it. I'm almost done with that integration effort and then I'll retrain models and see how they perform. I did a fair amount of this work recently but it may be a couple of weeks before I get back to it. I can let you know once the coref stuff is ready if it doesn't happen within a day or so of the next release. Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Tom,
finally I found the time to play around a bit with the coref and I tried to get the examples running, described in the README file.
Finally I got the
Exception in thread "main" java.lang.NumberFormatException: For input string: "N
UMBER_OF_VERB_FRAMES"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at net.didion.jwnl.data.VerbFrame.initialize(VerbFrame.java:22)
at opennlp.tools.coref.mention.JWNLDictionary.<init>(JWNLDictionary.java
:61)
Exception.
This is known in the jwnl project as well, but not really resolved. Did you have similar problems and do you know a workaround? I guess it is something with the file_properties.xml file, which is missing in the latest opennlp release ( if you really need it?).
Any help appreciated.
Grtz Peter
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
My guess is that this is a WordNet version compatibility problem. You need to have version WordNet version 2.0.
I'm not sure what the file_properties.xml file your are refering is. I believe the property files needed by jwnl are in its jar. The jwnl.jar that is distributed with opennlp is slightly different from the one the jwnl project distributes. I don't remember the differences exactly, but I think loading the properties file as a resource from the jar may have been one of my changes. If you're using theirs or one built from their source that may be the problem.
Let me know if one of those two things fixes this problem. If its the first I'll catch the exception and report a message that makes sense for the next release.
Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Tom,
finally it worked...somehow. I think it is something with the language settings you have set, because the jar file only contains a _EN properties file. But I am not sure about it...
More important: which version of JWNL did you use and what do you mean by slightly differences?
Problem: In my application I already used JWNL (1.3RC3), but after replacing this one with the OPENNLP one, this doesn't work anymore....
Grtz Peter
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I used the rc3 version as well. I did a diff on the code and the only thing I say was that I removed the logging dependency to org.apache.logging from net/didion/jwnl/util/MessageLog.java so that I would not have to include that jar in my distribution. Because of this my jar is built from the source. I can't imagine why it wouldn't work with the original jar. Does it work if you just run opennlp with original jwnl jar? (I'd try it but I'm not at home)
If you are integrating the two code bases together, then it may be that the instance of net.didion.jwnl.dictionary.Dictionary that you are creating for your app is incompatible with the one I create in:
opennlp.tools.coref.mention.JWNLDictionary or vice versa.
Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First of all my fault was, that I used your distribution of jwnl to compile the open-nlp lib, but used the original jwnl dist to run my code. That caused an error...
With using the same dist for both everything works fine.
I already mentioned the
"NUMBER_OF_VERB_FRAMES" error and I could reproduce it. After extracing "JWNLResource_en.properties" file from the jar and rename it to "JWNLResource.properties" and adding it again to the JWNL dist, this error disappeared. I run Win2000 with german language settings I there must be a problem with that, but now everything works fine.
Sample Output:
(TOP (S (NP#2 (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP#2 (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)) )
:-)
Thanks for you help Tom!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Glad you got it working. I'll take a look at the file name issue. Perhaps it's a windows/linux issue where jwnl looks for a different file depending on OS. I develop and test under linux so issues of this nature can slip through the cracks. Probably the easiest thing to do would be to just put files with both names in the jar. Thanks for tracking this down...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I encountered the error of "NUMBER_OF_VERB_FRAMES". I run WinXP OS. I used Peter's method which rename the "JWNLResource_en.properties" to "JWNLResource.properties" and then add it to the JWNL distribution again. But it still not work, I also got the same error
Exception in thread "main" java.lang.NumberFormatException: For input string: "NUMBER_OF_VERB_FRAMES"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at net.didion.jwnl.data.VerbFrame.initialize(VerbFrame.java:22) at opennlp.tools.coref.mention.JWNLDictionary.<init>(JWNLDictionary.java
:61)
I traced the program and found out the command of "VerbFrame.initialize();"
public static void initialize() {
if (!_initalized) {
int framesSize = Integer.parseInt(JWNL.resolveMessage("NUMBER_OF_VERB_FRAMES")); <---- I thought the error was happened by this command
_verbFrames = new VerbFrame[framesSize];
for (int i = 1; i <= framesSize; i++)
_verbFrames[i - 1] = new VerbFrame(getKeyString(i), i);
_initalized = true;
}
}
I don't know how to solve this problem.
Any help appreciated.
Jianlee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
and the file name appears to be dependent on the Locale of your machine. My guess is that you'll need to change the the "en" part of the file to better match your Locale and re-jar the files. I don't know how to get the locale name, but you might try just printing out Locale.getDefault() to see if that tells you something.
You'll probably want to also change:
"PrincetonResource_en.properties"
Please report back if you get this to work. Thanks...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's working...
Thanks for you help Tom.
Because my OS was set in Chinese, so I renamed both of files from "en" to "zh". And then it worked.
But it is a little weird of the output result. I used the example mentioned above by piodiwan, which is
"Johann goes into the house. He is a brave man."
piodiwan parsed the sentence to
(TOP (S (NP (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)))
and he got the result of coref was
(TOP (S (NP#2 (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP#2 (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)) )
but... I can't get the same result. I used his parsed result. But I got the result of coref was the same as input.
(TOP (S (NP (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)) )
Then, I used the OpenNLP parser. I got the parsed result.
(TOP (S (NP (NNP John)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)))
(TOP (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)))
And then input these parsed sentence to the coref program. But I also got the result of coref was the same as input.
(TOP (S (NP (NNP John)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)))
(TOP (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)))
Why?
Jianlee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Sorry I missed this reply when it first came in. The original posts were refering to the 1.2 version. of opennlp-tools, where as in 1.3, the named-entity information has been incorporated and the coreference algorithm is expecting to see people names taged as such. Run the named-entity component as described in the coreference example in the "Running The Tools" section of the README and it should work better. Unfortunatly I don't have time to do this myself right now.
Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I was wondering if anybody has experience with an anaphor resolution/substitution tool. This function is not included in an OpenNLP API right? Any hints, tips, links are welcome.
Thanx
Peter
Hi Peter,
This functionality is currently being integrated into opennlp but is not included in the last released version. The next release will include coreference resolution and will likely be in April. Hope this helps...Tom
Hi Tom, thank you for your quick response :-)
So do latest CVS snaphshots already include a basic functionality?
best regards
Peter
Hi,
The short answer is no. The code out there doesn't include a means of using the existing parser information or named-entity information, or displaying the output of the coreference material, or models for running it. I'm almost done with that integration effort and then I'll retrain models and see how they perform. I did a fair amount of this work recently but it may be a couple of weeks before I get back to it. I can let you know once the coref stuff is ready if it doesn't happen within a day or so of the next release. Hope this helps...Tom
Hi, you didn't promise too much :-)
I am going to download the latest version now and give it a try! Thanks for your good work.
Grtz Peter
Hi Tom,
finally I found the time to play around a bit with the coref and I tried to get the examples running, described in the README file.
Finally I got the
Exception in thread "main" java.lang.NumberFormatException: For input string: "N
UMBER_OF_VERB_FRAMES"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at net.didion.jwnl.data.VerbFrame.initialize(VerbFrame.java:22)
at opennlp.tools.coref.mention.JWNLDictionary.<init>(JWNLDictionary.java
:61)
Exception.
This is known in the jwnl project as well, but not really resolved. Did you have similar problems and do you know a workaround? I guess it is something with the file_properties.xml file, which is missing in the latest opennlp release ( if you really need it?).
Any help appreciated.
Grtz Peter
Hi,
My guess is that this is a WordNet version compatibility problem. You need to have version WordNet version 2.0.
I'm not sure what the file_properties.xml file your are refering is. I believe the property files needed by jwnl are in its jar. The jwnl.jar that is distributed with opennlp is slightly different from the one the jwnl project distributes. I don't remember the differences exactly, but I think loading the properties file as a resource from the jar may have been one of my changes. If you're using theirs or one built from their source that may be the problem.
Let me know if one of those two things fixes this problem. If its the first I'll catch the exception and report a message that makes sense for the next release.
Hope this helps...Tom
btw: I use WordNet 2.0
Hi Tom,
finally it worked...somehow. I think it is something with the language settings you have set, because the jar file only contains a _EN properties file. But I am not sure about it...
More important: which version of JWNL did you use and what do you mean by slightly differences?
Problem: In my application I already used JWNL (1.3RC3), but after replacing this one with the OPENNLP one, this doesn't work anymore....
Grtz Peter
Hi,
I used the rc3 version as well. I did a diff on the code and the only thing I say was that I removed the logging dependency to org.apache.logging from net/didion/jwnl/util/MessageLog.java so that I would not have to include that jar in my distribution. Because of this my jar is built from the source. I can't imagine why it wouldn't work with the original jar. Does it work if you just run opennlp with original jwnl jar? (I'd try it but I'm not at home)
If you are integrating the two code bases together, then it may be that the instance of net.didion.jwnl.dictionary.Dictionary that you are creating for your app is incompatible with the one I create in:
opennlp.tools.coref.mention.JWNLDictionary or vice versa.
Hope this helps...Tom
OK, here we go:
First of all my fault was, that I used your distribution of jwnl to compile the open-nlp lib, but used the original jwnl dist to run my code. That caused an error...
With using the same dist for both everything works fine.
I already mentioned the
"NUMBER_OF_VERB_FRAMES" error and I could reproduce it. After extracing "JWNLResource_en.properties" file from the jar and rename it to "JWNLResource.properties" and adding it again to the JWNL dist, this error disappeared. I run Win2000 with german language settings I there must be a problem with that, but now everything works fine.
Sample Output:
(TOP (S (NP#2 (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP#2 (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)) )
:-)
Thanks for you help Tom!
Hi,
Glad you got it working. I'll take a look at the file name issue. Perhaps it's a windows/linux issue where jwnl looks for a different file depending on OS. I develop and test under linux so issues of this nature can slip through the cracks. Probably the easiest thing to do would be to just put files with both names in the jar. Thanks for tracking this down...Tom
Hi, I encountered the error of "NUMBER_OF_VERB_FRAMES". I run WinXP OS. I used Peter's method which rename the "JWNLResource_en.properties" to "JWNLResource.properties" and then add it to the JWNL distribution again. But it still not work, I also got the same error
Exception in thread "main" java.lang.NumberFormatException: For input string: "NUMBER_OF_VERB_FRAMES"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at net.didion.jwnl.data.VerbFrame.initialize(VerbFrame.java:22) at opennlp.tools.coref.mention.JWNLDictionary.<init>(JWNLDictionary.java
:61)
I traced the program and found out the command of "VerbFrame.initialize();"
public static void initialize() {
if (!_initalized) {
int framesSize = Integer.parseInt(JWNL.resolveMessage("NUMBER_OF_VERB_FRAMES")); <---- I thought the error was happened by this command
_verbFrames = new VerbFrame[framesSize];
for (int i = 1; i <= framesSize; i++)
_verbFrames[i - 1] = new VerbFrame(getKeyString(i), i);
_initalized = true;
}
}
I don't know how to solve this problem.
Any help appreciated.
Jianlee
Or maybe the JWNLResource_en.properties file did not be loaded to the program...
Hi,
You are correct that the JWNLResource_en.properties file is not being loaded. This file is loaded via:
java.util.ResourceBundle.getBundle(bundle, _locale);
and the file name appears to be dependent on the Locale of your machine. My guess is that you'll need to change the the "en" part of the file to better match your Locale and re-jar the files. I don't know how to get the locale name, but you might try just printing out Locale.getDefault() to see if that tells you something.
You'll probably want to also change:
"PrincetonResource_en.properties"
Please report back if you get this to work. Thanks...Tom
It's working...
Thanks for you help Tom.
Because my OS was set in Chinese, so I renamed both of files from "en" to "zh". And then it worked.
But it is a little weird of the output result. I used the example mentioned above by piodiwan, which is
"Johann goes into the house. He is a brave man."
piodiwan parsed the sentence to
(TOP (S (NP (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)))
and he got the result of coref was
(TOP (S (NP#2 (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP#2 (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)) )
but... I can't get the same result. I used his parsed result. But I got the result of coref was the same as input.
(TOP (S (NP (NNP Johann)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)) (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)) )
Then, I used the OpenNLP parser. I got the parsed result.
(TOP (S (NP (NNP John)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)))
(TOP (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)))
And then input these parsed sentence to the coref program. But I also got the result of coref was the same as input.
(TOP (S (NP (NNP John)) (VP (VBZ goes) (PP (IN into) (NP (DT the) (NN house)))) (. .)))
(TOP (S (NP (PRP He)) (VP (VBZ is) (NP (DT a) (JJ brave) (NN man))) (. .)))
Why?
Jianlee
Hi,
Sorry I missed this reply when it first came in. The original posts were refering to the 1.2 version. of opennlp-tools, where as in 1.3, the named-entity information has been incorporated and the coreference algorithm is expecting to see people names taged as such. Run the named-entity component as described in the coreference example in the "Running The Tools" section of the README and it should work better. Unfortunatly I don't have time to do this myself right now.
Hope this helps...Tom
GUITAR
http://privatewww.essex.ac.uk/~malexa/GuiTAR/
MARS
http://clg.wlv.ac.uk/MARS/index.php
JavaRAP
http://www.comp.nus.edu.sg/~qiul/NLPTools/JavaRAP.html
All are quite tedious to compile or run. Mail me if u need help.
mapb AT stud DOT uni-graz DOT at
Regards
Marco