From: Szabo, P. (LNG-VIE) <pat...@le...> - 2013-02-27 09:23:13
|
Hi, I'm using jython to access the summarizer from Classifier4J. It's my first day with jython so plz be gentle :-) My problem is that i need to provide my own Stopwordlist for which I'll have to override the method getStopWords(). API: http://classifier4j.sourceforge.net/subprojects/core/apidocs/net/sf/classifier4J/DefaultStopWordsProvider.html How do I do that in a way that the summarizer will use my method ? I mean just subclassing DefaultStopWordsProvider in my py can't be enough right ? Help would be much appreciated. cheers . . . . . . . . . . . . . . . . . . . . . . . . . . Ing. Patrick Szabo Developer LexisNexis A-1030 Wien, Marxergasse 25 mailto:Pat...@le... Tel.: +43 1 53452 1573 Fax.: +43 1 534 52 146 . . . . . . . . . . . . . . . . . . . . . . . . . . |
From: Alan K. <jyt...@xh...> - 2013-02-27 10:34:57
|
[Patrick] > I'm using jython to access the summarizer from Classifier4J. It's my first day with jython so plz be gentle :-) > My problem is that i need to provide my own Stopwordlist for which I'll have to override the method getStopWords(). > API: http://classifier4j.sourceforge.net/subprojects/core/apidocs/net/sf/classifier4J/DefaultStopWordsProvider.html from net.sf.classifier4j import IStopWordProvider class MyStopWordsProvider(IStopWordProvider): def __init__(self): # This is just for illustration: should use a more efficient # data structure, such as a dictionary self.my_stop_words = ["abs", "def", "etc"] def getStopWords(self): return self.my_stop_words def isStopWord(self, word): # This does an inefficient linear search return word in self.my_stop_words Instantiate it like this my_stop_words_provider = MyStopWordsProvider() > How do I do that in a way that the summarizer will use my method ? How are you creating the summarizer? > I mean just subclassing DefaultStopWordsProvider in my py can't be enough right ? Should be simple. Alan. |
From: Szabo, P. (LNG-VIE) <pat...@le...> - 2013-02-27 11:38:36
|
Thank you that did help. Originally I subclassed DefaultStopWordsProvider instead of IStopWordProvider but I'm gonna trust you on this :) What is still unclear to me is how to tell my summarizer to use those stopwords. I'm initializing it like this: summariser = SimpleSummariser() It does not take any arguments. cheers Von: ala...@gm... [mailto:ala...@gm...] Im Auftrag von Alan Kennedy Gesendet: Mittwoch, 27. Februar 2013 11:34 An: Szabo, Patrick (LNG-VIE) Cc: jyt...@li... Betreff: Re: [Jython-users] overriding a class method [Patrick] > I'm using jython to access the summarizer from Classifier4J. It's my first day with jython so plz be gentle :-) > My problem is that i need to provide my own Stopwordlist for which I'll have to override the method getStopWords(). > API: http://classifier4j.sourceforge.net/subprojects/core/apidocs/net/sf/classifier4J/DefaultStopWordsProvider.html from net.sf.classifier4j import IStopWordProvider class MyStopWordsProvider(IStopWordProvider): def __init__(self): # This is just for illustration: should use a more efficient # data structure, such as a dictionary self.my_stop_words = ["abs", "def", "etc"] def getStopWords(self): return self.my_stop_words def isStopWord(self, word): # This does an inefficient linear search return word in self.my_stop_words Instantiate it like this my_stop_words_provider = MyStopWordsProvider() > How do I do that in a way that the summarizer will use my method ? How are you creating the summarizer? > I mean just subclassing DefaultStopWordsProvider in my py can't be enough right ? Should be simple. Alan. . . . . . . . . . . . . . . . . . . . . . . . . . . Ing. Patrick Szabo Developer LexisNexis A-1030 Wien, Marxergasse 25 mailto:Pat...@le... Tel.: +43 1 53452 1573 Fax.: +43 1 534 52 146 . . . . . . . . . . . . . . . . . . . . . . . . . . |
From: Alan K. <jyt...@xh...> - 2013-02-28 00:06:34
|
[Patrick] > Thank you that did help. Originally I subclassed DefaultStopWordsProvider instead of IStopWordProvider but I'm gonna trust you on this :) Actually, you could have either subclassed DefaultStopWordsProvider or implemented the IStopWordProvider interface: jython uses the same syntax for both, i.e inheritance syntax. class MyStopWordsProvider(IStopWordProvider): class MyStopWordsProvider(DefaultStopWordsProvider): The difference being that the latter version inherits the implementations (unless over-ridden) of DefaultStopWordsProvider, whereas the former inherits no implementations, since interfaces are abstract. [Patrick] > What is still unclear to me is how to tell my summarizer to use those stopwords. > I'm initializing it like this: > summariser = SimpleSummariser() > It does not take any arguments. Yes, I'm afraid we can't help you there: that is a deficiency of the Classifier4J API. I note that the one of the constructors to BayesianClassifier takes a IStopWordProvider implementation http://classifier4j.sourceforge.net/subprojects/core/apidocs/net/sf/classifier4J/bayesian/BayesianClassifier.html#BayesianClassifier%28net.sf.classifier4J.bayesian.IWordsDataSource,%20net.sf.classifier4J.ITokenizer,%20net.sf.classifier4J.IStopWordProvider%29 But none of the other consructors, of any object, do. That product seems moribund: it hasn't been updated in over 8 years. If you're going to use it, I think you're going to have to hack the code. The SimpleSummariser you want to use does not use a StopWordProvider. Perhaps it is called indirectly? To find out the call chain, you could change the code to throw an exception inside one of the DefaultStopWordsProvider methods. Then you could see what object is calling DefaultStopWordsProvider methods, and recompile the code to use your own implementation. Or single-step through the code in an IDE? Sorry we can't help further, but Classifier4J does seems a rather limited product. Alan. |
From: Szabo, P. (LNG-VIE) <pat...@le...> - 2013-02-28 07:44:24
|
Thank you for your detailed answer. I guess I was blinded by the whole Jython-power and did not notice that the summarizer does not appear to take any arguments. I guess I’ll move on to another lib. Thanks again ! Von: ala...@gm... [mailto:ala...@gm...] Im Auftrag von Alan Kennedy Gesendet: Donnerstag, 28. Februar 2013 01:05 An: Szabo, Patrick (LNG-VIE) Cc: jyt...@li... Betreff: Re: [Jython-users] overriding a class method [Patrick] > Thank you that did help. Originally I subclassed DefaultStopWordsProvider instead of IStopWordProvider but I'm gonna trust you on this :) Actually, you could have either subclassed DefaultStopWordsProvider or implemented the IStopWordProvider interface: jython uses the same syntax for both, i.e inheritance syntax. class MyStopWordsProvider(IStopWordProvider): class MyStopWordsProvider(DefaultStopWordsProvider): The difference being that the latter version inherits the implementations (unless over-ridden) of DefaultStopWordsProvider, whereas the former inherits no implementations, since interfaces are abstract. [Patrick] > What is still unclear to me is how to tell my summarizer to use those stopwords. > I'm initializing it like this: > summariser = SimpleSummariser() > It does not take any arguments. Yes, I'm afraid we can't help you there: that is a deficiency of the Classifier4J API. I note that the one of the constructors to BayesianClassifier takes a IStopWordProvider implementation http://classifier4j.sourceforge.net/subprojects/core/apidocs/net/sf/classifier4J/bayesian/BayesianClassifier.html#BayesianClassifier%28net.sf.classifier4J.bayesian.IWordsDataSource,%20net.sf.classifier4J.ITokenizer,%20net.sf.classifier4J.IStopWordProvider%29 But none of the other consructors, of any object, do. That product seems moribund: it hasn't been updated in over 8 years. If you're going to use it, I think you're going to have to hack the code. The SimpleSummariser you want to use does not use a StopWordProvider. Perhaps it is called indirectly? To find out the call chain, you could change the code to throw an exception inside one of the DefaultStopWordsProvider methods. Then you could see what object is calling DefaultStopWordsProvider methods, and recompile the code to use your own implementation. Or single-step through the code in an IDE? Sorry we can't help further, but Classifier4J does seems a rather limited product. Alan. . . . . . . . . . . . . . . . . . . . . . . . . . . Ing. Patrick Szabo Developer LexisNexis A-1030 Wien, Marxergasse 25 Pat...@le...<mailto:Pat...@le...> Tel.: +43 1 53452 1573 Fax.: +43 1 534 52 146 . . . . . . . . . . . . . . . . . . . . . . . . . . |