I am using this library in order to separate some text into sentences. Is there some way to mark any part of text as a word. For example, I have text
'Name of my profile picture is public_image.jpg'
and I would like to mark 'public_image.jpg' as word, so classifier from library will consider it as 'profile' or 'picture' or any another very ordinary word. Is there some way to do that? Thank you in advance.
The sentence detector will only try and split words which contain some end-of-sentence punctuation. It can also be called (with some java code) such that it will return character spans. I would replace that period with something like an under score in the text you send to the sentence detector and then use apply the returned character spans to your original text.
Hope this helps...Tom
Log in to post a comment.