S Luz
-
2017-03-05
- Priotity: --> 10
Words that are too common and might slow the browser down too much. The proposed solution is to sample such words randomly from a collection of files up to a certain limit of
concordances. This is not an issue with the current size of the corpora currently handled by modnlp (in fact, it may never become an issue) but it may be useful to implement this feature in future.