This plugins sends the entire page to Recoll for indexing, even the irrelevant content such as ads, sidebars, etc, that clutters the search results.
You probably know more about this than me, but I found these links useful. There are lots of open-source libraries providing this functionality. None still available for javascript, however.
https://web.archive.org/web/20130627025152/http://tomazkovacic.com/blog/122/evaluating-text-extraction-algorithms
https://web.archive.org/web/20130623005602/http://tomazkovacic.com/blog/14/extracting-article-text-from-html-documents/
https://web.archive.org/web/20130622025806/http://tomazkovacic.com/blog/56/list-of-resources-article-text-extraction-from-html-documents/
https://web.archive.org/web/20130623072055/http://tomazkovacic.com/blog/98/feature-wise-comparison-of-html-article-text-extractors/
Anonymous
View and moderate all "tickets Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Tickets"
Actually, this might make more sense in recoll rather than the firefox extension. Save the full web-page and index the filtered version, rather than save and filter the indexed version.