Menu

#5 HTML Filter cannot handle multi-page articles

open
nobody
News Filter (4)
5
2007-05-14
2007-05-14
No

Several news publications split up their long articles into multiple pages (Times of India, New York Times, etc.) However, HTMLFilter cannot recognize the second and subsequent pages currently. So, in these multi-page articles, only the first page gets processed which will affect how the article gets filtered and classified.

Need to work out techniques to tackle this problem.

Discussion


Log in to post a comment.