I recently received a spam that used a technique
similar to using an HTML comment to split a word.
However, this particular specimen merely split a word
using an incorrectly formed HTML comment tag without
the -- characters, eg:
It would probably be an idea to generalise the HTML
comment stripping code to strip all HTML tags (after
extracting the tokens contained therein) before parsing
the remaining text.