The HTMLParser class misses CSS and JS file references
in LINK and STYLE tags. Simple fix is to add the
following lines in the parseAsHTML method:
extractAttributesFromTags("link", "href",
sourceURL, newURLs, newURLSet, textContent);
extractAttributesFromTags("script", "src",
sourceURL, newURLs, newURLSet, textContent);
I am still missing css files referenced using the CSS
import notation.