[Ebiness-crawler] decent Html parser?
Status: Alpha
Brought to you by:
o3dozone
|
From: Mike D. <md...@3d...> - 2001-05-25 22:47:49
|
Hi guys, Well, I've already written an parser that simply extracts '<a href="">' tags from Html for the crawler, but was wondering if anyone knows of a good, stable parser that will generically parse Html? I think this would be far superior. I think the 'expat' library (hosted on SF - http://expat.sourceforge.net/) is the standard for Xml parsing, so we can use that for any Xml/xHtml we come across. Mike |