Robert Bala
-
2012-09-21
- milestone: --> Backlog
web scraping and data mining relies on certain assumptions and conditions in the parsed HTML output, such as a certain structure with special tags and attributes.
Thus, there needs to be a way to check for this required structure - to ensure that the corresponding processor can be run.
This can for example be done by walking the DOM and checking if all required nodes and leafs are available or not.