Dominate is a Python library for creating and manipulating HTML docs
The lxml XML toolkit for Python
Lightweight library for scraping web-sites with LLMs
A python package for building DOM of the HTML documents
HTML parser which can be used for screen-scraping applications