pup
Parsing HTML at the command line
...It reads from stdin, prints to stdout, and allows the user to filter parts of the page using CSS selectors. Inspired by jq, pup aims to be a fast and flexible way of exploring HTML from the terminal. If you have Go installed on your computer just run go get. If you're on OS X, use Homebrew to install (no Go required). By default pup will fill in missing tags and properly indent the page. CSS selectors have a group of specifiers called "pseudo classes" which are pretty cool. pup implements a majority of the relevant ones them. When combining selectors, the HTML nodes selected by the previous selector will be passed to the next ones. ...