pup
Parsing HTML at the command line
...If you're on OS X, use Homebrew to install (no Go required). By default pup will fill in missing tags and properly indent the page. CSS selectors have a group of specifiers called "pseudo classes" which are pretty cool. pup implements a majority of the relevant ones them. When combining selectors, the HTML nodes selected by the previous selector will be passed to the next ones. Non-HTML selectors which effect the output type are implemented as functions which can be provided as a final argument. Print the values of all attributes with a given key from all selected nodes.