docconv
Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text
...To add image support to the docconv library you first need to install and build gosseract. Now you can add -tags ocr to any go command when building/fetching/testing docconv to include support for processing images. Documents can be sent as a multipart POST request and the plain text (body) and meta information are then returned as a JSON object.