pdftable Code
Brought to you by:
kylecronan
File | Date | Author | Commit |
---|---|---|---|
CHANGELOG | 2009-10-02 | kylecronan | [r1] import code into svn |
LICENSE | 2009-10-02 | kylecronan | [r1] import code into svn |
MANIFEST | 2009-10-02 | kylecronan | [r1] import code into svn |
README | 2009-10-02 | kylecronan | [r1] import code into svn |
pdftable | 2009-10-02 | kylecronan | [r1] import code into svn |
pdftable.py | 2009-10-02 | kylecronan | [r1] import code into svn |
setup.py | 2009-10-02 | kylecronan | [r1] import code into svn |
Python module and command line utility that analyzes XML output from the program pdftohtml in order to extract tables from PDF files. Outputs CSV. For example: pdftohtml -xml -stdout file.pdf | pdftable -f file%d.csv See also 'pdftable -h' and http://sourceforge.net/projects/pdftable Author: Kyle Cronan <kyle@pbx.org>