Add Python versions of OCR post-processing scripts
Remove unnecessary test data
Add OCR post-processing scripts.