A document clustering system with search & report generation features
...The main task of clustering takes documents in a directory as an input and outputs an Excel spreadsheet displaying clusters of documents, with each cluster containing documents that are similar to each other.
The search features take search terms as input by the user and a directory with documents as an input and outputs an Excel spreadsheet displaying all documents containing the search term and gives similar documents to these. The 2nd feature gives each sentence containing the search term from documents found.
The report generation feature specifically for use by audit companies takes an audit report as an input and outputs an insight log and draft management letter with insights pulled from the report. ...
Sync .twb files to Tableau Server (prevent report chaos!)
...Comparison of what has changed checks last commit time in svn vs last update time on Tableau Server.
Can also run in 'nopublish' mode which just confirms what it would do, but doesn't take any action.
Requires tabcmd, python 2.5, pysvn, minidom, dateutil, optparse