A university project - A document clustering software for an audit client with additional features. The main task of clustering takes documents in a directory as an input and outputs an Excel spreadsheet displaying clusters of documents, with each cluster containing documents that are similar to each other.
The search features take search terms as input by the user and a directory with documents as an input and outputs an Excel spreadsheet displaying all documents containing the search term and gives similar documents to these. The 2nd feature gives each sentence containing the search term from documents found.
The report generation feature specifically for use by audit companies takes an audit report as an input and outputs an insight log and draft management letter with insights pulled from the report. This feature can be customised to suit a company's requirements.
This software works with pdf, docx, txt and csv files and the zip file must be saved in "My Documents".
This is a good program for clustering.