A university project - A document clustering software for an audit client with additional features. The main task of clustering takes documents in a directory as an input and outputs an Excel spreadsheet displaying clusters of documents, with each cluster containing documents that are similar to each other.

The search features take search terms as input by the user and a directory with documents as an input and outputs an Excel spreadsheet displaying all documents containing the search term and gives similar documents to these. The 2nd feature gives each sentence containing the search term from documents found.

The report generation feature specifically for use by audit companies takes an audit report as an input and outputs an insight log and draft management letter with insights pulled from the report. This feature can be customised to suit a company's requirements.

This software works with pdf, docx, txt and csv files and the zip file must be saved in "My Documents".

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow Data Ninja

Data Ninja Web Site

You Might Also Like
Top-Rated Free CRM Software Icon
Top-Rated Free CRM Software

216,000+ customers in over 135 countries grow their businesses with HubSpot

HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
0
1
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 3 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

There are no 4 star reviews.

Additional Project Details

Operating Systems

Windows

User Interface

Tk

Programming Language

Python

Related Categories

Python Business Intelligence Software

Registered

2013-03-13