Document txt classifier with no training

Classify any two TXT documents, no training required - JAVA

Add a Review
1 Download (This Week)
Last Update:
Download classifier.rar
Browse All Files

Screenshots

Description

This program is made to address two most common issues with the known classifying algorithms. First, over-training and second, shortage of data for a training of categories. Instead, each TXT file is a category on its own, rather than an assigned category. In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.

Document txt classifier with no training Web Site

Update Notifications





Write a Review

User Reviews

Be the first to post a review of Document txt classifier with no training!

Additional Project Details

Intended Audience

Developers, Education, Testers

Programming Language

Java

Registered

2013-12-18
Screenshots can attract more users to your project.
Features can attract more users to your project.

Icons must be PNG, GIF, or JPEG and less than 1 MiB in size. They will be displayed as 48x48 images.