Unsupervised TXT classifier Icon

Unsupervised TXT classifier

Classify any two TXT documents, no training required - JAVA

Add a Review
1 Download (This Week)
Last Update:
Download classifier.zip
Browse All Files



This program is made to address two most common issues with the known classifying algorithms. First, over-training and second, shortage of data for a training of categories. Instead, each TXT file is a category on its own, rather than an assigned category. In a way, this is similar to clustering but not really a clustering algorithm since there is some training involved. The summarizer from Classifier4J has been adjusted to accept two inputs (lets call them A and B). Then, the summarizer gets trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.

Unsupervised TXT classifier Web Site


Write a Review

User Reviews

Be the first to post a review of Unsupervised TXT classifier!

Additional Project Details

Intended Audience

Education, Developers, Testers

Programming Language




Thanks for helping keep SourceForge clean.

Screenshot instructions:
Red Hat Linux   Ubuntu

Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Briefly describe the problem (required):

Upload screenshot of ad (required):
Select a file, or drag & drop file here.

Please provide the ad click URL, if possible:

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks