Unsupervised TXT classifier - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
README.txt	2013-12-19	2.0 kB	0
classifier.rar	2013-12-19	229.7 kB	0
classifier.zip	2013-12-19	249.6 kB	1
Totals: 3 Items		481.3 kB	1

THIS IS HOW TO USE THE PROGRAM, IT IS VERY SIMPLE AND SELF-EXPLANATORY:


import classifier.*; //you must import this

public class usage {
	public static void main(String[] args) {

		System.out.println("Lets test the first half of Alice in Wonderland with the second half of the same novel");
		System.out.println("The similarity is: " 
				+ new classify().classifyDocuments("aliceP1.txt", "aliceP2.txt", 60)); //this is how to use the classifier
		System.out.println("===================================");
		System.out.println("Lets test the first half of Alice with the unrelated text, lets say, The Adventures of Sherlock Holmes");
		System.out.println("The similarity is: " 
				+ new classify().classifyDocuments("aliceP1.txt", "holmes.txt", 60)); 
		System.out.println("===================================");
		System.out.println("Lets test the second half of Alice with the unrelated text, lets say, The Adventures of Sherlock Holmes");
		System.out.println("The similarity is: " 
				+ new classify().classifyDocuments("aliceP1.txt", "holmes.txt", 60)); 
		System.out.println("Please feel free to write any feedback to olejar.damir@gmail.com");
	}

}

THIS IS THE PROGRAM OUTPUT:
Lets test the first half of Alice in Wonderland with the second half of the same novel
The similarity is: 0.8959410444250319
===================================
Lets test the first half of Alice with the unrelated text, lets say, The Adventures of Sherlock Holmes
The similarity is: 0.37617940779272774
===================================
Lets test the second half of Alice with the unrelated text, lets say, The Adventures of Sherlock Holmes
The similarity is: 0.37617940779272774
Please feel free to write any feedback to olejar.damir@gmail.com


As you can see, the first and the second half of the Alice in Wonderland were classified with 90% of similarity, while the Adventure of Sherlock Holmes gave the same classification result of 37% of a similarity. This was the expected outcome.

Source: README.txt, updated 2013-12-19

Unsupervised TXT classifier Files

Classify any two TXT documents, no training required - JAVA

Unsupervised TXT classifier Files

Classify any two TXT documents, no training required - JAVA

Get an email when there's a new version of Unsupervised TXT classifier