TF-IDF.jar is a Java Archive file to measure TF-IDF of each document
in a document collection (corpus).
The jar can be used to
(a) get all the terms in the corpus
(b) get the document frequency (DF) and inverse document frequency (IDF) of
all the terms in the corpus
(c) get the TF-IDF of each document in the corpus
(d) get each term with their frequency (no. of presence), term frequency (TF) and TF-IDF in every document
Generator for textual models by applying different techniques
This is a project created and supported by:
Angel Castellanos
Juan Cigarrán Recuero
Ana García Serrano
This projects allows the modelling of textual contents by applying different techniques:
TF-IDF
KLD
Mutual Information
Chi^2
With this application the users can be able to extract the most representative terminology of a textual collection.
The application is Java-based, allowing their execution in several platforms and operative systems (Windows, Linux, MacOS).