Redundancy-Aware Topic Modeling

Copy Paste Redundancy or Data Duplication are prevalent in many corpora.This redundancy has a negative impact on the quality of text mining and topic modeling in particular. This is a software package of a novel variant of Latent Dirichlet Allocation (LDA)
topic modeling, Red-LDA, which takes into account the inherent redundancy of corpora when
modeling content.

My site: http://www.cs.bgu.ac.il/~cohenrap/
Lab site: http://www.cs.bgu.ac.il/~nlpproj/

Sister project: http://sourceforge.net/projects/corpusredundanc/

Project Activity

See All Activity >

Follow RedLDA

RedLDA Web Site

Other Useful Business Software
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of RedLDA!

Additional Project Details

Registered

2014-01-05