Redundancy-Aware Topic Modeling

Copy Paste Redundancy or Data Duplication are prevalent in many corpora.This redundancy has a negative impact on the quality of text mining and topic modeling in particular. This is a software package of a novel variant of Latent Dirichlet Allocation (LDA)
topic modeling, Red-LDA, which takes into account the inherent redundancy of corpora when
modeling content.

My site: http://www.cs.bgu.ac.il/~cohenrap/
Lab site: http://www.cs.bgu.ac.il/~nlpproj/

Sister project: http://sourceforge.net/projects/corpusredundanc/

Project Activity

See All Activity >

Follow RedLDA

RedLDA Web Site

Other Useful Business Software
Try Google Cloud Risk-Free With $300 in Credit Icon
Try Google Cloud Risk-Free With $300 in Credit

No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of RedLDA!

Additional Project Details

Registered

2014-01-05