This classification problem involves classifying 20000 messages into 20 different classes. The dataset can be found here: https://archive.ics.uci.edu/ml/datasets/Twenty+Newsgroups. Four Machine Learning algorithms; Naïve Bayes, Logistic Regression, Regularized Logistic Regression Support Vector Machine (SVM) were implemented and there training and test dataset accuracy were compared. Arguably, one of the most important aspect to solving this problem is having the appropriate data set format. Each of these algorithms has its peculiar data format; the specific format and how to reconstruct the entire dataset are illustrated in other sections below. Out of all the methods, SVM using the Libsvm [1] produced the most accurate and optimized result for its classification accuracy for the 20 classes. All the algorithm implementation was written Matlab.
Download the code and Report here.

Project Activity

See All Activity >

Follow classify-20-NG-with-4-ML-Algo

classify-20-NG-with-4-ML-Algo Web Site

Other Useful Business Software
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of classify-20-NG-with-4-ML-Algo!

Additional Project Details

Registered

2016-01-21