This classification problem involves classifying 20000 messages into 20 different classes. The dataset can be found here: https://archive.ics.uci.edu/ml/datasets/Twenty+Newsgroups. Four Machine Learning algorithms; Naïve Bayes, Logistic Regression, Regularized Logistic Regression Support Vector Machine (SVM) were implemented and there training and test dataset accuracy were compared. Arguably, one of the most important aspect to solving this problem is having the appropriate data set format. Each of these algorithms has its peculiar data format; the specific format and how to reconstruct the entire dataset are illustrated in other sections below. Out of all the methods, SVM using the Libsvm [1] produced the most accurate and optimized result for its classification accuracy for the 20 classes. All the algorithm implementation was written Matlab.
Download the code and Report here.

Project Activity

See All Activity >

Follow classify-20-NG-with-4-ML-Algo

classify-20-NG-with-4-ML-Algo Web Site

Other Useful Business Software
Resolve Support Tickets 2x Faster​ with ServoDesk Icon
Resolve Support Tickets 2x Faster​ with ServoDesk

Full access to Enterprise features. No credit card required.

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
Try ServoDesk for free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of classify-20-NG-with-4-ML-Algo!

Additional Project Details

Registered

2016-01-21