DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/

It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling.

DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.

License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.

Features

  • Data Scraping (Web Scraping, Video2Text, Image2Text)
  • Data and Text Preprocessing (with stemming, stopwords...)
  • Data Exploration and Visualizations (histogram, bar, pie, boxplot, ...)
  • Document Clustering
  • Text Analytics (Text Link Analysis, POSTagging, Sentiments Analysis, ...)
  • Predictive Analytics (both numerical and text, Naive Bayes, with additional Weka add-ins )
  • Plugins with Big Data features (need Microsoft Azure account)
  • Expandable with Plugins using R Scripts
  • Text Explorer/Analytics uses Gate's Gazetteer .lst files and online university sentiment word lists

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow DSTK - DataScience ToolKit

DSTK - DataScience ToolKit Web Site

Other Useful Business Software
Secure remote access solution to your private network, in the cloud or on-prem. Icon
Secure remote access solution to your private network, in the cloud or on-prem.

Deliver secure remote access with OpenVPN.

OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
Get started — no credit card required.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DSTK - DataScience ToolKit !

Additional Project Details

Operating Systems

Windows

Intended Audience

Engineering, Financial and Insurance Industry, Information Technology, Management, Non-Profit Organizations, Science/Research

User Interface

.NET/Mono

Programming Language

C#, Java, Python

Related Categories

C# Business Software, C# Business Intelligence Software, C# Machine Learning Software, C# Data Analytics Tool, C# Web Scrapers, Python Business Software, Python Business Intelligence Software, Python Machine Learning Software, Python Data Analytics Tool, Python Web Scrapers, Java Business Software, Java Business Intelligence Software, Java Machine Learning Software, Java Data Analytics Tool, Java Web Scrapers

Registered

2017-04-26