DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/

It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling.

DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.

License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.

Features

  • Data Scraping (Web Scraping, Video2Text, Image2Text)
  • Data and Text Preprocessing (with stemming, stopwords...)
  • Data Exploration and Visualizations (histogram, bar, pie, boxplot, ...)
  • Document Clustering
  • Text Analytics (Text Link Analysis, POSTagging, Sentiments Analysis, ...)
  • Predictive Analytics (both numerical and text, Naive Bayes, with additional Weka add-ins )
  • Plugins with Big Data features (need Microsoft Azure account)
  • Expandable with Plugins using R Scripts
  • Text Explorer/Analytics uses Gate's Gazetteer .lst files and online university sentiment word lists

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow DSTK - DataScience ToolKit

DSTK - DataScience ToolKit Web Site

You Might Also Like
Cloud Native API Management Icon
Cloud Native API Management

For enterprises with large and distributed development teams seeking to rapidly build API-first applications.

Tyk is a leading Open Source API Gateway and Management Platform, featuring an API gateway, analytics, developer portal and dashboard. We power billions of transactions for thousands of innovative organisations.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DSTK - DataScience ToolKit !

Additional Project Details

Operating Systems

Windows

Intended Audience

Non-Profit Organizations, Information Technology, Financial and Insurance Industry, Science/Research, Management, Engineering

User Interface

.NET/Mono

Programming Language

C#, Python, Java

Related Categories

C# Business Software, C# Business Intelligence Software, C# Machine Learning Software, C# Data Analytics Tool, C# Web Scrapers, Python Business Software, Python Business Intelligence Software, Python Machine Learning Software, Python Data Analytics Tool, Python Web Scrapers, Java Business Software, Java Business Intelligence Software, Java Machine Learning Software, Java Data Analytics Tool, Java Web Scrapers

Registered

2017-04-26