DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/

It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling.

DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.

License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.

Features

  • Data Scraping (Web Scraping, Video2Text, Image2Text)
  • Data and Text Preprocessing (with stemming, stopwords...)
  • Data Exploration and Visualizations (histogram, bar, pie, boxplot, ...)
  • Document Clustering
  • Text Analytics (Text Link Analysis, POSTagging, Sentiments Analysis, ...)
  • Predictive Analytics (both numerical and text, Naive Bayes, with additional Weka add-ins )
  • Plugins with Big Data features (need Microsoft Azure account)
  • Expandable with Plugins using R Scripts
  • Text Explorer/Analytics uses Gate's Gazetteer .lst files and online university sentiment word lists

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow DSTK - DataScience ToolKit

DSTK - DataScience ToolKit Web Site

Other Useful Business Software
Full-stack observability with actually useful AI | Grafana Cloud Icon
Full-stack observability with actually useful AI | Grafana Cloud

Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Create free account
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DSTK - DataScience ToolKit !

Additional Project Details

Operating Systems

Windows

Intended Audience

Engineering, Financial and Insurance Industry, Information Technology, Management, Non-Profit Organizations, Science/Research

User Interface

.NET/Mono

Programming Language

C#, Java, Python

Related Categories

C# Business Software, C# Business Intelligence Software, C# Machine Learning Software, C# Data Analytics Tool, C# Web Scrapers, Python Business Software, Python Business Intelligence Software, Python Machine Learning Software, Python Data Analytics Tool, Python Web Scrapers, Java Business Software, Java Business Intelligence Software, Java Machine Learning Software, Java Data Analytics Tool, Java Web Scrapers

Registered

2017-04-26