DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/

It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling.

DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.

License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.

Features

  • Data Scraping (Web Scraping, Video2Text, Image2Text)
  • Data and Text Preprocessing (with stemming, stopwords...)
  • Data Exploration and Visualizations (histogram, bar, pie, boxplot, ...)
  • Document Clustering
  • Text Analytics (Text Link Analysis, POSTagging, Sentiments Analysis, ...)
  • Predictive Analytics (both numerical and text, Naive Bayes, with additional Weka add-ins )
  • Plugins with Big Data features (need Microsoft Azure account)
  • Expandable with Plugins using R Scripts
  • Text Explorer/Analytics uses Gate's Gazetteer .lst files and online university sentiment word lists

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow DSTK - DataScience ToolKit

DSTK - DataScience ToolKit Web Site

Other Useful Business Software
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DSTK - DataScience ToolKit !

Additional Project Details

Operating Systems

Windows

Intended Audience

Engineering, Financial and Insurance Industry, Information Technology, Management, Non-Profit Organizations, Science/Research

User Interface

.NET/Mono

Programming Language

C#, Java, Python

Related Categories

C# Business Software, C# Business Intelligence Software, C# Machine Learning Software, C# Data Analytics Tool, C# Web Scrapers, Python Business Software, Python Business Intelligence Software, Python Machine Learning Software, Python Data Analytics Tool, Python Web Scrapers, Java Business Software, Java Business Intelligence Software, Java Machine Learning Software, Java Data Analytics Tool, Java Web Scrapers

Registered

2017-04-26