Data Science Tools for Windows

View 30 business solutions

Browse free open source Data Science tools and projects for Windows below. Use the toggles on the left to filter open source Data Science tools by OS, license, language, programming language, and project status.

  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a server version for remote or multi-user deployment via a web browser. In addition to code editing and execution, RStudio offers extensive support for reproducible research via R Markdown, notebooks, and integration with version control systems like Git and SVN. Package development is built in, with tooling for building, checking, and testing R packages, plus integration with documentation tools, CRAN submission workflows, and project templates.
    Downloads: 88 This Week
    Last Update:
    See Project
  • 2
    CSAPP-Labs

    CSAPP-Labs

    Solutions and Notes for Labs of Computer Systems

    CSAPP-Labs is a repository that organizes and provides practical lab exercises corresponding to the famous textbook Computer Systems: A Programmer’s Perspective (CS:APP), helping students deepen their understanding of how computer systems work at the machine level. The exercises cover core topics such as data representation, assembly language, processor architecture, cache behavior, memory hierarchy, linking, and concurrency, contextualizing abstract concepts from the book in real code and experiments. Each lab is structured to include test programs, Makefiles, harnesses, and step-by-step instructions that guide students through hands-on interaction with low-level programming and system behavior. By actually building and debugging code that runs close to hardware, learners acquire intuition about performance trade-offs, bit-level manipulation, stack frame layout, and how compilers and OS features influence execution.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 3
    Quadratic

    Quadratic

    Data science spreadsheet with Python & SQL

    Quadratic enables your team to work together on data analysis to deliver better results, faster. You already know how to use a spreadsheet, but you’ve never had this much power before. Quadratic is a Web-based spreadsheet application that runs in the browser and as a native app (via Electron). Our goal is to build a spreadsheet that enables you to pull your data from its source (SaaS, Database, CSV, API, etc) and then work with that data using the most popular data science tools today (Python, Pandas, SQL, JS, Excel Formulas, etc). Quadratic has no environment to configure. The grid runs entirely in the browser with no backend service. This makes our grids completely portable and very easy to share. Quadratic has Python library support built-in. Bring the latest open-source tools directly to your spreadsheet. Quickly write code and see the output in full detail. No more squinting into a tiny terminal to see your data output.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    ggplot2

    ggplot2

    An implementation of the Grammar of Graphics in R

    ggplot2 is a system written in R for declaratively creating graphics. It is based on The Grammar of Graphics, which focuses on following a layered approach to describe and construct visualizations or graphics in a structured manner. With ggplot2 you simply provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it will take care of the rest. ggplot2 is over 10 years old and is used by hundreds of thousands of people all over the world for plotting. In most cases using ggplot2 starts with supplying a dataset and aesthetic mapping (with aes()); adding on layers (like geom_point() or geom_histogram()), scales (like scale_colour_brewer()), and faceting specifications (like facet_wrap()); and finally, coordinating systems. ggplot2 has a rich ecosystem of community-maintained extensions for those looking for more innovation. ggplot2 is a part of the tidyverse, an ecosystem of R packages designed for data science.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 5
    Nuclio

    Nuclio

    High-Performance Serverless event and data processing platform

    Nuclio is an open source and managed serverless platform used to minimize development and maintenance overhead and automate the deployment of data-science-based applications. Real-time performance running up to 400,000 function invocations per second. Portable across low laptops, edge, on-prem and multi-cloud deployments. The first serverless platform supporting GPUs for optimized utilization and sharing. Automated deployment to production in a few clicks from Jupyter notebook. Deploy one of the example serverless functions or write your own. The dashboard, when running outside an orchestration platform (e.g. Kubernetes or Swarm), will simply be deployed to the local docker daemon. The Getting Started With Nuclio On Kubernetes guide has a complete step-by-step guide to using Nuclio serverless functions over Kubernetes.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    Data Science Specialization

    Data Science Specialization

    Course materials for the Data Science Specialization on Coursera

    The Data Science Specialization Courses repository is a collection of materials that support the Johns Hopkins University Data Science Specialization on Coursera. It contains the source code and resources used throughout the specialization’s courses, covering a broad range of data science concepts and techniques. The repository is designed as a shared space for code examples, datasets, and instructional materials, helping learners follow along with lectures and assignments. It spans essential topics such as R programming, data cleaning, exploratory data analysis, statistical inference, regression models, machine learning, and practical data science projects. By providing centralized resources, the repo makes it easier for students to practice concepts and replicate examples from the curriculum. It also offers a structured view of how multiple disciplines—programming, statistics, and applied data analysis—come together in a professional workflow.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    XGBoost is an optimized distributed gradient boosting library, designed to be scalable, flexible, portable and highly efficient. It supports regression, classification, ranking and user defined objectives, and runs on all major operating systems and cloud platforms. XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems. XGBoost can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    DearPyGui

    DearPyGui

    Graphical User Interface Toolkit for Python with minimal dependencies

    Dear PyGui is an easy-to-use, dynamic, GPU-Accelerated, cross-platform graphical user interface toolkit(GUI) for Python. It is “built with” Dear ImGui. Features include traditional GUI elements such as buttons, radio buttons, menus, and various methods to create a functional layout. Additionally, DPG has an incredible assortment of dynamic plots, tables, drawings, debuggers, and multiple resource viewers. DPG is well suited for creating simple user interfaces as well as developing complex and demanding graphical interfaces. DPG offers a solid framework for developing scientific, engineering, gaming, data science and other applications that require fast and interactive interfaces. The Tutorials will provide a great overview and links to each topic in the API Reference for more detailed reading. Complete theme and style control. GPU-based rendering and efficient C/C++ code.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    marimo

    marimo

    A reactive notebook for Python

    marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots, make working with data feel refreshingly fast, futuristic, and intuitive. Version with git, run as Python scripts, import symbols from a notebook into other notebooks or Python files, and lint or format with your favorite tools. You'll always be able to reproduce your collaborators' results. Notebooks are executed in a deterministic order, with no hidden state, delete a cell and marimo deletes its variables while updating affected cells.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 10
    Data Science at the Command Line

    Data Science at the Command Line

    Data science at the command line

    Command Line by Jeroen Janssens, published by O’Reilly Media in October 2021. Obtain, scrub, explore, and model data with Unix Power Tools. This repository contains the full text, data, and scripts used in the second edition of the book Data Science at the Command Line by Jeroen Janssens. This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools, useful whether you work with Windows, macOS, or Linux. You’ll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you’re comfortable processing data with Python or R, you’ll learn how to greatly improve your data science workflow by leveraging the command line’s power.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Milvus

    Milvus

    Vector database for scalable similarity search and AI applications

    Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility. Average latency measured in milliseconds on trillion vector datasets. Rich APIs designed for data science workflows. Consistent user experience across laptop, local cluster, and cloud. Embed real-time search and analytics into virtually any application. Milvus’ built-in replication and failover/failback features ensure data and applications can maintain business continuity in the event of a disruption. Component-level scalability makes it possible to scale up and down on demand.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Rodeo

    Rodeo

    A data science IDE for Python

    A data science IDE for Python. RODEO, that is an open-source python IDE and has been brought up by the folks at yhat, is a development environment that is lightweight, intuitive and yet customizable to its very core and also contains all the features mentioned above that were searched for so long. It is just like your very own personal home base for exploration and interpretation of data that aims at Data Scientists and answers the main question, "Is there anything like RStudio for Python?" Rodeo makes it very easy for its users to explore what is created by them and also alongside allows the users to Inspect, interact, compare data frames, plots and even much more. It is an IDE that has been built especially for data science/Machine Learning in Python and you can also very simply think of it as a light weight alternative to the IPython Notebook.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    TensorFlow.NET

    TensorFlow.NET

    .NET Standard bindings for Google's TensorFlow for developing models

    TensorFlow.NET (TF.NET) provides a .NET Standard binding for TensorFlow. It aims to implement the complete Tensorflow API in C# which allows .NET developers to develop, train and deploy Machine Learning models with the cross-platform .NET Standard framework. TensorFlow.NET has built-in Keras high-level interface and is released as an independent package TensorFlow.Keras. SciSharp STACK's mission is to bring popular data science technology into the .NET world and to provide .NET developers with a powerful Machine Learning tool set without reinventing the wheel. Since the APIs are kept as similar as possible you can immediately adapt any existing TensorFlow code in C# or F# with a zero learning curve. Take a look at a comparison picture and see how comfortably a TensorFlow/Python script translates into a C# program with TensorFlow.NET.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    sadsa

    sadsa

    SADSA (Software Application for Data Science and Analytics)

    SADSA (Software Application for Data Science and Analytics) is a Python-based desktop application designed to simplify statistical analysis, machine learning, and data visualization for students, researchers, and data professionals. Built using Python for the GUI, SADSA provides a menu-driven interface for handling datasets, applying transformations, running advanced statistical tests, machine learning algorithms, and generating insightful plots — all without writing code.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Power Analytics for Excel

    Power Analytics for Excel

    Data Science made available for Excel

    This plugin for Excel comes with statistical methods such as Muliple Regression, Analysis of Variance (ANOVA) and Distribution Tests.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    AI Data Science Team is a Python library and agent ecosystem designed to accelerate and automate common data science workflows by modeling them as specialized AI “agents” that can be orchestrated to perform tasks like data cleaning, transformation, analysis, visualization, and machine learning. It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets, engineering predictive features, building models with AutoML, connecting to SQL databases, and producing visual outputs — all driven by natural language or programmatic instructions. The project includes ready-to-use applications that showcase these agents in action, such as an exploratory data analysis copilot that generates reports, a pandas data analyst that combines wrangling and plotting, and SQL database agents that can query business databases and output results directly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AWA-Core

    AWA-Core

    Full application for factory, process engineer and Automation..

    IT'S HERE-----FINALLY. AWA-Core 2026 is here with a totally new architecture. The core is now in Client/Server architecture and open to other applications (including yours). New interfaces for the server and client sides. Please, go to our youTube channel to see many tutorials about this new release. Don't waste your time trying things and clicking everywhere. Wait for our tutorials, install AWA-Core on a single PC or in a complete C/S architecture (Servers provided) and run powerfull projects. Check youTube channel @ImprovingFactories for tutorials (they're all coming). Stay tuned. And don't forget. AWA-Core is a cashback software. You can install it and earn money with. More than an Historian... AWA-Core (Another Way of Automation) is a complete suite that allows engineers, PLC programmers and factory designers to create huge projects for retrieving data, creating graphics, automatic scripts, exports and data links.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AWS SDK for pandas

    AWS SDK for pandas

    Easy integration with Athena, Glue, Redshift, Timestream, Neptune

    aws-sdk-pandas (formerly AWS Data Wrangler) bridges pandas with the AWS analytics stack so DataFrames flow seamlessly to and from cloud services. With a few lines of code, you can read from and write to Amazon S3 in Parquet/CSV/JSON/ORC, register tables in the AWS Glue Data Catalog, and query with Amazon Athena directly into pandas. The library abstracts efficient patterns like partitioning, compression, and vectorized I/O so you get performant data lake operations without hand-rolling boilerplate. It also supports Redshift, OpenSearch, and other services, enabling ETL tasks that blend SQL engines and Python transformations. Operational helpers handle IAM, sessions, and concurrency while exposing knobs for encryption, versioning, and catalog consistency. The result is a productive workflow that keeps your analytics in Python while leveraging AWS-native storage and query engines at scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Adele

    Adhoc Data Exploration - Live & Easy

    Adele was developed to simplify the daily work with data. Use it as a swiss knife to fill the gap between your work with spreadsheet application like MS Excel and enterprise servers like SAP ERP. Specialized tools like Rapid Miner, KNIME or similiary stuff should not be replaced. But Adele is designed for business people working with spreadsheet applications to analyse their data. There are many technical concepts in an easier way included. For example realtime OLAP, transformations, charts, analysis tools,... Connectors (e.g. JDBC, SAP ABAP, OData) can be used to pre-analyse the data and extract it without saving the data as text files. A plugin concept for enhancements are available. Enjoy! Its free for commercial use too. Adele runs without installation from USB stick for Windows, Linux and MacOSX. Last added changes: - data science tools (V1, IQR) - export to remote and desktop databases (mysql,sqlite, ms access) - internet features for emails and domains
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Amazon SageMaker Examples

    Amazon SageMaker Examples

    Jupyter notebooks that demonstrate how to build models using SageMaker

    Welcome to Amazon SageMaker. This projects highlights example Jupyter notebooks for a variety of machine learning use cases that you can run in SageMaker. If you’re new to SageMaker we recommend starting with more feature-rich SageMaker Studio. It uses the familiar JupyterLab interface and has seamless integration with a variety of deep learning and data science environments and scalable compute resources for training, inference, and other ML operations. Studio offers teams and companies easy on-boarding for their team members, freeing them up from complex systems admin and security processes. Administrators control data access and resource provisioning for their users. Notebook Instances are another option. They have the familiar Jupyter and JuypterLab interfaces that work well for single users, or small teams where users are also administrators. Advanced users also use SageMaker solely with the AWS CLI and Python scripts using boto3 and/or the SageMaker Python SDK.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ClearML

    ClearML

    Streamline your ML workflow

    ClearML is an open source platform that automates and simplifies developing and managing machine learning solutions for thousands of data science teams all over the world. It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. The ClearML Server storing experiment, model, and workflow data, and supports the Web UI experiment manager, and ML-Ops automation for reproducibility and tuning. It is available as a hosted service and open source for you to deploy your own ClearML Server. The ClearML Agent for ML-Ops orchestration, experiment and workflow reproducibility, and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Cookiecutter Data Science

    Cookiecutter Data Science

    Project structure for doing and sharing data science work

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. When we think about data analysis, we often think just about the resulting reports, insights, or visualizations. While these end products are generally the main event, it's easy to focus on making the products look nice and ignore the quality of the code that generates them. Because these end products are created programmatically, code quality is still important! And we're not talking about bikeshedding the indentation aesthetics or pedantic formatting standards, ultimately, data science code quality is about correctness and reproducibility. It's no secret that good analyses are often the result of very scattershot and serendipitous explorations. Tentative experiments and rapidly testing approaches that might not work out are all part of the process for getting to the good stuff, and there is no magic bullet to turn data exploration into a simple, linear progression.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    DEPRECATED - KVFinder

    Cavity Detection PyMOL plugin

    The KVFinder software, originally published in 2014, is deprecated. We published more recent software: parKVFinder and pyKVFinder. [parKVFinder] A Linux/macOS version is available in this GitHub repository, https://github.com/LBC-LNBio/parKVFinder, while a Windows version is in this GitHub repository, https://github.com/LBC-LNBio/parKVFinder-win. Please read and cite the original paper ParKVFinder: A thread-level parallel approach in biomolecular cavity detection (10.1016/j.softx.2020.100606). [pyKVFinder] pyKVFinder is available in this Python Package Index (PyPI) repository, https://pypi.org/project/pyKVFinder and this GitHub repository, https://github.com/LBC-LNBio/pyKVFinder. Please read and cite the original paper pyKVFinder: an efficient and integrable Python package for biomolecular cavity detection and characterization in data science (10.1186/s12859-021-04519-4).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB