all to free download - SourceForge

Showing 18 open source projects for "all to"

View related business solutions

Data Science Linux Clear Filters & Widen Search

Vibes don’t ship, Retool does
Start from a prompt and build production-ready apps on your data—with security, permissions, and compliance built in.

Vibe coding tools create cool demos, but Retool helps you build software your company can actually use. Generate internal apps that connect directly to your data—deployed in your cloud with enterprise security from day one. Build dashboards, admin panels, and workflows with granular permissions already in place. Stop prototyping and ship on a platform that actually passes security review.

Build apps that ship
Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)

Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.

Learn More
1

ggplot2

An implementation of the Grammar of Graphics in R

...With ggplot2 you simply provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it will take care of the rest. ggplot2 is over 10 years old and is used by hundreds of thousands of people all over the world for plotting. In most cases using ggplot2 starts with supplying a dataset and aesthetic mapping (with aes()); adding on layers (like geom_point() or geom_histogram()), scales (like scale_colour_brewer()), and faceting specifications (like facet_wrap()); and finally, coordinating systems. ggplot2 has a rich ecosystem of community-maintained extensions for those looking for more innovation. ...

Downloads: 27 This Week

Last Update: 2025-11-14
See Project
2

Great Expectations

Always know what to expect from your data

...Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams. Expectations are assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues. Expectations are a great start, but it takes more to get to production-ready data validation. Where are Expectations stored? How do they get updated? How do you securely connect to production data systems? How do you notify team members and triage when data validation fails? Great Expectations supports all of these use cases out of the box. ...

Downloads: 4 This Week

Last Update: 6 days ago
See Project
3

XGBoost

Scalable and Flexible Gradient Boosting

XGBoost is an optimized distributed gradient boosting library, designed to be scalable, flexible, portable and highly efficient. It supports regression, classification, ranking and user defined objectives, and runs on all major operating systems and cloud platforms. XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems. XGBoost can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.

Downloads: 11 This Week

Last Update: 2026-01-10
See Project
4

Milvus

Vector database for scalable similarity search and AI applications

...Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility. Average latency measured in milliseconds on trillion vector datasets. Rich APIs designed for data science workflows. Consistent user experience across laptop, local cluster, and cloud. Embed real-time search and analytics into virtually any application. ...

Downloads: 7 This Week

Last Update: 6 days ago
See Project
Grafana: The open and composable observability platform
Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

Grafana is the open source analytics & monitoring solution for every database.

Learn More
5

tsfresh

Automatic extraction of relevant features from time series

...It automatically calculates a large number of time series characteristics, the so called features. tsfresh is used to to extract characteristics from time series. Without tsfresh, you would have to calculate all characteristics by hand. With tsfresh this process is automated and all your features can be calculated automatically. Further tsfresh is compatible with pythons pandas and scikit-learn APIs, two important packages for Data Science endeavours in python. The extracted features can be used to describe or cluster time series based on the extracted characteristics. ...

Downloads: 0 This Week

Last Update: 2025-08-30
See Project
6

AI Data Science Team

An AI-powered data science team of agents

...It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets, engineering predictive features, building models with AutoML, connecting to SQL databases, and producing visual outputs — all driven by natural language or programmatic instructions. The project includes ready-to-use applications that showcase these agents in action, such as an exploratory data analysis copilot that generates reports, a pandas data analyst that combines wrangling and plotting, and SQL database agents that can query business databases and output results directly.

Downloads: 4 This Week

Last Update: 2 days ago
See Project
7

PySyft

Data science on data without acquiring a copy

...This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data without first putting it all in one (central) place. The Syft ecosystem seeks to change this system, allowing you to write software which can compute over information you do not own on machines you do not have (total) control over. This not only includes servers in the cloud, but also personal desktops, laptops, mobile phones, websites, and edge devices. Wherever your data wants to live in your ownership, the Syft ecosystem exists to help keep it there while allowing it to be used privately.

Downloads: 0 This Week

Last Update: 2025-02-13
See Project
8

ClearML

Streamline your ML workflow

ClearML is an open source platform that automates and simplifies developing and managing machine learning solutions for thousands of data science teams all over the world. It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. ...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
9

Cookiecutter Data Science

Project structure for doing and sharing data science work

...It's no secret that good analyses are often the result of very scattershot and serendipitous explorations. Tentative experiments and rapidly testing approaches that might not work out are all part of the process for getting to the good stuff, and there is no magic bullet to turn data exploration into a simple, linear progression.

Downloads: 0 This Week

Last Update: 2025-07-24
See Project
Fully managed relational database service for MySQL, PostgreSQL, and SQL Server
Focus on your application, and leave the database to us

Cloud SQL manages your databases so you don't have to, so your business can run without disruption. It automates all your backups, replication, patches, encryption, and storage capacity increases to give your applications the reliability, scalability, and security they need.

Try for free
10

NannyML

Detecting silent model failure. NannyML estimates performance

NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, and interactive visualizations, is completely model-agnostic, and currently supports all tabular classification use cases. NannyML closes the loop with performance monitoring and post deployment data science, empowering data scientist to quickly understand and automatically detect silent model failure. By using NannyML, data scientists can finally maintain complete visibility and trust in their deployed machine learning models. ...

Downloads: 0 This Week

Last Update: 2025-07-12
See Project
11

NVIDIA Merlin

Library providing end-to-end GPU-accelerated recommender systems

...Merlin includes tools to address common feature engineering, training, and inference challenges. Each stage of the Merlin pipeline is optimized to support hundreds of terabytes of data, which is all accessible through easy-to-use APIs. For more information, see NVIDIA Merlin on the NVIDIA developer website. Transform data (ETL) for preprocessing and engineering features. Accelerate your existing training pipelines in TensorFlow, PyTorch, or FastAI by leveraging optimized, custom-built data loaders. Scale large deep learning recommender models by distributing large embedding tables that exceed available GPU and CPU memory. ...

Downloads: 0 This Week

Last Update: 2024-06-14
See Project
12

DAT Linux

The data science OS

DAT Linux is a Linux distribution for data science. It brings together all your favourite open-source data science tools and apps into a ready-to-run desktop environment. https://datlinux.com It's based on Lubuntu, so it’s easy to install and use. The custom DAT Linux Control Panel provides a centralised one-stop-shop for running and managing dozens of data science programs. DAT Linux is perfect for students, professionals, academics, or anyone interested in data science who doesn’t want to spend endless hours downloading, installing, configuring, and maintaining applications from a range of sources, each with different technical requirements and set-up challenges.

Downloads: 61 This Week

Last Update: 2025-11-07
See Project
13

FlexiList.

FlexiList is a Java data structure that combines the benefits of array

...Benefits Over Arrays and ArrayList ->Efficient Insertion and Deletion: FlexiList can insert or delete nodes at any position in the list in O(1) time, whereas arrays require shifting all elements after the insertion or deletion point. ->Dynamic Size: FlexiList can grow or shrink dynamically as elements are added or removed, whereas arrays have a fixed size. ->Good Memory Locality: FlexiList nodes are stored in a contiguous block of memory, making it more cache-friendly than arrays. ->Faster Insertion and Deletion: FlexiList can insert or delete nodes at any position in the list in O(1) time, whereas ArrayList requires shifting all elements after the insertion or deletion point.

1 Review

Downloads: 0 This Week

Last Update: 2024-06-10
See Project
14

sadsa

SADSA (Software Application for Data Science and Analytics)

...Built using Python for the GUI, SADSA provides a menu-driven interface for handling datasets, applying transformations, running advanced statistical tests, machine learning algorithms, and generating insightful plots — all without writing code.

1 Review

Downloads: 1 This Week

Last Update: 2025-04-17
See Project
15

Orchest

Build data pipelines, the easy way

Code, run and monitor your data pipelines all from your browser! From idea to scheduled pipeline in hours, not days. Interactively build your data science pipelines in our visual pipeline editor. Versioned as a JSON file. Run scripts or Jupyter notebooks as steps in a pipeline. Python, R, Julia, JavaScript, and Bash are supported. Parameterize your pipelines and run them periodically on a cron schedule.

Downloads: 0 This Week

Last Update: 2023-04-03
See Project
16

ML workspace

All-in-one web-based IDE specialized for machine learning

All-in-one web-based development environment for machine learning. The ML workspace is an all-in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. This workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch, Keras, Sklearn) and dev tools (e.g., Jupyter, VS Code, Tensorboard) perfectly configured, optimized, and integrated. ...

Downloads: 1 This Week

Last Update: 2022-07-12
See Project
17

DeepLearningProject

An in-depth machine learning tutorial

...It is not a 30 minute tutorial that teaches you how to "Train your own neural network" or "Learn deep learning in under 30 minutes". It's a full pipeline which you would need to do if you actually work with machine learning - introducing you to all the parts, and all the implementation decisions and details that need to be made. The dataset is not one of the standard sets like MNIST or CIFAR, you will make you very own dataset. Then you will go through a couple conventional machine learning algorithms, before finally getting to deep learning! In the fall of 2016, I was a Teaching Fellow (Harvard's version of TA) for the graduate class on "Advanced Topics in Data Science (CS209/109)" at Harvard University. ...

Downloads: 0 This Week

Last Update: 2022-08-03
See Project
18

Rodeo

A data science IDE for Python

...RODEO, that is an open-source python IDE and has been brought up by the folks at yhat, is a development environment that is lightweight, intuitive and yet customizable to its very core and also contains all the features mentioned above that were searched for so long. It is just like your very own personal home base for exploration and interpretation of data that aims at Data Scientists and answers the main question, "Is there anything like RStudio for Python?" Rodeo makes it very easy for its users to explore what is created by them and also alongside allows the users to Inspect, interact, compare data frames, plots and even much more. ...

Downloads: 10 This Week

Last Update: 2022-02-09
See Project