Showing 227 open source projects for "ml"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    Pfl Research

    Pfl Research

    Simulation framework for accelerating research

    A fast, modular Python framework released by Apple for privacy-preserving federated learning (PFL) simulation. Integrates with TensorFlow, PyTorch, and classical ML, and offers high-speed distributed simulation (7–72× faster than alternatives).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Pyper

    Pyper

    Concurrent Python made simple

    Pyper is a Python-native orchestration and scheduling framework designed for modern data workflows, machine learning pipelines, and any task that benefits from a lightweight DAG-based execution engine. Unlike heavier platforms like Airflow, Pyper aims to remain lean, modular, and developer-friendly, embracing Pythonic conventions and minimizing boilerplate. It focuses on local development ergonomics and seamless transition to production environments, making it ideal for small teams and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    sparkmagic

    sparkmagic

    Jupyter magics and kernels for working with remote Spark clusters

    ...Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib). Send local files or dataframes to a remote cluster (e.g. sending pretrained local ML model straight to the Spark cluster) Authenticate to Livy via Basic Access authentication or via Kerberos.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    AI Hedge Fund

    AI Hedge Fund

    An AI Hedge Fund Team

    This repository demonstrates how to build a simplified, automated hedge fund strategy powered by AI/ML. It integrates financial data collection, preprocessing, feature engineering, and predictive modeling to simulate decision-making in trading. The code shows workflows for pulling stock or market data, applying machine learning algorithms to forecast trends, and generating buy/sell/hold signals based on the predictions. Its structure is educational: intended more as a proof-of-concept than a ready-to-use financial product, giving learners insight into the mechanics of quantitative finance automation. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    Hamilton DAGWorks

    Hamilton DAGWorks

    Helps scientists define testable, modular, self-documenting dataflow

    ...Hamilton loads that definition and automatically builds the DAG for you. Hamilton brings modularity and structure to any Python application moving data: ETL pipelines, ML workflows, LLM applications, RAG systems, BI dashboards, and the Hamilton UI allows you to automatically visualize, catalog, and monitor execution.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    ARIS

    ARIS

    Lightweight Markdown-only skills for autonomous ML research

    ARIS is an experimental automation framework that leverages AI coding agents to perform continuous research and development tasks autonomously, even without active user supervision. The system is designed to run iterative cycles of research, coding, testing, and refinement, effectively simulating a “sleep mode” where productive work continues in the background. It integrates with AI tools such as Claude Code to generate solutions, analyze results, and improve outputs over time. The project...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    whylogs

    whylogs

    The open standard for data logging

    whylogs is an open-source library for logging any kind of data. With whylogs, users are able to generate summaries of their datasets (called whylogs profiles) which they can use to track changes in their dataset Create data constraints to know whether their data looks the way it should. Quickly visualize key summary statistics about their datasets. whylogs profiles are the core of the whylogs library. They capture key statistical properties of data, such as the distribution (far beyond...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models. While you’re in the research phase, and want to validate your data, find potential methodological problems, and/or validate...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    plexe

    plexe

    Build a machine learning model from a prompt

    plexe lets you build machine-learning systems from natural-language prompts, turning plain English goals into working pipelines. You describe what you want—a predictor, a classifier, a forecaster—and the tool plans data ingestion, feature preparation, model training, and evaluation automatically. Under the hood an agent executes the plan step by step, surfacing intermediate results and artifacts so you can inspect or override choices. It aims to be production-minded: models can be exported,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    Superduper

    Superduper

    Superduper: Integrate AI models and machine learning workflows

    Superduper is a Python-based framework for building end-2-end AI-data workflows and applications on your own data, integrating with major databases. It supports the latest technologies and techniques, including LLMs, vector-search, RAG, and multimodality as well as classical AI and ML paradigms. Developers may leverage Superduper by building compositional and declarative objects that out-source the details of deployment, orchestration versioning, and more to the Superduper engine. This allows developers to completely avoid implementing MLOps, ETL pipelines, model deployment, data migration, and synchronization. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    EconML

    EconML

    Python Package for ML-Based Heterogeneous Treatment Effects Estimation

    EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the ALICE project at Microsoft Research with the goal of combining state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. One of the biggest promises of machine learning is to automate decision-making in a multitude of domains. At the core of many data-driven...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Flower

    Flower

    Flower: A Friendly Federated Learning Framework

    A unified approach to federated learning, analytics, and evaluation. Federate any workload, any ML framework, and any programming language. Federated learning systems vary wildly from one use case to another. Flower allows for a wide range of different configurations depending on the needs of each individual use case. Flower originated from a research project at the University of Oxford, so it was built with AI research in mind.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    TPOT

    TPOT

    A Python Automated Machine Learning tool that optimizes ML

    Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Sparrow

    Sparrow

    Structured data extraction and instruction calling with ML, LLM

    Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    Lithops

    Lithops

    A multi-cloud framework for big data analytics

    ...It abstracts cloud providers like IBM Cloud, AWS, Azure, and Google Cloud into a unified interface and turns your Python functions into scalable, event-driven workloads. Lithops is ideal for data processing, ML inference, and embarrassingly parallel workloads, giving you the power of FaaS (Function-as-a-Service) without vendor lock-in. It also supports hybrid cloud setups, object storage access, and simple integration with Jupyter notebooks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Argilla

    Argilla

    The open-source data curation platform for LLMs

    ...Most annotation tools treat data collection as a one-off activity at the beginning of each project. In real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model goes into production, you want to monitor and analyze its predictions, and collect more data to improve your model over time. Argilla is designed to close this gap, enabling you to iterate as much as you need.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Weights and Biases

    Weights and Biases

    Tool for visualizing and tracking your machine learning experiments

    ...Track and visualize all the pieces of your machine learning pipeline, from datasets to production models. Quickly identify model regressions. Use W&B to visualize results in real time, all in a central dashboard. Focus on the interesting ML. Spend less time manually tracking results in spreadsheets and text files. Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved code, hyperparameters, launch commands, input data, and resulting model weights. Set wandb.config once at the beginning of your script to save your hyperparameters, input settings (like dataset name or model type), and any other independent variables for your experiments. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    model2Vec

    model2Vec

    Fast State-of-the-Art Static Embeddings

    model2vec is an innovative embedding framework that converts large sentence transformer models into compact, high-speed static embedding models while preserving much of their semantic performance. The project focuses on dramatically reducing the computational cost of generating embeddings, achieving significant improvements in speed and model size without requiring large datasets for retraining. By using a distillation-based approach, it can produce lightweight models that run efficiently on...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Professional Services

    Professional Services

    Common solutions and tools developed by Google Cloud

    Professional Services repository is a collection of real-world solutions, tools, and reference implementations developed by Google Cloud’s Professional Services team to address common enterprise challenges. Unlike simple sample repositories, it focuses on production-oriented use cases such as data pipelines, machine learning workflows, infrastructure automation, and security management. The repository contains a wide variety of projects, including tools for validating data migrations,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    Google Research

    Google Research

    This repository contains code released by Google Research

    Google Research is a massive monorepo that hosts a wide range of research code released by Google Research teams across machine learning, artificial intelligence, robotics, natural language processing, and other advanced domains. Rather than being a single framework, the repository serves as a centralized collection of experimental projects, reference implementations, and reproducible research artifacts. It is intended primarily for researchers and advanced practitioners who want to explore...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Cube Studio

    Cube Studio

    Cube Studio open source cloud native one-stop machine learning

    Cube Studio is an open-source, cloud-native end-to-end machine learning and AI platform designed to support the full lifecycle of AI development — from data preparation and interactive notebook coding to distributed training, model tuning, and deployment in production-ready environments. It provides a unified interface where teams can manage data sources, track datasets, and build pipelines using drag-and-drop workflow orchestration, making it accessible for both engineers and data...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    SHAP

    SHAP

    A game theoretic approach to explain the output of ml models

    SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions. While SHAP can explain the output of any machine learning model, we have developed a high-speed exact algorithm for tree ensemble methods. Fast C++ implementations are supported for XGBoost, LightGBM, CatBoost, scikit-learn and pyspark...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    Book2_Beauty-of-Data-Visualization

    Book2_Beauty-of-Data-Visualization

    Machine Learning, Criticism and Correction

    Book2_Beauty-of-Data-Visualization is an open educational project that teaches the principles and techniques of effective data visualization using Python and modern plotting libraries. The repository focuses on both the technical and aesthetic aspects of visual analytics, helping learners understand how to communicate data clearly and persuasively. It includes practical examples that demonstrate how different chart types reveal patterns, trends, and distributions in real datasets. The...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    dstack

    dstack

    Open-source tool designed to enhance the efficiency of workloads

    dstack is an open-source tool designed to enhance the efficiency of running ML workloads in any cloud (AWS, GCP, Azure, Lambda, etc). It streamlines development and deployment, reduces cloud costs, and frees users from vendor lock-in.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ktrain

    ktrain

    ktrain is a Python library that makes deep learning AI more accessible

    ktrain is a Python library that makes deep learning and AI more accessible and easier to apply. ktrain is a lightweight wrapper for the deep learning library TensorFlow Keras (and other libraries) to help build, train, and deploy neural networks and other machine learning models. Inspired by ML framework extensions like fastai and ludwig, ktrain is designed to make deep learning and AI more accessible and easier to apply for both newcomers and experienced practitioners. With only a few lines of code, ktrain allows you to easily and quickly. ktrain purposely pins to a lower version of transformers to include support for older versions of TensorFlow. ...
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB