Showing 74 open source projects for "reasoning machine learning"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    NVIDIA Merlin

    NVIDIA Merlin

    Library providing end-to-end GPU-accelerated recommender systems

    NVIDIA Merlin is an open-source library that accelerates recommender systems on NVIDIA GPUs. The library enables data scientists, machine learning engineers, and researchers to build high-performing recommenders at scale. Merlin includes tools to address common feature engineering, training, and inference challenges. Each stage of the Merlin pipeline is optimized to support hundreds of terabytes of data, which is all accessible through easy-to-use APIs. For more information, see NVIDIA Merlin on the NVIDIA developer website. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 3
    Dask

    Dask

    Parallel computing with task scheduling

    ...It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    Cleanlab

    Cleanlab

    The standard data-centric AI package for data quality and ML

    cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    FinRobot

    FinRobot

    An Open-Source AI Agent Platform for Financial Analysis using LLMs

    ...It provides developers and quants with structured modules to fetch market data, process time series, generate technical indicators, and construct features appropriate for machine learning models, while also supporting backtesting and evaluation metrics to measure strategy performance. Built with modularity in mind, FinRobot allows users to plug in custom models — from classical algorithms to deep learning architectures — and orchestrate components in pipelines that can run reproducibly across experiments. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Pyper

    Pyper

    Concurrent Python made simple

    Pyper is a Python-native orchestration and scheduling framework designed for modern data workflows, machine learning pipelines, and any task that benefits from a lightweight DAG-based execution engine. Unlike heavier platforms like Airflow, Pyper aims to remain lean, modular, and developer-friendly, embracing Pythonic conventions and minimizing boilerplate. It focuses on local development ergonomics and seamless transition to production environments, making it ideal for small teams and individuals needing a programmable and flexible orchestration solution without the overhead of enterprise systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    ...Or write your own custom machine learning model. In addition to performance and memory usage, you can also measure synthetic data quality and privacy through a variety of metrics. Install SDGym using pip or conda. We recommend using a virtual environment to avoid conflicts with other software on your device.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Datumaro

    Datumaro

    Dataset Management Framework, a Python library and a CLI tool to build

    ...Datumaro makes it easy to merge datasets, split them into training/validation/test subsets, filter or transform annotations, and validate annotation quality — all while preserving metadata and supporting detailed statistics. It’s especially useful when you’re dealing with heterogeneous data sources or need to prepare complex datasets for machine learning workflows, freeing you from writing custom scripts for every format conversion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Covalent workflow

    Covalent workflow

    Pythonic tool for running machine-learning/high performance workflows

    Covalent is a Pythonic workflow tool for computational scientists, AI/ML software engineers, and anyone who needs to run experiments on limited or expensive computing resources including quantum computers, HPC clusters, GPU arrays, and cloud services. Covalent enables a researcher to run computation tasks on an advanced hardware platform – such as a quantum computer or serverless HPC cluster – using a single line of code. Covalent overcomes computational and operational challenges inherent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Grafana

    Grafana

    Leading open-source visualization and observability platform

    Grafana OSS is the leading open-source platform for visualization and observability. It enables teams to query, visualize, alert on, and explore telemetry data from multiple sources in a single interface. With support for 100+ data source plugins—including Prometheus, Loki, Elasticsearch, InfluxDB, SQL/NoSQL databases, and OpenTelemetry—Grafana helps teams correlate metrics, logs, and traces across applications and infrastructure. Users can build interactive dashboards with rich...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 14
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    ...You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else. You can build pretty much any task you want, but Luigi also comes with a toolbox of several common task templates that you use. It includes support for running Python mapreduce jobs in Hadoop, as well as Hive, and Pig, jobs. It also comes with file system abstractions for HDFS, and local files that ensures all file system operations are atomic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    ...It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The SageMaker Spark Container is a Docker image used to run batch data processing workloads on Amazon SageMaker using the Apache Spark framework. The container images in this repository are used to build the pre-built container images that are used when running Spark jobs on Amazon SageMaker using the SageMaker Python SDK. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    sadsa

    sadsa

    SADSA (Software Application for Data Science and Analytics)

    SADSA (Software Application for Data Science and Analytics) is a Python-based desktop application designed to simplify statistical analysis, machine learning, and data visualization for students, researchers, and data professionals. Built using Python for the GUI, SADSA provides a menu-driven interface for handling datasets, applying transformations, running advanced statistical tests, machine learning algorithms, and generating insightful plots — all without writing code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Datapipe

    Datapipe

    Real-time, incremental ETL library for ML with record-level depend

    Datapipe is a real-time, incremental ETL library for Python with record-level dependency tracking. Datapipe is designed to streamline the creation of data processing pipelines. It excels in scenarios where data is continuously changing, requiring pipelines to adapt and process only the modified data efficiently. This library tracks dependencies for each record in the pipeline, ensuring minimal and efficient data processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SMILI

    SMILI

    Scientific Visualisation Made Easy

    The Simple Medical Imaging Library Interface (SMILI), pronounced 'smilie', is an open-source, light-weight and easy-to-use medical imaging viewer and library for all major operating systems. The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with...
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 19
    Uranie

    Uranie

    Uranie is CEA's uncertainty analysis platform, based on ROOT

    Uranie is a sensitivity and uncertainty analysis plateform based on the ROOT framework (http://root.cern.ch) . It is developed at CEA, the French Atomic Energy Commission (http://www.cea.fr). It provides various tools for: - data analysis - sampling - statistical modeling - optimisation - sensitivity analysis - uncertainty analysis - running code on high performance computers - etc. Thanks to ROOT, it is easily scriptable in CINT (c++ like syntax) and Python. Is is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    DataPrep

    Python-based data preprocessing tool

    DataPrep v0.2 is a Tkinter-based GUI application/tool designed to assist users in data preprocessing, multicollinearity removal, and feature selection for a wide range of applications in Cheminformatics, Bioinformatics, Data Analysis, Feature Selection, Molecular Modeling, Machine Learning, and Quantitative-structure-property relationship (QSPR) studies. It includes functionality to load, process, and save datasets with support for different preprocessing & multicollinearity removal strategies with customizable parameter setting options.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Kalshi-Quant-TeleBot

    Kalshi Advanced Quantitative Trading Bot is an enterprise-grade

    ...Built with cutting-edge quantitative algorithms and professional risk management, it provides institutional-quality trading capabilities with user-friendly control The Kalshi Advanced Quantitative Trading Bot is a professional-grade automated trading system designed specifically for event-based markets on the Kalshi platform. This bot leverages advanced quantitative strategies, machine learning techniques, and real-time data analysis to identify profitable trading opportunities while maintaining robust risk management protocols. Built with a modular architecture, the system combines Python-based trading algorithms with a JavaScript Telegram bot interface for dynamic monitoring and interaction. The bot is designed to operate continuously, making data-driven decisions based on news sentiment analysis, statistical arbitrage opportunities
    Downloads: 12 This Week
    Last Update:
    See Project
  • 22
    Ubix Linux

    Ubix Linux

    The Pocket Datalab

    Ubix stands for Universal Business Intelligence Computing System. Ubix Linux is an open-source, Debian-based Linux distribution geared towards data acquisition, transformation, analysis and presentation. Ubix Linux purpose is to offer a tiny but versatile datalab. Ubix Linux is easily accessible, resource-efficient and completely portable on a simple USB key. Ubix Linux is a perfect toolset for learning data analysis and artificial intelligence basics on small to medium...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    SageMaker Inference Toolkit

    SageMaker Inference Toolkit

    Serve machine learning models within a Docker container

    Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OmicSelector

    OmicSelector

    Feature selection and deep learning modeling for omic biomarker study

    OmicSelector is an environment, Docker-based web application, and R package for biomarker signature selection (feature selection) from high-throughput experiments and others. It was initially developed for miRNA-seq (small RNA, smRNA-seq; hence the name was miRNAselector), RNA-seq and qPCR, but can be applied for every problem where numeric features should be selected to counteract overfitting of the models. Using our tool, you can choose features, like miRNAs, with the most significant...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PyNanoLab

    PyNanoLab

    data analysis and Visualization with matplotlib

    PyNanoLab contains a variety of tools to complete the data analysis, statistics, curve fitting, and basic machine learning application. Visualization in pynanolab is based on matplotlib. The setup tools is desinged to control and set-up all the details of the figure with a GUI.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB