Showing 3123 open source projects for "data"

View related business solutions
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • 1
    Matrix

    Matrix

    Multi-Agent daTa geneRation Infra and eXperimentation framework

    Matrix is a distributed, large-scale engine for multi-agent synthetic data generation and experiments: it provides the infrastructure to run thousands of “agentic” workflows concurrently (e.g. multiple LLMs interacting, reasoning, generating content, data-processing pipelines) by leveraging distributed computing (like Ray + cluster management). The idea is to treat data generation as a “data-to-data” transformation: each input item defines a task, and the runtime orchestrates asynchronous, peer-to-peer agent workflows, avoiding global synchronization bottlenecks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser-Use is a framework that makes websites accessible for AI agents, enabling automated interactions and data extraction from web pages.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Remarshal

    Remarshal

    Convert between CBOR, JSON, MessagePack, TOML, and YAML

    Convert between CBOR, JSON, MessagePack, TOML, and YAML. When installed, provides the command-line command remarshal as well as the short commands {cbor,json,msgpack,toml,yaml}2{cbor,json,msgpack,toml,yaml}. You can perform format conversion, reformatting, and error detection using these commands. CBOR, MessagePack, and YAML with binary fields cannot be converted to JSON or TOML. Binary fields are converted between CBOR, MessagePack, and YAML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Fantasy PL MCP

    Fantasy PL MCP

    Fantasy Premier League MCP Server

    Fantasy Premier League MCP Server is a Model Context Protocol (MCP) server that provides access to Fantasy Premier League (FPL) data and tools. It allows interaction with FPL data in MCP-compatible clients, enabling users to manage their fantasy teams effectively. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • 5
    Ragas

    Ragas

    Supercharge Your LLM Application Evaluations

    Objective metrics, intelligent test generation, and data-driven insights for LLM apps. Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient evaluation workflows. Don't have a test dataset ready? We also do production-aligned test set generation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Cortex Analyzers

    Cortex Analyzers

    Cortex Analyzers Repository

    Analyzers can be written in any programming language supported by Linux such as Python, Ruby, Perl, etc. Refer to the How to Write and Submit an Analyzer page for details on how to write and submit one.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    openvpn-monitor

    openvpn-monitor

    openvpn-monitor is a web based OpenVPN monitor

    ...It typically runs on the same host as the OpenVPN server, however, it does not necessarily need to. OpenVPN-monitor is a web-based OpenVPN monitor, that shows current connection information, such as users, location, and data transferred.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    AlphaFold 3

    AlphaFold 3

    AlphaFold 3 inference pipeline

    ...Users can perform local predictions via Docker containers, integrating AlphaFold 3’s inference process with provided JSON input configurations. The software includes flexible options for running both data preprocessing and GPU-accelerated inference, allowing users to adapt to available computational resources.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    Giskard

    Giskard

    Collaborative & Open-Source Quality Assurance for all AI models

    ...Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.
    Downloads: 4 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    OpenBB Terminal

    OpenBB Terminal

    Investment research for everyone, anywhere

    ...The MIT Open Source license allows any user to fork the project to either add features to the broader community or create their own customized terminal version. The terminal allows for users to import their own proprietary datasets to use on our econometric menu. In addition, users are allowed to export any type of data to any type of format whether that is raw data in Excel or an image in PNG. This is ideal for finance content creation. Create notebook templates (through papermill) which can be run on different tickers. This level of automation allows to speed up the development of your investment thesis and reduce human error.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    autopep8

    autopep8

    A tool that automatically formats Python code to conform to the PEP 8

    autopep8 automatically formats Python code to conform to the PEP 8 style guide. It uses the pycodestyle utility to determine what parts of the code need to be formatted. autopep8 is capable of fixing most of the formatting issues that can be reported by pycodestyle. Correct deprecated or non-idiomatic Python code (via lib2to3). Use this for making Python 2.7 code more compatible with Python 3. Put a blank line between a class docstring and its first method declaration. Remove blank lines...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    Recommenders

    Recommenders

    Best practices on recommendation systems

    ...Several utilities are provided in reco_utils to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. Please see the setup guide for more details on setting up your machine locally, on a data science virtual machine (DSVM) or on Azure Databricks. Independent or incubating algorithms and utilities are candidates for the contrib folder. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Nautobot

    Nautobot

    Network Source of Truth & Network Automation Platform

    Nautobot is an open-source network source of truth and automation platform designed to manage network infrastructure data effectively. Initially built as a fork of NetBox, Nautobot extends its capabilities by offering flexible data modeling, powerful REST and GraphQL APIs, and built-in automation tools. It enables network engineers and operators to store, query, and integrate network infrastructure data with external systems, making it a key component in modern network automation workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    python-bibtexparser v2

    python-bibtexparser v2

    Bibtex parser for Python 3

    Welcome to python-bibtexparser, a parser for .bib files with a long history and wide adaption. Bibtexparser is available in two versions: V1 and V2. For new projects, we recommend using v2 which, in the long run, will provide an overall more robust and faster experience. For now, however, note that v2 is an early beta, and does not contain all features of v1.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Bayesian Optimization

    Bayesian Optimization

    Python implementation of global optimization with gaussian processes

    This is a constrained global optimization package built upon bayesian inference and gaussian process, that attempts to find the maximum value of an unknown function in as few iterations as possible. This technique is particularly suited for optimization of high cost functions, situations where the balance between exploration and exploitation is important. More detailed information, other advanced features, and tips on usage/implementation can be found in the examples folder. Follow the basic...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    CUDOS Framework

    CUDOS Framework

    Command Line Interface tool for Cloud Intelligence Dashboards

    The AWS Cloud Intelligence Dashboards Framework is a set of open-source tools and templates designed to help organizations deploy and manage advanced data visualization dashboards that offer insights into cost, usage, governance, and operational health across AWS environments. It is part of the AWS Solutions Library and includes CloudFormation templates, CLI commands, and pre-built dashboards that collect, process, and visualize data from AWS billing, cost management, budgets, and usage reports in services such as Amazon QuickSight or other BI tools. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    MLRun

    MLRun

    Machine Learning automation and tracking

    MLRun is an open MLOps framework for quickly building and managing continuous ML and generative AI applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications, significantly reducing engineering efforts, time to production, and computation resources. MLRun breaks the silos between data, ML, software, and DevOps/MLOps teams, enabling collaboration and fast continuous improvements. In MLRun the assets, metadata, and services (data, functions, jobs, artifacts, models, secrets, etc.) are organized into projects. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    AI Hedge Fund

    AI Hedge Fund

    An AI Hedge Fund Team

    This repository demonstrates how to build a simplified, automated hedge fund strategy powered by AI/ML. It integrates financial data collection, preprocessing, feature engineering, and predictive modeling to simulate decision-making in trading. The code shows workflows for pulling stock or market data, applying machine learning algorithms to forecast trends, and generating buy/sell/hold signals based on the predictions. Its structure is educational: intended more as a proof-of-concept than a ready-to-use financial product, giving learners insight into the mechanics of quantitative finance automation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    FL4Health

    FL4Health

    Library to facilitate federated learning research

    FL4Health is a Vector Institute toolkit for building modular, clinically-focused FL pipelines. Tailored for healthcare, it supports privacy-preserving FL, heterogeneous data settings, integrated reporting, and clear API design.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    DOLMA

    DOLMA

    Data and tools for generating and inspecting OLMo pre-training data

    DOLMA (Data Optimization and Learning for Model Alignment) is a framework designed to manage large-scale datasets for training and fine-tuning language models efficiently.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    OpenBB

    OpenBB

    Investment Research for Everyone, Everywhere

    ...Create charts directly from raw data in seconds. Create charts directly from raw data in seconds. Customize your dashboards to build your dream terminal, integrate with your private datasets and bring your own fine-tuned AI copilots.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Dataproc Templates

    Dataproc Templates

    Dataproc templates and pipelines for solving simple in-cloud data task

    Dataproc templates are designed to address various in-cloud data tasks, including data import/export/backup/restore and bulk API operations. These templates leverage the power of Google Cloud's Dataproc, supporting both Dataproc Serverless and Dataproc clusters. Google provides this collection of pre-implemented Dataproc templates as a reference and for easy customization.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    doccano

    doccano

    Open source annotation tool for machine learning practitioners

    doccano is an open-source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequence-to-sequence tasks. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create a project, upload data and start annotating. You can build a dataset in hours.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    cognee

    cognee

    Deterministic LLMs Outputs for AI Applications and AI Agents

    We build for developers who need a reliable, production-ready data layer for AI applications. Cognee implements scalable, modular data pipelines that allow for creating the LLM-enriched data layer using graph and vector stores. Cognee acts a semantic memory layer, unveiling hidden connections within your data and infusing it with your company's language and principles. This self-optimizing process ensures ultra-relevant, personalized, and contextually aware LLM retrievals. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ClearML

    ClearML

    Streamline your ML workflow

    ...The ClearML Server storing experiment, model, and workflow data, and supports the Web UI experiment manager, and ML-Ops automation for reproducibility and tuning. It is available as a hosted service and open source for you to deploy your own ClearML Server. The ClearML Agent for ML-Ops orchestration, experiment and workflow reproducibility, and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →