Showing 3135 open source projects for "data"

View related business solutions
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • Cut Your Data Warehouse Bill by 54% Icon
    Cut Your Data Warehouse Bill by 54%

    Migrate from Snowflake, Redshift, or Databricks with free tools. No SQL rewrites.

    BigQuery delivers 54% lower TCO with serverless scale and flexible pricing. Free migration tools handle the SQL translation automatically.
    Try Free
  • 1
    AgentForge

    AgentForge

    Extensible AGI Framework

    AgentForge is a framework for creating and deploying AI agents that can perform autonomous decision-making and task execution. It enables developers to define agent behaviors, train models, and integrate AI-powered automation into various applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    deepdoctection

    deepdoctection

    A Repo For Document AI

    DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for fine-tuning, evaluating and running models. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    PyBroker

    PyBroker

    Algorithmic Trading in Python with Machine Learning

    Are you looking to enhance your trading strategies with the power of Python and machine learning? Then you need to check out PyBroker! This Python framework is designed for developing algorithmic trading strategies, with a focus on strategies that use machine learning. With PyBroker, you can easily create and fine-tune trading rules, build powerful models, and gain valuable insights into your strategy’s performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    EconML

    EconML

    Python Package for ML-Based Heterogeneous Treatment Effects Estimation

    EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the ALICE project at Microsoft Research with the goal of combining state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. One of the biggest promises of machine learning is to automate decision-making in a multitude of domains.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    TPOT

    TPOT

    A Python Automated Machine Learning tool that optimizes ML

    Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AtomAI

    AtomAI

    Deep and Machine Learning for Microscopy

    ...Ultimately, it aims to combine the power and flexibility of the PyTorch deep learning framework and the simplicity and intuitive nature of packages such as scikit-learn, with a focus on scientific data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    orjson

    orjson

    Fast, correct Python JSON library supporting dataclasses, datetimes

    orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard json library or other third-party libraries. It serializes dataclass, datetime, numpy, and UUID instances natively. orjson supports CPython 3.8, 3.9, 3.10, 3.11, and 3.12. It distributes amd64/x86_64, aarch64/armv8, arm7, POWER/ppc64le, and s390x wheels for Linux, amd64 and aarch64 wheels for macOS, and amd64 and i686/x86 wheels for Windows. orjson...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    truffleHog

    truffleHog

    Searches through git repositories for high entropy strings and secrets

    truffleHog searches through git repositories for high entropy strings and secrets, digging deep into commit history. TruffleHog runs behind the scenes to scan your environment for secrets like private keys and credentials, so you can protect your data before a breach occurs. Secrets can be found anywhere, so TruffleHog scans more than just code repositories, including SaaS and internally hosted software. With support for custom integrations and new integrations added all the time, you can secure your secrets across your entire environment. TruffleHog is developed by a team entirely comprised of career security experts. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    FitTrackee

    FitTrackee

    Self-hosted outdoor activity tracker

    ...It provides an organized environment for logging workouts, recording metrics like sets, reps, durations, and weights, and visualizing progress through charts and summaries that detail trends in strength, endurance, and consistency. Instead of locking users into proprietary ecosystems or paid plans, FitTrackee lets you keep your own data on your server, giving full control over privacy and longevity of your fitness history. The interface is designed to be flexible enough for everyday gym routines, home workouts, and personalized training plans, supporting a variety of exercise types and custom attributes for different training styles.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    AlphaFold 3

    AlphaFold 3

    AlphaFold 3 inference pipeline

    ...Users can perform local predictions via Docker containers, integrating AlphaFold 3’s inference process with provided JSON input configurations. The software includes flexible options for running both data preprocessing and GPU-accelerated inference, allowing users to adapt to available computational resources.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Airtable MCP

    Airtable MCP

    Airtable integration for AI-powered applications

    Airtable MCP is an integration tool that enables AI-powered applications to access and manipulate Airtable databases directly from the IDE using Anthropic's Model Context Protocol (MCP). It allows querying, creating, updating, and deleting records using natural language, facilitating seamless data management. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    MySQL MCP Server

    MySQL MCP Server

    A Model Context Protocol (MCP) server that enables secure interaction

    The MySQL MCP Server enables secure interaction with MySQL databases, allowing AI assistants to list tables, read data, and execute SQL queries through a controlled interface. It is designed for integration with AI applications like Claude Desktop and should not be run as a standalone Python program. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    talos

    talos

    Hyperparameter Optimization for TensorFlow, Keras and PyTorch

    Talos radically changes the ordinary Keras, TensorFlow (tf.keras), and PyTorch workflow by fully automating hyperparameter tuning and model evaluation. Talos exposes Keras and TensorFlow (tf.keras) and PyTorch functionality entirely and there is no new syntax or templates to learn. Talos is made for data scientists and data engineers that want to remain in complete control of their TensorFlow (tf.keras) and PyTorch models, but are tired of mindless parameter hopping and confusing optimization solutions that add complexity instead of reducing it. Within minutes, without learning any new syntax, Talos allows you to configure, perform, and evaluate hyperparameter optimization experiments that yield state-of-the-art results across a wide range of prediction tasks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    CogVideo

    CogVideo

    text and image to video generation: CogVideoX (2024) and CogVideo

    ...The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100). Current releases cover CogVideoX-2B, CogVideoX-5B, and the upgraded CogVideoX1.5-5B variants, plus image-to-video (I2V) models, with options for BF16/FP16/FP32—and INT8 quantized inference via TorchAO for memory-constrained setups. The codebase emphasizes practical deployment: prompt-optimization utilities (LLM-assisted long-prompt expansion), Colab notebooks, a Gradio web app, and multiple performance knobs (tiling/slicing, CPU offload, torch.compile, multi-GPU, and FA3 backends via partner projects).
    Downloads: 14 This Week
    Last Update:
    See Project
  • 16
    SaltStack

    SaltStack

    Automate the management and configuration of any infrastructure

    Software to automate the management and configuration of any infrastructure or application at scale. The Salt Project is an approach to infrastructure management built on a dynamic communication bus. Salt can be used for data-driven orchestration, remote execution for any infrastructure, configuration management for any app stack, and much more. Running commands on remote systems is the core function of Salt. Salt can execute commands across thousands of systems in seconds. Salt is built around an event infrastructure that can drive reactive provisioning, configuration, and management across all systems in your infrastructure. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    cheat.sh

    cheat.sh

    The only cheat sheet you need

    ...The repository contains the server and client code, instructions to run a local standalone instance (including Python virtualenv setup), and tooling to fetch or maintain the upstream cheat-sheet data; installation documentation explains disk-space needs and dependency setup for offline use. Cheat.sh is intentionally minimal and scriptable, so it fits naturally into shells, CI scripts, editors, and quick lookups without leaving the terminal, while also offering ways to extend or host personal cheat sheets.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 18
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality. According to the authors, they aligned the training setup of V3.2-Exp with V3.1-Terminus so that benchmark results remain largely...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 19
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip install SpeechRecognition. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    DuckDuckGo Android App

    DuckDuckGo Android App

    Privacy browser for Android

    DuckDuckGo is an app that gives you utmost privacy when browsing online. It stops you from getting tracked and protects your personal and private information, no matter where the internet may take you. Apart from providing standard browsing functionality, DuckDuckGo blocks all hidden third-party trackers, forces sites to use an encrypted connection where available, and provides a Privacy Grade rating for each website you visit.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    ...It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a mode-based architecture with specialized “modes” such as Author, Code, Research, QA and General, and a workflow manager that automatically routes user requests to the right agent based on the task. The project has a strong focus on developer ergonomics, with thorough development guidelines, environment configuration using .env variables, and a clear structure for tests, tools and agents.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 22
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    ...The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account. Darts supports both univariate and multivariate time series and models. The ML-based models can be trained on potentially large datasets containing multiple time series, and some of the models offer a rich support for probabilistic forecasting. We recommend to first setup a clean Python environment for your project with at least Python 3.7 using your favorite tool (conda, venv, virtualenv with or without virtualenvwrapper).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Connexion

    Connexion

    Swagger/OpenAPI First framework for Python on top of Flask

    Connexion is a framework on top of Flask that automagically handles HTTP requests defined using OpenAPI (formerly known as Swagger), supporting both v2.0 and v3.0 of the specification. Connexion allows you to write these specifications, then maps the endpoints to your Python functions. This is what makes it unique from other tools that generate the specification based on your Python code. You are free to describe your REST API with as much detail as you want and then Connexion guarantees...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Kaldi

    Kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project

    ...Kaldi is designed for researchers who need a highly customizable environment to experiment with new algorithms, as well as for practitioners who want robust, production-ready ASR pipelines. It includes extensive tools for data preparation, feature extraction, acoustic and language modeling, decoding, and evaluation. With its modular design, Kaldi allows users to adapt the system to a wide range of languages and domains. As one of the most influential projects in speech recognition, it has become a foundation for much of the modern work in ASR.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    SQL Explorer

    SQL Explorer

    Easily share data across your company via SQL queries

    SQL Explorer aims to make the flow of data between people fast, simple, and confusion-free. It is a Django-based application that you can add to an existing Django site, or use as a standalone business intelligence tool. Quickly write and share SQL queries in a simple, usable SQL editor, preview the results in the browser, share links, download CSV, JSON, or Excel files (and even expose queries as API endpoints, if desired), and keep the information flowing!
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB