Showing 3125 open source projects for "data"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 1
    pywebview

    pywebview

    Build GUI for your Python program with JavaScript, HTML, and CSS

    pywebview is a lightweight cross-platform wrapper around a webview component that allows to display HTML content in its own native GUI window. It gives you power of web technologies in your desktop application, hiding the fact that GUI is browser based. You can use pywebview either with a lightweight web framework like Flask or Bottle or on its own with a two way bridge between Python and DOM. pywebview uses native GUI for creating a web component window: WinForms on Windows, Cocoa on macOS...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Weights and Biases

    Weights and Biases

    Tool for visualizing and tracking your machine learning experiments

    ...Focus on the interesting ML. Spend less time manually tracking results in spreadsheets and text files. Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved code, hyperparameters, launch commands, input data, and resulting model weights. Set wandb.config once at the beginning of your script to save your hyperparameters, input settings (like dataset name or model type), and any other independent variables for your experiments. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    SafeClaw

    SafeClaw

    Chat with it via text and voice

    SafeClaw is an open-source, entirely local alternative to cloud-based AI assistants like OpenClaw, enabling users to build a personal assistant that runs on their own machine without incurring API usage charges or exposing data to third-party services. It emphasizes privacy and predictability by using traditional programming, rule-based intent parsing, and established machine learning tools rather than large language models, meaning there are no per-token API costs and deterministic behavior. The assistant offers features such as voice control using fully local speech-to-text (Whisper) and text-to-speech (Piper) capabilities, news aggregation with extractive summarization, and smart home or Bluetooth device control. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 4
    ESPnet

    ESPnet

    End-to-end speech processing toolkit

    ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while still benefiting from the robust data preparation practices developed in the speech community. ESPnet provides many ready-to-run recipes for popular academic benchmarks, making it straightforward to reproduce published results or serve as baselines for new research. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 5
    Markdown package LaTeX

    Markdown package LaTeX

    Package for converting and rendering markdown documents in TeX

    The Markdown package converts CommonMark markup to TeX commands. The functionality is provided both as a Lua module, and as plain TeX, LaTeX, and ConTeXt macro packages that can be used to directly typeset TeX documents containing markdown markup. Unlike other convertors, the Markdown package does not require any external programs and makes it easy to redefine how each and every markdown element is rendered. Creative abuse of the markdown syntax is encouraged.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    mistletoe

    mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python

    mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest CommonMark-compliant Markdown parser implementation in pure Python, mistletoe also supports easy definitions of custom tokens. Parsing Markdown into an abstract syntax tree also allows us to swap out renderers for different output formats, without touching any of the core components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Remarshal

    Remarshal

    Convert between CBOR, JSON, MessagePack, TOML, and YAML

    Convert between CBOR, JSON, MessagePack, TOML, and YAML. When installed, provides the command-line command remarshal as well as the short commands {cbor,json,msgpack,toml,yaml}2{cbor,json,msgpack,toml,yaml}. You can perform format conversion, reformatting, and error detection using these commands. CBOR, MessagePack, and YAML with binary fields cannot be converted to JSON or TOML. Binary fields are converted between CBOR, MessagePack, and YAML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Petastorm

    Petastorm

    Petastorm library enables single machine or distributed training

    ...On top of a Parquet schema, petastorm also stores higher-level schema information that makes multidimensional arrays into a native part of a petastorm dataset. Petastorm supports extensible data codecs. These enable a user to use one of the standard data compressions (jpeg, png) or implement her own.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Eigent

    Eigent

    The Open Source Cowork Desktop to Unlock Your Exceptional Productivity

    ...Built on the CAMEL-AI multi-agent framework, Eigent emphasizes productivity, flexibility, and transparent system design. You can run Eigent fully locally for maximum privacy and data control, or choose a cloud-connected experience for quick access. The platform supports a wide range of AI models and integrates powerful tools through the Model Context Protocol (MCP). With human-in-the-loop controls and enterprise-ready features, Eigent balances automation with oversight and security.
    Downloads: 3 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    ...The focus is on readability, correctness, and experimentation, making it ideal for students and practitioners transitioning from theory to working systems. By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Mathematics Dataset

    Mathematics Dataset

    This dataset code generates mathematical question and answer pairs

    ...Version 1.0 includes over 2 million examples per category, with training splits labeled as “easy,” “medium,” and “hard,” supporting curriculum-based learning strategies. The data can be accessed via PyPI or generated locally using provided Python scripts, with outputs formatted for direct use in training or evaluation pipelines.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    CogVideo

    CogVideo

    text and image to video generation: CogVideoX (2024) and CogVideo

    ...The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100). Current releases cover CogVideoX-2B, CogVideoX-5B, and the upgraded CogVideoX1.5-5B variants, plus image-to-video (I2V) models, with options for BF16/FP16/FP32—and INT8 quantized inference via TorchAO for memory-constrained setups. The codebase emphasizes practical deployment: prompt-optimization utilities (LLM-assisted long-prompt expansion), Colab notebooks, a Gradio web app, and multiple performance knobs (tiling/slicing, CPU offload, torch.compile, multi-GPU, and FA3 backends via partner projects).
    Downloads: 21 This Week
    Last Update:
    See Project
  • 13
    h2oGPT

    h2oGPT

    Private chat with local GPT with document, images, video, etc.

    ...It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and Mixtral, making it a flexible tool for anyone needing advanced document analysis and AI-driven conversation in a secure, local setup.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Inter

    Inter

    The Inter font family

    Inter is a typeface carefully crafted & designed for computer screens. Inter features a tall x-height to aid in readability of mixed-case and lower-case text. Several OpenType features are provided as well, like contextual alternates that adjusts punctuation depending on the shape of surrounding glyphs, slashed zero for when you need to disambiguate "0" from "o", tabular numbers, etc. Using Inter is as easy as downloading & installing the font files. There's of course no absolute right or...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 15
    DuckDuckGo Android App

    DuckDuckGo Android App

    Privacy browser for Android

    DuckDuckGo is an app that gives you utmost privacy when browsing online. It stops you from getting tracked and protects your personal and private information, no matter where the internet may take you. Apart from providing standard browsing functionality, DuckDuckGo blocks all hidden third-party trackers, forces sites to use an encrypted connection where available, and provides a Privacy Grade rating for each website you visit.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Streamline Analyst

    Streamline Analyst

    AI agent that streamlines the entire process of data analysis

    Streamline Analyst is a cutting-edge, open-source application powered by Large Language Models (LLMs) designed to revolutionize data analysis. This Data Analysis Agent effortlessly automates all the tasks such as data cleaning, preprocessing, and even complex operations like identifying target objects, partitioning test sets, and selecting the best-fit models based on your data. With Streamline Analyst, results visualization and evaluation become seamless.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip install SpeechRecognition. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 18
    Optopsy

    Optopsy

    A nimble options backtesting library for Python

    ...The csv_data() function is a convenience function. Under the hood it uses Panda's read_csv() function to do the import. There are other parameters that can help with loading the csv data, consult the code/future documentation to see how to use them. Optopsy is a small simple library that offloads the heavy work of backtesting option strategies, the API is designed to be simple and easy to implement into your regular Panda's data analysis workflow. As such, we just need to call the long_calls() function to have Optopsy generate all combinations of a simple long call strategy for the specified time period and return a DataFrame. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Hamilton DAGWorks

    Hamilton DAGWorks

    Helps scientists define testable, modular, self-documenting dataflow

    Hamilton is a lightweight Python library for directed acyclic graphs (DAGs) of data transformations. Your DAG is portable; it runs anywhere Python runs, whether it's a script, notebook, Airflow pipeline, FastAPI server, etc. Your DAG is expressive; Hamilton has extensive features to define and modify the execution of a DAG (e.g., data validation, experiment tracking, remote execution). To create a DAG, write regular Python functions that specify their dependencies with their parameters. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Python JSONPath Next-Generation

    Python JSONPath Next-Generation

    JSONPath implementation for Python that aims to be standard compliant

    A final implementation of JSONPath for Python that aims to be standard compliant, including arithmetic and binary comparison operators, as defined in the original JSONPath proposal. This package merges both jsonpath-rw and jsonpath-rw-ext and provides several AST API enhancements, such as the ability to update or remove nodes in the tree. This library provides a robust and significantly extended implementation of JSONPath for Python. It is tested with CPython 3.7 and higher. This library...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    simplejson

    simplejson

    simplejson is a simple, fast, extensible JSON encoder/decoder

    simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python 3.3+ with legacy support for Python 2.5+. It is pure Python code with no dependencies but includes an optional C extension for a serious speed boost. simplejson is the externally maintained development version of the json library included with Python (since 2.6). This version is tested with the latest Python 3.8 and maintains backward compatibility with Python 3.3+ and the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Linkedin Scraper

    Linkedin Scraper

    A library that scrapes Linkedin for user data

    Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    novelWriter

    novelWriter

    Open source plain text editor designed for writing novels

    ...The project storage is suitable for version control software, and also well suited for file synchronisation tools. All text is saved as plain text files with a meta data header. The core project structure is stored in a single project XML file. Other meta data is primarily saved as JSON files.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    autopep8

    autopep8

    A tool that automatically formats Python code to conform to the PEP 8

    autopep8 automatically formats Python code to conform to the PEP 8 style guide. It uses the pycodestyle utility to determine what parts of the code need to be formatted. autopep8 is capable of fixing most of the formatting issues that can be reported by pycodestyle. Correct deprecated or non-idiomatic Python code (via lib2to3). Use this for making Python 2.7 code more compatible with Python 3. Put a blank line between a class docstring and its first method declaration. Remove blank lines...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB