Showing 184 open source projects for "data"

View related business solutions
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 1
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    ...AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Binarytree

    Binarytree

    Python library for studying Binary Trees

    Binarytree is Python library that lets you generate, visualize, inspect and manipulate binary trees. Skip the tedious work of setting up test data, and dive straight into practicing algorithms. Heaps and BSTs (binary search trees) are also supported. Binarytree supports another representation which is more compact but without the indexing properties. Traverse trees using different algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Interpret-Text

    Interpret-Text

    State-of-the-art explainers for text-based machine learning models

    A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in dashboard. Interpret-Text builds on Interpret, an open source python package for training interpretable models and helping to explain blackbox machine learning systems. We have added extensions to support text models. Interpret-Text incorporates community-developed interpretability techniques for NLP models and a visualization dashboard to view the results....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    TensorFlow Examples

    TensorFlow Examples

    TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

    ...For clarity and educational value, each example is accompanied by explanatory comments or markdown cells to illustrate what the code does and why — a design that makes it especially suitable for self-learners or students following along with real data. Besides raw implementations, the repo often shows best practices using higher-level constructs (e.g. dataset pipelines, estimators, layers) which reflect modern TensorFlow workflows rather than only textbook-style code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 5
    PyTorchVideo

    PyTorchVideo

    A deep learning library for video understanding research

    ...The library includes efficient implementations of state-of-the-art architectures such as SlowFast, X3D, and MViT, optimized for both research prototyping and production inference. It supports video I/O pipelines, data augmentation, distributed training, and mixed precision computation for large-scale experiments. PyTorchVideo also connects seamlessly with other Meta AI tools such as Detectron2 and PyTorch3D for multimodal video analysis. Designed to accelerate research and deployment, it serves as a unified framework for reproducible, high-performance video AI development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    ReinventCommunity

    ReinventCommunity

    Jupyter Notebook tutorials for REINVENT 3.2

    This repository is a collection of useful jupyter notebooks, code snippets and example JSON files illustrating the use of Reinvent 3.2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Arraymancer

    Arraymancer

    A fast, ergonomic and portable tensor library in Nim

    Arraymancer is a tensor and deep learning library for the Nim programming language, designed for high-performance numerical computations and machine learning applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    The Big List of Naughty Strings is a community-maintained catalog of “gotcha” inputs that commonly break software, from unusual Unicode to SQL and script injection payloads. It exists so developers and QA engineers can easily test edge cases that normal test data would miss, such as zero-width characters, right-to-left marks, emojis, foreign alphabets, and long or malformed strings. By throwing these strings at forms, APIs, databases, and UIs, teams can discover encoding bugs, sanitizer gaps, rendering issues, and security oversights early. The list is language-agnostic and repository-friendly, meaning you can consume it from CI pipelines or local scripts with minimal setup. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    earthengine-py-notebooks

    earthengine-py-notebooks

    A collection of 360+ Jupyter Python notebook examples

    ...Users can quickly adapt the examples for their own remote sensing, environmental monitoring, or spatial data science projects, and can run the code in environments like Google Colab.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 10
    BMC

    BMC

    Notes on Scientific Computing for Biomechanics

    This repository is a collection of lecture notes and code on scientific computing and data analysis for Biomechanics and Motor Control.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    CNN for Image Retrieval
    ...The repository provides implementations of CNN-based methods to extract feature representations from images and use them for similarity-based retrieval. It focuses on applying deep learning techniques to improve upon traditional handcrafted descriptors by learning features directly from data. The code includes training and evaluation scripts that can be adapted for custom datasets, making it useful for experimenting with retrieval systems in computer vision. By leveraging CNN architectures, the project showcases how learned embeddings can capture semantic similarity across varied images. This resource serves as both an educational reference and a foundation for further exploration in image retrieval research.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    BeaEngine 5

    BeaEngine 5

    BeaEngine disasm project

    BeaEngine is a C library designed to decode instructions from 16-bit, 32-bit and 64-bit intel architectures. It includes standard instructions set and instructions set from FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VMX, CLMUL, AES, MPX, AVX, AVX2, AVX512 (VEX & EVEX prefixes), CET, BMI1, BMI2, SGX, UINTR, KL, TDX and AMX extensions. If you want to analyze malicious codes and more generally obfuscated codes, BeaEngine sends back a complex structure that describes precisely the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    NLP Architect

    NLP Architect

    A model library for exploring state-of-the-art deep learning

    ...The library is designed to be a tool for model development: data pre-processing, build model, train, validate, infer, save or load a model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    gradslam

    gradslam

    gradslam is an open source differentiable dense SLAM library

    ...The question of “representation” is central in the context of dense simultaneous localization and mapping (SLAM). Newer learning-based approaches have the potential to leverage data or task performance to directly inform the choice of representation. However, learning representations for SLAM has been an open question, because traditional SLAM systems are not end-to-end differentiable. In this work, we present gradSLAM, a differentiable computational graph take on SLAM. Leveraging the automatic differentiation capabilities of computational graphs, gradSLAM enables the design of SLAM systems that allow for gradient-based learning across each of their components, or the system as a whole.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Zipline

    Zipline

    Zipline, a Pythonic algorithmic trading library

    ...Zipline is currently used in production as the backtesting and live-trading engine powering Quantopian -- a free, community-centered, hosted platform for building and executing trading strategies. Quantopian also offers a fully managed service for professionals that includes Zipline, Alphalens, Pyfolio, FactSet data, and more. Installing Zipline is slightly more involved than the average Python package. For a development installation (used to develop Zipline itself), create and activate a virtualenv, then run the etc/dev-install script. Please note that Zipline is not a community-led project. Zipline is maintained by the Quantopian engineering team, and we are quite small and often busy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ebfformat

    ebfformat

    An Efficient Binary data Format

    ...It is also designed to simplify the programming of input output routines in different programming languages. In a nutshell an EBF file is a collection of data objects. Each data object is specified by a unique name and a single file can have multiple data objects. Each data object is preceded by a meta-data or header which describes the binary data associated with it. Among other things, this header allows the files to be portable across systems with different endianess.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    MMdnn

    MMdnn

    Tools to help users inter-operate among deep learning frameworks

    MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML. MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model management, and "dnn" is the acronym of deep neural network. We implement a universal converter to convert DL models between frameworks,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Forecasting Best Practices

    Forecasting Best Practices

    Time Series Forecasting Best Practices & Examples

    ...Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featuring the data, optimizing and evaluating models, and scaling up to the cloud. The examples and best practices are provided as Python Jupyter notebooks and R markdown files and a library of utility functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Brand new cheatsheets and handouts

    Brand new cheatsheets and handouts

    Matplotlib 3.1 cheat sheet

    ...It lays out common use cases (plot types, styling, figure configuration, saving/exporting, subplot layout, etc.) in a concise and organized format — often serving as a “cheat sheet” for rapid look-up. For practitioners working on data-heavy projects, dashboards, or research code where plotting is frequent, it helps speed up development by reducing context-switching and documentation navigation overhead. It is especially useful when you know roughly what you want (e.g. “I need a scatter + histogram marginal plot”) but don’t remember the exact Matplotlib call.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Alfred-Workflow

    Alfred-Workflow

    Full-featured library for writing Alfred 3 & 4 workflows

    Alfred-Workflow is a Python helper library for Alfred 2, 3 and 4 workflow authors, developed and hosted on GitHub. Alfred workflows typically take user input, fetch data from the Web or elsewhere, filter them and display results to the user. Alfred-Workflow takes care of a lot of the details for you, allowing you to concentrate your efforts on your workflow’s functionality. Alfred-Workflow supports macOS 10.7+ (Python 2.7). Easily launch background tasks (daemons) to keep your workflow responsive. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Pinject

    Pinject

    A pythonic dependency injection library

    ...Because bindings are just Python functions and classes, refactoring remains straightforward and the DI graph is easy to reason about. Pinject is particularly useful for medium-to-large services where configuration, logging, data clients, and business logic need clean separation without resorting to manual plumbing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    gditools

    gditools

    A Python program/library aimed at GD-ROM image files.

    This Python program/library is designed to handle GD-ROM image (GDI) files. It can be used to list files, extract data, generate sorttxt file, extract bootstrap (IP.BIN) file and more. This project can be used in standalone mode, in interactive mode or as a library in another Python program (check the 'addons' folder to learn how). For your convenience, you can use the gditools.py GUI program supplied in the Files section (optional). To use this project you must install the Python 2.7.x branch release binaries. ...
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 25
    Graph Nets library

    Graph Nets library

    Build Graph Nets in Tensorflow

    Graph Nets, developed by Google DeepMind, is a Python library designed for constructing and training graph neural networks (GNNs) using TensorFlow and Sonnet. It provides a high-level, flexible framework for building neural architectures that operate directly on graph-structured data. A graph network takes graphs as inputs, consisting of edges, nodes, and global attributes, and produces updated graphs with modified feature representations at each level. This library implements the foundational ideas from DeepMind’s paper “Relational Inductive Biases, Deep Learning, and Graph Networks”, offering tools to explore relational reasoning and message-passing neural networks. ...
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB