Showing 3125 open source projects for "data"

View related business solutions
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    MetaTransformer

    MetaTransformer

    Meta-Transformer for Unified Multimodal Learning

    We're thrilled to present OneLLM, an ensembling Meta-Transformer framework with Multimodal Large Language Models, which performs multimodal joint training, supports more modalities including fMRI, Depth, and Normal Maps, and demonstrates very impressive performances on 25 benchmarks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ThoughtSource

    ThoughtSource

    A central, open resource for data and tools

    ThoughtSource is a central, open resource and community centered on data and tools for chain-of-thought reasoning in large language models (Wei 2022). Our long-term goal is to enable trustworthy and robust reasoning in advanced AI systems for driving scientific research and medical practice.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    hloc

    hloc

    Visual localization made easy with hloc

    ...This codebase won the indoor/outdoor localization challenges at CVPR 2020 and ECCV 2020, in combination with SuperGlue, our graph neural network for feature matching. We provide step-by-step guides to localize with Aachen, InLoc, and to generate reference poses for your own data using SfM. Just download the datasets and you're reading to go! The notebook pipeline_InLoc.ipynb shows the steps for localizing with InLoc. It's much simpler since a 3D SfM model is not needed. We show in pipeline_SfM.ipynb how to run 3D reconstruction for an unordered set of images. This generates reference poses, and a nice sparse 3D model suitable for localization with the same pipeline as Aachen.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AI Explainability 360

    AI Explainability 360

    Interpretability and explainability of data and machine learning model

    ...The AI Explainability 360 interactive experience provides a gentle introduction to the concepts and capabilities by walking through an example use case for different consumer personas. The tutorials and example notebooks offer a deeper, data scientist-oriented introduction. The complete API is also available. There is no single approach to explainability that works best. There are many ways to explain: data vs. model, directly interpretable vs. post hoc explanation, local vs. global, etc. It may therefore be confusing to figure out which algorithms are most appropriate for a given use case.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • 5
    Lightning Flash

    Lightning Flash

    Flash enables you to easily configure and run complex AI recipes

    Your PyTorch AI Factory, Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains. In a nutshell, Flash is the production-grade research framework you always dreamed of but didn't have time to build. All data loading in Flash is performed via a from_* classmethod on a DataModule. Which DataModule to use and which from_* methods are available depends on the task you want to perform. For example, for image segmentation where your data is stored in folders, you would use the from_folders method of the SemanticSegmentationData class. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    scArches

    scArches

    Reference mapping for single-cell genomics

    Single-cell architecture surgery (scArches) is a package for reference-based analysis of single-cell data. scArches allows your single-cell query data to be analyzed by integrating it into a reference atlas. By mapping your data into an integrated reference you can transfer cell-type annotation from reference to query, identify disease states by mapping to healthy atlas, and advanced applications such as imputing missing data modalities or spatial locations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    CausalNex

    CausalNex

    A Python library that helps data scientists to infer causation

    CausalNex is a Python library that uses Bayesian Networks to combine machine learning and domain expertise for causal reasoning. You can use CausalNex to uncover structural relationships in your data, learn complex distributions, and observe the effect of potential interventions.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    ED Software project contains several programs used (mostly) for processing gas-phase electron diffraction (GED) experimental data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    MMOCR

    MMOCR

    OpenMMLab Text Detection, Recognition and Understanding Toolbox

    ...The toolbox supports a wide variety of state-of-the-art models for text detection, text recognition and key information extraction. The modular design of MMOCR enables users to define their own optimizers, data preprocessors, and model components such as backbones, necks and heads as well as losses. Please refer to Getting Started for how to construct a customized model. The toolbox provides a comprehensive set of utilities which can help users assess the performance of models. It includes visualizers which allow visualization of images, ground truths as well as predicted bounding boxes, and a validation tool for evaluating checkpoints.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 10
    PromethAI

    PromethAI

    Open-source framework that gives you AI Agents

    PromethAI-Backend is a backend framework for AI-driven automation and knowledge extraction. It is designed to integrate with large language models (LLMs) to provide AI-enhanced workflows, including content generation, summarization, and data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    UartVide

    UartVide

    A flat UI RS232 serial port communication utility.

    Mainly designed for embedded & software engineers, UartVide is a flat-UI and straightforward and lightweight RS232 serial port communication utility that allows you to configure the connection parameters and communicate via the port. UartVide runs on all platforms supported by PySide2 including Windows, Linux. (MyTerm was renamed to UartVide from version 2.4)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    fastMRI

    fastMRI

    A large open dataset + tools to speed up MRI scans using ML

    ...By enabling reconstruction of high-fidelity MR images from significantly fewer measurements, fastMRI aims to make MRI scanning faster, cheaper, and more accessible in clinical settings. The repository provides an open-source PyTorch framework with data loaders, subsampling utilities, reconstruction models, and evaluation metrics, supporting both research reproducibility and practical experimentation. It includes reference implementations for key MRI reconstruction architectures such as U-Net and Variational Networks (VarNet), along with example scripts for model training and evaluation using the PyTorch Lightning framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    learn2learn

    learn2learn

    A PyTorch Library for Meta-learning Research

    ...It provides reusable components and meta-learning algorithms, making it easier to build, train, and evaluate models that can quickly adapt to new tasks with minimal data. Learn2Learn is widely used in research for tasks such as few-shot classification, reinforcement learning, and optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Prime QA

    Prime QA

    State-of-the-art Multilingual Question Answering research

    ...By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly downloadable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    KubiScan

    KubiScan

    A tool to scan Kubernetes cluster for risky permissions

    A tool for scanning Kubernetes cluster for risky permissions in Kubernetes's Role-based access control (RBAC) authorization model. KubiScan helps cluster administrators identify permissions that attackers could potentially exploit to compromise the clusters. This can be especially helpful on large environments where there are lots of permissions that can be challenging to track. KubiScan gathers information about risky roles\clusterroles, rolebindings\clusterrolebindings, users and pods,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Metaseq

    Metaseq

    Repo for external large-scale work

    ...The framework was used internally at Meta to train models like OPT (Open Pre-trained Transformer) and serves as a reference implementation for scaling transformer architectures efficiently across GPUs and nodes. It supports both pretraining and fine-tuning workflows with data pipelines for text, multilingual corpora, and custom tokenization schemes. Metaseq also includes APIs for evaluation, generation, and model serving, enabling seamless transitions from training to inference.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TexGen
    TexGen is a geometric textile modelling software package to be used for obtaining engineering properties of woven textiles and textile composites. Citing TexGen We would be grateful if you could acknowledge use of TexGen where appropriate and suggest using one of the following references: L P Brown and A C Long. "Modelling the geometry of textile reinforcements for composites: TexGen", Chapter 8 in "Composite reinforcements for optimum performance (Second Edition)", ed. P Boisse,...
    Leader badge
    Downloads: 57 This Week
    Last Update:
    See Project
  • 19
    ACORBA

    ACORBA

    Automated approach to measure root tip angles of Arabidopsis thaliana

    Gravitropic response is studied in most of the laboratories working with Arabidopsis thaliana, for example, to detect new phenotypes in mutants. However, manual analysis of images and microscopy data are known to be subjected to human bias. This is particularly the case for manual measurements of root bending as the angle is set subjectively. In this context, it is essential to develop and use automated or semi-automated image analysis to produce faster, reproducible, and unbiased data. In this context, we developped ACORBA (Automatic Calculation Of Root Bending Angles), a fully automated software to measure root bending angle over time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    lora-svc

    lora-svc

    Singing voice change based on whisper, lora for singing voice clone

    singing voice change based on whisper, and lora for singing voice clone. You will feel the beauty of the code from this project. Uni-SVC main branch is for singing voice clone based on whisper with speaker encoder and speaker adapter. Uni-SVC main target is to develop lora for SVC. With lora, maybe clone a singer just need 10 stence after 10 minutes train. Each singer is a plug-in of the base model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    langchain-prefect

    langchain-prefect

    Tools for using Langchain with Prefect

    ...We need to know details about how our apps work, even when we want to use tools with convenient abstractions that may obfuscate those details. Prefect is built to help data people build, run, and observe event-driven workflows wherever they want. It provides a framework for creating deployments on a whole slew of runtime environments (from Lambda to Kubernetes), and is cloud agnostic (best supports AWS, GCP, Azure). For this reason, it could be a great fit for observing apps that use LLMs. RecordLLMCalls is a ContextDecorator that can be used to track LLM calls made by Langchain LLMs as Prefect flows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    iJEPA

    iJEPA

    Official codebase for I-JEPA

    ...The design scales naturally with Vision Transformer backbones and flexible masking strategies, and it trains stably at large batch sizes. i-JEPA’s predictions are made in embedding space, which is computationally efficient and better aligned with downstream discrimination tasks. The repository provides training recipes, data pipelines, and evaluation code that clarify which masking patterns and architectural choices matter most.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Horovod

    Horovod

    Distributed training framework for TensorFlow, Keras, PyTorch, etc.

    ...Horovod can be installed on-premise or run out-of-the-box in cloud platforms, including AWS, Azure, and Databricks. Horovod can additionally run on top of Apache Spark, making it possible to unify data processing and model training into a single pipeline. Once Horovod has been configured, the same infrastructure can be used to train models with any framework, making it easy to switch between TensorFlow, PyTorch, MXNet, and future frameworks as machine learning tech stacks continue to evolve. Start scaling your model training with just a few lines of Python code. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    TensorFlow Ranking

    TensorFlow Ranking

    Learning to rank in TensorFlow

    ...Multi-item (also known as groupwise) scoring functions. LambdaLoss implementation for direct ranking metric optimization. Unbiased Learning-to-Rank from biased feedback data. We envision that this library will provide a convenient open platform for hosting and advancing state-of-the-art ranking models based on deep learning techniques, and thus facilitate both academic research and industrial applications. We provide a demo, with no installation required, to get started on using TF-Ranking. This demo runs on a colaboratory notebook, an interactive Python environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Tributary

    Tributary

    Streaming reactive and dataflow graphs in Python

    Tributary is a library for constructing dataflow graphs in Python. Unlike many other DAG libraries in Python (airflow, luigi, prefect, dagster, dask, kedro, etc), tributary is not designed with data/etl pipelines or scheduling in mind. Instead, tributary is more similar to libraries like mdf, loman, pyungo, streamz, or pyfunctional, in that it is designed to be used as the implementation for a data model. One such example is the greeks library, which leverages tributary to build data models for options pricing.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB