Showing 31534 open source projects for "data"

View related business solutions
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Azure Data Studio

    Azure Data Studio

    A data management tool that enables working with other SQL tools

    Azure Data Studio is a cross-platform database tool for data professionals who use on-premises and cloud data platforms on Windows, macOS, and Linux. Azure Data Studio offers a modern editor experience with IntelliSense, code snippets, source control integration, and an integrated terminal. It's engineered with the data platform user in mind, with the built-in charting of query result sets and customizable dashboards.
    Downloads: 363 This Week
    Last Update:
    See Project
  • 2
    Data Formulator

    Data Formulator

    Create rich visualizations with AI

    To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals. To achieve this, analysts need not only proficiency in data transformation and visualization tools but also efforts to manage the branching history consisting of many different versions of data and charts. Recent LLM-powered AI systems have greatly improved visualization authoring experiences, for example by mitigating manual data transformation barriers via LLMs' code generation ability. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Polymarket Data

    Polymarket Data

    Polymarket Data Retriever that fetches, processes, and structures data

    Polymarket Data is a comprehensive data engineering pipeline designed to collect, process, and structure trading activity from the Polymarket prediction market ecosystem into analyzable datasets. The system operates as a multi-stage pipeline that integrates data from both off-chain APIs and on-chain event sources, enabling users to reconstruct full trading activity including markets, order events, and executed trades.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    data.table

    data.table

    Extends base R’s data for high-performance data manipulation

    data.table is an R package that extends base R’s data.frame for high-performance data manipulation. It offers concise syntax, blazing speed, and memory-efficient operations. It supports fast file reading/writing, joins, grouping, reshaping, and updates by reference. It is heavily used in large data workflows, big data in R, production pipelines, etc. Extremely efficient grouping/aggregation/summarization; can handle very large datasets (hundreds of millions to billions of rows) in memory (if available). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 5
    Explorer

    Explorer

    Series (one-dimensional) and dataframes (two-dimensional)

    Explorer brings series (one-dimensional) and data frames (two-dimensional) to Elixir for fast data exploration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Laravel Data

    Laravel Data

    Powerful data objects for Laravel

    This package enables the creation of rich data objects which can be used in various ways. Using this package you only need to describe your data once.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections. ...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 8
    Profile Data

    Profile Data

    Analyze computation-communication overlap in V3/R1

    profile-data is a repository that publishes profiling traces and metrics from DeepSeek’s training and inference infrastructure (especially during DeepSeek-V3 / R1 experiments). The profiling data targets insights into computation-communication overlap, pipeline scheduling (e.g. DualPipe), and how MoE / EP / parallelism strategies interact in real systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    atk4/data

    atk4/data

    Data Access PHP Framework for SQL & high-latency databases

    ATK Data is a data persistence and modeling framework for PHP, developed as part of the Agile Toolkit. It provides a high-level abstraction for working with databases, making it easier to define and manipulate data models with minimal boilerplate code. It supports various SQL and NoSQL databases and integrates seamlessly with Agile UI and other PHP frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Form-Data

    Form-Data

    A module to create readable `"multipart/form-data"` streams

    A library to create readable "multipart/form-data" streams. Can be used to submit forms and file uploads to other web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Data-Juicer

    Data-Juicer

    Data processing for and with foundation models

    Data-Juicer is an open-source data processing and augmentation framework designed to enhance the quality and diversity of datasets for machine learning tasks. It includes a modular pipeline for scalable data transformation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Micronaut Data

    Micronaut Data

    Ahead of Time Data Repositories

    ...The problem is worse when combined with Hibernate which maintains its own meta-model as you end up with duplicate meta-models. Micronaut Data instead moves this model into the compiler. Both GORM and Spring Data use regular expressions and pattern matching in combination with runtime generated proxies to translate a method definition on a Java interface into a query at runtime. No such runtime translation exists in Micronaut Data and this work is carried out by the Micronaut compiler at compilation time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    MDN data

    MDN data

    This repository contains general data for Web technologies

    This repository contains general data for Web technologies and is maintained by the MDN team at Mozilla.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Dynamic Data

    Dynamic Data

    Reactive collections based on Rx.Net

    ...However, typical applications are much more complicated and may apply a filter, transform the original dto and apply a sort. Even with these simple everyday operations, the complexity of the code is quickly magnified. Dynamic data has been developed to remove the tedious code of dynamically maintaining collections. It has grown to become functionally very rich with at least 60 collection-based operations which amongst other things enable filtering, sorting, grouping, joining different sources, transforms, binding, pagination, data virtualization, expiration, disposal management plus more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    sq data wrangler

    sq data wrangler

    sq data wrangler

    sq is a command line tool that provides jq-style access to structured data sources: SQL databases, or document formats like CSV or Excel. sq executes jq-like queries, or database-native SQL. It can join across sources: join a CSV file to a Postgres table, or MySQL with Excel. sq outputs to a multitude of formats including JSON, Excel, CSV, HTML, Markdown and XML, and can insert query results directly to a SQL database. sq can also inspect sources to view metadata about the source structure (tables, columns, size). ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    AWS Data Wrangler

    AWS Data Wrangler

    Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.

    An AWS Professional Service open-source python initiative that extends the power of Pandas library to AWS connecting DataFrames and AWS data-related services. Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON, and EXCEL). Built on top of other open-source projects like Pandas, Apache Arrow and Boto3, it offers abstracted functions to execute usual ETL tasks like load/unload data from Data Lakes, Data Warehouses, and Databases. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    License List Data

    License List Data

    Various data formats for the SPDX License List

    Various data formats for the SPDX License List including RDFa, HTML, Text, and JSON.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    Synthetic Data Generator

    Synthetic Data Generator

    SDG is a specialized framework

    ...It also includes a data processing module capable of handling different data types, preprocessing columns, managing missing values, and converting formats automatically before model training.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    browser-compat-data

    browser-compat-data

    This repository contains compatibility data for Web technologies

    The browser-compat-data ("BCD") project contains machine-readable browser (and JavaScript runtime) compatibility data for Web technologies, such as Web APIs, JavaScript features, CSS properties, and more. Our goal is to document accurate compatibility data for Web technologies, so web developers may write cross-browser compatible websites more easily. BCD is used in web apps and software such as MDN Web Docs, CanIUse, Visual Studio Code, WebStorm and more.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    data-table-filters

    data-table-filters

    Faceted filters, sorting & infinite scroll for React data tables

    data-table-filters is a frontend utility designed to simplify the implementation of advanced filtering capabilities in data tables within web applications. It provides a set of reusable components and hooks that allow developers to quickly add filtering, sorting, and search functionality to tabular data. The library is built with modern JavaScript frameworks in mind, ensuring compatibility with React-based applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Data Version Control

    Data Version Control

    Git-based data version control for machine learning workflows

    DVC (Data Version Control) is an open source tool designed to bring version control principles to machine learning and data science workflows. It enables developers and data scientists to track datasets, machine learning models, and experiment results in a way that integrates with existing Git repositories. Instead of storing large datasets directly in Git, DVC keeps lightweight metadata in the repository while storing the actual data in external storage systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DATA SCIENCE ROADMAP

    DATA SCIENCE ROADMAP

    Data Science Roadmap from A to Z

    DATA SCIENCE ROADMAP is an educational repository designed to guide learners through the process of becoming proficient in data science and machine learning. The project presents a structured roadmap that outlines the knowledge and skills required for different stages of a data science career. Topics typically include programming with Python, statistics, mathematics, machine learning algorithms, data visualization, and big data technologies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Data Science Articles from CodeCut

    Data Science Articles from CodeCut

    Collection of useful data science topics along with articles

    The Data-science repository from CodeCutTech is a curated collection of educational content focused on practical tools and workflows used in modern data science projects. Instead of providing a single software package, the repository aggregates articles, tutorials, and examples covering many topics within the data science ecosystem. The materials address areas such as MLOps, data management, project organization, testing practices, visualization techniques, and productivity tools used by data scientists. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Data-Science-Interview-Questions-Answers

    Data-Science-Interview-Questions-Answers

    Curated list of data science interview questions and answers

    Data-Science-Interview-Questions-Answers is a curated educational repository designed to help data science candidates prepare for technical interviews by organizing a large bank of questions and answers in one place. It began as a daily interview question initiative and was later consolidated into GitHub so learners could review the material more easily and revisit it over time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Data Science Interviews

    Data Science Interviews

    Data science interview questions and answers

    Data Science Interviews is an open-source repository that collects common data science interview questions along with community-provided answers and explanations. The project serves as a preparation resource for students, job seekers, and professionals who want to review the technical knowledge required for data science roles. The repository organizes questions into different categories including theoretical machine learning concepts, technical programming questions, and probability or statistics problems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB