Showing 629 open source projects for "data"

View related business solutions
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Copulas

    Copulas

    A library to model multivariate data using copulas

    Copulas is a Python library for modeling multivariate distributions and sampling from them using copula functions. Given a table of numerical data, use Copulas to learn the distribution and generate new synthetic data following the same statistical properties. Choose from a variety of univariate distributions and copulas – including Archimedian Copulas, Gaussian Copulas and Vine Copulas. Compare real and synthetic data visually after building your model. Visualizations are available as 1D histograms, 2D scatterplots and 3D scatterplots. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    SASM

    SASM

    Simple crossplatform IDE for NASM, MASM, GAS and FASM languages

    ...In SASM you can easily develop and execute programs, written in NASM, MASM, GAS or FASM assembly languages. Enter code in form and simply run your program. In Windows SASM can execute programs in a separate window. Enter your input data in "Input" docking field. In "Output" field you can see the result of the execution of the program. Wherein all messages and compilation errors will be shown in the form on the bottom. You can save source or already compiled (exe) code of your program to file and load your programs from file.
    Downloads: 103 This Week
    Last Update:
    See Project
  • 3
    Llama Cloud Services

    Llama Cloud Services

    Knowledge Agents and Management in the Cloud

    Llama Cloud Services is a suite of tools designed to facilitate the integration of large language models (LLMs) into applications. It offers components for parsing, extracting, and reporting on complex documents, streamlining the process of preparing data for LLM consumption.​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and reliable training process. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • 5
    GAM

    GAM

    Command line management for Google Workspace

    ...GAM will also be added to your path so you can run GAM even if you're not in the GAM folder. At the end of the MSI install process, GAM will open a command prompt to allow you to setup a project and authorize GAM for admin management and user data/config access.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 6
    Kedro

    Kedro

    A Python framework for creating reproducible, maintainable code

    Kedro is an open sourced Python framework for creating maintainable and modular data science code. Provides the scaffolding to build more complex data and machine-learning pipelines. In addition, there's a focus on spending less time on the tedious "plumbing" required to maintain data science code; this means that you have more time to solve new problems. Standardises team workflows; the modular structure of Kedro facilitates a higher level of collaboration when teams solve problems together. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps. Dash apps are very lightweight, requiring only a limited number of lines of Python or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Ralph

    Ralph

    Ralph is the CMDB / Asset Management system for data center

    ...We've chosen the best features of DCIM, Asset Mgmt and CMDB systems to create one, easy and well-integrated system. One interface is easier than 3. Keep track of assets purchases and their life cycle. Flexible flow system for assets life cycle. Data center and back office support. DC visualization built-in. Ralph is a simple yet powerful Asset Management, DCIM and CMDB system for data center and back office.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Parsera

    Parsera

    Lightweight library for scraping web-sites with LLMs

    Scrape data from any website with only a link and column descriptions. Parsera is a tool designed to scrape web content, specifically handling poorly structured or messy websites.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 10
    Superduper

    Superduper

    Superduper: Integrate AI models and machine learning workflows

    ...This allows developers to completely avoid implementing MLOps, ETL pipelines, model deployment, data migration, and synchronization. Using Superduper is simply "CAPE": Connect to your data, apply arbitrary AI to that data, package and reuse the application on arbitrary data, and execute AI-database queries and predictions on the resulting AI outputs and data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SENAITE LIMS

    SENAITE LIMS

    SENAITE Meta Package

    ...Therefore, it reflects nicely the complexity of the LIMS, while providing a modern, intuitive, and friendly UI/ UX. Amongst other functionalities, SENAITE comes with highly-customizable workflows to drive users through the analytical process, easy-to-use UI for data registration, automatic import of results, data validation, and transition constraints. SENAITE can be easily integrated with instruments by using off-the-shell interfaces for data import and export. Custom interfacing is supported too. Import instrument results and avoid human errors in the carrying-over process. Reduce the turnaround time on results report delivery. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    dj-stripe

    dj-stripe

    dj-stripe automatically syncs your Stripe Data to your local database

    Dj-stripe is an extensible wrapper around the Stripe API that continuously syncs most of the Stripe Data to your local database as pre-implemented Django Models, out of the box! This allows you to use the Django ORM, in your code, to work with the data making it easier and faster! For example, if you need to interact with a customer subscription, you can use dj-stripe’s Subscription Model, in your code, to get the subscription data for that customer as well as the related models’ data too (if need be and potentially in 1 database query!) ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    redis-py

    redis-py

    Redis Python client

    redis-py is the official Python client for interacting with Redis, the in-memory data structure store. It supports all Redis commands and data types, making it easy to build caching, messaging, or real-time analytics features in Python applications. With both synchronous and asyncio support, redis-py is suited for modern Python projects and integrates smoothly into web frameworks, task queues, and backend services.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    django-import-export

    django-import-export

    Django application and library for importing and exporting data

    ...Also, the report_skipped option controls whether skipped records appear in the import Result object, and if using the admin whether skipped records will show in the import preview page. Not all data can be easily extracted from an object/model attribute. In order to turn complicated data model into a (generally simpler) processed data structure on export, dehydrate_<fieldname> method should be defined.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Graphene

    Graphene

    GraphQL in Python Made Easy

    Graphene is a Python library for building GraphQL APIs fast and easily, using a code-first approach. Instead of writing GraphQL Schema Definition Langauge (SDL), Python code is written to describe the data provided by your server. Graphene helps you use GraphQL effortlessly in Python, but what is GraphQL? GraphQL is a data query language developed internally by Facebook as an alternative to REST and ad-hoc webservice architectures. With Graphene you have all the tools you need to implement a GraphQL API in Python, with multiple integrations with different frameworks including Django, SQLAlchemy and Google App Engine.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    PokeAPI

    PokeAPI

    The Pokémon API

    ...This API will always be publicly available and will never require any extensive setup process to consume. Each time the build script is run, it will iterate over each table in the database, wipe it, and rewrite each row using the data found in data/v2/CSV. The option to build individual portions of the database was removed in order to increase the performance of the build script. There is also a multi-container set up, managed by Docker Compose. This setup allows you to deploy a production-like environment, with separate containers for each service and is recommended if you need to simply spin up PokéAPI.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Ethereum ETL

    Ethereum ETL

    Python scripts for ETL (extract, transform and load) jobs for Ethereum

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery. Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PyTorch Geometric

    PyTorch Geometric

    Geometric deep learning extension library for PyTorch

    It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of an easy-to-use mini-batch loader for many small and single giant graphs, a large number of common benchmark datasets (based on simple interfaces to create your own), and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. We have outsourced a lot of...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 20
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models. While you’re in the research phase, and want to validate your data, find potential methodological problems, and/or validate your model and evaluate it. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Faker for Python

    Faker for Python

    Python package that generates fake data for you

    Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you. Starting from version 4.0.0, Faker dropped support for Python 2 and from version 5.0.0 only supports Python 3.6 and above. If you still need Python 2 compatibility, please install version 3.0.1 in the meantime, and please consider updating your codebase to support Python 3 so you can enjoy the latest features Faker has to offer. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    NetworkX

    NetworkX

    Network analysis in Python

    NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Data structures for graphs, digraphs, and multigraphs. Many standard graph algorithms. Network structure and analysis measures. Generators for classic graphs, random graphs, and synthetic networks. Nodes can be "anything" (e.g., text, images, XML records). Edges can hold arbitrary data (e.g., weights, time-series). Open source 3-clause BSD license. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    The Reactive Extensions for Python

    The Reactive Extensions for Python

    Reactive extensions for Python

    ...Reactive Extensions for Python (RxPY) is a set of libraries for composing asynchronous and event-based programs using observable sequences and pipable query operators in Python. Using Rx, developers represent asynchronous data streams with Observables, query asynchronous data streams using operators, and parameterize concurrency in data/event streams using Schedulers. RxPY is a fairly complete implementation of Rx with more than 120 operators, and over 1300 passing unit-tests. RxPY is mostly a direct port of RxJS, but also borrows a bit from RxNET and RxJava in terms of threading and blocking operators.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Pydantic-Core

    Pydantic-Core

    Core validation logic for pydantic written in rust

    pydantic-core is the Rust-based core validation logic for Pydantic, a widely used data validation library in Python. It offers significant performance improvements over its predecessor, enabling faster and more efficient data parsing and validation.​
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB