Showing 739 open source projects for "data"

View related business solutions
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 1
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 2
    AWS Data Wrangler

    AWS Data Wrangler

    Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.

    An AWS Professional Service open-source python initiative that extends the power of Pandas library to AWS connecting DataFrames and AWS data-related services. Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON, and EXCEL). Built on top of other open-source projects like Pandas, Apache Arrow and Boto3, it offers abstracted functions to execute usual ETL tasks like load/unload data from Data Lakes, Data Warehouses, and Databases. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Cookiecutter Data Science

    Cookiecutter Data Science

    Project structure for doing and sharing data science work

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. When we think about data analysis, we often think just about the resulting reports, insights, or visualizations. While these end products are generally the main event, it's easy to focus on making the products look nice and ignore the quality of the code that generates them. Because these end products are created programmatically, code quality is still important! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Blender GIS

    Blender GIS

    Blender addons to make the bridge between Blender and geographic data

    Import in Blender most commons GIS data format, Shapefile vector, raster image, geotiff DEM, OpenStreetMap XML. There are a lot of possibilities to create a 3D terrain from geographic data with BlenderGIS, check the Flowchart to have an overview. Display dynamics web maps inside Blender 3d view, requests for OpenStreetMap data (buildings, roads, etc.), get true elevation data from the NASA SRTM mission.
    Downloads: 100 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    folium

    folium

    Python data, Leaflet.js maps

    folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via folium. folium makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Astropy

    Astropy

    Repository for the Astropy core package

    ...It is at the core of the Astropy Project, which aims to enable the community to develop a robust ecosystem of affiliated packages covering a broad range of needs for astronomical research, data processing, and data analysis.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Seeker

    Seeker

    Accurately Locate Smartphones using Social Engineering

    Seeker is an open source project that demonstrates how to obtain precise location information from devices using social engineering and web-based techniques. The tool sets up a phishing page that asks for location permissions, allowing GPS and other device data to be shared if the user consents. It can capture latitude, longitude, accuracy, altitude, direction, and even speed, with results displayed in a terminal. The project supports both manual deployment and tunneling services like Ngrok for external access. While primarily intended as an educational resource on security awareness, it highlights the risks of exposing geolocation data online. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 8
    Grafana

    Grafana

    Leading open-source visualization and observability platform

    Grafana OSS is a leading open-source visualization and observability platform that lets you query, visualize, alert on, and explore your data—regardless of where it’s stored. With support for 100+ data source plugins (such as Prometheus, Loki, Elasticsearch, InfluxDB, SQL/NoSQL databases, OTel, and more), you can unify metrics, logs, traces, and other observability signals in one place. Grafana OSS empowers you to build dynamic, reusable dashboards with rich visualizations, template variables, interactive filtering, and cross-panel linking. ...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 9
    Open X-Embodiment

    Open X-Embodiment

    Unified open dataset enabling cross-embodiment learning for robotics

    ...The repository also provides Colab notebooks for dataset visualization, batching, and model inference, along with pretrained model checkpoints such as RT-1-X, a multitask robotic transformer model trained on this data.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 10
    Segments.ai

    Segments.ai

    Segments.ai Python SDK

    Multi-sensor labeling platform for robotics and autonomous vehicles. The platform for fast and accurate multi-sensor data annotation. Label in-house or with an external workforce. Intuitive labeling interfaces for images, videos, and 3D point clouds (lidar and RGBD). Obtain segmentation labels, vector labels, and more. Our labeling interfaces are set up to label fast and precise. Powerful ML assistance lets you label faster and reduce costs. Integrate data labeling into your existing ML pipelines and workflows using our simple yet powerful Python SDK. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    PyTorch Geometric

    PyTorch Geometric

    Geometric deep learning extension library for PyTorch

    It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of an easy-to-use mini-batch loader for many small and single giant graphs, a large number of common benchmark datasets (based on simple interfaces to create your own), and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. We have outsourced a lot of...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12
    Mathematics Dataset

    Mathematics Dataset

    This dataset code generates mathematical question and answer pairs

    ...Version 1.0 includes over 2 million examples per category, with training splits labeled as “easy,” “medium,” and “hard,” supporting curriculum-based learning strategies. The data can be accessed via PyPI or generated locally using provided Python scripts, with outputs formatted for direct use in training or evaluation pipelines.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    ROMM

    ROMM

    A beautiful, powerful, self-hosted rom manager and player

    ...Romm also supports widgets, customization options, and theme choices so users can tailor the visual experience to their preferences while maintaining performance and responsiveness. Privacy is a highlight, with local indexing and search functions that operate without sending data to external servers unless explicitly permitted.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    seaborn

    seaborn

    Statistical data visualization in Python

    Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Seaborn helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Vedo

    Vedo

    A python module for scientific analysis of 3D data

    A lightweight and powerful python module for scientific analysis and visualization of 3d objects. Inspired by the vpython manifesto "3D programming for ordinary mortals", vedo makes it easy to work with 3D pointclouds, meshes and volumes, in just a few lines of code, even for less experienced programmers. vedo is based on VTK and numpy, with no other dependencies. Import meshes from VTK format, STL, Wavefront OBJ, 3DS, Dolfin-XML, Neutral, GMSH, OFF, PCD (PointCloud). Export meshes as ASCII...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MuJoCo

    MuJoCo

    Multi-Joint dynamics with Contact. A general purpose physics simulator

    ...The engine provides a robust C API optimized for real-time computation, making it suitable for scientific research and advanced simulation environments. MuJoCo’s core architecture is performance-tuned and utilizes preallocated data structures created through an XML-based compiler. The platform includes built-in interactive visualization using OpenGL and a native graphical interface for analyzing and testing simulations. Additionally, it offers extensive utility functions for physics computation, Python bindings for developers, and a Unity plug-in to enable integration with game engines and visualization tools.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    statsmodels

    statsmodels

    Statsmodels, statistical modeling and econometrics in Python

    statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. An extensive list of result statistics are available for each estimator. The results are tested against existing statistical packages to ensure that they are correct. The package is released under the open source Modified BSD (3-clause) license. Generalized linear models with support for all of the one-parameter exponential family distributions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    go1pylib

    go1pylib

    go1pylib is a Python library designed to control the Go1 robot

    go1pylib is a Python library designed to control the Go1 robot by Unitree Robotics. It provides an easy-to-use interface for robot movement, state management, collision avoidance, battery monitoring, and MQTT communication. Ideal for research and robotics development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Tequila

    Tequila

    A High-Level Abstraction Framework for Quantum Algorithms

    Tequila is an abstraction framework for (variational) quantum algorithms. It operates on abstract data structures allowing the formulation, combination, automatic differentiation and optimization of generalized objectives. Tequila can execute the underlying quantum expectation values on state-of-the-art simulators as well as on real quantum devices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    PyMC

    PyMC

    Bayesian Modeling and Probabilistic Programming in Python

    ...Built on top of computational tools like Aesara and NumPy, PyMC allows users to define models using intuitive syntax and perform inference using MCMC, variational inference, and other advanced algorithms. It’s widely used in scientific research, data science, and decision modeling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The Missing Semester

    The Missing Semester

    The Missing Semester of Your CS Education

    The Missing Semester is a course and repository that teaches the engineering skills often skipped in traditional computer science curricula: command-line fluency, shell scripting, editors, version control, debugging, data wrangling, and automation. It includes lecture notes, exercises, and sample solutions that encourage hands-on practice rather than passive reading. The curriculum demystifies tools like bash, vim, git, and make, showing how to combine them into efficient workflows that scale from homework to production systems. Lessons dig into practical topics such as environment management, job control, shell pipelines, profiling, and reproducibility, with an emphasis on habits that save time and prevent errors. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Physical Symbolic Optimization (Φ-SO)

    Physical Symbolic Optimization (Φ-SO)

    Physical Symbolic Optimization

    Physical Symbolic Optimization (Φ-SO) - A symbolic optimization package built for physics. Symbolic regression module uses deep reinforcement learning to infer analytical physical laws that fit data points, searching in the space of functional forms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OpenFermion

    OpenFermion

    The electronic structure package for quantum computers

    OpenFermion is an open source library for compiling and analyzing quantum algorithms to simulate fermionic systems, including quantum chemistry. Among other functionalities, this version features data structures and tools for obtaining and manipulating representations of fermionic and qubit Hamiltonians. For more information, see our release paper. Currently, OpenFermion is tested on Mac, Windows, and Linux. We recommend using Mac or Linux because the electronic structure plugins are only compatible on these platforms. However, for those who would like to use Windows, or for anyone having other difficulties with installing OpenFermion or its plugins, we have provided a Docker image and usage instructions in the docker folder. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Perceval

    Perceval

    An open source framework for programming photonic quantum computers

    An open-source framework for programming photonic quantum computers. Through a simple object-oriented Python API, Perceval provides tools for composing circuits from linear optical components, defining single-photon sources, manipulating Fock states, running simulations, reproducing published experimental papers and experimenting with a new generation of quantum algorithms. It aims to be a companion tool for developing photonic circuits – for simulating and optimizing their design, modeling...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →