Showing 113 open source projects for "data"

View related business solutions
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 1
    The Data Engineering Handbook

    The Data Engineering Handbook

    Links to everything you'd ever want to learn about data engineering

    The Data Engineering Handbook is a comprehensive, community-curated repository that aggregates essential learning resources for anyone interested in becoming a professional data engineer. Rather than being a code project itself, it’s a learning handbook that links to books, articles, tutorials, community groups, boot camps, and real-world project examples that collectively form a roadmap to mastering data engineering skills.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    The Grand Complete Data Science Guide

    The Grand Complete Data Science Guide

    Data Science Guide With Videos And Materials

    The Grand Complete Data Science Materials is a repository curated by a data-science educator that aggregates a wide range of learning resources — from basic programming and math foundation to advanced topics in machine learning, deep learning, natural language processing, computer vision, and deployment practices — into a structured, centralized collection aimed at learners seeking a comprehensive path to data science mastery.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Recommenders

    Recommenders

    Best practices on recommendation systems

    ...Several utilities are provided in reco_utils to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. Please see the setup guide for more details on setting up your machine locally, on a data science virtual machine (DSVM) or on Azure Databricks. Independent or incubating algorithms and utilities are candidates for the contrib folder. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Interpretable machine learning

    Interpretable machine learning

    Book about interpretable machine learning

    ...As the programmer of an algorithm you want to know whether you can trust the learned model. Did it learn generalizable features? Or are there some odd artifacts in the training data which the algorithm picked up? This book will give an overview over techniques that can be used to make black boxes as transparent as possible and explain decisions. In the first chapter algorithms that produce simple, interpretable models are introduced together with instructions how to interpret the output. The later chapters focus on analyzing complex models and their decisions. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • 5
    PythonPark

    PythonPark

    Python open source project "The Road to Self-Study Programming"

    ...For someone self-teaching Python (or transitioning into coding/data science), the repository presents a one-stop “home base” of content, saving them from hunting scattered tutorials across the internet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Scientific Visualization

    Scientific Visualization

    An open access book on scientific visualization using python

    The Scientific Visualization book is a freely available open-access textbook that introduces how to produce effective scientific visualizations using Python, focusing especially on leveraging the popular plotting library Matplotlib (and related tools). It goes beyond simple plotting tutorials and emphasizes design principles: how to choose colors, layout subplots, annotate graphs, and present data in a way that is both accurate and visually compelling. As such, it serves as a guide for researchers, data scientists, and academic authors who need to create publication-quality figures or explanatory graphics, rather than quick exploratory plots. It includes extensive examples that demonstrate best practices — for instance handling multiple subplots, combining line plots with scatter/density overlays, or rendering high-resolution vector graphics for print.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Python Zero to Hero for DevOps Engineers

    Python Zero to Hero for DevOps Engineers

    Learn Python from DevOps Engineer point of you

    ...The repository is organized into Day-01 through Day-19 folders plus a small sample app, which makes it very easy to follow in sequence like a bootcamp. The curriculum starts with Python installation, environment setup, and writing your first script, then quickly moves into data types, strings, regular expressions, variables, and functions. It places a strong emphasis on DevOps-specific use cases: environment variables, command-line arguments, configuration handling, and automating log analysis or user management tasks are all explicitly woven into the exercises. As you progress, you encounter increasingly rich Python features such as lists (with list comprehensions), dictionaries, sets, operators, and control flow, always tied back to practical automation or infrastructure examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    ...The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. Stanza is built with highly accurate neural network components that also enable efficient training and evaluation with your own annotated data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    AI Researcher

    AI Researcher

    An autonomous AI researcher

    ...Each agent operates with clear roles — such as researcher, analyst, and summarizer — and they communicate through a task-management interface that ensures progress tracking and iterative refinement. The system emphasizes modularity, so teams can swap in new reasoning modules, data retrieval strategies, or domain knowledge bases depending on the research topic. Through self-supervised feedback loops, agents adjust their strategies based on prior outcomes, improving both the quality and relevance of results over time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 10
    Hello SQL

    Hello SQL

    Spanish-language course repository that teaches fundamentals of SQL

    ...The materials emphasize real-world query writing, schema design basics, and the mental model behind SELECT, JOIN, GROUP BY, and subqueries. Learners progress from setup and connection to hands-on exercises that build confidence with CRUD operations and data modeling. The repository’s structure favors incremental learning, with clear folders, references, and exercises you can run locally. It targets absolute beginners as well as developers from other stacks who want a clean, project-based path into SQL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    nanoGPT

    nanoGPT

    The simplest, fastest repository for training/finetuning models

    NanoGPT is a minimalistic yet powerful reimplementation of GPT-style transformers created by Andrej Karpathy for educational and research use. It distills the GPT architecture into a few hundred lines of Python code, making it far easier to understand than large, production-scale implementations. The repo is organized with a training pipeline (dataset preprocessing, model definition, optimizer, training loop) and inference script so you can train a small GPT on text datasets like Shakespeare...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Hello Python

    Hello Python

    Comprehensive tutorial repository aimed at teaching the Python program

    ...It includes over 100 classes and about 44 hours of video instruction, combined with code samples, projects, and a chat community for support. The material covers the fundamentals—variables, data types, loops, functions—as well as intermediate topics like date handling, list comprehensions, file IO, regular expressions, modules, and packages. The course is designed to be accessible: no prior programming experience required, and the resources are freely available. In addition, it is accompanied by a practical coding approach (projects) and is maintained as an open-source repository under Apache-2.0 license. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Think Python 2

    Think Python 2

    LaTeX source and supporting code for Think Python, 2nd edition

    ...The repository contains clean, well-commented Python scripts that are easy to follow and map directly to chapters of the text, covering topics like variables, control flow, functions, recursion, data structures (lists, dictionaries), classes and objects, file I/O, and algorithmic thinking. It also contains solutions or hints for many exercises so learners can check their work or explore alternative implementations. Because it’s educational, the repository emphasizes readability, clarity, and progressive learning rather than performance tuning or advanced constructs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Digital Forensics Guide

    Digital Forensics Guide

    Learn all about Digital Forensics and Computer Forensics

    ...Alongside conceptual explanations, the guide includes practical examples with widely used tools (like Autopsy, Volatility, Sleuth Kit, and network analysis suites), illustrating how investigations proceed from initial data capture to final analysis. The goal is to provide both a learning path and a quick reference for real-world casework, bridging the gap between academic theory and operational practice.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    xrayutilities

    xrayutilities

    a package with useful scripts for X-ray diffraction physicists

    xrayutilities is a python package used to analyze x-ray diffraction data. It can support with performing diffraction experiments and used for common steps in the data analysis. It can read experimental data from several data formats (spec, edf, xrdml, ...); convert them to reciprocal space for arbitrary goniometer geometries and different detector systems (point, linear as well as area detectors); for further processing the data can be gridded (transformed to a regular grid). ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16

    TOMUSS

    TOMUSS: The Online Multi User Simple Spreadsheet

    TOMUSS is an interactive web application (groupware) allowing multiple concurrent users to edit data tables. Its primary goal is the management of students grades.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Advanced Trigonometry Calculator

    Advanced Trigonometry Calculator

    Precision Trigonometry: Advanced Calculator for Complex Math

    Advanced Trigonometry Calculator is equipped with a user-friendly interface that allows for easy input of problems and instant computation. Professionals such as engineers who need to perform advanced trigonometric calculations in their work will find this tool extremely useful. ATC Online Alpha: https://advantrigoncalc.sourceforge.io/atc/ More info by clicking below: https://advantrigoncalc.sourceforge.io/ Advanced Trigonometry Calculator was only and always only developed by...
    Leader badge
    Downloads: 22 This Week
    Last Update:
    See Project
  • 18

    openSkyMatch

    Matches OpenScience Observatories images with astronomical catalogs

    ...It automates the identification and matching of detected celestial objects in locally captured FITS images with entries in large-scale sky catalogs, notably Pan-STARRS1 DR2 (II/389/ps1_dr2). The toolkit supports data preprocessing, coordinate correlation, and catalog-based validation of astronomical detections. All tools are open-source and optimized for reproducibility and transparency in citizen science astronomy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Elementary Algorithms

    Elementary Algorithms

    Book of elementary algorithms and data structures

    This book introduces elementary algorithms and data structure. It includes side-by-side comparison of purely functional realization and their imperative counterpart. From 2020/12, I started re-writing this book. The PDF can be downloaded for preview (EN, 中文). The 1st edition in Chinese (中文) was published in 2017. I recently switched my focus to the Mathematics of programming, the new book is also available in (github).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ACORBA

    ACORBA

    Automated approach to measure root tip angles of Arabidopsis thaliana

    Gravitropic response is studied in most of the laboratories working with Arabidopsis thaliana, for example, to detect new phenotypes in mutants. However, manual analysis of images and microscopy data are known to be subjected to human bias. This is particularly the case for manual measurements of root bending as the angle is set subjectively. In this context, it is essential to develop and use automated or semi-automated image analysis to produce faster, reproducible, and unbiased data. In this context, we developped ACORBA (Automatic Calculation Of Root Bending Angles), a fully automated software to measure root bending angle over time.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Br-Gogo is a Brazilian open-source version of the Gogo Board project. Developed by CTI, a Brazilian research center. **<div class="sf-root" data-id="250926" data-badge="oss-users-love-us-white" style="width:125px"> <a href="https://sourceforge.net/projects/br-gogo/" target="_blank">Br-Gogo</a> </div>**
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DIG

    DIG

    A library for graph deep learning research

    ...If you are working or plan to work on research in graph deep learning, DIG enables you to develop your own methods within our extensible framework, and compare with current baseline methods using common datasets and evaluation metrics without extra efforts. It includes unified implementations of data interfaces, common algorithms, and evaluation metrics for several advanced tasks. Our goal is to enable researchers to easily implement and benchmark algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    HomeworkHelper

    HomeworkHelper

    Homework Helper: Organize tasks, meet deadlines. Ideal for ADHD

    Homework Helper is a comprehensive and user-friendly application designed to assist students in effectively managing their homework and assignments. It provides a convenient and organized platform to keep track of upcoming tasks, due dates, subjects, and associated details. Developed with a focus on simplicity and usability, Homework Helper aims to support students, including those with ADHD or individuals struggling with task management, in staying organized and achieving academic success....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    The Art of Programming

    The Art of Programming

    A collection of practical tips can be found at the bottom of this page

    The Art of Programming (Second Edition) is a curated collection of programming problems and solutions originally derived from the Microsoft 100 Interview Questions blog series, later refined into a long-running tutorial and ultimately a published book. Created by July, the series began in 2010 and has since evolved into an in-depth exploration of algorithmic thinking, data structures, and coding interview preparation. The repository brings together 42 classic programming problems from the original series, enhanced with detailed explanations, formula derivations, and optimized solutions. In July 2023, work on the second edition was announced, which expands the project with updated content, new problems inspired by recent big-tech interviews, and introductions to modern machine learning techniques such as XGBoost, CNNs, RNNs, and LSTMs. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    AllenNLP

    AllenNLP

    An open-source NLP research library, built on PyTorch

    AllenNLP makes it easy to design and evaluate new deep learning models for nearly any NLP problem, along with the infrastructure to easily run them in the cloud or on your laptop. AllenNLP includes reference implementations of high quality models for both core NLP problems (e.g. semantic role labeling) and NLP applications (e.g. textual entailment). AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →