452 projects for "data science" with 2 filters applied:

  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Computer Science Flash Cards

    Computer Science Flash Cards

    Mini website for testing both general CS knowledge and enforce coding

    This repository collects concise flash cards that cover the core ideas of a traditional computer science curriculum with a focus on interview readiness. The cards distill topics like time and space complexity, classic data structures, algorithmic paradigms, operating systems, networking, and databases into short, testable prompts. They are designed for spaced-repetition style study so you can cycle frequently through fundamentals until recall feels automatic.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Computer Science courses video lectures

    Computer Science courses video lectures

    List of Computer Science courses with video lectures

    This repository is a curated list of full-length computer science video lecture series across many universities and MOOC platforms, helping learners assemble their own curriculum. The list spans foundational topics like algorithms, data structures, operating systems, computer networks, machine learning, and more, all delivered via lectures rather than just textual tutorials. The contributor guidelines encourage adding high-quality courses (not just casual tutorials) so the list remains academically oriented. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    NYC Taxi Data

    NYC Taxi Data

    Import public NYC taxi and for-hire vehicle (Uber, Lyft)

    ...It also contains example analyses—spatial and temporal visualizations like maps, time-series plots, and hotspot detection—highlighting insights such as patterns of demand, peak times, and geospatial distributions. The repository is often used as a benchmark dataset and example for teaching, benchmarking, and demonstration purposes in the data science and urban analytics communities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Kedro

    Kedro

    A Python framework for creating reproducible, maintainable code

    Kedro is an open sourced Python framework for creating maintainable and modular data science code. Provides the scaffolding to build more complex data and machine-learning pipelines. In addition, there's a focus on spending less time on the tedious "plumbing" required to maintain data science code; this means that you have more time to solve new problems. Standardises team workflows; the modular structure of Kedro facilitates a higher level of collaboration when teams solve problems together. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    SIT742

    SIT742

    SIT742: Modern Data Science

    SIT742: Modern Data Science.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    RStudio Cheatsheets

    RStudio Cheatsheets

    Curated collection of official cheat sheets for data science tools

    The cheatsheets repository from RStudio is a curated collection of official cheat sheets for R, RStudio, the tidyverse, Shiny, and related data science tools. Each cheat sheet is a single (or double) page PDF that condenses important syntax, functions, workflows, and best practices into a visually organized format ideal for quick reference. The repository contains source files (R Markdown or LaTeX) that generate the cheat sheets, version history, and metadata (title, author, description) for each. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    dplyr

    dplyr

    dplyr: A grammar of data manipulation

    dplyr is an R package that provides a consistent and intuitive grammar for data manipulation, enabling users to filter, arrange, summarize, and transform data efficiently. Part of the tidyverse ecosystem, dplyr simplifies complex data operations through a clear and readable syntax, whether working with data frames, tibbles, or databases. It is widely used in data science and statistical analysis workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    ...With Spark Streaming (microbatches) and Structured Streaming, it delivers low-latency event processing suitable for real-time analytics. The built-in MLlib library provides scalable machine learning algorithms, while GraphX enables graph computations integrated with data pipelines. Spark supports multiple languages—Scala, Java, Python, R—and connects with many storage systems like HDFS, S3, Cassandra, and streaming platforms like Kafka, making it a versatile choice for big data workloads in analytics, ETL, and data science.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    GeoStats.jl

    GeoStats.jl

    An extensible framework for geospatial data science

    GeoStats.jl is a Julia framework for geospatial data science and geostatistical modeling. It’s fully implemented in Julia and designed to provide an extensible, high-performance stack that handles spatial domains, interpolation, simulation, learning, and visualization. The package is modular: it breaks out geometry, spatial domains, transforms, variograms, covariance models, and modeling into subpackages (e.g., GeoStatsBase, GeoStatsModels, GeoStatsTransforms).
    Downloads: 3 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 10
    Jupyter Docker Stacks

    Jupyter Docker Stacks

    Ready-to-run Docker images containing Jupyter applications

    Jupyter Docker Stacks provides a curated set of ready-to-run Docker container images that bundle Jupyter applications with popular data science and computing tools, enabling users to quickly start working in a reproducible environment. These stacks support a range of use cases, from lightweight base notebook images to full featured environments that include scientific computing libraries, machine learning tools, and IDE-like notebook interfaces, all within Docker containers that run consistently across machines. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    InterviewGuide

    InterviewGuide

    Repository that collects extensive computer science

    ...The repository contains curated algorithm and data structure explanations, collections of interview questions, high-frequency topics, and practical problem-solving guides to help users brush up on skills that are often tested in technical interviews.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Kaggle CLI

    Kaggle CLI

    The official CLI to interact with Kaggle

    ...Its main value is turning Kaggle’s web-based data science platform into a scriptable developer workflow.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    kagglehub

    kagglehub

    Python library to access Kaggle resources

    ...The library is designed to work both inside and outside Kaggle Notebooks, with native behavior that can adapt when it runs in Kaggle’s hosted notebook environment. It is useful for machine learning workflows where data, models, and notebook artifacts need to be pulled into scripts, experiments, or pipelines. kagglehub also supports authentication so users can access private or restricted resources when their account has permission. Its main value is making Kaggle assets easier to consume programmatically in Python-first data science and AI development workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JupyterLite

    JupyterLite

    Wasm powered Jupyter running in the browser

    ...Built using JupyterLab components and powered by WebAssembly technologies, it allows users to run Python and other language kernels directly in the browser through tools like Pyodide or Xeus. This architecture eliminates the need for installation or server infrastructure, making it highly accessible for education, demonstrations, and lightweight data science workflows. JupyterLite supports many core Jupyter features, including notebooks, code consoles, and interactive visualizations, while storing files locally using browser storage mechanisms such as IndexedDB. It is designed to be easily deployable as a static website, enabling developers to host fully functional notebook environments on platforms like GitHub Pages.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    all AI news

    all AI news

    A list of online news & info sources in the AI/ML/Data Science space

    all AI news is a curated repository that aggregates and organizes sources for AI-related news and information. It serves as a centralized collection of feeds, links, and resources that can be used to build news aggregation systems or stay updated on developments in artificial intelligence. The project is designed to be easily extendable, allowing users to add new sources or customize the dataset for their specific needs. It is particularly useful for developers building AI news platforms,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    xqvm-py

    xqvm-py

    A python implementation of the Quip Network's quantum virtual machine

    xq-py is a Python implementation of the Quip Network’s quantum virtual machine, offering a more accessible and flexible alternative to the Rust-based version for experimentation and rapid development. It is designed to provide similar functionality to xq-rs while prioritizing ease of use, readability, and integration with Python-based data science and research tools. The project enables developers to simulate or prototype quantum-inspired computations without needing to work at a lower systems level. It integrates into the broader Quip ecosystem, allowing interoperability with other components such as blockchain nodes and management tools. Python’s dynamic nature makes it suitable for testing new algorithms, educational purposes, and exploratory research. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Numbast

    Numbast

    Build an automated pipeline that converts CUDA APIs into Numba

    ...This approach significantly improves developer productivity by reducing boilerplate code and ensuring consistency between C++ and Python interfaces. Numbast is particularly useful for teams working with custom CUDA libraries or extending existing ones into Python ecosystems for data science and machine learning. It complements tools like Numba, which compile Python code into GPU-executable kernels, by expanding the range of accessible CUDA functionality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    huihut interview

    huihut interview

    A summary of C/C++ technical interview basics

    interview is a curated repository of technical interview questions, solutions, and explanations covering a wide range of topics in computer science and software engineering. It aims to help developers prepare for job interviews by providing sample problems in algorithms, data structures, system design, databases, and programming language intricacies, often with code snippets and discussion. The repo is designed so learners can practice real interview scenarios, compare approaches, and internalize foundational concepts that are frequently tested by tech companies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    FairChem

    FairChem

    FAIR Chemistry's library of machine learning methods for chemistry

    FAIRChem is a unified library for machine learning in chemistry and materials, consolidating data, pretrained models, demos, and application code into a single, versioned toolkit. Version 2 modernizes the stack with a cleaner core package and breaking changes relative to V1, focusing on simpler installs and a stable API surface for production and research. The centerpiece models (e.g., UMA variants) plug directly into the ASE ecosystem via a FAIRChem calculator, so users can run relaxations,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Otter-Grader

    Otter-Grader

    A Python and R autograding solution

    Otter Grader is a light-weight, modular open-source autograder developed by the Data Science Education Program at UC Berkeley. It is designed to work with classes at any scale by abstracting away the autograding internals in a way that is compatible with any instructor's assignment distribution and collection pipeline. Otter supports local grading through parallel Docker containers, grading using the autograder platforms of 3rd party learning management systems (LMSs), the deployment of an Otter-managed grading virtual machine, and a client package that allows students to run public checks on their own machines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Awesome Network Analysis

    Awesome Network Analysis

    A curated list of awesome network analysis resources

    awesome-network-analysis is a curated list of resources focused on network and graph analysis, including libraries, frameworks, visualization tools, datasets, and academic papers. It covers multiple programming languages and domains like sociology, biology, and computer science. This repository serves as a central reference for researchers, analysts, and developers working with network data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MOA - Massive Online Analysis

    MOA - Massive Online Analysis

    Big Data Stream Analytics Framework.

    A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.
    Downloads: 70 This Week
    Last Update:
    See Project
  • 23
    Biosphere3D

    Biosphere3D

    Interactive landscape rendering based on a virtual globe.

    Biosphere3D targets interactive landscape rendering based on a virtual globe. It supports DEM, satellite and aerial images, 3D models (Collada), 3D plant models, and Shapefiles. Biosphere3D was initially developed by the landscape visualization group of the Zuse Institute Berlin by Malte Clasen and is now developed further by Lenné3D GmbH. For more information about the used concepts have a look at the thesis of Malte Clasen: Towards Interactive Landscape Visualization Doctoral...
    Downloads: 61 This Week
    Last Update:
    See Project
  • 24
    Hibernate

    Hibernate

    An object relational-mapping (ORM) library for Java

    The Hibernate projects offer a suite of powerful Java libraries to work with data. It is best known for Hibernate ORM, which provides relational persistence for Java models and is an implementation of the Jakarta Persistence specification. Hibernate projects do not consistently release binaries or documentation to SourceForge anymore. For up-to-date information, refer to the Hibernate website: * Hibernate ORM: https://hibernate.org/orm/ * Hibernate Validator:...
    Leader badge
    Downloads: 3,446 This Week
    Last Update:
    See Project
  • 25
    applied-ml

    applied-ml

    Papers & tech blogs by companies sharing their work on data science

    The applied-ml repository is a rich, curated collection of papers, technical articles, and case-study blog posts about how machine learning (ML) and data-driven systems are applied in real production environments by major companies. Instead of focusing solely on theoretical ML research, this repo highlights industry-scale challenges: data collection, quality, infrastructure, feature stores, model serving, monitoring, scalability, and how ML is embedded in product workflows. It acts as a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo