Showing 115 open source projects for "data science"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    Computer Science Flash Cards

    Computer Science Flash Cards

    Mini website for testing both general CS knowledge and enforce coding

    This repository collects concise flash cards that cover the core ideas of a traditional computer science curriculum with a focus on interview readiness. The cards distill topics like time and space complexity, classic data structures, algorithmic paradigms, operating systems, networking, and databases into short, testable prompts. They are designed for spaced-repetition style study so you can cycle frequently through fundamentals until recall feels automatic.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Computer Science courses video lectures

    Computer Science courses video lectures

    List of Computer Science courses with video lectures

    This repository is a curated list of full-length computer science video lecture series across many universities and MOOC platforms, helping learners assemble their own curriculum. The list spans foundational topics like algorithms, data structures, operating systems, computer networks, machine learning, and more, all delivered via lectures rather than just textual tutorials. The contributor guidelines encourage adding high-quality courses (not just casual tutorials) so the list remains academically oriented. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    NYC Taxi Data

    NYC Taxi Data

    Import public NYC taxi and for-hire vehicle (Uber, Lyft)

    ...It also contains example analyses—spatial and temporal visualizations like maps, time-series plots, and hotspot detection—highlighting insights such as patterns of demand, peak times, and geospatial distributions. The repository is often used as a benchmark dataset and example for teaching, benchmarking, and demonstration purposes in the data science and urban analytics communities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs by 54% Icon
    Cut Data Warehouse Costs by 54%

    Easily migrate from Snowflake, Redshift, or Databricks with free tools.

    BigQuery delivers 54% lower TCO with exabyte scale and flexible pricing. Free migration tools handle the SQL translation automatically.
    Try Free
  • 5
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    XGBoost

    XGBoost

    Scalable and Flexible Gradient Boosting

    ...XGBoost works by implementing machine learning algorithms under the Gradient Boosting framework. It also offers parallel tree boosting (GBDT, GBRT or GBM) that can quickly and accurately solve many data science problems. XGBoost can be used for Python, Java, Scala, R, C++ and more. It can run on a single machine, Hadoop, Spark, Dask, Flink and most other distributed environments, and is capable of solving problems beyond billions of examples.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    cuDF

    cuDF

    GPU DataFrame Library

    ...The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SIT742

    SIT742

    SIT742: Modern Data Science

    SIT742: Modern Data Science.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    ...SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit, and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of data sources. SynapseML also brings new networking capabilities to the Spark Ecosystem. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 10
    RStudio Cheatsheets

    RStudio Cheatsheets

    Curated collection of official cheat sheets for data science tools

    The cheatsheets repository from RStudio is a curated collection of official cheat sheets for R, RStudio, the tidyverse, Shiny, and related data science tools. Each cheat sheet is a single (or double) page PDF that condenses important syntax, functions, workflows, and best practices into a visually organized format ideal for quick reference. The repository contains source files (R Markdown or LaTeX) that generate the cheat sheets, version history, and metadata (title, author, description) for each. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    PSLab Android App

    PSLab Android App

    PSLab Android App

    Repository for the PSLab Android App for performing experiments with the Pocket Science Lab open-hardware platform. This repository holds the Android App for performing experiments with PSLab. PSLab is a tiny pocket science lab that provides an array of equipment for doing science and engineering experiments. It can function like an oscilloscope, waveform generator, frequency counter, programmable voltage and current source and also as a data logger. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    dplyr

    dplyr

    dplyr: A grammar of data manipulation

    dplyr is an R package that provides a consistent and intuitive grammar for data manipulation, enabling users to filter, arrange, summarize, and transform data efficiently. Part of the tidyverse ecosystem, dplyr simplifies complex data operations through a clear and readable syntax, whether working with data frames, tibbles, or databases. It is widely used in data science and statistical analysis workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    XState

    XState

    State machines and statecharts for the modern web

    JavaScript and TypeScript finite state machines and statecharts for the modern web. Statecharts are a formalism for modeling stateful, reactive systems. This is useful for declaratively describing the behavior of your application, from the individual components to the overall application logic. XState is a library for creating, interpreting, and executing finite state machines and statecharts, as well as managing invocations of those machines as actors. The following fundamental computer...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Jupyter Docker Stacks

    Jupyter Docker Stacks

    Ready-to-run Docker images containing Jupyter applications

    Jupyter Docker Stacks provides a curated set of ready-to-run Docker container images that bundle Jupyter applications with popular data science and computing tools, enabling users to quickly start working in a reproducible environment. These stacks support a range of use cases, from lightweight base notebook images to full featured environments that include scientific computing libraries, machine learning tools, and IDE-like notebook interfaces, all within Docker containers that run consistently across machines. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    InterviewGuide

    InterviewGuide

    Repository that collects extensive computer science

    ...The repository contains curated algorithm and data structure explanations, collections of interview questions, high-frequency topics, and practical problem-solving guides to help users brush up on skills that are often tested in technical interviews.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    kagglehub

    kagglehub

    Python library to access Kaggle resources

    ...The library is designed to work both inside and outside Kaggle Notebooks, with native behavior that can adapt when it runs in Kaggle’s hosted notebook environment. It is useful for machine learning workflows where data, models, and notebook artifacts need to be pulled into scripts, experiments, or pipelines. kagglehub also supports authentication so users can access private or restricted resources when their account has permission. Its main value is making Kaggle assets easier to consume programmatically in Python-first data science and AI development workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    all AI news

    all AI news

    A list of online news & info sources in the AI/ML/Data Science space

    all AI news is a curated repository that aggregates and organizes sources for AI-related news and information. It serves as a centralized collection of feeds, links, and resources that can be used to build news aggregation systems or stay updated on developments in artificial intelligence. The project is designed to be easily extendable, allowing users to add new sources or customize the dataset for their specific needs. It is particularly useful for developers building AI news platforms,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    huihut interview

    huihut interview

    A summary of C/C++ technical interview basics

    interview is a curated repository of technical interview questions, solutions, and explanations covering a wide range of topics in computer science and software engineering. It aims to help developers prepare for job interviews by providing sample problems in algorithms, data structures, system design, databases, and programming language intricacies, often with code snippets and discussion. The repo is designed so learners can practice real interview scenarios, compare approaches, and internalize foundational concepts that are frequently tested by tech companies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    FairChem

    FairChem

    FAIR Chemistry's library of machine learning methods for chemistry

    FAIRChem is a unified library for machine learning in chemistry and materials, consolidating data, pretrained models, demos, and application code into a single, versioned toolkit. Version 2 modernizes the stack with a cleaner core package and breaking changes relative to V1, focusing on simpler installs and a stable API surface for production and research. The centerpiece models (e.g., UMA variants) plug directly into the ASE ecosystem via a FAIRChem calculator, so users can run relaxations,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Otter-Grader

    Otter-Grader

    A Python and R autograding solution

    Otter Grader is a light-weight, modular open-source autograder developed by the Data Science Education Program at UC Berkeley. It is designed to work with classes at any scale by abstracting away the autograding internals in a way that is compatible with any instructor's assignment distribution and collection pipeline. Otter supports local grading through parallel Docker containers, grading using the autograder platforms of 3rd party learning management systems (LMSs), the deployment of an Otter-managed grading virtual machine, and a client package that allows students to run public checks on their own machines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Awesome Network Analysis

    Awesome Network Analysis

    A curated list of awesome network analysis resources

    awesome-network-analysis is a curated list of resources focused on network and graph analysis, including libraries, frameworks, visualization tools, datasets, and academic papers. It covers multiple programming languages and domains like sociology, biology, and computer science. This repository serves as a central reference for researchers, analysts, and developers working with network data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Patch-clamp data reader

    Library for reading files created by HEKA Pulse and Patch-master.

    Package and libraries (dll in windows) which read files, created by Pulse (ReadPulse.dll) and Patch Master (ReadPMaster.dll) software from HEKA company. Can read a Trace from a whole group, series or or a sweep. Service information about pulse protocols, voltage, amplifier state and so on also is retrieved. Detailed description of functions can be found in the source code. Files readpmaster.lpr and readpulse.lpr contain full lists of exported functions with the descriptions. Project...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Octave Forge

    Octave Forge

    A collection of packages providing extra functionality for GNU Octave

    Octave Forge is a central location for collaborative development of packages for GNU Octave. The Octave Forge packages expand Octave's core functionality by providing field specific features via Octave's package system. See https://octave.sourceforge.io/packages.php for a list of all available packages. GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and...
    Leader badge
    Downloads: 805 This Week
    Last Update:
    See Project
  • 24
    applied-ml

    applied-ml

    Papers & tech blogs by companies sharing their work on data science

    The applied-ml repository is a rich, curated collection of papers, technical articles, and case-study blog posts about how machine learning (ML) and data-driven systems are applied in real production environments by major companies. Instead of focusing solely on theoretical ML research, this repo highlights industry-scale challenges: data collection, quality, infrastructure, feature stores, model serving, monitoring, scalability, and how ML is embedded in product workflows. It acts as a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    FastReport Open Source

    FastReport Open Source

    Free Open Source Reporting tool for .NET

    Free Open Source Reporting tool for .NET Core/.NET Framework that helps your application generate document-like reports.
    Downloads: 38 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo