Showing 729 open source projects for "data science"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Data science blogs

    Data science blogs

    A curated list of data science blogs

    Data Science Blogs is a curated repository that aggregates a wide range of high-quality blogs and resources related to data science, machine learning, and analytics into a single organized collection. It serves as a discovery platform for practitioners, researchers, and learners who want to stay updated with industry trends, techniques, and insights without manually searching for reliable sources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Orchest

    Orchest

    Build data pipelines, the easy way

    Code, run and monitor your data pipelines all from your browser! From idea to scheduled pipeline in hours, not days. Interactively build your data science pipelines in our visual pipeline editor. Versioned as a JSON file. Run scripts or Jupyter notebooks as steps in a pipeline. Python, R, Julia, JavaScript, and Bash are supported. Parameterize your pipelines and run them periodically on a cron schedule.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    CARTOframes

    CARTOframes

    CARTO Python package for data scientists

    A Python package for integrating CARTO maps, analysis, and data services into data science workflows. Python data analysis workflows often rely on the de facto standards pandas and Jupyter notebooks. Integrating CARTO into this workflow saves data scientists time and energy by not having to export datasets as files or retain multiple copies of the data. Instead, CARTOframes give the ability to communicate reproducible analysis while providing the ability to gain from CARTO's services like hosted, dynamic or static maps and Data Observatory augmentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GXSM

    GXSM

    Scanning Probe Microscopy Controller and Data Visualization Software

    GXSM -- Gnome X Scanning Microscopy: A multi-channel image and vector-probe data acquisition and visualization system designed for SPM techniques (STM,AFM..), but also SPA-LEED/LEED/LEEM data analysis. A plug-in interface allows any user add-on data-processing and special hardware and instrument support. Latest: NC-AFM and related explorative methods as SQDM can be configured. High-Speed external PAC-PLL hardware option with digital DSP link. Based on several hardware options it supports...
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    SBEVSL is a collaborative project between Dowling and RIT on the development of a Structural Biology Extensible Visualization Scripting Language, so that users can move freely among various molecular graphics tools, such as rasmol, pymol, raster3d, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Mars Framework

    Mars Framework

    Mars is a tensor-based unified framework for large-scale data

    Mars is a distributed computing framework designed to scale scientific computing and data science workloads across large clusters while preserving the familiar programming interfaces of common Python libraries. The project provides a tensor-based execution model that extends the capabilities of tools such as NumPy, pandas, and scikit-learn so that large datasets can be processed in parallel without rewriting code for distributed environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    VAERity

    VAERity

    Uncovering truth in data

    VAERity is a free, open source tool to graphically explore the VAERS data set. It aims to eventually expand in scope to allow fast querying of arbitrary large datasets. It utilizes vaex and pandas as required to provide a balance of speed and query flexibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    UnionML

    UnionML

    Build and deploy machine learning microservices

    ...UnionML is an open-source Python framework built on top of Flyte™, unifying the complex ecosystem of ML tools into a single interface. Combine the tools that you love using a simple, standardized API so you can stop writing so much boilerplate and focus on what matters: the data and the models that learn from them. Fit the rich ecosystem of tools and frameworks into a common protocol for machine learning. Using industry-standard machine learning methods, implement endpoints for fetching data, training models, serving predictions (and much more) to write a complete ML stack in one place. Data science, ML engineering, and MLOps practitioners can all gather around UnionML apps as a way of defining a single source of truth about your ML system’s behavior. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Karate Club

    Karate Club

    An API Oriented Open-source Python Framework for Unsupervised Learning

    Karate Club is an unsupervised machine learning extension library for NetworkX. Karate Club consists of state-of-the-art methods to do unsupervised learning on graph-structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping community detection methods. Implemented methods cover a wide range of network science (NetSci, Complenet), data mining (ICDM, CIKM, KDD), artificial intelligence (AAAI, IJCAI) and machine learning (NeurIPS, ICML, ICLR) conferences, workshops, and pieces from prominent journals.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 10
    PyNanoLab

    PyNanoLab

    data analysis and Visualization with matplotlib

    PyNanoLab contains a variety of tools to complete the data analysis, statistics, curve fitting, and basic machine learning application. Visualization in pynanolab is based on matplotlib. The setup tools is desinged to control and set-up all the details of the figure with a GUI.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    AllenNLP

    AllenNLP

    An open-source NLP research library, built on PyTorch

    AllenNLP makes it easy to design and evaluate new deep learning models for nearly any NLP problem, along with the infrastructure to easily run them in the cloud or on your laptop. AllenNLP includes reference implementations of high quality models for both core NLP problems (e.g. semantic role labeling) and NLP applications (e.g. textual entailment). AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    dispy

    Distributed and Parallel Computing with/for Python.

    dispy is a generic and comprehensive, yet easy to use framework for creating and using compute clusters to execute computations in parallel across multiple processors in a single machine (SMP), among many machines in a cluster, grid or cloud. dispy is well suited for data parallel (SIMD) paradigm where a computation (Python function or standalone program) is evaluated with different (large) datasets independently. dispy supports public / private / hybrid cloud computing, fog / edge computing.
    Leader badge
    Downloads: 36 This Week
    Last Update:
    See Project
  • 13
    The Python Computer Graphics Kit is a collection of Python modules that contain the basic types and functions to be able to create 3D computer graphics images (focusing on Pixar's RenderMan interface).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    SASA Tool

    SWATH-Auto System Analyzer Tool, SASA Tool

    SWATH-Auto System Analyzer Tool, SASA Tool, is a novel SWATH platform for non-targeted metabolomics data analysis with an accurate mass spectral library for metabolite identification using SWATH acquisition mode.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    QuickPlot

    QuickPlot

    Simple user interface for gnuplot aimed for reflectometry data

    Graphical user interface for gnuplot to create publication quality figure very quickly. It supports templates for fast formatting of graphics, different plot styles, insets, axis and label options. One important feature is storing metadata in png and pdf files that can be used to reload any graph saved with QuickPlot.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    AWS Step Functions Data Science SDK

    AWS Step Functions Data Science SDK

    For building machine learning (ML) workflows and pipelines on AWS

    ...The best way to quickly review how the AWS Step Functions Data Science SDK works is to review the related example notebooks. These notebooks provide code and descriptions for creating and running workflows in AWS Step Functions Using the AWS Step Functions Data Science SDK. In Amazon SageMaker, example Jupyter notebooks are available in the example notebooks portion of a notebook instance. To run the AWS Step Functions Data Science SDK example notebooks locally, download the sample notebooks and open them in a working Jupyter instance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    SysBioTK

    A protein database management toolkit

    SysBioTK is a framework which aims to aid in the management of information on proteins from several databases, storing the information in a library. Several search functions are also included in order to filter the proteins in the library. Support for different screenings, such as control groups, is also included. The tool also allows to extract the GeneOntology of the protein/gene list and is able to perform several statiscal tests on the data. This is a renamed and improved version of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SciDAVis is a user-friendly data analysis and visualization program primarily aimed at high-quality plotting of scientific data. It strives to combine an intuitive, easy-to-use graphical user interface with powerful features such as Python scriptability.
    Leader badge
    Downloads: 1,110 This Week
    Last Update:
    See Project
  • 19
    XISMuS

    XISMuS

    X-Ray Imaging Software for Multiple Samples

    ATTENTION: Cumulative update 2.5.0 has been released!! The update works for any previous 2.x.x version. If upgrading from version v1.x.x, please download and install v2.0.0 first. IMPORTANT FIXES in respect to base v2.0.0 version: v.2.5.0 introduces the Differential Attenuation and Cube Viewer utilities, and migrates user database to *.json files v2.4.3 fixes a with K element in the fit-approx method v2.4.3 fixes and issue where saving plots with fit-approx or a auto-wizard could...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    MiModD

    MiModD

    Mutation Identification in Model Organism Genomes using Desktop PCs

    MiModD is a software package for genomic variant identification from next-generation sequencing (NGS) data with optimized usage of system resources and a user-friendly interface. For most model organism genomes it lets the user carry out a complete analysis from unaligned genomic NGS read data to an annotated list of variants on a regular Desktop PC within a few hours. Its user-interface is beginner-friendly and designed to encourage geneticists to analyze NGS data themselves without the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g.,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ECOLOG

    ECOLOG

    A database management system for ecological field surveys

    ECOLOG is a specimen-based, cross-platform relational database management system, aimed at the storage, retrieval and preliminary analysis of data on sites, species, and specimens gathered in ecological field surveys and biodiversity inventories. The main goal of ECOLOG is to make the data gathered in ecological field surveys readily accessible, providing lists of species collected in the study area and informations on habitat preferences, abundance or rarity of a given species, biometrics,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    HistogramsApp

    HistogramsApp

    Application that generates KDE-PDP plots from geochronological data

    HistogramsApp is a Python 3.6 application that generates (KDE and PDP) from geochronological data .HistogramsApp allows to interactively setup plot parameters such as the bandwidth and the peak detection sensibility. To cite the application please refer to: 1) https://www.tandfonline.com/doi/abs/10.1080/00206814.2021.1954556?journalCode=tigr20 Rodriguez-Corcho, A. F., Rojas-Agramonte, Y., Barrera-Gonzalez, J. A., Marroquin-Gomez, M. P., Bonilla-Correa, S., Izquierdo-Camacho, D.,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    python-tutorial

    python-tutorial

    Practical Python tutorials, including Python basics

    python-tutorial is a practical Python learning repository that collects examples for everyday programming tasks. It covers Python fundamentals, advanced language features, object-oriented programming, multithreading, databases, data science, Flask development, web crawling, and utility scripting. The project is intended for beginners learning Python and for working developers who want reference implementations for common scripts. Its examples are tested in a Python 3 environment and organized into topic-based directories. The repository also includes notebook-style learning materials for basic data types and core language concepts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    GetData

    Scientific Database Format

    The GetData library provides an API to interface with Dirfile databases. The Dirfile database format is designed to provide a fast, simple, scalable format for storing and reading binary, synchronously-sampled, time-ordered data.
    Downloads: 7 This Week
    Last Update:
    See Project
Auth0 Logo