Showing 64 open source projects for "virtual-machine"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    NVIDIA Merlin

    NVIDIA Merlin

    Library providing end-to-end GPU-accelerated recommender systems

    NVIDIA Merlin is an open-source library that accelerates recommender systems on NVIDIA GPUs. The library enables data scientists, machine learning engineers, and researchers to build high-performing recommenders at scale. Merlin includes tools to address common feature engineering, training, and inference challenges. Each stage of the Merlin pipeline is optimized to support hundreds of terabytes of data, which is all accessible through easy-to-use APIs. For more information, see NVIDIA Merlin on the NVIDIA developer website. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    ...You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop jobs, dumping data to/from databases, running machine learning algorithms, or anything else. You can build pretty much any task you want, but Luigi also comes with a toolbox of several common task templates that you use. It includes support for running Python mapreduce jobs in Hadoop, as well as Hive, and Pig, jobs. It also comes with file system abstractions for HDFS, and local files that ensures all file system operations are atomic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    ...It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The SageMaker Spark Container is a Docker image used to run batch data processing workloads on Amazon SageMaker using the Apache Spark framework. The container images in this repository are used to build the pre-built container images that are used when running Spark jobs on Amazon SageMaker using the SageMaker Python SDK. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    dxf2gcode

    dxf2gcode

    DXF2GCODE: converting 2D dxf drawings to CNC machine compatible G-Code

    DXF2GCODE is a tool for converting 2D (dxf, pdf, ps) drawings to CNC machine compatible GCode. Windows, Linux, and Mac support by using python scripting language.
    Leader badge
    Downloads: 440 This Week
    Last Update:
    See Project
  • 7
    PySchool

    PySchool

    Installable / Portable Python Distribution for Everyone.

    PySchool is a free and open-source Python distribution intended primarily for students who learn Python and data analysis, but it can also used by scientists, engineering, and data scientists. It includes more than 150 Python packages (full edition) including numpy, pandas, scipy, sympy, keras, scikit-learn, matplotlib, seaborn, beautifulsoup4...
    Leader badge
    Downloads: 1,796 This Week
    Last Update:
    See Project
  • 8
    SMILI

    SMILI

    Scientific Visualisation Made Easy

    The Simple Medical Imaging Library Interface (SMILI), pronounced 'smilie', is an open-source, light-weight and easy-to-use medical imaging viewer and library for all major operating systems. The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with...
    Leader badge
    Downloads: 41 This Week
    Last Update:
    See Project
  • 9
    Uranie

    Uranie

    Uranie is CEA's uncertainty analysis platform, based on ROOT

    Uranie is a sensitivity and uncertainty analysis plateform based on the ROOT framework (http://root.cern.ch) . It is developed at CEA, the French Atomic Energy Commission (http://www.cea.fr). It provides various tools for: - data analysis - sampling - statistical modeling - optimisation - sensitivity analysis - uncertainty analysis - running code on high performance computers - etc. Thanks to ROOT, it is easily scriptable in CINT (c++ like syntax) and Python. Is is...
    Downloads: 14 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    sadsa

    sadsa

    SADSA (Software Application for Data Science and Analytics)

    SADSA (Software Application for Data Science and Analytics) is a Python-based desktop application designed to simplify statistical analysis, machine learning, and data visualization for students, researchers, and data professionals. Built using Python for the GUI, SADSA provides a menu-driven interface for handling datasets, applying transformations, running advanced statistical tests, machine learning algorithms, and generating insightful plots — all without writing code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Datapipe

    Datapipe

    Real-time, incremental ETL library for ML with record-level depend

    Datapipe is a real-time, incremental ETL library for Python with record-level dependency tracking. Datapipe is designed to streamline the creation of data processing pipelines. It excels in scenarios where data is continuously changing, requiring pipelines to adapt and process only the modified data efficiently. This library tracks dependencies for each record in the pipeline, ensuring minimal and efficient data processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    DataPrep

    Python-based data preprocessing tool

    DataPrep v0.2 is a Tkinter-based GUI application/tool designed to assist users in data preprocessing, multicollinearity removal, and feature selection for a wide range of applications in Cheminformatics, Bioinformatics, Data Analysis, Feature Selection, Molecular Modeling, Machine Learning, and Quantitative-structure-property relationship (QSPR) studies. It includes functionality to load, process, and save datasets with support for different preprocessing & multicollinearity removal strategies with customizable parameter setting options.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Ubix Linux

    Ubix Linux

    The Pocket Datalab

    Ubix stands for Universal Business Intelligence Computing System. Ubix Linux is an open-source, Debian-based Linux distribution geared towards data acquisition, transformation, analysis and presentation. Ubix Linux purpose is to offer a tiny but versatile datalab. Ubix Linux is easily accessible, resource-efficient and completely portable on a simple USB key. Ubix Linux is a perfect toolset for learning data analysis and artificial intelligence basics on small to medium...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    ...Nodes depend on the completion of upstream nodes. No data dependencies or data flows. No in-app data processing: command line tools as the main tool for interacting with databases and data. Single machine pipeline execution based on Python's multiprocessing. No need for distributed task queues. Easy debugging and output logging. Cost based priority queues: nodes with higher cost (based on recorded run times) are run first.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    SageMaker Inference Toolkit

    SageMaker Inference Toolkit

    Serve machine learning models within a Docker container

    Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Orchest

    Orchest

    Build data pipelines, the easy way

    ...Each step runs a file in a container. It's that simple! Spin up services whose lifetime spans across the entire pipeline run. Easily define your dependencies to run on any machine. Run any subset of the pipeline directly or periodically.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    PyNanoLab

    PyNanoLab

    data analysis and Visualization with matplotlib

    PyNanoLab contains a variety of tools to complete the data analysis, statistics, curve fitting, and basic machine learning application. Visualization in pynanolab is based on matplotlib. The setup tools is desinged to control and set-up all the details of the figure with a GUI.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Padasip

    Padasip

    Python Adaptive Signal Processing

    ...The library is lightweight, well-documented, and ideal for research, prototyping, or teaching purposes. Padasip supports both supervised and unsupervised filtering modes and is built to be modular and extensible, making it easy to integrate into larger machine learning pipelines or control systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    AWS Step Functions Data Science SDK

    AWS Step Functions Data Science SDK

    For building machine learning (ML) workflows and pipelines on AWS

    The AWS Step Functions Data Science SDK is an open-source library that allows data scientists to easily create workflows that process and publish machine learning models using Amazon SageMaker and AWS Step Functions. You can create machine learning workflows in Python that orchestrate AWS infrastructure at scale, without having to provision and integrate the AWS services separately. The best way to quickly review how the AWS Step Functions Data Science SDK works is to review the related example notebooks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ML workspace

    ML workspace

    All-in-one web-based IDE specialized for machine learning

    All-in-one web-based development environment for machine learning. The ML workspace is an all-in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. This workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch, Keras, Sklearn) and dev tools (e.g., Jupyter, VS Code, Tensorboard) perfectly configured, optimized, and integrated. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Data Science Notes

    Data Science Notes

    Curated collection of data science learning materials

    Data Science Notes is a large, curated collection of data science learning materials, with explanations, code snippets, and structured notes across the typical end-to-end workflow. It spans foundational math and statistics through data wrangling, visualization, machine learning, and practical project organization. The content emphasizes hands-on understanding by pairing narrative notes with runnable examples, making it useful for both self-study and classroom settings. Because it aggregates topics in one place, learners can move linearly or jump into specific areas as needed during projects. The notes also highlight common pitfalls and good practices, which helps beginners adopt professional habits early. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OpenFrames

    OpenFrames

    Real-time interactive 3D graphics API for scientific simulations

    ...A simulation developer can use OpenFrames to specify what they want to visualize, without having to know any details of computer graphics programming. OpenFrames is currently used by three NASA programs: Copernicus (NASA JSC), the General Mission Analysis Tool (GMAT, NASA GSFC), and a Virtual Reality exploration tool (NASA GSFC).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    StellarGraph

    StellarGraph

    Machine Learning on Graphs

    StellarGraph is a Python library for machine learning on graphs and networks. The StellarGraph library offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data. It can solve many machine learning tasks. Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), and can include data associated with either as attributes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    MMdnn

    MMdnn

    Tools to help users inter-operate among deep learning frameworks

    MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML. MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model management, and "dnn" is the acronym of deep neural network. We implement a universal converter to convert DL models between frameworks,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Forecasting Best Practices

    Forecasting Best Practices

    Time Series Forecasting Best Practices & Examples

    Time series forecasting is one of the most important topics in data science. Almost every business needs to predict the future in order to make better decisions and allocate resources more effectively. This repository provides examples and best practice guidelines for building forecasting solutions. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in forecasting algorithms to build solutions and operationalize them. Rather than...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB