Showing 289 open source projects for "data.6bin"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    supabase-py

    supabase-py

    Python Client for Supabase. Query Postgres from Flask, Django

    Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Bytewax

    Bytewax

    Python Stream Processing

    ...You can use Bytewax for a variety of workloads from moving data à la Kafka Connect style all the way to advanced online machine learning workloads. Bytewax is not limited to streaming applications but excels anywhere that data can be distributed at the input and output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models. While you’re in the research phase, and want to validate your data, find potential methodological problems, and/or validate your model and evaluate it. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    omegaml

    omegaml

    MLOps simplified. From ML Pipeline ⇨ Data Product without the hassle

    omega|ml is the innovative Python-native MLOps platform that provides a scalable development and runtime environment for your Data Products. Works from laptop to cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    segment-geospatial

    segment-geospatial

    A Python package for segmenting geospatial data with the SAM

    The segment-geospatial package draws its inspiration from segment-anything-eo repository authored by Aliaksandr Hancharenka. To facilitate the use of the Segment Anything Model (SAM) for geospatial data, I have developed the segment-anything-py and segment-geospatial Python packages, which are now available on PyPI and conda-forge. My primary objective is to simplify the process of leveraging SAM for geospatial data analysis by enabling users to achieve this with minimal coding effort. I have adapted the source code of segment-geospatial from the segment-anything-eo repository, and credit for its original version goes to Aliaksandr Hancharenka.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    TabPFN

    TabPFN

    Foundation Model for Tabular Data

    TabPFN is an open-source machine learning system that introduces a foundation model designed specifically for tabular data analysis. The model is based on transformer architectures and implements a prior-data fitted network that can perform supervised learning tasks such as classification and regression with minimal configuration. Unlike many traditional machine learning workflows that require extensive hyperparameter tuning and training cycles, TabPFN is pre-trained to perform inference directly on tabular datasets. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    ...Datasets naturally frees the user from RAM memory limitation, all datasets are memory-mapped using an efficient zero-serialization cost backend (Apache Arrow). Smart caching: never wait for your data to process several times.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Quantitative Trading System

    Quantitative Trading System

    A comprehensive quantitative trading system with AI-powered analysis

    Quantitative Trading System is a comprehensive quantitative trading platform that integrates artificial intelligence, financial data analysis, and automated strategy execution within a unified software system. The project is designed to provide an end-to-end infrastructure for building and operating algorithmic trading strategies in financial markets. It includes tools for collecting and processing market data from multiple sources, performing statistical and machine learning analysis, and generating trading signals based on quantitative models. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    UMAP

    UMAP

    Uniform Manifold Approximation and Projection

    Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualization similarly to t-SNE, but also for general non-linear dimension reduction. It is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low-dimensional projection of the data that has the closest possible equivalent fuzzy topological structure. First of all UMAP is fast. It can handle large datasets and high dimensional data without too much difficulty, scaling beyond what most t-SNE packages can manage. This includes very high dimensional sparse datasets. UMAP has successfully been used directly on data with over a million dimensions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Open Notebook

    Open Notebook

    An Open Source implementation of Notebook LM with more flexibility

    Open Notebook is an open-source, privacy-focused alternative to Google’s Notebook LM that gives users full control over their research and AI workflows. Designed to be self-hosted, it ensures complete data sovereignty by keeping your content local or within your own infrastructure. The platform supports 16+ AI providers—including OpenAI, Anthropic, Ollama, Google, and LM Studio—allowing flexible model choice and cost optimization. Open Notebook enables users to organize and analyze multi-modal content such as PDFs, videos, audio files, web pages, and Office documents. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11
    mosaicml composer

    mosaicml composer

    Supercharge Your Model Training

    composer is a deep learning training framework built on PyTorch and designed to make large-scale model training more efficient, scalable, and customizable. At the center of the project is a highly optimized Trainer abstraction that simplifies the management of training loops, parallelization, metrics, logging, and data loading. The framework is intended for modern workloads that may span anything from a single GPU to very large distributed training environments, which makes it suitable for both experimentation and production-scale development. It includes built-in support for distributed training strategies such as Fully Sharded Data Parallelism and standard Distributed Data Parallel execution, helping teams scale models without having to assemble as much infrastructure by hand.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Mlxtend

    Mlxtend

    A library of extension and helper modules for Python's data analysis

    Mlxtend (machine learning extensions) is a Python library of useful tools for day-to-day data science tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    fastai

    fastai

    Deep learning library

    ...It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai is organized around two main design goals: to be approachable and rapidly productive, while also being deeply hackable and configurable. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    TensorFlow

    TensorFlow

    TensorFlow is an open source library for machine learning

    Originally developed by Google for internal use, TensorFlow is an open source platform for machine learning. Available across all common operating systems (desktop, server and mobile), TensorFlow provides stable APIs for Python and C as well as APIs that are not guaranteed to be backwards compatible or are 3rd party for a variety of other languages. The platform can be easily deployed on multiple CPUs, GPUs and Google's proprietary chip, the tensor processing unit (TPU). TensorFlow...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 15
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and reliable training process. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    tslearn

    tslearn

    The machine learning toolkit for time series analysis in Python

    ...The three dimensions correspond to the number of time series, the number of measurements per time series and the number of dimensions respectively (n_ts, max_sz, d). In order to get the data in the right format.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MNE-Python

    MNE-Python

    Magnetoencephalography (MEG) and Electroencephalography EEG in Python

    Open-source Python package for exploring, visualizing, and analyzing human neurophysiological data. MNE-Python is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, EEG, sEEG, ECoG, and more. It includes modules for data input/output, preprocessing, visualization, source estimation, time-frequency analysis, connectivity analysis, machine learning, statistics, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    ...The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account. Darts supports both univariate and multivariate time series and models. The ML-based models can be trained on potentially large datasets containing multiple time series, and some of the models offer a rich support for probabilistic forecasting. We recommend to first setup a clean Python environment for your project with at least Python 3.7 using your favorite tool (conda, venv, virtualenv with or without virtualenvwrapper).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Evidently

    Evidently

    Evaluate and monitor ML models from validation to production

    Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Ploomber

    Ploomber

    The fastest way to build data pipelines

    Ploomber is an open-source framework designed to simplify the development and deployment of data science and machine learning pipelines. It allows developers to transform exploratory data analysis workflows into production-ready pipelines without rewriting large portions of code. The system integrates with common development environments such as Jupyter Notebook, VS Code, and PyCharm, enabling data scientists to continue working with familiar tools while building scalable workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    High-Level Training Utilities Pytorch

    High-Level Training Utilities Pytorch

    High-level training, data augmentation, and utilities for Pytorch

    ...Get it from the releases, or pull the master branch. This package provides a few things. A high-level module for Keras-like training with callbacks, constraints, and regularizers. Comprehensive data augmentation, transforms, sampling, and loading. Utility tensor and variable functions so you don't need numpy as often. Have any feature requests? Submit an issue! I'll make it happen. Specifically, any data augmentation, data loading, or sampling functions. ModuleTrainer. The ModuleTrainer class provides a high-level training interface that abstracts away the training loop while providing callbacks, constraints, initializers, regularizers, and more. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Lightly

    Lightly

    A python library for self-supervised learning on images

    ...We, at Lightly, are passionate engineers who want to make deep learning more efficient. That's why - together with our community - we want to popularize the use of self-supervised methods to understand and curate raw image data. Our solution can be applied before any data annotation step and the learned representations can be used to visualize and analyze datasets. This allows selecting the best core set of samples for model training through advanced filtering. We provide PyTorch, PyTorch Lightning and PyTorch Lightning distributed examples for each of the models to kickstart your project. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Qlib

    Qlib

    Qlib is an AI-oriented quantitative investment platform

    Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies. An increasing number of SOTA Quant research works/papers are released in Qlib. With Qlib, users can easily try their ideas to create better Quant investment strategies. At the module level, Qlib is a platform that consists of...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Materials Discovery: GNoME

    Materials Discovery: GNoME

    AI discovers 520000 stable inorganic crystal structures for research

    Materials Discovery (GNoME) is a large-scale research initiative by Google DeepMind focused on applying graph neural networks to accelerate the discovery of stable inorganic crystal materials. The project centers on Graph Networks for Materials Exploration (GNoME), a message-passing neural network architecture trained on density functional theory (DFT) data to predict material stability and energy formation. Using GNoME, DeepMind identified 381,000 new stable materials, later expanding the dataset to include over 520,000 materials within 1 meV/atom of the convex hull as of August 2024. The repository provides datasets, model definitions, and interactive Colabs for exploring these materials, computing decomposition energies, and visualizing chemical families. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    openTSNE

    openTSNE

    Extensible, parallel implementations of t-SNE

    openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4] [5], enabling t-SNE to scale to millions of data points, and various tricks to improve the global alignment of the resulting visualizations.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB