Showing 60 open source projects for "data collection algorithm"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 1
    DataFrame

    DataFrame

    C++ DataFrame for statistical, Financial, and ML analysis

    This is a C++ analytical library designed for data analysis similar to libraries in Python and R. For example, you would compare this to Pandas, R data.frame, or Polars. You can slice the data in many different ways. You can join, merge, and group-by the data. You can run various statistical, summarization, financial, and ML algorithms on the data. You can add your custom algorithms easily. You can multi-column sort, custom pick, and delete the data. DataFrame also includes a large collection...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    MLJAR Studio

    MLJAR Studio

    Python package for AutoML on Tabular Data with Feature Engineering

    We are working on new way for visual programming. We developed a desktop application called MLJAR Studio. It is a notebook-based development environment with interactive code recipes and a managed Python environment. All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data. It is designed to save time for a data scientist. It abstracts the common way to preprocess the data, construct...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    NannyML

    NannyML

    Detecting silent model failure. NannyML estimates performance

    ... science, empowering data scientist to quickly understand and automatically detect silent model failure. By using NannyML, data scientists can finally maintain complete visibility and trust in their deployed machine learning models. When the actual outcome of your deployed prediction models is delayed, or even when post-deployment target labels are completely absent, you can use NannyML's CBPE-algorithm to estimate model performance.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Deepchecks

    Deepchecks

    Test Suites for validating ML models & data

    Deepchecks is the leading tool for testing and for validating your machine learning models and data, and it enables doing so with minimal effort. Deepchecks accompany you through various validation and testing needs such as verifying your data’s integrity, inspecting its distributions, validating data splits, evaluating your model and comparing between different models. While you’re in the research phase, and want to validate your data, find potential methodological problems, and/or validate...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Interpretable machine learning

    Interpretable machine learning

    Book about interpretable machine learning

    This book is about interpretable machine learning. Machine learning is being built into many products and processes of our daily lives, yet decisions made by machines don't automatically come with an explanation. An explanation increases the trust in the decision and in the machine learning model. As the programmer of an algorithm you want to know whether you can trust the learned model. Did it learn generalizable features? Or are there some odd artifacts in the training data which...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    TorchMetrics

    TorchMetrics

    Machine learning metrics for distributed, scalable PyTorch application

    TorchMetrics is a collection of 80+ PyTorch metrics implementations and an easy-to-use API to create custom metrics. Your data will always be placed on the same device as your metrics. You can log Metric objects directly in Lightning to reduce even more boilerplate. The module-based metrics contain internal metric states (similar to the parameters of the PyTorch module) that automate accumulation and synchronization across devices! Automatic accumulation over multiple batches. Automatic...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Homemade Machine Learning

    Homemade Machine Learning

    Python examples of popular machine learning algorithms

    homemade-machine-learning is a repository by Oleksii Trekhleb containing Python implementations of classic machine-learning algorithms done “from scratch”, meaning you don’t rely heavily on high-level libraries but instead write the logic yourself to deepen understanding. Each algorithm is accompanied by mathematical explanations, visualizations (often via Jupyter notebooks), and interactive demos so you can tweak parameters, data, and observe outcomes in real time. The purpose is pedagogical...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    The Operator Splitting QP Solver

    The Operator Splitting QP Solver

    The Operator Splitting QP Solver

    OSQP uses a specialized ADMM-based first-order method with custom sparse linear algebra routines that exploit structure in problem data. The algorithm is absolutely division-free after the setup and it requires no assumptions on problem data (the problem only needs to be convex). It just works. OSQP has an easy interface to generate customized embeddable C code with no memory manager required. OSQP supports many interfaces including C/C++, Fortran, Matlab, Python, R, Julia, Rust.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 10
    openTSNE

    openTSNE

    Extensible, parallel implementations of t-SNE

    openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4] [5], enabling t-SNE to scale to millions of data points, and various tricks to improve the global alignment of the resulting...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Vearch

    Vearch

    A distributed system for embedding-based vector retrieval

    ... with one click. Otherwise, you can easily customize your own image, video, or text feature extraction algorithm plugin. This GIF provides a clear demonstration of the project vearch usage and its internal structure. The use of vearch is mainly divided into three steps. Firstly, create DB and Space, then import your data, and finally, you can search on your own dataset.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster? Write applications quickly in Java, Scala, or any JVM languages...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    fugue

    fugue

    A unified interface for distributed computing

    Fugue is a unified interface for distributed computing that lets users execute Python, Pandas, and SQL code on Spark, Dask, and Ray with minimal rewrites.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    MiniSom

    MiniSom

    MiniSom is a minimalistic implementation of the Self Organizing Maps

    MiniSom is a minimalistic and Numpy-based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details. The project initially aimed for a minimalistic implementation...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    HDBSCAN

    HDBSCAN

    A high performance implementation of HDBSCAN clustering

    ..., is intuitive and easy to select. HDBSCAN is ideal for exploratory data analysis; it's a fast and robust algorithm that you can trust to return meaningful clusters (if there are any).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    CUTLASS

    CUTLASS

    CUDA Templates for Linear Algebra Subroutines

    CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN. CUTLASS decomposes these "moving parts" into reusable, modular software components abstracted by C++ template classes. These thread-wide, warp-wide, block-wide, and device-wide...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DeepLabCut

    DeepLabCut

    Implementation of DeepLabCut

    DeepLabCut™ is an efficient method for 2D and 3D markerless pose estimation based on transfer learning with deep neural networks that achieves excellent results (i.e. you can match human labeling accuracy) with minimal training data (typically 50-200 frames). We demonstrate the versatility of this framework by tracking various body parts in multiple species across a broad collection of behaviors. The package is open source, fast, robust, and can be used to compute 3D pose estimates...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Gen.jl

    Gen.jl

    A general-purpose probabilistic programming system

    ... for writing down generative models, inference models, variational families, and proposal distributions using ordinary code. But it also lets users migrate parts of their model or inference algorithm to specialized modeling languages for which it can generate especially fast code. Users can also hand-code parts of their models that demand better performance. Neural network inference is fast, but can be inaccurate on out-of-distribution data, and requires expensive training.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    DALI

    DALI

    A GPU-accelerated library containing highly optimized building blocks

    The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Alink

    Alink

    Alink is the Machine Learning algorithm platform based on Flink

    Alink is Alibaba’s scalable machine learning algorithm platform built on Apache Flink, designed for batch and stream data processing. It provides a wide variety of ready-to-use ML algorithms for tasks like classification, regression, clustering, recommendation, and more. Written in Java and Scala, Alink is suitable for enterprise-grade big data applications where performance and scalability are crucial. It supports model training, evaluation, and deployment in real-time environments...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account. Darts supports both univariate and multivariate time series and models. The ML-based models can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Lightning-Hydra-Template

    Lightning-Hydra-Template

    PyTorch Lightning + Hydra. A very user-friendly template

    Convenient all-in-one technology stack for deep learning prototyping - allows you to rapidly iterate over new models, datasets and tasks on different hardware accelerators like CPUs, multi-GPUs or TPUs. A collection of best practices for efficient workflow and reproducibility. Thoroughly commented - you can use this repo as a reference and educational resource. Not fitted for data engineering - the template configuration setup is not designed for building data processing pipelines that depend...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    OGB

    OGB

    Benchmark datasets, data loaders, and evaluators for graph machine

    The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner. OGB is a community-driven initiative in active development. We expect the benchmark datasets to evolve. OGB provides a diverse set of challenging and realistic benchmark datasets...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    stkpp

    stkpp

    C++ Statistical ToolKit

    STK++ (http://www.stkpp.org) is a versatile, fast, reliable and elegant collection of C++ classes for statistics, clustering, linear algebra, arrays (with an Eigen-like API), regression, dimension reduction, etc. Some functionalities provided by the library are available in the R environment as R functions (http://cran.at.r-project.org/web/packages/rtkore/index.html). At a convenience, we propose the source packages on sourceforge. The library offers a dense set of (mostly) template...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    LRSLibrary

    LRSLibrary

    Low-Rank and Sparse Tools for Background Modeling and Subtraction

    ... learning problems beyond video. Large algorithm collection: > 100 matrix- and tensor-based low-rank + sparse methods. Open-source license, documentation and references included.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next