Showing 58 open source projects for "data collection algorithm"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 1
    PaddleX

    PaddleX

    PaddlePaddle End-to-End Development Toolkit

    ..., the validation set and the test set. Therefore, we need to divide the above data. Using the paddlex command, the data set can be randomly divided into 70% training set, 20% validation set and 10% test set. If you use the PaddleX visualization client for model training, the data set division function is integrated in the client, and you do not need to use command division by yourself.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account. Darts supports both univariate and multivariate time series and models. The ML-based models can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    TextGen

    TextGen

    textgen, Text Generation models

    Implementation of Text Generation models. textgen implements a variety of text generation models, including UDA, GPT2, Seq2Seq, BART, T5, SongNet and other models, out of the box. UDA, non-core word replacement. EDA, simple data augmentation technique: similar words, synonym replacement, random word insertion, deletion, replacement. This project refers to Google's UDA (non-core word replacement) algorithm and EDA algorithm, based on TF-IDF to replace some unimportant words in sentences...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Lightning-Hydra-Template

    Lightning-Hydra-Template

    PyTorch Lightning + Hydra. A very user-friendly template

    Convenient all-in-one technology stack for deep learning prototyping - allows you to rapidly iterate over new models, datasets and tasks on different hardware accelerators like CPUs, multi-GPUs or TPUs. A collection of best practices for efficient workflow and reproducibility. Thoroughly commented - you can use this repo as a reference and educational resource. Not fitted for data engineering - the template configuration setup is not designed for building data processing pipelines that depend...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    AnyTrading

    AnyTrading

    The most simple, flexible, and comprehensive OpenAI Gym trading

    gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ChatGPT Plugins Collection

    ChatGPT Plugins Collection

    An unofficial collection of Plugins for ChatGPT

    ChatGPT-Plugins-Collection is a community-driven repository that gathers examples and resources for building, testing, and experimenting with ChatGPT plugins. The collection provides a variety of plugin implementations that showcase different use cases, helping developers learn how to extend ChatGPT’s functionality. It is designed to serve both as a learning resource for beginners and a reference point for more experienced developers. By centralizing community contributions, the repository...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    OGB

    OGB

    Benchmark datasets, data loaders, and evaluators for graph machine

    The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner. OGB is a community-driven initiative in active development. We expect the benchmark datasets to evolve. OGB provides a diverse set of challenging and realistic benchmark datasets...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    FFCV

    FFCV

    Fast Forward Computer Vision (and other ML workloads!)

    ffcv is a drop-in data loading system that dramatically increases data throughput in model training. From gridding to benchmarking to fast research iteration, there are many reasons to want faster model training. Below we present premade codebases for training on ImageNet and CIFAR, including both (a) extensible codebases and (b) numerous premade training configurations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    GFPGAN

    GFPGAN

    GFPGAN aims at developing Practical Algorithms

    ... a Practical Algorithm for Real-world Face Restoration. It leverages rich and diverse priors encapsulated in a pretrained face GAN (e.g., StyleGAN2) for blind face restoration. Add V1.3 model, which produces more natural restoration results, and better results on very low-quality / high-quality inputs.
    Downloads: 87 This Week
    Last Update:
    See Project
  • 11
    AllenNLP

    AllenNLP

    An open-source NLP research library, built on PyTorch

    AllenNLP makes it easy to design and evaluate new deep learning models for nearly any NLP problem, along with the infrastructure to easily run them in the cloud or on your laptop. AllenNLP includes reference implementations of high quality models for both core NLP problems (e.g. semantic role labeling) and NLP applications (e.g. textual entailment). AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Gym

    Gym

    Toolkit for developing and comparing reinforcement learning algorithms

    Gym by OpenAI is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents, everything from walking to playing games like Pong or Pinball. Open source interface to reinforce learning tasks. The gym library provides an easy-to-use suite of reinforcement learning tasks. Gym provides the environment, you provide the algorithm. You can write your agent using your existing numerical computation library, such as TensorFlow or Theano. It makes no assumptions...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Auto-PyTorch

    Auto-PyTorch

    Automatic architecture search and hyperparameter optimization

    While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed Auto-PyTorch, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL). Auto-PyTorch is mainly developed to support tabular data (classification, regression) and time series data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Photonix Photo Manager

    Photonix Photo Manager

    A modern, web-based photo management server

    A modern, web-based photo management server. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, face recognition, location awareness, color analysis and other ML algorithms. This project is currently in development and not feature complete for a version 1.0 yet. If you don't mind putting up with broken parts or want to help out, run the Docker image and give it a go. I'd love for other...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ReinventCommunity

    ReinventCommunity

    Jupyter Notebook tutorials for REINVENT 3.2

    This repository is a collection of useful jupyter notebooks, code snippets and example JSON files illustrating the use of Reinvent 3.2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Objectron

    Objectron

    A dataset of short, object-centric video clips

    The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    BerryNet

    BerryNet

    Deep learning gateway on Raspberry Pi and other edge devices

    This project turns edge devices such as Raspberry Pi into an intelligent gateway with deep learning running on it. No internet connection is required, everything is done locally on the edge device itself. Further, multiple edge devices can create a distributed AIoT network. At DT42, we believe that bringing deep learning to edge devices is the trend towards the future. It not only saves costs of data transmission and storage but also makes devices able to respond according to the events shown...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MMdnn

    MMdnn

    Tools to help users inter-operate among deep learning frameworks

    ..., which means you can train a model with one framework and deploy it with another. During the model conversion, we generate some code snippets to simplify later retraining or inference. We provide a model collection to help you find some popular models. We provide a model visualizer to display the network architecture more intuitively. We provide some guidelines to help you deploy DL models to another hardware platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Forecasting Best Practices

    Forecasting Best Practices

    Time Series Forecasting Best Practices & Examples

    Time series forecasting is one of the most important topics in data science. Almost every business needs to predict the future in order to make better decisions and allocate resources more effectively. This repository provides examples and best practice guidelines for building forecasting solutions. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in forecasting algorithms to build solutions and operationalize them. Rather than...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    RecNN

    RecNN

    Reinforced Recommendation toolkit built around pytorch 1.7

    This is my school project. It focuses on Reinforcement Learning for personalized news recommendation. The main distinction is that it tries to solve online off-policy learning with dynamically generated item embeddings. I want to create a library with SOTA algorithms for reinforcement learning recommendation, providing the level of abstraction you like.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CCZero (中国象棋Zero)

    CCZero (中国象棋Zero)

    Implement AlphaZero/AlphaGo Zero methods on Chinese chess

    ChineseChess-AlphaZero is a project that implements the AlphaZero algorithm for the game of Chinese Chess (Xiangqi). It adapts DeepMind’s AlphaZero method—combining neural networks and Monte Carlo Tree Search (MCTS)—to learn and play Chinese Chess without prior human data. The system includes self-play, training, and evaluation pipelines tailored to Xiangqi's unique game mechanics.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    lazynlp

    lazynlp

    Library to scrape and clean web pages to create massive datasets

    LazyNLP is a lightweight tool for collecting and curating large-scale text datasets for machine learning and NLP applications with minimal manual effort.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    CTS Surveyor

    CTS Surveyor

    Foot traffic and facial analytics for your business and home

    Surveyor is a software solution that monitors its environment via camera and gathers demographic information about the public in the surrounding area, providing important statistics such as number of people passing by as well as providing facial analytics to classify the pedestrians based on their age and gender. The statistical data is stored in a local database and is made available via RESTful API’s, and easy integration with other applications can be accomplished via a WebSocket interface...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    AI learning

    AI learning

    AiLearning, data analysis plus machine learning practice

    We actively respond to the Research Open Source Initiative (DOCX) . Open source today is not just open source, but datasets, models, tutorials, and experimental records. We are also exploring other categories of open source solutions and protocols. I hope you will understand this initiative, combine this initiative with your own interests, and do what you can. Everyone's tiny contributions, together, are the entire open source ecosystem. We are iBooker, a large open-source community,...
    Downloads: 0 This Week
    Last Update:
    See Project