58 projects for "q learning algorithm" with 2 filters applied:

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Deep-Learning-Interview-Book

    Deep-Learning-Interview-Book

    Interview guide for machine learning, mathematics, and deep learning

    Deep-Learning-Interview-Book collects structured notes, Q&A, and concept summaries tailored to deep-learning interviews, turning scattered study into a coherent playbook. It spans the core math (linear algebra, probability, optimization) and the practitioner topics candidates actually face, like CNNs, RNNs/Transformers, attention, regularization, and training tricks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    D4RL

    D4RL

    Collection of reference environments, offline reinforcement learning

    D4RL (Datasets for Deep Data-Driven Reinforcement Learning) is a benchmark suite focused on offline reinforcement learning — i.e., learning policies from fixed datasets rather than via online interaction with the environment. It contains standardized environments, tasks and datasets (observations, actions, rewards, terminals) aimed at enabling reproducible research in offline RL. Researchers can load a dataset for a given task (e.g., maze navigation, manipulation) and apply their algorithm without the need to collect fresh transitions, which accelerates experimentation and comparison. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    TensorHouse

    TensorHouse

    A collection of reference Jupyter notebooks and demo AI/ML application

    TensorHouse is a scalable reinforcement learning (RL) platform that focuses on high-throughput experience generation and distributed training. It is designed to efficiently train agents across multiple environments and compute resources. TensorHouse enables flexible experiment management, making it suitable for large-scale RL experiments in both research and applied settings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AI-Job-Notes

    AI-Job-Notes

    AI algorithm position job search strategy

    AI-Job-Notes is a pragmatic notebook for landing roles in machine learning, computer vision, and related engineering tracks. It assembles study paths, checklists, and interview prep materials, but also covers job-search mechanics—portfolio building, resume patterns, and communication tips. The emphasis is on doing: practicing with project ideas, setting up reproducible experiments, and showcasing results that convey impact. It ties technical study (ML/DL fundamentals) to real hiring signals...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 5
    GenAI Agents

    GenAI Agents

    Implementations for various Generative AI Agent techniques

    GenAI Agents is a large, tutorial-driven repository that teaches you how to design, build, and experiment with generative AI agents. It spans a spectrum from simple conversational bots and basic question-answering agents to complex multi-agent systems that coordinate on research, education, business workflows, and creative tasks. The implementations leverage modern frameworks such as LangChain, LangGraph, AutoGen, PydanticAI, CrewAI, and more, showing how each can be wired into realistic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    The Open Source Computer Vision Library has >2500 algorithms, extensive documentation and sample code for real-time computer vision. It works on Windows, Linux, Mac OS X, Android, iOS in your browser through JavaScript. Languages: C++, Python, Julia, Javascript Homepage: https://opencv.org Q&A forum: https://forum.opencv.org/ Documentation: https://docs.opencv.org Source code: https://github.com/opencv Please pay special attention to our tutorials!...
    Leader badge
    Downloads: 3,177 This Week
    Last Update:
    See Project
  • 8
    GLM-4-32B-0414

    GLM-4-32B-0414

    Open Multilingual Multimodal Chat LMs

    GLM-4-32B-0414 is a powerful open-source large language model featuring 32 billion parameters, designed to deliver performance comparable to leading models like OpenAI’s GPT series. It supports multilingual and multimodal chat capabilities with an extensive 32K token context length, making it ideal for dialogue, reasoning, and complex task completion. The model is pre-trained on 15 trillion tokens of high-quality data, including substantial synthetic reasoning datasets, and further enhanced...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Bandicoot

    Bandicoot

    fast C++ library for GPU linear algebra & scientific computing

    * Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive Apache 2.0 license, useful for both open-source and proprietary (closed-source) software * Can be used for machine learning, pattern recognition, computer vision, signal processing, bioinformatics, statistics, finance, etc * Downloads: http://coot.sourceforge.io/download.html * Documentation: http://coot.sourceforge.io/docs.html * Bug reports: http://coot.sourceforge.io/faq.html * Git repo: https://gitlab.com/conradsnicta/bandicoot-code
    Downloads: 5 This Week
    Last Update:
    See Project
  • Field Service Management Software | BlueFolder Icon
    Field Service Management Software | BlueFolder

    Maximize technician productivity with intuitive field service software

    Track all your service data in one easy-to-use system, enabling your team to move faster and generate more revenue for your bottom line.
    Learn More
  • 10
    AnyTrading

    AnyTrading

    The most simple, flexible, and comprehensive OpenAI Gym trading

    gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    PARL

    PARL

    A high-performance distributed training framework

    PARL is a scalable reinforcement learning framework built on top of PaddlePaddle. It focuses on modularity and ease of use, supporting distributed training and a variety of RL algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MTCNN Face Detection Alignment

    MTCNN Face Detection Alignment

    Joint Face Detection and Alignment

    MTCNN_face_detection_alignment is an implementation of the “Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks” algorithm. The algorithm uses a cascade of three convolutional networks (P-Net, R-Net, O-Net) to jointly detect faces (bounding boxes) and align facial landmarks in a coarse-to-fine manner, leveraging multi-task learning. Non-maximum suppression and bounding box regression at each stage. The repository includes Caffe / MATLAB code, support scripts, and instructions for dependencies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Supervised Reptile

    Supervised Reptile

    Code for the paper "On First-Order Meta-Learning Algorithms"

    The supervised-reptile repository contains code associated with the paper “On First-Order Meta-Learning Algorithms”, which introduces Reptile, a meta-learning algorithm for learning model parameter initializations that adapt quickly to new tasks. The implementation here is aimed at supervised few-shot learning settings (e.g. Omniglot, Mini-ImageNet), not reinforcement learning, and includes scripts to run training and evaluation for few-shot classification. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    deep-q-learning

    deep-q-learning

    Minimal Deep Q Learning (DQN & DDQN) implementations in Keras

    The deep-q-learning repository authored by keon provides a Python-based implementation of the Deep Q-Learning algorithm — a cornerstone method in reinforcement learning. It implements the core logic needed to train an agent using Q-learning with neural networks (i.e. approximating Q-values via deep nets), setting up environment interaction loops, experience replay, network updates, and policy behavior.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    RecNN

    RecNN

    Reinforced Recommendation toolkit built around pytorch 1.7

    This is my school project. It focuses on Reinforcement Learning for personalized news recommendation. The main distinction is that it tries to solve online off-policy learning with dynamically generated item embeddings. I want to create a library with SOTA algorithms for reinforcement learning recommendation, providing the level of abstraction you like.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CCZero (中国象棋Zero)

    CCZero (中国象棋Zero)

    Implement AlphaZero/AlphaGo Zero methods on Chinese chess

    ChineseChess-AlphaZero is a project that implements the AlphaZero algorithm for the game of Chinese Chess (Xiangqi). It adapts DeepMind’s AlphaZero method—combining neural networks and Monte Carlo Tree Search (MCTS)—to learn and play Chinese Chess without prior human data. The system includes self-play, training, and evaluation pipelines tailored to Xiangqi's unique game mechanics.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17

    OpenDino

    Open Source Java platform for Optimization, DoE, and Learning.

    OpenDino is an open source Java platform for optimization, design of experiment and learning. It provides a graphical user interface (GUI) and a platform which simplifies integration of new algorithms as "Modules". Implemented Modules Evolutionary Algorithms: - CMA-ES - (1+1)-ES - Differential Evolution Deterministic optimization algorithm: - SIMPLEX Learning: - a simple Artificial Neural Net Optimization problems: - test functions - interface for executing other programs (solvers) - parallel execution of problems - distributed execution of problems via socket connection between computers Others: - data storage - data analyser and viewer
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MatlabFunc

    MatlabFunc

    Matlab codes for feature learning

    MatlabFunc is a collection of MATLAB functions developed by the ZJULearning group to support various tasks in computer vision, machine learning, and numerical computation. The repository brings together a wide range of utility scripts, algorithms, and implementations that serve as building blocks for research and development. These functions cover areas such as matrix operations, optimization, data processing, and visualization, making them broadly applicable across different research...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    The Teachingbox uses advanced machine learning techniques to relieve developers from the programming of hand-crafted sophisticated behaviors of autonomous agents (such as robots, game players etc...) In the current status we have implemented a well founded reinforcement learning core in Java with many popular usecases, environments, policies and learners. Obtaining the teachingbox: FOR USERS: If you want to download the latest releases, please visit:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    An open source optical flow algorithm framework for scientists and engineers alike.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    GA-EoC

    GeneticAlgorithm-based search for Heterogeneous Ensemble Combinations

    In data classification, there are no particular classifiers that perform consistently in every case. This is even worst in case of both the high dimensional and class-imbalanced datasets. To overcome the limitations of class-imbalanced data, we split the dataset using a random sub-sampling to balance them. Then, we apply the (alpha,beta)-k feature set method to select a better subset of features and combine their outputs to get a consolidated feature set for classifier training. To...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    DE-HEoC

    DE-based Weight Optimisation for Heterogeneous Ensemble

    We propose the use of Differential Evolution algorithm for the weight adjustment of base classifiers used in weighted voting heterogeneous ensemble of classifier. Average Matthews Correlation Coefficient (MCC) score, calculated over 10-fold cross-validation, has been used as the measure of quality of an ensemble. DE/rand/1/bin algorithm has been utilised to maximize the average MCC score calculated using 10-fold cross-validation on training dataset. The voting weights of base classifiers are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ExSTraCS

    ExSTraCS

    Extended Supervised Tracking and Classifying System

    This advanced machine learning algorithm is a Michigan-style learning classifier system (LCS) developed to specialize in classification, prediction, data mining, and knowledge discovery tasks. Michigan-style LCS algorithms constitute a unique class of algorithms that distribute learned patterns over a collaborative population of of individually interpretable IF:THEN rules, allowing them to flexibly and effectively describe complex and diverse problem spaces. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    JAABA

    The Janelia Automated Animal Behavior Annotator

    ...JAABA uses machine learning techniques to convert these manual labels into behavior detectors that can then be used to automatically classify the behaviors of animals in large data sets with high throughput. JAABA combines an intuitive graphical user interface, a fast and powerful machine learning algorithm, and visualizations of the classifier into an interactive, usable system for creating automatic behavior detectors.
    Leader badge
    Downloads: 24 This Week
    Last Update:
    See Project
  • 25

    LightSpMV

    lightweight GPU-based sparse matrix-vector multiplication (SpMV)

    LightSpMV is a novel CUDA-compatible sparse matrix-vector multiplication (SpMv) algorithm using the standard compressed sparse row (CSR) storage format. We have evaluated LightSpMV using various sparse matrices and further compared it to the CSR-based SpMV subprograms in the state-of-the-art CUSP and cuSPARSE. Performance evaluation reveals that on a single Tesla K40c GPU, LightSpMV is superior to both CUSP and cuSPARSE, with a speedup of up to 2.60 and 2.63 over CUSP, and up to 1.93 and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next