Showing 88 open source projects for "python q learning"

View related business solutions
  • Simplify IT and security with a single endpoint management platform Icon
    Simplify IT and security with a single endpoint management platform

    Automate the hardest parts of IT

    NinjaOne automates the hardest parts of IT, delivering visibility, security, and control over all endpoints for more than 20,000 customers. The NinjaOne automated endpoint management platform is proven to increase productivity, reduce security risk, and lower costs for IT teams and managed service providers. The company seamlessly integrates with a wide range of IT and security technologies. NinjaOne is obsessed with customer success and provides free and unlimited onboarding, training, and support.
    Learn More
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Best-of Machine Learning with Python

    Best-of Machine Learning with Python

    A ranked list of awesome machine learning Python libraries

    This curated list contains 900 awesome open-source projects with a total of 3.3M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! General-purpose machine learning and deep learning frameworks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Transformer Reinforcement Learning X

    Transformer Reinforcement Learning X

    A repo for distributed training of language models with Reinforcement

    trlX is a distributed training framework designed from the ground up to focus on fine-tuning large language models with reinforcement learning using either a provided reward function or a reward-labeled dataset. Training support for Hugging Face models is provided by Accelerate-backed trainers, allowing users to fine-tune causal and T5-based language models of up to 20B parameters, such as facebook/opt-6.7b, EleutherAI/gpt-neox-20b, and google/flan-t5-xxl. For models beyond 20B parameters, trlX...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    TorchRL

    TorchRL

    A modular, primitive-first, python-first PyTorch library

    TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. TorchRL provides PyTorch and python-first, low and high-level abstractions for RL that are intended to be efficient, modular, documented, and properly tested. The code is aimed at supporting research in RL. Most of it is written in Python in a highly modular way, such that researchers can easily swap components, transform them, or write new ones with little effort.
    Downloads: 65 This Week
    Last Update:
    See Project
  • 4
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    ... integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 46 This Week
    Last Update:
    See Project
  • Your top-rated shield against malware and online scams | Avast Free Antivirus Icon
    Your top-rated shield against malware and online scams | Avast Free Antivirus

    Browse and email in peace, supported by clever AI

    Our antivirus software scans for security and performance issues and helps you to fix them instantly. It also protects you in real time by analyzing unknown files before they reach your desktop PC or laptop — all for free.
    Free Download
  • 5
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    ... supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 6
    AgentUniverse

    AgentUniverse

    agentUniverse is a LLM multi-agent framework

    AgentUniverse is a multi-agent AI framework that enables coordination between multiple intelligent agents for complex task execution and automation.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    Tensorforce

    Tensorforce

    A TensorFlow library for applied reinforcement learning

    Tensorforce is an open-source deep reinforcement learning framework built on TensorFlow, emphasizing modularized design and straightforward usability for applied research and practice.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    H2O LLM Studio

    H2O LLM Studio

    Framework and no-code GUI for fine-tuning LLMs

    Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell. With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 9
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Powering the best of the internet | Fastly Icon
    Powering the best of the internet | Fastly

    Fastly's edge cloud platform delivers faster, safer, and more scalable sites and apps to customers.

    Ensure your websites, applications and services can effortlessly handle the demands of your users with Fastly. Fastly’s portfolio is designed to be highly performant, personalized and secure while seamlessly scaling to support your growth.
    Try for free
  • 10
    Brax

    Brax

    Massively parallel rigidbody physics simulation

    Brax is a fast and fully differentiable physics engine for large-scale rigid body simulations, built on JAX. It is designed for research in reinforcement learning and robotics, enabling efficient simulations and gradient-based optimization.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Agent S2

    Agent S2

    Agent S: an open agentic framework that uses computers like a human

    Simular's Agent S2 represents a leap forward in the development of computer-use agents, capable of autonomously interacting with a range of devices and interfaces. By integrating specialized AI models, Agent S2 delivers state-of-the-art performance, whether on desktop systems or smartphones. Through modular architecture, it efficiently handles complex tasks, such as navigating UIs, performing low-level actions like text selection, and executing high-level strategies like planning....
    Downloads: 16 This Week
    Last Update:
    See Project
  • 12
    Weights and Biases

    Weights and Biases

    Tool for visualizing and tracking your machine learning experiments

    Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to production models. Quickly identify model regressions. Use W&B to visualize results in real time, all in a central dashboard. Focus on the interesting ML. Spend less time manually tracking results in spreadsheets and text files. Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved code...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 13
    Ray

    Ray

    A unified framework for scalable computing

    ... model and reduce training costs by using the latest optimization algorithms. Deploy your machine learning models at scale with Ray Serve, a Python-first and framework agnostic model serving framework. Scale reinforcement learning (RL) with RLlib, a framework-agnostic RL library that ships with 30+ cutting-edge RL algorithms including A3C, DQN, and PPO. Easily build out scalable, distributed systems in Python with simple and composable primitives in Ray Core.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Habitat-Lab

    Habitat-Lab

    A modular high-level library to train embodied AI agents

    ... and instantiating a diverse set of embodied agents, including commercial robots and humanoids, specifying their sensors and capabilities. Providing algorithms for single and multi-agent training (via imitation or reinforcement learning, or no learning at all as in SensePlanAct pipelines), as well as tools to benchmark their performance on the defined tasks using standard metrics.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    AndroidEnv

    AndroidEnv

    RL research on Android devices

    android_env is a reinforcement learning (RL) environment developed by Google DeepMind that enables agents to interact with Android applications directly as a learning environment. It provides a standardized API for training agents to perform tasks on Android apps, supporting tasks ranging from games to productivity apps, making it suitable for research in real-world RL settings.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    AnyTrading

    AnyTrading

    The most simple, flexible, and comprehensive OpenAI Gym trading

    gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    PyBoy

    PyBoy

    Game Boy emulator written in Python

    It is highly recommended to read the report to get a light introduction to Game Boy emulation. But do be aware, that the Python implementation has changed a lot. The report is relevant, even though you want to contribute to another emulator or create your own. If you are looking to make a bot or AI, you can find all the external components in the PyBoy Documentation. There is also a short example on our Wiki page Scripts, AI and Bots as well as in the examples directory. If more features...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    Multi-Agent Orchestrator

    Multi-Agent Orchestrator

    Flexible and powerful framework for managing multiple AI agents

    Multi-Agent Orchestrator is an AI coordination framework that enables multiple intelligent agents to work together to complete complex, multi-step workflows.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    dm_control

    dm_control

    DeepMind's software stack for physics-based simulation

    DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo. DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics. The MuJoCo Python bindings support three different OpenGL rendering backends: EGL (headless, hardware-accelerated), GLFW (windowed, hardware-accelerated), and OSMesa (purely software-based). At least one of these three backends must be available in order render through...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    PaLM + RLHF - Pytorch

    PaLM + RLHF - Pytorch

    Implementation of RLHF (Reinforcement Learning with Human Feedback)

    PaLM-rlhf-pytorch is a PyTorch implementation of Pathways Language Model (PaLM) with Reinforcement Learning from Human Feedback (RLHF). It is designed for fine-tuning large-scale language models with human preference alignment, similar to OpenAI’s approach for training models like ChatGPT.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    TextWorld

    TextWorld

    ​TextWorld is a sandbox learning environment for the training

    TextWorld is a learning environment designed to train reinforcement learning agents to play text-based games, where actions and observations are entirely in natural language. Developed by Microsoft Research, TextWorld focuses on language understanding, planning, and interaction in complex, narrative-driven environments. It generates games procedurally, enabling scalable testing of agents’ natural language processing and decision-making abilities.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    VectorizedMultiAgentSimulator (VMAS)

    VectorizedMultiAgentSimulator (VMAS)

    VMAS is a vectorized differentiable simulator

    VectorizedMultiAgentSimulator is a high-performance, vectorized simulator for multi-agent systems, focusing on large-scale agent interactions in shared environments. It is designed for research in multi-agent reinforcement learning, robotics, and autonomous systems where thousands of agents need to be simulated efficiently.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    RWARE

    RWARE

    MuA multi-agent reinforcement learning environment

    robotic-warehouse is a simulation environment and framework for robotic warehouse automation, enabling research and development of AI and robotic agents to manage warehouse logistics, such as item picking and transport.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    RL Games

    RL Games

    RL implementations

    rl_games is a high-performance reinforcement learning framework optimized for GPU-based training, particularly in environments like robotics and continuous control tasks. It supports advanced algorithms and is built with PyTorch.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Jittor

    Jittor

    Jittor is a high-performance deep learning framework

    ... learning, etc. The front-end language is Python. Module Design and Dynamic Graph Execution is used in the front-end, which is the most popular design for deep learning framework interface. The back-end is implemented by high-performance languages, such as CUDA, C++. Jittor'op is similar to NumPy. Let's try some operations. We create Var a and b via operation jt.float32, and add them. Printing those variables shows they have the same shape and dtype.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.