RWKV-LM is the main research and training repository for the RWKV language model architecture. It presents RWKV as an attention-free RNN-style model that aims to reach transformer-level language model performance. The project is built around the idea that a model can be trained in a parallelizable way like a GPT-style transformer while running inference with recurrent efficiency. This gives RWKV important advantages for long-context use, including lower memory pressure and no traditional key-value cache requirement. The repository includes training code, model notes, research material, and references to current RWKV weights. Its main value is providing the foundation for experimenting with efficient large language models that combine transformer-like scalability with RNN-like runtime behavior.

Features

  • Attention-free RWKV architecture
  • Parallelizable GPT-style training
  • RNN-style efficient inference
  • Linear-time sequence processing
  • Constant-space inference behavior
  • Training code and model research notes

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow RWKV

RWKV Web Site

Other Useful Business Software
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
Get a free trial
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of RWKV!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Deep Learning Frameworks

Registered

2 days ago