A theoretical reconstruction of the Claude Mythos architecture
Open image model at the forefront of design
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Block Diffusion for Ultra-Fast Speculative Decoding
Instructions on how to use the Realtime API on Microcontrollers
RoBERTa Chinese pre-training model: RoBERTa for Chinese
Speculative-decoding accelerator for the 675B Mistral Large 3