DeepSpec is a full-stack codebase for training and evaluating draft models used in speculative decoding. It provides the components needed to prepare data, train draft models, and measure acceptance behavior against target models. The workflow starts with data preparation, including prompt download, target answer regeneration, and target cache construction. It then trains a draft model using configuration files for different algorithms and target model setups. The evaluation pipeline measures speculative decoding performance across benchmark tasks such as math, coding, instruction-following, and chat-style datasets. Overall, it is useful for researchers and engineers studying faster language model inference through speculative decoding methods.
Features
- Full-stack speculative decoding research codebase
- Data preparation utilities for target model outputs
- Draft model training scripts and configurations
- Evaluation scripts for speculative decoding benchmarks
- Released checkpoints for Eagle3, DFlash, and DSpark variants
- Support for Qwen and Gemma target model experiments