t5-base is a pre-trained transformer model from Google’s T5 (Text-To-Text Transfer Transformer) family that reframes all NLP tasks into a unified text-to-text format. With 220 million parameters, it can handle a wide range of tasks, including translation, summarization, question answering, and classification. Unlike traditional models like BERT, which output class labels or spans, T5 always generates text outputs. It was trained on the C4 dataset, along with a variety of supervised NLP benchmarks, using both unsupervised denoising and supervised objectives. The model supports multiple languages, including English, French, Romanian, and German. Its flexible architecture and consistent input/output format simplify model reuse and transfer learning across different NLP tasks. T5-base achieves competitive performance across 24 language understanding tasks, as documented in its research paper.
Features
- Unified text-to-text format for all NLP tasks
- Pretrained on the large-scale C4 dataset
- 220 million parameters with encoder-decoder architecture
- Supports translation, summarization, QA, classification, and more
- Handles English, French, Romanian, and German
- Trained using both unsupervised and supervised learning
- Available in PyTorch, TensorFlow, and JAX
- Easily accessible via Hugging Face Transformers library