BLEURT-20-D12 is a PyTorch implementation of BLEURT, a model designed to assess the semantic similarity between two text sequences. It serves as an automatic evaluation metric for natural language generation tasks like summarization and translation. The model predicts a score indicating how similar a candidate sentence is to a reference sentence, with higher scores indicating greater semantic overlap. Unlike standard BLEURT models from TensorFlow, this version is built from a custom PyTorch transformer library. It requires installing the model-specific library from GitHub to function properly. Once set up, it can be used to compute similarity scores with minimal code. BLEURT-20-D12 enables more flexible deployment in PyTorch-based workflows for evaluating language generation outputs.
Features
- PyTorch-based implementation of BLEURT
- Evaluates sentence-level semantic similarity
- Accepts reference and candidate sentence pairs
- Outputs real-valued similarity scores
- Uses BleurtForSequenceClassification class
- Lightweight custom tokenizer and config classes
- Compatible with Hugging Face Transformers interface
- Suitable for translation and summarization evaluation