Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
ModernBERT Decoder (based on v4.53.2) source code.tar.gz | 2025-07-16 | 19.3 MB | |
ModernBERT Decoder (based on v4.53.2) source code.zip | 2025-07-16 | 24.4 MB | |
README.md | 2025-07-16 | 4.1 kB | |
Totals: 3 Items | 43.7 MB | 0 |
A new model is added to transformers: ModernBERT Decoder
It is added on top of the v4.53.2 release, and can be installed from the following tag: v4.53.2-modernbert-decoder-preview
.
In order to install this version, please install with the following command:
pip install git+https://github.com/huggingface/transformers@v4.53.2-modernbert-decoder-preview
If fixes are needed, they will be applied to this release; this installation may therefore be considered as stable and improving.
As the tag implies, this tag is a preview of the ModernBERT Decoder model. This tag is a tagged version of the main
branch and does not follow semantic versioning. This model will be included in the next minor release: v4.54.0
.
ModernBERT Decoder
ModernBERT Decoder is the same architecture as ModernBERT but trained from scratch with a causal language modeling (CLM) objective. This allows for using the same architecture for comparing encoders and decoders. This is the decoder architecture implementation of ModernBERT, designed for autoregressive text generation tasks.
Like the encoder version, ModernBERT Decoder incorporates modern architectural improvements such as rotary positional embeddings to support sequences of up to 8192 tokens, unpadding to avoid wasting compute on padding tokens, GeGLU layers, and alternating attention patterns. However, it uses causal (unidirectional) attention to enable autoregressive generation.
Usage example
ModernBERT Decoder can be found on the Huggingface Hub.
Using pipeline
:
:::py
import torch
from transformers import pipeline
generator = pipeline(
task="text-generation",
model="blab-jhu/test-32m-dec",
torch_dtype=torch.float16,
device=0
)
generator("The future of artificial intelligence is", max_length=50, num_return_sequences=1)
# For sequence classification
classifier = pipeline(
task="text-classification",
model="blab-jhu/test-32m-dec",
torch_dtype=torch.float16,
device=0
)
classifier("This movie is really great!")
Using AutoModel
:
:::py
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("blab-jhu/test-32m-dec")
model = AutoModelForCausalLM.from_pretrained(
"blab-jhu/test-32m-dec",
torch_dtype=torch.float16,
device_map="auto",
)
prompt = "The future of artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=50,
num_return_sequences=1,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Generated text: {generated_text}")
# For sequence classification
from transformers import AutoModelForSequenceClassification
classifier_model = AutoModelForSequenceClassification.from_pretrained(
"blab-jhu/test-32m-dec",
torch_dtype=torch.float16,
device_map="auto",
num_labels=2
)
text = "This movie is really great!"
inputs = tokenizer(text, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = classifier_model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions, dim=-1)
print(f"Predicted class: {predicted_class.item()}")
print(f"Prediction probabilities: {predictions}")
Using the transformers
CLI:
:::bash
echo "The future of artificial intelligence is" | transformers run --task text-generation --model your-username/modernbert-decoder-base --device 0