bert-base-cased

bert-base-cased is a foundational transformer model pretrained on English using masked language modeling (MLM) and next sentence prediction (NSP). It is case-sensitive, treating "English" and "english" as distinct, making it suitable for tasks where casing matters. The model uses a bidirectional attention mechanism to deeply understand sentence structure, trained on BookCorpus and English Wikipedia. With 109M parameters and WordPiece tokenization (30K vocab size), it captures rich contextual embeddings. It is mostly intended for fine-tuning on downstream NLP tasks such as classification, token labeling, or question answering. The model can also be used out-of-the-box for masked token prediction using Hugging Face’s fill-mask pipeline. Though trained on neutral data, it still inherits and reflects societal biases present in the corpus.

Features

Trained with masked language modeling and next sentence prediction
Case-sensitive (treats “Apple” and “apple” differently)
Bidirectional encoder using transformer architecture
Pretrained on BookCorpus and English Wikipedia
WordPiece tokenizer with 30,000 vocabulary tokens
Fine-tuneable for classification, NER, QA, and other NLP tasks
Hugging Face integration via PyTorch, TensorFlow, and JAX
Outputs contextual embeddings for entire sequences or tokens

Project Samples

Project Activity

See All Activity >

Follow bert-base-cased

bert-base-cased Web Site

Other Useful Business Software

Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now

Rate This Project

User Reviews

Be the first to post a review of bert-base-cased!

Additional Project Details

Registered

2025-07-01

Similar Business Software

ALBERT

ALBERT is a self-supervised Transformer model that was pretrained on a large corpus of English data. This means it does not require manual labelling, and instead uses an automated process to generate inputs and labels from raw texts. It is trained with two distinct objectives in mind. The first...

See Software
InstructGPT

InstructGPT is an open-source framework for training language models to generate natural language instructions from visual input. It uses a generative pre-trained transformer (GPT) model and the state-of-the-art object detector, Mask R-CNN, to detect objects in images and generate natural...

See Software
Mistral Large

Mistral Large is Mistral AI's flagship language model, designed for advanced text generation and complex multilingual reasoning tasks, including text comprehension, transformation, and code generation. It supports English, French, Spanish, German, and Italian, offering a nuanced understanding of...

See Software