table-transformer-detection is a fine-tuned DETR-based model by Microsoft for detecting tables in document images. Built on the Transformer architecture, it was trained on the PubTables1M dataset and excels at locating tabular structures in unstructured documents like PDFs. The model leverages the "normalize before" variant of DETR, applying layer normalization before attention layers. With 28.8 million parameters, it performs end-to-end object detection specific to tables without requiring handcrafted features. It is particularly useful in document understanding tasks where precise table extraction is critical. While Hugging Face provided the model card, the original authors released the training setup and paper. The model is implemented in PyTorch and uses Safetensors format for safe and efficient storage.
Features
- Transformer-based architecture (DETR variant)
- Fine-tuned on the PubTables1M dataset
- Specialized for detecting tables in documents
- Uses "normalize before" layernorm strategy
- 28.8 million parameters
- Implemented in PyTorch
- Safe model storage with Safetensors
- Compatible with Hugging Face Transformers library