Docling is an open-source document processing toolkit built to prepare diverse content types for modern generative AI and data workflows. The project focuses on converting and parsing many document formats into a unified structured representation that downstream systems can easily consume. It supports advanced PDF understanding, including layout detection, table extraction, and reading order analysis, enabling high-fidelity document intelligence pipelines. Docling is designed to run efficiently on commodity hardware and can be used both as a Python API and a command-line tool. Its modular architecture allows developers to extend functionality and integrate specialized models for tasks such as OCR and audio transcription. Overall, Docling serves as a comprehensive preprocessing layer for AI applications that require reliable, structured access to complex document data.

Features

  • Multi-format document parsing
  • Advanced PDF layout understanding
  • Unified structured document representation
  • CLI and Python API interfaces
  • OCR and speech recognition support
  • Integrations with AI frameworks

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Docling

Docling Web Site

Other Useful Business Software
Application Monitoring That Won't Slow Your App Down Icon
Application Monitoring That Won't Slow Your App Down

AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Docling!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2026-03-02