Docling is an open-source document processing toolkit built to prepare diverse content types for modern generative AI and data workflows. The project focuses on converting and parsing many document formats into a unified structured representation that downstream systems can easily consume. It supports advanced PDF understanding, including layout detection, table extraction, and reading order analysis, enabling high-fidelity document intelligence pipelines. Docling is designed to run efficiently on commodity hardware and can be used both as a Python API and a command-line tool. Its modular architecture allows developers to extend functionality and integrate specialized models for tasks such as OCR and audio transcription. Overall, Docling serves as a comprehensive preprocessing layer for AI applications that require reliable, structured access to complex document data.

Features

  • Multi-format document parsing
  • Advanced PDF layout understanding
  • Unified structured document representation
  • CLI and Python API interfaces
  • OCR and speech recognition support
  • Integrations with AI frameworks

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Docling

Docling Web Site

Other Useful Business Software
Go From AI Idea to AI App Fast Icon
Go From AI Idea to AI App Fast

One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Docling!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2026-03-02