| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 2026.03.26 source code.tar.gz | 2026-03-27 | 36.2 MB | |
| 2026.03.26 source code.zip | 2026-03-27 | 36.6 MB | |
| README.md | 2026-03-27 | 4.1 kB | |
| Totals: 3 Items | 72.7 MB | 1 | |
This release focuses on developer experience and parsing stability. A new resource management framework allows application developers to bundle and install essential data files (like AI models and OCR dictionaries) with a single line in their CMake configuration. The EML parser has also been substantially strengthened to handle complex nested MIME structures and malformed boundaries with deterministic content selection, ensuring significantly more reliable text extraction from emails.
One line to find, one line to stay,
The resources follow the SDK’s way.
The email stream, a clearer sight,
With boundaries tracked and sorted right.
A solid base, a robust frame,
DocWire answers every claim.
🐇✨📦📧🛡️
- Features
- One-Command Resource Management: Introduced a robust CMake framework (
docwire_deploy_resources,docwire_install_resources) that automates data file discovery and deployment. Developers can now bundle all necessary AI models and OCR dictionaries into their own applications with a single line in their build scripts. - Relocatable Package Support: Resource paths are now exported in the package configuration, enabling relocatable SDK installations and full support for custom
CMAKE_INSTALL_DATADIRlayouts. - Robust Resource Validation: CMake configuration now explicitly fails if required resource paths (like models or dictionaries) are missing, providing clear diagnostic errors early in the build process.
- Custom Archive Definitions: Integrated an updated
libmagicdatabase (from upstream Nov 2024) to resolve regressions in ZIP and container detection, ensuring reliable file type identification without external patches. -
Tiered Valgrind Suppression System: Implemented a granular suppression architecture with tool-specific files and a debug/release split, allowing for precise silencing of third-party false positives during deep thread-safety analysis.
-
Improvements
- EML Parsing Robustness: The email parser now features deterministic selection for
multipart/alternativeparts—prioritizing non-empty HTML content—and utilizes a newBoundaryTrackerutility to correctly process complex nested structures and malformed boundaries. - Unnamed Attachment Support: The EML parser now handles attachments without filenames via
std::optional, ensuring content is extracted and labeled correctly rather than being skipped. - Shared Library Resource Discovery: Resource resolution now employs a prioritized multi-candidate search strategy, favoring the library module path over the executable path. This ensures correct data file discovery when the SDK is used as a plugin in generic hosts like Python.
- Modernized HTTP Integration: The HTTP module now uses
httplibin CONFIG mode with native OpenSSL support, simplifying dependency management and improving build stability. - Sanitizer Stability: Introduced a fallback mechanism to disable asynchronous DNS resolution under ThreadSanitizer and Helgrind, bypassing known system-level stack-unwinding bugs on modern Linux distributions.
-
Plain Text Attachment Formatting: Refined the
PlainTextWriterto provide cleaner spacing and clearer labeling for attachments in both PST and EML exports. -
Fixes
- macOS Installation Pathing: Fixed installation errors on macOS by explicitly specifying
LIBRARYandFRAMEWORKdestinations for runtime dependencies. -
Windows Runtime Dependency Filtering: Improved installation rules to robustly exclude OS-internal API sets and CI-specific system DLLs from the distribution package.
-
Build / CI
-
SDK Installation Verification: Introduced a standalone test suite to verify package integrity independently of the build tree, ensuring both Debug and Release installations are fully functional for downstream users.
-
Tests
- Email Parsing Fixtures: Added a comprehensive set of EML test cases covering nested multiparts, folded boundaries, and unclosed inner boundaries to ensure long-term parsing stability.