| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 0.22.16 source code.tar.gz | < 12 hours ago | 122.1 MB | |
| 0.22.16 source code.zip | < 12 hours ago | 122.9 MB | |
| README.md | < 12 hours ago | 1.6 kB | |
| Totals: 3 Items | 245.0 MB | 0 | |
0.22.16
Enhancements
- Formula markdown export (
element_to_md/elements_to_md): New keyword-onlyformula_markdown_style("auto","display_math","plain"; default"auto"). In"auto", display math ($$ ... $$) is used only when the text looks like notation (heuristic score) and contains no$/$$(avoids breaking Markdown and noisy OCR captions)."display_math"wraps whenever safe (still falls back to plain if$would corrupt fences)."plain"emits text only. Optionalnormalize_formula(defaultTrue) maps common Unicode operators to LaTeX-like tokens;normalize_formulastays before keyword-only options so positionalencoding/no_group_by_pagecallers are unchanged. Unicode√is never mapped to\\sqrt{}. Module constants:FORMULA_MARKDOWN_AUTO,FORMULA_MARKDOWN_DISPLAY_MATH,FORMULA_MARKDOWN_PLAIN.
0.22.15
Security
- security: fix(deps): upgrade vulnerable transitive dependencies [security]
0.22.14
Enhancements
- Deduplicate PDF rendering: Remove
_render_pdf_pagesand delegate tounstructured-inference'sconvert_pdf_to_image(which already has lazy per-page rendering). Peak memory forpath_only=Truedrops from O(n_pages) to O(1 page) — 97% reduction on a 100-page PDF. Bumps inference dep to>=1.6.2.
0.22.13
Enhancements
- Speed up
standardize_quotes: Replace loop-based character replacement with a singlestr.translate()call using a pre-computed translation table. Also fixes a pre-existing bug where left smart quotes were never normalized due to duplicate dictionary keys.