Download Latest Version 0.22.16 source code.tar.gz (122.1 MB)
Email in envelope

Get an email when there's a new version of Unstructured.IO

Home / 0.22.16
Name Modified Size InfoDownloads / Week
Parent folder
0.22.16 source code.tar.gz < 12 hours ago 122.1 MB
0.22.16 source code.zip < 12 hours ago 122.9 MB
README.md < 12 hours ago 1.6 kB
Totals: 3 Items   245.0 MB 0

0.22.16

Enhancements

  • Formula markdown export (element_to_md / elements_to_md): New keyword-only formula_markdown_style ("auto", "display_math", "plain"; default "auto"). In "auto", display math ($$ ... $$) is used only when the text looks like notation (heuristic score) and contains no $/$$ (avoids breaking Markdown and noisy OCR captions). "display_math" wraps whenever safe (still falls back to plain if $ would corrupt fences). "plain" emits text only. Optional normalize_formula (default True) maps common Unicode operators to LaTeX-like tokens; normalize_formula stays before keyword-only options so positional encoding / no_group_by_page callers are unchanged. Unicode is never mapped to \\sqrt{}. Module constants: FORMULA_MARKDOWN_AUTO, FORMULA_MARKDOWN_DISPLAY_MATH, FORMULA_MARKDOWN_PLAIN.

0.22.15

Security

  • security: fix(deps): upgrade vulnerable transitive dependencies [security]

0.22.14

Enhancements

  • Deduplicate PDF rendering: Remove _render_pdf_pages and delegate to unstructured-inference's convert_pdf_to_image (which already has lazy per-page rendering). Peak memory for path_only=True drops from O(n_pages) to O(1 page) — 97% reduction on a 100-page PDF. Bumps inference dep to >=1.6.2.

0.22.13

Enhancements

  • Speed up standardize_quotes: Replace loop-based character replacement with a single str.translate() call using a pre-computed translation table. Also fixes a pre-existing bug where left smart quotes were never normalized due to duplicate dictionary keys.
Source: README.md, updated 2026-04-03