Download Latest Version OpenMed v1.0.0 source code.zip (748.2 kB)
Email in envelope

Get an email when there's a new version of OpenMed

Home / v0.6.3
Name Modified Size InfoDownloads / Week
Parent folder
OpenMed v0.6.3 source code.tar.gz 2026-03-19 577.0 kB
OpenMed v0.6.3 source code.zip 2026-03-19 636.1 kB
README.md 2026-03-19 3.9 kB
Totals: 3 Items   1.2 MB 0

OpenMed v0.6.3 — PII/NER Quality Gate Pack

Release date: 2026-03-19

v0.6.3 hardens the PII/NER extraction pipeline with deterministic guardrails: span-boundary validation, multilingual regression coverage, and label-map consistency checks. These quality gates catch tokenizer bugs, model drift, and label inconsistencies before they reach users.


What's New

Span-Boundary Quality Gates

New runtime validation module (openmed.core.quality_gates) that runs automatically after tokenizer repair and smart merging:

  • validate_entity_spans(entities, text) — checks every entity for:
  • start < end (no inverted or zero-length spans)
  • Bounds within text (start >= 0, end <= len(text))
  • text[start:end] matches stored entity text (catches stale spans after merging)
  • detect_overlapping_entities(entities) — returns overlapping span pairs for informational use
  • Warn-only by design — emits SpanValidationWarning and tags entity.metadata["span_valid"], but never silently drops entities

Integrated into both OutputFormatter.format_predictions() (after _fix_entity_spans) and extract_pii() (after smart merging).

Multilingual PII Regression Test Suite

31 golden-input regression tests covering all 8 supported languages:

Language Tests Entity types validated
English (en) 5 NAME, DATE, PHONE, SSN, merging
French (fr) 4 NAME, DATE, PHONE, NIR
German (de) 4 NAME, DATE, PHONE, merging
Italian (it) 3 NAME, DATE, PHONE
Spanish (es) 4 NAME, DATE, PHONE, confidence
Dutch (nl) 3 NAME, DATE, PHONE
Hindi (hi) 4 NAME, DATE, PHONE, merging
Telugu (te) 4 NAME, DATE, PHONE, confidence

All tests use mocked model output for fast, deterministic execution.

Label-Map Consistency Tests

32 tests validating configuration invariants:

  • Every domain in defaults.json has at least 1 label, no case-insensitive duplicates
  • generic fallback domain always exists
  • normalize_label() is idempotent for all known label variants
  • Specificity hierarchy (is_more_specific()) agrees with documented hierarchies
  • All entity_types in OPENMED_MODELS PII entries are recognized by normalize_label()
  • At least one PII model per supported language in the registry

Span-Boundary Guard Tests

19 unit tests covering valid entities, inverted/zero-length spans, negative start, out-of-bounds end, text mismatch, overlap detection (adjacent, nested, multiple), and integration with _fix_entity_spans output.


Changed

  • Website model count updated from 640+ to 750+

Test Summary

Suite Tests Status
Span-boundary guards 19 All pass
Multilingual PII regression 31 All pass
Label-map consistency 32 All pass
Full suite 600 All pass

Files Added

  • openmed/core/quality_gates.py — Span validation + overlap detection
  • tests/unit/test_quality_gates.py — Guard unit tests
  • tests/unit/test_pii_multilingual_regression.py — Multilingual regression tests
  • tests/unit/ner/test_label_map_consistency.py — Label-map invariant tests

Files Changed

  • openmed/processing/outputs.py — Integrated span guard after _fix_entity_spans
  • openmed/core/pii.py — Integrated span guard after smart merging
  • openmed/__about__.py — Version 0.6.20.6.3
  • docs/website/index.htmlsoftwareVersion0.6.3
  • CHANGELOG.md — Added v0.6.3 section
  • README.md — Updated version references

What's Changed

Full Changelog: https://github.com/maziyarpanahi/openmed/compare/v0.6.2...v0.6.3

Source: README.md, updated 2026-03-19