| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| OpenMed v0.6.3 source code.tar.gz | 2026-03-19 | 577.0 kB | |
| OpenMed v0.6.3 source code.zip | 2026-03-19 | 636.1 kB | |
| README.md | 2026-03-19 | 3.9 kB | |
| Totals: 3 Items | 1.2 MB | 0 | |
OpenMed v0.6.3 — PII/NER Quality Gate Pack
Release date: 2026-03-19
v0.6.3 hardens the PII/NER extraction pipeline with deterministic guardrails: span-boundary validation, multilingual regression coverage, and label-map consistency checks. These quality gates catch tokenizer bugs, model drift, and label inconsistencies before they reach users.
What's New
Span-Boundary Quality Gates
New runtime validation module (openmed.core.quality_gates) that runs automatically after tokenizer repair and smart merging:
validate_entity_spans(entities, text)— checks every entity for:start < end(no inverted or zero-length spans)- Bounds within text (
start >= 0,end <= len(text)) text[start:end]matches stored entity text (catches stale spans after merging)detect_overlapping_entities(entities)— returns overlapping span pairs for informational use- Warn-only by design — emits
SpanValidationWarningand tagsentity.metadata["span_valid"], but never silently drops entities
Integrated into both OutputFormatter.format_predictions() (after _fix_entity_spans) and extract_pii() (after smart merging).
Multilingual PII Regression Test Suite
31 golden-input regression tests covering all 8 supported languages:
| Language | Tests | Entity types validated |
|---|---|---|
| English (en) | 5 | NAME, DATE, PHONE, SSN, merging |
| French (fr) | 4 | NAME, DATE, PHONE, NIR |
| German (de) | 4 | NAME, DATE, PHONE, merging |
| Italian (it) | 3 | NAME, DATE, PHONE |
| Spanish (es) | 4 | NAME, DATE, PHONE, confidence |
| Dutch (nl) | 3 | NAME, DATE, PHONE |
| Hindi (hi) | 4 | NAME, DATE, PHONE, merging |
| Telugu (te) | 4 | NAME, DATE, PHONE, confidence |
All tests use mocked model output for fast, deterministic execution.
Label-Map Consistency Tests
32 tests validating configuration invariants:
- Every domain in
defaults.jsonhas at least 1 label, no case-insensitive duplicates genericfallback domain always existsnormalize_label()is idempotent for all known label variants- Specificity hierarchy (
is_more_specific()) agrees with documented hierarchies - All
entity_typesinOPENMED_MODELSPII entries are recognized bynormalize_label() - At least one PII model per supported language in the registry
Span-Boundary Guard Tests
19 unit tests covering valid entities, inverted/zero-length spans, negative start, out-of-bounds end, text mismatch, overlap detection (adjacent, nested, multiple), and integration with _fix_entity_spans output.
Changed
- Website model count updated from 640+ to 750+
Test Summary
| Suite | Tests | Status |
|---|---|---|
| Span-boundary guards | 19 | All pass |
| Multilingual PII regression | 31 | All pass |
| Label-map consistency | 32 | All pass |
| Full suite | 600 | All pass |
Files Added
openmed/core/quality_gates.py— Span validation + overlap detectiontests/unit/test_quality_gates.py— Guard unit teststests/unit/test_pii_multilingual_regression.py— Multilingual regression teststests/unit/ner/test_label_map_consistency.py— Label-map invariant tests
Files Changed
openmed/processing/outputs.py— Integrated span guard after_fix_entity_spansopenmed/core/pii.py— Integrated span guard after smart mergingopenmed/__about__.py— Version0.6.2→0.6.3docs/website/index.html—softwareVersion→0.6.3CHANGELOG.md— Added v0.6.3 sectionREADME.md— Updated version references
What's Changed
- v0.6.3: PII/NER quality gate pack by @maziyarpanahi in https://github.com/maziyarpanahi/openmed/pull/31
Full Changelog: https://github.com/maziyarpanahi/openmed/compare/v0.6.2...v0.6.3