The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
OpenMed v1.3.0 source code.tar.gz	2026-04-29	8.9 MB	0
OpenMed v1.3.0 source code.zip	2026-04-29	9.1 MB	0
README.md	2026-04-29	12.8 kB	0
Totals: 3 Items		18.1 MB	0

OpenMed v1.3.0 is the privacy and anonymization release.

CleanShot 2026-04-29 at 10 03 53@2x

This release turns PII handling into a more complete cross-platform workflow: Faker-backed obfuscation, deterministic surrogates, a canonical PII label taxonomy, unified Privacy Filter routing across MLX and PyTorch, Nemotron-PII Privacy Filter artifacts, and a new interactive Privacy Filter Studio.

The headline: OpenMed can now detect, mask, remove, hash, date-shift, or realistically replace identifiers with locale-aware surrogates, while using the same extract_pii() / deidentify() API on Apple Silicon, Linux, Windows, and service deployments.

Highlights

Added a Faker-backed anonymization engine for method="replace".
Added deterministic, locale-aware, and format-preserving surrogate generation.
Added a canonical PII label taxonomy that normalizes English, Portuguese, and BIOES-tagged Privacy Filter labels into one stable label set.
Added unified Privacy Filter backend routing: MLX on Apple Silicon, PyTorch everywhere else.
Added PyTorch support for the OpenAI Privacy Filter family through PrivacyFilterTorchPipeline.
Added Nemotron-PII Privacy Filter artifacts for PyTorch and MLX.
Added family-aware fallback so Nemotron MLX model names resolve to the Nemotron PyTorch checkpoint on non-MLX hosts.
Added shared BIOES/Viterbi decoding and span refinement utilities used by both MLX and PyTorch Privacy Filter paths.
Added Portuguese (pt) support to the REST API schemas.
Added Privacy Filter Studio, an interactive FastAPI/static web demo for masking and deterministic randomization.
Added Python and Swift Privacy Filter classifier-head bias support for Nemotron-PII artifacts.

Why This Release Matters

PII de-identification is only useful when the output is both safe and usable.

Simple masking is sometimes enough, but many clinical, operational, and demo workflows need text that still looks realistic: names that look like names, phone numbers that keep their separators, dates that keep their local ordering, and repeated mentions that resolve to the same fake person.

OpenMed v1.3.0 moves beyond static replacement lists. It gives developers a single privacy API that can:

run locally on Apple Silicon through MLX
run on CPU or CUDA through Transformers/PyTorch
preserve downstream-friendly formats
generate locale-appropriate fake identifiers
behave deterministically for reproducible tests and demos
route OpenAI and Nemotron Privacy Filter checkpoints through the same code

That makes OpenMed more practical for clinical prototypes, privacy demos, evaluation harnesses, and local-first healthcare applications.

Faker-Backed Anonymization

method="replace" now uses openmed.core.anonymizer.Anonymizer.

The anonymizer supports:

cached per-locale Faker instances
deterministic seeding with hashlib.blake2b
label-keyed generator dispatch
format-preserving phones, dates, emails, and generic IDs
locale overrides such as locale="pt_BR" or locale="en_GB"
custom label generators through register_label_generator()
custom clinical Faker providers through register_clinical_provider()

Example:

:::python
from openmed import deidentify

text = "Patient Pedro Almeida, CPF 123.456.789-09, phone +351 912 345 678."

result = deidentify(
    text,
    method="replace",
    lang="pt",
    locale="pt_BR",
    consistent=True,
    seed=42,
)

print(result.deidentified_text)

Deterministic mode means the same (label, original value) pair maps to the same surrogate within a call. Passing seed= makes the output reproducible across runs.

Clinical And National IDs

OpenMed now includes custom Faker providers for clinical and national ID shapes where Faker's built-ins are missing or insufficient:

Aadhaar with Verhoeff checksum
German Steuer-ID
medical record numbers
US National Provider Identifier (NPI)

It also reuses Faker's locale-specific built-ins where they already validate against OpenMed's checksum logic:

pt_BR.cpf and pt_BR.cnpj
nl_NL.ssn for BSN
fr_FR.ssn for NIR
it_IT.ssn for Codice Fiscale
es_ES.nie

Canonical Label Taxonomy

openmed.core.labels introduces CANONICAL_LABELS and normalize_label().

This gives downstream code one stable label vocabulary even when models emit different naming schemes:

English lowercase labels such as first_name
Portuguese uppercase labels such as FIRSTNAME
Privacy Filter BIOES labels such as B-NAME, I-EMAIL, or S-PHONE
mixed-case or separator variants

The anonymizer, replacement mapping, and Privacy Filter routes now use this normalization layer to reduce model-family-specific branching.

Privacy Filter Family

OpenMed v1.3.0 exposes two Privacy Filter checkpoint families through the same public API:

Variant	PyTorch	MLX full	MLX 8-bit
OpenAI Privacy Filter	`openai/privacy-filter`	`OpenMed/privacy-filter-mlx`	`OpenMed/privacy-filter-mlx-8bit`
Nemotron-PII fine-tune	`OpenMed/privacy-filter-nemotron`	`OpenMed/privacy-filter-nemotron-mlx`	`OpenMed/privacy-filter-nemotron-mlx-8bit`

Both families use the OpenAI Privacy Filter architecture. The Nemotron-PII artifacts are fine-tuned on the Nemotron PII dataset and reuse the existing Privacy Filter pipeline and model architecture.

Use the same API everywhere:

:::python
from openmed import extract_pii, deidentify

text = "Patient Sarah Connor, DOB 03/15/1985, MRN 4471882."

entities = extract_pii(
    text,
    model_name="OpenMed/privacy-filter-nemotron-mlx-8bit",
)

safe = deidentify(
    text,
    model_name="OpenMed/privacy-filter-nemotron-mlx-8bit",
    method="replace",
    consistent=True,
    seed=42,
)

On Apple Silicon with MLX available, MLX artifacts run through PrivacyFilterMLXPipeline. On other hosts, MLX-only model names are automatically substituted with the matching PyTorch checkpoint:

OpenMed/privacy-filter-mlx* -> openai/privacy-filter
OpenMed/privacy-filter-nemotron-mlx* -> OpenMed/privacy-filter-nemotron

A one-time UserWarning explains the substitution.

PyTorch Privacy Filter

openmed.torch.PrivacyFilterTorchPipeline loads the Privacy Filter family via Transformers:

auto-selects CUDA when available, otherwise CPU
supports compatible fine-tunes
emits the same entity dictionary shape as the MLX pipeline
uses trust_remote_code=True by default for the OpenAI Privacy Filter family

Install:

:::bash
pip install -U "openmed[hf]"

Run:

:::python
from openmed import extract_pii

result = extract_pii(
    "Alice Smith emailed alice@example.com.",
    model_name="openai/privacy-filter",
)

MLX Privacy Filter Updates

The Python MLX Privacy Filter runtime now shares decoding utilities with the PyTorch path:

TokenLabelInfo
build_label_info
viterbi_decode
labels_to_token_spans
trim_span_whitespace
refine_privacy_filter_span

This keeps BIOES/Viterbi decoding consistent across backends.

The MLX model class also now honors classifier_bias / unembedding_bias in artifact configs. This keeps the original OpenAI Privacy Filter bias-less by default while allowing Nemotron-PII artifacts to load their biased classifier head correctly.

Swift And OpenMedKit

OpenMedKit also gained Privacy Filter classifier-head bias support.

The native MLX artifact loader now decodes classifier_bias / unembedding_bias and builds the Privacy Filter head with a learned bias when Nemotron-PII artifacts require it, while preserving the bias-less baseline path.

The OpenMed Scan Demo privacy-filter option now points at OpenMed/privacy-filter-nemotron-mlx-8bit and labels the engine as OpenAI Nemotron Privacy Filter throughout the picker, download events, and README.

Privacy Filter Studio

This release adds examples/privacy_filter_studio/, an interactive two-pane web demo for PII de-identification.

It includes:

sample clinical and operational notes
mask and deterministic randomize modes
highlighted detected entities
per-entity labels and category colors
model/backend status
latency and entity counters
a first-run download toggle
cache-only model loading unless downloads are explicitly allowed

Run:

:::bash
pip install -U "openmed[mlx]"        # or "openmed[hf]" off Apple Silicon
uvicorn examples.privacy_filter_studio.app:app --reload --port 8770

Open:

:::text
http://127.0.0.1:8770

Override the model:

:::bash
OPENMED_STUDIO_MODEL=OpenMed/privacy-filter-nemotron-mlx-8bit \
  uvicorn examples.privacy_filter_studio.app:app --port 8770

Documentation And Examples

New and updated docs/examples:

docs/anonymization.md
examples/obfuscation_demo.py
examples/privacy_filter_unified.py
examples/privacy_filter_studio/
examples/privacy_filter_book/app.py

The anonymization guide covers deterministic surrogates, locale resolution, format preservation, custom generators, clinical ID providers, and the Privacy Filter routing model.

Breaking Changes

faker>=22.0 is now a required core dependency.
method="replace" no longer uses the old small static fake-data lists. Downstream tests that asserted exact prior replacement strings should be updated.
Privacy Filter routing through extract_pii() skips regex smart-merging by design, because the model already performs Viterbi-constrained BIOES span construction.

Other de-identification methods are unchanged:

mask
remove
hash
shift_dates

Upgrade Notes

Install or upgrade:

:::bash
pip install -U openmed

For PyTorch Privacy Filter support:

:::bash
pip install -U "openmed[hf]"

For Apple Silicon MLX support:

:::bash
pip install -U "openmed[mlx]"

Recommended checks for application upgrades:

If you assert exact method="replace" outputs, switch to seeded deterministic expectations or assert that originals are removed.
If you use MLX Privacy Filter model names on non-MLX hosts, expect a one-time warning and an automatic PyTorch substitution.
If you package Nemotron-PII MLX artifacts, keep classifier_bias or unembedding_bias in the artifact config when the classifier head has bias.
If you expose downloads in a UI, use explicit user control like Privacy Filter Studio's download toggle.

Validation

Release-prep validation included:

:::bash
git diff --check
.venv/bin/python -m compileall -q examples/privacy_filter_studio openmed/mlx/models/privacy_filter.py
.venv/bin/python -m pytest tests/unit/mlx/test_privacy_filter_mlx.py tests/unit/test_privacy_filter_routing.py
.venv/bin/python -m pytest tests/unit/core/test_anonymizer.py tests/unit/core/test_labels.py tests/unit/test_pii.py tests/unit/test_privacy_filter_routing.py tests/unit/test_pii_multilingual_regression.py tests/unit/mlx/test_privacy_filter_mlx.py tests/unit/service/test_api.py

Results captured during release prep:

Studio and MLX model compile check: passed
Studio FastAPI smoke test: passed
Privacy Filter routing/MLX subset: 20 passed, 8 skipped
Focused privacy/anonymization suite: 471 passed, 1 skipped, 11 warnings

The warnings are pre-existing span-validation warnings from multilingual PII regression fixtures.

Thank You

OpenMed v1.3.0 is about making privacy work feel less like a demo trick and more like an actual developer surface: local when possible, portable when needed, deterministic when tests demand it, and realistic enough for useful clinical workflows.

Thank you to everyone testing the Privacy Filter artifacts, poking at de-identification edge cases, trying the OpenMedKit paths, and helping OpenMed move toward a more practical open-source healthcare AI stack.

What's Changed

add privacy filter book comparison demo by @maziyarpanahi in https://github.com/maziyarpanahi/openmed/pull/45
Add anonymization, Studio, and Nemotron privacy-filter support by @maziyarpanahi in https://github.com/maziyarpanahi/openmed/pull/46
Refresh website for v1.3 privacy release by @maziyarpanahi in https://github.com/maziyarpanahi/openmed/pull/47

Full Changelog: https://github.com/maziyarpanahi/openmed/compare/v1.2.0...v1.3.0

Source: README.md, updated 2026-04-29

OpenMed Files

Open source healthcare AI

OpenMed v1.3.0 is the privacy and anonymization release.

Highlights

Why This Release Matters

Faker-Backed Anonymization

Clinical And National IDs

Canonical Label Taxonomy

Privacy Filter Family

PyTorch Privacy Filter

MLX Privacy Filter Updates

Swift And OpenMedKit

Privacy Filter Studio

Documentation And Examples

Breaking Changes

Upgrade Notes

Validation

Thank You

What's Changed

OpenMed Files

Open source healthcare AI

Get an email when there's a new version of OpenMed

OpenMed v1.3.0 is the privacy and anonymization release.

Highlights

Why This Release Matters

Faker-Backed Anonymization

Clinical And National IDs

Canonical Label Taxonomy

Privacy Filter Family

PyTorch Privacy Filter

MLX Privacy Filter Updates

Swift And OpenMedKit

Privacy Filter Studio

Documentation And Examples

Breaking Changes

Upgrade Notes

Validation

Thank You

What's Changed