Download Latest Version 18.10.0 source code.zip (4.5 MB)
Email in envelope

Get an email when there's a new version of Semantic Type Detection

Home / v18.8.0
Name Modified Size InfoDownloads / Week
Parent folder
18.8.0 source code.tar.gz 2026-05-10 4.2 MB
18.8.0 source code.zip 2026-05-10 4.5 MB
README.md 2026-05-10 784 Bytes
Totals: 3 Items   8.7 MB 0
  • ENH: Add Japanese locale support for NAME.LAST_FIRST — detects full names in Japanese Last-First order (姓名), supporting both space-separated (田中 太郎) and concatenated (田中太郎) forms using bloom filters. Header hints: 氏名, 姓名, 名前 (Issue [#165]).
  • ENH: Add approxDistinctCount to TextAnalysisResult using Apache DataSketches HyperLogLog — opt-in via Feature.APPROX_DISTINCT_COUNT, exact for low-cardinality fields, ~1% error estimate for high-cardinality fields. Supports merge. Fixes Issue [#92].
  • ENH: Add Feature.COLLECT_SHAPES flag (enabled by default) to allow disabling shape tracking and serialization — reduces JSON payload size and CPU overhead in distributed/Spark workloads. Fixes Issue [#166].
  • INT: Bump libphonenumber to 9.0.30
Source: README.md, updated 2026-05-10