Lingua-Py - Browse /v1.4.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
lingua_language_detector-1.4.0.tar.gz	2024-10-29	93.2 MB	0
lingua_language_detector-1.4.0-py3-none-any.whl	2024-10-29	93.4 MB	0
Lingua 1.4.0 source code.tar.gz	2024-10-29	101.6 MB	0
Lingua 1.4.0 source code.zip	2024-10-29	104.3 MB	0
README.md	2024-10-29	1.0 kB	0
Totals: 5 Items		392.5 MB	0

This release introduces an absolute confidence metric based on unique and most common ngrams for each supported language. It allows to build a language detector from a single language only. Such a detector serves as a binary classifier, telling you whether some text is written in your selected language or not. (#235)

The new absolute confidence metric helps to improve accuracy in low accuracy mode. The mean of average detection accuracy (single words, word pairs and sentences combined) increases from 77% to 80%.

The tokenization of texts written in the Devanagari alphabet was flawed. This has been fixed, leading to better detection accuracy for Hindi and Marathi.

The newest Python 3.13 is now officially supported.
Support for Python 3.8 and 3.9 has been dropped. The lowest supported Python version is 3.10 now.

Please note: All new features and bug fixes will also be part of the next Rust-based Python extension release 2.1.0.

Source: README.md, updated 2024-10-29

Lingua-Py Files