GROBID - Browse /0.9.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
grobid-service-0.9.0.jar	2026-04-07	3.2 MB	0
grobid-trainer-0.9.0.jar	2026-04-07	16.4 MB	0
grobid-core-0.9.0.jar	2026-04-07	17.1 MB	0
0.9.0 source code.tar.gz	2026-04-07	500.5 MB	0
0.9.0 source code.zip	2026-04-07	508.6 MB	2
README.md	2026-04-07	4.4 kB	0
Totals: 6 Items		1.0 GB	2

What's Changed

Conflict of interest and author contributions statement extraction in header and segmentation models [#1319]
Extract figures, tables and equations from back/annex sections [#1215]
Extract URLs from PDF annotations in fulltext [#1315]
Mark consolidated bibliographical references and header explicitly in TEI output [#1313]
Include middle name and format initials in BibTeX output [#1356]
Fetch ORCID from Crossref when not extracted by Grobid [#1406]
Timeout configuration for consolidation requests (separate glutton and Crossref timeout) [#1340]
Lingua as an alternative for language recognition [#1239]
Blingfire as an alternative sentence segmentation engine [#1378]
Native support for Linux ARM 64 architecture
Multi-architecture Docker builds with ARM64 support (pdfalto and wapiti binaries for Linux ARM 64)
Support for Python environment managers (virtualenv, conda) for DeLFT integration [#1010]
Added version and revision information in the web UI [#1390]
Added health status indicator with periodic updates in the web UI [#1403]
Added more explanation and links to documentation in the web UI [#1391]
More informative /api/health endpoint, failing early when models are partially initialised [#1373]
-modelPath CLI argument for training and eval-mode model loading [#1383], [#1389]
Evaluation script for running end-to-end evaluation from the repository root
Enabled trivy security code scanning [#1295]
Updated Citation.cff and SWID metadata [#1341]

Revised and updated the Crossref integration, with better handling of API limits and errors, in collaboration with Crossref team [#1398]
Upgraded to JDK 21 and Gradle 9 [#1321]
Updated TensorFlow to 2.17 with Python 3.10-3.11 support [#1188]
Updated pdfalto to 0.6.0
Updated wapiti to 1.5.1
Updated JEP to 4.2.2 [#1332]
Updated DeLFT to > 0.4.1 in documentation and Dockerfiles [#1400]
Updated JRuby to 9.4.12.1 and pragmatic segmenter [#1293]
Updated Docker base images from deprecated openjdk to eclipse-temurin (21.0.10_7)
Updated Dropwizard to address Trivy vulnerability in Docker image
Updated grobid-lucene-analyzers [#1346]
Updated dependency versions in build.gradle [#1377]
Extensive model retraining: header, segmentation, fulltext, article-light, and article-light-ref models updated across CRF, BidLSTM_CRF_FEATURES, and BidLSTM_ChainCRF_FEATURES architectures
Significant expansion of training data for segmentation, fulltext, header, name, and affiliation-address models
Refactored training framework for clearer extensibility [#1393]
Updated benchmark results [#1392]
Removed obsolete and unused models [#1367]
Enhanced documentation structure and clarity for newcomers [#1310], [#1382]
Return XML by default when no HTTP Accept header is provided [#1405]
CI speed-up [#1374]

Figures, tables and equations identifier uniqueness and overlapping IDs in body and annex [#1342]
IndexOutOfBoundException in ORCID search by annotation [#1369]
Missing logic to correctly get conflicts and credits in the output TEI
BibTeX index bug [#1409]
Revision link format in the web UI [#1404]
German wordforms failing to load in the Lexicon [#1362]
Honour instance-level Wapiti params in train() [#1383]
Evaluation script now works from the repository root
Docker build crash caused by dynamic Python environment version fetching [#1348]
Dockerfile for ARM Linux [#1395]
Full Docker image build restored [#1371]
preload_embeddings.py crash when download directory doesn't exist
Security-oriented regex improvements [#1366]
Coveralls build and Gradle deprecations [#1347]
Numerous training data corrections across all models