LangExtract

LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. LangExtract supports a wide range of models, including Google Gemini, OpenAI GPT, and local LLMs via Ollama, making it adaptable to different deployment environments and compliance needs. The system excels at handling long documents using optimized chunking, multi-pass extraction, and parallel processing to ensure both high recall and structured consistency.

Features

Precise source grounding for traceable, verifiable extractions
Schema-based structured outputs guided by few-shot examples
Optimized for long documents via chunking, multi-pass analysis, and parallelism
Interactive visualization in self-contained HTML for review and validation
Compatible with Gemini, OpenAI, and Ollama models (local and cloud)
Domain-agnostic and adaptable to new use cases without fine-tuning

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow LangExtract

LangExtract Web Site

Other Useful Business Software

Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now

Rate This Project

User Reviews

Be the first to post a review of LangExtract!

Additional Project Details

Programming Language

Python

Related Categories

Python Libraries

Registered

2025-10-09

Similar Business Software

SurveyJS

SurveyJS is an embeddable, self-hosted, white-label form builder for teams building custom forms, surveys, questionnaires, and other data collection tools inside web applications. It runs entirely on the client and is fully compatible with all modern JavaScript frameworks, including React,...

See Software
DHTMLX

DHTMLX is a JavaScript UI library that provides a set of highly customizable and flexible components for building modern and responsive web applications. The library includes more than 30 UI components, such as Gantt, Scheduler, Kanban, diagrams, charts, grids, spreadsheets, calendars, trees,...

See Software
Webix

JavaScript UI library and framework for speeding up web development. JS Framework for cross-platform web Apps development 102 UI widgets and feature-rich CSS / HTML5 JavaScript controls. Save at least 3000+ development hours by using ready-made widgets and UI controls. Develop Web UI 30% faster....

See Software
FusionCharts

FusionCharts is a powerful and easy-to-use JavaScript charting library that helps developers to add interactive charts and data visualizations to their web and mobile applications. With 100+ chart types, including column, bar, line, area, pie, doughnut, scatter, bubble, and more, it's easy to...

See Software
JointJS

JointJS is a powerful JavaScript diagramming library that helps developers and companies of any size build advanced visual and No-Code applications. It comes in two versions: open-source (JointJS) with limited features and professional (JointJS+), which extends the features of JointJS and offers...

See Software
VisuallyJs

VisuallyJs is a JavaScript and TypeScript library for building professional diagrams, charts, dashboards, and rich graphical front ends. The platform supports React, Angular, Vue, Svelte, JavaScript, and TypeScript, giving development teams flexible options for building visual applications....

See Software

Report inappropriate content

LangExtract

A Python library for extracting structured information

Get an email when there's a new version of LangExtract

Features

Project Samples

Project Activity

Categories

License

Follow LangExtract

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered