LLM Scraper is a TypeScript library designed to extract structured data from webpages using large language models. Instead of relying on fragile HTML selectors or manual parsing rules, the tool interprets webpage content with language models and converts it into structured data according to a defined schema. Developers can specify the data structure using tools such as Zod or JSON Schema, enabling the model to extract relevant information directly into typed objects. LLM Scraper integrates browser automation through Playwright, allowing it to load webpages and process their content before sending it to a language model for interpretation. Multiple content processing modes are supported, including raw HTML, cleaned HTML, Markdown, extracted text, screenshots, and custom inputs, making it adaptable to a wide range of scraping scenarios. LLM Scraper also provides streaming output and code generation capabilities that help developers build reusable scraping workflows.

Features

  • Extracts structured data from webpages using large language models
  • Supports multiple LLM providers including GPT, Gemini, Llama, Qwen, and Sonnet
  • Schema-based data extraction using Zod or JSON Schema
  • Built on Playwright for automated webpage loading and interaction
  • Streaming mode for receiving partial structured outputs in real time
  • Code generation for creating reusable Playwright scraping scripts

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow LLM Scraper

LLM Scraper Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of LLM Scraper!

Additional Project Details

Programming Language

TypeScript

Related Categories

TypeScript Artificial Intelligence Software

Registered

19 hours ago