Crawlee is a web scraping and browser automation library. It helps you build reliable crawlers. Fast. Crawlee won't fix broken selectors for you (yet), but it helps you build and maintain your crawlers faster. When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a great API to speed up your crawls, flip the switch back. It keeps your proxies healthy by rotating them smartly with good fingerprints that make your crawlers look human-like. It's not unblockable, but it will save you money in the long run. Crawlee is built by people who scrape for a living and use it every day to scrape millions of pages. Meet our community on Discord. We believe websites are best scraped in the language they're written in. Crawlee runs on Node.js and it's built in TypeScript to improve code completion in your IDE, even if you don't use TypeScript yourself.

Features

  • JavaScript & TypeScript
  • HTTP scraping
  • Headless browsers
  • Automatic scaling and proxy management
  • Queue and Storage
  • Helpful utils and configurability

Project Samples

Project Activity

See All Activity >

Categories

Web Scrapers

License

Apache License V2.0

Follow crawlee

crawlee Web Site

You Might Also Like
Gain insights and build data-powered applications Icon
Gain insights and build data-powered applications

Your unified business intelligence platform. Self-service. Governed. Embedded.

Chat with your business data with Looker. More than just a modern business intelligence platform, you can turn to Looker for self-service or governed BI, build your own custom applications with trusted metrics, or even bring Looker modeling to your existing BI environment.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of crawlee!

Additional Project Details

Programming Language

TypeScript

Related Categories

TypeScript Web Scrapers

Registered

2023-04-12