Download Latest Version v0.7.2_ CI_CD _ Dependency Optimization Update source code.tar.gz (8.1 MB)
Email in envelope

Get an email when there's a new version of Crawl4AI

Home / v0.6.3
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-05-12 1.7 kB
v0.6.3 source code.tar.gz 2025-05-12 6.0 MB
v0.6.3 source code.zip 2025-05-12 6.1 MB
Totals: 3 Items   12.1 MB 0

Release 0.6.3 (unreleased)

Features

  • extraction: add RegexExtractionStrategy for pattern-based extraction, including built-in patterns for emails, URLs, phones, dates, support for custom regexes, an LLM-assisted pattern generator, optimized HTML preprocessing via fit_html, and enhanced network response body capture (9b5ccac)
  • docker-api: introduce job-based polling endpoints—POST /crawl/job & GET /crawl/job/{task_id} for crawls, POST /llm/job & GET /llm/job/{task_id} for LLM tasks—backed by Redis task management with configurable TTL, moved schemas to schemas.py, and added demo_docker_polling.py example (94e9959)
  • browser: improve profile management and cleanup—add process cleanup for existing Chromium instances on Windows/Unix, fix profile creation by passing full browser config, ship detailed browser/CLI docs and initial profile-creation test, bump version to 0.6.3 (9499164)

Fixes

  • crawler: remove automatic page closure in take_screenshot and take_screenshot_naive, preventing premature teardown; callers now must explicitly close pages (BREAKING CHANGE) (a3e9ef9)

Documentation

  • format bash scripts in docs/apps/linkdin/README.md so examples copy & paste cleanly (87d4b0f)
  • update the same README with full litellm argument details for correct script usage (bd5a9ac)

Refactoring

  • logger: centralize color codes behind an Enum in async_logger, browser_profiler, content_filter_strategy and related modules for cleaner, type-safe formatting (cd2b490)

Experimental

  • start migration of logging stack to rich (WIP, work ongoing) (b2f3cb0)
Source: README.md, updated 2025-05-12