| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-20 | 1.3 kB | |
| v2.47.75 source code.tar.gz | 2026-03-20 | 776.1 kB | |
| v2.47.75 source code.zip | 2026-03-20 | 932.5 kB | |
| Totals: 3 Items | 1.7 MB | 0 | |
What's New
- PageData & Crawler trait abstractions for extensible crawl pipelines
- Proxy support for LLM HTTP requests (#378)
- Chrome remote_addr via CDP
Network.responseReceived - Remote cache for Chrome responses — dump & fallback support
Performance
- SIMD-accelerated byte scanning (memchr), unrolled FNV hash
- Trie:
Box<str>keys + manual byte-walk + memchr dot scan - Bloom filter bitmask addressing + inline early-exit
- Zero-alloc DNS cache hits via
Arc<[SocketAddr]> - Skip robots.txt for single-page crawls, TCP keepalive always
- Batch LRU cache eviction to reduce write-lock hold time
cache_memauto-enablesskip_browserfor Chrome mode- Remove unnecessary Box allocations and redundant string conversions
Fixes
- Fall back to HTTP crawl when Chrome is unavailable (#373)
- Skip redirect-caused
net::ERR_ABORTEDinstead of aborting navigation - Deterministic CDP event listener shutdown via watch channel
- Remote cache fallback + 3s timeout for chrome_remote_cache path
- Add decentralized stubs for
set_url_parsed_directandlinks_full
Deps
- strum 0.28, phf 0.13, gemini-rust 1.7, chromey 2, llm_models_spider 0.1
Full Changelog: https://github.com/spider-rs/spider/compare/v2.47.52...v2.47.75