Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders. A common (and useful) convention to use for the version name is the revision number of the version control tool you’re using to track your Scrapy project code. For example: r23. The versions are not compared alphabetically but using a smarter algorithm (the same packaging uses) so r10 compares greater to r9, for example. Scrapyd is an application (typically run as a daemon) that listens to requests for spiders to run and spawns a process for each one. Scrapyd also runs multiple processes in parallel, allocating them in a fixed number of slots given by the max_proc and max_proc_per_cpu options, starting as many processes as possible to handle the load.

Features

  • Scrapyd is a service for running Scrapy spiders
  • It allows you to deploy your Scrapy projects and control their spiders using an HTTP JSON API
  • Documentation available
  • Scrapyd comes with a minimal web interface
  • For monitoring running processes and accessing logs
  • You can use ScrapydWeb to manage your Scrapyd cluster

Project Samples

Project Activity

See All Activity >

Categories

Web Scrapers

License

BSD License

Follow Scrapyd

Scrapyd Web Site

Other Useful Business Software
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Scrapyd!

Additional Project Details

Programming Language

Python

Related Categories

Python Web Scrapers

Registered

2023-04-10