tumblr-crawler is an open source Python-based utility designed to download media content from Tumblr blogs. It provides a script that automatically retrieves photos and videos from specified Tumblr sites and saves them locally for offline access. Users can specify one or multiple blogs to crawl by editing a configuration file or by passing parameters through the command line. Once executed, the script fetches media from the Tumblr API and stores the downloaded files in folders named after each blog. tumblr-crawler avoids re-downloading files that have already been saved, making repeated runs safe and useful for recovering missing media. It also supports optional proxy configuration, which can help when access to Tumblr content requires routing requests through a proxy server. With simple dependencies and straightforward configuration, the project offers a practical way to archive media content from Tumblr blogs.

Features

  • Downloads photos and videos from specified Tumblr blogs
  • Supports multiple blogs through a configuration file or CLI arguments
  • Saves media into folders automatically named after each blog
  • Prevents duplicate downloads when the script is run multiple times
  • Optional proxy configuration for network-restricted environments
  • Multi-threaded downloading to process media efficiently

Project Samples

Project Activity

See All Activity >

Categories

Web Scrapers

Follow tumblr-crawler

tumblr-crawler Web Site

Other Useful Business Software
Fully Managed MySQL, PostgreSQL, and SQL Server Icon
Fully Managed MySQL, PostgreSQL, and SQL Server

Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of tumblr-crawler!

Additional Project Details

Programming Language

Python

Related Categories

Python Web Scrapers

Registered

14 hours ago