DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. It also integrates with the Aria2 download utility to enable large-scale downloading of videos and images associated with collected content. It includes multiple usage modes such as a desktop GUI, a web service interface, and a command line tool for flexible deployment. In addition to data collection, it supports incremental updates so users can track and gather newly published content without reprocessing previously collected data.
Features
- Collect public data from Douyin accounts, videos, hashtags, music, and topics
- Retrieve user information including followers and following lists
- Incremental crawling to collect only newly published content
- Batch processing using input files containing multiple targets
- Integrated Aria2 support for downloading videos and images
- Multiple operation modes including GUI desktop app, web service, and CLI