weibo-crawler is a Python-based data collection tool designed to retrieve information from Sina Weibo user accounts. It automates the process of gathering posts, user profile details, and engagement metrics from one or more target accounts. weibo-crawler can extract comprehensive information about users, including profile attributes such as nickname, follower count, following count, and account metadata. It also captures detailed data about each post, including the content, publishing time, topics, mentions, likes, reposts, and comments. In addition to textual data, the project can download original media from posts, such as images, videos, and Live Photo content. Collected data can be exported to structured formats such as CSV or JSON or stored in databases for further analysis and research. It supports incremental crawling so users can periodically collect only newly published posts, making it useful for ongoing monitoring or dataset updates.
Features
- Crawls posts and profile data from one or multiple Sina Weibo users
- Extracts detailed post metadata including text, topics, mentions, and timestamps
- Downloads media such as images, videos, and Live Photo content from posts
- Exports collected data to CSV or JSON files or stores it in databases
- Supports downloading comments and repost information from posts
- Allows incremental crawling to collect newly published posts over time