Pholcus is a high-concurrency crawler software written in pure Go language that supports distributed, only used for programming learning and research. It supports three operating modes of stand-alone, server and client, and has three operating interfaces, Web, GUI, and command line; simple and flexible rules, concurrent batch tasks, and rich output methods (mysql/mongodb/kafka/csv/excel, etc.); In addition, it also supports horizontal and vertical grabbing modes, and a series of advanced functions such as simulated login and task suspension and cancellation. This software is only used for academic research, users need to abide by the relevant laws and regulations of their location, please do not use it for illegal purposes! Provide users with a certain Go or JS programming foundation with a heavyweight crawler tool that only needs to pay attention to rule customization and complete functions.
Features
- Support three operating modes of stand-alone, server and client
- Support status control, such as pause, resume, stop, etc.
- Can control the number of concurrent coroutines
- Support random stop during the collection process to simulate artificial behavior
- There are five output methods including mysql, mongodb, kafka, csv, excel, and original file download
- Support batch output, and the quantity of each batch is controllable