Showing 10 open source projects for "python crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Roach

    Roach

    The complete web scraping toolkit for PHP

    Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package on its own or install one of the framework-specific adapters. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Gerapy

    Gerapy

    Distributed Crawler Management Framework Based on Scrapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Someone who has worked as a crawler with Python may use Scrapy. Scrapy is indeed a very powerful crawler framework. It has high crawling efficiency and good scalability. It is basically a necessary tool for developing crawlers using Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    CEF Python

    CEF Python

    Python bindings for the Chromium Embedded Framework (CEF)

    Python bindings for the Chromium Embedded Framework (CEF). CEF Python is an open source project founded by Czarek Tomczak in 2012 to provide Python bindings for the Chromium Embedded Framework (CEF). The Chromium project focuses mainly on Google Chrome application development while CEF focuses on facilitating embedded browser use cases in third-party applications. Lots of applications use CEF control, there are more than 100 million CEF instances installed around the world. There are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    pyspider

    pyspider

    A powerful Spider(Web Crawler) system in Python

    pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    Python Crawler Library

    Python Web Crawler Library

    A simple library for crawling the web. This library will give you the ability to create macros for crawling web site and preforming simple actions like preforming "log in" and other simple actions in web sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    elk is a powerful open-source python based command-line web crawler that can recursively search for files and text on websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Nomad is tiny but efficient search engine and web crawler. This works very good for searching with in the set of corporate websites on internet and/or intranet's HTML documents or knowledge repositories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Polygon Software | Apparel Software | PLM and ERP Solutions Icon
    Polygon Software | Apparel Software | PLM and ERP Solutions

    Small to mid-sized sewn goods manufacturers and textile mills.

    PolyPM is an integrated enterprise resource planning (ERP) and product lifecycle management (PLM) solution developed by Polygon Software. Built for small to medium-sized apparel manufacturers, PolyPM enables businesses to integrate all aspects of the product development, supply chain and production processes, as well as instantly access all their style and manufacturing information anywhere in the world. This allows businesses to shorten time-to-market, incur lower development costs, and improve customer service and worker productivity.
    Learn More
  • 10
    Webhunter is a distributed, multi-threaded web crawler designed for both general indexing and crawling the web for focused content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next