Showing 411 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Secure remote access solution to your private network, in the cloud or on-prem. Icon
    Secure remote access solution to your private network, in the cloud or on-prem.

    Deliver secure remote access with OpenVPN.

    OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
    Get started — no credit card required.
  • 1
    Spatie Crawler

    Spatie Crawler

    An easy to use, powerful crawler implemented in PHP

    Spatie Crawler is a PHP library that allows developers to crawl websites and extract information efficiently. It can be used for web scraping, link checking, or automated testing of web pages. The library is simple to use and supports customizable crawling strategies, including controlling crawl depth and handling redirects. It’s suitable for building crawlers that navigate large or dynamically generated websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Roach

    Roach

    The complete web scraping toolkit for PHP

    Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler could...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Requests for PHP

    Requests for PHP

    Requests for PHP is a humble HTTP request library

    Requests is a HTTP library written in PHP, for human beings. It is roughly based on the API from the excellent Requests Python library. Requests is ISC Licensed (similar to the new BSD license) and has no dependencies, except for PHP 5.6+. Despite PHP’s use as a language for the web, its tools for sending HTTP requests are severely lacking. cURL has an interesting API, to say the least, and you can’t always rely on it being available. Sockets provide only low-level access and require you...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    CssSelector Component

    CssSelector Component

    Converts CSS selectors to XPath expressions

    ... to an XPath equivalent. This XPath expression can then be used with other functions and classes that use XPath to find elements in a document. Not all CSS selectors can be converted to XPath equivalents. There are several CSS selectors that only make sense in the context of a web-browser. Pseudo-elements (:before, :after, :first-line, :first-letter) are not supported because they select portions of text rather than elements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Ascoos Web Extended Studio

    Ascoos Web Extended Studio

    Is a portable web server suite for windows 64Bit, for Web Development.

    The Ascoos Web Extended Studio is a special 64Bit freeware version of web server for all Web Developers and Designers and is based on Apache, PHP, MariaDB, MongoDB, Filezilla and other. It offers to user the option of executing different versions of PHP and MariaDB. It is structured for easy upgrading Each new version of the Ascoos Web Extended Studio, includes the latest versions of individual programs without repealing earlier versions. So, you have the opportunity for experiments...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    s3cmd

    s3cmd

    Command line tool for managing Amazon S3 and CloudFront services

    Open-source tool to access Amazon S3 file storage. S3cmd is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage. Lots of features and options have been added to s3cmd since its very first release in 2008.... we recently counted more than 60 command line options, including multipart uploads, encryption, incremental backup, s3 sync, ACL and Metadata...
    Leader badge
    Downloads: 1,335 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Ascoos Web Server

    Ascoos Web Server

    Is a web server for all Web Developers and Web Designers

    For PHP 5.6 - 8.4.X see: Ascoos Web Extended Studio (AWES) is here : https://sourceforge.net/projects/ascoos-web-extended-studio/ ASCOOS Web Server is a rich package designed as a versatile web server for development purposes. It incorporates third-party components such as PHP, MySQL, pgSQL, MongoDB and FileZilla and stands out through a compact setup and a well-built administrative panel. ASCOOS Web Server allows you to work with multiple versions of PHP and MySQL without having to re...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Network Security Toolkit (NST)

    Network Security Toolkit (NST)

    A network security analysis and monitoring toolkit Linux distribution.

    ... in the toolkit. An advanced Web User Interface (WUI) is provided for system/network administration, navigation, automation, network monitoring, host geolocation, network analysis and configuration of many network and security applications found within the NST distribution. In the virtual world, NST can be used as a network security analysis, validation and monitoring tool on enterprise virtual servers hosting virtual machines.
    Leader badge
    Downloads: 203 This Week
    Last Update:
    See Project
  • 12
    Render Farm Manager, Project Tracker.

    Render Farm Manager, Project Tracker.

    CGRU: Afanasy render farm manager and RULES project tracker.

    CGRU is an open source CG tools pack, includes Afanasy render farm manager and RULES project tracker.
    Leader badge
    Downloads: 21 This Week
    Last Update:
    See Project
  • 13
    StrongKey FIDO Server (SKFS)

    StrongKey FIDO Server (SKFS)

    FIDO® Certified StrongKey FIDO Server (SKFS)

    An open source implementation of the FIDO2 protocol to support passwordless strong authentication using public-key cryptography. Supports registration, authentication (all platforms), and transaction authorization (for native Android apps).
    Downloads: 15 This Week
    Last Update:
    See Project
  • 14

    ahCrawler

    A PHP search engine for your website and web analytics tool. GNU GPL3

    ahCrawler is a set to implement your own search on your website and an analyzer for your web content. It can be used on a shared hosting. It consists of * crawler (spider) and indexer * search for your website(s) * search statistics * website analyzer (http header, short titles and keywords, linkchecker, ...) You need to install it on your own server. So all crawled data stay in your environment. You never know when an external webspider updated your content. Trigger a rescan...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    LAMP package in Complete Virtual Machine

    LAMP package in Complete Virtual Machine

    A Quick LAMP/WAMP/MAMP/XAMPP Pkg for development, testing & production

    This VM is created for 2 reasons: 1. Very little initial setup work required to Develop / Test / Deploy a Dynamic Web Application live, within minutes. 2. This system should keep running for Years, without requiring Updates / Breakages. If you are new to Virtual Machines, then please watch the Video below ( taken from my other sample PHP project called teamdocs. You may cleanup this Application from the htdocs home folder link & its corresponding database, after logging into mysql. The mysql...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    elFinder
    elFinder is a file manager for web similar to that you use on your computer. Written in JavaScript using jQuery UI, it just work's in any modern browser. Its creation is inspired by simplicity and convenience of Finder.app program used in Mac OS X.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    MailCleaner

    MailCleaner

    Anti Spam SMTP Gateway

    MailCleaner is no longer maintained. It will return soon in another form. [antispam] MailCleaner is an anti-spam / anti-virus filter SMTP gateway with user and admin web interfaces, quarantine, multi-domains, multi-templates, multi-languages. Using Bayes, RBLs, Spamassassin, MailScanner, ClamAV. Based on Debian. Enterprise ready. MailCleaner is an anti spam gateway installed between your mail infrastructure and the Internet. It includes a complete GNU/Linux OS and a graphical web interface...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    This is a C library to check the validity of German and Austrian Bank Account Numbers. All currently defined test methods by Deutsche Bundesbank (Dec 2017: 00 to E4) are implemented. Modules for AWK, Perl, PHP, Python, Ruby, C#.net and VB.net are included too. The package includes also an IBAN converter to generate (german) IBANs and BICs from account data. All currently defined IBAN rules by Deutsche Bundesbank are implemented (Dec 2017: 57 rules) and tested against independent solutions.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    CerberusCMS5

    CerberusCMS5

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    LAPP package in Complete Virtual Machine

    LAPP package in Complete Virtual Machine

    Quick Apache/PHP/Postgresql Pkg for development, testing & production

    This VM is created for 2 reasons: 1. Very little initial setup work required to Develop / Test / Deploy a Dynamic Web Application live. 2. This system should keep running for Years, without requiring Updates / Breakages. If you are new to Virtual Machines, then please watch the Video below ( taken from my other PHP project called teamdocs. Just replace td with lapp wherever mentioned ) Upload your PHP Files using the free FileZilla ftp client software, to this server called lapp.local...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    ALMA pyconv

    this is a converter for Python 2 source files

    This script : - as a web script does a limited conversion on Python 2 source files (for example for run them by Skuplt) so that the source can be written in ALMA Python syntax, it comes with a complete web directory ('converter') and can be used by the page 'converter.php' in a server (but its main purpose is an online demonstration only !) - as a console script does a limited conversion on Python 2 source files so that the source can be written in ALMA Python syntax (but its main...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    kepzeletmuhely

    kepzeletmuhely

    a writers' guild

    An application of BookStack, open source wiki engine and external services (python scripts) to ensure community controlled publishing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Warehouse Controls

    A versatile building control system using ESP32 type controllers.

    A building control system which includes an Access control database combined with a easy graphical user interface. This system includes at conception, an array of controllers and sensors to allow the monitoring of occupancy, access, lighting and HVAC.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Work'in Memories

    Work'in Memories

    computer activity log & webservice, watch your team computers progress

    automatic working time tracking software, computer activity log, enrich your time with the free web server, watch your team computers progress prove your working time homework working time viewer working time proof
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    artikelschreiber

    artikelschreiber

    Frontend and Backend Code for ArtikelSchreiber.com and UNAIQUE.NET

    Frontend and Backend Code for ArtikelSchreiber.com and UNAIQUE.NET Text Generator deutsch - Dein KI Text Generator kostenlos mit Künstlicher Intelligenz The Software as a Service can be found here: SEO Optimizer: Ghost Writer - Hausarbeiten schreiben mit KI and KI Text Generator This product includes software developed by Sebastian Enger, M.Sc. Copyright (c) 2023, Sebastian Enger, M.Sc. All rights reserved. Frontend and Backend Source Code for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.