Showing 1049 open source projects for "python web crawler"

View related business solutions
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1

    pySocketHTTPserver

    HTTP server developed with Python and socket as the only web module.

    # pySocketHTTPserver 1.0 by CHEN Guang (Chin Hikaru) # Using only one web module: socket, thus allow user to see and test every detail of HTTP-server. # Run this script and visit http://127.0.0.1:880/ with browser and you will see a picture. # Double click the picture for full screen, # move mouse cursor to the screen top to get the "X" button for exitting full screen. # You can drag the pictur with left mouse button. # You can change to other pictures by rolling the mouse wheel. #...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    MOFO Linux

    MOFO Linux

    A live Linux environment for computing without censorship barriers.

    MOFO Linux is a USB pluggable live Linux environment you boot on PC hardware. It gives you the power to unblock any media, at your discretion, clearing the way for you to read, write, watch, listen to, debate, or collaborate anywhere - beyond the reach of Big Brother. In other words, you jump the barrier, find media, and interact with people. MOFO Linux is designed for easy usage on home PCs, laptop computers, or workstations, whether installed in internet cafes anywhere the world or on...
    Leader badge
    Downloads: 92 This Week
    Last Update:
    See Project
  • 3
    EditPlus

    EditPlus

    Text editor for Windows with built-in FTP, FTPS and sftp

    EditPlus is a lightweight text editor designed for Windows that caters to programmers, web developers, and anyone working with code or text. It offers powerful features like syntax highlighting, code folding, and a customizable interface, making it an excellent alternative to more complex Integrated Development Environments (IDEs). EditPlus supports a wide range of programming languages, including HTML, CSS, PHP, JavaScript, C++, and more. It also integrates tools for FTP, SFTP, and...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    Buku

    Buku

    Powerful command-line bookmark manager. Your mini web!

    buku is a powerful bookmark manager written in Python3 and SQLite3. buku fetches the title of a bookmarked web page and stores it along with any additional comments and tags. You can use your favourite editor to compose and update bookmarks. With multiple search options, including regex and a deep scan mode (particularly for URLs), it can find any bookmark instantly. Multiple search results can be opened in the browser at once. Though a terminal utility, it's possible to add bookmarks...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 5
    Links Into Social Media Posts

    Links Into Social Media Posts

    Provide a list of links, get back a CSV of social media post drafts

    ## About: Instantly mass-produce many social media posts using just your links! Turn your big list of website links into ready-to-use social media post drafts. This program automatically web-scrapes each link and generates a suitable title and 5 hashtags. ### Here’s a sample of results: title,url,hashtags Skelegant - itch.io,https://skelegant.itch.io,#skelegant #itch #social #media #share APHRODITE by Skelegant: A cyberpunk reskin for vanilla...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    ddgr

    ddgr

    DuckDuckGo from the terminal

    ddgr is a cmdline utility to search DuckDuckGo from the terminal. While googler is highly popular among cmdline users, in many forums the need of a similar utility for privacy-aware DuckDuckGo came up. DuckDuckGo Bangs are super-cool too! So here's ddgr for you! Unlike the web interface, you can specify the number of search results you would like to see per page. It's more convenient than skimming through 30-odd search results per page. The default interface is carefully designed to use...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    CerberusCMS5

    CerberusCMS5

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Endian Firewall Community
    Endian Firewall Community (EFW) is a "turn-key" linux security distribution that makes your system a full featured security appliance with Unified Threat Management (UTM) functionalities. The software has been designed for the best usability: very easy to install, use and manage and still greatly flexible. The feature suite includes stateful packet inspection firewall, application-level proxies for various protocols (HTTP, FTP, POP3, SMTP) with antivirus support, virus and spam-filtering...
    Leader badge
    Downloads: 274 This Week
    Last Update:
    See Project
  • 9
    barcraft

    barcraft

    A simple QrCode / barcode generator in python

    A simple QrCode / barcode generator that you can also use from this website version : https://secret-guest.github.io/barcraft/ Interface made with pyQt5, made with a MSI installer with Inno setup
    Downloads: 0 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 10
    Shynet

    Shynet

    Modern, privacy-friendly, and detailed web analytics

    Modern, privacy-friendly, and detailed web analytics that works without cookies or JS. There are a lot of web analytics tools. Unfortunately, most of them come with the following caveats. They require handing all of your visitors' info to a third-party company They use cookies to track visitors across sessions, so you need to have those annoying cookie notices. They collect so much personal data that even the NSA is jealous. They are closed source and/or expensive, often with limited data...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    ACHE Focused Crawler

    ACHE Focused Crawler

    ACHE is a web crawler for domain-specific search

    ACHE is a focused web crawler. It collects web pages that satisfy some specific criteria, e.g., pages that belong to a given domain or that contain a user-specified pattern. ACHE differs from generic crawlers in sense that it uses page classifiers to distinguish between relevant and irrelevant pages in a given domain. A page classifier can be defined as a simple regular expression (e.g., that matches every page that contains a specific word) or a machine-learning-based classification model. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    YouTube video web scraper 2 [ISA]

    YouTube video web scraper 2 [ISA]

    YouTube video web scraper 2 [Improved.Simplified.Alternative]

    'YouTube video web scraper 2' is an desktop application developed using python 3.11.4 and other add-on libaries. Finds YouTube video based on user request and view as table. Export the table as excel. Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    dorker-py

    Descubre archivos, rutas escondidas realizando busquedas avanzadas

    Dorking Google - Dorker Py Descubre archivos, rutas escondidas realizando busquedas avanzadas (ES) Discover files, hidden paths by performing advanced searches (EN)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ScrapBot 1.40 64bits

    ScrapBot 1.40 64bits

    Task automation software for accessing and manipulating website data.

    ScrapBot is a task automation software that allows you to access, authenticate, extract, and insert data on any website. The software utilizes JavaScript to execute tasks, eliminating the need for server or additional software installations. The system can control the accessed webpage through JavaScript, and the entire navigation can be viewed in the program window. The main.js script runs in a separate frame from the navigation frame but can access all page content without any restrictions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    ...The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout. Read the documentation of the BrowserKit, DomCrawler, and HttpClient Symfony Components for more information about what you can do with Goutte. Goutte is a thin wrapper around the following Symfony Components: BrowserKit, CssSelector, DomCrawler, and HttpClient.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    LXR Cross Referencer
    A general purpose source code indexer and cross-referencer that provides web-based browsing of source code with links to the definition and usage of any identifier. Supports multiple languages. Up-to-date information in http://lxr.sourceforge.net
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Swagbucks Web Search Bot

    Swagbucks Web Search Bot

    This is a automatic swagbucks search automator.

    Basically this allows you to be able to automate searches by putting in search terms through any .txt file appropriately formatted and then the program picks that up and basically opens the link through your computer. You must login to Swagbucks for this to work. With that in mind, the title speaks for itself. To find new releases, check the version folder in each designated Operating System here. To get archived releases (ill-advised), check out the GitHub:...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    HackTools

    HackTools

    The all-in-one Red Team extension for Web Pentesters

    The all-in-one Red Team browser extension for Web Pentesters. HackTools, is a web extension facilitating your web application penetration tests, it includes cheat sheets as well as all the tools used during a test such as XSS payloads, Reverse shells and much more. With the extension you no longer need to search for payloads in different websites or in your local storage space, most of the tools are accessible in one click. HackTools is accessible either in pop-up mode or in a whole tab in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Cinemagoer

    Cinemagoer

    Python package to retrieve and manage data of the IMDb

    Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies. Platform-independent, it can retrieve data from both the IMDb's web server and a local copy of the whole db.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Splinter

    Splinter

    Splinter - Python test framework for web applications

    Splinter is a Python test framework for web applications, providing a simple and consistent API for browser automation and testing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Scrapyd

    Scrapyd

    A service daemon to run Scrapy spiders

    Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders. A common (and useful) convention to use for the version name is the revision number of the version control tool you’re using to track your Scrapy project code. For example: r23. The versions are not compared alphabetically but using a smarter algorithm (the same packaging uses) so r10 compares greater to r9, for example. Scrapyd is an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Security Log Generator

    Security Log Generator

    Generates logs of typical formats that would often be found in a SOC

    Generates logs of typical formats that would often be found in a SOC. As of 31st January 2023, it supports IDS, Web Access and Endpoint log formats. Can generate a specific number of events in a linear fashion or use a waveform to add 'bumpiness' to your data. The code is modular and extensible, adding additional formats can be done with relative ease.
    Downloads: 0 This Week
    Last Update:
    See Project