Showing 72 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Playwright for Python

    Playwright for Python

    Python version of the Playwright testing and automation library

    Playwright enables reliable end-to-end testing for modern web apps. Single API to automate Chromium, Firefox and WebKit. Capable automation for single page apps that rely on the modern web platform. Use the Playwright API in JavaScript & TypeScript, Python, .NET and, Java. With Playwright, test how your app behaves in Apple Safari with WebKit builds for Windows, Linux and macOS. Test locally and on CI.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Helium Browser

    Helium Browser

    Private, fast, and honest web browser

    Helium is a Chromium-based web browser designed to deliver privacy, speed, and simplicity by removing Google’s proprietary services, telemetry, and bloat. It’s built atop ungoogled-chromium, extending its philosophy with additional privacy features, design refinements, and user experience improvements aimed at transparency and control. Helium blocks ads and trackers by default through an integrated, unbiased uBlock Origin extension prepackaged as a native browser component. Its UI and...
    Downloads: 127 This Week
    Last Update:
    See Project
  • 3
    ungoogled-chromium

    ungoogled-chromium

    A lightweight approach to removing Google web service dependency

    In descending order of significance (i.e. most important objective first), ungoogled-chromium is Google Chromium, sans dependency on Google web services, ungoogled-chromium retains the default Chromium experience as closely as possible. Unlike other Chromium forks that have their own visions of a web browser, ungoogled-chromium is essentially a drop-in replacement for Chromium. ungoogled-chromium features tweaks to enhance privacy, control, and transparency. However, almost all of these...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 4
    miniblink49

    miniblink49

    Lighter, faster browser kernel of blink to integrate HTML UI in apps

    ...After turning off the cross-domain switch, you can use various cross-domain functions (support cross-domain). Headless mode, which greatly saves resources and is suitable for crawlers (headless mode, be suitable for Web Crawler).
    Downloads: 6 This Week
    Last Update:
    See Project
  • The database client with the highest user satisfaction Icon
    The database client with the highest user satisfaction

    DbVisualizer has everything you need to build, manage and maintain state-of-the-art database technologies.

    DbVisualizer is the highest performer among the universal database tools in the G2 grid for universal database management systems. Write queries in an advanced SQL editor that has all the smart features you need. Speed up your coding and avoid errors. Design your favorite workspace that is saved between sessions. Tag objects and script files as favorites for instant loads. Autosave your work and resume where you left off with editors preserved between sessions. Connect with high security. Work with optimized features and functions, improved for over a decade by continuous feedback from thousands of users. Customers range from self employed consultants to major institutions and global corporations with more than 8,000 licensed users.
    Learn More
  • 5
    Mercury Browser

    Mercury Browser

    Privacy-focused web browser fork of Firefox

    Mercury Browser is an optimized, privacy-focused web browser that is a fork of Mozilla Firefox. It incorporates compiler optimizations such as AVX, AES, LTO, and PGO to enhance performance and security. With features derived from projects like LibreWolf, Waterfox, and Ghostery, Mercury disables telemetry and debugging elements by default, ensuring a more private browsing experience. It also includes usability patches that bring back features like the classic top bar and supports unsigned...
    Downloads: 59 This Week
    Last Update:
    See Project
  • 6
    SeleniumBase

    SeleniumBase

    A framework for browser automation and testing with Selenium

    SeleniumBase automatically handles common WebDriver actions such as launching web browsers before tests, saving screenshots during failures, and closing web browsers after tests. SeleniumBase lets you customize test runs from the command line. SeleniumBase uses simple syntax for commands. pytest includes automatic test discovery. If you don't specify a specific file or folder to run, pytest will automatically search through all subdirectories for tests to run. No More Flaky Tests!...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Scrapy-Redis

    Scrapy-Redis

    Redis-based components for Scrapy

    You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls. Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue. Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. Although we...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Listen 1

    Listen 1

    One for all free music in china (chrome extension)

    ...Select "Load unpacked extension..." and select the folder you just unpacked. Download the Windows zip file and choose the 32-bit or 64-bit version according to the system. The original web player, using Python to develop a web server. Can run directly on the server, or use the packaged Windows and Mac versions to run the web server locally. Windows, Mac, Linux desktop. Using Electron framework, based on Listen 1 Chrome plug-in version JS library development.
    Downloads: 4 This Week
    Last Update:
    See Project
  • SpamTitan Email Security and Protection Icon
    SpamTitan Email Security and Protection

    SpamTitan blocks spam, viruses, malware, ransomware, phishing attempts and other email threats.

    Blocks phishing, spam emails, malware, viruses, ransomware and malicious email threats. Provides advanced yet easy to use email spam filtering. Perfect for businesses, schools and managed service providers.
    Learn More
  • 10
    Playwright for .NET

    Playwright for .NET

    .NET version of the Playwright testing and automation library

    Playwright for .NET is the official language port of Playwright, the library to automate Chromium, Firefox and WebKit with a single API. Playwright is built to enable cross-browser web automation that is ever-green, capable, reliable and fast. Cross-browser. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox. Cross-platform. Test on Windows, Linux, and macOS, locally or on CI, headless or headed. Cross-language. Use the Playwright API in TypeScript, JavaScript, Python, .NET, Java. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Eric Integrated Development Environment

    Eric Integrated Development Environment

    Python Development Environment with all batteries included

    Eric is a Python IDE written using PyQt and QScintilla. It provides various features such as any number of open editors, an integrated (remote) debugger, project management facilities, unit test, refactoring and much more.
    Leader badge
    Downloads: 234 This Week
    Last Update:
    See Project
  • 12
    RPA for Python

    RPA for Python

    Python package for doing RPA

    Python package for doing RPA. RPA for Python's simple and powerful API makes robotic process automation fun! You can use it to quickly automate away repetitive time-consuming tasks on websites, desktop applications, or the command line. See sample Python script, the RPA Challenge solution, and RedMart groceries example. To send a Telegram app notification, simply look up @rpapybot to allow receiving messages. To automate Chrome browser invisibly, use headless mode. To run 10X faster instead...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    ZK - Simply Ajax and Mobile
    ZK is an open-source Java framework for building modern web and mobile applications. It enables developers to create rich, interactive UIs using only Java — no JavaScript required. With 200+ Ajax-powered components, event-driven architecture, and support for popular technologies like Spring, Java EE, and JSP/JSF, ZK makes it simple to deliver powerful and user-friendly web applications.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 14

    uweb browser: unlimited power

    minimal suckless android web browser with unlimited power

    - AI bot as search engine; append file content as input for complex query. - Powerful: html5 enhancement; any urls to host a website; javascript and shell scripting for general processing; and more with Termux. - Customizable: user-defined menus, (new) buttons and gestures for user agents, bookmarklets, url services, shell commands, internal functionality links and text processing etc. - Convenient: book/dictionary/txt/command line/app can be search engine. - Tiny: less than 200k -...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    ...The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout. Read the documentation of the BrowserKit, DomCrawler, and HttpClient Symfony Components for more information about what you can do with Goutte. Goutte is a thin wrapper around the following Symfony Components: BrowserKit, CssSelector, DomCrawler, and HttpClient.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    HackTools

    HackTools

    The all-in-one Red Team extension for Web Pentesters

    The all-in-one Red Team browser extension for Web Pentesters. HackTools, is a web extension facilitating your web application penetration tests, it includes cheat sheets as well as all the tools used during a test such as XSS payloads, Reverse shells and much more. With the extension you no longer need to search for payloads in different websites or in your local storage space, most of the tools are accessible in one click. HackTools is accessible either in pop-up mode or in a whole tab in...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Splinter

    Splinter

    Splinter - Python test framework for web applications

    Splinter is a Python test framework for web applications, providing a simple and consistent API for browser automation and testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    galacteek

    galacteek

    Multi-platform browser for the distributed web

    galacteek is a multi-platform Qt5-based browser and semantic agent for the distributed web. Be sure to install all the gstreamer packages on your system to be able to use the mediaplayer. After opening/mounting the DMG image, hold Control and click on the galacteek icon, and select Open and accept. You probably need to allow the system to install applications from anywhere in the security settings. Docker images are available. They run the full GUI inside a virtual Xorg server (using Xvfb)....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    HomeTabs

    HomeTabs project helps you to organize bookmarks for web browsers

    HomeTabs project helps you to organize bookmarks for web browsers (like a standart browser's home page, but cooler and more comfortable). Design of HomeTabs was inspiried by Mozilla Firefox startpage, i think this is the best way to organise bookmarks, but history of browsing saved on homepage - is bad idea. GitHub: https://github.com/grildroid/HomeTabs Discord: https://discord.gg/6ZGDgFjDVm
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    googler

    googler

    Google from the terminal

    googler is a power tool to Google (web, news, videos and site search) from the command line. It shows the title, URL and abstract for each result, which can be directly opened in a browser from the terminal. Results are fetched in pages (with page navigation). Supports sequential searches in a single googler instance. googler was initially written to cater to headless servers without X. You can integrate it with a text-based browser. However, it has grown into a very handy and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    googler

    googler

    Google Search, Google Site Search, Google News from the terminal

    googler is a power tool to Google (Web & News) and Google Site Search from the command-line. It shows the title, URL and abstract for each result, which can be directly opened in a browser from the terminal. Results are fetched in pages (with page navigation). Supports sequential searches in a single googler instance. googler was initially written to cater to headless servers without X. You can integrate it with a text-based browser. However, it has grown into a very handy and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    CountBookmarks

    CountBookmarks

    Makes a detailed count of your browser bookmarks by folder

    This simple program performs a detailed count of exported web browser bookmarks by folder. Its output file can be imported into a spreadsheet and sorted to show the relative size of all your bookmark folders.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    Reactive Extensions for JavaScript

    Reactive Extensions for JavaScript

    An API for asynchronous programming with observable streams

    ...The Observer pattern, the Iterator pattern, and functional programming. ReactiveX is everywhere, and it's meant for everything. Available for idiomatic Java, Scala, C#, C++, Clojure, JavaScript, Python, Groovy, JRuby, and others. Embrace ReactiveX's asynchronicity, enabling concurrency and implementation independence. Manipulate UI events and API responses, on the Web with RxJS, or on mobile with Rx.NET and RxJava. Avoid intricate stateful programs, using clean input/output functions over observable streams. ReactiveX's operators often reduce what was once an elaborate challenge into a few lines of code. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next