Showing 70 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB 8.0 on Atlas | Run anywhere Icon
    MongoDB 8.0 on Atlas | Run anywhere

    Now available in even more cloud regions across AWS, Azure, and Google Cloud.

    MongoDB 8.0 brings enhanced performance and flexibility to Atlas—with expanded availability across 125+ regions globally. Build modern apps anywhere your users are, with the power of a modern database behind you.
    Learn More
  • 1
    Playwright for Python

    Playwright for Python

    Python version of the Playwright testing and automation library

    Playwright enables reliable end-to-end testing for modern web apps. Single API to automate Chromium, Firefox and WebKit. Capable automation for single page apps that rely on the modern web platform. Use the Playwright API in JavaScript & TypeScript, Python, .NET and, Java. With Playwright, test how your app behaves in Apple Safari with WebKit builds for Windows, Linux and macOS. Test locally and on CI. Use device emulation to test your responsive web apps in mobile web browsers. Playwright...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    RPA for Python

    RPA for Python

    Python package for doing RPA

    Python package for doing RPA. RPA for Python's simple and powerful API makes robotic process automation fun! You can use it to quickly automate away repetitive time-consuming tasks on websites, desktop applications, or the command line. See sample Python script, the RPA Challenge solution, and RedMart groceries example. To send a Telegram app notification, simply look up @rpapybot to allow receiving messages. To automate Chrome browser invisibly, use headless mode. To run 10X faster instead...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    ungoogled-chromium

    ungoogled-chromium

    A lightweight approach to removing Google web service dependency

    In descending order of significance (i.e. most important objective first), ungoogled-chromium is Google Chromium, sans dependency on Google web services, ungoogled-chromium retains the default Chromium experience as closely as possible. Unlike other Chromium forks that have their own visions of a web browser, ungoogled-chromium is essentially a drop-in replacement for Chromium. ungoogled-chromium features tweaks to enhance privacy, control, and transparency. However, almost all...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 4
    Mercury Browser

    Mercury Browser

    Privacy-focused web browser fork of Firefox

    Mercury Browser is an optimized, privacy-focused web browser that is a fork of Mozilla Firefox. It incorporates compiler optimizations such as AVX, AES, LTO, and PGO to enhance performance and security. With features derived from projects like LibreWolf, Waterfox, and Ghostery, Mercury disables telemetry and debugging elements by default, ensuring a more private browsing experience. It also includes usability patches that bring back features like the classic top bar and supports unsigned...
    Downloads: 27 This Week
    Last Update:
    See Project
  • Turn Your Content into Interactive Magic - For Free Icon
    Turn Your Content into Interactive Magic - For Free

    From Canva to Slides, Desmos to YouTube, Lumio works with the tech tools you are already using.

    Transform anything you share into an engaging digital experience - for free. Instantly convert your PDFs, slides, and files into dynamic, interactive sessions with built-in collaboration tools, activities, and real-time assessment. From teaching to training to team building, make every presentation unforgettable. Used by millions for education, business, and professional development.
    Start Free Forever
  • 5
    miniblink49

    miniblink49

    Lighter, faster browser kernel of blink to integrate HTML UI in apps

    ... electron). Customize as you wish, simulate another browser environment. Perfect HTML5 support, friendly to various front-end libraries (support HTML5, and friendly to front framework). After turning off the cross-domain switch, you can use various cross-domain functions (support cross-domain). Headless mode, which greatly saves resources and is suitable for crawlers (headless mode, be suitable for Web Crawler).
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Listen 1

    Listen 1

    One for all free music in china (chrome extension)

    .... Download the Windows zip file and choose the 32-bit or 64-bit version according to the system. The original web player, using Python to develop a web server. Can run directly on the server, or use the packaged Windows and Mac versions to run the web server locally. Windows, Mac, Linux desktop. Using Electron framework, based on Listen 1 Chrome plug-in version JS library development.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    SeleniumBase

    SeleniumBase

    A framework for browser automation and testing with Selenium

    SeleniumBase automatically handles common WebDriver actions such as launching web browsers before tests, saving screenshots during failures, and closing web browsers after tests. SeleniumBase lets you customize test runs from the command line. SeleniumBase uses simple syntax for commands. pytest includes automatic test discovery. If you don't specify a specific file or folder to run, pytest will automatically search through all subdirectories for tests to run. No More Flaky Tests! SeleniumBase...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Playwright for .NET

    Playwright for .NET

    .NET version of the Playwright testing and automation library

    ..., JavaScript, Python, .NET, Java. Test Mobile Web. Native mobile emulation of Google Chrome for Android and Mobile Safari. The same rendering engine works on your Desktop and in the Cloud. Auto-wait. Playwright waits for elements to be actionable prior to performing actions. It also has a rich set of introspection events. The combination of the two eliminates the need for artificial timeouts - the primary cause of flaky tests.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Scrapy-Redis

    Scrapy-Redis

    Redis-based components for Scrapy

    You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls. Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue. Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version 0.3...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. Although we don't...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    HackTools

    HackTools

    The all-in-one Red Team extension for Web Pentesters

    The all-in-one Red Team browser extension for Web Pentesters. HackTools, is a web extension facilitating your web application penetration tests, it includes cheat sheets as well as all the tools used during a test such as XSS payloads, Reverse shells and much more. With the extension you no longer need to search for payloads in different websites or in your local storage space, most of the tools are accessible in one click. HackTools is accessible either in pop-up mode or in a whole tab...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Eric Integrated Development Environment

    Eric Integrated Development Environment

    Python Development Environment with all batteries included

    Eric is a Python IDE written using PyQt and QScintilla. It provides various features such as any number of open editors, an integrated (remote) debugger, project management facilities, unit test, refactoring and much more.
    Leader badge
    Downloads: 180 This Week
    Last Update:
    See Project
  • 14
    Splinter

    Splinter

    Splinter - Python test framework for web applications

    Splinter is a Python test framework for web applications, providing a simple and consistent API for browser automation and testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ZK - Simply Ajax and Mobile
    Ajax+Mobile Java Web framework. With 200+ Ajax components and event-driven, Ajax/RIA apps are as effortless and rich as desktop apps and HTML/XUL pages. Support JSP/JSF/JavaEE/Spring, Ajax Push and Client-fusion; also Java/Groovy/Python/JavaScript.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 16

    uweb browser: unlimited power

    minimal suckless android web browser with unlimited power

    ...: run fast, even with thousands of user provided css/scripts - Efficient: less touches, one click to reach any number of search engines without repeated input; automate online services. - URL bar command line support ("!" and .js files as commands). - user-defined site-specific JS/CSS/HTML/preprocessing. - Online play/preview/preprocess for downloadable resources. - Multiple type profiles: switch any data including logins/config orthogonally - web automation, crontab (alarm clock)
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    galacteek

    galacteek

    Multi-platform browser for the distributed web

    galacteek is a multi-platform Qt5-based browser and semantic agent for the distributed web. Be sure to install all the gstreamer packages on your system to be able to use the mediaplayer. After opening/mounting the DMG image, hold Control and click on the galacteek icon, and select Open and accept. You probably need to allow the system to install applications from anywhere in the security settings. Docker images are available. They run the full GUI inside a virtual Xorg server (using Xvfb...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    googler

    googler

    Google from the terminal

    googler is a power tool to Google (web, news, videos and site search) from the command line. It shows the title, URL and abstract for each result, which can be directly opened in a browser from the terminal. Results are fetched in pages (with page navigation). Supports sequential searches in a single googler instance. googler was initially written to cater to headless servers without X. You can integrate it with a text-based browser. However, it has grown into a very handy and flexible utility...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20

    HomeTabs

    HomeTabs project helps you to organize bookmarks for web browsers

    HomeTabs project helps you to organize bookmarks for web browsers (like a standart browser's home page, but cooler and more comfortable). Design of HomeTabs was inspiried by Mozilla Firefox startpage, i think this is the best way to organise bookmarks, but history of browsing saved on homepage - is bad idea. GitHub: https://github.com/grildroid/HomeTabs Discord: https://discord.gg/6ZGDgFjDVm
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    googler

    googler

    Google Search, Google Site Search, Google News from the terminal

    googler is a power tool to Google (Web & News) and Google Site Search from the command-line. It shows the title, URL and abstract for each result, which can be directly opened in a browser from the terminal. Results are fetched in pages (with page navigation). Supports sequential searches in a single googler instance. googler was initially written to cater to headless servers without X. You can integrate it with a text-based browser. However, it has grown into a very handy and flexible...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    CountBookmarks

    CountBookmarks

    Makes a detailed count of your browser bookmarks by folder

    This simple program performs a detailed count of exported web browser bookmarks by folder. Its output file can be imported into a spreadsheet and sorted to show the relative size of all your bookmark folders.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Reactive Extensions for JavaScript

    Reactive Extensions for JavaScript

    An API for asynchronous programming with observable streams

    An API for asynchronous programming with observable streams. The Observer pattern done right. ReactiveX is a combination of the best ideas from. The Observer pattern, the Iterator pattern, and functional programming. ReactiveX is everywhere, and it's meant for everything. Available for idiomatic Java, Scala, C#, C++, Clojure, JavaScript, Python, Groovy, JRuby, and others. Embrace ReactiveX's asynchronicity, enabling concurrency and implementation independence. Manipulate UI events and API...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    WebDAVSurfer

    WebDAVSurfer

    WebDAV client 64-bit works with Plone 5, Apache and more

    GUI WebDAV Client for Linux and Windows 10. Includes PROPFIND, PROPPATCH, LOCK ,UNLOCK ,VERSION-CONTROL,REPORT. HTTP(S) with Basic Authentication and PKI client and server Certificates. Works with Plone, Zope, Apache + mod_dav, PyWebDAV, PyDAV, Tamino. 64-bit wxPython used. Upload files or from Web. Update properties. Tested with Plone 5.04
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.