Search Results for "python web crawler" - Page 3

Showing 1911 open source projects for "python web crawler"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Douyin TikTok Download API

    Douyin TikTok Download API

    Douyin TikTok Download API

    Use the official interface to capture Douyin|TikTok data, support API calls, Web portals, and batch analysis. Fast, asynchronous, free, open source, ad-free, long-term maintenance. This project is based on PyWebIO , FastAPI , HTTPX , a fast and asynchronous Douyin / TikTok data crawling tool, and realizes online batch parsing and downloading of watermark-free videos or atlases through the web, data crawling API, and iOS shortcut instructions for watermark-free download and other functions. You...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    Pyxel

    Pyxel

    A retro game engine for Python

    A retro game engine for Python. Thanks to its simple specifications inspired by retro gaming consoles, such as only 16 colors can be displayed and only 4 sounds can be played back at the same time, you can feel free to enjoy making pixel art style games. The motivation for the development of Pyxel is the feedback from users. Please give Pyxel a star on GitHub! Pyxel's specifications and APIs are inspired by PICO-8 and TIC-80. Pyxel is open source and free to use. Let's start making a retro game...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    miniblink49

    miniblink49

    Lighter, faster browser kernel of blink to integrate HTML UI in apps

    ... electron). Customize as you wish, simulate another browser environment. Perfect HTML5 support, friendly to various front-end libraries (support HTML5, and friendly to front framework). After turning off the cross-domain switch, you can use various cross-domain functions (support cross-domain). Headless mode, which greatly saves resources and is suitable for crawlers (headless mode, be suitable for Web Crawler).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    OpenHands

    OpenHands

    Open-source autonomous AI software engineer

    ... in the open on GitHub, under the MIT license. Our agents can do anything a human developer can: they write code, run commands, and use the web. We're partnering with AI safety experts like Invariant Labs to balance innovation with security.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 5
    QR Code generator library

    QR Code generator library

    High-quality QR Code generator library in Java, TypeScript/JavaScript

    ... to TypeScript, Python, Rust, C++, and C. It is open source under the MIT License. For each language, the codebase is roughly 1000 lines of code and has no dependencies other than the respective language’s standard library.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Whoogle Search

    Whoogle Search

    A self-hosted, ad-free, privacy-respecting metasearch engine

    Get Google search results, but without any ads, javascript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile. Autocomplete/search suggestions. POST request search and suggestion queries (when possible). View images at full res without site redirect (currently mobile only). Light/Dark/System theme modes (with...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Glances

    Glances

    An eye on your system

    Glances is an open source, cross-platform monitoring tool that aims to provide a significant amount of monitoring information through a curses or Web-based interface. Depending on the size of the user interface, this information can then dynamically adapt. Glances can work in client/server mode, and is also capable of remote monitoring. All systems statistics can be exported to files or external time/value databases. Glances gets information from your system through various libraries...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    pyTelegramBotAPI

    pyTelegramBotAPI

    Python Telegram bot api.

    TeleBot is the synchronous and asynchronous implementation of Telegram Bot API.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other features...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Picsart Enterprise Background Removal API for Stunning eCommerce Visuals Icon
    Picsart Enterprise Background Removal API for Stunning eCommerce Visuals

    Instantly remove the background from your images in just one click.

    With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.
    Learn More
  • 10
    Klavis AI

    Klavis AI

    Open Source MCP integration for AI applications

    Klavis AI is an open-source platform that simplifies the integration of Model Context Protocols (MCPs) into AI applications. It provides hosted, secure MCP servers with built-in OAuth support, eliminating the need for complex authentication management and client-side code. Klavis AI enables developers to connect AI agents to various tools and services efficiently, facilitating scalable and secure AI deployments.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Posting

    Posting

    The modern API client that lives in your terminal

    posting is a lightweight command-line tool that lets users schedule and automate Mastodon posts using Markdown files. It reads a simple folder structure of Markdown drafts and posts them at predefined intervals or manually. Designed for content creators and developers, posting helps maintain consistent and organized Mastodon accounts without depending on web UIs or third-party schedulers.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    Wifipumpkin3

    Wifipumpkin3

    Powerful framework for rogue access point attack

    wifipumpkin3 is powerful framework for rogue access point attack, written in Python, that allow and offer to security researchers, red teamers and reverse engineers to mount a wireless network to conduct a man-in-the-middle attack.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    HAXE

    HAXE

    The cross-platform toolkit

    Haxe is an open source high-level strictly-typed programming language with a fast optimizing cross-compiler. Haxe can build cross-platform applications targeting JavaScript, C++, C#, Java, JVM, Python, Lua, PHP, Flash, and allows access to each platform's native capabilities. Haxe has its own VMs (HashLink and NekoVM) but can also run in interpreted mode. Haxe is useful in a wide variety of domains; games, web, mobile, desktop, command-line and cross-platform APIs. Take a look at who is using...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    Locust

    Locust

    Scalable open source load testing tool

    Locust is an open source user load testing tool written in Python. The idea behind Locust is to swarm your web site or other systems with attacks from simulated users during a test, with each user behavior defined by you using Python code. This swarming process is then monitored from a web UI in real-time, and will help identify any bottlenecks in your code before real users can come in. As it is completely event-based, Locust can have thousands or even millions of simultaneous users...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    armory

    armory

    3D Engine with Blender Integration

    Armory is an open-source 3D engine focused on portability, minimal footprint and performance. The renderer is fully scriptable with deferred and forward paths supported out of the box. Armory provides a full Blender integration add-on, turning it into a complete game development tool. The result is a unified workflow from start to finish. Powered by Armory engine, ArmorPaint is a stand-alone software designed for physically-based texture painting. Drag & drop your 3D models and start...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    jQuery Terminal

    jQuery Terminal

    JavaScript library for creating web-based terminals

    jQuery Terminal is a JavaScript library for creating command-line interpreters in your applications. You can use this JavaScript Terminal library to create interactive web-based terminal applications on your website. Where commands are defined by you. You can define them on the server or in the browser's JavaScript. It can automatically call JSON-RPC service when the user types a command. Alternatively, you can provide an object with methods; each method will be invoked on the user's command...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Roxy-WI

    Roxy-WI

    Web interface for managing Haproxy, Nginx, Apache and Keepalived

    For those who need a convenient interface for managing all services in one place. Roxy-WI was created for people who want to have a fault-tolerant infrastructure, but do not want to plunge deep into the details of setting up and creating a cluster based on HAProxy, NGINX, Apache, and Keepalived. Use Roxy-WI to build a high available cluster for a couple of clicks: install HAProxy, NGINX, Apache, Keepalived, and its exporters, and carry out the initial configuration for the services. Collect...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Composio

    Composio

    Composio equip's your AI agents & LLMs

    Empower your AI agents with Composio - a platform for managing and integrating tools with LLMs & AI agents using Function Calling. Equip your agent with high-quality tools & integrations without worrying about authentication, accuracy, and reliability in a single line of code.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser-Use is a framework that makes websites accessible for AI agents, enabling automated interactions and data extraction from web pages.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    PyScript
    PyScript is a framework that allows users to create rich Python applications in the browser using HTML's interface and the power of Pyodide, MicroPython and WASM, and modern web technologies. PyScript is a meta project that aims to combine multiple open technologies into a framework that allows users to create sophisticated browser applications with Python. It integrates seamlessly with the way the DOM works in the browser and allows users to add Python logic in a way that feels natural both...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Mail-in-a-Box

    Mail-in-a-Box

    Take back control of your email with this mail server in a box

    ... a good mail server easy, promote decentralization, innovation, and privacy on the web, have automated, auditable, and idempotent configuration, not make a totally unhackable, NSA-proof server, and not make something customizable by power users. Mail-in-a-Box turns a fresh Ubuntu 18.04 LTS 64-bit machine into a working mail server by installing and configuring various components. It is a one-click email appliance. There are no user-configurable setup options. It "just works."
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    X-Crawl

    X-Crawl

    Flexible Node.js AI-assisted crawler library

    A high-performance web crawling and scraping framework for Node.js, designed for large-scale data extraction.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    JS Beautifier

    JS Beautifier

    Beautifier for javascript

    ...-beautify script. As with the Python script, the beautified result is sent to stdout unless otherwise configured. You can also use js-beautify as a node library (install locally, the npm default). The beautifier can be added on your page as web library. JS Beautifier is hosted on two CDN services: cdnjs and rawgit. You can beautify javascript using JS Beautifier in your web browser, or on the command-line using node.js or python.
    Downloads: 5 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.