666 projects for "python web crawler" with 2 filters applied:

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 1
    Tabby Web

    Tabby Web

    An SSH/Telnet/Serial client in your browser

    Tabby Web brings a modern terminal experience to the browser by pairing a web UI with a backend gateway that brokers TCP connections over WebSockets. It aims to deliver an experience similar to the desktop Tabby terminal—sessions, profiles, and rich configuration—while being accessible anywhere through a login. The architecture splits concerns: a Django-based control plane manages users, auth, and configuration, while a gateway service handles network transport so browser clients can reach...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 3
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    img2dataset

    img2dataset

    Easily turn large sets of image urls to an image dataset

    Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports saving captions for url+caption datasets. Opt-out directives: Websites can pass the http headers X-Robots-Tag: noai, X-Robots-Tag: noindex , X-Robots-Tag: noimageai and X-Robots-Tag: noimageindex By default img2dataset will ignore images with such headers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 5
    Gunicorn

    Gunicorn

    WSGI HTTP Server for UNIX, fast clients and sleepy applications

    Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It's a pre-fork worker model. The Gunicorn server is broadly compatible with various web frameworks, simply implemented, light on server resources, and fairly speedy. You can run Gunicorn by using commands or integrate with popular frameworks like Django, Pyramid, or TurboGears. For deploying Gunicorn in production see Deploying Gunicorn.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Requests for PHP

    Requests for PHP

    Requests for PHP is a humble HTTP request library

    Requests is a HTTP library written in PHP, for human beings. It is roughly based on the API from the excellent Requests Python library. Requests is ISC Licensed (similar to the new BSD license) and has no dependencies, except for PHP 5.6+. Despite PHP’s use as a language for the web, its tools for sending HTTP requests are severely lacking. cURL has an interesting API, to say the least, and you can’t always rely on it being available. Sockets provide only low-level access and require you to build most of the HTTP response parsing yourself. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Eric Integrated Development Environment

    Eric Integrated Development Environment

    Python Development Environment with all batteries included

    Eric is a Python IDE written using PyQt and QScintilla. It provides various features such as any number of open editors, an integrated (remote) debugger, project management facilities, unit test, refactoring and much more.
    Leader badge
    Downloads: 229 This Week
    Last Update:
    See Project
  • 8
    ZK - Simply Ajax and Mobile
    ZK is an open-source Java framework for building modern web and mobile applications. It enables developers to create rich, interactive UIs using only Java — no JavaScript required. With 200+ Ajax-powered components, event-driven architecture, and support for popular technologies like Spring, Java EE, and JSP/JSF, ZK makes it simple to deliver powerful and user-friendly web applications.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9

    http-proxy-tunnel

    Create nested tunnels through HTTP proxies

    Http-proxy-tunnel creates TCP tunnels through http proxies that permit the CONNECT method. It differs from other proxy tunnelling programs in that it can tunnel through multiple proxies, and can use SSL tunnels. These abilities mean that in combination with a web server that can proxy (such as Apache) you can serve normal web pages from ports 80 and 443 and connect to the server (using ssh say) via those ports at the same time. All available documentation can be read online at...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    ciwiki

    ciwiki

    Personnal or familly wiki with low ressource requirement.

    Personal lightweight wiki based on DidiWiki. Upgraded to accept text and highlight color, image resize and video (youtube, dailymotion...) embedded. Written in C, doesn't require a lot of RAM. Works fine on Raspbian (Raspberry Pi). Example of Ciwiki running on Raspberry Pi B+ (700MHz, 512MB): http://inphilly.dyn.dhs.org
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    MOFO Linux

    MOFO Linux

    A live Linux environment for computing without censorship barriers.

    MOFO Linux is a USB pluggable live Linux environment you boot on PC hardware. It gives you the power to unblock any media, at your discretion, clearing the way for you to read, write, watch, listen to, debate, or collaborate anywhere - beyond the reach of Big Brother. In other words, you jump the barrier, find media, and interact with people. MOFO Linux is designed for easy usage on home PCs, laptop computers, or workstations, whether installed in internet cafes anywhere the world or on...
    Leader badge
    Downloads: 44 This Week
    Last Update:
    See Project
  • 12
    CerberusCMS5

    CerberusCMS5

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    LXR Cross Referencer
    A general purpose source code indexer and cross-referencer that provides web-based browsing of source code with links to the definition and usage of any identifier. Supports multiple languages. Up-to-date information in http://lxr.sourceforge.net
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Cinemagoer

    Cinemagoer

    Python package to retrieve and manage data of the IMDb

    Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies. Platform-independent, it can retrieve data from both the IMDb's web server and a local copy of the whole db.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Webware for Python

    Webware for Python

    The Classic Webware for Python

    Webware for Python is a suite of components for dynamic, server-side web development.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Zenoss Community Edition

    Zenoss Community Edition

    Zenoss - Intelligent IT Operations Management

    Zenoss provides software-defined IT operations for the world’s largest organizations. We deliver the ultimate level of IT service health with simplicity by providing the most granular and intelligent IT service modeling possible, at any scale, and sharing these unique insights with other IT operations management (ITOM) tools to make them more efficient. Zenoss Community Edition is not a “demo” or trial version of Zenoss Enterprise or Zenoss Cloud! Before You install Zenoss Community...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    Software, information, data sets and documentation for the Web as Corpus community.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Liferay Portal

    Liferay Portal

    The world's leading open source portal

    Liferay Portal is the world's leading enterprise open source portal framework, offering integrated Web publishing and content management, an enterprise service bus and service-oriented architecture, and compatibility with all major IT infrastructure. Check GitHub for our latest releases: https://github.com/liferay/liferay-portal/releases https://github.com/liferay/liferay-ide/releases
    Leader badge
    Downloads: 147 This Week
    Last Update:
    See Project
  • 20
    Zero Install
    Zero Install is a decentralised cross-distribution software installation system. Create one package that works everywhere! With dependency handling and automatic updates, full support for shared libraries, and integration with native package managers
    Leader badge
    Downloads: 3,663 This Week
    Last Update:
    See Project
  • 21
    TimothyDocs

    TimothyDocs

    Timothy is a cloud base storage system designed to document your work

    Timothy is a cloud based documentation system. Timothy will document any endeavor because it will store not only the documents created during the project but also store information about those files. Like most storge schemes timothy creates a hierarchy of categories through which one may browse. Timothy displays information about the document or category as well as its name. This use of meta data explains the structure and content of the project to the user as he browses. Users...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    PHP mini vulnerability suite

    Multiple server/webapp vulnerability scanner

    github: https://github.com/samedog/phpmvs
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    SFM2Web reads text and database files encoded with SFMs (Standard Format Markers) and then generates a web site according to flags specified in control files. This is useful for web publication of MDF lexicons, USFM Bible books, texts, phrasebooks, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    icemac.addressbook

    icemac.addressbook

    Multi user address book application accessable through the web.

    Multi user address book application accessable through the web. Store, edit, search and export addresses, phone numbers, … using a web browser. Code moved to https://bitbucket.org/icemac/icemac.addressbook Documentation see https://icemacaddressbook.readthedocs.io/en/latest/ New releases (after 6.0.2) see https://pypi.org/project/icemac.addressbook/#history
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next