664 projects for "python web crawler" with 2 filters applied:

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Constellio Enterprise Search engine

    Constellio Enterprise Search engine

    Open source Search Engine and Enterprise Search

    Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    UpStage
    WE ARE NO LONGER USING SOURCEFORGE. Please visit http://www.upstage.org.nz for the most up-to-date code (v3 to be released january 2014, beta version available November 2013) and information. UpStage is a web-based venue for cyberformance: artists compile digital media in real time to create live theatrical performance for online audiences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    Ginger RSS Reader

    Web-based RSS Reader

    This is the old page. See https://sourceforge.net/projects/ginger-rss
    Downloads: 0 This Week
    Last Update:
    See Project
  • Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use. Icon
    Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use.

    Transform user access with Frontegg CIAM: login box, SSO, MFA, multi-tenancy, and 99.99% uptime.

    Custom auth drains 25% of dev time and risks 62% more breaches, stalling enterprise deals. Frontegg platform delivers a simple login box, seamless authentication (SSO, MFA, passwordless), robust multi-tenancy, and a customizable Admin Portal. Integrate fast with the React SDK, meet compliance needs, and focus on innovation.
    Start for Free
  • 5
    Atomschlag

    Atomschlag

    A lightweight Webkit browser written entirely in Python

    Atomschlag is a project of writing a Webkit-based browser using PyGTK and PyWebkitGTK, completely in Python, to create a useable, secure and lightweight replacement of existing browsers in custom appliances. The primary project goals are: - small size; - minimal abilities to track you down based on the client info; - maximal compatibility with proxy-based anonymity layers such as I2P; - URL filtering for blocking ads and user tracking services; - simple and non-overloaded user interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Lograph

    Log text into graph on python and javascript.

    Graphize logs on the web browser. Fast javascript implementation needed with large monitor use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    A Python interface to the gnuplot plotting program.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8

    LinkChecker

    check links in web documents or full websites

    New Homepage: http://wummel.github.io/linkchecker/ Linkchecker features: - recursive and multithreaded checking and site crawling - output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats - HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support - restrict link checking with regular expression filters for URLs - proxy support -...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Yet another web crawler? Yes, but this ones uses the full power of regular expressions to accept or reject, examine or ignore, save or refuse pages. You also use MIME types to do all this. Powerful and flexible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Photo and Video Editing APIs and SDKs Icon
    Photo and Video Editing APIs and SDKs

    Trusted by 150 million+ creators and businesses globally

    Unlock Picsart's full editing suite by embedding our Editor SDK directly into your platform. Offer your users the power of a full design suite without leaving your site.
    Learn More
  • 10
    Screenshot Paste plugin for Trac

    Screenshot Paste plugin for Trac

    A Trac plugin to allow pasting screenshots or images with one click

    A Trac plugin to allow pasting screenshots or other images captured or copied in the clipboard directly as attachements to tickets, Wiki pages, etc., without the need to first saving as images and then uploading them. Once the plugin is installed in Trac, you can easily attach a screenshot or any image you have in the clipboard to a Ticket or Wiki page, with one click.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Python Crawler Library

    Python Web Crawler Library

    A simple library for crawling the web. This library will give you the ability to create macros for crawling web site and preforming simple actions like preforming "log in" and other simple actions in web sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Spondulas

    Spondulas is browser emulator designed to retrieve web pages for hunti

    Spondulas is browser emulator and parser designed to retrieve web pages for hunting malware. It supports generation of browser user agents, GET/POST requests, and SOCKS5 proxy. It can be used to parse HTML files sent via e-mail. Monitor mode allows a website to be monitored at intervals to discover changes in DNS or content over time. Autolog mode creates an investigation file that documents redirection chains. The retrieved web pages are parsed for links and reported to an output file. More...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Where In the World Have You Been?
    A PHP script with maps of the World, China, Canada, USA, India, Africa and Europe that allows the user to select the countries, provinces or states by clicking on them or selecting a checkboxes. Selection causes the entity to turn a default color which contrasts with defaults colors of all bordering countries. Thus a patchwork is made to show the history of countries, states or provinces traveled. Added features allow users to download their maps, to blow them up to posters of any...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    SaWALi Web Application Library

    The SaWALi is a website management tool written in Python.

    The SaWALi Web Application Library is a Python application that aims to provide a reasonably complete set of components for operating a multi-purpose website. Taking advantage of the Pylons Framework, SaWALi is fully-customisable and inherently-extensible. All of SaWALi's administrative and public interfaces can be modified to suit a website's userbase— from its document editors and server error pages down to its public-facing pages and site maps. Being a Python module, SaWALi can also...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Seven-Labs

    Seven-Labs

    Application Development

    This repository serves as our entire project space which contains all of the open-source projects we've worked on. - C/C++ - C#/.NET - PHP - HTML5/CSS3
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Pyjamas is a python to Javascript compiler, Widget set, Framework and Toolkit for Application development that runs on Web browsers. The developer need not know anything about AJAX: all the AJAX tricks, for all major browsers, are entirely taken care of.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PRO-Search is a crawler of FTP servers, SMB shares, HTTP, dc++ networks, ... with powerful web search and navigation interface
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PBP is a web browser made for testing web applications. Its user interface is a command interpreter with a simple, focused shell-like language which helps both developers and non-developers create robust functional tests with little effort.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Notice Publisher plugin for Trac

    Notice Publisher plugin for Trac

    A Trac plugin to display Notices to any User visiting any page in Trac

    A Trac plugin to display Notices to any User visiting any page in Trac. Take a look at the Web site on Trac-Hacks: http://trac-hacks.org/wiki/NoticePublisherPlugin This is useful to bring everyone attention on news that affect all users, like the system going down, a solution to a common problem, and so on. Notices can contain Wiki-formatted syntax, thus allowing for rich content. Notices can have an expiration, expressed in hours, after which they automatically disappear...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    "Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files" (http://swish-e.org/ ) This module provides a Python API for this software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Biz is a WSGI-compatible web application framework written in Python. It aims to be a platform for easily developing secure and internationalized web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GlassBeadGame

    GlassBeadGame

    Organizing the Knowledge of Humanity

    OUT OF DATE: See the Singularity project here at sourceforge. This project aims to make the expanse of human knowledge beautifully presentable and the exabytes of data navigable by an average user via the power of a Unified Data Model and 3d visualization layer for the Web. It will invert the top 3 layers of the OSI network model to make a 3-dimensional presentation layer with a peer-to-peer session layer for the Internet. For the curious, there is a simple demo that provides...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ftpsearch is a web based indexing search engine for the ftp server, which supports regular expressions in queries, new files monitoring, some fancy stats and so on.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A collection of small web applications, for now<BR> <LI>MyReporting </li> <UL>an application used inside a team to report weekly your activity<BR> this has browsing feature, and emailing the reports. </ul>
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.