Showing 90 open source projects for "python web crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Efficent WEB UI for wget utility written in python(twisted) . It's work without any web server. Script consists of a SINGLE file
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    XUProxy is an extensible multi-protocol proxy based on the Twisted framework. It supports multiple protocol plugins (currently only HTTP), and multiple "filter" plugins for things like logging, caching, and Proxomitron-compatible ad filtering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    A tool for autonomous and virtual topical data integration using the focused web-harvesting method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Scripts and tools for OpenStreetMap (osm) Sample maps can be found at : <http://www.leretourdelautruche.com/map/index.html>
    Downloads: 0 This Week
    Last Update:
    See Project
  • Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use. Icon
    Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use.

    Transform user access with Frontegg CIAM: login box, SSO, MFA, multi-tenancy, and 99.99% uptime.

    Custom auth drains 25% of dev time and risks 62% more breaches, stalling enterprise deals. Frontegg platform delivers a simple login box, seamless authentication (SSO, MFA, passwordless), robust multi-tenancy, and a customizable Admin Portal. Integrate fast with the React SDK, meet compliance needs, and focus on innovation.
    Start for Free
  • 5
    The London Datastore (http://data.london.gov.uk) was created by the Greater London Authority (GLA) as an innovation towards freeing London’s data. This SourceForge Project will be used to Open Source our development efforts surrounding data formats
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Redland is a set of object-based, modular and portable C RDF libraries providing RDF APIs for the graph, triple storage (librdf), RDF/XML parsing and serializing (Raptor), SPARQL RDF querying (Rasqal). Language APIs in Perl, PHP, Python, Ruby and others.
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Some tools related to the Music Ontology - including domain-specific Semantic Web crawlers, audio collection management and mapping tools
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PyH
    A powerful python module that lets you output HTML code from within a python script in a very efficient and convenient fashion. Code your web-page like a GUI! Create tags and modify their attributes at anytime during your script. http://pyh/googlecod
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pyservices
    Making use of our library you can easily deploy and consume services available on the web. PyServices is a pythonic library that provides a default interface to WebServices written in many different protocols. Our objective is describe and implement
    Downloads: 0 This Week
    Last Update:
    See Project
  • Crowdtesting That Delivers | Testeum Icon
    Crowdtesting That Delivers | Testeum

    Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

    Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights.
    Click to perfect your product now.
  • 10
    ZK Light is renamed to ZKuery and moved to http://code.google.com/p/zkuery/. ZK Light is a client-only version of ZK; Support Java, C, PHP, Python...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ASI to Smart-M3 SIB synchronization agent
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Dare-Dare stands for Document Ajax Reader Extension. It's a full javascript online pdf reader. No flash, no closed sources !!!!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Make AsciiDoc part of your literate programming tool set. With eWEB you can weave and tangle literate programs written as AsciiDoc documents, using embedded WEB code snippets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Sofa is a CUDA-based reasoner
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The Java Sitemap Parser can parse a website's Sitemap (http://www.sitemaps.org/). This is useful for web crawlers that want to discover URLs from a website that is using the Sitemap Protocol. This project has been incorporated into crawler-commons (https://github.com/crawler-commons/crawler-commons) and is no longer being maintained.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    SnapLogic is an Open Source Data Integration framework that combines the power of state-of-the-art dynamic programming languages with standard Web interfaces to solve today's most pressing problems in data integration.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    This project provide scripts for automatically generating man pages from wiki web based sources. So it consists with scripts which download wiki source files from wiki web server, convert it from wiki to roff format end then make archive of man pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Txt2tags converts a text file with minimal markup to HTML, XHTML, SGML, LaTeX, Lout, UNIX Man Page, Wikipedia, Google Code Wiki, DokuWiki, MoinMoin, MagicPoint(mgp), PageMaker. Features: simple, fast, automatic TOC, macros, filters, include, GUI/CLI/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    This project aims to provide an offline version of wikipedia, available from the web browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ZML, the Zeitung Markup Language, is a simple CMS for small newspapers. It was specifically designed to publish a student newspaper in print and on the Web. It uses LaTeX and XHTML. So far, it is documented in German only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PyAMF provides Action Message Format (AMF) support for Python that is compatible with the Adobe Flash Player. It includes integration with Python web frameworks like Django, Pylons, Twisted, SQLAlchemy and more. You can download the latest version from h
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    XForms Validator is the open source version of the online XForms Validator, available at http://xformsinstitute.com/validator/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    sabnzbd-xmlgui is an Ajax based frontend built around sabnzbdplus. It also provides an xml based API for other applications to easily connect with sabnzbd while at the same time maintaining the existing web based ajax gui.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    This project will provide translation of mathematical content, from TeX to MathML and vice-versa, and to graphics formats, as a web service. TeX, running as a daemon, is used for mathematical typography.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.