35 projects for "python web crawler" with 2 filters applied:

  • Get the most trusted enterprise browser Icon
    Get the most trusted enterprise browser

    Advanced built-in security helps IT prevent breaches before they happen

    Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.
    Download Chrome
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 1
    Lexbor

    Lexbor

    Lexbor is development of an open source HTML Renderer library

    Lexbor is the development of a web browser engine available as a software library; it ships with a free license and has no extra dependencies. For us, speed is an absolute must-have. In our development process, we focus on fastest parsing techniques for HTML, CSS, and fonts, fastest data processing methods, and fastest ways to serve content to end users. Whether you are building a backend that handles millions of HTML documents or a UI-heavy user app, your software’s response rate always...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Bots open source edi translator

    Bots open source edi translator

    Bots is a complete translator for edi: edifact, x12, xml, tradacoms

    Bots is a complete translator for edi (Electronic Data Interchange). EDI data formats eg: edifact, x12, tradacoms, xml. Mail: http://groups.google.com/group/botsmail Web-site: http://bots.sourceforge.net Wiki: http://bots.readthedocs.io Develop: https://github.com/eppye-bots/bots
    Leader badge
    Downloads: 40 This Week
    Last Update:
    See Project
  • 4
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    acp245

    ACP245 Suite: Brazil Telematics protocol implementation

    This "suite" includes a portable reference implementation and testing tools for ACP245, the automotive Telematics protocol defined by Brazil government. see: http://www.denatran.gov.br/simrav/simrav.asp
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    SPARQL Endpoint interface to Python

    This project has been moved to http://rdflib.github.io/sparqlwrapper/

    A library to allow query a SPARQL end-point in Python
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Framework (scripts, configuration, code) to build free and public services around travel and leisure data. That project makes an extensive use of already existing data sources such as Geonames and dbPedia, and adds some glue around those (eg, links).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use. Icon
    Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use.

    Transform user access with Frontegg CIAM: login box, SSO, MFA, multi-tenancy, and 99.99% uptime.

    Custom auth drains 25% of dev time and risks 62% more breaches, stalling enterprise deals. Frontegg platform delivers a simple login box, seamless authentication (SSO, MFA, passwordless), robust multi-tenancy, and a customizable Admin Portal. Integrate fast with the React SDK, meet compliance needs, and focus on innovation.
    Start for Free
  • 10
    Aurora Application Server is a new Python Web Application Server and Framework. The main goal of the project is to provide the developer with a complete set of tools to speed up the application development process. See project wiki for more information.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    This is the Open Source RESTful client for the take.io platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Efficent WEB UI for wget utility written in python(twisted) . It's work without any web server. Script consists of a SINGLE file
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    XUProxy is an extensible multi-protocol proxy based on the Twisted framework. It supports multiple protocol plugins (currently only HTTP), and multiple "filter" plugins for things like logging, caching, and Proxomitron-compatible ad filtering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Scripts and tools for OpenStreetMap (osm) Sample maps can be found at : <http://www.leretourdelautruche.com/map/index.html>
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The London Datastore (http://data.london.gov.uk) was created by the Greater London Authority (GLA) as an innovation towards freeing London’s data. This SourceForge Project will be used to Open Source our development efforts surrounding data formats
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Redland is a set of object-based, modular and portable C RDF libraries providing RDF APIs for the graph, triple storage (librdf), RDF/XML parsing and serializing (Raptor), SPARQL RDF querying (Rasqal). Language APIs in Perl, PHP, Python, Ruby and others.
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    pyservices
    Making use of our library you can easily deploy and consume services available on the web. PyServices is a pythonic library that provides a default interface to WebServices written in many different protocols. Our objective is describe and implement
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Dare-Dare stands for Document Ajax Reader Extension. It's a full javascript online pdf reader. No flash, no closed sources !!!!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Make AsciiDoc part of your literate programming tool set. With eWEB you can weave and tangle literate programs written as AsciiDoc documents, using embedded WEB code snippets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Sofa is a CUDA-based reasoner
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The Java Sitemap Parser can parse a website's Sitemap (http://www.sitemaps.org/). This is useful for web crawlers that want to discover URLs from a website that is using the Sitemap Protocol. This project has been incorporated into crawler-commons (https://github.com/crawler-commons/crawler-commons) and is no longer being maintained.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Txt2tags converts a text file with minimal markup to HTML, XHTML, SGML, LaTeX, Lout, UNIX Man Page, Wikipedia, Google Code Wiki, DokuWiki, MoinMoin, MagicPoint(mgp), PageMaker. Features: simple, fast, automatic TOC, macros, filters, include, GUI/CLI/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ZML, the Zeitung Markup Language, is a simple CMS for small newspapers. It was specifically designed to publish a student newspaper in print and on the Web. It uses LaTeX and XHTML. So far, it is documented in German only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PyAMF provides Action Message Format (AMF) support for Python that is compatible with the Adobe Flash Player. It includes integration with Python web frameworks like Django, Pylons, Twisted, SQLAlchemy and more. You can download the latest version from h
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    XForms Validator is the open source version of the online XForms Validator, available at http://xformsinstitute.com/validator/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.