Showing 21 open source projects for "python web crawler"

View related business solutions
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    PyXB (“pixbee”) is a pure Python package that generates Python source code for classes that correspond to data structures defined by XMLSchema. In concept it is similar to JAXB for Java and CodeSynthesis XSD for C++.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Aurora Application Server is a new Python Web Application Server and Framework. The main goal of the project is to provide the developer with a complete set of tools to speed up the application development process. See project wiki for more information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Business Automation Software for SMBs Icon
    Business Automation Software for SMBs

    Fed up with not having the time, money and resources to grow your business?

    The only software you need to increase cash flow, optimize resource utilization, and take control of your assets and inventory.
    Learn More
  • 5
    The London Datastore (http://data.london.gov.uk) was created by the Greater London Authority (GLA) as an innovation towards freeing London’s data. This SourceForge Project will be used to Open Source our development efforts surrounding data formats
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    The Java Sitemap Parser can parse a website's Sitemap (http://www.sitemaps.org/). This is useful for web crawlers that want to discover URLs from a website that is using the Sitemap Protocol. This project has been incorporated into crawler-commons (https://github.com/crawler-commons/crawler-commons) and is no longer being maintained.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    XForms Validator is the open source version of the online XForms Validator, available at http://xformsinstitute.com/validator/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    sabnzbd-xmlgui is an Ajax based frontend built around sabnzbdplus. It also provides an xml based API for other applications to easily connect with sabnzbd while at the same time maintaining the existing web based ajax gui.
    Downloads: 0 This Week
    Last Update:
    See Project
  • G-P - Global EOR Solution Icon
    G-P - Global EOR Solution

    Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

    With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
    Learn More
  • 10
    4Suite is a platform for XML processing and knowledge-management, consisting of a library of integrated tools for XML processing, and an XML data repository and server with a rules-based engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    BugEye is an XML-based unit test creation framework. Being XML-based, it can be easily translated to almost any language. The current translations are C#, Java, JavaScript, and Visual Basic. Future translations include C++, Python, Perl, and PHP.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Cathnet is developing the infrastructure for the Catholic Semantic Web. Technologies involved include, but are not limited to, XML, RDF, NLP, Zope, Plone and Plone products.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    XSDB XML is to DATA as HTML is to DOCUMENT. Publish and combine data as easily as HTML format and web browsers publish and view documents. Implementations in Python, javascript, java, C#/.NET.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    wxBrowser is an application browser based on the wxWidgets GUI framework. It's similar to a regular old web browser only, instead of reading HTML and displaying content it reads XML and executes presentation logic (wxPython) in a client side application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Modular, network based system for integrating separate multimedia systems
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Splice is a Python-based content aggregation and publishing platform. It provides all of the features of a common weblog combined with synchronization capabilities, allowing content to be slurped in from external sources, classified, and published.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Python wrappers for the way(s) you think. Mindwrapper is a framework for the rapid development of custom, data-centric, GUI applications with wxPython.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    xml2csv
    XML is a standard to move data around easily and CSV format is the easiest to display huge chunk of data. xml2csv offers, light weight and easy conversion of XML data to CSV formated data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    BOFeed is a collection of scripts which process a feed of news articles for integrated display within popular open source CRMs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PHP + mod_python GUI for athenaCL
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Dictionnaire is an open-source French-English dictionary intended to cover modern phraseology as well as entries that are difficult to translate using traditional dictionaries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next