Showing 73 open source projects for "xpath"

View related business solutions
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 2
    gain

    gain

    Asyncio-based Python framework for building fast web crawling spiders

    ...Developers define crawlers using components such as spiders, parsers, and items, allowing them to organize crawling logic and data extraction rules clearly. Gain supports CSS selectors and XPath expressions for parsing page content and extracting specific elements. Gain also allows developers to configure headers, concurrency levels, and proxy settings to control how crawlers interact with target websites. Because it uses asynchronous programming, Gain can handle multiple requests efficiently while minimizing blocking operations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    lxml

    lxml

    The lxml XML toolkit for Python

    A Python library for efficient XML and HTML processing, known for speed and compatibility. The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. The latest release works with all CPython versions from 3.6 to 3.12. See the introduction for more information about the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5

    contools

    console tools, batch scripts, shell scripts, shell tools, utilities

    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Wapiti

    Wapiti

    Wapiti is a web-application vulnerability scanner

    Wapiti is a vulnerability scanner for web applications. It currently search vulnerabilities like XSS, SQL and XPath injections, file inclusions, command execution, XXE injections, CRLF injections, Server Side Request Forgery, Open Redirects... It use the Python 3 programming language.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    malware-samples

    malware-samples

    A collection of malware samples and relevant dissection information

    This repo is a public collection of malware samples and related dissection/analysis information, maintained by InQuest. It gathers various kinds of malicious artifacts, executables, scripts, macros, obfuscated documents, etc., with metadata (e.g., VirusTotal reports), file carriers, and sample hashes. It’s intended for malware analysts/researchers to help study how malware works, how they are delivered, and how it evolves.
    Downloads: 91 This Week
    Last Update:
    See Project
  • 8
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    RPA for Python

    RPA for Python

    Python package for doing RPA

    Python package for doing RPA. RPA for Python's simple and powerful API makes robotic process automation fun! You can use it to quickly automate away repetitive time-consuming tasks on websites, desktop applications, or the command line. See sample Python script, the RPA Challenge solution, and RedMart groceries example. To send a Telegram app notification, simply look up @rpapybot to allow receiving messages. To automate Chrome browser invisibly, use headless mode. To run 10X faster instead...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    agglo

    agglo

    Multi facets CASE/AGL for easy project developments

    Agglo is a CASE (Computer Aided Software Engineering) or AGL (Atelier Génie Logiciel), whose aim is to facilitate installation ans use of various tools (either existing opensource tools, or agglo tools), in several facets of project development: - requirements management - unit tests - automated integration tests - toolkit for various languages (cpp, c, python, shell, xsl) other facets are to come ultimately: - coverage tests - planification - integration with hudson - code...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    mlscraper

    mlscraper

    ML-based HTML scraper that learns extraction rules from examples

    mlscraper is a Python library designed to automatically extract structured data from HTML pages without requiring developers to manually write CSS selectors or XPath rules. Instead of defining extraction logic by hand, users provide a few examples of the data they want to retrieve from a webpage. It analyzes those examples within the HTML document and determines patterns or rules that can be used to extract the same type of information from similar pages. Once trained, the generated scraper can process new pages and return the extracted data in structured formats such as dictionaries or lists. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    DocBook to LaTeX Publishing transforms your SGML/XML DocBook documents to DVI, PostScript or PDF by translating them in pure LaTeX as a first process. MathML 2.0 markups are supported too. It started as a clone of DB2LaTeX.
    Leader badge
    Downloads: 35 This Week
    Last Update:
    See Project
  • 14
    DocBook Authoring and Publishing Suite

    DocBook Authoring and Publishing Suite

    DocBook Publishing Made Easy

    The DAPS project moved to https://github.com/openSUSE/daps The SUSE XSL Stylesheets have moved to https://github.com/openSUSE/suse-xsl To join the discussion, under https://github.com/openSUSE/daps/discussions
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Requests-HTML

    Requests-HTML

    Pythonic HTML Parsing for Humans

    ...When using this library you automatically get full JavaScript support! (Using Chromium, thanks to puppeteer) CSS Selectors (a.k.a jQuery-style, thanks to PyQuery). XPath Selectors, for the faint of heart. Mocked user-agent (like a real web browser). Automatic following of redirects. Connection–pooling and cookie persistence. The Requests experience you know and love, with magical parsing abilities, and async support. The rest of the code operates the same way as the synchronous version except that results is a list containing multiple response objects however the same basic processes can be applied as above to extract the data you want.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    IdeoType is a book compiler that converts manuscript (XHTML) to book (PDF) on the fly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    TekDS820

    Drive your obsolete Tek oscilloscope remotely with python

    Control a Tek TDS-820 remotely via a GPIB interface using linux-gpib, python, matplotlib and tkinter. The programming manual is used to generate an XML schema which is then used to assist in building extra functionality into the tkinter user interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Yang XML viewer

    PyGtk2-based application that shows Yang-models as Treeview

    The application uses 'pyang' for creating a XML file and displaying the Yang-model in a Treeview. Additionally it allows to send Netconf requests to a configured target. For this purpose, it uses the Netconf client from [1] or can use an external console too. [1] https://github.com/tail-f-systems/JNC/blob/master/examples/2-junos/netconf-console
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    pgal

    Online gallery factory

    Given images folders, generate a static web gallery. Note: This project has been moved to: https://github.com/viewplatgh/pgal
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    dynamide
    dynamide is a dynamic web application framework for handling the presentation and business layers in a traditional web app. See http://dynamide.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    webgeno is a light-weight Content Management System (CMS) for generating websites offline. It is primarily designed for personal websites where the user doesn't have access to the server except to upload files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ETICS
    ETICS stands for "eInfrastructure for Testing, Integration and Configuration of Software". It provides software professionals with an "out-of-the-box" build and test system, powered with a build and test product repository.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A simple and fast way to generate API documentation for Python source code. (text, XML, LaTeX, DocBook, etc) RealDoc does not require modules to be imported, so no hassle with non-running source trees.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo