Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Comet Backup - Fast, Secure Backup Software for MSPs
Fast, Secure Backup Software for Businesses and IT Providers
Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.
OriSVG attends to give some help to origami diagrams designers by providing markers, arrows, strokes and preformated diagrams in SVG language, and some Inkscape's plugins and scripts.
Cathnet is developing the infrastructure for the Catholic Semantic Web. Technologies involved include, but are not limited to, XML, RDF, NLP, Zope, Plone and Plone products.
XSDB XML is to DATA as HTML is to DOCUMENT. Publish and combine data as easily as HTML format and web browsers publish and view documents. Implementations in Python, javascript, java, C#/.NET.
A cgi program written in Python that turns HTTP GET and POST queries into SQL queries, runs them, turns the results into XML, and then returns the XML as the response to the HTTP query. It also supports returning a single BLOB as the whole response.
A set of XML specifcations for game-related metadata, an API, and a collection of associated tools for managing data. This is to serve as a reusable infrastructure for creating game frontends able to deal with large numbers of games (such as MAME).
wxBrowser is an application browser based on the wxWidgets GUI framework. It's similar to a regular old web browser only, instead of reading HTML and displaying content it reads XML and executes presentation logic (wxPython) in a client side application.
A Python program to deduce the DTD from a set of related XML files. It may not be 100% automatic but easies up the initial step before manual adjusting. The number of documents can be huge, like 100k.
SciBook is a framework for xslt transformation from xhtml to html.
The transformation can be extended by adding plugins. The standard LaTeX plugin can convert
LaTeX code to images.
Image2DocInfo has been made to quickly tag digital pictures. A GUI allows you to set attributes for an image, and then store them in XML files. Those files follow the Dublin Core naming scheme and are stored in the same directories than the pictures.
Splice is a Python-based content aggregation and publishing platform. It provides all of the features of a common weblog combined with synchronization capabilities, allowing content to be slurped in from external sources, classified, and published.
This editor aims to help users creating their own ebooks in the newly released Open Publishing standard defined by International Digital Publishing Forum. The editor will permit the creatioin of ebook in OCF-1.0 format (.epub)
DXIE is a Dynamic XML Instance Editor that provides dynamic GUI editing of XML documents. DXIE reads a Document (XML) and its Schema (XSD) and produces a dynamic UI for editing (search, add, edit, and delete) of the Document.
pyModeliXe is a Python template engine. Designers will appreciate its Dreamweaver support, and its XML compliant templates. Developers will be pleased about its plug-ins system, error management and the template cache.
Simple (Markup for) XML, or SMX, is an attempt to reduce the amount of syntactical overhead associated with standard XML. AKA: XML! Now with 50% Higher SNR!!
XML is a standard to move data around easily and CSV format is the easiest to display huge chunk of data. xml2csv offers, light weight and easy conversion of XML data to CSV formated data.
xml.dom.easydom for python primarily behaves like any xml.dom.* you already use. Additionaly easydom provides operator overloadings which render xml processing descriptions more readable and hence less error-prone. See about 90% of your xml-code boosted.
WebPath is an experimental implementation of XPath 2.0 in Python, initially developed during Yahoo! Hack Day. It uses a novel parsing technique called Top Down Operator Precedence. Seeking developers to improve implementation and conformance.