Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
G-P - Global EOR Solution
Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world
With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
The Image Markup Tool will be an editing environment for creating TEI P5 XML files in which zones defined on images are linked to XML elements containing transcription and annotation data.
BKPML Mananager is a application for to backup, restore and migration of data in databases, your objective is created hibrids backup to restore in any database.
This is a generic XML to RDF converter which uses XSLT transformations to convert any XML document into RDF format.
The transformation uses an XSLT processor like xsltproc. The command line for the Bash shell is:
xsltproc xml2rdf3.xsl document.xml > document.rdf
Reference:
Breitling, F. 2009: A standard transformation from XML to RDF via XSLT, Astronomical Notes, Vol 330 Issue 7, DOI: 10.1002/asna.200811233,
http://onlinelibrary.wiley.com/doi/10.1002/asna.200811233/abstract...
Given an XSD schema and an XSA.xml configuration file, XML Skeleton Annotations (XSA) generates JSF forms UI to create XML records compliant to the XSD and following the XML skeleton defined in XSA.xml, still keeping everything under your control.
Award-Winning Medical Office Software Designed for Your Specialty
Succeed and scale your practice with cloud-based, data-backed, AI-powered healthcare software.
RXNT is an ambulatory healthcare technology pioneer that empowers medical practices and healthcare organizations to succeed and scale through innovative, data-backed, AI-powered software.
PHigester is a PHP 5 port of Jakarta Commons Digester. Like his father, PHigester lets you configure and use an XML -> PHP 5 object mapping, which triggers certain actions called rules whenever a particular pattern of nested XML elements is recognize
Mistral-IdM is a project whose aim is to provide an identity management system, with advanced authentication and authorization abilities, based on standards (SAML, XACML, XKMS), providing a user-friendly administration console.
Mex PREMIS Editor is an Editor for the MEX Editor Framework
PREMIS stands for Preservation Metadata: Implementation Strategies
The Editor will implement PREMIS v 1.1 (Version 2.0 is under development)
CpX is an XML-based lightweight C++ development environment.It’s a philosophy of simple OO software engineering, with a C++ subset & base classes.It has general-purpose useful C++ services & tasks to speed up build.Tired? A Sudoku game is given in CpX as a bonus.
amorph is an any-to-XML-to-any data transformation library. Use amorph to read possibly ANY kind of data format required for further processing (csv,fixed length, xml, electronic bills, custom formats, ...) within your application.
Inventors: Validate Your Idea, Protect It and Gain Market Advantages
SenseIP is ideal for individual inventors, startups, and businesses
senseIP is an AI innovation platform for inventors, automating any aspect of IP from the moment you have an idea. You can have it researched for uniqueness and protected; quickly and effortlessly, without expensive attorneys. Built for business success while securing your competitive edge.
XMP PHP Toolkit Extension is a PHP module which include the Adobe XMP Toolkit SDK. The main functions from adobe XMP will be available from PHP as classes and methods. The actual release 2.0 is based on the new Adobe XMP Toolkit SDK 5.1.2.
now here: https://github.com/plastex/plastex
plasTeX is a Python-based LaTeX document processing framework. It gives DOM-like access to a LaTeX document, as well as the ability to generate mulitple output formats (e.g. HTML, DocBook, tBook, etc.).
Trial Criteria Online Data Entry (trialCODE): a Java-based user interface that codifies eligibility requirements used to automate the screening of potential subjects to clinical trials. Used for caMATCH screening engine on BreastCancerTrials.org site.
This project is to develop web applications and data integration functions to provide information on the collection records, ecology, geographic distribution, and taxonomic concepts of the vascular flora of the region.
Feeds2Mail is an application that periodically reads RSS feeds and sends them by email (SMTP). Highly configurable it can filter feed posts by title and/or by category. It runs on linux (mono) and on MS Windows (Task Scheduler or crontab).
ResXML is an XML application for the presentation-oriented markup of resumes or curricula vitae. The aim is for ResXML to support a set of features which make it a very functional tool for generating resumes for actual use from an XML source.
XPN is relies on a non relational native XML database, where XML documents are stored in a compressed form and indices enables fast access to structure and content, thus enabling a fast evaluation of XQuery queries.
The SSAF ("Secure Search And Forwards") is a dirt-simple standalone web app for inexpensive and secure information sharing. Any uploaded record may be forwarded to an intended destination, and may also be stashed in a searchable repository.
Read-in XML tags in an array format accessible via "paths". Modify existing XML. Create a proper XML string. All methods are used to programmatically read, modify, and/or create XML.
A Qt-based Application used to collect geographical information from twitter accounts and display them using Google Maps API. It will also include features to follow links and extract similar information from image sites like Flickr and Dailybooth.
Web Registry for sharing keys and values across the web. Apps sharing the same web registry will be able to communicate with one another, share common resources, and constantly update new resources by creating new keys, or new key revisions.
A toy XML-aware (but otherwise generic and extensible) content management system demonstrating how to do sophisticated management of versioned hyperdocuments with a focus on issues of import and export of compound documents (e.g., XInclude-based).
The gateway is an opensource JavaEE application developed by the Vermont Dept of Taxes. It provides a web services framework for accepting Streamlined Sales Tax registrations and returns. It also includes a web interface for submitting transmissions.
The aim of the tool is to validate a particular format of metadata. Specifically, the tool checks three parts: 1. Big5 character encoding; 2. whether it is a well-formed XML document ; 3. other specifications with our own purposes.