Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)
Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
Learn More
Cloud tools for web scraping and data extraction
Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
**CODE MOVED TO GITHUB: https://github.com/bitextor **
Bitextor is an application created to generate translation memories using multilingual websites as a corpus source. It downloads an entire website and applies a set of heuristics (based mainly on HTML tag structure and text block length) to find bitexts.
Linknx is an automation platform providing high level functionalities to EIB/KNX installation. The rules engine allows execution of actions based on complex logical conditions and timers. Lightweight design allows it to run on embedded Linux.
The project has been migrated to Github in 2015: https://github.com/linknx/linknx
libxmldiff is a library to provide diff functions for XML files. It is shipped with xmldiff, a simple command line tool to demonstate libxml functionnalities. See the project xmltreenav for a GUI.
The G(arbage) C(ollected) X(Query) engine is the first streaming XQuery engine that implements active garbage collection, a novel buffer management strategy in which both static and dynamic analysis are exploited.
Open Source Implementations for 3D-Surface Characterisation Algorithms according to ISO 25178 (Geometric Product Specification) in verifiable pseudocode (MATLAB). Implementation of an XML-based file exchange format according to upcoming ISO 25178-72.
pdf2xml convertor based on Xpdf library (http://www.foolabs.com/xpdf/home.html). It converts information contained in a PDF file into XML. First, you need to install xpdf and libxml2 (see documentation).
Hervé Déjean
Xerox Research Centre Europe
http://www.xrce.xerox.com/About-XRCE/People/Herve-Dejean
It's a modern take on desktop management that can be scaled as per organizational needs.
Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
XmlPL is a C like language with special syntax for creating and manipulating XML data. If you know Java, C or C++ and XPath then XmlPL is easy to learn. XML is a native data type and is processed more naturally using XML path expressions and inline XML.
Compiler and interpreter for the Scriptol programming language. Scriptol is object oriented and the first language to use XML as data structure in sources. It is easy to learn and safe. Also, RSS readers in PHP, an RSS feed editor.
XML Processor. A Multi-threaded, Pub/Sub environment for Dynamic programming on an event driven Tickless and Sleeping State Machine with TCP communications, tight flawless memory management, powerful set algebra and a magical database. 100% C++. ezPort.
wsdlpull is an efficient and powerful command line utility for dynamic inspection and invocation of WSDL web services.It provides a C++ library with dynamic WSDL invocation API,WSDL parser,Schema parser and Validator and an xmlpull Parser/Serializer.
That project aims at providing a framework around the WSDLPull (http://wsdlpull.sourceforge.net) source code, so as to ease the packaging and delivery processes, allowing to deliver packages for a great number of Linux distributions and Unices.
Xemeiah is a fast, modular and scalable XML Framework written in C++, with an efficient DOM and Oasis-compliant XSLT Processor. Xemeiah modules include a persistence layer, a fast Ajax Web Server, a Media Player, ImageMagick frontend, java bindings...
A command line utility to display statistics about a text file consisting of lines of data. The statistics include counts of line terminator pairs (CR, LF, CR+LF) and line counts. Also shows if there is an unterminated trailing line.
SGML2KSS transforms one SGML document into a XML document that contains the content of the SGML document as well as markup information about the SGML instance such as OmitTag and ShortTag.
PSP RSS Feed Generator is a commandline based PSP RSS Channel file creator intended for beginners' use. It scans a local directory and creates a PSP RSS Channel compatible XML file ready to be hosted on a web site. ***Supports directories with spaces***
xmlconf-lite is a library to read/write XML configure file, it does not depend on any XML SDK, so it is suitable for embed device or when no XML SDK available. Anyway, it is very light and easy to use.
Simple Plain Xml Parser (spxml) is a stream-oriented XML parser that supports pull-model and DOM-model XML parsing.Resulting DOM trees can be read, modified, and saved.
The Introspector enables the programming tools that deal with source code such as the compiler to communicate in a standard and neutral manner reducing the accidental cost of programming. http://github.com/h4ck3rm1k3/
Performs external DTD simplification according to previously tagged text. Parameter entities are replaced at every model group and simplified independently.
csvtoxml will convert parse csv comma separated value data into xml. a command line console utility that uses stdin and stdout pipe with more cat, pr, wget, zip, find -exec for added functionality. file stream term c c++ small fast parser unix win osx
A lightweight toolkit for efficient processing of XML data. The tools are analagous to the UNIX command-line text processing tools sort, grep, etc. The infrastructure includes an efficient DFA-based engine for streaming evaluation of XPath expressions.