html xml free download

Showing 30 open source projects for "html xml"

View related business solutions

Search Engines Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
1

WebHarvest - web data extraction tool

Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.

14 Reviews

Downloads: 1 This Week

Last Update: 2025-10-27
See Project
2

OpenSearchServer Search Engine

An open source search engine with RESTFul API and crawlers

OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...

31 Reviews

Downloads: 0 This Week

Last Update: 2018-08-26
See Project
3

cpDetector

cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.

Downloads: 22 This Week

Last Update: 2018-04-05
See Project
4

SSEP - Site Search Engine PHP-Ajax

A Free site search engine script build with PHP and Ajax.

A Site Search engine script that uses MySQL to store your website's indexed pages, to add Search Functionality to Your Web Site. It is build with PHP and JavaScript, the search results are loaded via Ajax. The search system combine MySQL full text with SQL regexp, and words weight according to their location in the HTML elements, to determine the relevance of the search results. It can be included in any web site.

3 Reviews

Downloads: 0 This Week

Last Update: 2017-03-25
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
5

eXtensible Text Framework (XTF)

Framework for search and display of heterogenous document collections.

...Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.

Downloads: 0 This Week

Last Update: 2019-07-29
See Project
6

HyperSQL

HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.

Downloads: 0 This Week

Last Update: 2016-09-19
See Project
7

CyberNeko HTML Parser

NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.

17 Reviews

Downloads: 5 This Week

Last Update: 2015-04-17
See Project
8

Wap Auto Index Advance

Auto Index wap is Advance of Download Portal (Multi Language)

Djamolwap 13v -Advance Auto Index With Web Admin Panel + Multi Language + Themes ||||||||||||||||||||||||||||||||||||| New Updates ||||||||||||||||||||||||||||||||||||| - Multi Language Website 1) English 2) Urdu 3) Gujrati 4) Russian - User/Visitor manual change language website - Multi Language Plugin On/Off - Added Function in Admin Panel - Automatic All Mp3 Tag Setting Added _____________________________________________ Official Website : http://ai.djamol.com Demo...

2 Reviews

Downloads: 0 This Week

Last Update: 2014-10-19
See Project
9

regain

Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.

13 Reviews

Downloads: 9 This Week

Last Update: 2014-07-30
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
10

webStraktor

webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy to master. ...

Downloads: 0 This Week

Last Update: 2014-04-25
See Project
11

TestEl

TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.

1 Review

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
12

Image Crawler

The image crawler application is used to collect a multitude of images from websites. The images can be viewed as thumbnails or saved to a given folder for enhanced processing. (Html and XML report are generated)

2 Reviews

Downloads: 2 This Week

Last Update: 2015-08-04
See Project
13

Galateia HTML Extractor

A HTML scraper that uses machine learning frameworks to extract labelled fields from raw HTML. The project also involves the development of a tool to display the semi structured data generated by the scraper component.

1 Review

Downloads: 0 This Week

Last Update: 2013-05-14
See Project
14

Information Extracter

A utility to extract meta-information (properties/comments) out of various file-types; e.g. HTML, PDF, RTF & various Office documents; OGG/MP3 files and JPEG/PNG/GIF images, which can be presented in various output formats (HTML, XML, LaTeX & plain t

Downloads: 0 This Week

Last Update: 2013-04-08
See Project
15

Browser Search Box library

This library can be used to add your site to browser search box. It can generate HTML, Javascript and XML to pass information to browsers so they can add a site to the list of types of search that the browser can perform.

Downloads: 0 This Week

Last Update: 2015-11-15
See Project
16

zSearch -- The easy search engine

zSearch is a simple python based crawler and search engine. Raw HTML are stored in bzip2 archives, the index is created using pylucene, and twsited is used to provide internal http server. Results are sent back as XML over HTTP.

Downloads: 0 This Week

Last Update: 2016-07-24
See Project
17

RDF AutoPilot

Generates RDF and RDFS ontology documents automatically from HTML pages once given a set of rules.

Downloads: 0 This Week

Last Update: 2016-08-07
See Project
18

bluebery

bluebery is an easy-to-use sql/php based content manager that provides php libraries and methods to use in your sites pages with which you can very easily access & print desired items, or an iteration of items that are stored through the bluebery web ui.

Downloads: 0 This Week

Last Update: 2014-07-06
See Project
19

exom : siteministrator

Compact and fast open source desktop CMS (content management system) for generation of documentations and small or medium sized websites. Features: Full text search, integrated browser, multilingual projects, support extern HTML editors.

Downloads: 0 This Week

Last Update: 2014-04-24
See Project
20

JaWiki

JaWiki is Java Wiki with a file based database to manage the Content. The content is stored in XML files in the file system. A html frontend allows to edit the content by the users via an Browser. A standalone server also included.

Downloads: 0 This Week

Last Update: 2015-08-06
See Project
21

webnavigator

The project Navigator aims at supporting automated gathering of dynamic information from third party web sites, using their web interface to post queries and to gather replies. Navigator is written in OS-independent java language.

Downloads: 0 This Week

Last Update: 2013-03-21
See Project
22

JLinkCheck

JLinkCheck is an Ant Task written in Java for checking links in websites. It is not just checking one single page, but crawling a whole site like a spider, generating a report in XML and (X)HTML. JReptator will be its succesor with many more features

Downloads: 0 This Week

Last Update: 2016-04-26
See Project
23

Information Retrieval Toolkit

High-performance software for information retrieval research. Emphasis on semi-structured text retrieval, especially for HTML and XML. The goal is to facilitate information retrieval research by providing an interchangable toolkit of functions.

1 Review

Downloads: 0 This Week

Last Update: 2013-02-21
See Project
24

Distributed ISBN portal

A distributed search portal of common sources of ISBN numbers, with permanent caching of results. To provide a open-source free interface for ISBN retrieval using HTML, SQL or XML to be independent of any toolkits or software.

Downloads: 0 This Week

Last Update: 2013-07-14
See Project
25

webExtractor

webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.

Downloads: 4 This Week

Last Update: 2014-06-26
See Project