Showing 55 open source projects for "java xml"

View related business solutions
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 2
    panFMP
    panFMP is a generic framework suitable for harvested XML metadata that is searchable through Apache Lucene without any additional RDBMS. Fields can be defined by XPath allowing for full text queries on all types of fields including numerical ranges. The code was moved to Github: https://github.com/pangaea-data-publisher/panfmp
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 5

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    ScraperEdit for XBMC

    XML bindings and a GUI for creating and editing XBMC Scrapers

    This program is an editor for creating XBMC Scrapers. It is similar to ScraperEditor, an other editor using ScraperXML, that runs under .Net environment. This program runs under Sun/Oracle's Java Runtime. HELP WANTED! I am looking for someone, who would help me writing documentation, like user's manual and on-line help. Also if someone want to help, translated language files are always welcome...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 9
    IDRA (InDexing and Retrieving Automatically) is a tool which allows indexing a wide range of text (TXT, DOC, PDF) and image annotations files (XML), query-based searching, visualizing an index, saving it for re-usability, evaluation, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10

    Infofuze

    Data migration/conversion library based on STX and XSLT transformation

    Infofuze is a Java library and server application that can be used to transform and combine data from various sources into a specific XML or other text output format that can be stored or indexed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    XML documents To Generated dynamic web application supporting CRUD actions. Credits to Ministry of Culture and Communication, France; UNESCO; Ecole Nationale des Chartes, France; PASS-TECH, France.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    A universal platform for resource discovery and description that shares XML meta-data over existing peer-to-peer (P2P) networks such as Gnutella and JXTA.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    DBPrism is a framework to generate dynamic XML from a database, it provides an high performance DBGenerator for Cocoon2. Also is a J2EE replacement for Oracle mod_plsql. This project also includes a Restlet-Oracle connector exam. and Lucene Domain In
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    NewsRack is a tool/service that attempts to automate news monitoring. Based on user-specified definitions and rules, NewsRack will enable automated downloading, classification, filing, and long-term archiving of news.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    jSEO -- Pluggable SEO (Search Engine Optimization) for dynamic JEE web applications
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    nxs crawler is a program to crawl the internet. The program generates random ip numbers and attempts to connect to the hosts. If the host will answer, the result will be saved in a xml file. After than the crawler will disconnect... Additionally you can
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The Semantic Web implementation using native xml database as backend storage. A SPARQL java compiler to XQuery using Jena. There are XQuery scripts for native xml database Sedna(http://modis.ispras.ru/sedna/).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The Java-Sitemapper is a Java API for building sitemap files to improve search indexing on Google, Yahoo!, MSN, and Ask.com. This project strives to implement the latest in search technology for use on the Java platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A Java library as a wrapper for the Google Search Appliance's search protocol XML API. The XML API is publicly available at: http://code.google.com/gsa_apis/xml_reference.html The homepage and tutorial for this project is at: http://gsa-japi.sf.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    JxtASK is a P2P system that is aimed to search, download and share academic content hosted on websites that will join the JxtASK community. Joining is simple: siteadmins must generate(even automatically)a XML catalog which describes the files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Google(™) meets the Matrix. Red Piranha combines Lucene (Searching Ability), XML-RDF (ability to learn), Tomcat (for P2P Power) and Spring (Ease of use) to not only let you find anything, anywhere, but to actually understand what you are looking for.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Open Source Semantic Web Search Engine Software: If two machines anywhere on the web can agree on the same definition of a digital service or digital good, then machine to machine transactions can use this lingua franca to transact on the users behalf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OpenMKS is a search & navigational tool for large multimedia collections. With pluggable functionality and a core subsystem supporting the z39.50 ZING Community SRW search & retrieval specification, it can be run either as a Servlet or as a Web Service.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The Retrieval Component Integrator Project (RECOIN) intends to provide an extensible framework of Java classes to build a meta-search and information retrieval (IR) system based on heterogenous IR components as part of a modular retrieval process. The so
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →