Showing 28 open source projects for "html web browser"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Geoportal Server
    Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    ssSearchEngine

    keyword search engine for semi-structured data (Tables, lists,...)

    This application implement an approach for doing keyword based search over semi-structured data available in HTML pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Web-as-corpus tools in Java. * Simple Crawler (and also integration with Nutch and Heritrix) * HTML cleaner to remove boiler plate code * Language recognition * Corpus builder
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Webstats Solr is an attempt to make Apache Access log easier to Data Mine. By adding a powerful Search Engine (SOLR) as a Backend and using Java Script and HTML and maybe PHP I hope to out date AWStats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    EasyGIS simplifies GIS data management, sharing, and publishing. REST interfaces (json, html views). Lucene based FTS searches. Thematic maps, business cartography. Integration with external GIS data providers - Google, OSM.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    HttpFinder is web content searching tool. It enables look for text content that matches given regular expression in html pages/scripts etc. All navigation is performed with use of other regexp which describes links to visit.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    JeCARS (Java Extendable Contents And Rights System) is a RESTful webservice which delivers pluggable output formats, e.g. Atom feeds or HTML. Third party applications can be plugged in. A JCR (JSR-170) repository (Jackrabbit) is used for storage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JaWiki is Java Wiki with a file based database to manage the Content. The content is stored in XML files in the file system. A html frontend allows to edit the content by the users via an Browser. A standalone server also included.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The Batino Browser is the next generation rich web browser platform. It is based on Eclipse technology.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    JLinkCheck is an Ant Task written in Java for checking links in websites. It is not just checking one single page, but crawling a whole site like a spider, generating a report in XML and (X)HTML. JReptator will be its succesor with many more features
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    The project Navigator aims at supporting automated gathering of dynamic information from third party web sites, using their web interface to post queries and to gather replies. Navigator is written in OS-independent java language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    COMMON VULNERABILITIES AND EXPOSURES (CVEŽ) DATABASE BROWSER, CVEBROWSER A web search engine for the CVE dictionary, targeted to be used on a intranet. CVEBrowser uses Java Servlets / JSP and MySQL and its designed to work well on RedHat
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    contentix - open source content management system contentix is a cms and a framework to develop any personalized browser based application. It use xml to store data in media nutral way and xsl to generate output. Check the demowebsite from downloads.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A hypertext-browser written in Java which filters links (emails, docs or pics for e.g.) out of .html-documents and paints them on screen in hierarchical order. Users get a quick overview of how a website is put together.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Swing-Search tool to effectively search among a list of strings and open a corresponding webpage in a browser. It was originally designed to quickly search all titles of pages that are stored in a Wiki.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A servlet that generates html indexes for your music collection, for easy browsing and listening to your remote vorbis files ;) Using XSLT as stylesheet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB