This project is for a webpage content monitor called PagePing.
It is designed to actively keep track of website content (especially message boards) by repeatedly downloading and diff'ing against the last cached copy.
Cross platform full text indexing, search, and preview. Supports any document that can be converted to plain text. Web-Start ready, 100% Java, uses the lucene search engine.
Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.
Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
A DAML+OIL ontology editor with constraint propagation functionality to ensure that constraints applied to properties and restrictions are correctly propagated through an ontology, and datatype management functionality for manipulating custom datatypes.
TopoWeb is about web topology, the linked structure of the web. Harness the power of link awareness, a method not just for surfing hyperlinks backwards to find content, but also, for publishing content.
Bookmark.inc is a PHP class that provides the essential functions to manage a bookmark. It provides an (multilingual) administrator interface, and allows the personalization of the bookmark visualization through a simple template system.
The Medlane project is an attempt to create a set of tools that will enable librarians to move from the standard MARC (MAchine Readable Cataloging) format to a new library/museum XML format. This move will ensure traditional library/museum data remains
A highlighter for XML documents, written in Java. Uses regular expressions to search a set of DOM nodes, and transparently handles highlighting matches that span multiple elements. Highlight events are passed to a user defined highlighter for processing.
Xapian is a Search Engine Library, written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C# and Ruby. Xapian allows you to easily add advanced indexing and search facilities to your applications. See www.xapian.org for more information.
The jGroups Package is a Java package providing a few classes to abstract over the structure and contents of Yahoo! Groups message archives. Due to the state the package is currently in it is only available from its CVS repository.
Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page
Indexes a set of files contained on many supports (directories, CDs, archives -such as zips, tarballs...-, ...) using several indexing methods (file name/date/length/CRC...) to detect dupplicates, find and trash old useless backups.
The Redwood WLMS is an Open Source implementation of a Web Log Mining System, which is based on Java2 Enterprise Edition (J2EE), such as EJB, JMS and Servlets.
A SOAP-based Document/File-Sharing solution written in Java. It includes a basic web-interface but other clients are possible. You can share and download all common office document formats like MS Word, Excel, OpenOffice and PDF.
BTR Wizard quickly replaces multiple occurances of text over multiple files. This unique program scans folders for files matching filter critera then searches those files for any occurances of a text string and replaces them all. This is an ideal tool fo
a meta search engine that can be run as a server or as a stand alone search utility. Will be extended to search The Invisible Web in the near future. Seek something? These dogs will find it!
Catalog module for creating FTP server support website. Search engine for Files located on FTP sites or in other collections submitted to this catalog.
It is oriented for Audio Books in mp3 format collections. Use PHP, Java & database MySQL or Oracle.
FTPSearch is a java-based program that garthers URLs from many ftps and stores them in database to provide search function. Currently Postgres and MySQL are available for data storage.
Project B is a platform for various Bible programs using Java. It will support desktop applications like the On-Line Bible and Sword, there is a Servlet interface, some add-in macros for MS Word. Other interfaces are in development.
WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
Voambolana (pronouce VOO-BOO-LUH-NUH) is an on-line dictionary that converts foreign languages to a native language. Voambolana uses SAX parser and XSLT transformer. The tools used includes Ant, Xerces, Xalan (XNI) and Apache from the Apache Group.
The BeeGram library is a portable open source search engine toolkit written in C. BeeGram provides a number of building blocks for the construction of powerful general-purpose text-based search tools.