The LEADERS toolkit is a generic toolset that enables the creation of an online environment which integrates EAD finding aids and EAC authority records with TEI transcripts and digitised images of archival material suitable to a wide variety of archives.
Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
WebTrack is an PHP based search engine for your MovieTrack (www.movietrack.net) or AMC (www.antp.be) system. It gives you a way to present your movie list on the web. Full skin support and it's very simple to use. You have to try it to understand it...
SNT is a search engine for SMB and FTP shares with crawler running on Win32. Web interface is provided for searching files and browsing shares contents. Also provided shared films list with users rates and comments.
A network asset management written in PHP & MySQL. Maintains a list of servers that can be x-ref by multiple items. Features: locations,manufacturers,vendors(contact names & phone numbers), Device log ,List of network ports,Software manager,File manager
AVD is a continuation of the swim project. The goal is to create a suitable SQL server from swim's not-installed DB, and to maintain the swim client. AVD will be used as a gBootRoot method.
PHPLinks is an open source project written in PHP for use with MySQL, allowing one to run an extremely efficient Link Farm with full search capabilities. A "simulated" search engine in many ways.
A perl solution to display a nice directory listing if indexes are turned off on a *NIX based server. Nominally named index.cgi, it reads via 'ls -l' and parses the output as needed for display, using the default Apache icons, or others, if specified.
Bookmark-Manager is an advanced bookmark management utility for Windows supporting importing/exporting and merging of Internet Explorer favorites, Opera hotlists, Mozilla, Netscape, and Firefox bookmarks, XBEL, and HTML lists.
A PHP script that can be used for submitting webpages to search engines.
DLC - HTTP link checker written in Perl. Can generate HTML output for easy checking of results and process a link cache file to hasten multiple requests. Initially created as an extension to Public Bookmark Generator (PBM); can be used alone.
Develop a java API (JAR library, with an example web GUI) for content management. Simple but powerful, based on Apache Lucene project, it would be embeded on projects requiring content management.
This program runs on XP/2000/NT plataform using the Microsoft .NET Framework and Microsoft SAPI speech / voice engine. Its function is to monitor an unlimited number of files on local or remote filesystems , for changes and then speak the content
Check a list of web sites for their Google PageRank score. Can work with 10,000 or more web sites at once: just copy-paste all the domain names and let the program run in the background.
UindexWeb Search engine is an open source web spider, main program is in Delphi7. Lucene.Net is the default full text index engine. The latest version can be retrieved from http://www.opencpu.com/.
The Daisy Open Source CMS is an enterprise-grade, Java-based content and information management solution consisting of a standalone repository server (with an HTTP/XML interface) and a Wiki-like, WYSIWYG editing and publishing web application frontend.
XMLTV (http://xmltv.org/) is for grabbing TV listings primarily from websites. It has a grabber for Danish Television that grabs from http://tv.tv2.dk, but here we maintain serveral others. You can find documentation on http://niels.dybdahl.dk/xmltvdk
Course Crawler is an application to compile term-definition pair from multiple web glossaries into a centralized, stable, and searchable location.
This is a Python script to parse your irssi logs and input them into a MySQL database which you can then use to search and display your logs on the web. It incrementally updates the database from the logs and is ideally run as a cronjob often.
arachnode.net is an open source Web crawler for downloading, indexing and storing Internet content including e-mail addresses, files, hyperlinks, images, and Web pages and is written in C# using SQL Server 2008. See http://arachnode.net for the LATEST.
Sgrep (sorted grep) is a much faster alternative to traditional Unix grep when searching large files, because sgrep searches sorted input files using a fast binary search to find matching lines.
A redistribute of a stripped down version of the Zend Framework for use with the Search Lucene API contributed Drupal module.
a small collection of python 3000 scripts/modules used to automate searching craigslist.org cities and categories for interesting stuff; these scripts currently use html screen scraping, since craigslist currently has no api
VuFind is a library resource discovery portal designed and developed for libraries by libraries. The goal of VuFind is to enable your users to search and browse through all of your library's resources by replacing the traditional OPAC.
Digital Learning Sciences (DLS) is a mission-centered, not-for-profit organization dedicated to improving learning through the use of digital content and tools.