Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
Google Hacks is a compilation of carefully crafted Google searches that expose novel functionality from Google's search and map services.
The stuff here has no documentation and some of it may never be completed. This is my playground, use at your own risk.
Imgur Gallery Downloader
Users can now search Imgur for any phrase and ImgurDL/Loadur will automatically search for matching images. ImgurDL/Loadur will download the images while displaying the progress to the user.
IRToolkit is an attempt to build and develop a generic search engine that integrates state-of-the-art Information Retrieval (IR) models. Furthermore, it offers a capability to compare the performance (in terms of precision, recall, index size, search response time and so on) between several open source IR applications. If you use the IRToolkit please cite the following work: https://sites.google.com/site/dinhbaduy/bibtex#Dinh-Phdthesis-2012
CaC is a application to easily download and convert Videos from Videosites like YouTube, Google Video etc. It´s written in Lazarus / FreePascal and availible for Linux, Windows and Mac OS X Systems.
HTTP Directory Index consiste en un script PHP que actúa como interfaz gráfica amigable para indexar directorios Web.
A simple php script that retrieves weather information from wunderground.com quickly and easily. Enter a city and state, then submit and a wunderground forecast banner image will load on the page. Not affiliated with wunderground.com
Simple application for downloading pictures from Zerochan.net
Simple java application for downloading high-quality pictures from Zerochan.net. You can find images by size or a tag. It's simple. And flat. All you need to do: download .jar file and run it with Oracle JVM (or any another JVM supporting image decoding)
Analyze and visualization of the social structuring from "lastfm.de" which contained user data, friendslist, groups, group-members and musical neighbours.
This project was started by myself and a few friends a while ago to solve out problems with other more well know CMS's. the problem was that the others didnt have the functions we required so we started our own.
The goal of bookman is to implement a network based service for managing and distributing bookmarks transparently from a central server to any bookman-enabled client software (curently focussing on Mozilla, IE and Opera).
Cheshire3 is a fast Z39.50, SRW, XML search engine, written in Python for extensability and using C libraries for speed. Next generation of the Cheshire system (http://cheshire.berkeley.edu) and designed around a distributable, object oriented model.
Cicerone is a multi-platform, multi-server, multi-database, web-based corporate information system like no other. Completely web-driven and accessible through any 4.x web browser, Cicerone allows your company to create and maintain information on the fly
Coherence is an advanced Content Management System build on top of Zope. Coherence has site-, user- and filemanagement. Some of the special features are a WYSIWYG page-editor with a drag and drop interface, versioncontrol, workflow and linkmanagement.
A combination web and desktop application for cataloging and orgranizing your books, cd's and dvd's.
Downloads and formats stories from your favorite Web-based, RSS- or Atom-syndicated news sources for display on your iPod. Provides an easy interface for creating and installing adapters for new news sources.
Fire.now is a Firefox plugin that automatically adds your documents to the WhereIsNow latest version discovery service. Everytime you upload a document somewhere, Fire.now integrates the WhereIsNow keys into the file and add it's url to WhereIsNow.
FlexibleShare has FlexSpaces Alfresco doc mgt, workflow and search in pods with a dashboard style UI with added Flex UI pods (wiki, blog, discussions, calendar, doc lib pods) for Alfresco Share back-end. Based on FlexibleDashboard, supporting plug-able pod modules for BI/charting/reporting, etc. AIR version with desktop file drag/drop, in browser version, and Mobile (Android and iOS) version. Downloads and source now only at http://code.google.com/p/flexibleshare/ Developed by Integrated Semantics: http://integratedsemantics.com blog: http://integratedsemantics.org
GImageSpider is an Image Spider that has two abilities. GIS can search web by image search engines to find images. GIS can act as an image spider that crawls your arbitrary site by your constraints and find images.
An application used to search various web-based genealogy sites simultaneously and review and analyse the data gathered.
HooDoo is designed to provide most of the same functionality of Google, but available to all for their websites
The LEADERS toolkit is a generic toolset that enables the creation of an online environment which integrates EAD finding aids and EAC authority records with TEI transcripts and digitised images of archival material suitable to a wide variety of archives.
Written in PHP and designed to maintain a personal database of bookmarks, Linkerdoodle is a simple link organizer.
My Community Portal is a all in one internet portal that offers, forum, groups, chat, your own e-mail, search engine, internet directory, your own home page, poll's, dating services, buddy list, MP3 and file sharing, and many more.