Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
The stuff here has no documentation and some of it may never be completed. This is my playground, use at your own risk.
Imgur Gallery Downloader
Users can now search Imgur for any phrase and ImgurDL/Loadur will automatically search for matching images. ImgurDL/Loadur will download the images while displaying the progress to the user.
IGLU is a Java class library designed to facilitate sharing of code among Artificial Intelligence/Information Retrieval researchers to illustrate how various problems can be solved in Java. It is developed and maintained by the IGLU Research Group.
This program runs on XP/2000/NT plataform using the Microsoft .NET Framework and Microsoft SAPI speech / voice engine. Its function is to monitor an unlimited number of files on local or remote filesystems , for changes and then speak the content
The Rainbow project is an open source initiative to build a comprehensive content management system using Microsoft's ASP.NET and C# technologies. It has ASP.NET 1.1 and ASP.NET 2.0 code bases.
PornSeer, a smart porn detector, precisely locates breasts, vulvas and other pornographic features in images/videos. It generates mosaic patterns on illicit contents of porn images/video, provides indexes of pornographic contents for image/video database
Client/aggregator for free online language services - translators, dictionaries, thesauruses . Allow to translate text over 41 sites, 53 services, 70 languages, more of 4000 translation directions.
The LEADERS toolkit is a generic toolset that enables the creation of an online environment which integrates EAD finding aids and EAC authority records with TEI transcripts and digitised images of archival material suitable to a wide variety of archives.
"Gobble" is a GUI based interface for accessing search results from www.Google.com and allowing the user to download files of a selected type. Functionality for multiple advanced functions is included.
ImageCrawler Application to extract Images from Websites. A Thumbnail view is provided. Based on Spring.NET and the HTML Agility Pack
CaC is a application to easily download and convert Videos from Videosites like YouTube, Google Video etc. It´s written in Lazarus / FreePascal and availible for Linux, Windows and Mac OS X Systems.
A program designed to browse all chans and automatically save threads that would be of interest to the user. The program searches for user entered keywords, finds threads with them in, and downloads images from each thread till it 404's.
A simple search engine for LANs. Indexes files in shares over FTP and SMB protocols and provides the ability to search for certain files in this index.
J-DAWN project is a Job-Directed Automated Web Navigator. It can retrieve network tasks, and schedule and execute them. Part of its power lies in the ability to define tasks using a graphical programming language based on an underlying XML foundation.
Web Textual eXtraction Tools C++ Parallel web crawler, noun phrase idenification, Multi-lingual Part of Speech Tagging, Tarjan's Algorithm, Co-RelationShip Mappings...
HooDoo is designed to provide most of the same functionality of Google, but available to all for their websites
A VB Web crawler that is currently under construction with the goal to be able to crawl and index the net most likely by distributed computing (via network).
Open Source Application for databasing your Music Collection(s). iChoons will utilize other open source products such as MySQL, Apache Webserver and PHP as well as Python / wxPython and SQL Lite. We will also be including tools written in Python for Win3
FileStructureToHTML literally creates an .html file for mp3´s, videos and more of your selected drive or a specific directory. The files can be organized in a tree, listing or in a table. Gain a whole new way of viewing your file list.
RIG is a web-based JPEG image album viewer, especially useful for digital camera albums; provides automatic image resizing, preview & thumbnail caching, user authentication; composed of a PHP web interface and a C++ thumbnail engine.
This project was started by myself and a few friends a while ago to solve out problems with other more well know CMS's. the problem was that the others didnt have the functions we required so we started our own.
Written in PHP and designed to maintain a personal database of bookmarks, Linkerdoodle is a simple link organizer.
A cd catalogue program, written in asp and linked to an access database. Features include add/view/delete records, sort records and the ability to check cds in and out of the catalogue. Fully functional from the web using IIS.
The purpose of this project is to build a searchable database out of a directorystructure of ini files (for album info), id3v1 and v2 tags from MP3s using PHP, MySQL and Apache.