Framework (scripts, configuration, code) to build free and public services around travel and leisure data. That project makes an extensive use of already existing data sources such as Geonames and dbPedia, and adds some glue around those (eg, links).
Document summarization system. By adding document content to system, user queries will generate a summary document containing the available information to the system.
Voxound Extension is a daemon application used to provide additional local content access and management functionality to the voxound.com web application.
a small collection of python 3000 scripts/modules used to automate searching craigslist.org cities and categories for interesting stuff; these scripts currently use html screen scraping, since craigslist currently has no api
Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Nucular Archiving System for creating full text indices for fielded data. Python API, web, and command line interfaces. Fast. Very light weight. Concurrent read/writes with no possible locking issues. No server process. Proximity. Facets. Funny name.
"Filtered communication" is the source code for a website which facilitates collaborative filtering of information on the internet. Users can create "filters", criteria which are defined in English. Activity mode (http://bayleshanks.com/pamv1): aslee
Cortez, for create new news service model for RSS and blogging. Cortez will just offer the environment to create post, read news thru RSS(ATOM) and syndicate within the multiple blogs.
DiskAt is disk/media catalogue app supporting multiple categories per item, good search and features which allow to use it as Movie/DVD/etc database. Written with PHP/Python/SQLite.
This is a Python script to parse your irssi logs and input them into a MySQL database which you can then use to search and display your logs on the web. It incrementally updates the database from the logs and is ideally run as a cronjob often.
zSearch is a simple python based crawler and search engine. Raw HTML are stored in bzip2 archives, the index is created using pylucene, and twsited is used to provide internal http server. Results are sent back as XML over HTTP.
Each user can run their own threaded search engine and contribute to a global search database searching only the sites they want. It is built using Turbogears.
Eligante is a software for archivation, management and browsing (with full-text search functions) of all your communications, be it via email, chat (IRC, ICQ, MSN,...) and even messaging websites (hi5, orkut,...).
Cheshire3 is a fast Z39.50, SRW, XML search engine, written in Python for extensability and using C libraries for speed. Next generation of the Cheshire system (http://cheshire.berkeley.edu) and designed around a distributable, object oriented model.
Webcomic Archive and News Generator (WANG) is a database driven PHP application built for both aspiring and existing web comics. Written with a focus on security and speed, the code is built to be easy to use for code novices and experts alike.
Syncato is a Weblog Web Services system built on top of Berkeley DB XML, Webware and Python. It has a number of unique features; XPath access to all content via URLs, XSL-T presentation and extremely flexible database structure.
A collection of software to implement search engine technology. The overall search technology is built on the individual components of this project, each component is released under the BSD License, and is written in the language most suited to its task.
LAMP eGovernment Database Project offers state and local governments a free open source, web-enabled system for use in developing public information sites. You can also use this system for government-to-government systems as well.
Emine is a python script that parses an email file, separates all the email elements, including words and phrases, and populates a database with file offsets for retrieval from the original file.
Open Source Application for databasing your Music Collection(s). iChoons will utilize other open source products such as MySQL, Apache Webserver and PHP as well as Python / wxPython and SQL Lite. We will also be including tools written in Python for Win3
PySMBSearch is a crawler and search engine for SMB shares. It consists of a crawler script, which creates an index and stores it in an SQL database, and a CGI script that can be used to extract queries from the database.
Omseek has been renamed to Xapian. Xapian is a Search Engine Library, written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C# and Ruby. It allows you to easily add advanced indexing and search facilities to your applications.