A Python wrapper for the Google web API. Allows you to do Google searches, retrieve pages from the Google cache, and ask Google for spelling suggestions.
pyTube is a python-based commandline YouTube search. One can search for videos and display them in their default web browser. Requires python 2.5 and gdata.
The Porn Toolkit is a collection of scripts or programs to download free porn videos automatically.
High performance distributed in-memory key/value store
Infinispan is an open source, Java based data grid platform. ***IMPORTANT*** Starting with Infinispan 5.0.0.FINAL, Infinispan releases are no longer hosted in Sourceforge. They can now be located in www.jboss.org/infinispan/downloads
Document summarization system. By adding document content to system, user queries will generate a summary document containing the available information to the system.
A web-based search interface tailored to the New Zealand Gazette PDF archive for the NZ library community. A generic Python-based Swish-e search interface.
A powerful, themeable image gallery generator for static HTML pages.
a small collection of python 3000 scripts/modules used to automate searching craigslist.org cities and categories for interesting stuff; these scripts currently use html screen scraping, since craigslist currently has no api
This is an ***old archive*** of tools developed for facilitating the use of Creative Commons licenses and metadata. --- For the most up to date representation of any of the projects listed here, please see: http://creativecommons.org/project/Developer.
MedusWiki is a Python Wiki engine intended to be used as a personal knowledge management system. It uses Topic Maps (XTM) to store metadata, meaningful associations could be created between wiki pages. Zope Page Templates (ZPT) are used to produce HTML.
DuckDuckGo from the terminal
ddgr is a cmdline utility to search DuckDuckGo from the terminal. While googler is highly popular among cmdline users, in many forums the need of a similar utility for privacy-aware DuckDuckGo came up. DuckDuckGo Bangs are super-cool too! So here's ddgr for you! Unlike the web interface, you can specify the number of search results you would like to see per page. It's more convenient than skimming through 30-odd search results per page. The default interface is carefully designed to use minimum space without sacrificing readability. ddgr isn't affiliated to DuckDuckGo in any way. Demo: https://asciinema.org/a/151849
Frosttie (FROnt-end SchemaTron Text Internet Engine) takes XHTML pages and processes them with various user-definable filters such a W3C's WAI, Section 508 (US) web usability compliance, ad removal, etc. It can be used with zKnowMan.
HarvestMan is a fully functional, multithreaded webcrawler cum offline-browser. It is highly customizable and supports as much as 55 plus options for controlling and customizing offline browsing. It is written entirely in the Python programming language.
HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.
MindRetrieve is a personal search engine. It helps you organize and retrieve web pages you have visited. MindRetrieve is a lightweight, cross-platform, open source application available under the BSD license. It works with all popular web browsers.
ISO - Customized version of dcm4chee 2.17.3 for MySQL.
1. Add JBoss Application Server 4.2.3.GA for JDK 6. 2. Cleanup for Windows and deprecated files. 3. Off CONSOLE records - http://forums.dcm4che.org/jiveforums/thread.jspa?messageID=4787
Simple tool to backup Livejournal entries written in Python. Given a username and date range, downloads all entries in the range and places them in html files on the user's hard drive.
Ruya is a Python-based breadth-first, level-, delayed, event-based-crawler for crawling English, Japanese websites. It is targeted solely towards developers who want crawling functionality in their projects using API, and crawl control.
Each user can run their own threaded search engine and contribute to a global search database searching only the sites they want. It is built using Turbogears.
Zope is an open source application server specializing in content management, intranets, and custom web applications. Zope is written in Python and has a large, global community of developers and companies.
A drop-in framework for adding tagging (folksonomy) capabilities to existing applications
Agile Author is a framework for developing networked repositories of digital information such as digital libraries and content management systems.
Ambisearch is linguistic search engine tool based on Yahoo! to disambiguate the search results regarding to their word senses.
BandStalker is a utility that monitors local websites to inform you when your favorite bands are touring through your city.
BeeSeek is a project to build a free, open-source search engine based on a peer to peer technology. Code and bug reports are available on https://launchpad.net/beeseek-project