Coherence is an advanced Content Management System build on top of Zope. Coherence has site-, user- and filemanagement. Some of the special features are a WYSIWYG page-editor with a drag and drop interface, versioncontrol, workflow and linkmanagement.
This is an ***old archive*** of tools developed for facilitating the use of Creative Commons licenses and metadata. --- For the most up to date representation of any of the projects listed here, please see: http://creativecommons.org/project/Developer.
An application used to search various web-based genealogy sites simultaneously and review and analyse the data gathered.
HttpFinder is web content searching tool. It enables look for text content that matches given regular expression in html pages/scripts etc. All navigation is performed with use of other regexp which describes links to visit.
IGLU is a Java class library designed to facilitate sharing of code among Artificial Intelligence/Information Retrieval researchers to illustrate how various problems can be solved in Java. It is developed and maintained by the IGLU Research Group.
A web crawler which uses regular expressions on text downloaded from a site.
JobClient downloads information from job-seeker sites, filters and sorts them against your skillset, and provides a GUI interface to browse and apply for jobs. Utilities are included for archiving, and screenscraping
Kobold's file searchengine is a cgi script for Homepages, programed in Aptilis, a easy to learn scripting language. The main search script, an example html file is included plus a script for indexing your files. No Database required, only a HTTP Server.
Lucene has moved to Jakarta. Please visit http://lucene.apache.org/
Mp3 JudeBox Server Interface
My Community Portal is a all in one internet portal that offers, forum, groups, chat, your own e-mail, search engine, internet directory, your own home page, poll's, dating services, buddy list, MP3 and file sharing, and many more.
Not A Blog is a collection of modules based on a common user authentication and sessions module that can separate form from content without sacrificing design control over the content. This is achieved by having a virtual site hierarchy in a MySQL databas
Open-site PHP code.
a efficient php P2P Distributed computing Client and Server. will do webcrawling, indexing, searching, filesharing, replication, etc. based on persona a p2p trust based algorithm. project goal to provide uptodate relevant search results better than goog
PHP World Portal is being developed as the framework for JLS Web Development's site. After each module is completed it will be released as open source for the public. The core framework will be released by 1/23/04.
The photo gallery management system is designed to track and catagorize images on a web page. Allows for searching image metadata, generating galleries by search or predefined lists, tracks viewing statistics, creating image exchanges with other sites.
RoadRunn's MP3 Server Search Engine is a web based search engine to search a personal MP3 collection and download and/or queue it in a streaming server. Please read the Notes for each release for version specific features.
SearchIRC Deskbar. Provides multiple methods to access SearchIRC from your desktop and/or browser. Deskbar is NO LONGER SUPPORTED. Mozilla search plug-in should still work, however.
(Project is participated in the Zend PHP5 Contest. Project information will be released after the event, Oct 11, 2004)
Referer spam (also known as log spam or referer bombing)
Required: - Php CLI - Php CURL Referer spam (also known as log spam or referer bombing) is a kind of spamdexing (spamming aimed at search engines). The technique involves making repeated web site requests using a fake referer URL that points to the site the spammer wishes to advertise. Sites that publicize their access logs, including referer statistics, will then inadvertently link back to the spammer's site. These links will be indexed by search engines as they crawl the access logs. This benefits the spammer because of the free link, which gives the spammer's site improved search engine ranking due to link-counting algorithms that search engines use.
A hypertext-browser written in Java which filters links (emails, docs or pics for e.g.) out of .html-documents and paints them on screen in hierarchical order. Users get a quick overview of how a website is put together.
Zope is an open source application server specializing in content management, intranets, and custom web applications. Zope is written in Python and has a large, global community of developers and companies.
Java Bookmark Manager
webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.
webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy to master. The standard webStraktor output format is XML based, either in ASCII, UTF-8 or ISO-8859-1 (Latin1) code pages. webStraktor relies on the Apache HttpClient for retrieving content via the HTTP protocol. It adheres to the Robots Exclusion Protocol and it can be configured to operate in an anonymous way by connecting to the predominant types of web proxy servers. webStraktor extends the functionality of web crawlers, spiders or bots by integrating scraping and crawling capabilities.