An object relational-mapping (ORM) library for Java
Hibernate is an Object/Relational Mapper tool. It's very popular among Java applications and implements the Java Persistence API. Hibernate ORM enables developers to more easily write applications whose data outlives the application process. As an Object/Relational Mapping (ORM) framework, Hibernate is concerned with data persistence as it applies to relational databases (via JDBC).
Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex compounding or character encoding. Hunspell interfaces: Curses, Ispell compatible pipe interface, OpenOffice.org UNO module
CLucene is a C++ port of Lucene: the high-performance, full-featured text search engine written in Java. CLucene is faster than lucene as it is written in C++.
Archive your personal history
ResCarta Toolkit offers an open source solution to creating, storing, viewing, and searching digital collections. Applications in the toolkit let users create and edit metadata, convert data to open standard ResCarta format, index and host collections.
A Java implementation of a flexible and extensible web spider engine. Optional modules allow functionality to be added (searching dead links, testing the performance and scalability of a site, creating a sitemap, etc ..
The ht://Dig system is a complete indexing and searching system for a domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Google and AltaVista.
DOSE: a distributed platform for semantic elaboration that provides semantic services such as automatic annotation of web resources at the document substructure level, semantic search facilities, semantic annotation storage and retrieval.
Hyper Estraier is a full-text search system. It works as with Google, but based on peer-to-peer architecture. Using Hyper Estraier, we can construct a large-scaled search engine with cheap computers.
Contineo is a Web-based Document Management System (DMS). Features: Folder organization, document Versioning, Bulk import, import from mailbox. NOTE: this project has been DISMISSED in favor of LogicalDOC http://sourceforge.net/projects/logicaldoc
WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
High performance distributed in-memory key/value store
Infinispan is an open source, Java based data grid platform. ***IMPORTANT*** Starting with Infinispan 5.0.0.FINAL, Infinispan releases are no longer hosted in Sourceforge. They can now be located in www.jboss.org/infinispan/downloads
Bibliophile is a loose grouping of independent OS or GPL bibliographic systems and aims at promoting discussion, standards and the development of common utilities.
EasyGIS simplifies GIS data management, sharing, and publishing. REST interfaces (json, html views). Lucene based FTS searches. Thematic maps, business cartography. Integration with external GIS data providers - Google, OSM.
webspider provides a mechanism to get contents from web. With the extended classes, you can do the following things: 1. grab urls from a specified base url 2. analyze the contents of a list of urls 3. get specific files from web 4. blablabla
with Zip2Map, one can find the geo map of any zip code(now U.S. only). finding the zip code, returns the Map of the location with its name and state name. Google Maps api has been used with PHP-MySql and lots of Ajax to make it a real WEB 2.0 Application
Google Sitemaps Toolbox (GSToolbox) is a toolbox designed for webmaster to generate, manage and view Google sitemaps files. It is composed of Google Sitemaps Stylesheet (GSStylesheet) and Google Sitemaps Director (GSDirector).
This is PHP library for accessing OWL files. OWL is w3.org standard for storing semantic information.
A fat client price checking tool. Similar in spirit to pricerunner and others except it checks prices at the source on demand. Supposed to save entering the same search criteria on multiple sites and then tabbing through to do a comparison.
This project provides cross-forge semantic search for the Qualipso Forge. It integrates A4 AdvDoc prototype (semantic search GUI and engine) with A3 homogeneous and heterogeneous cross-forge semantic search capabilities. See Qualipso.org for details
SENTENSA Knowledge Miner is a platform independent tool for searching any text. SENTENSA uses robust methods of indexing and searching text, leveraging on experience from more than 20 years of information retrieval.
SpatiumCube is open source technology for the easily development of Spatial Data Infrastructures (SDIs) and services over them. It includes software, stylesheets, ontologies, and other technology elements for the development of SDIs.
Easy Spider is a distributed Perl Web Crawler Project from 2006
Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider and Perl/PHP Backends: https://www.artikelschreiber.com/en/ Webcrawlers are mostly the first thing to start programming at if you start your programming career. It is fun to look at some code that is few years ago and to see how one has improved himself. (c) Sebastian Enger 2005-2015
The complete suggestions framework for java, supporting single and multi field suggest, java suggest box, client/server with hessian or json-rpc, and GWT AJAX suggest box, phonetic plugins. Proven high performance for data sets > 1 Mio.
Classfieds for cars with mootools, php and mysql, totally in ajax.