Welcome, Guest! Log In | Create Account

Changes between Initial Version and Version 1 of Overview

Show
Ignore:
Timestamp:
07/30/09 20:52:41 (5 months ago)
Author:
leo_sauermann (IP: 127.0.0.1)
Comment:

Imported from wikispaces

Legend:

Unmodified
Added
Removed
Modified
  • Overview

    v1 v1  
     1= Overview = 
     2 
     3== Goals of the Project == 
     4 
     5Aperture is an open source library for crawling and indexing information sources such as file systems, websites and mail boxes. Aperture supports a number of common source types and document formats out-of-the-box and provides easy ways to extend it with custom implementations. 
     6 
     7The Aperture code consists of a number of related but independently usable parts: 
     8 
     9  * Crawling of information sources: file systems, websites, mail boxes 
     10  * MIME type identification 
     11  * Full-text and metadata extraction of various file formats 
     12  * Opening of crawled resources 
     13 
     14For each of these parts, a set of APIs has been developed and a number of implementations is provided. 
     15 
     16Aperture has a strong focus on semantics. For example, a lot of effort is made in order to let the Extractors also extract as much metadata contained in the file formats as possible (e.g., titles, authors, comments, ...) and combine this with source-specific metadata (e.g., location, last modification date, ...). All metadata is mapped to properties from the NIE namespace to allow uniform processing of the crawled and extracted information. NIE is an ontology developed within the Nepomuk Project and now maintained by an open source community. See [http://sourceforge.net/projects/oscaf/ the website of the development project on Sourceforge], [http://www.oscaf.org the website of the organization], [http://www.semanticdesktop.org/ontologies the website of the ontology] 
     17 
     18Aperture is responsible for extraction, it doesn't try to tell you what to do with the data. You may want to store it, index it, query it, or simply grab the full text and print it. Code snippets within this wiki will show you some examples what can be done with the data. How to make it available for your application with the RDF APIs and query languages such as SPARQL or SeRQL. 
     19 
     20== Aperture Web Demo == 
     21Try out Aperture live: [[br]] 
     22http://www.dfki.uni-kl.de/ApertureWebProject/ - website where you can try out extractors (note that it works with a little outdated version of aperture, better take a look at the [ApertureExampleApplications] to see how you can give the newest Aperture a spin. 
     23 
     24== Extensibility and Plugins == 
     25Aperture consists of many parts that can be used as plugins - it can be extended and limited very easily. [[br]] 
     26The core plugins are contained in the distributions, besides them, there are contributions and the web project. 
     27  * [wiki:"Aperture Addons"] 
     28  * [wiki:"Aperture Webserver" Aperture Web Server] 
     29 
     30== Aperture Team == 
     31Who is contributing to Aperture? What are the people behind? 
     32  * [wiki:Team] 
     33 
     34== Historical Background == 
     35 
     36Aperture started as a cooperation between the [http://www.dfki.de German Research Center for Artificial Intelligence] (http://www.dfki.de) and the Dutch software company [http://www.aduna-software.com/ Aduna] ([http://www.aduna-software.com/%29. http://www.aduna-software.com/).] 
     37 
     38Both organizations had already produced software sharing certain characteristics, such as targeting desktop search using Semantic Web technology. Therefore, they necessarily had to solve the same technical problems, like incremental crawling of a file system, text and metadata extraction and indexing and querying of metadata. 
     39 
     40This made them realize that through cooperation on these issues they could get better code at lower individual efforts. Furthermore, this would enable other people to contribute as well. 
     41 
     42In the summer of 2005 they decided to start a joint open source project that would be bootstrapped with the crawling and indexing code already developed in-house and that would serve as a basis for future development. 
     43 
     44 
     45== Licensing == 
     46The Aperture code is published under a permissive BSD-like open source license that allows the use of Aperture in proprietary applications. 
     47 
     48See the [http://aperture.sourceforge.net/license.html License] distributed with the library for more details. 
     49 
     50== Old licensing policy (releases up to 1.2.0) == 
     51The Aperture code is published under an open source license (AFL 3.0 for APIs and example code, OSL 3.0 for API implementations) that allows the use of Aperture in proprietary applications. [[br]] 
     52In short, in order to use Aperture your own code does not have to be open source, with one exception: if you create a modified version of an Aperture API implementation (i.e., you base it on a file published under the OSL license), then you are required to publish that part under the same license. [[br]] 
     53For example, when you create an improved version of the Excel text and metadata extractor using the Aperture implementation as a starting point, then this derivative code must also be distributed under the OSL. Your own implementations of Aperture APIs are not required to be open sourced, nor is any code using Aperture components. 
     54 
     55See the [http://aperture.sourceforge.net/license.html License] distributed with the library for more details.