[Carrot2-cvs] website/site/architecture index.xml,1.1.1.1,1.2 usergain.xml,1.1.1.1,1.2

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Update of /cvsroot/carrot2/website/site/architecture
In directory sc8-pr-cvs1:/tmp/cvs-serv9307/site/architecture

Modified Files:
	index.xml usergain.xml 
Log Message:
site update - downloads especially

Index: index.xml
===================================================================
RCS file: /cvsroot/carrot2/website/site/architecture/index.xml,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -C2 -d -r1.1.1.1 -r1.2
*** index.xml	20 Sep 2003 18:14:14 -0000	1.1.1.1
--- index.xml	6 Oct 2003 17:13:31 -0000	1.2
***************
*** 12,73 ****
  	<title lng="pl">Architektura</title>

!             <lang:pl>
!             <frame>
!             Ta strona jeszcze nie zostaÅa przepisana na jÄzyk polski. ProszÄ
!             przeÅÄczyÄ siÄ na <a href="/architecture/index.xml?lang=en">wersjÄ angielskÄ</a>.
!             </frame>
!             </lang:pl>            
! 
!             <lang:en>            
!             <chapter level="1">
!                 <title>Architecture</title>

! <carrot-text/> is based on a concept of separate components, which communicate
! only by passing XML data. The communication protocol is restricted to HTTP,
! POST method specifically. This allows for great flexibility in adding new components,
! as the language of implementation and physical location may remain unknown.
! 
! 
! <illustration src="/gfx/carrot2/figures/components-dataflow.gif" float="right">
! 				 <description>1: A scenario of data flow in <carrot-text/> architecture</description>
! </illustration>
! There are four component types in Carrot2:
! <ul>
! <li>Input - This type of component accepts user query request (wrapped
! in standard XML and passed via HTTP POST), and is in charge of producing some
! document list, which should &quot;match&quot; the query. Upon successful processing,
! the component is required to produce a valid XML result stream.
! </li>
! <li>Filter - This type of component accepts result stream from Input, or
! Filter components, and does some processing on it. At the end of processing,
! it is required to return unchanged input stream, with perhaps intermixed custom
! tags (the result of processing). Such tags may include, for instance, alternate
! relevance ranking of results, grouping of similar documents, or other.
! </li>
! <li>Output - Output component type is in charge of somehow presenting the
! results to the user. The results, which this component produces are not defined (it
! may produce HTML page, display a Swing applet, or write results to disk). Components
! of this type usually interact with Controllers to present processing results to the user.
! </li>
! <li>Controller - A component, which binds all other together to form a processing stream.
! Carrot2 is a Controller component, because it allows to select input, filter and
! output components and facilitates communication among them. However, other controller
! components are possible, such as command-line processors, or local application (as opposed to
! Web-accessible) controllers.
! </li>
! </ul>
! 
! It should be clearly stated that the scenario of data flow presented in figure 1
! is not optimal (because data is sent back and forth between components and the controller),
! but it was a design-decision to simplify component-side programming.
! 
! 
! A detailed description of architecture, data exchange protocols and other elements of the
! framework is given in the official Developers Manual (see 
! <a href="/developers/index.xml">developers section</a>).
! 
! </chapter>

!             
!             </lang:en>	
  </page>
--- 12,68 ----
  	<title lng="pl">Architektura</title>

!     <lang:pl>
!     <frame>
!     Ta strona tylko w wersji angielskiej, przepraszamy.
!     </frame>
!     </lang:pl>            

!     <chapter level="1">
!         <title>Architecture</title>

! <carrot-text/> is based on a concept of separate components, which communicate
! only by passing XML data. The communication protocol is restricted to HTTP,
! POST method specifically. This allows for great flexibility in adding new components,
! as the language of implementation and physical location may remain unknown.
! 
! 
! <illustration src="/gfx/carrot2/figures/components-dataflow.gif" float="right">
! <description>1: A scenario of data flow in <carrot-text/> architecture</description>
! </illustration>
! There are four component types in Carrot2:
! <ul>
! <li>Input - This type of component accepts user query request (wrapped
! in standard XML and passed via HTTP POST), and is in charge of producing some
! document list, which should &quot;match&quot; the query. Upon successful processing,
! the component is required to produce a valid XML result stream.
! </li>
! <li>Filter - This type of component accepts result stream from Input, or
! Filter components, and does some processing on it. At the end of processing,
! it is required to return unchanged input stream, with perhaps intermixed custom
! tags (the result of processing). Such tags may include, for instance, alternate
! relevance ranking of results, grouping of similar documents, or other.
! </li>
! <li>Output - Output component type is in charge of somehow presenting the
! results to the user. The results, which this component produces are not defined (it
! may produce HTML page, display a Swing applet, or write results to disk). Components
! of this type usually interact with Controllers to present processing results to the user.
! </li>
! <li>Controller - A component, which binds all other together to form a processing stream.
! Carrot2 is a Controller component, because it allows to select input, filter and
! output components and facilitates communication among them. However, other controller
! components are possible, such as command-line processors, or local application (as opposed to
! Web-accessible) controllers.
! </li>
! </ul>
! 
! It should be clearly stated that the scenario of data flow presented in figure 1
! is not optimal (because data is sent back and forth between components and the controller),
! but it was a design-decision to simplify component-side programming.
! 
! 
! A detailed description of architecture, data exchange protocols and other elements of the
! framework is given in the official Developers Manual (see 
! <a href="/developers/index.xml">developers section</a>).
! 
! </chapter>
 </page>

Index: usergain.xml
===================================================================
RCS file: /cvsroot/carrot2/website/site/architecture/usergain.xml,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -C2 -d -r1.1.1.1 -r1.2
*** usergain.xml	20 Sep 2003 18:14:15 -0000	1.1.1.1
--- usergain.xml	6 Oct 2003 17:13:31 -0000	1.2
***************
*** 12,61 ****
  	<title lng="pl">KorzyÅci</title>

! <lang:pl>
! <frame>
! Ta strona jeszcze nie zostaÅa przepisana na jÄzyk polski. ProszÄ
! przeÅÄczyÄ siÄ na <a href="/architecture/usergain.xml?lang=en">wersjÄ angielskÄ</a>.
! </frame>
! </lang:pl>

! <chapter level="1">
! <title>What a researcher may gain from <carrot-text/>?</title>
! 
! A number of scientific papers concerning search results processing have been published.
! Each and every author had to go through the same tedious (and quite fruitless) tasks
! of building a search engine wrapper, incorporating stemming/ light language processing
! algorithms, and finally displaying a result of the algorithm in some form.
! 
! 
! In our opinion this is highly reduntant work and most of the components (wrappers, processing,
! displaying) can be reused. Thus, we think <carrot-text/> gives a number of possible 
! shortcuts for those willing to work with search result clustering or any other form
! of textual data processing (we envision some applications of information retrieval could also
! applied easily using the architecture we proposed):
! <ul>
! <li>Reusable components - for a certain application, a standard data exchange
! format can be proposed, which makes components using this format replaceable and reusable.
! We proposed such data exchange format for application in search results clustering -- 
! an XML-based query and search result.
! </li>
! <li>Platform and language independency - data transmission protocol in <carrot-text/>
! is fixed to HTTP POST. Almost any programming language can be adopted to work as a web server
! or in web-server enabled mode (via CGI for instance). This allows for true independency of
! components within the architecture -- they can be physically distributed and running
! on different hardware platforms for instance. Also, the language of implementation does not matter,
! so for research objectives slower, but more elegant languages can be used (Prolog, Lisp, Java?),
! while for production systems, if ever, the code can be rewritten for performance and the component
! reused without any changes to other parts of the system.
! </li>
! <li>Start your experiments fast - because <carrot-text/> comes with a set of template
! component classes and utilities in Java, a researches can almost immediately set up his/ her own
! component within the framework and start working on specific thing he wants to deal with.
! </li>
! <li>OpenSource - <carrot-text/> is an open source initiative. It is available free of charge
! and the code base can be modified to be adjusted to one's needs.
! </li>
! </ul>
! 
! </chapter>

  </page>
--- 12,60 ----
  	<title lng="pl">KorzyÅci</title>

!     <lang:pl>
!     <frame>
!     Ta strona tylko w wersji angielskiej, przepraszamy.
!     </frame>
!     </lang:pl>            

! <chapter level="1">
! <title>What a researcher may gain from <carrot-text/>?</title>
! 
! A number of scientific papers concerning search results processing have been published.
! Each and every author had to go through the same tedious (and quite fruitless) tasks
! of building a search engine wrapper, incorporating stemming/ light language processing
! algorithms, and finally displaying a result of the algorithm in some form.
! 
! 
! In our opinion this is highly reduntant work and most of the components (wrappers, processing,
! displaying) can be reused. Thus, we think <carrot-text/> gives a number of possible 
! shortcuts for those willing to work with search result clustering or any other form
! of textual data processing (we envision some applications of information retrieval could also
! applied easily using the architecture we proposed):
! <ul>
! <li>Reusable components - for a certain application, a standard data exchange
! format can be proposed, which makes components using this format replaceable and reusable.
! We proposed such data exchange format for application in search results clustering -- 
! an XML-based query and search result.
! </li>
! <li>Platform and language independency - data transmission protocol in <carrot-text/>
! is fixed to HTTP POST. Almost any programming language can be adopted to work as a web server
! or in web-server enabled mode (via CGI for instance). This allows for true independency of
! components within the architecture -- they can be physically distributed and running
! on different hardware platforms for instance. Also, the language of implementation does not matter,
! so for research objectives slower, but more elegant languages can be used (Prolog, Lisp, Java?),
! while for production systems, if ever, the code can be rewritten for performance and the component
! reused without any changes to other parts of the system.
! </li>
! <li>Start your experiments fast - because <carrot-text/> comes with a set of template
! component classes and utilities in Java, a researches can almost immediately set up his/ her own
! component within the framework and start working on specific thing he wants to deal with.
! </li>
! <li>OpenSource - <carrot-text/> is an open source initiative. It is available free of charge
! and the code base can be modified to be adjusted to one's needs.
! </li>
! </ul>
! 
! </chapter>

  </page>