[Anet-checkins] CVS: Documentation/design/latex anetdesign.tex,1.4,1.5

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/anet/Documentation/design/latex
In directory usw-pr-cvs1:/tmp/cvs-serv26461

Modified Files:
	anetdesign.tex 
Log Message:
Added 2 more sections. Need to add the footnotes and references,
though...
Using the fancyhdr package for the headers, looks great!!!

Index: anetdesign.tex
===================================================================
RCS file: /cvsroot/anet/Documentation/design/latex/anetdesign.tex,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** anetdesign.tex	2002/01/03 19:02:39	1.4
--- anetdesign.tex	2002/01/05 16:46:35	1.5
***************
*** 26,30 ****
  \usepackage{url} 
  \usepackage{appendix}
! %\usepackage{fancyhdr}
  %\usepackage{pandora}    % use the pandora fonts, they seem more modern
  \usepackage{concrete}
--- 26,30 ----
  \usepackage{url} 
  \usepackage{appendix}
! \usepackage{fancyhdr}
  %\usepackage{pandora}    % use the pandora fonts, they seem more modern
  \usepackage{concrete}
***************
*** 44,48 ****

  % Let Latex write the pages headings.
! \pagestyle{headings}
  % Number all sections, but include only section and subsections in the ToC.
  \setcounter{secnumdepth}{5} \setcounter{tocdepth}{2}
--- 44,49 ----

  % Let Latex write the pages headings.
! %\pagestyle{headings}
! \pagestyle{fancy}
  % Number all sections, but include only section and subsections in the ToC.
  \setcounter{secnumdepth}{5} \setcounter{tocdepth}{2}
***************
*** 742,776 ****
  \end{verbatim}
  \end{fmpage}
! \caption{The Cluster Group Modules DTD} \label{fig:cgdtd}
  \end{figure}

  \section{Cluster Filter Modules} \label{sec:clus}
! Define data filtering for a cluster.

  \section{ANet Core} \label{sec:ac}
! Does the actual data duplication within a cluster.  Contains the
! query, static data and TWDT modules.

  \section{Handshaking Protocol Modules} \label{sec:hpm} 
! Define what and how connections should be made and manage resumed
! connections.

  \section{Packet Protocol Modules} \label{sec:ppm}
- Define the way the data will be streamed between two nodes.

  \section{Connection Protocol Modules} \label{sec:cpm}
- Define how network connections, from a low-level point of view, should
- be made between two nodes.

  \section{Bandwidth Manager Module} \label{sec:bbm}
- Monitors the network connections to maintain bandwidth statistics.

  \section{Document Type Definition (DTD)} \label{sec:dtd}
- This is the DTD file for the ANet configuration files.

- \clearpage
- \part{Low Level Design}
  \clearpage
! \pagenumbering{roman}
  \appendix
  \appendixpage
--- 743,1083 ----
  \end{verbatim}
  \end{fmpage}
! \caption{The Cluster Group Modules DTD} \label{fig:cgmdtd}
  \end{figure}

\section{Cluster Filter Modules} \label{sec:clus}
! The Cluster Filter Modules are used to filter incoming and outgoing
! data in a cluster.
! 
! \subsection{Design}
! \subsubsection{Role of the Filter Modules}
! The Cluster Filter Modules simply filters data coming in or out of a
! cluster. There could be various reasons why one would want to filter
! some packets, and this kind of module easily allows developers to do
! that.
! 
! The filtering can be based on the data contained in the packets, the
! service number, or based on the information given by the Bandwidth
! Manager.
! 
! Also, a Filter Module can decide to force disconnection of an ANet
! connection to the network if the same connection is not ``following
! the rules'' defined by the Filter Module\footnote{The ``origin or
! destination'' part of the AIP, when inside a Filter Module, is a
! connection ID.}. Thus, the Filter Modules have to very important role
! of defining the security requirements of a cluster; any connection
! that does not follow the security requirements of the Filter Module
! should eventually be disconnected.
! 
! \subsubsection{Input/Output for the Filter Modules}
! As input, the Filter Modules will have a list of the AIPs \xs{sec:cgm} that can
! be filtered and a list of the connection IDs that want to send or
! receive that data.
! 
! As output, you have to produce two lists. The first list is a list of
! indexes of the AIPs you want to delete. For example, if the fifth AIP
! in the input list of AIPs has to be deleted, then you add "5" to the
! list of AIP indexed to be deleted\footnote{Thus, you don't have to
! fill an list of the unfiltered AIPs. Anyways, that would be too
! awkward.}. The second list is a list of connection IDs to be deleted.
! 
! \subsubsection{Other Functions}
! One other function has to be implemented in the Filter Modules, while
! another one is optional, but can be very useful for the deamon.
! 
! The first function, the required one, gives as output its own
! definition of the security requirements. This definition consists of a
! "definition kind" and the definition itself. The definition will be
! used be the Handshaking Protocol Modules to tell to the other node
! what are the security requirements of the cluster.
! 
! The second function, the optional one, is used to do ``forced''
! filtering. The function will be called by the deamon when it finds out
! that to much data is trying get out at the same time. Without this
! function, the deamon will delete random packets, which might not be
! very good if some packets can be considered as more important than
! others. The input and output for this function is similar to the
! "main" function of the Filter Modules, but it additionally has both
! the total size of the AIP list, and the size, in bytes, of what needs
! to be deleted.
! 
! \subsection{Implementation Notes}
! The Filter Modules do not have to be very complex. The easiest example
! is filtering by the service number. Actually, it is recommended that
! all Filter Modules implement this kind of filtering.  Since Filter
! Modules can filter based on more than one logic, the security
! requirements is actually a list of the definition kind and the
! definition itself.  Don't put too much information in the security
! requirements. For example, if you filter using keywords, giving a list
! of hundred of thousands of "banned" words would be a bad idea. The
! deamon will impose a tight limit on the total size of all security
! requirements of a Filter Module.
! 
! \subsection{DTD}
! See the complete DTD for more information.
! \begin{figure}[!h]
! \begin{fmpage}{\textwidth}
! \begin{verbatim}
! 
! <!ELEMENT ClusterFilter EMPTY>
! <!ATTLIST ClusterFilter %moduleName; %args; %security;>
! \end{verbatim}
! \end{fmpage}
! \caption{The Cluster Filter Modules DTD} \label{fig:cfmdtd}
! \end{figure}

  \section{ANet Core} \label{sec:ac}
! The Core Modules are doing the actual data distribution within a
! network.
! 
! \subsection{Design}
! \subsubsection{Why Core Modules?}
! We need to distribute data somewhere, so we need the Core Modules,
! right? Actually, if data distribution is so important, how come the
! most important part of the ANet Protocol resides in external modules?
! 
! There is a simple answer: the rules of distribution themselves can be
! changed, and new distribution methods could be added. For example,
! let's say that we just found a way to optimize the distribution of
! Static Data.  Then, the only thing we need to install is a newer
! version of the Static Data Module. No painful version upgrade is
! needed.
! 
! \subsubsection{Input/Output of the Core Modules}
! The ``main'' function of each module is called with both a list of
! packets to distribute and a list of commands that were called by the
! clients. The list of packets is destroyed between each function
! call. Commands will also be deleted, but only if you mark them as "was
! used". All "used" commands will be deleted, while the other ones will
! be sent in a similar manner to the other Core Modules. Unused commands
! will be finally deleted if they were unused by all the Core
! Modules.(1)
! 
! The input packets are AIPs. There is one buffer for all the cluster,
! thus it is up to the Core Module to figure out to which connections
! the packets should be re-transmitted.
! 
! As output, the Core Module will have to produce one buffer per
! connection ID. The output packets must be ANet External Packets (AEP),
! which are totally platform independant. The deamon will provide some
! functions to transform an AIP to an AEP, though the modules might have
! to set some values by themselves(2).
! 
! \subsubsection{Core Modules}
! Here is a high-level description of the Core Modules that will be
! implemented for the first versions of ANet.(3)

+ Query Module
+ 
+ This module will implement queries[1] for ANet. A query packet is sent
+ to a connection only if the same packet was not sent recently and if
+ the "Time-to-Live" value of the packet is not 0. Also, the packet must
+ not be re-sent to the connection that originally sent the packet.
+ 
+ To do that, the Query Module will keep a list of the packets recently
+ sent, in the form of a simple checksum(4) and the origin of each
+ packet.
+ 
+ Static Data Module
+ 
+ This module will implement Static Data[2] in ANet. It will mostly use
+ the hard disk to store the data, though it might need a larger cache
+ than the Query Module to keep its indexes and recently used static
+ packets.
+ 
+ TWDT Module
+ 
+ This module will implement Anonymous Two-Way Data Transfers (TWDT[3])
+ in ANet. Its input come only from the client commands[4] made for the
+ TWDT module.
+ 
+ This module will need to use the Query Module through Inter-Module
+ Communication[5], even though the caching of the queries it needs to
+ produce is handled by the TWDT Module itself.
+ 
+ Managing Connections
+ 
+ As input, the modules will always have a list of the connections, and
+ their state. That list is kept in the parsed configuration
+ file[5]. Here are the possible states.
+ 
+ Open 
+        The connection is open, thus it will accept input data and produce output data. 
+ On Hold 
+        The connection cannot do any input or output, but will soon become either Open again or be Closed. 
+ Closed 
+        The connection is forever closed. The connection will soon be removed from the connection list. 
+ 
+ The Core Modules cannot, by themselves, change the state of a connection. Only Cluster Filter Modules[6] and by Handshaking Protocol
+ Modules.
+ 
+ \subsection{Implementation Notes}
+ Be careful implementing Core Modules. If they don't work, nothing will
+ happen. The entire ANet deamon depends on the implementation of the
+ Core Modules (hence the name ``Core'').  The Core Modules are not
+ limited to the three basic Core Modules that will be first implemented
+ (Query, Static and TWDT). If you want the distribution to be based on
+ other rules, then you should implement new Core Modules.
+ 
+ \subsection{DTD}
+ See the complete DTD for more information.
+ 
+ \begin{figure}[!h]
+ \begin{fmpage}{\textwidth}
+ \begin{verbatim}
+ <!-- Core Modules. Simply contains a list of CoreModule elements. -->
+ <!ELEMENT CoreModules (CoreModule)+>
+ <!ATTLIST CoreModules %security;>
+ 
+ <!-- Core Module -->
+ <!ELEMENT CoreModule EMPTY>
+ <!ATTLIST CoreModule %moduleName; %args; %security;>
+ \end{verbatim}
+ \end{fmpage}
+ \caption{The Cluster Filter Modules DTD} \label{fig:cfmdtd}
+ \end{figure}
+ 
+ 
+ 
+ % Notes
+ 
+ % (1) The Run-Time wrapper[5] will allow you to create as many new commands as you want. Actually, commands that are unrecognized by the
+ % wrapper will be sent to the Core Modules.
+ 
+ % (2) For example, the "Time-to-Live" exists only in the AEPs (for query AEPs), so it is up to the Query Module to fill the value. Note that the
+ % format or the AEPs will be very similar to the format of the AIP, though without memory alignment and with strict rules for byte ordering.
+ 
+ % (3) The modules were already described in the development introduction. They don't really have a high-level design, and the low-level design
+ % will be covered during development. That's because it's too simple (Query Module) or too complex (Static Data and TWDT Module) to be
+ % worth the time investment. Don't worry, the modules will be thoroughly documented once a stable implementation will be done.
+ 
+ % (4) By simple, I mean easy to compute. For example, MD5[7] checksums would take too much CPU, and isn't very good for small data anyways.
+ 
+ 
+ % References
+ 
+ % About the references...
+ 
+ % [1] Benad: "Queries". Local link.
+ % [2] Benad: "Static Data". Local link.
+ % [3] Benad: "Anonymous Two-Way Data Transfers". Local link.
+ % [4] Benad: "Client Connection Modules". Local link.
+ % [5] Benad: "Run-Time Wrapper". Local link.
+ % [6] Benad: "Cluster Filter Modules". Local link.
+ % [7] Network Working Group: "The MD5 Message-Digest
+ % Algorithm". External link.
+ 
  \section{Handshaking Protocol Modules} \label{sec:hpm} 
! The goal of the Handshaking Protocol Modules is to provide and
! maintain network connections to the Core Modules[1].
! 
! \subsection{Design}
! \subsubsection{Role of the Handshaking Protocol Modules}
! The Handshaking Protocol Modules are there to provide an abstraction
! of network connections. From the point of view of the Core Modules,
! connections are identified by an ID, and Input/Output is done through
! buffers that will be given to the Handshaking Protocol Modules. So, it
! is up to the Handshaking Protocol Module to provide the IDs and to
! figure out to what actual protocol they point to.

+ Thus, it is up to the Handshaking Module to initaite and close the
+ actual connections. This includes the need to exchange information
+ about the distribution protocol itself, including the information
+ taken from the Filter Modules[2], hence the name \emph{Handshaking
+ Protocol}.
+ 
+ It is important to note that it is not the role of the Handshaking
+ Protocol to define how the connections are identified, both internally
+ and in the configuration file. This is required, as many modules
+ require to "know" what the connections are, without forcing them to
+ "know" how the Hanshaking Protocol works.
+ 
+ \subsubsection{Input/Output for the Handshaking Protocol Modules}
+ Basically, the entire Input and Output is in the parsed configuration
+ file[3]. The goal of the Hanshaking Protocol is to try to make
+ connections as in the current configuration, and update the
+ configuration to the current state of the connections (Open, Closed
+ and On Hold[1]).
+ 
+ The current configuration also contains Memory Tags[3] for both Input
+ and Output buffers for each connection. This might be needed for some
+ Handshaking Protocols, though it is not the goal of the Handshaking
+ Protocol to produce or filter data in those buffers.
+ 
+ \subsubsection{Connections}
+ So, what the connections consist of? To be able to properly identify
+ and use a connection in ANet, here's what is needed.
+ 
+ Connection ID: The unique ID that identifies this connection. Read
+ only for all modules except the Handshaking Modules.  Connection
+ State: The current state of the connection. It can be ``Open'', "Closed"
+ or "On Hold"[1].  Input Buffer: A buffer of AEPs that came from the
+ network on that connection.  Output Buffer: A buffer of AEPs that will
+ go to the network on that connection.  Packet Protocol Name: The name
+ of the Packet Protocol Module for that connection.  Connection
+ Protocol Name: The name of the Connection Protocol Module for that
+ connection.  Network ID: A unique identifier for the Connection
+ Protocol that identifies the other end of the connection. For example,
+ using TCP/IP[4], this is the IP address and port of the other
+ computer.  Subnet: This is the "network subnet" of the
+ connection. Usually, this has a value representing "the
+ internet". This is useful only when the Connection Protocol can handle
+ routers and ``knows'' the difference between an internet and an
+ intranet connection.
+ 
+ All of these values will be inside the current configuration settings.
+ 
+ \subsection{Implementation Notes}
+ Use the "On Hold" status value when you are in the process of starting
+ or resuming a connection. Otherwise, the rest of the deamon and the
+ clients will keep sending data to the connection.  You're not forced
+ to send the security rules to establish a connection, though not doing
+ so might result in unexpected disconnections, instead of initially
+ deciding to not connect because the current security rules cannot be
+ followed.  You can write the Handshaking Protocol to ask, before the
+ beginning of the connection, a list of the Network IDs that are
+ connected to the other side of the connection to do some "network
+ discovery", though it is not required. If you do that, you can
+ implement a connection system similar to Gnutella[5] or Freenet[6].
+ 
+ \subsection{DTD}
+ See the complete DTD for more information.
+ 
+ \begin{figure}[!h]
+ \begin{fmpage}{\textwidth}
+ \begin{verbatim}
+ <!-- Handshaking Protocol. Contains a list of inital connections
+ (live connections once opened) -->
+ <!ELEMENT HandshakingProtocol (Connection)*>
+ <!ATTLIST HandshakingProtocol %moduleName; %args; %security; minConnections CDATA "0"
+  maxConnections CDATA "-1">
+ \end{verbatim}
+ \end{fmpage}
+ \caption{The Handshake Protocol Modules DTD} \label{fig:hpmdtd}
+ \end{figure}
+ 
+ % References
+ 
+ % About the references...
+ 
+ % [1] Benad: "Core Modules". Local link.
+ % [2] Benad: "Cluster Filter Modules". Local link.
+ % [3] Benad: "Run-Time Wrapper". Local link.
+ % [4] University of Southern California , "Transmission Control Protocol". External link.
+ % [5] Semi-Official Gnutella Web Site. External link.
+ % [6] Freenet Web Site. External link.
+ 
  \section{Packet Protocol Modules} \label{sec:ppm}

  \section{Connection Protocol Modules} \label{sec:cpm}

  \section{Bandwidth Manager Module} \label{sec:bbm}

  \section{Document Type Definition (DTD)} \label{sec:dtd}

  \clearpage
! %\part{Low Level Design}
! %\clearpage
! %\pagenumbering{Roman}
  \appendix
  \appendixpage