anet-devel Mailing List for ANet (Page 4)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I uploaded a multi-client IPC system at:
ftp://anet.sourceforge.net/pub/anet/IPC_Examples/Benad/

- Benad

Hi-
I have updated the IPC functions on the FTP site to do better 
error-management. As I can't seem to figure out how to upload the 
updataed versions to the FTP site, you can email me for them.
--Quentin

I uploaded Ajaya's IPC code to the FTP site
(ftp://anet.sourceforge.net/pub/anet/), even though I still don't fully
understand how the semaphores work...

- Benad

You can see some sample IPC code (message transfers) for Linux there:
ftp://anet.sourceforge.net/pub/anet/IPC_Examples/

Those were successfully compiled on Linux and Solaris.

- Benad

Here's a list of the man pages and functions we should stick to for IPC:
ftok
msgctl
msgget
msgrcv
msgsnd
semctl
semget
semop
shmat
shmctl
shmdt
shmget
stat (section 2 on Linux)

Actually, that's because I have some trouble with Linux on my machine and
I'll have to compile under my code in Solaris and FreeBSD for now...

- Benad

Check out section 5 of the 'ipc' manpage in Linux ("man 5 ipc").
We should stick to non-Linux specific system calls, as this will make
porting to other UNIX systems easier (System V?).

Here's a tentative roadmap for the coding of version 1 of ANet:

1. Get the Client library to work, even if the deamon does nothing.
2. Make the data transfer and TCP/IP modules in the deamon, establish
   simple connections.
3. ANet network connection & partial network map (you remember, the network
   should be "flat"...)
3. Queries.
4. Static data objects.
5. Two-way data transfer.
6. Final tests, basic (very basic) security tests.

For each "milestone", we should do some testing before moving to the next
milestone.

That's OK for everyone?

OK. I'm back to my reading on IPC...

- Benad

I posted some *very* preliminary code. Actually only declarations...

ftp://anet.sourceforge.net/pub/anet/

Those header files are at the base of ANet, so you should comment about
their contents as soon as you can.

If you change them, say it to me. I'm not using CVS, yet.

- Benad

As recommended by Ajaya, I'll start a bit on the code.
Actually, just some basic cross-platform header files and some function
declarations.

Most importantly, I'll start the specific design of the "ANet Client SDK".
Basically, it's just a library that will do the actual process-to-process
communication so that you don't have to care about how that works on a
specific OS, making cross-platform client coding (especially Java ones)
much more easier.

And I'm already having a problem! An example: Pascal Strings. The format is
simple:

unsigned char length;
char string[length];

No "NULL" character is needed at the end of the string, though Pascal
strings cannot be longer than 255 bytes.

Now, try to make a C structure out of this. One MAJOR limitation of
structures is that the size of the structure, in bytes, MUST be known at
compile-time. And with ANet, I'm trying to do things like this:

unsigned char nbrServices;
ServiceNbr services[nbrServices];

Obviously, you can do "malloc", then play inside that allocated memory
(whose size is known at run-time) using some clever typecasting:

//i is the number of services, at known at run-time
//servicesArray are the service numbers, at run-time
char *myStruct = (char*)malloc(sizeof(unsigned char)+i*sizeof(ServiceNbr));
char *curPos = myStruct;
int j;
*((unsigned char*)curPos) = i;
curPos += sizeof(unsigned char);
for (j = 0; j < i; j++)
{
    *((ServiceNbr*)curPos) = servicesArray[j];
    curPos += sizeof(ServiceNbr);
}

While, in theory, this code is 100% cross-platform and totally unaffected
by any memory alignment optimizations, it's a real pain in the a** to
code...

The danger here is that "clever" programmers may think they just have found
an easier way to do this, while they are actually breaking up everything.
For example, doing this might screw up your data (memory alignment, again):
((ServiceNbr*)curPos)[j] = servicesArray[j];

So, any easier way to do this?

- Benad

Benoit Nadeau wrote:

> To whoever submitted the docs in SourceForge, thanks!
>
> (I think we're now at 40% for the Software Conception task...)
>
> - Benad
>
> _______________________________________________
> ANet-devel mailing list
> ANe...@li...
> http://lists.sourceforge.net/mailman/listinfo/anet-devel

Hi-
That was me. It seems to work fine for me (don't know what the trouble
mentioned before was).
--Quentin

Hi-
I hadn't realized it, but I had selected Netscape compatibility by accident. I will
upload a new, HTML 4.0 compliant version shortly.
--Quentin
Benoit Nadeau wrote:

> >Hi-
> >I just put up a really scratchy version of what the web site might look
> >like so you can comment on the look and feel.
> >http://anet.sourceforge.net/indexnew.html
>
> That was weird. So, any coder on ANet can ssh1 to upload web pages? Cool!
>
> I see that you're using Freeway. While it's good, it's not fully HTML 3.2
> compliant. Look at:
>
> http://validator.w3.org/check?uri=http%3A%2F%2Fanet.sourceforge.net%2Findexnew.html
>
> ( http://www.w3c.org/ is THE only official reference for html)
>
> What I suggest is that we use WYSIWYG HTML editors to quickly make a
> working web site first, then I'll (or Chris) clean up the junk those
> editors make.
>
> One last thing: we'll need to decide which fonts to use for what, just to
> be consistent. I personally don't like that "Western" font for "ANet"...
>
> - Benad
>
> _______________________________________________
> ANet-devel mailing list
> ANe...@li...
> http://lists.sourceforge.net/mailman/listinfo/anet-devel

>Hi-
>I just put up a really scratchy version of what the web site might look
>like so you can comment on the look and feel.
>http://anet.sourceforge.net/indexnew.html

That was weird. So, any coder on ANet can ssh1 to upload web pages? Cool!

I see that you're using Freeway. While it's good, it's not fully HTML 3.2
compliant. Look at:

http://validator.w3.org/check?uri=http%3A%2F%2Fanet.sourceforge.net%2Findexnew.html

( http://www.w3c.org/ is THE only official reference for html)

What I suggest is that we use WYSIWYG HTML editors to quickly make a
working web site first, then I'll (or Chris) clean up the junk those
editors make.

One last thing: we'll need to decide which fonts to use for what, just to
be consistent. I personally don't like that "Western" font for "ANet"...

- Benad

>    Although I agree that a downloading a list of files and browsing it=20
>locally will help to eliminate extensive file searches, we should address=
=20
>other types of content.  If ANet is going to be a backbone for distributed=
 a=20
>network then there are going to be other types of content floating around=
 the=20
>network.  Two examples in which people will actively need to look for=20
>something... (my apologizes if these have already been addressed)
>
>1.  Finding a users/machine/node for chat on the network
>2.  Finding services, or content by service type
>
>    The second one is important because of the new functionality this=20
>network-type could handle.  Examples:  distributed computing, large scale=
=20
>transfers, and distributed network based simulations.  I personally already=
=20
>have some new ideas for services.

Tell us. You might have not enough time to code them all, and other coders=
 might later be interested.

>>I don't think streaming content (more than just text)
>>would work properly with the current way ANet is defined...
>
>    Although not necessary now, I would like to see room for this addition=
=20
>later.  If you want know specifics to why or some of my ideas how, I'll=20
>gladly discuss it, but I don't feel it's necessary at this point.  To many=
=20
>other more important things to discuss.

We have to determine now what should be the relative maximum bandwidth=
 allowed for each type of data transfer (see Part 3). So, we have to=
 determine right now how streaming data can be handled by the current=
 mechanism and most importantly how this is going to affect the speed of the=
 network.

>Second - I'd like to dig into the services on top of ANet thing.  How are=
 we=20
>working the transition from the ANet to a machine with a usuable service on=
=20
>it?  We could rework the data passing through ANet to something=
 recognizable=20
>by existing services on a program to program basis, or we could do it by a=
 an=20
>entire sheet of data and let the service take only the data it normally=20
>takes.  How though are we going to keep track of what services are=
 available?=20

ANet is a protocol, and thus it doesn't need to care about the content of=
 the data. If a client registered to the ANet deamon for a service number,=
 then all data passing through the ANet node will be duplicated to the=
 client. The client souldn't have any control to the passing of data.

> We could design a custom type of search, or we could include service=20
>information right in the file list.

That's up to every service to have their own protocol (on top of ANet's) to=
 find which other nodes have the same service running. This is a bit like=
 TCP/IP, where you can't ask for a list of all open ports, and why you don't=
 really need to.

>    Personally, I could really get to like the idea that services could be=
=20
>browsed much like files.  I also feel it would help in situation I=
 mentioned=20
>above about searching.  I still feel that a mechanism for searching for=
 users=20
>is going to be necessary though, but trying to find ever node is a lot less=
=20
>congesting to a network than finding every *.mp3 on every node of network. =
=20
>Also, finding a user is a lot more like a community than finding a file. =
=20
>However, this bring up the issue of knowing who is really who, and how=
 secure=20
>the data recieved from a node actually is.  Benad mentioned Public Keys,=
 and=20
>I think this may be a solution.  This is something we're have to discuss.

=2E..But not right now. Security is something very important, but simply=
 making ANet work is, I think, more important. So, for now, if you receive=
 screwed up queries, duplicate static data or disconnected data transfers,=
 then live with it.

>Third - I checked out that article and it had some pretty good statistics=
 in=20
>it.  The whole site actually has some good information on it, it was my=
 first=20
>time there.  Definately worth checking out, thanks.

This means for the others: READ IT! ;-)

I'm off trying to post the docs in SourceForge and starting to write ANet's=
 introduction web page.

- Benad

To whoever submitted the docs in SourceForge, thanks!

(I think we're now at 40% for the Software Conception task...)

- Benad

Hi-
I just put up a really scratchy version of what the web site might look
like so you can comment on the look and feel.
http://anet.sourceforge.net/indexnew.html
--Quentin

    Sorry for the long pause in reply's.  I'm moving 1000 miles in less than 
two weeks, and I have alot going on (cleaning, packing, etc.).  I have been 
keeping up to date however, and have a lot of things that I'd like to 
mention.    I'll try to comment as things come to mind.  

First - referencing Benad's comments
>My example of "Passive File System" is much better, I think, because it's
>much like FTP: there is NO searching systems at all. You just wait for file
>lists, distributed as static data, and once those file lists are on your
>computer, you search them on your computer, not on the network. And I
>personally prefer file browsing than file leeching... (See last chapter in
>Part 1 of the docs)

    Although I agree that a downloading a list of files and browsing it 
locally will help to eliminate extensive file searches, we should address 
other types of content.  If ANet is going to be a backbone for distributed a 
network then there are going to be other types of content floating around the 
network.  Two examples in which people will actively need to look for 
something... (my apologizes if these have already been addressed)

1.  Finding a users/machine/node for chat on the network
2.  Finding services, or content by service type

    The second one is important because of the new functionality this 
network-type could handle.  Examples:  distributed computing, large scale 
transfers, and distributed network based simulations.  I personally already 
have some new ideas for services.

>I don't think streaming content (more than just text)
>would work properly with the current way ANet is defined...

    Although not necessary now, I would like to see room for this addition 
later.  If you want know specifics to why or some of my ideas how, I'll 
gladly discuss it, but I don't feel it's necessary at this point.  To many 
other more important things to discuss.

Second - I'd like to dig into the services on top of ANet thing.  How are we 
working the transition from the ANet to a machine with a usuable service on 
it?  We could rework the data passing through ANet to something recognizable 
by existing services on a program to program basis, or we could do it by a an 
entire sheet of data and let the service take only the data it normally 
takes.  How though are we going to keep track of what services are available? 
 We could design a custom type of search, or we could include service 
information right in the file list.
    Personally, I could really get to like the idea that services could be 
browsed much like files.  I also feel it would help in situation I mentioned 
above about searching.  I still feel that a mechanism for searching for users 
is going to be necessary though, but trying to find ever node is a lot less 
congesting to a network than finding every *.mp3 on every node of network.  
Also, finding a user is a lot more like a community than finding a file.  
However, this bring up the issue of knowing who is really who, and how secure 
the data recieved from a node actually is.  Benad mentioned Public Keys, and 
I think this may be a solution.  This is something we're have to discuss.

Third - I checked out that article and it had some pretty good statistics in 
it.  The whole site actually has some good information on it, it was my first 
time there.  Definately worth checking out, thanks.

                    jmitchell

Here's a good report on Gnutella's performance:
http://dss.clip2.com/gnutella.html

Note that making the network "flat" and gateways would really help avoid
this kind of "bandwidth barrier".

BTW, since ANet will allow Anonymous to Anonymous two-way data transfers,
it could be possible to have ANet on top of ANet...

Bye bye, bandwidth... ;-)

- Benad

>Okay, let's look at what we're up against as far as network optimization.
>Some real quick thoughts here about what we have to avoid...
>
>Searches -
>Overly expansive searches without a mechanism to handle them. Ex. *.mp3
>Searches that take to long to play out. Ex. Searching for to many hops,
>would create extra of pressure on the nodes that are in direct contact
>with the reciever.

My example of "Passive File System" is much better, I think, because it's
much like FTP: there is NO searching systems at all. You just wait for file
lists, distributed as static data, and once those file lists are on your
computer, you search them on your computer, not on the network. And I
personally prefer file browsing than file leeching... (See last chapter in
Part 1 of the docs)

No one should be allowed to create a query with too many hops, as this will
be considered as spamming and your node will be disconnected from the
network. There is already a mechanism to "plant" a query in the middle of
the network with gateways (Part 3 in the docs, actually a previous posting
in this mailing list).

>Services -
>Bandwidth hungry services. Ex. Streaming content that is not properly shared.

That's a though one. I don't think streaming content (more than just text)
would work properly with the current way ANet is defined...

>Abusive use of services. Ex. File Leeching from a single source.

Again partly resolved with passive file browsing.

>Inappropiate service. Ex. Running too much on too little.

I think that could be considered as a form of spamming (even if it isn't,
like a big, fat 100KB/sec streming video), as it is abusing of the
bandwidth of the network. No?

- Benad

    Okay, let's look at what we're up against as far as network optimization. 
 Some real quick thoughts here about what we have to avoid...

Searches - 
Overly expansive searches without a mechanism to handle them.  Ex.  *.mp3
Searches that take to long to play out.  Ex.  Searching for to many hops, 
would   create extra of pressure on the nodes that are in direct contact with 
the reciever.

Services - 
Bandwidth hungry services.  Ex.  Streaming content that is not properly 
shared.
Abusive use of services.  Ex.  File Leeching from a single source.  
Inappropiate service.  Ex.  Running too much on too little.

    There's more, but these are some quick ones I thought I'd throw out for 
discussion.  Let's look for "great" solutions while we're still planning.  My 
next post will probably be some of my own ideas about how to conquer these 
beasts, but for now I most be off. 

                                                    John M.

John Mitchell wrote:
>Let's talk for a moment about the types of data that will be passed by
>ANet. Now floating around will be chunks of data... typically what files
>are available, etc. However this doesn't have to be the only use for this
>data. Service data could be scuttled around in the same manner. For
>instance, instead of browsing a list of files, how about browsing a list
>of what services a person has available. Here's an example, due to the
>need to optimize a distributed network, a person may not be getting a view
>wide enough to find the content they want, but let's say a server within a
>users range is running a operating a service to stretch the search to the
>end of it's range, or maybe a service to filter out irrelevant data to
>find very specific information (a search that has other aspects written
>into it than that that a normal user GUI would have). I guess these would
>be similar to applets, except more for optimization than anything else.

In my opinion, the way the network works should never, ever change, even if
some service would need that. This is because the way data is distributed
on the network is within the protocol and shouldn't care about what
services are running on top of it. What I'm trying to say is that since the
protocol will "accept" any services, someone could easily spam the network
with his/her "service" that would contain only random data and that would
"need" more network bandwidth.

So, you're 100% right when you say "a service to filter out irrelevant data
to find very specific information", in the sence that some node have the
liberty to filter the data or to increase its range (by increasing the
"hops to live"). But how much of that should be allowed (before your node
is considered a "spamming" node) is very tricky...

>Now let's assume, every user of ANet is running some type of server
>(regardless of whether it's actually serving anything or not). What they
>could actually have is a very powerful backbone for operating other
>services.
>Now if each node can be upgraded as easy as installing a patch or
>recompiling with new modules it will be extremely functional and useful.
>Let's say one user operates a node, but it's primarily for private use (he
>uses it to hold private text conversations) he should be able to upgrade
>his node to act as an extented chat server.
>Another example, let's say a person wants to operate a sort of web server
>with again very private content. A person should be able to do that the
>same way. The important things to remember is that ANet's foundation now
>will determine how easy a transistion can be made to new functionality
>latter. There's going to need to be some serious conversation to work out
>details for the whole process, and I'm all about finding new ways.

We'll need to think "in advance" for things like public key encryption so
that it will be easy to add that in the future, for example.

>I think the Core Module is going to have to be left very open for the sake
>of later improvements. There isn't enough room to play with things in the
>TCP/IP module, because that would require finding a new way of doing
>everything for every OS ANet runs off of. I don't know exactly how you see
>the layout for the high priority tasks, but when I read about them I
>assumed that the Direct On Port Data Transfer would be right along side
>the Core Module in program operation.

Direct On Port Data Transfer is just an abstraction layer between the Core
and TCP/IP to make sure that the Core doesn't need to care about the TCP/IP
connections (TCP/IP module), and the connection process and which port to
use (Data Transfer modules). So, Direct On Port Data Transfer only chooses
a port, initiates the connection with the help of the TCP/IP module and
sends

>I do have a questions about the GUI Module though, will that module be in
>charge of collecting information for a client application, or is it the
>client application itself? I think the former would definately push others
>to develop there own interpretations of a client, but the latter is going
>to be necessary in any testing phases... even if it does get recycled down
>the line.

In the end, I think that ANet should be only a deamon, running as a
backgroud process (or whatever they call that in Windows). I recently made
a posting on the mailing list about that. The clients can only specify
which service numbers you're interested in (like port numbers), and insert
data on the network for their own service. But the actual caching, for
static data for example, is handled entirely by the ANet deamon. So, we
won't need to do any GUI for the "core" of ANet (Core, Data Transfer,
TCP/IP), while the clients (GUI and Services) will handle the rest. The
"Core" doesn't need a very sophisticated GUI anyways (for debugging
purposes). Some settings files and a very basic output with stdio (log file
or console) should be enough. I know it's not very user friendly, but this
is pretty much like other UNIX deamons anyways: play with the settings
files, lauch the deamon, and don't care about the rest...

(We should move this discussion to the mailing list...)

- Benad

Part 3: Network Optimization 
 
 
 
1. Distributed Networking Speed Limitations
The basic idea of distributed networking is to have an unstructured
network. The reason why it has to be unstructured is that if the network is
structured, the network will be easier to attack as a whole. For example,
take Napster. Its structure is simple: a big, fat server in the middle, and
everyone (the "clients") are connected to that server. If the server shuts
down, then the whole network stops working. While there exist a lot of
different network structure, simply taking the step of trying to
restructure the network while not having control over the individual nodes
is complicated and would allow easily someone to break the network apart.

The network has to stay anarchic, for the security and the anonymity of the
network.

So, you have to take extra steps to make sure the network is the least
cetralized, the most flat. But this will make the time for a query to be
distributed to the whole network very high. Thus, we have to find different
ways to make the network as fast as possible, without changing its
structure.

2. Bandwidth Management
To optimize the transfer between two nodes, we could try to "manage" the
available bandwidth between the two nodes.

Firstly, the total bandwidth used between two nodes must be the same in
both directions. Otherwise, too much data would get in compared to the data
comming out, and the node will be forced to delete some data. While you
can't totally avoid this situation, for example when a node is connected
with two other nodes with different bandwidth, we must avoid as much as
possible to have data deleted.

2.1 Bandwidth distribution between different nodes
When a node wants to connect to another node, the other node will try to
allocate as much bandwidth as it can. Thus, the other node will answer with
"I have X1 upload and Y1 download", and your node will reply with "X2
download, Y2 upload". The smallest of X1, X2, Y1, and Y2 is choosen as the
maximum amount of data flowing in both directions.

If you connect to two or more nodes, then the amout of bandwidth allocated
to each node is in proportion of the transfer speeds. For example, if node
A is X1 and node B is X2, then node A will have X1/(X1+X2) of your
bandwidth. The values of X1 and X2 are the the maximum amout of data in
both directions that you could handle with A and B if you were connected
only to A or B.

The bandwidth between two nodes can be re-negociated at any time. If you
send a different amount of data than what was negociated, then the
connection will be dropped. If you want to send less data than what you
have, you must send dummy data to proove that the bandwidth has not
changed. If the internet connection becomes slower and that packets of data
are starting to get lost, then both nodes should try to re-negicoate the
data to a lower value.

2.2 Bandwidth distribution between two nodes
Basically, the data transfered between two nodes fall into 3 categories:
queries, static data and data transfers. There should be a certain
proportion of the maximum bandwidth they can use. For example, let's say
that queries is 2, static data 4 and data transfers 6. Then, queries cannot
take more than 2/12 of the bandwidth, static 4/12 and transfers 6/12. But
then, let's say that data transfers are not used. Then, queries could take
up to 2/6 of the bandwidth. Let's say that data transfers only use 2/12 of
the bandwidth. Then, queries could take up to (2/6)*(10/12) of the
bandwidth.

The proportions used between nodes could be negiciated, but both nodes have
to fully accept the values. (Benad: how can this negiciation could work?
can we abuse of that?)

3. Network Management
The goal is to make the network as "flat" as possible. To do so, a node
should connect to the node with the lowest ping time it can find.

So, at the beginning, you connect to any node on the network that you want
to connect to. Then, you ask the node the lists of IPs that are connected
to it. You than compare the ping times of that node with the nodes that are
connected to it, and you move to the node with the lowest ping time. You do
that until you have the smallest ping time. Then, you should try to connect
directly or around that node.

This will make sure that the network will be as "flat" as possible, both
for the network and geographically.

4. Gateways
Usually, queries should have a limited "life" on the network by having a
value "hops to live". It is reduced by 1 each time you copy it to another
node in the network, and when it gets to 0, you shouldn't copy it anymore.

Some nodes should allow others to establish a two-way data transfer to give
you one or more queries. Then, those queries should be distributed in the
network as if the node that received them from the transfer were the node
that produced it. That node (the one that received the queries from the
transfer) is called a "Gateway".

Gateways will send very low priority queries to the whole network informing
it where are their proxy for their two-way data transfers to receive the
queries. Thus, you start sending queries around you, then, if that didn't
worked, you try again starting in another part of the network. This will
allow nodes to "plant" their queries to distant places in the network
directly, and will promote queries with short "time to live" values, which
can really speed up the network and make it more efficient.

- Benad

I don't know why, but I can't add a new documentation page in sourceforge. Can anyone else try?

Anyways, I'll post it here.

----------------BEGIN DOCS---------------
Part 3: Network Optimization<BR>
<BR>
<BR>
<BR>
1. Distributed Networking Speed Limitations<P>
The basic idea of distributed networking is to have an unstructured network. The reason why it has to be unstructured is that if the network is structured, the network will be easier to attack as a whole. For example, take Napster. Its structure is simple: a big, fat server in the middle, and everyone (the "clients") are connected to that server. If the server shuts down, then the whole network stops working. While there exist a lot of different network structure, simply taking the step of trying to restructure the network while not having control over the individual nodes is complicated and would allow easily someone to break the network apart.<P>
<P>
The network has to stay anarchic, for the security and the anonymity of the network.<P>
<P>
So, you have to take extra steps to make sure the network is the least cetralized, the most flat. But this will make the time for a query to be distributed to the whole network very high. Thus, we have to find different ways to make the network as fast as possible, without changing its structure.
<P>
2. Bandwidth Management<P>
To optimize the transfer between two nodes, we could try to "manage" the available bandwidth between the two nodes.
<P>
Firstly, the total bandwidth used between two nodes must be the same in both directions. Otherwise, too much data would get in compared to the data comming out, and the node will be forced to delete some data. While you can't totally avoid this situation, for example when a node is connected with two other nodes with different bandwidth, we must avoid as much as possible to have data deleted.
<P>
2.1 Bandwidth distribution between different nodes
When a node wants to connect to another node, the other node will try to allocate as much bandwidth as it can. Thus, the other node will answer with "I have X1 upload and Y1 download", and your node will reply with "X2 download, Y2 upload". The smallest of X1, X2, Y1, and Y2 is choosen as the maximum amount of data flowing in both directions.
<P>
If you connect to two or more nodes, then the amout of bandwidth allocated to each node is in proportion of the transfer speeds. For example, if node A is X1 and node B is X2, then node A will have X1/(X1+X2) of your bandwidth. The values of X1 and X2 are the the maximum amout of data in both directions that you could handle with A and B if you were connected only to A or B.
<P>
The bandwidth between two nodes can be re-negociated at any time. If you send a different amount of data than what was negociated, then the connection will be dropped. If you want to send less data than what you have, you must send dummy data to proove that the bandwidth has not changed. If the internet connection becomes slower and that packets of data are starting to get lost, then both nodes should try to re-negicoate the data to a lower value.
<P>
2.2 Bandwidth distribution between two nodes<P>
Basically, the data transfered between two nodes fall into 3 categories: queries, static data and data transfers. There should be a certain proportion of the maximum bandwidth they can use. For example, let's say that queries is 2, static data 4 and data transfers 6. Then, queries cannot take more than 2/12 of the bandwidth, static 4/12 and transfers 6/12. But then, let's say that data transfers are not used. Then, queries could take up to 2/6 of the bandwidth. Let's say that data transfers only use 2/12 of the bandwidth. Then, queries could take up to (2/6)*(10/12) of the bandwidth.
<P>
The proportions used between nodes could be negiciated, but both nodes have to fully accept the values. (Benad: how can this negiciation could work? can we abuse of that?)
<P>
3. Network Management<P>
The goal is to make the network as "flat" as possible. To do so, a node should connect to the node with the lowest ping time it can find.
<P>
(DOCUMENT NOT FINISHED YET)

--------------END DOCS------------------

- Benad

I'm having a problem here. Having ANet as a library is a great idea, but it
allows more than one instance of the library to execute at the same time on
the same machine. Only one node on the network should be created per
machine, not per service.

So, ANet should be a deamon, right?

There would still be a library, simply to communicate with the deamon. This
is because process to process communication is very OS specific, and having
a library that does the job of sending/receiving the data from the ANet
deamon would really help the programmers.

Here's how it could work:

Service A   <--->  ANet lib.   \
Service B   <--->  ANet lib.   |
Service C   <--->  ANet lib.   |<---> ANet deamon <---> Network
...         <--->  ANet lib.   |
Service Z   <--->  ANet lib.   /

(you liked my ascii art?)

I'll start writing part 3 and 4 today.

- Benad

>Hi,

Hi! ... ? We can skip the formalities now...

>The point is the entry point of the data into the network.

Oops! Completely forgot about that!

When I first started to think about security in ANet, I thought about
protecting the network, not making the entry of data in the network
anonymous. The reason why I really wanted anonymity, in the sense that you
can't trace back the origin of the data once it's in the network, it's
because it allows the network to work (hehe!) without any static IP
address. Hence the "philosophy" behind ANet, in the docs.

Protecting the entry point is perfect for the paranoid, but is not the
point why we should make the network "anonymous". I'm not making this for
Warez anyways. So, if everyone's OK with it, anything related to encryption
should be coded after version 1.

>The orginator of
>the message is hidden by the use of the public key encryption. When you
>send something encrypted with the public key only the reciever can decrypt
>it. It is not known who sent the message. Once the last node has recieved
>the message it is then introduced into the network - so the originator is
>not known.
>
>I'll illustrate what I mean with an example.
>
>There is a chain A -> B -> C -> D.
>
>A wishes to send data to D. A takes the data and encrypts it first with
>D's public key, then C's then B's and sends it to B.
>
>B decrypts the data and sees that the encrypted messages is for C and
>therefore forwards it.
>
>C decrypts the data and sees that the message is for D and forwards it.
>
>D decrypts the data and then broadcasts it to the whole network with no
>way of telling the originator of the message.
>
>Unless B, C & D are malicious and all working in concert, no one can tell
>where the message originates from. Dummy packets will have to be sent to
>prevent traffic analysis.

Uh... This idea is great to "plant" a query of some static data in the
middle of the network (as a one way data flow), but can't be used in two
way data flow. The other side of the proxy doesn't know the full path to
the actual destination, so the proxy can't know which public keys to use.

So, when you "plant" your query in the network, network sniffing may
identify you as the origin, but no one can know what the data is, and once
the data is in the network, everyone can see the data, but you can't be
traced back anymore!

Hmmmm... Your idea is great after all!

>In your scheme how is A kept anonymous from B? You do not cover this in
>your documentation. If someone is sniffing the network between A and B
>they will see when a query originates from A and therefore know what A is
>looking for.

There's always a way to "sniff" that some node is producing some data, and
encryption simply stops the third party to know what the data is. For what
I think is obvious, it's more important to protect the network as a whole
than individual nodes, as it's easier (and cheaper) to attack a protocol
than to attack a specific node. That's why I view encryption as "optional".

Anyways, between A and B, A sends the data to B the same way as if A
received the data from someone else (IP to IP: the know the IPs of each
other). So, you'll need to find what output of A in all connections is
compared to its input to find what A is producing.

BTW, Gnutella is pretty happy without encryption, and I haven't even heard
rumors about individual nodes being attacked. At least non-warez nodes.

>Without public key encryption how will this information be hidden? UDP is
>blocked by most firewalls. If used it would drastically reduce the
>usefulness of the application. So I assume that you would be using TCP
>which means that adjacent nodes would have to know each others IP
>addresses. With that information you can find out who owns that address
>and then the ISP's logs can be used to find out who you are.

And unless you're an ISP, knowing someone's IP address is pretty unusefull.
Unless you want to try a DOS (Denial Of Service) attack... Even encryption
will not protect you from the others to know your IP address as your
position in the network.

UDP? What's that? ...
RFC 768, right? (http://www.freesoft.org/CIE/RFC/768/index.htm)

.... (reading)

There are ports in UDP. So it doesn't define the behavior of the data. :-(

>You have know way of knowing which nodes are malicious and therefore you
>cannot just simply switch to another node.

I'm currently thinking about some way to "test" the nodes to see if they're
malicious or not

>I'd recommend that you get a copy of 'Applied Cryptography' by Bruce
>Schneier (ISBN: 0471117099). It contains many different explanations of
>security analysis and techniques. I don't know anyone that hasn't found it
>useful.

I see this book recommended by everyone. Is it that good?

>PS The RSA patent expired on the 20th September this year. OpenSSL
>contains an industrial strength, open source, implementation of the RSA
>algorithym.

WHAT? Great!

>PPS Export restrictions have been drastically reduced. Have a look at:
>http://www.mozilla.org/crypto-faq.html All legal issues can be
>circumvented by using developers in countries without export restrictions.

I'm still not impressed by just 56 bits. But for ANet, it should be enough...

For keeping encryption for version 2, OK or not?

(wow. that email was huge...)

- Benad

Hi,

The point is the entry point of the data into the network.  The orginator of
the message is hidden by the use of the public key encryption.  When you send
something encrypted with the public key only the reciever can decrypt it.  It
is not known who sent the message.  Once the last node has recieved the message
it is then introduced into the network - so the originator is not known.

I'll illustrate what I mean with an example.

There is a chain A -> B -> C -> D.

A wishes to send data to D.  A takes the data and encrypts it first with D's
public key, then C's then B's and sends it to B.

B decrypts the data and sees that the encrypted messages is for C and therefore
forwards it.

C decrypts the data and sees that the message is for D and forwards it.

D decrypts the data and then broadcasts it to the whole network with no way of
telling the originator of the message.

Unless B, C & D are malicious and all working in concert, no one can tell
where the message originates from.  Dummy packets will have to be sent to
prevent traffic analysis.

In your scheme how is A kept anonymous from B?  You do not cover this in your
documentation.  If someone is sniffing the network between A and B they will
see when a query originates from A and therefore know what A is looking for.
Without public key encryption how will this information be hidden? UDP is 
blocked by most firewalls. If used it would drastically reduce the usefulness
of the application.  So I assume that you would be using TCP which means that
adjacent nodes would have to know each others IP addresses.  With that
information you can find out who owns that address and then the ISP's logs
can be used to find out who you are.

You have know way of knowing which nodes are malicious and therefore you cannot
just simply switch to another node.

I'd recommend that you get a copy of 'Applied Cryptography' by Bruce Schneier
(ISBN: 0471117099).  It contains many different explanations of security
analysis and techniques.  I don't know anyone that hasn't found it useful.

thanks,

- Dale

PS The RSA patent expired on the 20th September this year.  OpenSSL contains an
industrial strength, open source, implementation of the RSA algorithym.

PPS Export restrictions have been drastically reduced.  Have a look at:
http://www.mozilla.org/crypto-faq.html  All legal issues can be circumvented by
using developers in countries without export restrictions.

On Tue, Dec 05, 2000 at 07:41:49PM -0500, Benoit Nadeau wrote:
> >How else can you enter information into the network anonymously?  I was
> >thinking of a similar approach using recursively wrapped RSA encryption.  I
> >was going to add some features that would make the network harder to spam.
> 
> Distribute the data like a query. Remember, there is no trace, or more precisely no way for a node to ask from another node: "Where this came from?". There is no backtrace as in FreeNet. This is the very basic (and brillant) idea behind the queries in Gnutella, and why ANet is a kind of distributed networking.
> 
> BTW, RSA is not free, not open source and not exportable at all (unless you stick with 56 bits). Avoid it like plague.
> 
> 
> >How would you solve the problem?  In the scheme you have preposed (2.7
> >Anonymous two-way data flow) you must trust the proxies for anonymity to be
> >maintained.  For example the proxy you connect to could be malicious and would
> >know who you are communicating with.  Without using encryption how would this
> >be avoided?
> 
> No. Proxies cannot know either where the data comes from at its origin or where is its final destination. There is no backtracing! In the example, (A B C D...E F G H, if I remember correctly), D and E know the exact IP of each other, but they can NEVER know that A and H are the end points in the data flow. They only know the existence of the previous node (C or F) if the chain, and that's it. I think I explained all this in the docs...
> 
> Anyways, if the proxies are trying to "screw up" the data, just re-establish the connection with other proxies. And if you don't want the other nodes to peek at your data, encrypt your file before sending it. Isn't that something obvious? That's why I didn't even thought about writing this in the docs.
> 
> Please, stop reading the docs of FreeNet. ANet is so different that it will confuse you. It seems that you assume there is some kind of backtracing like in FreeNet or IP. Also, you seem to not really understand what makes ANet anonymous: each node behaves the same way with the data, so that there is no way  to distinguish the originator from all the other nodes in the network. True, everyone knows the contents of your query, but who cares, no one knows where is comes from! The exact same idea is used with static data. Again, static data is NOT for files. You keep your files on your hard disk, and that's it.
> 
> IP addresses of the originator or the destination are never, ever know. IP addresses for the nodes, the proxies and the gateways (I'll talk about this in part 3, later this week) are known, but that still doesn't give any hint to where the data comes from or is destinated.
> 
> After all, ANet is totally unlike FreeNet and very similar, in its basic idea, to Gnutella. So, reading docs about gnutella might help you understand what I mean.
> 
> - Benad
> 
> 
> _______________________________________________
> ANet-devel mailing list
> ANe...@li...
> http://lists.sourceforge.net/mailman/listinfo/anet-devel

>How else can you enter information into the network anonymously?  I was
>thinking of a similar approach using recursively wrapped RSA encryption.  I
>was going to add some features that would make the network harder to spam.

Distribute the data like a query. Remember, there is no trace, or more precisely no way for a node to ask from another node: "Where this came from?". There is no backtrace as in FreeNet. This is the very basic (and brillant) idea behind the queries in Gnutella, and why ANet is a kind of distributed networking.

BTW, RSA is not free, not open source and not exportable at all (unless you stick with 56 bits). Avoid it like plague.

>How would you solve the problem?  In the scheme you have preposed (2.7
>Anonymous two-way data flow) you must trust the proxies for anonymity to be
>maintained.  For example the proxy you connect to could be malicious and would
>know who you are communicating with.  Without using encryption how would this
>be avoided?

No. Proxies cannot know either where the data comes from at its origin or where is its final destination. There is no backtracing! In the example, (A B C D...E F G H, if I remember correctly), D and E know the exact IP of each other, but they can NEVER know that A and H are the end points in the data flow. They only know the existence of the previous node (C or F) if the chain, and that's it. I think I explained all this in the docs...

Anyways, if the proxies are trying to "screw up" the data, just re-establish the connection with other proxies. And if you don't want the other nodes to peek at your data, encrypt your file before sending it. Isn't that something obvious? That's why I didn't even thought about writing this in the docs.

Please, stop reading the docs of FreeNet. ANet is so different that it will confuse you. It seems that you assume there is some kind of backtracing like in FreeNet or IP. Also, you seem to not really understand what makes ANet anonymous: each node behaves the same way with the data, so that there is no way  to distinguish the originator from all the other nodes in the network. True, everyone knows the contents of your query, but who cares, no one knows where is comes from! The exact same idea is used with static data. Again, static data is NOT for files. You keep your files on your hard disk, and that's it.

IP addresses of the originator or the destination are never, ever know. IP addresses for the nodes, the proxies and the gateways (I'll talk about this in part 3, later this week) are known, but that still doesn't give any hint to where the data comes from or is destinated.

After all, ANet is totally unlike FreeNet and very similar, in its basic idea, to Gnutella. So, reading docs about gnutella might help you understand what I mean.

- Benad

2000	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (2)	Dec (29)
2001	Jan (13)	Feb (7)	Mar (10)	Apr (7)	May (3)	Jun (6)	Jul (4)	Aug (2)	Sep (1)	Oct (3)	Nov (3)	Dec (12)
2002	Jan (6)	Feb	Mar	Apr	May (2)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2003	Jan (1)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec

anet-devel Mailing List for ANet (Page 4)

anet-devel — General development mailing list. Only ANet developers can post.