Egregore Blog

Secure Anonymous P2P Chat

Status: Planning

Brought to you by: nucleardecay

Router Location Code and Eclipse Attack

With a recent commit there was a comment "Changed the router location code to use Kademlia Node ID's as search keys to elminate the dependance on a single key id for routers." And I'd like to explain what that is.

First thing to understand is that Kademlia can be vulnerable to an Eclipse Attack, where an attacker is able to block access to all queries for a single key in the DHT. What I had originally done was to publish router contact details (IP address and port) under a single key. If an attacker went after this key, the Egregore client would not be able to find any routers to connect to effectively shutting down the entire network. This was clearly not acceptable.

A couple of Ideas were kicked around for what to do about this, and I didn't really like any of them. Then it occurred to me that the software already had a way to find other computer running it. The DHT node already has infrastructure where one DHT node could find others by sending find node messages to known peers. Now the assumption here is that each DHT node is also a router, (and except for the standalone DHT program for testing) that should be true. So assuming that a DHT connection can be made, then the DHT network can be 'crawled' to find more nodes, and therefore more routers.

So the thought was the software could loop through the list of known Kademlia Node IDs and query them to see if any were routers. The only problem was that if the node ID's were used as the key to store the router contact details at, subsequent searches for routers could only return a limited subset of all possible routers, and that represents a risk. So instead of routers storing their contact info at its own node ID, it instead salts the node ID with publicly known data. This means that for any Node ID, it is possible to calculate a new node ID where the router contact info is actually stored. This means that my client already knows node 1, and it wants the router info for node 1, it takes the ID 1, adds the salt data and gets ID 1+Salt. When the key 1+Salt is queried, the Kademlia locating code causes the Kademlia client to come into contact with more nodes. This means that a second attempt to find routers based on node IDs has even more node IDs to work from each time it runs.

This method means that repeated queries for router contact information, could eventually return every router on the whole network. It also makes an eclipse attack on this part of the network pretty much impossible since an attacker would need to run 8 nodes for every non-compromised node on the network.

Cool Huh?

Posted by 2012-02-14