[Dproxy-devel] some thougths on dproxy

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I want to check in the new stuff. Matty mentioned to call the current
cvs tree 1.0 and the new versions 1.x . Is it ok if I check in the new
stuff as a new top level package 'dproxy-1.x' ?

Here are some thoughts for the future development of dproxy, 

Target environment:
-------------------

IMO we should expect dproxy to be used on small networks, say 1-100
hosts. Larger networks would require some more complex things as
database replication, authority splits etc.
With this in mind, we maybe can keep the cache in plain text. I would
expect less than 300 cache entries on such a network, so we get about
30k.

The deny file is a different thing. A friend of mine is very interested
in dproxy because of this feature. He is admin of school network and
want to block some porn sites. Well, at the moment his list of such
sites has about 1500 sites. Maybe it was better to move to a [ ng]dbm
database for the deny file ... 

With such a hash based database, we can find an entry within 2 disk
operations and we need only 3 compare operations (hidden in the dbm
lib).

On the other hand, we had to access that database for each part of the
name, e.g.. if we want to check if 'a.b.c.d' should be blocked, we need
to check for 'a.b.c.d' 'b.c.d' 'c.d' and 'd'

RFC conformance
---------------

dproxy-1.x implements most of the MUST features from RFC1123, some of
the SHOULD features and none of MUST NOT. The largest possible RFC
violation is the maximum UDP packet size, as of RFC1035, an UDP packet
must not exceed 512 bytes. This was not a problem in the past, anyway.

Additional RR's
---------------
Another problem is, that we not support all RR's specified in RFC1035.
This is no real problem, because most of them are nearly needless -
except MX (mail exchange) records.

I'm not sure how to implement them. If we have an dialup gateway, all
mail from the 'inside' network to the Internet should be relayed on the
gateway, so we could answer any MX query from the inside with an MX
record pointing to the gateway. But what happens on the gateway ? The
mailer there needs a different MX record for its routing decisions, but
dproxy can not (yet?) pass MX records from the outside to a client on
the local machine.

For MX queries on the internal net, we could either send an MX record
pointing to the queried  host or pointing to a mail server (the gateway
again ?)

If we need to answer queries from the outside, this is similar to MX
queries from the inside. We could answer with the hostname+domain or
with the address of a mail gateway.

If we need to transfer any additional RR's from the outside to the
inside, we also should cache them. We could extend the current cache
file if we add an RR marker to each record in the cache. The new format
could be something like

  <RR> <Name> <TTL> <some data>\n

There would be a function that returns the ttl and everything from the
first non blank up to the end-of-line. The calling function had to
decide how to interpret this data.

If we need to store additional RR data for local hosts, we could extend
the format of the '/etc/hosts'. We could check the comments that come
after a host entry for additional data, e.g.

	192.168.1.2  speedy.example.net  speedy
	#
	#HINFO AMD-Athlon  Linux
	#MX mail.example.net
	#MX mail2.example.net

	192.168.1.1  sun01.example.net	sun01
	#
	#HINFO SPARC  Linux
	#TXT We have joy, we have fun, we have Linux on our SUN
	#

This would be stored in the cache file as

	A speedy.example.net 0 	192.186.1.2
	A speedy 0 192.168.1.2
	HINFO speed.example.net 0 AMD-Athlon Linux
	MX speedy.example.net 0 mail1.example.net
	MX speedy.example.net 0 mail2.example.net
	A sun01.example.net 192.168.1.1
	A sun01 192.168.1.1
	HINFO sun01.example.net SPARC  Linux
	TXT sun01.example.net We have joy, we have fun, we have Linux on our
SUN

Security
--------

The DNS is not a secure protocol!  There is a 'spoof protection'
feature, used by gethostbyname() we should implement. That is, if we
query an upstream DNS for the address to a hostname, we should also
query for the hostname to that address. If they do not match, one of the
answers was spoofed - maybe.

The Idea with this is, if the attacker is not directly on the other side
of the wire, it is possible that he can not spoof both answers. Not much
protection anyway ...

There are also that things from RFC2065 - crypto stuff and such, sounds
interesting ...

non forking server ...
-----------------------

... makes thing a lot more complicated. On the other hand, we need to
keep track of the queries to prevent forking of multiple children for a
single query.
Any ideas?

More?
-----

What do you think, that MUST, SHOULD, MAY, MAY NOT, SHOULD NOT or MUST
NOT be implemented (hey, i read to much in that rfc's)

Ciao
  Andreas