dproxy-devel Mailing List for dproxy - caching DNS proxy (Page 2)
Brought to you by:
mattpratt
You can subscribe to this list here.
2000 |
Jan
|
Feb
(25) |
Mar
(1) |
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Andreas H. <hof...@in...> - 2000-02-11 02:41:43
|
Hi, I just commited some new files for the cache handling. Everything is untested and not yet integrated into the main module. Changes are - parsing of cache entries is now in cache_data.[ch] - cache file stuff is in 'cache_file.[ch] - cache2.[ch] is the new cache interface. - WARNING : the format of the cache entries has changed ! Be carefull if you try anything. Anybody out there who want to implement the linked list stuff ? We need this for next steps on the cache redisgn. (refresh & memory cache) See my last posting on that. Another topic: Some of the files in the CVS are not neccessary, eg. Makefile.in Makefile configure etc. because they can be re created by autoconf & automake. If they are in the CVS could cause some confusion. I want to remove them, but that means that everybody of you need the tools. Please try to 'make maintainer-clean' then 'automake --foreign Makefile' and 'autoconf'. This will recreate all files from scratch. Please tell me, if this works for you so I can remove the files from the repository. Ciao Andreas |
From: Andreas H. <hof...@in...> - 2000-02-10 17:58:56
|
Benjamin Close wrote: > Else we fork off the child to do > the dns, since it may timeout. If we have the ip/name the delay shouldn't > be that noticable so I think this would work well. I just moved the fork out of the main loop up to 'handle_query' to see what happens. I need to change some more little things to move the fork into the handler function. See no problem here. > If we find the entry in the disk cache and move it to memory we should > probably also remove it/mark it invalid from the disk cache. We may end up > with data inconsistency otherwise. I think it is not neccassary to mark entries on disk, because we already store the expiration time on disk. We should ignore expired entries in the disk cache instead. > > 6. We do not refresh entries in the disk cache. > > What happens to expired entries in the disk cache? On next refrence we > move the to the ll and inc the refcount. However these may be way out of > date. I think if we get an entry from the disk cache and the TTL has > expired, refresh them add to ll. If we ignore expired entries, we can reley on the usual mechanismen here : - If an entry is expired im mem, it is also expired on disk. - if we need to lookup an entry that is neither in memory nor on disk, we will get it from the DNS and write it back into the disk cache. Remember we still have the disk cache purge thing, so the only time interval an inconsitency problem could occur is in between the expiration time of an entry and the next disk cache purge. > > > > Hardware people sometimes call this a 'write through 1st level cache'. > > IMO this implementation will work well, if some conditions are meet : > > > > - The cache is 'large enough', all heavily used entries can be kept in > > memory. > > I think this is already somewhat done. The reference count will force new > entries in the ll so the ll will grow with the demand and shrink as the > demand drops (ref count ==0) Good idea, but we should need a LRU time instead of an refcount. Or we use a combination of refcount and LRU time like this: - every time an entry is referenced, we incremt the refcount. - every time the LRU time is < now - cache_expiration_time , we decrement the refcount and set the LRU to now. - if the refcount drops to 0, rmove the entry. > > > - the TTL of a mem cache entry and a disk cache entry are always the > > same. > > Removing the entry from the disk cache solves this problem. We may just > want to mark it invalid (TTL=-1?)and have the purge routine rewrite the > file when we go off the net. No problem because of the expiration time is stored on disk. > > - Instead of starting with an empty cache, one could fill up the cache > > with the last > > 'mem_cache_size' entries from the disk cache. > > Reloading the whole mem cache may be a little pointless. I think just > reload the entries with a reference count > x where x is a defined value. It is maybe usefull for that refresh stuff, we can leave this decission open and start with an empty mem cache for now. > > Oh, who wan't to write what? > Roadmap suggestion : --- Job #1 -- I just added a new data structure for cache entries, including some methods on them (read,write,dup,free). I will check that in tonight, Next I want to rewrite the disk cache stuff. The cache->hostentry conversion need to be rewitten to use both, the mem and the disk cache and should go somewhere else (xhostentry.c ?) The mem cache and the disk cache should have the same/simmiliar API, something like void init() cache_data_t * XX_find_name( char * type, char * name) cache_data_t * XX_find_next_name(char * type, char * name) cache_data_t * XX_find_data(char * type, char * data) cache_data_t * XX_find_next_data(char * type, char * data) int XX_add(cache_data_t *) void purge() A caller must not expect a returned pointer to be stable for a longer periode than until the next call to one of those function. The functions need some state information (a fd or a list entry) ---- --- Job #2 --- We need a LL implementation. As we maybe want to re-use it (refresh list ...), it should be in a separate module, i would suggest an api like this: typedef struct {...} list_t; tyepdef struct { list_t * list , ll_entry_t *pNext , cache_data_t * data } ll_entry; Constructor, an add method and a destructor list_t * ll_new(); int ll_add(list_t * , cache_data_t *) void ll_delete(ll_t *) ... an iterator ... ll_entry_t * ll_first(list_t *); ll_entry_t * ll_next(ll_entry_t *); ... element destructor ... void ll_remove(ll_entry_t *) ... and finally a move method void ll_to_front(ll_entry *) We need to remove entries from the middle of the list, so we maybe should implement this as a double linked list. Some of the functions are trivial and could be implemented as macros. (or inlines, but some compilers can not use inline) The LL implementation does not depend on the cache rewrite stuff, parallel work possible. --- Job #3 --- When the LL stuff is ready, one can implement the mem cache stuff. Ciao Andreas -- Andreas Hofmeister (see http://www.informatik.uni-freiburg.de/~hofmeist) "I can do it in Quake, why not in real life ?" (Lan Solaris (Illiad)) |
From: Benjamin C. <clo...@lu...> - 2000-02-10 03:15:57
|
On Thu, 10 Feb 2000, Andreas Hofmeister wrote: > Ok I would suggest the following memory cache implementation with or > whithout refresh I like the idea! > 2. Every time we search and find an entry in the mem-cache, we move it > to the front of the list. > 2a. (refresh) if an entry was referenced, we increase a reference count. We are still going to have the problem of sharing the list between processes. I suggest that the parent searches the ll, then disk, and returns the appropriate entry if found. Else we fork off the child to do the dns, since it may timeout. If we have the ip/name the delay shouldn't be that noticable so I think this would work well. > 3. if we can not find the entry in the mem cache, we search it in the > disk cache and the dhcp file (in that order), if we find it there, we > will add it to the mem cache as the first element. If there are too many > entries in the mem cache, we remove the last element from the list. If we find the entry in the disk cache and move it to memory we should probably also remove it/mark it invalid from the disk cache. We may end up with data inconsistency otherwise. > 6. We do not refresh entries in the disk cache. What happens to expired entries in the disk cache? On next refrence we move the to the ll and inc the refcount. However these may be way out of date. I think if we get an entry from the disk cache and the TTL has expired, refresh them add to ll. > Hardware people sometimes call this a 'write through 1st level cache'. > IMO this implementation will work well, if some conditions are meet : > > - The cache is 'large enough', all heavily used entries can be kept in > memory. I think this is already somewhat done. The reference count will force new entries in the ll so the ll will grow with the demand and shrink as the demand drops (ref count ==0) > - the TTL of a mem cache entry and a disk cache entry are always the > same. Removing the entry from the disk cache solves this problem. We may just want to mark it invalid (TTL=-1?)and have the purge routine rewrite the file when we go off the net. > detail alternatives : > > - Instead of a reference count, one could use the LRU time > - Instead of starting with an empty cache, one could fill up the cache > with the last > 'mem_cache_size' entries from the disk cache. Reloading the whole mem cache may be a little pointless. I think just reload the entries with a reference count > x where x is a defined value. Oh, who wan't to write what? Cheers, Benjamin PS: sorry emails to the address quakevr@bandi... will bounce - firewalled |
From: Andreas H. <hof...@in...> - 2000-02-10 02:32:11
|
jeroen wrote: > > Andreas Hofmeister wrote: > > > Maybe it is an idea to add a keyword search to it??? e.g. *xxx in the > > > deny file would cause all domains with 'xxx' to get rejected? > > <snip> > ... if you have 'hundreds' of sites you want to block this might be a little > more effective for reducing the > size of this file. You think of some sort of regex. Mhhh - only problem with this is, one must be extremely careful with them - you might remember that AOL drama: they tried to block newsgroups like alt.sex etc. but they also blocked some groups about breast cancer, aids and such ... There are some functions in the libc for regex comparison we could use. Only thing about regex is, they are expensive, either in means of memory or speed. To be efficient, a regular expression must be pre compiled. We can not re-read and recompile them for every query we get - and of course, not for every of some hundred sites to block. Maybe we could implement that regex stuff like this : every line that starts with a special marker- say '@' (not part of any domainname) - will be regarded as a regex. On startup, dproxy reads the block file once, checks for regex and compiles them. In normal operation, dproxy simply ignores every line starting with a '@' > (I think a firewall rule might be a better place for this kind of > thing.) Both are different aspects of blocking, a FW rule can block the data transfer to a site, a blocking rule in the name server can hide the existence of a site. BTW. your refresh things are not yet in dproxy-1.x . Please wait a little before you start to implement this. I do some experiments with 'late forks', which makes a mem cache possible This will allow us to a do refresh implementation without the problems I mentioned. Also note, that the semantics for the time field in the cache file has changed! We now get and use the real TTL from the upstream DNS, 'cache_purge_time' has a very different meaning now, (it is simply the default TTL send to the clients, param should be renamed) There are two additional parameters that would make sense with this, 'cache_min_ttl' and 'cache_max_ttl', but both are not implemented yet. The first was for sites that send unbelievable short ttl's (e.g. netscape.com sys 1h ) , the second is for a little bit more security. Ciao Andreas |
From: Andreas H. <hof...@in...> - 2000-02-10 02:31:58
|
qu...@ba... wrote: > > On Wed, 9 Feb 2000, Andreas Hofmeister wrote: > > <SNIP> > > FIFO is not the best cache strategy one can think of, but it would work. > > I suggest a reference count in the dproxy file ie: > Ok I would suggest the following memory cache implementation with or whithout refresh 1. The mem-cache is a linked list of entries 2. Every time we search and find an entry in the mem-cache, we move it to the front of the list. 2a. (refresh) if an entry was referenced, we increase a reference count. 3. if we can not find the entry in the mem cache, we search it in the disk cache and the dhcp file (in that order), if we find it there, we will add it to the mem cache as the first element. If there are too many entries in the mem cache, we remove the last element from the list. 4. we can not check for the entry on the DNS server! If we get an entry from the DNS, that entry will only be added to the disk cache. Only if it is referenced again, it will get into to the memory. 5a. (no refresh) if an entry from the cache expires, it is removed. 5b. (refresh) on each purge run on the mem cache, we decrease the reference count. If an entry is expired and the reference count is greater 0, we put that entry into a 'refresh' list. Names in the refresh list will be refreshed in a separate process and written into the disk cache. 6. We do not refresh entries in the disk cache. 7. After startup, the memory cache is empty. Hardware people sometimes call this a 'write through 1st level cache'. IMO this implementation will work well, if some conditions are meet : - The cache is 'large enough', all heavily used entries can be kept in memory. - We don't have linear access patterns that are longer than the cache size, this would lead to 'cache trashing'. - refresh with that implementation will work, as long as the system (and dproxy) are started and connected at least once in every 'cache_min_time' interval, - the TTL of a mem cache entry and a disk cache entry are always the same. detail alternatives : - Instead of a reference count, one could use the LRU time - Instead of starting with an empty cache, one could fill up the cache with the last 'mem_cache_size' entries from the disk cache. Ciao Andreas |
From: <qu...@ba...> - 2000-02-10 00:25:04
|
On Wed, 9 Feb 2000, Andreas Hofmeister wrote: <SNIP> > FIFO is not the best cache strategy one can think of, but it would work. I suggest a reference count in the dproxy file ie: name ip timeout refcount anyotherstuff (Setting the file up this way means we can work with older dproxy cache files - support backward compatability I say :) When each site is accessed we increment the reference count. Thus when we come to purge we purge the entries using ttl as the primary and refcount as the secondary check. If we just use refcount then we may have a site just entered (so refcount =1 ) and it will be purges straight away - not what we want. > There is another problem with that 'refresh' thing: Consider you conatct > a large number of sites one day, your 'CACHE_PURGE_TIME' is set to n > days. On the n'th day, you do not connect to the net. If you now connect > the net on the n+1 day, all entries are refreshed in one run. The > situation will get even worse if you do not connect to the net for a > longer periode (say hollidays). > Once all the hosts looked up in that single run, they will have the same > creation time, so they will always refreshed together ... How about basing the cache purge time on how long the person was online? Ie in last/first line of proxy.cache keep an indicator. This indicator could be used for the time and so ttl entries would be purged as they expire based on the online time. This would however mean that the ttl entrie wasn't used how it was initailly intended. > IMO those problems can not be solved with an on disk cache, as long as > this has no fixed record structure, but this would mean that the cache > wasted much space on disk and was not human readable anymore. We could > solve the refresh problems with an memory cache ... I think it is possible on disk and in human readable format. I do think that dproxy should have a limited memory cache for the most used entries. A purge could write the memory cache to disk, purge it then reread the new most used entries. If a server relies heavily on the dns for local machine names, why should we have to go to disk each time? Cheers, Benjamin |
From: jeroen <pe...@ch...> - 2000-02-09 21:19:00
|
Andreas Hofmeister wrote: > > > > Maybe it is an idea to add a keyword search to it??? e.g. *xxx in the > > deny file would cause all domains with 'xxx' to get rejected? > > Thats already in, dproxy tries to match the strings from 'dproxy.deny' > with the > end of a hostname, so a query for 'a.b.c' will be blocked by a > 'dproxy.deny' line like 'b.c' - btw. this would also block x.yb.c' -> > all domain names in the deny file should have a dot as first char. I know that's in, I wrote that part, but if you have 'hundreds' of sites you want to block this might be a little more effective for reducing the size of this file. (I think a firewall rule might be a better place for this kind of thing.) > > Ciao > Andreas Grtz, Jeroen |
From: jeroen <pe...@ch...> - 2000-02-09 21:07:37
|
Andi, I looked into the files from slackware, and I found a file 'slack-version-4.0.0' in /usr/lib/setup Maybe you can look for the existance of a file named '/usr/lib/setup/slack*' ????? Grtz, Jeroen |
From: Andreas H. <hof...@in...> - 2000-02-09 20:11:16
|
Hi, jeroen wrote: > > Andi, > > I just downloaded the 1.x source tree, looks nice. (although I did saw a > few goto's in dproxy.c....not a real problem but dangerous.....) I > haven't yet looked at it really close, but I saw a few things about > ipv6, is that working already??? Nope, I need to read the RFC's first. Those points where you saw something about ipv6 are some sort of 'fixme' markers for the future. > > I have been thinking about the problems you mentioned about refreshing, > but I think they are solved if a config option 'max_entries' is added, > this would cause the oldest to be deleted if the cache gets to big, if > it is frequently used it will appear at the end of the cache the next > time it gets looked up. FIFO is not the best cache strategy one can think of, but it would work. There is another problem with that 'refresh' thing: Consider you conatct a large number of sites one day, your 'CACHE_PURGE_TIME' is set to n days. On the n'th day, you do not connect to the net. If you now connect the net on the n+1 day, all entries are refreshed in one run. The situation will get even worse if you do not connect to the net for a longer periode (say hollidays). Once all the hosts looked up in that single run, they will have the same creation time, so they will always refreshed together ... IMO those problems can not be solved with an on disk cache, as long as this has no fixed record structure, but this would mean that the cache wasted much space on disk and was not human readable anymore. We could solve the refresh problems with an memory cache ... > > Everybody: > > About the memory cache and other things I saw in the last few mails, I > agree with Matty that we should be carefull not to bloat it to much. I > started using dproxy because it was nice and small and didn't need much > config (I know.... I started the config thing.......) I think we should > leave all the little extra's to our 'big brothers' like bind or dent, > and concentrate on what dproxy was originally made for: small LAN's. Sure, but some features are usefull even for smaller networks, e.g. there is a small network monitor 'tkined' that could use HINFO records, or if you want to have a central mail server for your net, MX records are very usefull. Also small depends on your point of view, if you have 3-4 hosts, you may think 100 hosts make a big network, if you are admin for a net with some 1000 hosts, 100 are a small network. IMO 100 hosts is a number that fits most of the applications for dproxy - home networks, department of an university, a workstation pool and small companies. > I don't think we should change the deny file to dbase format, it would > also bloat dproxy. Do you talk about the 'hosts' file, the cache or about dproxy.deny ? > Maybe it is an idea to add a keyword search to it??? e.g. *xxx in the > deny file would cause all domains with 'xxx' to get rejected? Thats already in, dproxy tries to match the strings from 'dproxy.deny' with the end of a hostname, so a query for 'a.b.c' will be blocked by a 'dproxy.deny' line like 'b.c' - btw. this would also block x.yb.c' -> all domain names in the deny file should have a dot as first char. Ciao Andreas -- Andreas Hofmeister (see http://www.informatik.uni-freiburg.de/~hofmeist) "I can do it in Quake, why not in real life ?" (Lan Solaris (Illiad)) -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS/IT d-- s--:>++: a C++ ULSC+++$ P+++ L+++ E+ W++ N+ o? !w--- !O !M-- V-- PS++ PE-- Y+ PGP- t+ 5? X+ !R !tv b+ DI? D++ G e>+++ h(++) r% y+ -----END GEEK CODE BLOCK----- |
From: jeroen <pe...@ch...> - 2000-02-09 18:58:49
|
Andi, I just downloaded the 1.x source tree, looks nice. (although I did saw a few goto's in dproxy.c....not a real problem but dangerous.....) I haven't yet looked at it really close, but I saw a few things about ipv6, is that working already??? I have been thinking about the problems you mentioned about refreshing, but I think they are solved if a config option 'max_entries' is added, this would cause the oldest to be deleted if the cache gets to big, if it is frequently used it will appear at the end of the cache the next time it gets looked up. Everybody: About the memory cache and other things I saw in the last few mails, I agree with Matty that we should be carefull not to bloat it to much. I started using dproxy because it was nice and small and didn't need much config (I know.... I started the config thing.......) I think we should leave all the little extra's to our 'big brothers' like bind or dent, and concentrate on what dproxy was originally made for: small LAN's. I don't think we should change the deny file to dbase format, it would also bloat dproxy. Maybe it is an idea to add a keyword search to it??? e.g. *xxx in the deny file would cause all domains with 'xxx' to get rejected? (just my opinion) Grtz, Jeroen |
From: jeroen <pe...@ch...> - 2000-02-09 06:50:24
|
Andreas Hofmeister wrote: > > jeroen wrote: > > > > Hi again, > > > > I just committed the changes I made to cache.c and dproxy.h (I needed > > the connected and deny functions in cache.c). > > Mhhh - at the moment we don't keep track how often an cache entry is > referenced - that means, if you contact a site once, it will be in the > cache forever ... That is a problem I to found out, the cache could grow for ever, I have been thinking about adding a entry to the config file 'max_cache' or something like that and throw out the oldest entry (the first one after the hosts) if the file grows to large. > > > Looks like it worked (Yahoo.... I learned to use CVS!!!!!) > :-) > > BTW. what is the path for that "slackware script" you mentioned in a > previous mail? > If this script is a special feature for a slackware dist, I could use > this to detect this dist in the configure script. It is located at /etc/rc.d/rc.inet2 all deamons are started here and as far as I know it is specific for slackware.... Grtz, Jeroen |
From: Andreas H. <hof...@in...> - 2000-02-09 06:49:29
|
Hi all, I just checked in the new dproxy 1.x development thread. There are many new features in that version, you can get it from the cvs repository with anonymous cvs. I also started a manual for this version, you will find it in the 'doc' as 'manual.sgml'. Because of my ... limited english skills, I need somebody who could fix all those syntactical bugs ... Ciao Andreas |
From: Andreas H. <hof...@in...> - 2000-02-09 06:12:08
|
jeroen wrote: > > Hi again, > > I just committed the changes I made to cache.c and dproxy.h (I needed > the connected and deny functions in cache.c). Mhhh - at the moment we don't keep track how often an cache entry is referenced - that means, if you contact a site once, it will be in the cache forever ... > Looks like it worked (Yahoo.... I learned to use CVS!!!!!) :-) BTW. what is the path for that "slackware script" you mentioned in a previous mail? If this script is a special feature for a slackware dist, I could use this to detect this dist in the configure script. -- Andreas Hofmeister (see http://www.informatik.uni-freiburg.de/~hofmeist) "I can do it in Quake, why not in real life ?" (Lan Solaris (Illiad)) -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS/IT d-- s--:>++: a C++ ULSC+++$ P+++ L+++ E+ W++ N+ o? !w--- !O !M-- V-- PS++ PE-- Y+ PGP- t+ 5? X+ !R !tv b+ DI? D++ G e>+++ h(++) r% y+ -----END GEEK CODE BLOCK----- |
From: Andreas H. <hof...@in...> - 2000-02-09 06:12:05
|
qu...@ba... wrote: > > Hi all, > has anyone thought about implementing a memory cache for the top > 100 or so dns lookups? This would definately be faster, especially for > things like entries from /etc/hosts and deleteing timed out entries. > > I think putting the whole thing in memory would be good as you could then > setup another thread which would purge the timedout cache entries and > rewrite the file (Less IO = good); In the new 1.x series, the cache will be purged at a fixed time interval. This will result in a better response time. With a memory cache there is a problem with the forking design of dproxy - we can pass data in memory to the child but the child can not pass back the result from an upstream query. We consider a non forking design for the whole thing, but this would make things much more complicated. Another way to implement a mem cache was shared memory, but I never tried that. Or, as a last resort, we could cheat - fork of a child to do the upstream query, write to the disk cache and read that new entry by the parent on the next query for this host. > > However, if the cache grows large this could be a downfall. > Comments? This depends on the size of your network. I run dproxy on my very small net with 3 hosts and one user (me) in this seting, the cache is very small. Most answers from the DNS have a ttl of 1-2 days, so in the usual case, your cache will not become very large as long as you (+ your users) won't contact some hundred hosts a day. Well, there is that unusal case - some peopel run busy web servers and want to resolve some thousend addresses from their logfiles in just one hour ... > > Also, what's happening with getting dproxy working on the gateway machine. > I know it means implenting the full dns protocol but I think it may be > worth it. Will be in 1.x (i have this already running on my gateway) Ciao Andreas |
From: Matty <mat...@ya...> - 2000-02-09 05:16:01
|
qu...@ba... wrote: > Hi all, > has anyone thought about implementing a memory cache for the top > 100 or so dns lookups? This would definately be faster, especially for > things like entries from /etc/hosts and deleteing timed out entries. > > I think putting the whole thing in memory would be good as you could then > setup another thread which would purge the timedout cache entries and > rewrite the file (Less IO = good); > Well the kernel should take care of caching the cache file (confused yet?), so there should not be heaps of IO. Besides compare a ~15ms disk seek time to a 650ms ping time over the modem. I avoided memory based cache for two reasons: 1) if dproxy is unexpectedly killed it doesnt loose its cache 2) my gateway does not have much RAM. Similarly I didn't go with a threads aproach as it bloated dproxy, and again I dont have much RAM. Seems we should go two directions from here: 1) a dproxy-lite version that stays small and doesn't have feature creep. Does the mininum job and thats it. 2) a more featurefull version for either larger servers, or workstations. > > However, if the cache grows large this could be a downfall. > Comments? > > Also, what's happening with getting dproxy working on the gateway machine. > I know it means implenting the full dns protocol but I think it may be > worth it. > Ask andi about his lastest work. On that note, Andi can you put in in CVS under something like dproxy-new, or dproxy-1.x... Actually I dont know what permissions you have over the CVS, so if you cant do it give us a yell and I will see if I can fix it. Matty |
From: Andreas H. <hof...@in...> - 2000-02-09 05:12:31
|
Hi, I want to check in the new stuff. Matty mentioned to call the current cvs tree 1.0 and the new versions 1.x . Is it ok if I check in the new stuff as a new top level package 'dproxy-1.x' ? Here are some thoughts for the future development of dproxy, Target environment: ------------------- IMO we should expect dproxy to be used on small networks, say 1-100 hosts. Larger networks would require some more complex things as database replication, authority splits etc. With this in mind, we maybe can keep the cache in plain text. I would expect less than 300 cache entries on such a network, so we get about 30k. The deny file is a different thing. A friend of mine is very interested in dproxy because of this feature. He is admin of school network and want to block some porn sites. Well, at the moment his list of such sites has about 1500 sites. Maybe it was better to move to a [ ng]dbm database for the deny file ... With such a hash based database, we can find an entry within 2 disk operations and we need only 3 compare operations (hidden in the dbm lib). On the other hand, we had to access that database for each part of the name, e.g.. if we want to check if 'a.b.c.d' should be blocked, we need to check for 'a.b.c.d' 'b.c.d' 'c.d' and 'd' RFC conformance --------------- dproxy-1.x implements most of the MUST features from RFC1123, some of the SHOULD features and none of MUST NOT. The largest possible RFC violation is the maximum UDP packet size, as of RFC1035, an UDP packet must not exceed 512 bytes. This was not a problem in the past, anyway. Additional RR's --------------- Another problem is, that we not support all RR's specified in RFC1035. This is no real problem, because most of them are nearly needless - except MX (mail exchange) records. I'm not sure how to implement them. If we have an dialup gateway, all mail from the 'inside' network to the Internet should be relayed on the gateway, so we could answer any MX query from the inside with an MX record pointing to the gateway. But what happens on the gateway ? The mailer there needs a different MX record for its routing decisions, but dproxy can not (yet?) pass MX records from the outside to a client on the local machine. For MX queries on the internal net, we could either send an MX record pointing to the queried host or pointing to a mail server (the gateway again ?) If we need to answer queries from the outside, this is similar to MX queries from the inside. We could answer with the hostname+domain or with the address of a mail gateway. If we need to transfer any additional RR's from the outside to the inside, we also should cache them. We could extend the current cache file if we add an RR marker to each record in the cache. The new format could be something like <RR> <Name> <TTL> <some data>\n There would be a function that returns the ttl and everything from the first non blank up to the end-of-line. The calling function had to decide how to interpret this data. If we need to store additional RR data for local hosts, we could extend the format of the '/etc/hosts'. We could check the comments that come after a host entry for additional data, e.g. 192.168.1.2 speedy.example.net speedy # #HINFO AMD-Athlon Linux #MX mail.example.net #MX mail2.example.net 192.168.1.1 sun01.example.net sun01 # #HINFO SPARC Linux #TXT We have joy, we have fun, we have Linux on our SUN # This would be stored in the cache file as A speedy.example.net 0 192.186.1.2 A speedy 0 192.168.1.2 HINFO speed.example.net 0 AMD-Athlon Linux MX speedy.example.net 0 mail1.example.net MX speedy.example.net 0 mail2.example.net A sun01.example.net 192.168.1.1 A sun01 192.168.1.1 HINFO sun01.example.net SPARC Linux TXT sun01.example.net We have joy, we have fun, we have Linux on our SUN Security -------- The DNS is not a secure protocol! There is a 'spoof protection' feature, used by gethostbyname() we should implement. That is, if we query an upstream DNS for the address to a hostname, we should also query for the hostname to that address. If they do not match, one of the answers was spoofed - maybe. The Idea with this is, if the attacker is not directly on the other side of the wire, it is possible that he can not spoof both answers. Not much protection anyway ... There are also that things from RFC2065 - crypto stuff and such, sounds interesting ... non forking server ... ----------------------- ... makes thing a lot more complicated. On the other hand, we need to keep track of the queries to prevent forking of multiple children for a single query. Any ideas? More? ----- What do you think, that MUST, SHOULD, MAY, MAY NOT, SHOULD NOT or MUST NOT be implemented (hey, i read to much in that rfc's) Ciao Andreas |
From: <qu...@ba...> - 2000-02-09 00:22:43
|
Hi all, has anyone thought about implementing a memory cache for the top 100 or so dns lookups? This would definately be faster, especially for things like entries from /etc/hosts and deleteing timed out entries. I think putting the whole thing in memory would be good as you could then setup another thread which would purge the timedout cache entries and rewrite the file (Less IO = good); However, if the cache grows large this could be a downfall. Comments? Also, what's happening with getting dproxy working on the gateway machine. I know it means implenting the full dns protocol but I think it may be worth it. Cheers, Benjamin |
From: jeroen <pe...@ch...> - 2000-02-08 22:43:00
|
Hi again, I just committed the changes I made to cache.c and dproxy.h (I needed the connected and deny functions in cache.c). Looks like it worked (Yahoo.... I learned to use CVS!!!!!) Grtz, Jeroen p.s. The last message had a wrong date stamp, I had a little clock skew on my system... |
From: jeroen <pe...@ch...> - 2000-02-08 21:54:11
|
Hi everybody.... Hope this works........ Matty I saw that you haven't made tasks yet, I would like to work on a way to implement refreshing of entries instead of removing them at purge time. Grtz, Jeroen |
From: Andreas H. <hof...@in...> - 2000-02-07 14:40:35
|
Hi, I'm on the list now ... I've posted some bugs to the bugtracker. Matty, could you give the permission to assign bugs to me ? Andi -- Andreas Hofmeister (see http://www.informatik.uni-freiburg.de/~hofmeist) "I can do it in Quake, why not in real life ?" (Lan Solaris (Illiad)) |