Re: [Dproxy-devel] dproxy 1.x

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Benjamin Close wrote:

>  Else we fork off the child to do
> the dns, since it may timeout. If we have the ip/name the delay shouldn't
> be that noticable so I think this would work well.

I  just moved the fork out of the main loop  up to 'handle_query' to see what
happens. I need to change some more little things to move the fork into the
handler function. See no problem here.

> If we find the entry in the disk cache and move it to memory we should
> probably also remove it/mark it invalid from the disk cache. We may end up
> with data inconsistency otherwise.

I think it is not neccassary to mark entries on disk,  because  we  already
store the expiration time on disk.  We should ignore expired entries in the
disk cache instead.

> > 6. We do not refresh entries in the disk cache.
>
> What happens to expired entries in the disk cache? On next refrence we
> move the to the ll and inc the refcount. However these may be way out of
> date. I think if we get an entry from the disk cache and the TTL has
> expired, refresh them add to ll.

If we ignore expired entries, we can reley on the usual mechanismen here :
- If an entry is expired im mem, it is also expired on disk.
- if we need to lookup an entry that is neither in memory nor on disk, we
will get it from the DNS  and write it back into the disk cache.

Remember we still have the disk cache purge thing, so the only time interval
an inconsitency problem could occur is in between the expiration time of an
entry and the next disk cache purge.

>
>
> > Hardware people sometimes call this a 'write through 1st level cache'.
> > IMO this implementation will work well, if some conditions are meet :
> >
> > - The cache is 'large enough', all heavily used entries can be kept in
> > memory.
>
> I think this is already somewhat done. The reference count will force new
> entries in the ll so the ll will grow with the demand and shrink as the
> demand drops (ref count ==0)

Good idea, but we should need a LRU time instead of an refcount.  Or we use a
combination of refcount and LRU time like this:

- every time an entry is referenced, we incremt the refcount.
- every time the LRU time is < now - cache_expiration_time  , we decrement
the refcount and set the LRU to now.
- if the refcount drops to 0, rmove the entry.

>
> > - the TTL of a mem cache entry and a disk cache entry are always the
> > same.
>
> Removing the entry from the disk cache solves this problem. We may just
> want to mark it invalid (TTL=-1?)and have the purge routine rewrite the
> file when we go off the net.

No problem because of the expiration time is stored on disk.

> > - Instead of starting with an empty cache, one could fill up the cache
> > with the last
> >   'mem_cache_size' entries from the disk cache.
>
> Reloading the whole mem cache may be a little pointless. I think just
> reload the entries with a reference count > x where x is a defined value.

It is maybe usefull for that refresh stuff, we can leave this decission open
and start with an empty mem cache for now.

>
> Oh, who wan't to write what?
>

Roadmap suggestion :

--- Job #1 --
I just added a new data structure for cache entries, including some methods
on them (read,write,dup,free). I will check that in tonight,

Next I want to rewrite the  disk cache stuff. The cache->hostentry conversion
need to be rewitten to use both, the mem and the disk cache  and should go
somewhere else (xhostentry.c ?)
The mem cache and the disk cache should have the same/simmiliar API,
something like

void init()
cache_data_t * XX_find_name( char * type, char * name)
cache_data_t * XX_find_next_name(char * type, char * name)
cache_data_t * XX_find_data(char * type, char * data)
cache_data_t * XX_find_next_data(char * type, char * data)
int XX_add(cache_data_t *)
void purge()

A caller must not expect a returned pointer to be stable for a longer periode
than until the next call to one of those function.  The functions need some
state information (a fd or a list entry)
----

--- Job #2 ---
We need a LL implementation. As we maybe want to re-use  it (refresh list
...), it should be in a separate module, i would suggest an api like this:

typedef struct {...} list_t;
tyepdef struct { list_t * list , ll_entry_t *pNext , cache_data_t * data }
ll_entry;

Constructor, an add method and a destructor

list_t  * ll_new();
int ll_add(list_t * , cache_data_t *)
void ll_delete(ll_t *)

... an iterator ...

ll_entry_t * ll_first(list_t *);
ll_entry_t * ll_next(ll_entry_t *);

... element destructor ...

void ll_remove(ll_entry_t *)

... and finally a move method

void ll_to_front(ll_entry *)

We need to  remove entries from the middle of the list, so we maybe should
implement this as a double linked list. Some of the functions are trivial and
could be implemented as macros.  (or inlines, but some compilers  can not use
inline)

The LL implementation does not depend on the cache rewrite stuff, parallel
work possible.

--- Job #3 ---

When the LL stuff is ready, one can  implement the mem cache stuff.

Ciao
  Andreas

--
Andreas Hofmeister (see http://www.informatik.uni-freiburg.de/~hofmeist)

"I can do it in Quake, why not in real life ?"
                                (Lan Solaris (Illiad))