Re: [uml-devel] recursive COW files

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

>Ok, here is my take on recursive COW files and in addation a V3
>cow header layout is included.

I'm happy to see the nested COW issue being addressed here.  I had discussed 
it with Jeff, and he scared me off doing it myself by describing an 
(elegant) recursive driver design that he wanted to see.  I wasn't convinced 
drivers calling drivers calling drivers... was efficient** (I think you've 
coded is an iterative design, btw, not a recursive one like your comments 
indicate).

I was also the person interested in the ISAM format COW for non-sparse 
devices.  Jeff and I discussed that, too, and agreed that a "format" field 
(0 == conventional bitmap index, 1 == ISAM, etc. ) in the COW header would 
suffice for now, and ISAM could be integrated later.  To make that port 
easier, I envisioned a dev->ops like function lookup table that would point 
to the proper routines based on the format of the COW.  It might even 
contain the functions to process the different version headers, like so:

  if((vops = get_COW_version_ops(workspace) == NULL) {
    ERROR("Unknown COW file version");
    return NULL;
  }

  cow_name = vops->get_backing_name(workspace);
  vops->fill_devinfo(&tmp,workspace);
  ...

Just a thought.

>Now this is just a test version, I have a pretty good idea how to
>hook it into the the ubd driver but I wanted to ask for comments
>first, this version has lots of debugging to check for my map
>allocation and which file is being read.

1)
I'm not convinced this is proper:

  #include <sys/param.h> /* MAXPATHLEN */
[...]
  #define PATH_LEN_V1 256
  #define PATH_LEN_V2 MAXPATHLEN

While 256 may have been too short(?), MAXPATHLEN is ambigious, and possibly 
variable, depending on system, leading to more porting headaches.  Pick an 
appropriate length, or define the length as a variable in the header.

2)
  if (memcmp(&workspace.v3.magic, &magic, sizeof(magic)) == 0) {
    if (ntohl(workspace.v3.version) == 3) {
      version = ntohl(workspace.v3.version);
      cow_name = workspace.v3.backing_file;
      tmp.data_length = ntohll(workspace.v3.data_length);
      tmp.data_offset = ntohll(workspace.v3.data_offset);
      tmp.mtime = ntohll(workspace.v3.mtime);
      tmp.cow_offset = ntohll(workspace.v3.cow_offset);
      tmp.cow_length = ntohll(workspace.v3.cow_length);
      tmp.data_offset = ntohll(workspace.v3.data_offset);
      tmp.data_length = ntohll(workspace.v3.data_length);

The above two lines are typos (duplicates), I think?

      tmp.cylinders = ntohl(workspace.v3.cylinders);
      tmp.heads = ntohl(workspace.v3.heads);
      tmp.sectors = ntohl(workspace.v3.sectors);
      tmp.sector_size = ntohl(workspace.v3.sector_size);
    }
  }

3)
  #define COW_SPARSE 1

What is this for?  Future ISAM support?  There is only one instance of this 
define being addressed in the code, which leads me to:

4)
  The comments in the code mostly state the obvious, like:

  /* write out to the data section */
[...]
  tmp = pwrite64(dev->fd, buf, dev->sector_size,
                  dev->data_offset + work);

While extremely odd code segments, like this:

  if ((dev->flags & COW_SPARSE) && (memcmp(buf, zero,dev->sector_size) == 
0)) {
    tmp = read_COW(dev, buf, 1, offset);
    if (tmp == 1) {
      if (memcmp(buf, zero, dev->sector_size) == 0) {
         goto one_done;
      }
      DEBUG("zero %s fd%d buf %p[%d]:%d\n", dev->name,
                      dev->fd, zero, dev->sector_size,
                      dev->data_offset + work);
      tmp = pwrite64(dev->fd, zero, dev->sector_size,
                      dev->data_offset + work);
      if (tmp < dev->sector_size) {
        ERROR("write zero %d:buf[%p]+%xl:%d",
                      dev->fd, zero, dev->data_offset
                      + work, dev->sector_size);
        return result;
      }
      goto one_done;
    }
  }

Are unfathomable to me and go completely unexplained.  Why is this here? 
What special case is this code *supposed* to take care of.  Why are blocks 
of zeros an issue?  How CPU expensive is doing the memcmp on the blocks - 
not only once, but *twice*?  What are the implications of the block *not* 
comparing to a bunch of zeros?

I am glad someone's hacking at this, I would love to see nested COW files.

Steve Schmidtke

** An easy optimization for nested COW files with James' iterative 
implementation, as opposed to Jeff's recursive proposal, would be to create 
a bitmap union of all the nested COW bitmaps.  If a bit for a block is set 
in the union bitmap, then each of the COWs must be searched for that block, 
otherwise (and the most likely case) the block exists in the backing file.  
Another optimization would make the union bitmap an index array, such that a 
block entry in the index array also identifies which COW entry contains that 
block (with an index of 0 defining a backing file block) - no sequential 
searching or bit shifting required, at the cost of 1 byte per block for the 
first 255 nested COW files.

_________________________________________________________________
Add photos to your messages with MSN 8. Get 2 months FREE*.  
http://join.msn.com/?page=features/featuredemail