I have a need for ntfsimage, which is listed as `not started' on the
tools status page, so I decided to write it. I'm posting here to let
you know to save anyone else the effort, to say where I'm at, and to
discuss a couple of design questions.
* Firstly, status: I have a first cut of the imaging utility which
produces a file of some kind, but no restore utility. I haven't
stared hard at the image output, so I don't know whether the output of
the imager is right yet.
* I'm planning to produce two utilities: ntfsimage, which reads an
NTFS volume and spits out an image, and restoreimage, which restores
an image to a partition. I wanted them to be separate programs
because the restoreimage functionality is not at all filesystem
dependent, so it's not clear that ntfstools is the right place for
it. (Although it'll probably be sensible to ship it there until
someone comes up with other uses. ext3image anyone?)
* I did some web browsing to try to find a suitable existing sparse
image format to use. However, such a thing doesn't seem to exist.
This means I had to invent one - which is a bonus as it means I can
invent a nice simple format which is east to generate, but it also
means that I have to write the restore utility rather than just using
an existing tool. Formats I rejected:
* GNU tar's sparse file extensions - support 33-bit filesizes
and holes only, which is no good for a filesystem image
* partimage's image format - contains compression, CRCs, etc.,
which I think this is the wrong layer for, and which add
complexity. (Also, I'm not 100% convinced of the quality of some
of the partimage work.)
* Apple have a disk image format UDIF which looks like it might have
been a sensible standard to follow if Apple hadn't done the usual
Apple thing and made it a secret format.
* Ghost and its competitors don't seem to publish format specs
either.
So, please comment on my format, which is described below. What form
of documentation would be best for the image format specification ?
Should I write a section 5 manpage ?
* I'm slightly disturbed by the amount of option-parsing boilerplate
(and thus clone-and-hack) thats appears in the utilities in the
linux-ntfs 1.7.1 source tree. For example, the stuff to implement the
force, help, verbose, quiet, and version options, and the code that's
needed when opterr=0. Are there any plans to rationalise this, or
should I just follow the example of the other tools ?
* Please comment on the following option set:
Usage: ntfsimage [options] device
-c --count Print <image-total-bytes> <in-fs-clusters-used>*<cluster-size>.
-f --force Use less caution
-q --quiet Less output
-v --verbose More output
-V --version Print version number
-h --help Print this help
* What copyright formalities are required, if any, for my code to make
it into the main linux-ntfs distribution ?
Thanks,
Ian.
Format description:
at least 64 bytes of header
offset len
0 16 ASCII magic number
"restoreimage" 0x0a 0x00 0x0d 0x1a
16 16 random octets magic number
0xb5 0x0c 0xa9 0xb6 0x15 0x63 0xb5 0x18
0x9a 0x31 0x80 0x33 0xf1 0x14 0xf8 0x3c
32 4 major version number, currently 0
readers must fail if major verion not known
36 4 header size in bytes
must be multiple of cluster size (see below)
40 4 minor version number, currently 0
readers must continue if major version not known
44 4 reserved
48 8 cluster size in bytes
56 8 number of clusters
64 ? remainder of header is reserved
zero or more cluster groups where a cluster group is:
one cluster's worth of bitmap specifying whether
each cluster is present (1=present, 0=absent), where:
first cluster usage is ms bit of first byte
the actual cluster data for the specified clusters
everything is in network byte order (big endian)
reserved fields are all-bits-zero on write, ignored on read (in v0.0)
The last cluster group may be partial, if the number of clusters
is not a multiple of the number of bits in a cluster. In this
case the leftover clusters past the end are _not_ marked as
present, nor present
|