#157 archive file format handling (zip, tar, cpio)

open
nobody
None
5
2014-11-29
2007-12-24
No

!!! This is a WIP stuff, not finished yet!!!!
(but more or less working)

This patches add archive handling capability to libspectrum and of course to Fuse.

There are two patch:
1. libspectrum patch
- libspectrum.h.in: a new class: LIBSPECTRUM_CLASS_ARCHIVE, and some new id: LIBSPECTRUM_ID_COMPRESSED_ZIP, ...ARCHIVE_ZIP, ...ARCHIVE_TAR and ...ARCHIVE_CPIO
- the declaration of libspectrum_uncompress_file (Fuse need it)
- zlib.c: `skip_pkzip_header' and `libspectrum_pkzip_inflate' functions for handling zip archives
- libspectrum.c: some new type in `libspectrum_identify_file_raw' for zip, tar, and cpio files;
- some new code in `...identify_class' for archives
- new code in `...uncompress_file' to uncompress zip archive members
- a new static function `read_ascii_number' to read ascii octal and hexa numbers from tar and cpio headers
- new function `...extract_member' to extract a given or the first member of an archive
- internals.h: new decl. for `...pkzip_inflate'

2. fuse patch
- utils.h: two new mwmber in file struct: file type and file class
- fuse.c: removed an `...identify...' because the `file' structure include the opened file class and type
- utils.c: utils_open_file see above :)
- utils_read_file: some new code, to handle the automatic decompression and dearchiving of files

If we try to open an archive or a compressed file \(through utils\_read\_file or utils\_open\_file\), fuse uncompress it or extract the first or given member of the archive. If the new 'file' is an archive or a compressed file, \`utils\_read\_file' unarchive/uncompress it , and so on...
So, we can open tar.gz, or cpio.gz files... \(or zip files :\)
e.g.: fuse demo.tar.gz
-> \`utils\_read\_file' read it
-> and uncompress
-> the new 'file' name is demo.tar
-> now extract the first \(regular file\) member \`e.g. demo.mdr' from it

we can point a member of an archive:
e.g.: fuse Demo/demo.tar.gz/demos/Tube128K.csw
now \`utils\_read\_file' cannot open this file, so try to locate the first 'real' file-> Demo/demo.tar.gz
now open this file and uncompress it \(demo.tar\) and then try to extract the \`demos/Tube128K.csw' member from it

~~~~~~~~~~~~~~~
what works -- what not

0\. general:
If we have recursive archives \(e.g. a zip file in a zip file\). We cannot point to a member in the inner archive\(s\). e.g. \`a.zip/b.zip/demo.mdr' does not work, only \`a.zip/b.zip'.
Of course cannot write to an archive or compressed file ;\)

1\. Zip:
Libspectrum uses the local file headers, not the central directory, so not all zip files works \(e.g. created from standard in, or created to stdout\)
does not handles \(and cannot identify\) encryption and other compressing methods only than \`store' and \`deflate'.
There is a 'hack' in the \(libspectrum\) code in order to \`...identify\_file\_raw' correctly distinguish between a zip archive, and a compressed zip member: the \`...extract\_member' overwrites the 'version need to extract' field with \`MB' in the extracted file local header, so \`...identify...' can identify it as a pkzip compressed 'file'.
Cannot identify the file types \(regular/directory, etc\), so if the first member is e.g. a directory and we does not give a member, fuse fails to open...

2\. tar:
Libspectrum handles only the old v7 and the GNU ustar format.
Cannot handle filenames more than 100 character length \(the ustar's \`prefix' field\)

3\. cpio
Libspectrum handles the old binary \(bin\), the old ascii \(odc\), the new ascii \(newc\) and the new crc formats. The old binary format fail if an archive \`cross' the endiannes margin :\(

We need an `Archive Browser' widget in order to users can pick up a member easily from an archive.

imho: the `open file' widget(s) have to allow to users can type file names (like in save dialog) not just select...

Related

Feature Requests: #53
Patches: #342
Patches: #343
Wiki: Fuse 1.2.2 Release Plan

Discussion

  • Gergely Szasz

    Gergely Szasz - 2007-12-24

    Logged In: YES
    user_id=57243
    Originator: YES

    File Added: fuse.arch_01.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-24

    archive/compressed file handling

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-24

    Logged In: YES
    user_id=57243
    Originator: YES

    A new patch to libspectrum:
    - now libspectrum use the `central directory' to determine the 'member size' and the file type (directory or not). The directory identification is so simple, so it may fail with some ZIP archives, where this bit (extra data & 0x10) is not compatible with MSDOS/WINDOWS/UNIX... (AMIGA/MORPHOS???)
    File Added: libspectrum.arch_03.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-26

    Logged In: YES
    user_id=57243
    Originator: YES

    a new libspectrum patch, some fixup

    - `libspectrum_extract_member' declaration in libspectrum.in.h
    - in the `libspectrum_identify_file_raw' the ustart tar archive magic is ustar\0 instead of ustar
    - in `libspectrum_extract_member' some error code fixed, and the 'member not found' error not reported just return; fixed the tar end of archive detection
    File Added: libspectrum.arch_04.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-27

    Logged In: YES
    user_id=57243
    Originator: YES

    This libspectrum patch add a new function:
    `libspectrum_list_members' which extract all `usable' members name from an archive, fuse can use this list in the archive browser.
    File Added: libspectrum.arch_05.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-27

    archive browser

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-27

    Logged In: YES
    user_id=57243
    Originator: YES

    This patch add a simple archive browser only for the WIDGET UI(s)... (need the libspectrum.arch_05.diff)
    ..hmm...

    There are some problem:
    fuse sometimes opens a file 2 or even 3 times (this is a bug???), so that case if fuse has to ask something (the choosen member name) then asks it 2 or 3 times...

    loading a file from command line:
    1. `fuse.c:parse_nonoption_args' opens the given filenames to determine the class and fills the `start_files' structure
    2. `fuse.c:do_start_files' opens the given files to determine the class and loads with the apropriate loader
    3. the loader opens the file again

    loading a file from menu: File->Open
    1. `utils.c:utils_open_file' opens the file and loads with the apropriate loader
    2. some loader can `load' from buffer (e.g. tape), others don't (e.g. mdr), and some doesn't use the `utils_read_file' at all (e.g. the disk.c backend or if2_insert )

    ~~~~~~~~~~~
    what we can do?

    1. somehow whe chache the choosen member names across functions... so second and third times fuse automatically choose the given ones

    2. a. implement everywhere the `...read_buffer', so utils_open_file can use these functions (like `tape_read_buffer'
    b. in some place fuse uses the `utils_find_auxiliary_file' and read the opened file with `utils_read_fd'. (machine.c:machine_load_rom_bank_internal, tape.c:tape_autoload, specplus3.c: specplus_disk_insert, widget.c:widget_read_font, menu.c:menu_help_keyboard, picture.c:read_screen) If we extend the `utils_find_auxiliary_file' with file reading, than it can use the `utils_read_file'... (this may useful e.g. in if2_insert)
    c. in the other places (e.g. ide or dck stuff) we have to do some other thing: e.g. unpack the identified file to a temp file and later the open routines use this... (read only)

    ...

    File Added: fuse.arch_04.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-27

    Logged In: YES
    user_id=57243
    Originator: YES

    This patch fix a bug in utils_read_file->`skip first FUSE_DIR_SEP_CHR'
    File Added: fuse.arch_05.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-27

    skip first `/' neverending story fixed

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-30

    Logged In: YES
    user_id=57243
    Originator: YES

    In this patch a realloc bug fixed in `libspectrum_list_members'...
    File Added: libspectrum.arch_06.diff

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-30

    archive browser for GTK+ UI

     
  • Gergely Szasz

    Gergely Szasz - 2007-12-30

    Logged In: YES
    user_id=57243
    Originator: YES

    In this patch there is an archive browser for gtk ui too.
    File Added: fuse.arch_06.diff

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks