Menu

#51 mksquahsfs segfaulted at the end of the process

v1.0 (example)
closed-fixed
nobody
None
1
2017-02-04
2013-07-17
Steve Kieu
No

This is 100% reproducible. When I compressing a huge folder (the resulting of the sqs nearly at the end is about 296G with compressor gzip) it always segfaulted when (I think) it reaches 100%.

I compile version 4.2 from source. It dies and does not produce core file.

It is critical for my project. Please fix it. I am looking into the source code too in the mean time.

Discussion

  • Steve Kieu

    Steve Kieu - 2013-07-17

    Also I have tried option -no-duplicates which does not solve the problem.

     
  • Steve Kieu

    Steve Kieu - 2013-07-17

    I was pretty disppointed that why the fix

    diff --git a/squashfs-tools/mksquashfs.c b/squashfs-tools/mksquashfs.c
    index 48a260d..063c081 100644
    --- a/squashfs-tools/mksquashfs.c
    +++ b/squashfs-tools/mksquashfs.c
    @@ -1920,7 +1920,6 @@ long long generic_write_table(int length, void *buffer, int length2,
    long long write_fragment_table()
    {
    unsigned int frag_bytes = SQUASHFS_FRAGMENT_BYTES(fragments);
    - struct squashfs_fragment_entry p[fragments];
    int i;

    TRACE("write_fragment_table: fragments %d, frag_bytes %d\n", fragments,
    

    @@ -1929,10 +1928,10 @@ long long write_fragment_table()
    TRACE("write_fragment_table: fragment %d, start_block 0x%llx, "
    "size %d\n", i, fragment_table[i].start_block,
    fragment_table[i].size);
    - SQUASHFS_SWAP_FRAGMENT_ENTRY(&fragment_table[i], p + i);
    + SQUASHFS_INSWAP_FRAGMENT_ENTRY(&fragment_table[i]);
    }

    • return generic_write_table(frag_bytes, p, 0, NULL, noF);
    • return generic_write_table(frag_bytes, fragment_table, 0, NULL, noF);
      }

    does not get merged to the 'offical' download tar ball of squashfs4.2 ? It has been for one year now and the team does not merge it in or release new version of it.

    This may fix my problem. I am currently running it to see if it is OK. Another day or so <sigh> to see the result.

     
    • Phillip Lougher

      Phillip Lougher - 2013-07-17

      I was pretty disppointed that why the fix
      does not get merged to the 'offical' download
      tar ball of squashfs4.2 ? It has been for one year
      now and the team does not merge it in or release new version of it.

      One person, working in spare time only.

       
      • Steve Kieu

        Steve Kieu - 2013-07-18

        Sorry for the rant :-)

         
  • Phillip Lougher

    Phillip Lougher - 2013-07-17

    This sounds like a bug fixed in the development version. You should download and try that.

     
  • Steve Kieu

    Steve Kieu - 2013-07-18

    Have tried the latest git version. This time mksquashfs does not segfaulted - everything look fine until I mount it. Mount sayas can not allocate memory, and the kernel log says:

    SQUASHFS error: unable to read inode lookup table

    This is kernel 3.9.9 vanilla from kernel.org with squashfs built in. Do I need to rebuild the kernel module?

     
    • Phillip Lougher

      Phillip Lougher - 2013-07-18

      Have tried the latest git version. This time mksquashfs does not >segfaulted

      Hopefully this means the big filesystem related bugs in the squashfs-tools I was getting reported last year have been fixed.

      everything look fine until I mount it. Mount sayas can not allocate >memory, and the kernel log says:

      SQUASHFS error: unable to read inode lookup table

      Hmm, that's a new bug, I have never got that reported before now!

      The inode lookup table is a two-tier table, the top level points to individually compressed blocks of 8Kbytes, which each contain lookup information for 1024 inodes.

      The kernel code tries to allocate and store the top level table in memory using kmalloc - this has a maximum size of 128 Kbytes.

      The above error implies that the number of inodes in your filesystem means the top level table exceeds 128Kbytes, this will be 2^24 or 16,777,216 inodes.

      What does unsquashfs -s <filesystem image="">

      output?

      If it shows that the inodes exceeds the above 16,777,216, and that's the expected number of inodes (a bug in mksquashfs could conceivably be storing a rogue inode value), then I'll send you a patch to the kernel code to use vmalloc rather than kmalloc (this does not have the 128 Kbye limit).

       
  • Steve Kieu

    Steve Kieu - 2013-07-18

    I tested the same version with smaller fs which works fine. Thus I am convinced there is still something when dealing with big squashfs file (>296G in my case). Not sure what to do further to debug this

     
  • Steve Kieu

    Steve Kieu - 2013-07-18

    Unfortunately the unmountable squashfs file is removed as space constraint.

    kernel message saya below, maybe it help.

    Jul 19 07:43:19 voipmonitor kernel: [465110.860489] mount: page allocation failure: order:4, mode:0x1040d0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860495] Pid: 24891, comm: mount Tainted: G C 3.9.9-x64std #7
    Jul 19 07:43:19 voipmonitor kernel: [465110.860497] Call Trace:
    Jul 19 07:43:19 voipmonitor kernel: [465110.860505] [<ffffffff810cb891>] ? warn_alloc_failed+0xe1/0x130
    Jul 19 07:43:19 voipmonitor kernel: [465110.860510] [<ffffffff810dd41d>] ? next_online_pgdat+0x1d/0x50
    Jul 19 07:43:19 voipmonitor kernel: [465110.860514] [<ffffffff810ce692>] ? drain_pages+0x32/0x90
    Jul 19 07:43:19 voipmonitor kernel: [465110.860517] [<ffffffff810ceeb7>] ? alloc_pages_nodemask+0x777/0xa10
    Jul 19 07:43:19 voipmonitor kernel: [465110.860523] [<ffffffff81104095>] ? alloc_pages_current+0xb5/0x180
    Jul 19 07:43:19 voipmonitor kernel: [465110.860527] [<ffffffff811d7720>] ? squashfs_alloc_inode+0x30/0x30
    Jul 19 07:43:19 voipmonitor kernel: [465110.860530] [<ffffffff810cabe5>] ?
    get_free_pages+0x5/0x40
    Jul 19 07:43:19 voipmonitor kernel: [465110.860535] [<ffffffff811d5893>] ? squashfs_read_table+0x43/0x130
    Jul 19 07:43:19 voipmonitor kernel: [465110.860538] [<ffffffff811d7720>] ? squashfs_alloc_inode+0x30/0x30
    Jul 19 07:43:19 voipmonitor kernel: [465110.860541] [<ffffffff811d5e4a>] ? squashfs_read_inode_lookup_table+0x3a/0x60
    Jul 19 07:43:19 voipmonitor kernel: [465110.860544] [<ffffffff811d7adc>] ? squashfs_fill_super+0x3bc/0x680
    Jul 19 07:43:19 voipmonitor kernel: [465110.860547] [<ffffffff811d7720>] ? squashfs_alloc_inode+0x30/0x30
    Jul 19 07:43:19 voipmonitor kernel: [465110.860552] [<ffffffff811226ac>] ? mount_bdev+0x1cc/0x210
    Jul 19 07:43:19 voipmonitor kernel: [465110.860555] [<ffffffff81123065>] ? mount_fs+0x45/0x1d0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860560] [<ffffffff8113bcc2>] ? vfs_kern_mount+0x72/0x120
    Jul 19 07:43:19 voipmonitor kernel: [465110.860562] [<ffffffff8113da55>] ? do_mount+0x245/0xa00
    Jul 19 07:43:19 voipmonitor kernel: [465110.860566] [<ffffffff810dcfa4>] ? memdup_user+0x44/0x90
    Jul 19 07:43:19 voipmonitor kernel: [465110.860569] [<ffffffff8113e2a5>] ? sys_mount+0x95/0xf0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860573] [<ffffffff814a87ed>] ? system_call_fastpath+0x1a/0x1f

    Jul 19 07:43:19 voipmonitor kernel: [465110.860575] Mem-Info:
    Jul 19 07:43:19 voipmonitor kernel: [465110.860576] Node 0 DMA per-cpu:
    Jul 19 07:43:19 voipmonitor kernel: [465110.860579] CPU 0: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860581] CPU 1: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860583] CPU 2: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860585] CPU 3: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860587] CPU 4: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860589] CPU 5: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860591] CPU 6: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860593] CPU 7: hi: 0, btch: 1 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860594] Node 0 DMA32 per-cpu:
    Jul 19 07:43:19 voipmonitor kernel: [465110.860597] CPU 0: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860598] CPU 1: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860600] CPU 2: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860602] CPU 3: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860604] CPU 4: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860606] CPU 5: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860608] CPU 6: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860610] CPU 7: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860611] Node 0 Normal per-cpu:
    Jul 19 07:43:19 voipmonitor kernel: [465110.860613] CPU 0: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860615] CPU 1: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860617] CPU 2: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860619] CPU 3: hi: 186, btch: 31 usd: 32
    Jul 19 07:43:19 voipmonitor kernel: [465110.860621] CPU 4: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860623] CPU 5: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860625] CPU 6: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860626] CPU 7: hi: 186, btch: 31 usd: 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860631] active_anon:1427265 inactive_anon:247154 isolated_anon:0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860631] active_file:229931 inactive_file:243996 isolated_file:32
    Jul 19 07:43:19 voipmonitor kernel: [465110.860631] unevictable:988 dirty:330299 writeback:14677 unstable:0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860631] free:40496 slab_reclaimable:36875 slab_unreclaimable:27861
    Jul 19 07:43:19 voipmonitor kernel: [465110.860631] mapped:2285 shmem:29 pagetables:11016 bounce:0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860631] free_cma:0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860636] Node 0 DMA free:15884kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15884kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
    Jul 19 07:43:19 voipmonitor kernel: [465110.860642] lowmem_reserve[]: 0 1997 16075 16075
    Jul 19 07:43:19 voipmonitor kernel: [465110.860646] Node 0 DMA32 free:71684kB min:8388kB low:10484kB high:12580kB active_anon:642948kB inactive_anon:245780kB active_file:268564kB inactive_file:296688kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2069408kB managed:2045256kB mlocked:0kB dirty:426896kB writeback:17344kB mapped:136kB shmem:8kB slab_reclaimable:94632kB slab_unreclaimable:61228kB kernel_stack:840kB pagetables:4888kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
    Jul 19 07:43:19 voipmonitor kernel: [465110.860653] lowmem_reserve[]: 0 0 14078 14078
    Jul 19 07:43:19 voipmonitor kernel: [465110.860656] Node 0 Normal free:74416kB min:59128kB low:73908kB high:88692kB active_anon:5066112kB inactive_anon:742836kB active_file:651160kB inactive_file:679296kB unevictable:3952kB isolated(anon):0kB isolated(file):128kB present:14680064kB managed:14415908kB mlocked:3952kB dirty:894300kB writeback:41364kB mapped:9004kB shmem:108kB slab_reclaimable:52868kB slab_unreclaimable:50216kB kernel_stack:2160kB pagetables:39176kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no
    Jul 19 07:43:19 voipmonitor kernel: [465110.860663] lowmem_reserve[]: 0 0 0 0
    Jul 19 07:43:19 voipmonitor kernel: [465110.860666] Node 0 DMA: 34kB (U) 28kB (U) 116kB (U) 132kB (U) 364kB (U) 2128kB (U) 0256kB 0512kB 11024kB (U) 12048kB (R) 34096kB (M) = 15884kB
    Jul 19 07:43:19 voipmonitor kernel: [465110.860679] Node 0 DMA32: 17569
    4kB (UEM) 38kB (E) 1016kB (R) 732kB (R) 464kB (R) 1128kB (R) 2256kB (R) 0512kB 01024kB 02048kB 04096kB = 71580kB
    Jul 19 07:43:19 voipmonitor kernel: [465110.860691] Node 0 Normal: 177994kB (UEM) 78kB (E) 016kB 032kB 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 1*4096kB (R) = 75348kB
    Jul 19 07:43:19 voipmonitor kernel: [465110.860701] 620568 total pagecache pages
    Jul 19 07:43:19 voipmonitor kernel: [465110.860703] 145836 pages in swap cache
    Jul 19 07:43:19 voipmonitor kernel: [465110.860705] Swap cache stats: add 28760097, delete 28614261, find 7780999/10572117
    Jul 19 07:43:19 voipmonitor kernel: [465110.860714] Free swap = 22568728kB
    Jul 19 07:43:19 voipmonitor kernel: [465110.860716] Total swap = 33525756kB
    Jul 19 07:43:19 voipmonitor kernel: [465110.926155] 4194303 pages RAM
    Jul 19 07:43:19 voipmonitor kernel: [465110.926158] 70965 pages reserved
    Jul 19 07:43:19 voipmonitor kernel: [465110.926159] 781330 pages shared
    Jul 19 07:43:19 voipmonitor kernel: [465110.926160] 2102813 pages non-shared

    As I started to think it is memory fragmentation problem but it does not look like this. (I use frontswap and zcache for that system)

    There is no harm to try new kernel patch to remove that limit, please email me msh dot computing at gmail dot com - the patch so I can try. We will have another image after today (it takes about 8 hour and in daytime longer as I have to set it to be nicer for the other) - so it would be available tomorrow.

    Many thanks

     
    • Phillip Lougher

      Phillip Lougher - 2013-07-22

      Jul 19 07:43:19 voipmonitor kernel: [465110.860489] mount: page allocation failure: order:4, mode:0x1040d0

      This shows your system is suffering from fragmentation.

      This is an order 4 allocation (64K) which has failed. This can happen when there's memory pressure.

       
  • Steve Kieu

    Steve Kieu - 2013-07-19

    Just do a quick for a typical day folder

    find 2013-07-18/ -printf "%i\n" | sort -u | wc -l
    6497568

    looks like it does not reach the value of 16777216 ; maybe another problem?

    Nevertheless I still want to test/try anything that would make it work

    Thanks,

     
    • Phillip Lougher

      Phillip Lougher - 2013-07-22

      find 2013-07-18/ -printf "%i\n" | sort -u | wc -l
      6497568
      looks like it does not reach the value of 16777216 ; maybe another problem?

      This matches the order 4 failure (64K) seen. For that number of inodes Squashfs will need ~ 50 Kbytes (which will need an order 4 page allocation).

       
  • Steve Kieu

    Steve Kieu - 2013-07-22

    OK. It does not look like the inode look up table limit in my case.

    If after compressing use mksquashfs I do a rm -rf big_folder and then do the mount -o loop squashfs.img right away, it failed with error above.

    However if I wait for about 15 minutes and do the mount again, it is OK

    Of better yet do not remove the folder right away - just mount it first it will be ok and then remove after.

    something in the kernel memory allocation bug!

     
    • Phillip Lougher

      Phillip Lougher - 2013-07-22

      Yes, that is to be expected with transient memory pressure, sometimes it will work, sometimes it won't...

      I will attach a patch that converts the kmallocs to vmallocs that should eliminate the failures. It is against 3.9.9 and can be applied via git am, or just patch.

       
  • Steve Kieu

    Steve Kieu - 2013-07-22

    OK. It does not look like the inode look up table limit in my case.

    If after compressing use mksquashfs I do a rm -rf big_folder and then do the mount -o loop squashfs.img right away, it failed with error above.

    However if I wait for about 15 minutes and do the mount again, it is OK

    Of better yet do not remove the folder right away - just mount it first it will be ok and then remove after.

    something in the kernel memory allocation bug!

     
  • Phillip Lougher

    Phillip Lougher - 2013-07-22

    Patch to convert kmallocs to vmallocs that should eliminate the failures. It is against 3.9.9 and can be applied via git am, or just patch.

     
  • Steve Kieu

    Steve Kieu - 2013-07-22

    Thanks Philip. I will try and let you know if any problem.

     
  • Steve Kieu

    Steve Kieu - 2013-07-22

    Is it true that kmalloc does not have 128 KBytes limit anymore with recent version?

    http://kaiwantech.wordpress.com/2011/08/17/kmalloc-and-vmalloc-linux-kernel-memory-allocation-api-limits/

    if it is then we may not need the patch at all? I assume that vmalloc can overcome the memory fragmentation problem because it remap the buffer space into a contiguous range; if so then there is still benefit of using it.

     
    • Phillip Lougher

      Phillip Lougher - 2013-07-22

      The question of what is the maximum size of kmalloc isn't really applicable here. The kmallocs in question are failing within the traditional limit of 128 Kbytes.

      The general rule of thumb with kmalloc is to make the allocation as small as possible because with memory fragmentation there is never any guarantee that anything larger than a page sized allocation (normally 4K) will not fail. The larger the kmalloc attempted, the greater likelihood the kmalloc will fail. Squashfs only attempts to alloc more than 4K with kmalloc in the isolated circumstances identified here for that reason (and it only does so here because it is old code written when squashfs filesystems were much smaller, the general assumption when the code was reviewed a few years back was that systems attempting to use "big" Squashfs filesystems would have lots of memory and so the higher order kmallocs would in practice not cause a problem, your bug report shows that such an assumption does not always hold true).

      Vmalloc overcomes fragmentation because it does not return memory which is physically contiguous (unlike kmalloc). The memory returned by vmalloc is mapped into the vmalloc address space so it appears virtually contiguous, but the pages do not need to be physically contiguous. As such with systems with fragmentation there is a definite advantage in using it.

       
  • Phillip Lougher

    Phillip Lougher - 2014-05-15
    • status: open --> closed-fixed
     
  • Phillip Lougher

    Phillip Lougher - 2014-05-15

    Bug in Mksquashfs fixed in Squashfs tools 4.3 release

     
  • Jaap Versteegh

    Jaap Versteegh - 2017-02-04

    Not sure if it is related, but I get "SQUASHFS error: unable to read inode lookup table" with a 70M inode count. 4.3-3-ubuntu2 package on x86_64.

     

Log in to post a comment.