squashfs - a compressed fs for Linux / Bugs / #51 mksquahsfs segfaulted at the end of the process

Steve Kieu - 2013-07-17

Also I have tried option -no-duplicates which does not solve the problem.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-17

I was pretty disppointed that why the fix

diff --git a/squashfs-tools/mksquashfs.c b/squashfs-tools/mksquashfs.c
index 48a260d..063c081 100644
--- a/squashfs-tools/mksquashfs.c
+++ b/squashfs-tools/mksquashfs.c
@@ -1920,7 +1920,6 @@ long long generic_write_table(int length, void *buffer, int length2,
long long write_fragment_table()
{
unsigned int frag_bytes = SQUASHFS_FRAGMENT_BYTES(fragments);
- struct squashfs_fragment_entry p[fragments];
int i;

TRACE("write_fragment_table: fragments %d, frag_bytes %d\n", fragments,

@@ -1929,10 +1928,10 @@ long long write_fragment_table()
TRACE("write_fragment_table: fragment %d, start_block 0x%llx, "
"size %d\n", i, fragment_table[i].start_block,
fragment_table[i].size);
- SQUASHFS_SWAP_FRAGMENT_ENTRY(&fragment_table[i], p + i);
+ SQUASHFS_INSWAP_FRAGMENT_ENTRY(&fragment_table[i]);
}

return generic_write_table(frag_bytes, p, 0, NULL, noF);

return generic_write_table(frag_bytes, fragment_table, 0, NULL, noF);
}

does not get merged to the 'offical' download tar ball of squashfs4.2 ? It has been for one year now and the team does not merge it in or release new version of it.

This may fix my problem. I am currently running it to see if it is OK. Another day or so <sigh> to see the result.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Phillip Lougher - 2013-07-17
  
  I was pretty disppointed that why the fix
  does not get merged to the 'offical' download
  tar ball of squashfs4.2 ? It has been for one year
  now and the team does not merge it in or release new version of it.
  
  One person, working in spare time only.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Steve Kieu - 2013-07-18
    
    Sorry for the rant :-)
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Phillip Lougher - 2013-07-17

This sounds like a bug fixed in the development version. You should download and try that.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-18

Have tried the latest git version. This time mksquashfs does not segfaulted - everything look fine until I mount it. Mount sayas can not allocate memory, and the kernel log says:

SQUASHFS error: unable to read inode lookup table

This is kernel 3.9.9 vanilla from kernel.org with squashfs built in. Do I need to rebuild the kernel module?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Phillip Lougher - 2013-07-18
  
  Have tried the latest git version. This time mksquashfs does not >segfaulted
  
  Hopefully this means the big filesystem related bugs in the squashfs-tools I was getting reported last year have been fixed.
  
  everything look fine until I mount it. Mount sayas can not allocate >memory, and the kernel log says:
  
  SQUASHFS error: unable to read inode lookup table
  
  Hmm, that's a new bug, I have never got that reported before now!
  
  The inode lookup table is a two-tier table, the top level points to individually compressed blocks of 8Kbytes, which each contain lookup information for 1024 inodes.
  
  The kernel code tries to allocate and store the top level table in memory using kmalloc - this has a maximum size of 128 Kbytes.
  
  The above error implies that the number of inodes in your filesystem means the top level table exceeds 128Kbytes, this will be 2^24 or 16,777,216 inodes.
  
  What does unsquashfs -s <filesystem image="">
  
  output?
  
  If it shows that the inodes exceeds the above 16,777,216, and that's the expected number of inodes (a bug in mksquashfs could conceivably be storing a rogue inode value), then I'll send you a patch to the kernel code to use vmalloc rather than kmalloc (this does not have the 128 Kbye limit).
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-18

I tested the same version with smaller fs which works fine. Thus I am convinced there is still something when dealing with big squashfs file (>296G in my case). Not sure what to do further to debug this

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-18

Unfortunately the unmountable squashfs file is removed as space constraint.

kernel message saya below, maybe it help.

Jul 19 07:43:19 voipmonitor kernel: [465110.860489] mount: page allocation failure: order:4, mode:0x1040d0
Jul 19 07:43:19 voipmonitor kernel: [465110.860495] Pid: 24891, comm: mount Tainted: G C 3.9.9-x64std #7
Jul 19 07:43:19 voipmonitor kernel: [465110.860497] Call Trace:
Jul 19 07:43:19 voipmonitor kernel: [465110.860505] [<ffffffff810cb891>] ? warn_alloc_failed+0xe1/0x130
Jul 19 07:43:19 voipmonitor kernel: [465110.860510] [<ffffffff810dd41d>] ? next_online_pgdat+0x1d/0x50
Jul 19 07:43:19 voipmonitor kernel: [465110.860514] [<ffffffff810ce692>] ? drain_pages+0x32/0x90
Jul 19 07:43:19 voipmonitor kernel: [465110.860517] [<ffffffff810ceeb7>] ? alloc_pages_nodemask+0x777/0xa10
Jul 19 07:43:19 voipmonitor kernel: [465110.860523] [<ffffffff81104095>] ? alloc_pages_current+0xb5/0x180
Jul 19 07:43:19 voipmonitor kernel: [465110.860527] [<ffffffff811d7720>] ? squashfs_alloc_inode+0x30/0x30
Jul 19 07:43:19 voipmonitor kernel: [465110.860530] [<ffffffff810cabe5>] ? get_free_pages+0x5/0x40
Jul 19 07:43:19 voipmonitor kernel: [465110.860535] [<ffffffff811d5893>] ? squashfs_read_table+0x43/0x130
Jul 19 07:43:19 voipmonitor kernel: [465110.860538] [<ffffffff811d7720>] ? squashfs_alloc_inode+0x30/0x30
Jul 19 07:43:19 voipmonitor kernel: [465110.860541] [<ffffffff811d5e4a>] ? squashfs_read_inode_lookup_table+0x3a/0x60
Jul 19 07:43:19 voipmonitor kernel: [465110.860544] [<ffffffff811d7adc>] ? squashfs_fill_super+0x3bc/0x680
Jul 19 07:43:19 voipmonitor kernel: [465110.860547] [<ffffffff811d7720>] ? squashfs_alloc_inode+0x30/0x30
Jul 19 07:43:19 voipmonitor kernel: [465110.860552] [<ffffffff811226ac>] ? mount_bdev+0x1cc/0x210
Jul 19 07:43:19 voipmonitor kernel: [465110.860555] [<ffffffff81123065>] ? mount_fs+0x45/0x1d0
Jul 19 07:43:19 voipmonitor kernel: [465110.860560] [<ffffffff8113bcc2>] ? vfs_kern_mount+0x72/0x120
Jul 19 07:43:19 voipmonitor kernel: [465110.860562] [<ffffffff8113da55>] ? do_mount+0x245/0xa00
Jul 19 07:43:19 voipmonitor kernel: [465110.860566] [<ffffffff810dcfa4>] ? memdup_user+0x44/0x90
Jul 19 07:43:19 voipmonitor kernel: [465110.860569] [<ffffffff8113e2a5>] ? sys_mount+0x95/0xf0
Jul 19 07:43:19 voipmonitor kernel: [465110.860573] [<ffffffff814a87ed>] ? system_call_fastpath+0x1a/0x1f

Jul 19 07:43:19 voipmonitor kernel: [465110.860575] Mem-Info:
Jul 19 07:43:19 voipmonitor kernel: [465110.860576] Node 0 DMA per-cpu:
Jul 19 07:43:19 voipmonitor kernel: [465110.860579] CPU 0: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860581] CPU 1: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860583] CPU 2: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860585] CPU 3: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860587] CPU 4: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860589] CPU 5: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860591] CPU 6: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860593] CPU 7: hi: 0, btch: 1 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860594] Node 0 DMA32 per-cpu:
Jul 19 07:43:19 voipmonitor kernel: [465110.860597] CPU 0: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860598] CPU 1: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860600] CPU 2: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860602] CPU 3: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860604] CPU 4: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860606] CPU 5: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860608] CPU 6: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860610] CPU 7: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860611] Node 0 Normal per-cpu:
Jul 19 07:43:19 voipmonitor kernel: [465110.860613] CPU 0: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860615] CPU 1: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860617] CPU 2: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860619] CPU 3: hi: 186, btch: 31 usd: 32
Jul 19 07:43:19 voipmonitor kernel: [465110.860621] CPU 4: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860623] CPU 5: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860625] CPU 6: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860626] CPU 7: hi: 186, btch: 31 usd: 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860631] active_anon:1427265 inactive_anon:247154 isolated_anon:0
Jul 19 07:43:19 voipmonitor kernel: [465110.860631] active_file:229931 inactive_file:243996 isolated_file:32
Jul 19 07:43:19 voipmonitor kernel: [465110.860631] unevictable:988 dirty:330299 writeback:14677 unstable:0
Jul 19 07:43:19 voipmonitor kernel: [465110.860631] free:40496 slab_reclaimable:36875 slab_unreclaimable:27861
Jul 19 07:43:19 voipmonitor kernel: [465110.860631] mapped:2285 shmem:29 pagetables:11016 bounce:0
Jul 19 07:43:19 voipmonitor kernel: [465110.860631] free_cma:0
Jul 19 07:43:19 voipmonitor kernel: [465110.860636] Node 0 DMA free:15884kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15884kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 19 07:43:19 voipmonitor kernel: [465110.860642] lowmem_reserve[]: 0 1997 16075 16075
Jul 19 07:43:19 voipmonitor kernel: [465110.860646] Node 0 DMA32 free:71684kB min:8388kB low:10484kB high:12580kB active_anon:642948kB inactive_anon:245780kB active_file:268564kB inactive_file:296688kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2069408kB managed:2045256kB mlocked:0kB dirty:426896kB writeback:17344kB mapped:136kB shmem:8kB slab_reclaimable:94632kB slab_unreclaimable:61228kB kernel_stack:840kB pagetables:4888kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jul 19 07:43:19 voipmonitor kernel: [465110.860653] lowmem_reserve[]: 0 0 14078 14078
Jul 19 07:43:19 voipmonitor kernel: [465110.860656] Node 0 Normal free:74416kB min:59128kB low:73908kB high:88692kB active_anon:5066112kB inactive_anon:742836kB active_file:651160kB inactive_file:679296kB unevictable:3952kB isolated(anon):0kB isolated(file):128kB present:14680064kB managed:14415908kB mlocked:3952kB dirty:894300kB writeback:41364kB mapped:9004kB shmem:108kB slab_reclaimable:52868kB slab_unreclaimable:50216kB kernel_stack:2160kB pagetables:39176kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no
Jul 19 07:43:19 voipmonitor kernel: [465110.860663] lowmem_reserve[]: 0 0 0 0
Jul 19 07:43:19 voipmonitor kernel: [465110.860666] Node 0 DMA: 34kB (U) 28kB (U) 116kB (U) 132kB (U) 364kB (U) 2128kB (U) 0256kB 0512kB 11024kB (U) 12048kB (R) 34096kB (M) = 15884kB
Jul 19 07:43:19 voipmonitor kernel: [465110.860679] Node 0 DMA32: 175694kB (UEM) 38kB (E) 1016kB (R) 732kB (R) 464kB (R) 1128kB (R) 2256kB (R) 0512kB 01024kB 02048kB 04096kB = 71580kB
Jul 19 07:43:19 voipmonitor kernel: [465110.860691] Node 0 Normal: 177994kB (UEM) 78kB (E) 016kB 032kB 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 1*4096kB (R) = 75348kB
Jul 19 07:43:19 voipmonitor kernel: [465110.860701] 620568 total pagecache pages
Jul 19 07:43:19 voipmonitor kernel: [465110.860703] 145836 pages in swap cache
Jul 19 07:43:19 voipmonitor kernel: [465110.860705] Swap cache stats: add 28760097, delete 28614261, find 7780999/10572117
Jul 19 07:43:19 voipmonitor kernel: [465110.860714] Free swap = 22568728kB
Jul 19 07:43:19 voipmonitor kernel: [465110.860716] Total swap = 33525756kB
Jul 19 07:43:19 voipmonitor kernel: [465110.926155] 4194303 pages RAM
Jul 19 07:43:19 voipmonitor kernel: [465110.926158] 70965 pages reserved
Jul 19 07:43:19 voipmonitor kernel: [465110.926159] 781330 pages shared
Jul 19 07:43:19 voipmonitor kernel: [465110.926160] 2102813 pages non-shared

As I started to think it is memory fragmentation problem but it does not look like this. (I use frontswap and zcache for that system)

There is no harm to try new kernel patch to remove that limit, please email me msh dot computing at gmail dot com - the patch so I can try. We will have another image after today (it takes about 8 hour and in daytime longer as I have to set it to be nicer for the other) - so it would be available tomorrow.

Many thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Phillip Lougher - 2013-07-22
  
  Jul 19 07:43:19 voipmonitor kernel: [465110.860489] mount: page allocation failure: order:4, mode:0x1040d0
  
  This shows your system is suffering from fragmentation.
  
  This is an order 4 allocation (64K) which has failed. This can happen when there's memory pressure.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-19

Just do a quick for a typical day folder

find 2013-07-18/ -printf "%i\n" | sort -u | wc -l
6497568

looks like it does not reach the value of 16777216 ; maybe another problem?

Nevertheless I still want to test/try anything that would make it work

Thanks,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Phillip Lougher - 2013-07-22
  
  find 2013-07-18/ -printf "%i\n" | sort -u | wc -l
  6497568
  looks like it does not reach the value of 16777216 ; maybe another problem?
  
  This matches the order 4 failure (64K) seen. For that number of inodes Squashfs will need ~ 50 Kbytes (which will need an order 4 page allocation).
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-22

OK. It does not look like the inode look up table limit in my case.

If after compressing use mksquashfs I do a rm -rf big_folder and then do the mount -o loop squashfs.img right away, it failed with error above.

However if I wait for about 15 minutes and do the mount again, it is OK

Of better yet do not remove the folder right away - just mount it first it will be ok and then remove after.

something in the kernel memory allocation bug!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Phillip Lougher - 2013-07-22
  
  Yes, that is to be expected with transient memory pressure, sometimes it will work, sometimes it won't...
  
  I will attach a patch that converts the kmallocs to vmallocs that should eliminate the failures. It is against 3.9.9 and can be applied via git am, or just patch.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-22

OK. It does not look like the inode look up table limit in my case.

If after compressing use mksquashfs I do a rm -rf big_folder and then do the mount -o loop squashfs.img right away, it failed with error above.

However if I wait for about 15 minutes and do the mount again, it is OK

Of better yet do not remove the folder right away - just mount it first it will be ok and then remove after.

something in the kernel memory allocation bug!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Phillip Lougher - 2013-07-22

Patch to convert kmallocs to vmallocs that should eliminate the failures. It is against 3.9.9 and can be applied via git am, or just patch.

0001-Squashfs-convert-kmallocs-to-vmallocs.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-22

Thanks Philip. I will try and let you know if any problem.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steve Kieu - 2013-07-22

Is it true that kmalloc does not have 128 KBytes limit anymore with recent version?

http://kaiwantech.wordpress.com/2011/08/17/kmalloc-and-vmalloc-linux-kernel-memory-allocation-api-limits/

if it is then we may not need the patch at all? I assume that vmalloc can overcome the memory fragmentation problem because it remap the buffer space into a contiguous range; if so then there is still benefit of using it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Phillip Lougher - 2013-07-22
  
  The question of what is the maximum size of kmalloc isn't really applicable here. The kmallocs in question are failing within the traditional limit of 128 Kbytes.
  
  The general rule of thumb with kmalloc is to make the allocation as small as possible because with memory fragmentation there is never any guarantee that anything larger than a page sized allocation (normally 4K) will not fail. The larger the kmalloc attempted, the greater likelihood the kmalloc will fail. Squashfs only attempts to alloc more than 4K with kmalloc in the isolated circumstances identified here for that reason (and it only does so here because it is old code written when squashfs filesystems were much smaller, the general assumption when the code was reviewed a few years back was that systems attempting to use "big" Squashfs filesystems would have lots of memory and so the higher order kmallocs would in practice not cause a problem, your bug report shows that such an assumption does not always hold true).
  
  Vmalloc overcomes fragmentation because it does not return memory which is physically contiguous (unlike kmalloc). The memory returned by vmalloc is mapped into the vmalloc address space so it appears virtually contiguous, but the pages do not need to be physically contiguous. As such with systems with fragmentation there is a definite advantage in using it.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Phillip Lougher - 2014-05-15

status: open --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Phillip Lougher - 2014-05-15

Bug in Mksquashfs fixed in Squashfs tools 4.3 release

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jaap Versteegh - 2017-02-04

Not sure if it is related, but I get "SQUASHFS error: unable to read inode lookup table" with a 70M inode count. 4.3-3-ubuntu2 package on x86_64.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

mksquahsfs segfaulted at the end of the process

Group

Searches

Help

#51 mksquahsfs segfaulted at the end of the process

Discussion