Thread: [Jfs-discussion] running complete filesystem check with 100.000s of files

Brought to you by: blaschke-oss, shaggyk

jfs-discussion

[Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-29 07:37:26

For the last 12 hours, I've had a full fsck running on a 100Gb
filesystem - probably with a few hundred thousand files.  The large
majority are less than 100K. 
It seems to be taking forever and it's finding lots of problems like
these:

Inode F3338155 has references to cross linked blocks.
File system object FF3338155 has corrupt data (39).
Duplicate reference to 1 block(s) beginning at offset 12398768 found in
file system object FF3338156.
Inode F3338156 has references to cross linked blocks.
File system object FF3338156 has corrupt data (39).
Duplicate reference to 1 block(s) beginning at offset 12398772 found in
file system object FF3338159.
Inode F3338159 has references to cross linked blocks.
File system object FF3338159 has corrupt data (39).
Duplicate reference to 1 block(s) beginning at offset 12398775 found in
file system object FF3338160.
Inode F3338160 has references to cross linked blocks.
File system object FF3338160 has corrupt data (39).
Duplicate reference to 1 block(s) beginning at offset 12398618 found in
file system object FF3338163.

Is there _any_ way of guesstimating a time of competion?


/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-29 12:29:58

Per Jessen wrote:

> For the last 12 hours, I've had a full fsck running on a 100Gb
> filesystem - probably with a few hundred thousand files.  The large
> majority are less than 100K.

Now 18 hours and counting.  I'm somewhat worried about the many messages
I'm seeing:

> Inode F3338155 has references to cross linked blocks.
> File system object FF3338155 has corrupt data (39).
> Duplicate reference to 1 block(s) beginning at offset 12398768 found
> in file system object FF3338156.
> Inode F3338156 has references to cross linked blocks.
> File system object FF3338156 has corrupt data (39).
> Duplicate reference to 1 block(s) beginning at offset 12398772 found
> in file system object FF3338159.

but I'm much more worried about the time it's taking.  We're only
talking about 90Gb ...


/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Dave K. <sh...@li...> - 2008-04-29 12:49:07

On Tue, 2008-04-29 at 14:29 +0200, Per Jessen wrote:
> Per Jessen wrote:
> 
> > For the last 12 hours, I've had a full fsck running on a 100Gb
> > filesystem - probably with a few hundred thousand files.  The large
> > majority are less than 100K.
> 
> Now 18 hours and counting.  I'm somewhat worried about the many messages
> I'm seeing:

Ouch.  How many?  hundreds?  thousands?  You'll likely lose all the
files that are found to have cross-linked blocks.

> > Inode F3338155 has references to cross linked blocks.
> > File system object FF3338155 has corrupt data (39).
> > Duplicate reference to 1 block(s) beginning at offset 12398768 found
> > in file system object FF3338156.
> > Inode F3338156 has references to cross linked blocks.
> > File system object FF3338156 has corrupt data (39).
> > Duplicate reference to 1 block(s) beginning at offset 12398772 found
> > in file system object FF3338159.
> 
> but I'm much more worried about the time it's taking.  We're only
> talking about 90Gb ...

I really don't have an estimate.  Years ago, this processing was even
slower, but I guess it can still be pretty horrible.  Fortunately, it
only kicks in rarely.  I don't know what could have caused the problem.
Cross-linked blocks are blocks that more than one file claim.

> /Per Jessen, Zürich

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-29 13:08:38

Dave Kleikamp wrote:

> Ouch.  How many?  hundreds?  thousands?  You'll likely lose all the
> files that are found to have cross-linked blocks.

By now I would say a thousand easily.  The vast majority of the files
are old and/or throw-away, and I should have a backup those that
aren't.  

>> 
>> but I'm much more worried about the time it's taking.  We're only
>> talking about 90Gb ...
> 
> I really don't have an estimate.  Years ago, this processing was even
> slower, but I guess it can still be pretty horrible.  Fortunately, it
> only kicks in rarely.  I don't know what could have caused the
> problem. Cross-linked blocks are blocks that more than one file claim.

Fortunately, no customer has complained yet, but someone will if it goes
on for another 12-15 hours.  I really do not want to have a 2nd day of
this tomorrow ...

/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 05:49:12

Per Jessen wrote:

> Fortunately, no customer has complained yet, but someone will if it
> goes on for another 12-15 hours.  I really do not want to have a 2nd
> day of this tomorrow ...

Well, looks like that was wishful thinking.  Now 35 hours and counting.

Recent output is stuff like this:

Duplicate reference to 2 block(s) beginning at offset 13952656 found in
file system object DF192093.
Duplicate reference to 13 block(s) beginning at offset 13952674 found in
file system object DF192093.
Duplicate reference to 80 block(s) beginning at offset 13952688 found in
file system object DF192093.
Duplicate reference to 7 block(s) beginning at offset 13952789 found in
file system object DF192093.
Duplicate reference to 6578 block(s) beginning at offset 13952797 found
in file system object DF192093.
Duplicate reference to 6579 block(s) beginning at offset 13952796 found
in file system object DF192093.

Fortunately, most people will be off work the next 4 days, so in about
12 hours I'll probably start rebuilding/recreating this system.  It has
got to be working again by Monday. 

Shaggy, any idea what could possibly have caused such a mess?? 
This is a old(ish) SMP system, running 2.4.33, jfsutils 1.1.7.  I tried
upgrading to 1.1.11, but had to back down to 1.1.7 as the new utils
refused to do an fsck.  The filesystem is about 140Gb in total of which
90Gb is in use. It's backed by a software RAID5.  I'm guessing
the filesystem probably had some 500.000 files with up to maybe 40,000
in some directories, but generally less.  The system was generally very
busy storing/processing new files (24h/day).  

/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: <pg...@jf...> - 2008-04-30 15:01:33

[ ... ]

>> Fortunately, no customer has complained yet,

This and "very busy storing/processing new files (24h/day)" later
seem to describe a fairly critical system with somewhat high
availability requirements.

>> but someone will if it goes on for another 12-15 hours. I
>> really do not want to have a 2nd day of this tomorrow ...

> Well, looks like that was wishful thinking.

Indeed, and if one has availability constraints, relying on
'fsck' being quick is equally unrealistic.

> Now 35 hours and counting.

The time taken to do a deep check of entangled filesystems can
be long. For an 'ext3' filesystem it was 75 days, and there are
other interesting reports of long 'fsck' times:

  http://www.sabi.co.uk/blog/anno05-4th.html#051009
  http://www.sabi.co.uk/blog/anno05-4th.html#051108
  http://www.sabi.co.uk/blog/0802feb.html#080210

My impression is that JFS has a much better 'fsck' than 'ext3',
but I haven't found (even on this mailing list) many reports of
'fsck' durations for JFS, and my own filesystems are rather small
like yours (a few hundred thousand files, a few hundred GB of
data), and 'fsck' takes a few minutes on undamaged or mostly OK
filesystems.

Anyhow the high bounds on 'fsck' times and space are a well known
problems, especially for multi-TB filesystems, and this are some
of the most recent news for a couple of other filesystems:

  http://kerneltrap.org/Linux/Improving_fsck_Speeds_in_ext4
  http://oss.sgi.com/archives/xfs/2008-01/msg00187.html

> Recent output is stuff like this: [ ... shared blocks ... ]

> [ ... ] what could possibly have caused such a mess??

A very optimistic sysadm? :-)

> This is a old(ish) SMP system, running 2.4.33, [ ... ]

My impression is that both SMP and JFS in 2.4.33 are not as well
tested as in 2.6, as there have been some important bug fixes in
the 2.6 series that probably apply very much to high load systems,
especially for SMP.

Using a kernel that old means accepting whatever issues it has
and hoping that they don't affect your load.

Anyhow in my experience most events like the above are caused by
hardware issues, more than old old bugs remaining unfixed in the
SMP or JFS code of old kernels. Even a single bit error in RAM or
a single block error during IO can have devastating effects.
Never mind firmware or other errors.

Consider for example this interesting report on IO "silent
corruption" from a largish installation with a lot of experience:

  https://indico.desy.de/contributionDisplay.py?contribId=65&sessionId=42&confId=257

and their subsquent update:

 http://indico.fnal.gov/contributionDisplay.py?contribId=44&amp;sessionId=15&amp;confId=805

System integration and qualification is a very difficult and
expensive activity...

> [ ... ] The filesystem is about 140Gb in total of which 90Gb
> is in use. It's backed by a software RAID5.

As you now have duscovered it would have been much quicker to
restore it from backups ('-o nointegrity' would have made it even
faster).

That's a way of doing 'fsck' that is often faster than 'fsck',
because it relies largely on straightforward sequential accesses,
while 'fsck' relies a lot on random accesses and somewhat hairy
algorithms.

> I'm guessing the filesystem probably had some 500.000 files
> with up to maybe 40,000 in some directories,

That's generally unwise, but the real problems are the overlapping
allocations because then 'fsck' must check everything against
everything.

> but generally less. The system was generally very busy
> storing/processing new files (24h/day).

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 05:53:31

Per Jessen wrote:

> Fortunately, no customer has complained yet, but someone will if it
> goes on for another 12-15 hours.  I really do not want to have a 2nd
> day of this tomorrow ...

Well, looks like that was wishful thinking.  Now 34 hours and counting.

Recent output is stuff like this:

Duplicate reference to 2 block(s) beginning at offset 13952656 found in
file system object DF192093.
Duplicate reference to 13 block(s) beginning at offset 13952674 found in
file system object DF192093.
Duplicate reference to 80 block(s) beginning at offset 13952688 found in
file system object DF192093.
Duplicate reference to 7 block(s) beginning at offset 13952789 found in
file system object DF192093.
Duplicate reference to 6578 block(s) beginning at offset 13952797 found
in file system object DF192093.
Duplicate reference to 6579 block(s) beginning at offset 13952796 found
in file system object DF192093.

Fortunately, most people will be off work the next 4 days, so in about
12 hours I'll probably start rebuilding/recreating this system.  It has
got to be working again by Monday. 

Still - Dave, any idea what could possibly have caused such a mess?? 
This is a old(ish) SMP system, running 2.4.33, jfsutils 1.1.7.  I tried
upgrading to 1.1.11, but had to back down to 1.1.7 as the new utils
refused to do an fsck.  The filesystem is about 140Gb in total of which
90Gb is used.  It's backed by a  softwarea RAID5 software I'm guessing
the filesystem probably had some 500.000 files, with up to maybe 40,000
in some directories.  The system was generally very busy storing new
files (24h/day).  

/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Dave K. <sh...@li...> - 2008-04-30 13:28:53

On Wed, 2008-04-30 at 07:53 +0200, Per Jessen wrote:
> Per Jessen wrote:
> 
> > Fortunately, no customer has complained yet, but someone will if it
> > goes on for another 12-15 hours.  I really do not want to have a 2nd
> > day of this tomorrow ...
> 
> Well, looks like that was wishful thinking.  Now 34 hours and counting.
> 
> Recent output is stuff like this:
> 
> Duplicate reference to 2 block(s) beginning at offset 13952656 found in
> file system object DF192093.
> Duplicate reference to 13 block(s) beginning at offset 13952674 found in
> file system object DF192093.
> Duplicate reference to 80 block(s) beginning at offset 13952688 found in
> file system object DF192093.
> Duplicate reference to 7 block(s) beginning at offset 13952789 found in
> file system object DF192093.
> Duplicate reference to 6578 block(s) beginning at offset 13952797 found
> in file system object DF192093.
> Duplicate reference to 6579 block(s) beginning at offset 13952796 found
> in file system object DF192093.
> 
> 
> Fortunately, most people will be off work the next 4 days, so in about
> 12 hours I'll probably start rebuilding/recreating this system.  It has
> got to be working again by Monday. 
> 
> Still - Dave, any idea what could possibly have caused such a mess?? 
> This is a old(ish) SMP system, running 2.4.33, jfsutils 1.1.7.

Wow.  That is pretty old.  I've pretty much forgotten about the 2.4
kernel.  There have been a lot of bug fixes since then, but I wouldn't
know off the top of my head anything specific that would explain this.

> I tried
> upgrading to 1.1.11, but had to back down to 1.1.7 as the new utils
> refused to do an fsck.

What error did you get?  There's no reason 1.1.11 should have failed.

> The filesystem is about 140Gb in total of which
> 90Gb is used.  It's backed by a  softwarea RAID5 software I'm guessing
> the filesystem probably had some 500.000 files, with up to maybe 40,000
> in some directories.  The system was generally very busy storing new
> files (24h/day).

Do you have any plans to upgrade to a newer distribution?  JFS has
gotten a lot more stable in the 2.6 kernel than it was back in 2.4.  I'm
pretty impressed that it's been holding up this long under such a high
load.

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Christian K. <li...@ne...> - 2008-04-30 12:53:30

On Wed, April 30, 2008 07:53, Per Jessen wrote:
> This is a old(ish) SMP system, running 2.4.33, jfsutils 1.1.7.  I tried
> upgrading to 1.1.11, but had to back down to 1.1.7 as the new utils
> refused to do an fsck.

Hm, I would've assumed current jfsutils would run no matter what the
kernel version was or how old the system is. What was the error message
when jfs_fsck refused to run?

> The filesystem is about 140Gb in total of which 90Gb is
> used.  It's backed by a  softwarea RAID5 software I'm guessing the

Hm, did you try to boot off a rescue-CD [0] with more current
jfsutils/kernel?
Not that I know why this would help, but when doing fsck I tend to use the
latest and the greatest fsck tools.

Christian.

[0] http://grml.org/download/
    (comes with jfsutils-1.1.11-1 and kernel v2.6.23)
-- 
make bzImage, not war

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 13:18:19

Christian Kujau wrote:
> On Wed, April 30, 2008 07:53, Per Jessen wrote:
>> This is a old(ish) SMP system, running 2.4.33, jfsutils 1.1.7.  I tried
>> upgrading to 1.1.11, but had to back down to 1.1.7 as the new utils
>> refused to do an fsck.
> 
> Hm, I would've assumed current jfsutils would run no matter what the
> kernel version was or how old the system is. What was the error message
> when jfs_fsck refused to run?

I'm not sure, I think it complained about the superblock.

>> The filesystem is about 140Gb in total of which 90Gb is
>> used.  It's backed by a  softwarea RAID5 software I'm guessing the
> 
> Hm, did you try to boot off a rescue-CD [0] with more current
> jfsutils/kernel?
> Not that I know why this would help, but when doing fsck I tend to use the
> latest and the greatest fsck tools.

Yeah, I did boot a opensuse 10.2 system, which I think is how I noticed
the problem with jfsutils 1.1.11.

Btw, the fsck is still running, but at least it doesn't seem to have
found any errors since early this morning.


/Per

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 14:41:16

Dave Kleikamp wrote:

> There have been a lot of bug fixes since then, but I wouldn't 
> know off the top of my head anything specific that would explain this.
> 
>> I tried upgrading to 1.1.11, but had to back down to 1.1.7 as the new
>> utils  refused to do an fsck.
> 
> What error did you get?  There's no reason 1.1.11 should have failed.

I'm pretty certain it said something about the superblock, but I didn't
take note.  I can probably reproduce it, but is there any point?

> Do you have any plans to upgrade to a newer distribution?  JFS has
> gotten a lot more stable in the 2.6 kernel than it was back in 2.4. 

Yep, I've been preparing a new system since this morning.  Latest 2.6
kernel.  
The fsck is still running, but I've been able to copy the key files to
the new system, later I hope to be able to recover as much as possible
of the data.  

> I'm pretty impressed that it's been holding up this long under such a
> high load.

Well, looks like it wasn't holding up all that well ... 

/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Dave K. <sh...@li...> - 2008-04-30 14:58:01

On Wed, 2008-04-30 at 16:40 +0200, Per Jessen wrote:
> Dave Kleikamp wrote:
> 
> > There have been a lot of bug fixes since then, but I wouldn't 
> > know off the top of my head anything specific that would explain this.
> > 
> >> I tried upgrading to 1.1.11, but had to back down to 1.1.7 as the new
> >> utils  refused to do an fsck.
> > 
> > What error did you get?  There's no reason 1.1.11 should have failed.
> 
> I'm pretty certain it said something about the superblock, but I didn't
> take note.  I can probably reproduce it, but is there any point?

Not if you're moving up to a new system.  I'm a bit curious though.

> > Do you have any plans to upgrade to a newer distribution?  JFS has
> > gotten a lot more stable in the 2.6 kernel than it was back in 2.4. 
> 
> Yep, I've been preparing a new system since this morning.  Latest 2.6
> kernel.  
> The fsck is still running, but I've been able to copy the key files to
> the new system, later I hope to be able to recover as much as possible
> of the data.  
> 
> > I'm pretty impressed that it's been holding up this long under such a
> > high load.
> 
> Well, looks like it wasn't holding up all that well ... 

I think you'll have better luck on a modern kernel.  I trust you'll let
me know if any new problems show up.

Thanks,
Shaggy
-- 
David Kleikamp
IBM Linux Technology Center

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 15:09:19

Dave Kleikamp wrote:

>> I'm pretty certain it said something about the superblock, but I
>> didn't take note.  I can probably reproduce it, but is there any
>> point? 
> 
> Not if you're moving up to a new system.  I'm a bit curious though.

I'll try it once I've the new system up and running. 


/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 15:32:27

Peter Grandi wrote:

> This and "very busy storing/processing new files (24h/day)" later
> seem to describe a fairly critical system with somewhat high
> availability requirements.

Fairly high, yes.  It has now been down for almost 48hours, which is
probably just about as far as I can let it go.  We've already promised
our customers it will be back up Friday morning.  Tomorrow is a holiday
here, very fortunate.

>>> but someone will if it goes on for another 12-15 hours. I
>>> really do not want to have a 2nd day of this tomorrow ...
> 
>> Well, looks like that was wishful thinking.
> 
> Indeed, and if one has availability constraints, relying on
> 'fsck' being quick is equally unrealistic.

That's an interesting comment - I guess I _have_ been relying on 1) the
system only rarely needing a reboot and 2) a fast fsck when it happens. 

Do you have any insights to share wrt availability, large filesystems
(up to 1Tb in our case) and millions of files?  (apart from "don't do
it" :-)

>> Now 35 hours and counting.
> 
> The time taken to do a deep check of entangled filesystems can
> be long. For an 'ext3' filesystem it was 75 days, and there are
> other interesting reports of long 'fsck' times:

Uh oh.  I guess I'd better move ahead with my new system, and hope to
migrate whatever I can later on. 

> but I haven't found (even on this mailing list) many reports of
> 'fsck' durations for JFS, and my own filesystems are rather small
> like yours (a few hundred thousand files, a few hundred GB of
> data), and 'fsck' takes a few minutes on undamaged or mostly OK
> filesystems.

That has been my experience too - right up until 28 April at around
20:00. :-(

/Per Jessen, Zürich

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Dave K. <sh...@li...> - 2008-04-30 17:27:58

On Wed, 2008-04-30 at 17:32 +0200, Per Jessen wrote:
> Peter Grandi wrote:
> 
> > This and "very busy storing/processing new files (24h/day)" later
> > seem to describe a fairly critical system with somewhat high
> > availability requirements.
> 
> Fairly high, yes.  It has now been down for almost 48hours, which is
> probably just about as far as I can let it go.  We've already promised
> our customers it will be back up Friday morning.  Tomorrow is a holiday
> here, very fortunate.
> 
> >>> but someone will if it goes on for another 12-15 hours. I
> >>> really do not want to have a 2nd day of this tomorrow ...
> > 
> >> Well, looks like that was wishful thinking.
> > 
> > Indeed, and if one has availability constraints, relying on
> > 'fsck' being quick is equally unrealistic.
> 
> That's an interesting comment - I guess I _have_ been relying on 1) the
> system only rarely needing a reboot and 2) a fast fsck when it happens. 

JFS is, of course, designed so that under normal circumstances, fsck
only replays the journal, which is very fast.  When something bad
happens, and it has to do the full processing, it isn't necessarily
going to be fast.
> 
> Do you have any insights to share wrt availability, large filesystems
> (up to 1Tb in our case) and millions of files?  (apart from "don't do
> it" :-)

JFS's fsck time is basically tied to the number of inodes.  I don't have
numbers to give you, but a huge, nearly-empty file system won't take too
much time to check, but one with millions of inodes may take a long
time.  Worst case (other than a fatal error that fsck can't recover
from) is when cross-linked blocks are detected, and it has to do the
pass that is causing you so much delay.  It used to be MUCH worse before
jfsutils-1.1.5, if you can believe it.

> >> Now 35 hours and counting.
> > 
> > The time taken to do a deep check of entangled filesystems can
> > be long. For an 'ext3' filesystem it was 75 days, and there are
> > other interesting reports of long 'fsck' times:

I doubt it gets as bad as that, but again, I have no idea how much
longer it will take.

> Uh oh.  I guess I'd better move ahead with my new system, and hope to
> migrate whatever I can later on. 
> 
> > but I haven't found (even on this mailing list) many reports of
> > 'fsck' durations for JFS, and my own filesystems are rather small
> > like yours (a few hundred thousand files, a few hundred GB of
> > data), and 'fsck' takes a few minutes on undamaged or mostly OK
> > filesystems.
> 
> That has been my experience too - right up until 28 April at around
> 20:00. :-(
> 
> 
> /Per Jessen, Zürich
-- 
David Kleikamp
IBM Linux Technology Center

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: <pg...@jf...> - 2008-04-30 21:52:09

[ ... ]

>> This and "very busy storing/processing new files (24h/day)"
>> later seem to describe a fairly critical system with somewhat
>> high availability requirements.

[ ... ]

>> Indeed, and if one has availability constraints, relying on
>> 'fsck' being quick is equally unrealistic.

> That's an interesting comment - I guess I _have_ been relying
> on 1) the system only rarely needing a reboot and 2) a fast
> fsck when it happens.

Plenty of people do that, and then bad news do happen. I was
some time ago at a workshop about large scale system admin at
big national research labs (CERN and so on) and I asked almost
every speaker what they were doing about filesystem checking
times, and some seemed to be unaware of the issue.

The main driver of the issue is that thanks to RAID of various
sorts it is easy to scale up capacity and read or write accesses,
but 'fsck' does not take advantage of the multiple spindles in
RAID because it is serial.

> Do you have any insights to share wrt availability, large
> filesystems (up to 1Tb in our case) and millions of files?
> (apart from "don't do it" :-)

>From other things you have written it looks like that you use
the filesystem as a structured database. "don't do it" :-)

Usually it is better to use a database manager instead if you want
to store many records, instead of a filesystem.

However filesystems can grow in their own way, without remotely
looking like structured databases. For example a 200TB repository
with 100M files.

As to that, in general the only way I can see now to do that is
via clusters of very many smaller filesystems, each of which can
be either repaired or restored from backup pretty quickly, which
means 1-4TB and hundreds of thousands of inodes.

Some notes I have written on the subject:

  http://www.sabi.co.uk/blog/0804apr.html#080417
  http://www.sabi.co.uk/blog/0804apr.html#080407

[ ... ]

>> The time taken to do a deep check of entangled filesystems can
>> be long. For an 'ext3' filesystem it was 75 days, and there are
>> other interesting reports of long 'fsck' times:

> Uh oh.  I guess I'd better move ahead with my new system, and
> hope to migrate whatever I can later on.

In your case for up to 1TB the best stragtegy is probably frequent
backups and then in case of trouble a quick restore copying back a
the whole disk using 'dd'. With FW800 or eSATA I get around 50MB
sustained average (better with O_DIRECT and large block sizes,
which I now prefer) when duplicating modern cheap 500GB drives:

  http://www.sabi.co.uk/blog/0705may.html#070505

Of course if you have a "warm" backup you can just swap in the
backup drive and do an offline 'fsck' if really necessary on the
swapped out damaged filesystem. There are several options that
are advisable depending on circumstances.

>> but I haven't found (even on this mailing list) many reports of
>> 'fsck' durations for JFS, and my own filesystems are rather small
>> like yours (a few hundred thousand files, a few hundred GB of
>> data), and 'fsck' takes a few minutes on undamaged or mostly OK
>> filesystems.

> That has been my experience too - right up until 28 April at around
> 20:00. :-(

That's because even 'jfs_fsck -f' is quite quick on clean
filesystems, the problem is deep scans on messed up filesystems.

Some numbers for clean filesystems:

  --------------------------------------------------------------
  # sysctl vm/drop_caches=3; time jfs_fsck -f /dev/sda8
  vm.drop_caches = 3
  jfs_fsck version 1.1.12, 24-Aug-2007
  processing started: 4/30/2008 21.52.28
  The current device is:  /dev/sda8
  Block size in bytes:  4096
  Filesystem size in blocks:  61046992
  **Phase 0 - Replay Journal Log
  **Phase 1 - Check Blocks, Files/Directories, and  Directory Entries
  **Phase 2 - Count links
  **Phase 3 - Duplicate Block Rescan and Directory Connectedness
  **Phase 4 - Report Problems
  **Phase 5 - Check Connectivity
  **Phase 6 - Perform Approved Corrections
  **Phase 7 - Rebuild File/Directory Allocation Maps
  **Phase 8 - Rebuild Disk Allocation Maps
  244187968 kilobytes total disk space.
      27013 kilobytes in 8242 directories.
  173119730 kilobytes in 118301 user files.
      10896 kilobytes in extended attributes
     128391 kilobytes reserved for system use.
   70955964 kilobytes are available for use.
  Filesystem is clean.

  real    1m2.159s
  user    0m1.530s
  sys     0m1.630s
  --------------------------------------------------------------

That's on a 2004 class desktop machine and it is doing almost
2,000 inodes/s; I get similar results this is instead for a
contemporary chunky server on an 8 drive RAID10:

  --------------------------------------------------------------
  # sysctl vm/drop_caches=2; time jfs_fsck -f /dev/md0
  vm.drop_caches = 2
  jfs_fsck version 1.1.12, 24-Aug-2007
  processing started: 4/30/2008 21.58.17
  The current device is:  /dev/md0
  Block size in bytes:  4096
  Filesystem size in blocks:  390070208
  **Phase 0 - Replay Journal Log
  **Phase 1 - Check Blocks, Files/Directories, and  Directory Entries
  **Phase 2 - Count links
  **Phase 3 - Duplicate Block Rescan and Directory Connectedness
  **Phase 4 - Report Problems
  **Phase 5 - Check Connectivity
  **Phase 6 - Perform Approved Corrections
  **Phase 7 - Rebuild File/Directory Allocation Maps
  **Phase 8 - Rebuild Disk Allocation Maps
  1560280832 kilobytes total disk space.
     108545 kilobytes in 82098 directories.
   17122632 kilobytes in 251457 user files.
	100 kilobytes in extended attributes
     496649 kilobytes reserved for system use.
  1542769996 kilobytes are available for use.
  Filesystem is clean.

  real    0m55.271s
  user    0m2.428s
  sys     0m4.127s
  --------------------------------------------------------------

That's a large filesystem mostly empty because I use that for
testing and the particular test was a lot of very small files
and yet it does 4-5,000 inodes/s.

A million inodes? Probably 4-5 minutes. But it is not a deep
scan over a messed up filesystem.

Re: [Jfs-discussion] running complete filesystem check with 100.000s of files

From: Per J. <pe...@co...> - 2008-04-30 18:22:37

Per Jessen wrote:

>> Do you have any plans to upgrade to a newer distribution?  JFS has
>> gotten a lot more stable in the 2.6 kernel than it was back in 2.4.
> 
> Yep, I've been preparing a new system since this morning.  Latest 2.6
> kernel. The fsck is still running, but I've been able to copy the key
> files to the new system, later I hope to be able to recover as much as
> possible of the data.

On the topic of recovering files - the toplevel directory has about 300
subdirectories, each with 5-6 subdirs, of which one with 3 subdirs. 
Initially I was able to recover all the files of the toplevel
directory, plus the 300 subdirs and any files in those.  I'm now
working on the subdirs of the 300 toplevel subdirs.  This is where the
vast majority of the files are stored. 

I'm using rsync to copy each of the 300 toplevel subdirs individually to
keep an eye on what's being copied and what's not.  I'm seeing quite a
few errors such as: 

rsync: readlink "<subdir>/reports/nov2005.1.email" failed: Permission
denied (13)

rsync: readdir("<subdir>/quarantined/summary"): Input/output error (5)

Other subdirs report no errors at all.  I see "permission denied" on
files, and the "input/output error" on directories.

Could these errors somehow be construed to be an indication of "how far"
the fsck is?  Or which subdirs have been "marked" clean?  I'm grasping
at straws here, I know. 

/Per Jessen, Zürich