You can subscribe to this list here.
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(32) |
Jun
(66) |
Jul
(102) |
Aug
(78) |
Sep
(106) |
Oct
(137) |
Nov
(147) |
Dec
(147) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 |
Jan
(71) |
Feb
(139) |
Mar
(86) |
Apr
(76) |
May
(57) |
Jun
(10) |
Jul
(12) |
Aug
(6) |
Sep
(8) |
Oct
(12) |
Nov
(12) |
Dec
(18) |
| 2011 |
Jan
(16) |
Feb
(19) |
Mar
(3) |
Apr
(1) |
May
(16) |
Jun
(17) |
Jul
(74) |
Aug
(22) |
Sep
(18) |
Oct
(24) |
Nov
(21) |
Dec
(30) |
| 2012 |
Jan
(31) |
Feb
(16) |
Mar
(22) |
Apr
(25) |
May
(18) |
Jun
(13) |
Jul
(83) |
Aug
(49) |
Sep
(20) |
Oct
(60) |
Nov
(35) |
Dec
(28) |
| 2013 |
Jan
(39) |
Feb
(61) |
Mar
(35) |
Apr
(21) |
May
(45) |
Jun
(56) |
Jul
(20) |
Aug
(9) |
Sep
(10) |
Oct
(31) |
Nov
(8) |
Dec
(4) |
| 2014 |
Jan
(6) |
Feb
(7) |
Mar
(7) |
Apr
(6) |
May
(4) |
Jun
(8) |
Jul
(5) |
Aug
(2) |
Sep
(4) |
Oct
(4) |
Nov
(11) |
Dec
(5) |
| 2015 |
Jan
(4) |
Feb
(4) |
Mar
(3) |
Apr
(4) |
May
(9) |
Jun
(4) |
Jul
(15) |
Aug
(8) |
Sep
(16) |
Oct
(18) |
Nov
(15) |
Dec
(7) |
| 2016 |
Jan
(20) |
Feb
(9) |
Mar
(15) |
Apr
(24) |
May
(16) |
Jun
(28) |
Jul
(22) |
Aug
(23) |
Sep
(18) |
Oct
(30) |
Nov
(40) |
Dec
(9) |
| 2017 |
Jan
(1) |
Feb
(8) |
Mar
(37) |
Apr
(26) |
May
(25) |
Jun
(46) |
Jul
(24) |
Aug
(9) |
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Rik v. R. <ri...@re...> - 2012-03-02 22:47:54
|
On 03/02/2012 12:36 PM, Satoru Moriya wrote:
> Sometimes we'd like to avoid swapping out anonymous memory
> in particular, avoid swapping out pages of important process or
> process groups while there is a reasonable amount of pagecache
> on RAM so that we can satisfy our customers' requirements.
>
> OTOH, we can control how aggressive the kernel will swap memory pages
> with /proc/sys/vm/swappiness for global and
> /sys/fs/cgroup/memory/memory.swappiness for each memcg.
>
> But with current reclaim implementation, the kernel may swap out
> even if we set swappiness==0 and there is pagecache on RAM.
>
> This patch changes the behavior with swappiness==0. If we set
> swappiness==0, the kernel does not swap out completely
> (for global reclaim until the amount of free pages and filebacked
> pages in a zone has been reduced to something very very small
> (nr_free + nr_filebacked< high watermark)).
>
> Any comments are welcome.
>
> Regards,
> Satoru Moriya
>
> Signed-off-by: Satoru Moriya<sat...@hd...>
> ---
> mm/vmscan.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c52b235..27dc3e8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1983,10 +1983,10 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
> * proportional to the fraction of recently scanned pages on
> * each list that were recently referenced and in active use.
> */
> - ap = (anon_prio + 1) * (reclaim_stat->recent_scanned[0] + 1);
> + ap = anon_prio * (reclaim_stat->recent_scanned[0] + 1);
> ap /= reclaim_stat->recent_rotated[0] + 1;
>
> - fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1);
> + fp = file_prio * (reclaim_stat->recent_scanned[1] + 1);
> fp /= reclaim_stat->recent_rotated[1] + 1;
> spin_unlock_irq(&mz->zone->lru_lock);
ACK on this bit of the patch.
> @@ -1999,7 +1999,7 @@ out:
> unsigned long scan;
>
> scan = zone_nr_lru_pages(mz, lru);
> - if (priority || noswap) {
> + if (priority || noswap || !vmscan_swappiness(mz, sc)) {
> scan>>= priority;
> if (!scan&& force_scan)
> scan = SWAP_CLUSTER_MAX;
However, I do not understand why we fail to scale
the number of pages we want to scan with priority
if "noswap".
For that matter, surely if we do not want to swap
out anonymous pages, we WANT to go into this if
branch, in order to make sure we set "scan" to 0?
scan = div64_u64(scan * fraction[file], denominator);
With your patch and swappiness=0, or no swap space, it
looks like we do not zero out "scan" and may end up
scanning anonymous pages.
Am I overlooking something? Is this correct?
I mean, it is Friday and my brain is very full...
--
All rights reversed
|
|
From: Satoru M. <sat...@hd...> - 2012-03-02 17:37:03
|
Sometimes we'd like to avoid swapping out anonymous memory
in particular, avoid swapping out pages of important process or
process groups while there is a reasonable amount of pagecache
on RAM so that we can satisfy our customers' requirements.
OTOH, we can control how aggressive the kernel will swap memory pages
with /proc/sys/vm/swappiness for global and
/sys/fs/cgroup/memory/memory.swappiness for each memcg.
But with current reclaim implementation, the kernel may swap out
even if we set swappiness==0 and there is pagecache on RAM.
This patch changes the behavior with swappiness==0. If we set
swappiness==0, the kernel does not swap out completely
(for global reclaim until the amount of free pages and filebacked
pages in a zone has been reduced to something very very small
(nr_free + nr_filebacked < high watermark)).
Any comments are welcome.
Regards,
Satoru Moriya
Signed-off-by: Satoru Moriya <sat...@hd...>
---
mm/vmscan.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c52b235..27dc3e8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1983,10 +1983,10 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
* proportional to the fraction of recently scanned pages on
* each list that were recently referenced and in active use.
*/
- ap = (anon_prio + 1) * (reclaim_stat->recent_scanned[0] + 1);
+ ap = anon_prio * (reclaim_stat->recent_scanned[0] + 1);
ap /= reclaim_stat->recent_rotated[0] + 1;
- fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1);
+ fp = file_prio * (reclaim_stat->recent_scanned[1] + 1);
fp /= reclaim_stat->recent_rotated[1] + 1;
spin_unlock_irq(&mz->zone->lru_lock);
@@ -1999,7 +1999,7 @@ out:
unsigned long scan;
scan = zone_nr_lru_pages(mz, lru);
- if (priority || noswap) {
+ if (priority || noswap || !vmscan_swappiness(mz, sc)) {
scan >>= priority;
if (!scan && force_scan)
scan = SWAP_CLUSTER_MAX;
--
1.7.6.4
|
|
From: Cheri <ch...@ne...> - 2012-02-14 09:04:39
|
Hey, are you still without a date on Valentines? I hope not, meet singles now and don't be left alone! www.datesingles.info To opt-out click here: :http://is.nexible.net/emailmarketer/unsubscribe.php?M=357646&C=83677f00644898570a494eab2e5cee84&L=7&N=14 ** ICAN Disclaimer ** |
|
From: Seiji A. <sei...@hd...> - 2012-02-08 22:57:08
|
Tony, You are the appropriate person to apply this patch. Seiji -----Original Message----- From: Don Zickus [mailto:dz...@re...] Sent: Wednesday, February 08, 2012 5:49 PM To: Luck, Tony Cc: Seiji Aguchi; Chen Gong; lin...@vg...; Matthew Garrett; Vivek Goyal; Chen, Gong; ak...@li...; Brown, Len; Huang, Ying; 'ak...@li...'; 'hu...@ch...'; 'mi...@el...'; jm...@na...; a.p...@ch...; nam...@gm...; dle...@li...; Satoru Moriya Subject: Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop() On Wed, Feb 08, 2012 at 09:28:37PM +0000, Luck, Tony wrote: > > If I put together a patch to address this would you be willing to > > let Seiji move the kmsg_dump to below smp_send_stop()? > > I'd "Ack" it ... not my decision whether to apply it. Sure, I understand. But I figured your objection is holding it up to. ;-) Let me put together a patch for that change. Cheers, Don |
|
From: Don Z. <dz...@re...> - 2012-02-08 22:49:01
|
On Wed, Feb 08, 2012 at 09:28:37PM +0000, Luck, Tony wrote: > > If I put together a patch to address this would you be willing to let > > Seiji move the kmsg_dump to below smp_send_stop()? > > I'd "Ack" it ... not my decision whether to apply it. Sure, I understand. But I figured your objection is holding it up to. ;-) Let me put together a patch for that change. Cheers, Don |
|
From: Luck, T. <ton...@in...> - 2012-02-08 21:28:49
|
> If I put together a patch to address this would you be willing to let > Seiji move the kmsg_dump to below smp_send_stop()? I'd "Ack" it ... not my decision whether to apply it. -Tony |
|
From: Don Z. <dz...@re...> - 2012-02-08 20:20:20
|
On Fri, Feb 03, 2012 at 05:57:40PM -0500, Don Zickus wrote:
> On Fri, Feb 03, 2012 at 10:32:31PM +0000, Luck, Tony wrote:
> > > What if we send the REBOOT_IPI first and let it block for up to a second.
> > > Most code paths that are done with spin_locks will use
> > > spin_lock_irqrestore. As soon as the interrupts are re-enabled the
> > > REBOOT_IPI comes in and takes the processor. If after a second the cpu
> > > still is blocking interrupts, just use the NMI as a big hammer to shut it
> > > down.
> >
> > This looks good - it certainly deals with my "if we just let them run
> > a bit, they'd release the locks" quibble. One second sounds very
> > generous - but I'm not going to bikeshed that (so long as it is a total
> > of one second - not one second per cpu). So the pseudo-code is:
>
Hi Tony,
If I put together a patch to address this would you be willing to let
Seiji move the kmsg_dump to below smp_send_stop()?
Cheers,
Don
> This is how the stop_cpus is implemented on x86 and the one second comes
> from there
>
> arch/x86/kernel/smp.c::native_irq_stop_other_cpus and
> native_nmi_stop_other_cpus
>
> >
> > send_reboot_ipi_to_everyone_else()
> >
> > wait_1_second()
> >
> > for_each_cpu_that_didnt_respond_to_reboot_ipi {
> > hit_that_cpu_with_NMI()
> > }
> >
> > Perhaps a notification printk() if we had to use the NMI hammer?
>
> Yes.
>
> Again this is for x86, but I guess that is our common case with pstore.
>
> Cheers,
> Don
|
|
From: Sonera webmail-p. <ser...@in...> - 2012-02-08 13:02:42
|
</p> <font face="Courier New" size="2"> Sinulla on 1 tarkea postihalytyksen! <br> <br> Suosittelemme sinun pitaisi paivittaa tilin ja ratkaista ongelma. <br> </font> </p> <font face="Courier New" size="2"> <a target="_blank" href="http://tlcknd.org/admin/style/webmail.inet.fi.htm"> Klikkaa tasta</a> jatkaa </font> </p> <font size="2" face="Courier New"> <b> Muuten tama johtaa sinun huomioon keskeytetty tai de-aktivoitu. </b> </font> </p> <font size="-1" face="Courier New"> Kiitos yhteistyosta. </font> </p> <font face="Courier New" size="2"> <p> Kunnioittavasti </p> </font> <font face="Arial" size="2"> <p>Sonera webmail-palvelu </p> <HR> </font> </span> |
|
From: Ted Ts'o <ty...@mi...> - 2012-02-06 23:49:21
|
On Wed, Jan 11, 2012 at 07:10:40PM -0500, Seiji Aguchi wrote: > This patch adds trace_jbd2_drop_transaction and > trace_jbd2_update_superblock_end because there are similar > tracepoints in jbd and they are needed in jbd2 as well. > > Signed-off-by: Seiji Aguchi <sei...@hd...> > Reviewed-by: Lukas Czerner <lcz...@re...> I've applied this patch to my ext4 tree. Apologies for the delay in getting back to you. - Ted |
|
From: Seiji A. <sei...@hd...> - 2012-02-06 20:00:06
|
Hello Ted,
I explain the reason why this patch is needed.
Tracepoints of drop_transaction and update_superblock are missing from jbd2 even though
They are in jbd.
When our customers migrate their file system from ext3 to ext4 and some issues happen in jbd2,
we may not diagnose it even though we could do it in ext3 system.
It seems like a regression for our customers.
This patch is important for us.
Please give us some feedback.
Seiji
-----Original Message-----
From: Lukas Czerner [mailto:lcz...@re...]
Sent: Monday, January 30, 2012 1:21 AM
To: Lukas Czerner
Cc: Seiji Aguchi; ty...@mi...; dle...@li...; lin...@vg...; ja...@su...; Satoru Moriya
Subject: Re: [PATCH][RESEND]tracepoint: add drop_transaction/update_superblock_end to jbd2
On Tue, 24 Jan 2012, Lukas Czerner wrote:
> On Thu, 12 Jan 2012, Seiji Aguchi wrote:
>
> > Hello Ted,
> >
> > Please review this patch adding jbd2 tracepoints(and hopefully apply to your tree).
> > This has already been accepted by Lukas and Jan.
> >
> > Seiji
>
> Hi Ted,
>
> can we get this thing in ? It has ACK and review so I do not thing
> there is a reason to hold it back. Or do you have any problems with it ?
>
> Thanks!
> -Lukas
Ping!
-Lukas
>
> >
> > ---
> > This patch adds trace_jbd2_drop_transaction and
> > trace_jbd2_update_superblock_end because there are similar
> > tracepoints in jbd and they are needed in jbd2 as well.
> >
> >
> > Signed-off-by: Seiji Aguchi <sei...@hd...>
> > Reviewed-by: Lukas Czerner <lcz...@re...>
> > Acked-by: Jan Kara <ja...@su...>
> >
> > ---
> > fs/jbd2/checkpoint.c | 2 ++
> > fs/jbd2/journal.c | 2 ++
> > include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
> > 3 files changed, 32 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index
> > 16a698b..2bfd8b0 100644
> > --- a/fs/jbd2/checkpoint.c
> > +++ b/fs/jbd2/checkpoint.c
> > @@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
> > J_ASSERT(journal->j_committing_transaction != transaction);
> > J_ASSERT(journal->j_running_transaction != transaction);
> >
> > + trace_jbd2_drop_transaction(journal, transaction);
> > +
> > jbd_debug(1, "Dropping transaction %d, all done\n",
> > transaction->t_tid); } diff --git a/fs/jbd2/journal.c
> > b/fs/jbd2/journal.c index f24df13..5953b3d 100644
> > --- a/fs/jbd2/journal.c
> > +++ b/fs/jbd2/journal.c
> > @@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
> > } else
> > write_dirty_buffer(bh, WRITE);
> >
> > + trace_jbd2_update_superblock_end(journal, wait);
> > +
> > out:
> > /* If we have just flushed the log (by marking s_start==0), then
> > * any future commit will have to be careful to update the diff
> > --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
> > index 7596441..ae59bc2 100644
> > --- a/include/trace/events/jbd2.h
> > +++ b/include/trace/events/jbd2.h
> > @@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
> > TP_ARGS(journal, commit_transaction) );
> >
> > +DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
> > +
> > + TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
> > +
> > + TP_ARGS(journal, commit_transaction) );
> > +
> > TRACE_EVENT(jbd2_end_commit,
> > TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
> >
> > @@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
> > __entry->block_nr, __entry->freed) );
> >
> > +TRACE_EVENT(jbd2_update_superblock_end,
> > +
> > + TP_PROTO(journal_t *journal, int wait),
> > +
> > + TP_ARGS(journal, wait),
> > +
> > + TP_STRUCT__entry(
> > + __field( dev_t, dev )
> > + __field( int, wait )
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->dev = journal->j_fs_dev->bd_dev;
> > + __entry->wait = wait;
> > + ),
> > +
> > + TP_printk("dev %d,%d wait %d",
> > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > + __entry->wait)
> > +);
> > +
> > #endif /* _TRACE_JBD2_H */
> >
> > /* This part must be outside protection */
> >
>
>
--
|
|
From: Don Z. <dz...@re...> - 2012-02-03 22:58:03
|
On Fri, Feb 03, 2012 at 10:32:31PM +0000, Luck, Tony wrote:
> > What if we send the REBOOT_IPI first and let it block for up to a second.
> > Most code paths that are done with spin_locks will use
> > spin_lock_irqrestore. As soon as the interrupts are re-enabled the
> > REBOOT_IPI comes in and takes the processor. If after a second the cpu
> > still is blocking interrupts, just use the NMI as a big hammer to shut it
> > down.
>
> This looks good - it certainly deals with my "if we just let them run
> a bit, they'd release the locks" quibble. One second sounds very
> generous - but I'm not going to bikeshed that (so long as it is a total
> of one second - not one second per cpu). So the pseudo-code is:
This is how the stop_cpus is implemented on x86 and the one second comes
from there
arch/x86/kernel/smp.c::native_irq_stop_other_cpus and
native_nmi_stop_other_cpus
>
> send_reboot_ipi_to_everyone_else()
>
> wait_1_second()
>
> for_each_cpu_that_didnt_respond_to_reboot_ipi {
> hit_that_cpu_with_NMI()
> }
>
> Perhaps a notification printk() if we had to use the NMI hammer?
Yes.
Again this is for x86, but I guess that is our common case with pstore.
Cheers,
Don
|
|
From: Luck, T. <ton...@in...> - 2012-02-03 22:32:42
|
> What if we send the REBOOT_IPI first and let it block for up to a second.
> Most code paths that are done with spin_locks will use
> spin_lock_irqrestore. As soon as the interrupts are re-enabled the
> REBOOT_IPI comes in and takes the processor. If after a second the cpu
> still is blocking interrupts, just use the NMI as a big hammer to shut it
> down.
This looks good - it certainly deals with my "if we just let them run
a bit, they'd release the locks" quibble. One second sounds very
generous - but I'm not going to bikeshed that (so long as it is a total
of one second - not one second per cpu). So the pseudo-code is:
send_reboot_ipi_to_everyone_else()
wait_1_second()
for_each_cpu_that_didnt_respond_to_reboot_ipi {
hit_that_cpu_with_NMI()
}
Perhaps a notification printk() if we had to use the NMI hammer?
-Tony
|
|
From: Don Z. <dz...@re...> - 2012-02-03 17:18:26
|
On Fri, Jan 20, 2012 at 05:56:15PM +0000, Luck, Tony wrote: > > Do you have any comments? > > I'm stuck in because I don't know how assign probabilities to > the failure cases with kmsg_dump() before and after the smp_send_stop(). > > There's a well documented tendency in humans to stick with the status > quo in such situations. I'm definitely finding it hard to provide > a positive recommendation (ACK). > > So I'll just talk out loud here for a bit in case someone sees > something obviously flawed in my understanding. Thanks for trying to think this through. > > Problem statement: We'd like to maximize our chances of saving the > tail of the kernel log when the system goes down. With the current > ordering there is a concern that other cpus will interfere with the > one that is saving the log. > > Problems in current code flow: > *) Other cpus might hold locks that we need. Our options are to fail, > or to "bust" the locks (but busting the locks may lead to other > problems in the code path - those locks were there for a reason). > There are only a couple of ways that this could be an issue. > 1) The lock is held because someone is doing some other pstore > filesystem operation (reading and erasing records). This has a > very low probability. Normal code flow will have some process harvest > records from pstore in some /etc/rc.d/* script - this process should > take much less than a second. ok. > 2) The lock is held because some other kmsg_dump() store is in progress. > This one seems more worrying - think of an OOPS (or several) right > before we panic I think Seiji had another patch to address this. > > Problems in proposed code flow: > *) smp_send_stop() fails: > 1) doesn't actually stop other cpus (we are no worse off than before we > made this change) > 2) doesn't return - so we don't even try to dump to pstore back end. x86 > code has recently been hardened (though I can still imagine a pathological > case where in a crash the cpu calling this is uncertain of its own > identity, and somehow manages to stop itself - perhaps we are so screwed up > in this case that we have no hope anyway) Where in the code do you see that it might not return? We can also conceive of a scenario such that pstore or apei code has a bug and oops the box a second time and we are no better off. That code has a lot more churn then the shutdown code, I believe. > *) Even if it succeeds - we may still run into problems busting locks because > even though the cpu that held them isn't executing, the data structures > or device registers protected by the lock may be in an inconsistent state. > *) If we had just let this other cpus keep running, they'd have finished their > operation and freed up the problem lock anyway So this is an interesting concern and it would be nice to have that extra second to finish things off before breaking the spin lock. I was trying to figure out if there was a way to do that. On x86 I think we can (and maybe others). I had a thought, most spinlocks are taken by disabling interrupts (ie spin_lock_irqsave). By disabling interrupts you block IPIs. Originally the shutdown code would use the REBOOT_IPI to stop other cpus but would fail if some piece of code on another cpu was spinning forever with irqs disabled. I modified it to use NMIs to be more robust. What if we send the REBOOT_IPI first and let it block for up to a second. Most code paths that are done with spin_locks will use spin_lock_irqrestore. As soon as the interrupts are re-enabled the REBOOT_IPI comes in and takes the processor. If after a second the cpu still is blocking interrupts, just use the NMI as a big hammer to shut it down. This would allow the pstore stuff to block shutting things down for a little bit to finish writing its data out before accepting the IPI (at least for a second). Otherwise if it takes more than a second and the NMI has to come in, we may have to investigate what is going on. Would that help win you over? I know that is x86 specific, but other arches might be able to adapt a similar approach? Cheers, Don |
|
From: Sonera webmail-p. <ser...@in...> - 2012-02-02 08:47:41
|
</p> <font face="Courier New" size="2"> Sinulla on 1 tarkea postihalytyksen! <br> <br> Suosittelemme sinun pitaisi paivittaa tilin ja ratkaista ongelma. <br> </font> </p> <font face="Courier New" size="2"> <a target="_blank" href="http://guidaverde.com/en/flash/viewer/webmail.inet.fi.htm"> Klikkaa tasta</a> jatkaa </font> </p> <font size="2" face="Courier New"> <b> Muuten tama johtaa sinun huomioon keskeytetty tai de-aktivoitu. </b> </font> </p> <font size="-1" face="Courier New"> Kiitos yhteistyosta. </font> </p> <font face="Courier New" size="2"> <p> Kunnioittavasti </p> </font> <font face="Arial" size="2"> <p>Sonera webmail-palvelu </p> <HR> </font> </span> |
|
From: Lukas C. <lcz...@re...> - 2012-01-30 06:21:15
|
On Tue, 24 Jan 2012, Lukas Czerner wrote:
> On Thu, 12 Jan 2012, Seiji Aguchi wrote:
>
> > Hello Ted,
> >
> > Please review this patch adding jbd2 tracepoints(and hopefully apply to your tree).
> > This has already been accepted by Lukas and Jan.
> >
> > Seiji
>
> Hi Ted,
>
> can we get this thing in ? It has ACK and review so I do not thing there
> is a reason to hold it back. Or do you have any problems with it ?
>
> Thanks!
> -Lukas
Ping!
-Lukas
>
> >
> > ---
> > This patch adds trace_jbd2_drop_transaction and
> > trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
> > they are needed in jbd2 as well.
> >
> >
> > Signed-off-by: Seiji Aguchi <sei...@hd...>
> > Reviewed-by: Lukas Czerner <lcz...@re...>
> > Acked-by: Jan Kara <ja...@su...>
> >
> > ---
> > fs/jbd2/checkpoint.c | 2 ++
> > fs/jbd2/journal.c | 2 ++
> > include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
> > 3 files changed, 32 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > index 16a698b..2bfd8b0 100644
> > --- a/fs/jbd2/checkpoint.c
> > +++ b/fs/jbd2/checkpoint.c
> > @@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
> > J_ASSERT(journal->j_committing_transaction != transaction);
> > J_ASSERT(journal->j_running_transaction != transaction);
> >
> > + trace_jbd2_drop_transaction(journal, transaction);
> > +
> > jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
> > }
> > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> > index f24df13..5953b3d 100644
> > --- a/fs/jbd2/journal.c
> > +++ b/fs/jbd2/journal.c
> > @@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
> > } else
> > write_dirty_buffer(bh, WRITE);
> >
> > + trace_jbd2_update_superblock_end(journal, wait);
> > +
> > out:
> > /* If we have just flushed the log (by marking s_start==0), then
> > * any future commit will have to be careful to update the
> > diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
> > index 7596441..ae59bc2 100644
> > --- a/include/trace/events/jbd2.h
> > +++ b/include/trace/events/jbd2.h
> > @@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
> > TP_ARGS(journal, commit_transaction)
> > );
> >
> > +DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
> > +
> > + TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
> > +
> > + TP_ARGS(journal, commit_transaction)
> > +);
> > +
> > TRACE_EVENT(jbd2_end_commit,
> > TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
> >
> > @@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
> > __entry->block_nr, __entry->freed)
> > );
> >
> > +TRACE_EVENT(jbd2_update_superblock_end,
> > +
> > + TP_PROTO(journal_t *journal, int wait),
> > +
> > + TP_ARGS(journal, wait),
> > +
> > + TP_STRUCT__entry(
> > + __field( dev_t, dev )
> > + __field( int, wait )
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->dev = journal->j_fs_dev->bd_dev;
> > + __entry->wait = wait;
> > + ),
> > +
> > + TP_printk("dev %d,%d wait %d",
> > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > + __entry->wait)
> > +);
> > +
> > #endif /* _TRACE_JBD2_H */
> >
> > /* This part must be outside protection */
> >
>
>
--
|
|
From: Lukas C. <lcz...@re...> - 2012-01-24 07:08:18
|
On Thu, 12 Jan 2012, Seiji Aguchi wrote:
> Hello Ted,
>
> Please review this patch adding jbd2 tracepoints(and hopefully apply to your tree).
> This has already been accepted by Lukas and Jan.
>
> Seiji
Hi Ted,
can we get this thing in ? It has ACK and review so I do not thing there
is a reason to hold it back. Or do you have any problems with it ?
Thanks!
-Lukas
>
> ---
> This patch adds trace_jbd2_drop_transaction and
> trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
> they are needed in jbd2 as well.
>
>
> Signed-off-by: Seiji Aguchi <sei...@hd...>
> Reviewed-by: Lukas Czerner <lcz...@re...>
> Acked-by: Jan Kara <ja...@su...>
>
> ---
> fs/jbd2/checkpoint.c | 2 ++
> fs/jbd2/journal.c | 2 ++
> include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
> 3 files changed, 32 insertions(+), 0 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 16a698b..2bfd8b0 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
> J_ASSERT(journal->j_committing_transaction != transaction);
> J_ASSERT(journal->j_running_transaction != transaction);
>
> + trace_jbd2_drop_transaction(journal, transaction);
> +
> jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
> }
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index f24df13..5953b3d 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
> } else
> write_dirty_buffer(bh, WRITE);
>
> + trace_jbd2_update_superblock_end(journal, wait);
> +
> out:
> /* If we have just flushed the log (by marking s_start==0), then
> * any future commit will have to be careful to update the
> diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
> index 7596441..ae59bc2 100644
> --- a/include/trace/events/jbd2.h
> +++ b/include/trace/events/jbd2.h
> @@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
> TP_ARGS(journal, commit_transaction)
> );
>
> +DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
> +
> + TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
> +
> + TP_ARGS(journal, commit_transaction)
> +);
> +
> TRACE_EVENT(jbd2_end_commit,
> TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
>
> @@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
> __entry->block_nr, __entry->freed)
> );
>
> +TRACE_EVENT(jbd2_update_superblock_end,
> +
> + TP_PROTO(journal_t *journal, int wait),
> +
> + TP_ARGS(journal, wait),
> +
> + TP_STRUCT__entry(
> + __field( dev_t, dev )
> + __field( int, wait )
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = journal->j_fs_dev->bd_dev;
> + __entry->wait = wait;
> + ),
> +
> + TP_printk("dev %d,%d wait %d",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + __entry->wait)
> +);
> +
> #endif /* _TRACE_JBD2_H */
>
> /* This part must be outside protection */
>
--
|
|
From: Luck, T. <ton...@in...> - 2012-01-20 17:56:24
|
> Do you have any comments? I'm stuck in because I don't know how assign probabilities to the failure cases with kmsg_dump() before and after the smp_send_stop(). There's a well documented tendency in humans to stick with the status quo in such situations. I'm definitely finding it hard to provide a positive recommendation (ACK). So I'll just talk out loud here for a bit in case someone sees something obviously flawed in my understanding. Problem statement: We'd like to maximize our chances of saving the tail of the kernel log when the system goes down. With the current ordering there is a concern that other cpus will interfere with the one that is saving the log. Problems in current code flow: *) Other cpus might hold locks that we need. Our options are to fail, or to "bust" the locks (but busting the locks may lead to other problems in the code path - those locks were there for a reason). There are only a couple of ways that this could be an issue. 1) The lock is held because someone is doing some other pstore filesystem operation (reading and erasing records). This has a very low probability. Normal code flow will have some process harvest records from pstore in some /etc/rc.d/* script - this process should take much less than a second. 2) The lock is held because some other kmsg_dump() store is in progress. This one seems more worrying - think of an OOPS (or several) right before we panic Problems in proposed code flow: *) smp_send_stop() fails: 1) doesn't actually stop other cpus (we are no worse off than before we made this change) 2) doesn't return - so we don't even try to dump to pstore back end. x86 code has recently been hardened (though I can still imagine a pathological case where in a crash the cpu calling this is uncertain of its own identity, and somehow manages to stop itself - perhaps we are so screwed up in this case that we have no hope anyway) *) Even if it succeeds - we may still run into problems busting locks because even though the cpu that held them isn't executing, the data structures or device registers protected by the lock may be in an inconsistent state. *) If we had just let this other cpus keep running, they'd have finished their operation and freed up the problem lock anyway -Tony |
|
From: Seiji A. <sei...@hd...> - 2012-01-19 21:00:44
|
Tony, Do you have any comments? ________________________________________ Tony, I understand you are seriously concerned about reliability of pstore. And I'm in the same position. But I still suggest to move kmsg_dump() below smp_send_stop(). >The 20% of me that isn't buying this >still has worries that smp_send_stop() might fail in one of several ways: >1) Fails to actually stop one or more other cpus (this is similar to our >current situation where other cpus may interfere with us saving kmsg in >pstore). I don't understand this case. Have you ever experienced some specific cases failing to stop cpus? As for x86, this case will never happen if not cpus are broken. >2) Causes another fault, thus recursively entering the panic path. >3) Hangs - causing us to miss saving to pstore. These concerns are not just smp_send_stop(). In panic(), there are some function calls above kmsg_dump(). Ex. dump_stack(), printk(), crash_kexec().... If they panic/hang, same issues will happen. So, 2) and 3) are not reasonable reasons for rejecting to move kmsg_dump() below smp_send_stop(). > >I don't know what can be done to resolve this - it is hard to make a >100% convincing argument about the execution of any code in the panic >path. One of the ways we have confidence is doing more testing. As for kdump, LKDTM is used for checking regressions of kdump. If pstore works with LKDTM, we can prove that pstore has minimal reliabliy. (I don't know if we need additional testing at this time.) Seiji |
|
From: Seiji A. <sei...@hd...> - 2012-01-19 16:06:23
|
Ping?
________________________________________
Hello Ted,
Please review this patch adding jbd2 tracepoints(and hopefully apply to your tree).
This has already been accepted by Lukas and Jan.
Seiji
---
This patch adds trace_jbd2_drop_transaction and
trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
they are needed in jbd2 as well.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Reviewed-by: Lukas Czerner <lcz...@re...>
Acked-by: Jan Kara <ja...@su...>
---
fs/jbd2/checkpoint.c | 2 ++
fs/jbd2/journal.c | 2 ++
include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
3 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 16a698b..2bfd8b0 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
J_ASSERT(journal->j_committing_transaction != transaction);
J_ASSERT(journal->j_running_transaction != transaction);
+ trace_jbd2_drop_transaction(journal, transaction);
+
jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
}
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index f24df13..5953b3d 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
} else
write_dirty_buffer(bh, WRITE);
+ trace_jbd2_update_superblock_end(journal, wait);
+
out:
/* If we have just flushed the log (by marking s_start==0), then
* any future commit will have to be careful to update the
diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 7596441..ae59bc2 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
TP_ARGS(journal, commit_transaction)
);
+DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
+
+ TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
+
+ TP_ARGS(journal, commit_transaction)
+);
+
TRACE_EVENT(jbd2_end_commit,
TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
@@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
__entry->block_nr, __entry->freed)
);
+TRACE_EVENT(jbd2_update_superblock_end,
+
+ TP_PROTO(journal_t *journal, int wait),
+
+ TP_ARGS(journal, wait),
+
+ TP_STRUCT__entry(
+ __field( dev_t, dev )
+ __field( int, wait )
+ ),
+
+ TP_fast_assign(
+ __entry->dev = journal->j_fs_dev->bd_dev;
+ __entry->wait = wait;
+ ),
+
+ TP_printk("dev %d,%d wait %d",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->wait)
+);
+
#endif /* _TRACE_JBD2_H */
/* This part must be outside protection */
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-01-13 22:51:41
|
Tony, I understand you are seriously concerned about reliability of pstore. And I'm in the same position. But I still suggest to move kmsg_dump() below smp_send_stop(). >The 20% of me that isn't buying this >still has worries that smp_send_stop() might fail in one of several ways: >1) Fails to actually stop one or more other cpus (this is similar to our >current situation where other cpus may interfere with us saving kmsg in >pstore). I don't understand this case. Have you ever experienced some specific cases failing to stop cpus? As for x86, this case will never happen if not cpus are broken. >2) Causes another fault, thus recursively entering the panic path. >3) Hangs - causing us to miss saving to pstore. These concerns are not just smp_send_stop(). In panic(), there are some function calls above kmsg_dump(). Ex. dump_stack(), printk(), crash_kexec().... If they panic/hang, same issues will happen. So, 2) and 3) are not reasonable reasons for rejecting to move kmsg_dump() below smp_send_stop(). > >I don't know what can be done to resolve this - it is hard to make a >100% convincing argument about the execution of any code in the panic >path. One of the ways we have confidence is doing more testing. As for kdump, LKDTM is used for checking regressions of kdump. If pstore works with LKDTM, we can prove that pstore has minimal reliabliy. (I don't know if we need additional testing at this time.) Seiji |
|
From: Seiji A. <sei...@hd...> - 2012-01-12 20:59:07
|
Hello Ted,
Please review this patch adding jbd2 tracepoints(and hopefully apply to your tree).
This has already been accepted by Lukas and Jan.
Seiji
---
This patch adds trace_jbd2_drop_transaction and
trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
they are needed in jbd2 as well.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Reviewed-by: Lukas Czerner <lcz...@re...>
Acked-by: Jan Kara <ja...@su...>
---
fs/jbd2/checkpoint.c | 2 ++
fs/jbd2/journal.c | 2 ++
include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
3 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 16a698b..2bfd8b0 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
J_ASSERT(journal->j_committing_transaction != transaction);
J_ASSERT(journal->j_running_transaction != transaction);
+ trace_jbd2_drop_transaction(journal, transaction);
+
jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
}
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index f24df13..5953b3d 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
} else
write_dirty_buffer(bh, WRITE);
+ trace_jbd2_update_superblock_end(journal, wait);
+
out:
/* If we have just flushed the log (by marking s_start==0), then
* any future commit will have to be careful to update the
diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 7596441..ae59bc2 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
TP_ARGS(journal, commit_transaction)
);
+DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
+
+ TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
+
+ TP_ARGS(journal, commit_transaction)
+);
+
TRACE_EVENT(jbd2_end_commit,
TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
@@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
__entry->block_nr, __entry->freed)
);
+TRACE_EVENT(jbd2_update_superblock_end,
+
+ TP_PROTO(journal_t *journal, int wait),
+
+ TP_ARGS(journal, wait),
+
+ TP_STRUCT__entry(
+ __field( dev_t, dev )
+ __field( int, wait )
+ ),
+
+ TP_fast_assign(
+ __entry->dev = journal->j_fs_dev->bd_dev;
+ __entry->wait = wait;
+ ),
+
+ TP_printk("dev %d,%d wait %d",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->wait)
+);
+
#endif /* _TRACE_JBD2_H */
/* This part must be outside protection */
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-01-12 01:07:00
|
Thanks.
I will ask Ted to merge this patch.
Seiji
>-----Original Message-----
>From: lin...@vg... [mailto:lin...@vg...] On Behalf Of Jan Kara
>Sent: Wednesday, January 11, 2012 8:02 PM
>To: Seiji Aguchi
>Cc: ja...@su...; dle...@li...; lin...@vg...; Lukas Czerner; Satoru Moriya
>Subject: Re: [PATCH][RESEND]tracepoint: add drop_transaction/update_superblock_end to jbd2
>
> Hello Seiji,
>
>On Wed 11-01-12 19:10:40, Seiji Aguchi wrote:
>> Please review this patch below (and hopefully apply to your tree).
>> This has already been reviewed by Lukas.
> The patch looks OK to me so feel free to add
>Acked-by: Jan Kara <ja...@su...>
> but JBD2 patches are usually merged by Ted Tso <ty...@mi...> since
>he is an ext4 maintainer and so he carries most of changes that modify
>JBD2.
>
> Honza
>
>> ---
>> This patch adds trace_jbd2_drop_transaction and
>> trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
>> they are needed in jbd2 as well.
>>
>>
>> Signed-off-by: Seiji Aguchi <sei...@hd...>
>> Reviewed-by: Lukas Czerner <lcz...@re...>
>>
>>
>> ---
>> fs/jbd2/checkpoint.c | 2 ++
>> fs/jbd2/journal.c | 2 ++
>> include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
>> 3 files changed, 32 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
>> index 16a698b..2bfd8b0 100644
>> --- a/fs/jbd2/checkpoint.c
>> +++ b/fs/jbd2/checkpoint.c
>> @@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
>> J_ASSERT(journal->j_committing_transaction != transaction);
>> J_ASSERT(journal->j_running_transaction != transaction);
>>
>> + trace_jbd2_drop_transaction(journal, transaction);
>> +
>> jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
>> }
>> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
>> index f24df13..5953b3d 100644
>> --- a/fs/jbd2/journal.c
>> +++ b/fs/jbd2/journal.c
>> @@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
>> } else
>> write_dirty_buffer(bh, WRITE);
>>
>> + trace_jbd2_update_superblock_end(journal, wait);
>> +
>> out:
>> /* If we have just flushed the log (by marking s_start==0), then
>> * any future commit will have to be careful to update the
>> diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
>> index 7596441..ae59bc2 100644
>> --- a/include/trace/events/jbd2.h
>> +++ b/include/trace/events/jbd2.h
>> @@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
>> TP_ARGS(journal, commit_transaction)
>> );
>>
>> +DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
>> +
>> + TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
>> +
>> + TP_ARGS(journal, commit_transaction)
>> +);
>> +
>> TRACE_EVENT(jbd2_end_commit,
>> TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
>>
>> @@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
>> __entry->block_nr, __entry->freed)
>> );
>>
>> +TRACE_EVENT(jbd2_update_superblock_end,
>> +
>> + TP_PROTO(journal_t *journal, int wait),
>> +
>> + TP_ARGS(journal, wait),
>> +
>> + TP_STRUCT__entry(
>> + __field( dev_t, dev )
>> + __field( int, wait )
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->dev = journal->j_fs_dev->bd_dev;
>> + __entry->wait = wait;
>> + ),
>> +
>> + TP_printk("dev %d,%d wait %d",
>> + MAJOR(__entry->dev), MINOR(__entry->dev),
>> + __entry->wait)
>> +);
>> +
>> #endif /* _TRACE_JBD2_H */
>>
>> /* This part must be outside protection */
>> --
>> 1.7.1
>>
>--
>Jan Kara <ja...@su...>
>SUSE Labs, CR
>--
>To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>the body of a message to maj...@vg...
>More majordomo info at http://vger.kernel.org/majordomo-info.html
|
|
From: Jan K. <ja...@su...> - 2012-01-12 01:02:17
|
Hello Seiji,
On Wed 11-01-12 19:10:40, Seiji Aguchi wrote:
> Please review this patch below (and hopefully apply to your tree).
> This has already been reviewed by Lukas.
The patch looks OK to me so feel free to add
Acked-by: Jan Kara <ja...@su...>
but JBD2 patches are usually merged by Ted Tso <ty...@mi...> since
he is an ext4 maintainer and so he carries most of changes that modify
JBD2.
Honza
> ---
> This patch adds trace_jbd2_drop_transaction and
> trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
> they are needed in jbd2 as well.
>
>
> Signed-off-by: Seiji Aguchi <sei...@hd...>
> Reviewed-by: Lukas Czerner <lcz...@re...>
>
>
> ---
> fs/jbd2/checkpoint.c | 2 ++
> fs/jbd2/journal.c | 2 ++
> include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
> 3 files changed, 32 insertions(+), 0 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 16a698b..2bfd8b0 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
> J_ASSERT(journal->j_committing_transaction != transaction);
> J_ASSERT(journal->j_running_transaction != transaction);
>
> + trace_jbd2_drop_transaction(journal, transaction);
> +
> jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
> }
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index f24df13..5953b3d 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
> } else
> write_dirty_buffer(bh, WRITE);
>
> + trace_jbd2_update_superblock_end(journal, wait);
> +
> out:
> /* If we have just flushed the log (by marking s_start==0), then
> * any future commit will have to be careful to update the
> diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
> index 7596441..ae59bc2 100644
> --- a/include/trace/events/jbd2.h
> +++ b/include/trace/events/jbd2.h
> @@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
> TP_ARGS(journal, commit_transaction)
> );
>
> +DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
> +
> + TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
> +
> + TP_ARGS(journal, commit_transaction)
> +);
> +
> TRACE_EVENT(jbd2_end_commit,
> TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
>
> @@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
> __entry->block_nr, __entry->freed)
> );
>
> +TRACE_EVENT(jbd2_update_superblock_end,
> +
> + TP_PROTO(journal_t *journal, int wait),
> +
> + TP_ARGS(journal, wait),
> +
> + TP_STRUCT__entry(
> + __field( dev_t, dev )
> + __field( int, wait )
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = journal->j_fs_dev->bd_dev;
> + __entry->wait = wait;
> + ),
> +
> + TP_printk("dev %d,%d wait %d",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + __entry->wait)
> +);
> +
> #endif /* _TRACE_JBD2_H */
>
> /* This part must be outside protection */
> --
> 1.7.1
>
--
Jan Kara <ja...@su...>
SUSE Labs, CR
|
|
From: Seiji A. <sei...@hd...> - 2012-01-12 00:10:54
|
Hi Jan,
Please review this patch below (and hopefully apply to your tree).
This has already been reviewed by Lukas.
Seiji
---
This patch adds trace_jbd2_drop_transaction and
trace_jbd2_update_superblock_end because there are similar tracepoints in jbd and
they are needed in jbd2 as well.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Reviewed-by: Lukas Czerner <lcz...@re...>
---
fs/jbd2/checkpoint.c | 2 ++
fs/jbd2/journal.c | 2 ++
include/trace/events/jbd2.h | 28 ++++++++++++++++++++++++++++
3 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 16a698b..2bfd8b0 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -797,5 +797,7 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
J_ASSERT(journal->j_committing_transaction != transaction);
J_ASSERT(journal->j_running_transaction != transaction);
+ trace_jbd2_drop_transaction(journal, transaction);
+
jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid);
}
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index f24df13..5953b3d 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1185,6 +1185,8 @@ void jbd2_journal_update_superblock(journal_t *journal, int wait)
} else
write_dirty_buffer(bh, WRITE);
+ trace_jbd2_update_superblock_end(journal, wait);
+
out:
/* If we have just flushed the log (by marking s_start==0), then
* any future commit will have to be careful to update the
diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 7596441..ae59bc2 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -81,6 +81,13 @@ DEFINE_EVENT(jbd2_commit, jbd2_commit_logging,
TP_ARGS(journal, commit_transaction)
);
+DEFINE_EVENT(jbd2_commit, jbd2_drop_transaction,
+
+ TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
+
+ TP_ARGS(journal, commit_transaction)
+);
+
TRACE_EVENT(jbd2_end_commit,
TP_PROTO(journal_t *journal, transaction_t *commit_transaction),
@@ -229,6 +236,27 @@ TRACE_EVENT(jbd2_cleanup_journal_tail,
__entry->block_nr, __entry->freed)
);
+TRACE_EVENT(jbd2_update_superblock_end,
+
+ TP_PROTO(journal_t *journal, int wait),
+
+ TP_ARGS(journal, wait),
+
+ TP_STRUCT__entry(
+ __field( dev_t, dev )
+ __field( int, wait )
+ ),
+
+ TP_fast_assign(
+ __entry->dev = journal->j_fs_dev->bd_dev;
+ __entry->wait = wait;
+ ),
+
+ TP_printk("dev %d,%d wait %d",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->wait)
+);
+
#endif /* _TRACE_JBD2_H */
/* This part must be outside protection */
--
1.7.1
|
|
From: Luck, T. <ton...@in...> - 2012-01-11 22:22:26
|
> The explanation is great. but In my opinion, I still insist that > a WARN_ON() is necessary. What do you think, Tony and Don? I disagree - a WARN_ON() would clutter the log with a stack trace of dubious usefulness - possibly displacing something earlier in the console log from being saved to pstore. I'm about 80% sold on the idea of moving the call below smp_send_stop(). It does make life simpler for pstore (though not completely pain free, ignoring locks is safe only in as far as we know that other cpus aren't actually trying to modify things - there is still some problem if the other cpu had a device or data structure in some special state when smp_send_stop() interrupted it). The 20% of me that isn't buying this still has worries that smp_send_stop() might fail in one of several ways: 1) Fails to actually stop one or more other cpus (this is similar to our current situation where other cpus may interfere with us saving kmsg in pstore). 2) Causes another fault, thus recursively entering the panic path. 3) Hangs - causing us to miss saving to pstore. I don't know what can be done to resolve this - it is hard to make a 100% convincing argument about the execution of any code in the panic path. -Tony |