You can subscribe to this list here.
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(32) |
Jun
(66) |
Jul
(102) |
Aug
(78) |
Sep
(106) |
Oct
(137) |
Nov
(147) |
Dec
(147) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 |
Jan
(71) |
Feb
(139) |
Mar
(86) |
Apr
(76) |
May
(57) |
Jun
(10) |
Jul
(12) |
Aug
(6) |
Sep
(8) |
Oct
(12) |
Nov
(12) |
Dec
(18) |
| 2011 |
Jan
(16) |
Feb
(19) |
Mar
(3) |
Apr
(1) |
May
(16) |
Jun
(17) |
Jul
(74) |
Aug
(22) |
Sep
(18) |
Oct
(24) |
Nov
(21) |
Dec
(30) |
| 2012 |
Jan
(31) |
Feb
(16) |
Mar
(22) |
Apr
(25) |
May
(18) |
Jun
(13) |
Jul
(83) |
Aug
(49) |
Sep
(20) |
Oct
(60) |
Nov
(35) |
Dec
(28) |
| 2013 |
Jan
(39) |
Feb
(61) |
Mar
(35) |
Apr
(21) |
May
(45) |
Jun
(56) |
Jul
(20) |
Aug
(9) |
Sep
(10) |
Oct
(31) |
Nov
(8) |
Dec
(4) |
| 2014 |
Jan
(6) |
Feb
(7) |
Mar
(7) |
Apr
(6) |
May
(4) |
Jun
(8) |
Jul
(5) |
Aug
(2) |
Sep
(4) |
Oct
(4) |
Nov
(11) |
Dec
(5) |
| 2015 |
Jan
(4) |
Feb
(4) |
Mar
(3) |
Apr
(4) |
May
(9) |
Jun
(4) |
Jul
(15) |
Aug
(8) |
Sep
(16) |
Oct
(18) |
Nov
(15) |
Dec
(7) |
| 2016 |
Jan
(20) |
Feb
(9) |
Mar
(15) |
Apr
(24) |
May
(16) |
Jun
(28) |
Jul
(22) |
Aug
(23) |
Sep
(18) |
Oct
(30) |
Nov
(40) |
Dec
(9) |
| 2017 |
Jan
(1) |
Feb
(8) |
Mar
(37) |
Apr
(26) |
May
(25) |
Jun
(46) |
Jul
(24) |
Aug
(9) |
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Seiji A. <sei...@hd...> - 2011-10-17 14:11:11
|
Hi,
Thank you for giving me a comment.
>I have a stupid question: since you have serialized the process procedure via
>smp_send_stop, why still using spin_lock_xxx? Maybe preempt_disable/enable is
>enough?
I added spin_lock_init() in panic path for sharing code with other triggers
such as oops/reboot/emergency_restart because they still need spin_locks.
Do you suggest following code?
<snip>
If(!panic)
spin_lock_irqsave();
.
.
If(!panic)
spin_unlock_restore();
<snip>
I don't stick to current patch.
So I will resend a patch above if you request.
Regarding as preempt_disable/enable, we don't need to call them in panic path because they are
called at the beginning of panic().
<snip>
60 NORET_TYPE void panic(const char * fmt, ...)
61 {
62 static char buf[1024];
63 va_list args;
64 long i, i_next = 0;
65 int state = 0;
66
67 /*
68 * It's possible to come here directly from a panic-assertion and
69 * not have preempt disabled. Some functions called from here want
70 * preempt to be disabled. No point enabling it later though...
71 */
72 preempt_disable();
<snip>
Seiji
|
|
From: Chen G. <gon...@li...> - 2011-10-17 06:21:50
|
于 2011/10/15 4:53, Seiji Aguchi 写道: > Hi, > > As Don mentioned in following thread, it would be nice for pstore/kmsg_dump to serialize > panic path and have one cpu running because they can log messages reliably. > > https://lkml.org/lkml/2011/10/13/427 > > For realizing this idea, we have to move kmsg_dump below smp_send_stop() and bust some locks > of kmsg_dump/pstore in panic path. > > This patch does followings. > > - moving kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop. > - busting logbuf_lock of kmsg_dump() in panic path for avoiding deadlock. > - busting psinfo->buf_lock of pstore_dump() in panic path for avoiding deadlock. > > Any comments are welcome. > Hi, Seiji I have a stupid question: since you have serialized the process procedure via smp_send_stop, why still using spin_lock_xxx? Maybe preempt_disable/enable is enough? |
|
From: Seiji A. <sei...@hd...> - 2011-10-14 20:53:36
|
Hi, As Don mentioned in following thread, it would be nice for pstore/kmsg_dump to serialize panic path and have one cpu running because they can log messages reliably. https://lkml.org/lkml/2011/10/13/427 For realizing this idea, we have to move kmsg_dump below smp_send_stop() and bust some locks of kmsg_dump/pstore in panic path. This patch does followings. - moving kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop. - busting logbuf_lock of kmsg_dump() in panic path for avoiding deadlock. - busting psinfo->buf_lock of pstore_dump() in panic path for avoiding deadlock. Any comments are welcome. Signed-off-by: Seiji Aguchi <sei...@hd...> --- fs/pstore/platform.c | 22 ++++++++++------------ kernel/panic.c | 4 ++-- kernel/printk.c | 7 +++++++ 3 files changed, 19 insertions(+), 14 deletions(-) diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 2bd620f..e73d940 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -90,19 +90,21 @@ static void pstore_dump(struct kmsg_dumper *dumper, int hsize, ret; unsigned int part = 1; unsigned long flags = 0; - int is_locked = 0; if (reason < ARRAY_SIZE(reason_str)) why = reason_str[reason]; else why = "Unknown"; - if (in_nmi()) { - is_locked = spin_trylock(&psinfo->buf_lock); - if (!is_locked) - pr_err("pstore dump routine blocked in NMI, may corrupt error record\n"); - } else - spin_lock_irqsave(&psinfo->buf_lock, flags); + /* + * pstore_dump() is called after smp_send_stop() in panic path. + * So, spin_lock should be bust for avoiding deadlock. + */ + if (reason == KMSG_DUMP_PANIC) + spin_lock_init(&psinfo->buf_lock); + + spin_lock_irqsave(&psinfo->buf_lock, flags); + oopscount++; while (total < kmsg_bytes) { dst = psinfo->buf; @@ -131,11 +133,7 @@ static void pstore_dump(struct kmsg_dumper *dumper, total += l1_cpy + l2_cpy; part++; } - if (in_nmi()) { - if (is_locked) - spin_unlock(&psinfo->buf_lock); - } else - spin_unlock_irqrestore(&psinfo->buf_lock, flags); + spin_unlock_irqrestore(&psinfo->buf_lock, flags); } static struct kmsg_dumper pstore_dumper = { diff --git a/kernel/panic.c b/kernel/panic.c index d7bb697..41bf6ad 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -88,8 +88,6 @@ NORET_TYPE void panic(const char * fmt, ...) */ crash_kexec(NULL); - kmsg_dump(KMSG_DUMP_PANIC); - /* * Note smp_send_stop is the usual smp shutdown function, which * unfortunately means it may not be hardened to work in a panic @@ -97,6 +95,8 @@ NORET_TYPE void panic(const char * fmt, ...) */ smp_send_stop(); + kmsg_dump(KMSG_DUMP_PANIC); + atomic_notifier_call_chain(&panic_notifier_list, 0, buf); bust_spinlocks(0); diff --git a/kernel/printk.c b/kernel/printk.c index 1455a0d..e1e57db 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -1732,6 +1732,13 @@ void kmsg_dump(enum kmsg_dump_reason reason) unsigned long l1, l2; unsigned long flags; + /* + * kmsg_dump() is called after smp_send_stop() in panic path. + * So, spin_lock should be bust for avoiding deadlock. + */ + if (reason == KMSG_DUMP_PANIC) + raw_spin_lock_init(&logbuf_lock); + /* Theoretically, the log could move on after we do this, but there's not a lot we can do about that. The new messages will overwrite the start of what we dump. */ -- 1.7.1 |
|
From: Satoru M. <sat...@hd...> - 2011-09-29 16:41:27
|
Hi, This is a tunable watermark systemtap script v2. This implements the features that confine reclaimed page to pagecache. For more details, please see README in tarball. Regards, Satoru |
|
From: Luck, T. <ton...@in...> - 2011-09-28 22:30:12
|
> That should be up to the backend, no? ERST has two modes, only one which > has a state machine. The other is NVRAM which can probably handle > simultaneous writes. And I believe the EFI back-end can handle that as > well. That is why I was suggesting that the back-end return a failure. ERST tries to provide a lot of flexibility to the platform on how to make persistent space available. If the platform has directly addressable NVRAM - then ERST can point to it, so the "save" operation degenerates into a simple write to the next available block. But this isn't required. The ERST buffer may be in normal memory, and the actions in the state machine may trigger an SMI to make the BIOS copy it away to some safe place, or the actions could ping a doorbell on a management controller which could initiate a DMA transfer to pick up the buffer from memory. -Tony |
|
From: Seiji A. <sei...@hd...> - 2011-09-28 18:55:31
|
Hi Don, >[reads the kernel/panic.c code] oh, I see this already exists, you would >just move the smp_send_stop() command up a couple lines of code. > >[Side note] perhaps we should change the behaviour of smp_send_stop to use >NMI and create a blacklist of machines to use the IRQ line instead. I >assume the list of broken machines is small as Red Hat has been kdumping >kernels since 2.6.18 with little evidence that machines were failing >because NMI wasn't working properly. OK. I will develop a patch in accordance with your comment above. In addition to that, I have to improve mtdoops/ramoops because There are some blocking codes in them. Seiji |
|
From: Don Z. <dz...@re...> - 2011-09-28 14:09:48
|
On Tue, Sep 27, 2011 at 03:46:08PM -0400, Seiji Aguchi wrote: > Hi, > > >Yes we care - saving panic data is most likely the single most important > >thing that pstore does. I just have severe doubts that it will actually > >save anything useful if we just blindly continue if we can't get the lock. > > I agree with Tony. We may not get useful information if pstore just blindly continues > while other cpus are running. > > >Is this patch based on a real-life case of a system deadlocking? I'd > >like to know if we are just talking around the theoretical case that > >the lock may be held at panic time - or something that has actually been > >seen in real life. > > This patch is _not_ based on real-life case. I would like to avoid potential deadlock. > > If Don disagrees to my "return" code, I have another idea which moves pstore_dump() behind smp_send_stop(). > smp_send_stop() stops other cpus by sending IPI. > So pstore can continue reliably and get useful information by just busting spinlock. Yeah, Vivek had a similar idea to have the common panic path mimic what they do with kdump, stop all the cpus except for the crashing one, to serialize the crashing path. This would allow us to more easily bust spinlocks without worrying about what the other cpus are doing. The kdump solution involves using NMI whereas smp_send_stop (on x86) avoids it because of past issues and instead uses the IRQ line. This won't work if pstore_dump uses a spin_try_lock_irqsave() because the IRQ line will be disable and never get the smp_send_stop() message (unless I am reading the code wrong). [reads the kernel/panic.c code] oh, I see this already exists, you would just move the smp_send_stop() command up a couple lines of code. [Side note] perhaps we should change the behaviour of smp_send_stop to use NMI and create a blacklist of machines to use the IRQ line instead. I assume the list of broken machines is small as Red Hat has been kdumping kernels since 2.6.18 with little evidence that machines were failing because NMI wasn't working properly. Cheers, Don |
|
From: Don Z. <dz...@re...> - 2011-09-28 13:57:41
|
On Tue, Sep 27, 2011 at 12:02:38PM -0700, Luck, Tony wrote: > > Ok. Do we care? I assumed the panic data would be more > > relevant/interesting than whatever pstore was doing before (like loading > > previous log files). > > Yes we care - saving panic data is most likely the single most important > thing that pstore does. I just have severe doubts that it will actually > save anything useful if we just blindly continue if we can't get the lock. Well, I was trying to imply that any pre-panic info is uninteresting. It is the panic/NMI stuff that should be top priority, worthy of busting the spin lock. > > What actually happens next will be dependent on the back-end. For > the state machine in ERST, one possible outcome is a hang. For many > people a hang is considered worse than a panic. That should be up to the backend, no? ERST has two modes, only one which has a state machine. The other is NVRAM which can probably handle simultaneous writes. And I believe the EFI back-end can handle that as well. That is why I was suggesting that the back-end return a failure. > > > I assumed we are just overwriting the buffer with the current data, so > > unless the other cpu is chugging along while this cpu is in panic, the new > > data shouldn't get corrupted, no? > > I really have no idea what *will* happen. Lots of things are possible, only > some of them are desirable. My concern here is that if someone is just toying with pstore, writing/reading data or even just poking at it to see what is going on with the system, they may accidentally block real system errors or panics from properly logging. That doesn't seem right. Cheers, Don |
|
From: Seiji A. <sei...@hd...> - 2011-09-27 19:46:33
|
Hi, >Yes we care - saving panic data is most likely the single most important >thing that pstore does. I just have severe doubts that it will actually >save anything useful if we just blindly continue if we can't get the lock. I agree with Tony. We may not get useful information if pstore just blindly continues while other cpus are running. >Is this patch based on a real-life case of a system deadlocking? I'd >like to know if we are just talking around the theoretical case that >the lock may be held at panic time - or something that has actually been >seen in real life. This patch is _not_ based on real-life case. I would like to avoid potential deadlock. If Don disagrees to my "return" code, I have another idea which moves pstore_dump() behind smp_send_stop(). smp_send_stop() stops other cpus by sending IPI. So pstore can continue reliably and get useful information by just busting spinlock. It depends on each backend driver whether it actually accesses to NVRAM/storage. Idea ==== Panic() |- smp_send_stop() (Send IPI to other cpus) |- bust spin_lock(&psinfo->buf_lock) |- call pstore_dump() Seiji >-----Original Message----- >From: Luck, Tony [mailto:ton...@in...] >Sent: Tuesday, September 27, 2011 3:03 PM >To: Don Zickus >Cc: Seiji Aguchi; lin...@vg...; Vivek Goyal; Matthew Garrett; Chen, Gong; Andrew Morton; >dle...@li...; Satoru Moriya >Subject: RE: [RFC][PATCH -next] pstore: replace spin_lock with spin_trylock_irqsave in panic path > >> Ok. Do we care? I assumed the panic data would be more >> relevant/interesting than whatever pstore was doing before (like loading >> previous log files). > >Yes we care - saving panic data is most likely the single most important >thing that pstore does. I just have severe doubts that it will actually >save anything useful if we just blindly continue if we can't get the lock. > >What actually happens next will be dependent on the back-end. For >the state machine in ERST, one possible outcome is a hang. For many >people a hang is considered worse than a panic. > >> I assumed we are just overwriting the buffer with the current data, so >> unless the other cpu is chugging along while this cpu is in panic, the new >> data shouldn't get corrupted, no? > >I really have no idea what *will* happen. Lots of things are possible, only >some of them are desirable. > >Is this patch based on a real-life case of a system deadlocking? I'd >like to know if we are just talking around the theoretical case that >the lock may be held at panic time - or something that has actually been >seen in real life. > >-Tony |
|
From: Luck, T. <ton...@in...> - 2011-09-27 19:02:48
|
> Ok. Do we care? I assumed the panic data would be more > relevant/interesting than whatever pstore was doing before (like loading > previous log files). Yes we care - saving panic data is most likely the single most important thing that pstore does. I just have severe doubts that it will actually save anything useful if we just blindly continue if we can't get the lock. What actually happens next will be dependent on the back-end. For the state machine in ERST, one possible outcome is a hang. For many people a hang is considered worse than a panic. > I assumed we are just overwriting the buffer with the current data, so > unless the other cpu is chugging along while this cpu is in panic, the new > data shouldn't get corrupted, no? I really have no idea what *will* happen. Lots of things are possible, only some of them are desirable. Is this patch based on a real-life case of a system deadlocking? I'd like to know if we are just talking around the theoretical case that the lock may be held at panic time - or something that has actually been seen in real life. -Tony |
|
From: Don Z. <dz...@re...> - 2011-09-27 17:59:36
|
On Tue, Sep 27, 2011 at 10:46:32AM -0700, Luck, Tony wrote: > > Personally, I am not sure we want to abort here at the pstore layer, it > > should probably be aborted lower. There isn't any reason why we can't > > continue from a pstore perspective (we can just bust the spinlock). > > But do we really have much chance at getting a real dump in this case? > The pstore buf_lock is protecting the memory that the backend uses to > save the data. If we can't get the lock, then we are going to conflict > using that buffer with whoever does have the lock. So we will probably > mess up whatever data they were trying to save, as well as not managing Ok. Do we care? I assumed the panic data would be more relevant/interesting than whatever pstore was doing before (like loading previous log files). > to save our panic data. So this isn't just a back-end issue, it is I assumed we are just overwriting the buffer with the current data, so unless the other cpu is chugging along while this cpu is in panic, the new data shouldn't get corrupted, no? Cheers, Don > fundamental to the pstore layer (since it depends on this back end buffer). > > This is a tough call - but I'm leaning a bit towards taking this patch. > > I agree with your suggestion that we need a better comment by the "return" > (and also in the change log) saying why we are not saving the panic dmesg. > > -Tony |
|
From: Luck, T. <ton...@in...> - 2011-09-27 17:46:52
|
> Personally, I am not sure we want to abort here at the pstore layer, it > should probably be aborted lower. There isn't any reason why we can't > continue from a pstore perspective (we can just bust the spinlock). But do we really have much chance at getting a real dump in this case? The pstore buf_lock is protecting the memory that the backend uses to save the data. If we can't get the lock, then we are going to conflict using that buffer with whoever does have the lock. So we will probably mess up whatever data they were trying to save, as well as not managing to save our panic data. So this isn't just a back-end issue, it is fundamental to the pstore layer (since it depends on this back end buffer). This is a tough call - but I'm leaning a bit towards taking this patch. I agree with your suggestion that we need a better comment by the "return" (and also in the change log) saying why we are not saving the panic dmesg. -Tony |
|
From: Don Z. <dz...@re...> - 2011-09-27 17:34:26
|
On Tue, Sep 27, 2011 at 01:14:59PM -0400, Seiji Aguchi wrote:
> Hi,
>
> [Problem]
> Currently, pstore takes spin_trylock(&psinfo->buf_lock) in NMI context.
> And it takes spin_lock(&psinfo->buf_lock) in other cases.
>
> If there are some bugs in pstore and kernel panics, spin_lock(&psinfo->buf_lock) causes deadlock
> and panic_notifier_chain will not work.
Ok, so I missed your 'return' first time through and originally had a
bunch of comments. So I would suggest adding a comment explaining why we
are returning in that failure.
Personally, I am not sure we want to abort here at the pstore layer, it
should probably be aborted lower. There isn't any reason why we can't
continue from a pstore perspective (we can just bust the spinlock).
>From an ERST perspective, the state machine might be screwed up, hence
aborting in that layer could make sense. But I don't think I agree with
the 'return' statement.
So I am opposed to it for now.
Cheers,
Don
>
> [Patch Description]
> For solving this problem, this patch replaces spin_lock with spin_trylock_irqsave in panic path.
>
> Dead lock in panic path will not happen by applying this patch.
>
> Signed-off-by: Seiji Aguchi <sei...@hd...>
>
> ---
> fs/pstore/platform.c | 17 ++++++++---------
> 1 files changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 0472924..9882892 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -97,12 +97,15 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> else
> why = "Unknown";
>
> - if (in_nmi()) {
> - is_locked = spin_trylock(&psinfo->buf_lock);
> - if (!is_locked)
> - pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
> + if (reason == KMSG_DUMP_PANIC) {
> + is_locked = spin_trylock_irqsave(&psinfo->buf_lock, flags);
> + if (!is_locked) {
> + pr_err("pstore dump routine skipped in panic path\n");
> + return;
> + }
> } else
> spin_lock_irqsave(&psinfo->buf_lock, flags);
> +
> oopscount++;
> while (total < kmsg_bytes) {
> dst = psinfo->buf;
> @@ -131,11 +134,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> total += l1_cpy + l2_cpy;
> part++;
> }
> - if (in_nmi()) {
> - if (is_locked)
> - spin_unlock(&psinfo->buf_lock);
> - } else
> - spin_unlock_irqrestore(&psinfo->buf_lock, flags);
> + spin_unlock_irqrestore(&psinfo->buf_lock, flags);
> }
>
> static struct kmsg_dumper pstore_dumper = {
> --
> 1.7.1
>
|
|
From: Seiji A. <sei...@hd...> - 2011-09-27 17:15:17
|
Hi,
[Problem]
Currently, pstore takes spin_trylock(&psinfo->buf_lock) in NMI context.
And it takes spin_lock(&psinfo->buf_lock) in other cases.
If there are some bugs in pstore and kernel panics, spin_lock(&psinfo->buf_lock) causes deadlock
and panic_notifier_chain will not work.
[Patch Description]
For solving this problem, this patch replaces spin_lock with spin_trylock_irqsave in panic path.
Dead lock in panic path will not happen by applying this patch.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
fs/pstore/platform.c | 17 ++++++++---------
1 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 0472924..9882892 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -97,12 +97,15 @@ static void pstore_dump(struct kmsg_dumper *dumper,
else
why = "Unknown";
- if (in_nmi()) {
- is_locked = spin_trylock(&psinfo->buf_lock);
- if (!is_locked)
- pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
+ if (reason == KMSG_DUMP_PANIC) {
+ is_locked = spin_trylock_irqsave(&psinfo->buf_lock, flags);
+ if (!is_locked) {
+ pr_err("pstore dump routine skipped in panic path\n");
+ return;
+ }
} else
spin_lock_irqsave(&psinfo->buf_lock, flags);
+
oopscount++;
while (total < kmsg_bytes) {
dst = psinfo->buf;
@@ -131,11 +134,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
total += l1_cpy + l2_cpy;
part++;
}
- if (in_nmi()) {
- if (is_locked)
- spin_unlock(&psinfo->buf_lock);
- } else
- spin_unlock_irqrestore(&psinfo->buf_lock, flags);
+ spin_unlock_irqrestore(&psinfo->buf_lock, flags);
}
static struct kmsg_dumper pstore_dumper = {
--
1.7.1
|
|
From: VNI6rINO <VNI...@uf...> - 2011-09-15 03:49:03
|
lrwqokf 你好 dle-develop: ojwou pzdftzt 2011年09月15日ncemi 此致 祝商祺!lflymgmem |
|
From: Nao N. <nao...@hi...> - 2011-09-02 12:52:17
|
Hi Valdis, (2011/08/31 5:02), Val...@vt... wrote: > On Sat, 27 Aug 2011 21:54:29 +0900, Nao Nishijima said: > >> A kernel device names (e.g. sda) is not useful information because it >> doesn't always point the same disk at each boot-up time. > > If this is important to you, can't you use a udev rule, similar to what most > distros already stick in 70-persistent-net.rules and 70-persistent-cd.rules? > > (Yes, this *does* involve finding a UUID or label or something on the disk > that you can identify as "same entity as last time". As you said, it is able to identify a disk at the expense of checking cost. However to introduce "alias" is an advantage of reducing both the cost and the risk of miss-communication, and it can easily identify it. And also, currently, kernel log and command output do not accord with the device name which a user uses (e.g. by-id, by-uuid). I would try to solve those mismatches using "alias". In other words, I'd like to introduce aliases for integrating the name of devices to control and record. Of course I will modify commands using a device name to use a persistent device names. Best regards, -- Nao NISHIJIMA Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., YOKOHAMA Research Laboratory Email: nao...@hi... |
|
From: MyBonusWeb S. <ne...@my...> - 2011-09-02 01:38:56
|
Añade in...@my... a tu libreta de direcciones y llegaremos siempre a tu bandeja de entrada. Ofertas Solidarias en Tu Ciudad Sé solidario, colabora con los que lo necesitan solo con tu email 30% Dto. Prueba este menú degustación de alta cocina!! Saborea este menú espectacular... SOLO: 17,50€ Precio original: 35€ Descuento: 30% VER OFERTA Si te apetece probar platos diferentes de una gran calidad, prueba este menú degustación de alta cocina en Restaurante Tentaciones. ¡Consigue tu MyBonus Ya! Con reserva previa. Validez del MyBonus, 3 meses. 50% Dto. Ya no tienes excusa para lucir tu mejor figura Luce tu cuerpo como nunca... SOLO: 10,00€ Precio original: 20€ Descuento: 50% VER OFERTA Tonifícate, adelgaza y mejora tu flexibilidad sin apenas realizar esfuerzo. Prueba este nuevo sistema de Gimnasia Asistida para recuperar tu línea sin cansarte ¡Consigue tu MyBonus Ya! Con reserva previa. Validez del MyBonus, 3 meses. 50% Dto. Circuito de arborismo de aproximadamente 3 horas. Oferta especial... SOLO: 15,00€ Precio original: 30€ Descuento: 50% VER OFERTA ¿Te gustan las aventuras? Consigue esta aventura por un increíble precio, Arborismo durante 3 horas aproximadamente con diferentes modalidades de recorrido. Para niños o adultos que midan más de 1,50 cm de altura. ¡Consigue tu MyBonus Ya! Con reserva previa. Validez del MyBonus, 6 meses. 60% Dto. Consigue unos brillos espectaculares sin dañar tu cabello Tu pelo perfecto... SOLO: 24,00€ Precio original: 60€ Descuento: 60% VER OFERTA Dale a tu pelo un aspecto cuidado y bonito con los Velos de Color (Valido para cabellos teñidos y naturales) ¡Consigue tu MyBonus Ya! Con reserva previa. Validez del MyBonus, 6 meses. 50% Dto. Tatúate tu nombre o el que quieras en espalda o brazos! Para siempre ... SOLO: 60,00€ Precio original: 120,00€ Descuento: 50% VER OFERTA Tatúate para toda la vida el nombre de tu amor, de tu ídolo, el tuyo, el de un familiar, el nombre que tu quieras! ¡Consigue tu MyBonus Ya! Con reserva previa. Validez del MyBonus, 3 meses. Si nos referencias a tus amigos o familiares todos tendréis un gran descuento en la compra de vuestros ¡MyBonus! Sé el primero en recibir nuestras ofertas, para ello haz click en tu ciudad: Madrid · Barcelona · Bilbao · Valencia · Zaragoza · Sevilla · Málaga · A Coruña · Mallorca · Alicante · Las Palmas · Valladolid · Córdoba · Granada · Oviedo · Otra Descuentos solidarios de hasta el 90% en tu ciudad www.mybonusweb.eu Actualmente estás suscrito a este boletín con la cuenta (dle...@li...). Tus datos se encuentran recogidos en un fichero automatizado cuyo responsable es Mybonus Experience S.L., a través de la web www.mybonusweb.com podrás ejercitar los derechos de acceso, rectificación, cancelación y oposición. Todo ello en cumplimiento de lo dispuesto en la Ley Orgánica 15/1999, de 13 de Diciembre de Protección de Datos de Carácter Personal. Por favor no respondas a este mail. Si tienes cualquier duda o sugerencia contacta con in...@my... (in...@my...) Si NO deseas recibir las fantásticas ofertas solidarias de tu ciudad con las que tienes un ahorro de hasta el 90% y además ayudas a proyectos solidarios, haz click aquí |
|
From: Satoru M. <sat...@hd...> - 2011-08-31 16:35:23
|
Hi, This program can be used for testing tunable watermark effect. It supports RHEL6.1(tested on RHEL6.1). For more details, please see README in tarball. Regards, Satoru |
|
From: Bernd S. <ber...@fa...> - 2011-08-30 21:53:55
|
On 08/28/2011 08:07 PM, Kay Sievers wrote:
> On Sat, Aug 27, 2011 at 15:20, Tejun Heo<tj...@ke...> wrote:
>> On Sat, Aug 27, 2011 at 09:54:29PM +0900, Nao Nishijima wrote:
>>>> Hmm... I don't follow. Why wouldn't it be able to? All the
>>>> informations are in the log. It is messy but it's there. If you want
>>>
>>> In many cases, the script is able to convert the name. However there is
>>> the special case that the logs do not exist in memory and disk due to
>>> the crash except console.
>>
>> Sorry but I still don't get it. If you can extract the log, the
>> information is there and w/ remote logging (be it via serial or
>> network), the information always has to be there. Are you talking
>> about the case where somehow only the video console somehow succeeds
>> to print out oops? I don't think that's a common case as serial (also
>> on IPMI) tends to be pretty robust, often more robust than vide
>> output, and even when such case occurs, the only thing you want is
>> mapping kernel device name to more recognizable information, which
>> isn't difficult at all. If you wanna do it in a really simple manner,
>> just save udisks --dump output after boot and each hotplug event and
>> write a simple script to search it.
>>
>>>> more structured information, u{dev|disks} already maintain device
>>>> libarary - what maps to what, connected how with what attributes and
>>>> so on. Sending them off to the log machine as device hotplug events
>>>> occur and consulting it when post-processing log message would work
>>>> fine. All you need is just some python scripting. I don't really see
>>>> much point in messing with device names directly. The only thing is
>>>> that the raw log would be prettier. I don't think that is useful
>>>> enough to justify changing kernel device names.
>>>
>>> A kernel device names (e.g. sda) is not useful information because it
>>> doesn't always point the same disk at each boot-up time.
>>
>> Eh? What difference does that make? Just make the target machines
>> send up-to-date disk config info to the log server.
>>
>>> An alias is just an option and provides the ability to give all
>>> kernel devices a "preferred name".
>>>
>>> By default, dev_printk's will show a kernel device name. They show an
>>> alias only when the user assigns a "preferred name" to an alias.
>>> Even if the persistend device name is used, the device names in logs are
>>> different from the name that the users are using. So, an alias helps the
>>> user identify the disk.
>>
>> Yes, I do understand what it's doing and can see there can be cases it
>> can be somewhat useful but I still think it's too adhoc an approach
>> which doesn't really justify itself. It just does too little to solve
>> the actual problem and even that 'little' part isn't very trivial - it
>> adds whole lot of policy decisions to make and I'm pretty sure it will
>> cause good amount of havoc w/ all the system tools which currently
>> don't expect block device names to change to some admin determined
>> free format string on the fly.
>
> My take on this in short again:
> The very same thing that needs to store the 'pretty name' in the
> kernel here, can instead just log that name along with the kernel name
> to /dev/kmsg. It ends up in the kernel log buffer and is the marker to
> safely match all later log entries.
>
> This can be done today, even on many years old distros, with a single
> udev rule and a tiny program. It needs no kernel or tool changes and
> gives almost all the benefits of the 'pretty name' infrastructure.
Could you please explain exactly how? Simply replace sd{X} by the
preferred name in /dev/kmsg? How do make make sure you do not replace
something that is not supposed to be replaced? I think unless you plan
to modify existing kernel device message you cannot, as sd{X} is too
general.
And how does /dev/kmsg solve the serial console problem?
However, I think a real technical issue with any udev device name rules
remains - it often does not work well with multipath devices. Alua
partly solves that problem, but only if there is a direct connection to
each of the alua controllers. But if there is a switch in between, there
are often several devices with the same alua score. And the situation is
even worse with multipath hardware that does not know alua at all.
So for many multipath devices it probably would not make sense to set
alias names. But especially on those system you often would like to have
suitable alias names, as it gets a bit chaotic without (I had to deal
with 30 or even 60 devices x 8 paths in the past...).
Cheers,
Bernd
|
|
From: <Val...@vt...> - 2011-08-30 20:07:14
|
On Sat, 27 Aug 2011 21:54:29 +0900, Nao Nishijima said: > A kernel device names (e.g. sda) is not useful information because it > doesn't always point the same disk at each boot-up time. If this is important to you, can't you use a udev rule, similar to what most distros already stick in 70-persistent-net.rules and 70-persistent-cd.rules? (Yes, this *does* involve finding a UUID or label or something on the disk that you can identify as "same entity as last time". |
|
From: Kay S. <kay...@vr...> - 2011-08-28 20:02:18
|
On Sat, Aug 27, 2011 at 15:20, Tejun Heo <tj...@ke...> wrote:
> On Sat, Aug 27, 2011 at 09:54:29PM +0900, Nao Nishijima wrote:
>> > Hmm... I don't follow. Why wouldn't it be able to? All the
>> > informations are in the log. It is messy but it's there. If you want
>>
>> In many cases, the script is able to convert the name. However there is
>> the special case that the logs do not exist in memory and disk due to
>> the crash except console.
>
> Sorry but I still don't get it. If you can extract the log, the
> information is there and w/ remote logging (be it via serial or
> network), the information always has to be there. Are you talking
> about the case where somehow only the video console somehow succeeds
> to print out oops? I don't think that's a common case as serial (also
> on IPMI) tends to be pretty robust, often more robust than vide
> output, and even when such case occurs, the only thing you want is
> mapping kernel device name to more recognizable information, which
> isn't difficult at all. If you wanna do it in a really simple manner,
> just save udisks --dump output after boot and each hotplug event and
> write a simple script to search it.
>
>> > more structured information, u{dev|disks} already maintain device
>> > libarary - what maps to what, connected how with what attributes and
>> > so on. Sending them off to the log machine as device hotplug events
>> > occur and consulting it when post-processing log message would work
>> > fine. All you need is just some python scripting. I don't really see
>> > much point in messing with device names directly. The only thing is
>> > that the raw log would be prettier. I don't think that is useful
>> > enough to justify changing kernel device names.
>>
>> A kernel device names (e.g. sda) is not useful information because it
>> doesn't always point the same disk at each boot-up time.
>
> Eh? What difference does that make? Just make the target machines
> send up-to-date disk config info to the log server.
>
>> An alias is just an option and provides the ability to give all
>> kernel devices a "preferred name".
>>
>> By default, dev_printk's will show a kernel device name. They show an
>> alias only when the user assigns a "preferred name" to an alias.
>> Even if the persistend device name is used, the device names in logs are
>> different from the name that the users are using. So, an alias helps the
>> user identify the disk.
>
> Yes, I do understand what it's doing and can see there can be cases it
> can be somewhat useful but I still think it's too adhoc an approach
> which doesn't really justify itself. It just does too little to solve
> the actual problem and even that 'little' part isn't very trivial - it
> adds whole lot of policy decisions to make and I'm pretty sure it will
> cause good amount of havoc w/ all the system tools which currently
> don't expect block device names to change to some admin determined
> free format string on the fly.
My take on this in short again:
The very same thing that needs to store the 'pretty name' in the
kernel here, can instead just log that name along with the kernel name
to /dev/kmsg. It ends up in the kernel log buffer and is the marker to
safely match all later log entries.
This can be done today, even on many years old distros, with a single
udev rule and a tiny program. It needs no kernel or tool changes and
gives almost all the benefits of the 'pretty name' infrastructure.
Kay
|
|
From: Tejun H. <tj...@ke...> - 2011-08-27 13:20:54
|
Hello, Nao.
On Sat, Aug 27, 2011 at 09:54:29PM +0900, Nao Nishijima wrote:
> > Hmm... I don't follow. Why wouldn't it be able to? All the
> > informations are in the log. It is messy but it's there. If you want
>
> In many cases, the script is able to convert the name. However there is
> the special case that the logs do not exist in memory and disk due to
> the crash except console.
Sorry but I still don't get it. If you can extract the log, the
information is there and w/ remote logging (be it via serial or
network), the information always has to be there. Are you talking
about the case where somehow only the video console somehow succeeds
to print out oops? I don't think that's a common case as serial (also
on IPMI) tends to be pretty robust, often more robust than vide
output, and even when such case occurs, the only thing you want is
mapping kernel device name to more recognizable information, which
isn't difficult at all. If you wanna do it in a really simple manner,
just save udisks --dump output after boot and each hotplug event and
write a simple script to search it.
> > more structured information, u{dev|disks} already maintain device
> > libarary - what maps to what, connected how with what attributes and
> > so on. Sending them off to the log machine as device hotplug events
> > occur and consulting it when post-processing log message would work
> > fine. All you need is just some python scripting. I don't really see
> > much point in messing with device names directly. The only thing is
> > that the raw log would be prettier. I don't think that is useful
> > enough to justify changing kernel device names.
>
> A kernel device names (e.g. sda) is not useful information because it
> doesn't always point the same disk at each boot-up time.
Eh? What difference does that make? Just make the target machines
send up-to-date disk config info to the log server.
> An alias is just an option and provides the ability to give all
> kernel devices a "preferred name".
>
> By default, dev_printk's will show a kernel device name. They show an
> alias only when the user assigns a "preferred name" to an alias.
> Even if the persistend device name is used, the device names in logs are
> different from the name that the users are using. So, an alias helps the
> user identify the disk.
Yes, I do understand what it's doing and can see there can be cases it
can be somewhat useful but I still think it's too adhoc an approach
which doesn't really justify itself. It just does too little to solve
the actual problem and even that 'little' part isn't very trivial - it
adds whole lot of policy decisions to make and I'm pretty sure it will
cause good amount of havoc w/ all the system tools which currently
don't expect block device names to change to some admin determined
free format string on the fly.
Thanks.
--
tejun
|
|
From: Nao N. <nao...@hi...> - 2011-08-27 12:54:41
|
Hello,
(2011/08/27 19:26), Tejun Heo wrote:
> (cc'ing Kay)
> Hello,
>
> On Sat, Aug 27, 2011 at 07:15:25PM +0900, Nao Nishijima wrote:
>> Our concern is the failure analysis. For example, when the disk failure
>> happened, we need to identify the disk from kernel log.
>>
>> Kernel messages are output to serial console when kernel crashes.
>> It's so hard to convert a device name to the alias. Thus the script
>> can't always convert the name.
>
> Hmm... I don't follow. Why wouldn't it be able to? All the
> informations are in the log. It is messy but it's there. If you want
In many cases, the script is able to convert the name. However there is
the special case that the logs do not exist in memory and disk due to
the crash except console.
> more structured information, u{dev|disks} already maintain device
> libarary - what maps to what, connected how with what attributes and
> so on. Sending them off to the log machine as device hotplug events
> occur and consulting it when post-processing log message would work
> fine. All you need is just some python scripting. I don't really see
> much point in messing with device names directly. The only thing is
> that the raw log would be prettier. I don't think that is useful
> enough to justify changing kernel device names.
A kernel device names (e.g. sda) is not useful information because it
doesn't always point the same disk at each boot-up time.
An alias is just an option and provides the ability to give all
kernel devices a "preferred name".
By default, dev_printk's will show a kernel device name. They show an
alias only when the user assigns a "preferred name" to an alias.
Even if the persistend device name is used, the device names in logs are
different from the name that the users are using. So, an alias helps the
user identify the disk.
Best regards,
--
Nao NISHIJIMA
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., YOKOHAMA Research Laboratory
Email: nao...@hi...
|
|
From: Tejun H. <tj...@ke...> - 2011-08-27 10:27:06
|
(cc'ing Kay)
Hello,
On Sat, Aug 27, 2011 at 07:15:25PM +0900, Nao Nishijima wrote:
> Our concern is the failure analysis. For example, when the disk failure
> happened, we need to identify the disk from kernel log.
>
> Kernel messages are output to serial console when kernel crashes.
> It's so hard to convert a device name to the alias. Thus the script
> can't always convert the name.
Hmm... I don't follow. Why wouldn't it be able to? All the
informations are in the log. It is messy but it's there. If you want
more structured information, u{dev|disks} already maintain device
libarary - what maps to what, connected how with what attributes and
so on. Sending them off to the log machine as device hotplug events
occur and consulting it when post-processing log message would work
fine. All you need is just some python scripting. I don't really see
much point in messing with device names directly. The only thing is
that the raw log would be prettier. I don't think that is useful
enough to justify changing kernel device names.
Thanks.
--
tejun
|
|
From: Nao N. <nao...@hi...> - 2011-08-27 10:15:38
|
Hi Tejun, Thank you for your comments. (2011/08/25 19:16), Tejun Heo wrote: > Hello, > > On Thu, Aug 25, 2011 at 06:03:59PM +0900, Nao Nishijima wrote: >> This patch series provide an "alias" of the disk into kernel messages. >> >> A raw device name of a disk does not always point a same disk at each boot-up >> time. Therefore, users have to use persistent device names, which udev creates >> to always access the same disk. However, kernel messages still display the raw >> device names. >> >> My proposal is that users can use and see persistent device names which were >> assigned by themselves because users expect same name to point same disk >> anytime. >> >> Why need to modify kernel messages? >> - We can see mapping of device names and persistent device names in udev log. >> If those logs output to syslog, we can search persistent device name from >> device name, but it can cause a large amount of syslog output. >> >> - If we can use the persistent device names and can always see the same name on >> the kernel log, we don't need to pay additional cost for searching and picking >> a correct pair of device name and persistent device name from udev log. >> >> - Kernel messages are output to serial console when kenel crashes, it's so hard >> to convert a device name to the alias. > > Just some general comments. This may already be a horse which is > beaten to death but anyways... > > I'm not really convinced this is something we need. What we're > missing is structured error reporting which can be understood, > processed, presented and reacted by programs implementing system > management policies. Such facility would be useful in general but for > block devices I think it's a must that we're sorely missing. Yes. I agree that we need structured error reporting. This facility will be discussed by Kay Sievers at Kernel Summit. However I think that even if it is applied in kernel, "alias" is still necessary because users want to use and see friendly name which can be simple, short and preference, instead of persistent device names (e.g. /dev/disk/by-uuid/35d4def7-9098-448c-8cc3-b0cb74c8670b). > Free format kernel message is a very undiscoverable way of > communicating these information. For developing and debugging, it's > fine. It's easy, flexible and you don't really have to think too much > about what should be presented how. For anything else, it's basically > horrible. > > I don't really see what the point of this feature is. For developing > and debugging, pretty names might be nice but almost completely > unnecessary. For anything else, this falls way too short and can be > easily replaced by some smart scripting from userland. After all, > matching different device names is the least of the worries when > trying to use kernel log for general management and post-processing w/ > good amount of heuristics is necessary to be useful anyway. Our concern is the failure analysis. For example, when the disk failure happened, we need to identify the disk from kernel log. Kernel messages are output to serial console when kernel crashes. It's so hard to convert a device name to the alias. Thus the script can't always convert the name. Also, the alias is useful for users because they can use and see the same name everywhere(e.g. commands, kernel log). Best regards, -- Nao NISHIJIMA Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., YOKOHAMA Research Laboratory Email: nao...@hi... |