You can subscribe to this list here.
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(32) |
Jun
(66) |
Jul
(102) |
Aug
(78) |
Sep
(106) |
Oct
(137) |
Nov
(147) |
Dec
(147) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 |
Jan
(71) |
Feb
(139) |
Mar
(86) |
Apr
(76) |
May
(57) |
Jun
(10) |
Jul
(12) |
Aug
(6) |
Sep
(8) |
Oct
(12) |
Nov
(12) |
Dec
(18) |
| 2011 |
Jan
(16) |
Feb
(19) |
Mar
(3) |
Apr
(1) |
May
(16) |
Jun
(17) |
Jul
(74) |
Aug
(22) |
Sep
(18) |
Oct
(24) |
Nov
(21) |
Dec
(30) |
| 2012 |
Jan
(31) |
Feb
(16) |
Mar
(22) |
Apr
(25) |
May
(18) |
Jun
(13) |
Jul
(83) |
Aug
(49) |
Sep
(20) |
Oct
(60) |
Nov
(35) |
Dec
(28) |
| 2013 |
Jan
(39) |
Feb
(61) |
Mar
(35) |
Apr
(21) |
May
(45) |
Jun
(56) |
Jul
(20) |
Aug
(9) |
Sep
(10) |
Oct
(31) |
Nov
(8) |
Dec
(4) |
| 2014 |
Jan
(6) |
Feb
(7) |
Mar
(7) |
Apr
(6) |
May
(4) |
Jun
(8) |
Jul
(5) |
Aug
(2) |
Sep
(4) |
Oct
(4) |
Nov
(11) |
Dec
(5) |
| 2015 |
Jan
(4) |
Feb
(4) |
Mar
(3) |
Apr
(4) |
May
(9) |
Jun
(4) |
Jul
(15) |
Aug
(8) |
Sep
(16) |
Oct
(18) |
Nov
(15) |
Dec
(7) |
| 2016 |
Jan
(20) |
Feb
(9) |
Mar
(15) |
Apr
(24) |
May
(16) |
Jun
(28) |
Jul
(22) |
Aug
(23) |
Sep
(18) |
Oct
(30) |
Nov
(40) |
Dec
(9) |
| 2017 |
Jan
(1) |
Feb
(8) |
Mar
(37) |
Apr
(26) |
May
(25) |
Jun
(46) |
Jul
(24) |
Aug
(9) |
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Seiji A. <sei...@hd...> - 2012-12-07 23:43:16
|
> Can all these things really happen (did you run into this problem on a real system?). Or is this just a theoretical problem. Ugly (but > practical) hacks might be OK to solve real problems. It is a theoretical problem right now. But it is a timing issue and there is a possibility to happen actually. > But do we really want them to fix problems that actually never happen? If we find a problem (even if it is theoretical), we can't say "It actually never happen.". I have some reasons to submit this patch before reproducing actually. 1) It is too late if we fix a problem after it actually happened in case where we apply Linux, including pstore, to mission critical systems, because the failure of those systems has a great impact on a whole society. Customers in this area ask us to fix a problem as soon as possible. On the other hand, this kind of timing issue is hard to reproduce. So, our support service engineers often work all night to reproduce it. It is a nightmare for us. If we can fix it with a small patch in adance, it is really helpful for us. 2) In the long term, I plan to add a kmsg_dump to a kexec path because kdump may fail in the real world. In that case, we need another troubleshooting material like pstore to detect a root cause of failure. Actually, someone blamed for a reliability of kdump in LinuxCON Europe. http://events.linuxfoundation.org/images/stories/pdf/lceu2012_holzheu.pdf To convince a kexec maintainer to add a kmsg_dump, I need to prove that there is no problem in pstore code causing a failure of kdump. Seiji |
|
From: Luck, T. <ton...@in...> - 2012-12-07 22:17:17
|
> This patch skips taking a psinfo->buf_lock when just one cpu is online > because stopped cpus turn to offline via smp_send_stop() > in some architectures like x86, powerpc or arm64. That seems an impressive list of preconditions. So for this to help we need to have taken all but one cpu offline, then be in some code that is holding the pstore lock and get hit by an NMI which causes us to recurse into the pstore code. Can all these things really happen (did you run into this problem on a real system?). Or is this just a theoretical problem. Ugly (but practical) hacks might be OK to solve real problems. But do we really want them to fix problems that actually never happen? -Tony |
|
From: Seiji A. <sei...@hd...> - 2012-12-07 21:41:26
|
[Issue]
If one cpu ,which is taking a psinfo->buf_lock,
receive NMI from a panicked cpu via smp_send_stop(),
the panicked cpu hangs up in pstore_dump() called by kmsg_dump(KMSG_DUMP_PANIC)
because the psinfo->buf_lock is taken again in it.
To avoid the deadlock, an easy solution is moving kmsg_dump above
smp_send_stop() in panic path.
But, it is not safe to kick pstore while multiple cpus are running in panic case,
because they may touch corrupted data/variables and unnecessary failures may happen.
In that case, we can't guarantee that a panicked cpu can log messages reliably
because it may have harmful effects due to the failures.
[Solution]
This patch skips taking a psinfo->buf_lock when just one cpu is online
because stopped cpus turn to offline via smp_send_stop()
in some architectures like x86, powerpc or arm64.
It may be a hack but solves my concern deadlocking in x86 architecture.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
fs/pstore/platform.c | 14 +++++++++++---
1 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 947fbe0..ca4d2ab 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -107,7 +107,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
unsigned long total = 0;
const char *why;
u64 id;
- unsigned int part = 1;
+ unsigned int part = 1, cpu_num = num_online_cpus();
unsigned long flags = 0;
int is_locked = 0;
int ret;
@@ -118,8 +118,14 @@ static void pstore_dump(struct kmsg_dumper *dumper,
is_locked = spin_trylock(&psinfo->buf_lock);
if (!is_locked)
pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
- } else
+ } else if (cpu_num > 1) {
+ /*
+ * Take a spin lock only when multiple cpus are online.
+ */
spin_lock_irqsave(&psinfo->buf_lock, flags);
+ } else
+ local_irq_save(flags);
+
oopscount++;
while (total < kmsg_bytes) {
char *dst;
@@ -146,8 +152,10 @@ static void pstore_dump(struct kmsg_dumper *dumper,
if (in_nmi()) {
if (is_locked)
spin_unlock(&psinfo->buf_lock);
- } else
+ } else if (cpu_num > 1) {
spin_unlock_irqrestore(&psinfo->buf_lock, flags);
+ } else
+ local_irq_restore(flags);
}
static struct kmsg_dumper pstore_dumper = {
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-12-01 01:05:27
|
> What I actually meant was: can "this" CPU avoid stopping other CPUs so early? If we stop the other CPUs when this CPU is ready to > stop itself then there will never be such deadlocks. Let me explain my opinion. When we focus on the deadlock only, the code will be simple by moving smp_send_stop() at the end of panic(). But, panic situation is not normal. I don't think that keeping running multiple cpus is safe, because they may touch corrupted data/variables and unnecessary panic/BUG() may happen. IMO, cpus should be stopped "as early as" possible when panic happens. And then panic() has to take minimal steps with a single cpu. - output messages - kicking troubleshooting features like kdump/kmsg_dump Seiji |
|
From: Andrew M. <ak...@li...> - 2012-11-30 23:21:57
|
On Fri, 30 Nov 2012 22:59:13 +0000 Seiji Aguchi <sei...@hd...> wrote: > > > > Let's step back a bit. Please identify with great specificity the code sites which are stopping other CPUs before taking locks which > > those other CPUs might have been holding. > > > > Then let's see what we can do to fix up the callers, instead of trying to tidy up after they have made this mess. > > OK. > I will update my patch without adding complexity. > The logic will be as follows, if I understand your comment correctly. > > - take console related locks (logbuf_lock, console_sem) > - stop other cpus > - release those locks That would be one way around the problem. It's still a bit messy, because we'll have to take more and more locks in the future, as we identify them. What I actually meant was: can "this" CPU avoid stopping other CPUs so early? If we stop the other CPUs when this CPU is ready to stop itself then there will never be such deadlocks. |
|
From: Seiji A. <sei...@hd...> - 2012-11-30 23:00:13
|
Thank you for giving me the comment. > - Makes the logic in this area even more twisty and complex, when > what we need to do is to simplify it > > - Reinitialises in-use locks > > - Gives the boolean variable "yes" three states, but didn't rename > that variable to something appropriate. I understand "yes" is odd. I just wanted to know if an idea reinitializing locks is acceptable. But now I understand I have to take another approach. > > - Passes yes==2 into s390's unsuspecting bust_spinlocks() implementation. > Sorry. I missed the code. > > Let's step back a bit. Please identify with great specificity the code sites which are stopping other CPUs before taking locks which > those other CPUs might have been holding. > > Then let's see what we can do to fix up the callers, instead of trying to tidy up after they have made this mess. OK. I will update my patch without adding complexity. The logic will be as follows, if I understand your comment correctly. - take console related locks (logbuf_lock, console_sem) - stop other cpus - release those locks Seiji |
|
From: Andrew M. <ak...@li...> - 2012-11-30 22:30:32
|
On Fri, 30 Nov 2012 17:11:07 +0000 Seiji Aguchi <sei...@hd...> wrote: > If one cpu ,which is taking a logbuf_lock or console_sem, > receive IPI/NMI from a panicked cpu via smp_send_stop(), > the panicked cpu hangs up in subsequent kmsg_dump()/printk() > because logbuf_lock and console_sem are taken in the function calls. > > This causes a console blank and users can't see panic messages. > > [Solution] > > this patch introduces a logic initializing logbuf_lock and console_sem > just after smp_send_stop() to avoid dead locks above. That is one nasty looking patch :( - Makes the logic in this area even more twisty and complex, when what we need to do is to simplify it - Reinitialises in-use locks - Gives the boolean variable "yes" three states, but didn't rename that variable to something appropriate. - Passes yes==2 into s390's unsuspecting bust_spinlocks() implementation. Let's step back a bit. Please identify with great specificity the code sites which are stopping other CPUs before taking locks which those other CPUs might have been holding. Then let's see what we can do to fix up the callers, instead of trying to tidy up after they have made this mess. |
|
From: Seiji A. <sei...@hd...> - 2012-11-30 17:12:24
|
[Issue]
If one cpu ,which is taking a logbuf_lock or console_sem,
receive IPI/NMI from a panicked cpu via smp_send_stop(),
the panicked cpu hangs up in subsequent kmsg_dump()/printk()
because logbuf_lock and console_sem are taken in the function calls.
This causes a console blank and users can't see panic messages.
[Solution]
this patch introduces a logic initializing logbuf_lock and console_sem
just after smp_send_stop() to avoid dead locks above.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
include/linux/printk.h | 5 +++++
kernel/panic.c | 1 +
kernel/printk.c | 17 +++++++++++++++++
lib/bust_spinlocks.c | 21 ++++++++++++++++++---
4 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 9afc01e..ffd3e55 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -139,6 +139,7 @@ extern int kptr_restrict;
void log_buf_kexec_setup(void);
void __init setup_log_buf(int early);
+void zap_locks_force(void);
#else
static inline __printf(1, 0)
int vprintk(const char *s, va_list args)
@@ -172,6 +173,10 @@ static inline void log_buf_kexec_setup(void)
static inline void setup_log_buf(int early)
{
}
+
+static inline void zap_locks_force(void)
+{
+}
#endif
extern void dump_stack(void) __cold;
diff --git a/kernel/panic.c b/kernel/panic.c
index e1b2822..c27e58e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -122,6 +122,7 @@ void panic(const char *fmt, ...)
* situation.
*/
smp_send_stop();
+ bust_spinlocks(2);
kmsg_dump(KMSG_DUMP_PANIC);
diff --git a/kernel/printk.c b/kernel/printk.c
index 2d607f4..a6fa638 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1278,6 +1278,23 @@ static void call_console_drivers(int level, const char *text, size_t len)
}
/*
+ * Zap console related locks in panic situation.
+ * It should be called by bust_spinlocks().
+ */
+void zap_locks_force(void)
+{
+ debug_locks_off();
+
+ /* If a crash is occurring, make sure we can't deadlock */
+ if (raw_spin_is_locked(&logbuf_lock))
+ raw_spin_lock_init(&logbuf_lock);
+
+ /* And make sure that we print immediately */
+ if (is_console_locked())
+ sema_init(&console_sem, 1);
+}
+
+/*
* Zap console related locks when oopsing. Only zap at most once
* every 10 seconds, to leave time for slow consoles to print a
* full oops.
diff --git a/lib/bust_spinlocks.c b/lib/bust_spinlocks.c
index 9681d54..ffa1c64 100644
--- a/lib/bust_spinlocks.c
+++ b/lib/bust_spinlocks.c
@@ -13,19 +13,34 @@
#include <linux/wait.h>
#include <linux/vt_kern.h>
#include <linux/console.h>
+#include <linux/printk.h>
+/*
+ * yes = 0: make sure that messages are printed to console immediately.
+ * yes = 1: avoid a recursive lock on a single cpu in a panic case.
+ * yes = 2: avoid a deadlock after stopping cpus by smp_send_stop()
+ * in a panic case.
+ */
void __attribute__((weak)) bust_spinlocks(int yes)
{
- if (yes) {
- ++oops_in_progress;
- } else {
+ switch (yes) {
+ case 0:
#ifdef CONFIG_VT
unblank_screen();
#endif
console_unblank();
if (--oops_in_progress == 0)
wake_up_klogd();
+ break;
+ case 1:
+ ++oops_in_progress;
+ break;
+ case 2:
+ zap_locks_force();
+ break;
+ default:
+ break;
}
}
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:29:30
|
[Issue]
a format of variable name has been updated to type, id, count and ctime
to support holding multiple logs.
Format of current variable name
dump-type0-1-2-12345678
type:0
id:1
count:2
ctime:12345678
On the other hand, if an old variable name before being updated
remains, users can't erase it via /dev/pstore.
Format of old variable name
dump-type0-1-12345678
type:0
id:1
ctime:12345678
[Solution]
This patch add a format check for the old variable name in a erase callback to make it erasable.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 20 ++++++++++++++++++--
1 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bf7a6f9..6e51c1e 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -773,6 +773,8 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
{
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
+ char name_old[DUMP_NAME_LEN];
+ efi_char16_t efi_name_old[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
@@ -796,8 +798,22 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
if (efi_guidcmp(entry->var.VendorGuid, vendor))
continue;
if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
+ utf16_strlen(efi_name))) {
+ /*
+ * Check if an old format,
+ * which doesn't support holding
+ * multiple logs, remains.
+ */
+ sprintf(name_old, "dump-type%u-%u-%lu", type,
+ (unsigned int)id, time.tv_sec);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name_old[i] = name_old[i];
+
+ if (utf16_strncmp(entry->var.VariableName, efi_name_old,
+ utf16_strlen(efi_name_old)))
+ continue;
+ }
/* found */
found = entry;
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:29:00
|
[Issue]
a format of variable name has been updated to type, id, count and ctime
to support holding multiple logs.
Format of current variable name
dump-type0-1-2-12345678
type:0
id:1
count:2
ctime:12345678
On the other hand, if an old variable name before being updated
remains, users can't read it via /dev/pstore.
Format of old variable name
dump-type0-1-12345678
type:0
id:1
ctime:12345678
[Solution]
This patch add a format check for the old variable name in a read callback
to make it readable.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 38 ++++++++++++++++++++++++++++----------
1 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 7ad3aae..bf7a6f9 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -681,17 +681,35 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
*count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
- get_var_data_locked(efivars, &efivars->walk_entry->var);
- size = efivars->walk_entry->var.DataSize;
- *buf = kmalloc(size, GFP_KERNEL);
- if (*buf == NULL)
- return -ENOMEM;
- memcpy(*buf, efivars->walk_entry->var.Data,
- size);
- efivars->walk_entry = list_entry(efivars->walk_entry->list.next,
- struct efivar_entry, list);
- return size;
+ } else if (sscanf(name, "dump-type%u-%u-%lu",
+ type, &part, &time) == 3) {
+ /*
+ * Check if an old format,
+ * which doesn't support holding
+ * multiple logs, remains.
+ */
+ *id = part;
+ *count = 0;
+ timespec->tv_sec = time;
+ timespec->tv_nsec = 0;
+ } else {
+ efivars->walk_entry = list_entry(
+ efivars->walk_entry->list.next,
+ struct efivar_entry, list);
+ continue;
}
+
+ get_var_data_locked(efivars, &efivars->walk_entry->var);
+ size = efivars->walk_entry->var.DataSize;
+ *buf = kmalloc(size, GFP_KERNEL);
+ if (*buf == NULL)
+ return -ENOMEM;
+ memcpy(*buf, efivars->walk_entry->var.Data,
+ size);
+ efivars->walk_entry = list_entry(
+ efivars->walk_entry->list.next,
+ struct efivar_entry, list);
+ return size;
}
efivars->walk_entry = list_entry(efivars->walk_entry->list.next,
struct efivar_entry, list);
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:28:28
|
[Issue]
Currently, a variable name, which identifies each entry, consists of type, id and ctime.
But if multiple events happens in a short time, a second/third event may fail to log because
efi_pstore can't distinguish each event with current variable name.
[Solution]
A reasonable way to identify all events precisely is introducing a sequence counter to
the variable name.
The sequence counter has already supported in a pstore layer with "oopscount".
So, this patch adds it to a variable name.
Also, it is passed to read/erase callbacks of platform drivers in accordance with
the modification of the variable name.
<before applying this patch>
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-12345678
type:0
id:1
ctime:12345678
If multiple events happen in a short time, efi_pstore can't distinguish them because
variable names are same among them.
<after applying this patch>
it can be distinguishable by adding a sequence counter as follows.
a variable name of first event: dump-type0-1-1-12345678
a variable name of Second event: dump-type0-1-2-12345678
type:0
id:1
sequence counter: 1(first event), 2(second event)
ctime:12345678
In case of a write callback executed in pstore_console_write(), "0" is added to
an argument of the write callback because it just logs all kernel messages and
doesn't need to care about multiple events.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Rafael J. Wysocki <raf...@in...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/acpi/apei/erst.c | 12 ++++++------
drivers/firmware/efivars.c | 23 ++++++++++++++---------
fs/pstore/inode.c | 8 +++++---
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 13 +++++++------
fs/pstore/ram.c | 7 +++----
include/linux/pstore.h | 8 +++++---
7 files changed, 41 insertions(+), 32 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index 0bd6ae4..6d894bf 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -931,13 +931,13 @@ static int erst_check_table(struct acpi_table_erst *erst_tab)
static int erst_open_pstore(struct pstore_info *psi);
static int erst_close_pstore(struct pstore_info *psi);
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi);
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi);
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
@@ -987,7 +987,7 @@ static int erst_close_pstore(struct pstore_info *psi)
return 0;
}
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi)
{
@@ -1055,7 +1055,7 @@ out:
}
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
struct cper_pstore_record *rcd = (struct cper_pstore_record *)
@@ -1101,7 +1101,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
return ret;
}
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 3803621..7ad3aae 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -658,13 +658,14 @@ static int efi_pstore_close(struct pstore_info *psi)
}
static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *timespec,
+ int *count, struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
char name[DUMP_NAME_LEN];
int i;
+ int cnt;
unsigned int part, size;
unsigned long time;
@@ -674,8 +675,10 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
for (i = 0; i < DUMP_NAME_LEN; i++) {
name[i] = efivars->walk_entry->var.VariableName[i];
}
- if (sscanf(name, "dump-type%u-%u-%lu", type, &part, &time) == 3) {
+ if (sscanf(name, "dump-type%u-%u-%d-%lu",
+ type, &part, &cnt, &time) == 4) {
*id = part;
+ *count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
get_var_data_locked(efivars, &efivars->walk_entry->var);
@@ -698,7 +701,8 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
static int efi_pstore_write(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi)
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
@@ -725,7 +729,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- sprintf(name, "dump-type%u-%u-%lu", type, part,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count,
get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
@@ -746,7 +750,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return ret;
};
-static int efi_pstore_erase(enum pstore_type_id type, u64 id,
+static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
@@ -756,7 +760,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, (unsigned int)id, count,
time.tv_sec);
spin_lock(&efivars->lock);
@@ -807,7 +811,7 @@ static int efi_pstore_close(struct pstore_info *psi)
return 0;
}
-static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
+static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
@@ -816,12 +820,13 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
static int efi_pstore_write(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi)
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi)
{
return 0;
}
-static int efi_pstore_erase(enum pstore_type_id type, u64 id,
+static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
return 0;
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4300af6..ed1d8c7 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -49,6 +49,7 @@ struct pstore_private {
struct pstore_info *psi;
enum pstore_type_id type;
u64 id;
+ int count;
ssize_t size;
char data[];
};
@@ -175,8 +176,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
- p->psi);
+ p->psi->erase(p->type, p->id, p->count,
+ dentry->d_inode->i_ctime, p->psi);
return simple_unlink(dir, dentry);
}
@@ -271,7 +272,7 @@ int pstore_is_mounted(void)
* Load it up with "size" bytes of data from "buf".
* Set the mtime & ctime to the date that this record was originally stored.
*/
-int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
+int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
char *data, size_t size, struct timespec time,
struct pstore_info *psi)
{
@@ -307,6 +308,7 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
goto fail_alloc;
private->type = type;
private->id = id;
+ private->count = count;
private->psi = psi;
switch (type) {
diff --git a/fs/pstore/internal.h b/fs/pstore/internal.h
index 4847f58..937d820 100644
--- a/fs/pstore/internal.h
+++ b/fs/pstore/internal.h
@@ -50,7 +50,7 @@ extern struct pstore_info *psinfo;
extern void pstore_set_kmsg_bytes(int);
extern void pstore_get_records(int);
extern int pstore_mkfile(enum pstore_type_id, char *psname, u64 id,
- char *data, size_t size,
+ int count, char *data, size_t size,
struct timespec time, struct pstore_info *psi);
extern int pstore_is_mounted(void);
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index a40da07..e518f43 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -136,7 +136,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
break;
ret = psinfo->write(PSTORE_TYPE_DMESG, reason, &id, part,
- hsize + len, psinfo);
+ oopscount, hsize + len, psinfo);
if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted())
pstore_new_entry = 1;
@@ -172,7 +172,7 @@ static void pstore_console_write(struct console *con, const char *s, unsigned c)
spin_lock_irqsave(&psinfo->buf_lock, flags);
}
memcpy(psinfo->buf, s, c);
- psinfo->write(PSTORE_TYPE_CONSOLE, 0, NULL, 0, c, psinfo);
+ psinfo->write(PSTORE_TYPE_CONSOLE, 0, NULL, 0, 0, c, psinfo);
spin_unlock_irqrestore(&psinfo->buf_lock, flags);
s += c;
c = e - s;
@@ -196,7 +196,7 @@ static void pstore_register_console(void) {}
static int pstore_write_compat(enum pstore_type_id type,
enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
return psi->write_buf(type, reason, id, part, psinfo->buf, size, psi);
@@ -266,6 +266,7 @@ void pstore_get_records(int quiet)
char *buf = NULL;
ssize_t size;
u64 id;
+ int count;
enum pstore_type_id type;
struct timespec time;
int failed = 0, rc;
@@ -277,9 +278,9 @@ void pstore_get_records(int quiet)
if (psi->open && psi->open(psi))
goto out;
- while ((size = psi->read(&id, &type, &time, &buf, psi)) > 0) {
- rc = pstore_mkfile(type, psi->name, id, buf, (size_t)size,
- time, psi);
+ while ((size = psi->read(&id, &type, &count, &time, &buf, psi)) > 0) {
+ rc = pstore_mkfile(type, psi->name, id, count, buf,
+ (size_t)size, time, psi);
kfree(buf);
buf = NULL;
if (rc && (rc != -EEXIST || !quiet))
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 749693f..2bfa36e 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -132,9 +132,8 @@ ramoops_get_next_prz(struct persistent_ram_zone *przs[], uint *c, uint max,
}
static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *time,
- char **buf,
- struct pstore_info *psi)
+ int *count, struct timespec *time,
+ char **buf, struct pstore_info *psi)
{
ssize_t size;
struct ramoops_context *cxt = psi->data;
@@ -236,7 +235,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
return 0;
}
-static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
+static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index f6e9336..1788909 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -50,17 +50,19 @@ struct pstore_info {
int (*open)(struct pstore_info *psi);
int (*close)(struct pstore_info *psi);
ssize_t (*read)(u64 *id, enum pstore_type_id *type,
- struct timespec *time, char **buf,
+ int *count, struct timespec *time, char **buf,
struct pstore_info *psi);
int (*write)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi);
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi);
int (*write_buf)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct timespec time, struct pstore_info *psi);
+ int count, struct timespec time,
+ struct pstore_info *psi);
void *data;
};
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:27:36
|
[Issue]
Currently, a variable name, which is used to identify each log entry, consists of type,
id and ctime. But an erase callback does not use ctime.
If efi_pstore supported just one log, type and id were enough.
However, in case of supporting multiple logs, it doesn't work because
it can't distinguish each entry without ctime at erasing time.
<Example>
As you can see below, efi_pstore can't differentiate first event from second one without ctime.
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-23456789
type:0
id:1
ctime:12345678, 23456789
[Solution]
This patch adds ctime to an argument of an erase callback.
It works across reboots because ctime of pstore means the date that the record was originally stored.
To do this, efi_pstore saves the ctime to variable name at writing time and passes it to pstore
at reading time.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/acpi/apei/erst.c | 4 ++--
drivers/firmware/efivars.c | 17 ++++++++---------
fs/pstore/inode.c | 3 ++-
fs/pstore/ram.c | 2 +-
include/linux/pstore.h | 2 +-
5 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index e4d9d24..0bd6ae4 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -938,7 +938,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
u64 *id, unsigned int part,
size_t size, struct pstore_info *psi);
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
.owner = THIS_MODULE,
@@ -1102,7 +1102,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
}
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
}
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index fbe9202..3803621 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -747,24 +747,25 @@ static int efi_pstore_write(enum pstore_type_id type,
};
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
- char stub_name[DUMP_NAME_LEN];
+ char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+ sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ time.tv_sec);
spin_lock(&efivars->lock);
for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
+ efi_name[i] = name[i];
/*
- * Clean up any entries with the same name
+ * Clean up an entry with the same name
*/
list_for_each_entry(entry, &efivars->list, list) {
@@ -775,9 +776,6 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
if (utf16_strncmp(entry->var.VariableName, efi_name,
utf16_strlen(efi_name)))
continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
/* found */
found = entry;
@@ -785,6 +783,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
&entry->var.VendorGuid,
PSTORE_EFI_ATTRIBUTES,
0, NULL);
+ break;
}
if (found)
@@ -823,7 +822,7 @@ static int efi_pstore_write(enum pstore_type_id type,
}
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
return 0;
}
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4ab572e..4300af6 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -175,7 +175,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, p->psi);
+ p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
+ p->psi);
return simple_unlink(dir, dentry);
}
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 1a4f6da..749693f 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -237,7 +237,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
}
static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
struct persistent_ram_zone *prz;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index ee3034a..f6e9336 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -60,7 +60,7 @@ struct pstore_info {
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
void *data;
};
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:26:58
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch removes a logic erasing existing entries in a write callback
because the logic is not needed in the write callback to support holding multiple logs.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 39 ++-------------------------------------
1 files changed, 2 insertions(+), 37 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bee14cc..fbe9202 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -701,18 +701,13 @@ static int efi_pstore_write(enum pstore_type_id type,
unsigned int part, size_t size, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
- char stub_name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
- struct efivar_entry *entry, *found = NULL;
int i, ret = 0;
u64 storage_space, remaining_space, max_variable_size;
efi_status_t status = EFI_NOT_FOUND;
- sprintf(stub_name, "dump-type%u-%u-", type, part);
- sprintf(name, "%s%lu", stub_name, get_seconds());
-
spin_lock(&efivars->lock);
/*
@@ -730,35 +725,8 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
-
- /*
- * Clean up any entries with the same name
- */
-
- list_for_each_entry(entry, &efivars->list, list) {
- get_var_data_locked(efivars, &entry->var);
-
- if (efi_guidcmp(entry->var.VendorGuid, vendor))
- continue;
- if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
-
- /* found */
- found = entry;
- efivars->ops->set_variable(entry->var.VariableName,
- &entry->var.VendorGuid,
- PSTORE_EFI_ATTRIBUTES,
- 0, NULL);
- }
-
- if (found)
- list_del(&found->list);
+ sprintf(name, "dump-type%u-%u-%lu", type, part,
+ get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];
@@ -768,9 +736,6 @@ static int efi_pstore_write(enum pstore_type_id type,
spin_unlock(&efivars->lock);
- if (found)
- efivar_unregister(found);
-
if (size)
ret = efivar_create_sysfs_entry(efivars,
utf16_strsize(efi_name,
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:26:30
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch freshly adds a logic erasing existing entries, which shared with a write callback,
to an erase callback.
To support holding multiple logs, the write callback doesn't need to erase any entries and
it will be removed in a subsequent patch.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 46 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 37ac21a..bee14cc 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -784,7 +784,51 @@ static int efi_pstore_write(enum pstore_type_id type,
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct pstore_info *psi)
{
- efi_pstore_write(type, 0, &id, (unsigned int)id, 0, psi);
+ char stub_name[DUMP_NAME_LEN];
+ efi_char16_t efi_name[DUMP_NAME_LEN];
+ efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
+ struct efivars *efivars = psi->data;
+ struct efivar_entry *entry, *found = NULL;
+ int i;
+
+ sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+
+ spin_lock(&efivars->lock);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name[i] = stub_name[i];
+
+ /*
+ * Clean up any entries with the same name
+ */
+
+ list_for_each_entry(entry, &efivars->list, list) {
+ get_var_data_locked(efivars, &entry->var);
+
+ if (efi_guidcmp(entry->var.VendorGuid, vendor))
+ continue;
+ if (utf16_strncmp(entry->var.VariableName, efi_name,
+ utf16_strlen(efi_name)))
+ continue;
+ /* Needs to be a prefix */
+ if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
+ continue;
+
+ /* found */
+ found = entry;
+ efivars->ops->set_variable(entry->var.VariableName,
+ &entry->var.VendorGuid,
+ PSTORE_EFI_ATTRIBUTES,
+ 0, NULL);
+ }
+
+ if (found)
+ list_del(&found->list);
+
+ spin_unlock(&efivars->lock);
+
+ if (found)
+ efivar_unregister(found);
return 0;
}
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:25:47
|
[Issue] As discussed in a thread below, Running out of space in EFI isn't a well-tested scenario. And we wouldn't expect all firmware to handle it gracefully. http://marc.info/?l=linux-kernel&m=134305325801789&w=2 On the other hand, current efi_pstore doesn't check a remaining space of storage at writing time. Therefore, efi_pstore may not work if it tries to write a large amount of data. [Patch Description] To avoid handling the situation above, this patch checks if there is a space enough to log with QueryVariableInfo() before writing data. Signed-off-by: Seiji Aguchi <sei...@hd...> Acked-by: Mike Waychison <mi...@go...> --- drivers/firmware/efivars.c | 18 ++++++++++++++++++ include/linux/efi.h | 1 + 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index d10c987..37ac21a 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -707,12 +707,29 @@ static int efi_pstore_write(enum pstore_type_id type, struct efivars *efivars = psi->data; struct efivar_entry *entry, *found = NULL; int i, ret = 0; + u64 storage_space, remaining_space, max_variable_size; + efi_status_t status = EFI_NOT_FOUND; sprintf(stub_name, "dump-type%u-%u-", type, part); sprintf(name, "%s%lu", stub_name, get_seconds()); spin_lock(&efivars->lock); + /* + * Check if there is a space enough to log. + * size: a size of logging data + * DUMP_NAME_LEN * 2: a maximum size of variable name + */ + status = efivars->ops->query_variable_info(PSTORE_EFI_ATTRIBUTES, + &storage_space, + &remaining_space, + &max_variable_size); + if (status || remaining_space < size + DUMP_NAME_LEN * 2) { + spin_unlock(&efivars->lock); + *id = part; + return -ENOSPC; + } + for (i = 0; i < DUMP_NAME_LEN; i++) efi_name[i] = stub_name[i]; @@ -1237,6 +1254,7 @@ efivars_init(void) ops.get_variable = efi.get_variable; ops.set_variable = efi.set_variable; ops.get_next_variable = efi.get_next_variable; + ops.query_variable_info = efi.query_variable_info; error = register_efivars(&__efivars, &ops, efi_kobj); if (error) goto err_put; diff --git a/include/linux/efi.h b/include/linux/efi.h index 8670eb1..c47ec36 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -643,6 +643,7 @@ struct efivar_operations { efi_get_variable_t *get_variable; efi_get_next_variable_t *get_next_variable; efi_set_variable_t *set_variable; + efi_query_variable_info_t *query_variable_info; }; struct efivars { -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2012-11-14 20:24:52
|
v5 -> v6
- Add ctime to an argument of efi_pstore_erase to build successfully
in case where CONFIG_PTORE=n is specified. (Patch 4/7)
- Add count to an argument of efi_pstore_erase and efi_pstore_write to
build successfully in case where CONFIG_PTORE=n is specified (Patch 5/7)
- They are removed in a patchset v5 by mistake even though they were fixed
in a patchset v4.
By applying a patchset v6, kernel can be built with both CONFIG_PSTORE=y and
CONFIG_PSTORE=n.
Also, it can be built with both CONFIG_PSTORE_CONSOLE=y and
CONFIG_PSTORE_CONSOLE=n.
v4 -> v5
- Rebase to 3.7-rc5
- Add count to an argument of a write callback executed in pstore_console_write()
to build successfully in case where CONSIG_PSTORE_CONSOLE=y is specified. (Patch 5/7)
v3 -> v4
- Rebase to 3.7-rc3
- Add ctime to an argument of efi_pstore_erase to build successfully
in case where CONFIG_PTORE=n is specified. (Patch 4/7)
- Add count to an argument of efi_pstore_erase and efi_pstore_write to
build successfully in case where CONFIG_PTORE=n is specified (Patch 5/7)
- Fix a white space damage (Patch 6/7)
- Modify a logic checking a 3-variable format of a variable name (Patch 7/7)
v2 -> v3
- Create patches 6/7 and 7/7 to work with an existing format of variable name
v1 -> v2
- Separate into 5 patches in accordance with Mike's comment
- Erase an extra line of comment in patch 1/5
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
Solutions of this problem has been discussed among Tony, Matthew, Don, Mike and me.
http://marc.info/?l=linux-kernel&m=134273270704586&w=2
And there are two possible solutions right now.
- First one is introducing some policy overwriting existing logs.
- Second one is simply holding multiple log without overwriting any entries.
We haven't decided the overwriting policy which is reasonable to all users yet.
But I believe we agree that just holding multiple logs is a reasonable way.
We may need further discussions to find the possibility of introducing overwriting
policy, especially getting critical messages in multiple oops case.
But I would like to begin with a simple and reasonable way to everyone.
So, this patch takes an approach just holding multiple logs.
Seiji Aguchi (7):
efi_pstore: Check remaining space with QueryVariableInfo() before
writing data
efi_pstore: Add a logic erasing entries to an erase callback
efi_pstore: Remove a logic erasing entries from a write callback to
hold multiple logs
efi_pstore: Add ctime to argument of erase callback
efi_pstore: Add a sequence counter to a variable name
efi_pstore: Add a format check for an existing variable name at
reading time
efi_pstore: Add a format check for an existing variable name at
erasing time
drivers/acpi/apei/erst.c | 16 ++--
drivers/firmware/efivars.c | 163 ++++++++++++++++++++++++++++++-------------
fs/pstore/inode.c | 7 ++-
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 13 ++--
fs/pstore/ram.c | 9 +--
include/linux/efi.h | 1 +
include/linux/pstore.h | 6 +-
8 files changed, 144 insertions(+), 73 deletions(-)
|
|
From: Luck, T. <ton...@in...> - 2012-11-13 18:43:19
|
v4 -> v5
- Rebase to 3.7-rc5
- Add count to an argument of a write callback executed in pstore_console_write()
to build successfully in case where CONSIG_PSTORE_CONSOLE=y is specified. (Patch 5/7)
Applied. It's in my
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git next
tree now so should pick up into linux-next by tomorrow or the day after.
-Tony
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:39:49
|
[Issue]
a format of variable name has been updated to type, id, count and ctime
to support holding multiple logs.
Format of current variable name
dump-type0-1-2-12345678
type:0
id:1
count:2
ctime:12345678
On the other hand, if an old variable name before being updated
remains, users can't erase it via /dev/pstore.
Format of old variable name
dump-type0-1-12345678
type:0
id:1
ctime:12345678
[Solution]
This patch add a format check for the old variable name in a erase callback to make it erasable.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 20 ++++++++++++++++++--
1 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 60d2905..904d0c7 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -773,6 +773,8 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
{
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
+ char name_old[DUMP_NAME_LEN];
+ efi_char16_t efi_name_old[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
@@ -796,8 +798,22 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
if (efi_guidcmp(entry->var.VendorGuid, vendor))
continue;
if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
+ utf16_strlen(efi_name))) {
+ /*
+ * Check if an old format,
+ * which doesn't support holding
+ * multiple logs, remains.
+ */
+ sprintf(name_old, "dump-type%u-%u-%lu", type,
+ (unsigned int)id, time.tv_sec);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name_old[i] = name_old[i];
+
+ if (utf16_strncmp(entry->var.VariableName, efi_name_old,
+ utf16_strlen(efi_name_old)))
+ continue;
+ }
/* found */
found = entry;
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:39:16
|
[Issue]
a format of variable name has been updated to type, id, count and ctime
to support holding multiple logs.
Format of current variable name
dump-type0-1-2-12345678
type:0
id:1
count:2
ctime:12345678
On the other hand, if an old variable name before being updated
remains, users can't read it via /dev/pstore.
Format of old variable name
dump-type0-1-12345678
type:0
id:1
ctime:12345678
[Solution]
This patch add a format check for the old variable name in a read callback
to make it readable.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 38 ++++++++++++++++++++++++++++----------
1 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index dc69802..60d2905 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -681,17 +681,35 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
*count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
- get_var_data_locked(efivars, &efivars->walk_entry->var);
- size = efivars->walk_entry->var.DataSize;
- *buf = kmalloc(size, GFP_KERNEL);
- if (*buf == NULL)
- return -ENOMEM;
- memcpy(*buf, efivars->walk_entry->var.Data,
- size);
- efivars->walk_entry = list_entry(efivars->walk_entry->list.next,
- struct efivar_entry, list);
- return size;
+ } else if (sscanf(name, "dump-type%u-%u-%lu",
+ type, &part, &time) == 3) {
+ /*
+ * Check if an old format,
+ * which doesn't support holding
+ * multiple logs, remains.
+ */
+ *id = part;
+ *count = 0;
+ timespec->tv_sec = time;
+ timespec->tv_nsec = 0;
+ } else {
+ efivars->walk_entry = list_entry(
+ efivars->walk_entry->list.next,
+ struct efivar_entry, list);
+ continue;
}
+
+ get_var_data_locked(efivars, &efivars->walk_entry->var);
+ size = efivars->walk_entry->var.DataSize;
+ *buf = kmalloc(size, GFP_KERNEL);
+ if (*buf == NULL)
+ return -ENOMEM;
+ memcpy(*buf, efivars->walk_entry->var.Data,
+ size);
+ efivars->walk_entry = list_entry(
+ efivars->walk_entry->list.next,
+ struct efivar_entry, list);
+ return size;
}
efivars->walk_entry = list_entry(efivars->walk_entry->list.next,
struct efivar_entry, list);
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:38:51
|
[Issue]
Currently, a variable name, which identifies each entry, consists of type, id and ctime.
But if multiple events happens in a short time, a second/third event may fail to log because
efi_pstore can't distinguish each event with current variable name.
[Solution]
A reasonable way to identify all events precisely is introducing a sequence counter to
the variable name.
The sequence counter has already supported in a pstore layer with "oopscount".
So, this patch adds it to a variable name.
Also, it is passed to read/erase callbacks of platform drivers in accordance with
the modification of the variable name.
<before applying this patch>
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-12345678
type:0
id:1
ctime:12345678
If multiple events happen in a short time, efi_pstore can't distinguish them because
variable names are same among them.
<after applying this patch>
it can be distinguishable by adding a sequence counter as follows.
a variable name of first event: dump-type0-1-1-12345678
a variable name of Second event: dump-type0-1-2-12345678
type:0
id:1
sequence counter: 1(first event), 2(second event)
ctime:12345678
In case of a write callback executed in pstore_console_write(), "0" is added to
an argument of the write callback because it just logs all kernel messages and
doesn't need to care about multiple events.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Rafael J. Wysocki <raf...@in...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/acpi/apei/erst.c | 12 ++++++------
drivers/firmware/efivars.c | 18 +++++++++++-------
fs/pstore/inode.c | 8 +++++---
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 13 +++++++------
fs/pstore/ram.c | 7 +++----
include/linux/pstore.h | 8 +++++---
7 files changed, 38 insertions(+), 30 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index 0bd6ae4..6d894bf 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -931,13 +931,13 @@ static int erst_check_table(struct acpi_table_erst *erst_tab)
static int erst_open_pstore(struct pstore_info *psi);
static int erst_close_pstore(struct pstore_info *psi);
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi);
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi);
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
@@ -987,7 +987,7 @@ static int erst_close_pstore(struct pstore_info *psi)
return 0;
}
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi)
{
@@ -1055,7 +1055,7 @@ out:
}
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
struct cper_pstore_record *rcd = (struct cper_pstore_record *)
@@ -1101,7 +1101,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
return ret;
}
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 6cbeea7..dc69802 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -658,13 +658,14 @@ static int efi_pstore_close(struct pstore_info *psi)
}
static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *timespec,
+ int *count, struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
char name[DUMP_NAME_LEN];
int i;
+ int cnt;
unsigned int part, size;
unsigned long time;
@@ -674,8 +675,10 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
for (i = 0; i < DUMP_NAME_LEN; i++) {
name[i] = efivars->walk_entry->var.VariableName[i];
}
- if (sscanf(name, "dump-type%u-%u-%lu", type, &part, &time) == 3) {
+ if (sscanf(name, "dump-type%u-%u-%d-%lu",
+ type, &part, &cnt, &time) == 4) {
*id = part;
+ *count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
get_var_data_locked(efivars, &efivars->walk_entry->var);
@@ -698,7 +701,8 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
static int efi_pstore_write(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi)
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
@@ -725,7 +729,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- sprintf(name, "dump-type%u-%u-%lu", type, part,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count,
get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
@@ -746,7 +750,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return ret;
};
-static int efi_pstore_erase(enum pstore_type_id type, u64 id,
+static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
@@ -756,7 +760,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, (unsigned int)id, count,
time.tv_sec);
spin_lock(&efivars->lock);
@@ -807,7 +811,7 @@ static int efi_pstore_close(struct pstore_info *psi)
return 0;
}
-static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
+static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4300af6..ed1d8c7 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -49,6 +49,7 @@ struct pstore_private {
struct pstore_info *psi;
enum pstore_type_id type;
u64 id;
+ int count;
ssize_t size;
char data[];
};
@@ -175,8 +176,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
- p->psi);
+ p->psi->erase(p->type, p->id, p->count,
+ dentry->d_inode->i_ctime, p->psi);
return simple_unlink(dir, dentry);
}
@@ -271,7 +272,7 @@ int pstore_is_mounted(void)
* Load it up with "size" bytes of data from "buf".
* Set the mtime & ctime to the date that this record was originally stored.
*/
-int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
+int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
char *data, size_t size, struct timespec time,
struct pstore_info *psi)
{
@@ -307,6 +308,7 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
goto fail_alloc;
private->type = type;
private->id = id;
+ private->count = count;
private->psi = psi;
switch (type) {
diff --git a/fs/pstore/internal.h b/fs/pstore/internal.h
index 4847f58..937d820 100644
--- a/fs/pstore/internal.h
+++ b/fs/pstore/internal.h
@@ -50,7 +50,7 @@ extern struct pstore_info *psinfo;
extern void pstore_set_kmsg_bytes(int);
extern void pstore_get_records(int);
extern int pstore_mkfile(enum pstore_type_id, char *psname, u64 id,
- char *data, size_t size,
+ int count, char *data, size_t size,
struct timespec time, struct pstore_info *psi);
extern int pstore_is_mounted(void);
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index a40da07..e518f43 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -136,7 +136,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
break;
ret = psinfo->write(PSTORE_TYPE_DMESG, reason, &id, part,
- hsize + len, psinfo);
+ oopscount, hsize + len, psinfo);
if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted())
pstore_new_entry = 1;
@@ -172,7 +172,7 @@ static void pstore_console_write(struct console *con, const char *s, unsigned c)
spin_lock_irqsave(&psinfo->buf_lock, flags);
}
memcpy(psinfo->buf, s, c);
- psinfo->write(PSTORE_TYPE_CONSOLE, 0, NULL, 0, c, psinfo);
+ psinfo->write(PSTORE_TYPE_CONSOLE, 0, NULL, 0, 0, c, psinfo);
spin_unlock_irqrestore(&psinfo->buf_lock, flags);
s += c;
c = e - s;
@@ -196,7 +196,7 @@ static void pstore_register_console(void) {}
static int pstore_write_compat(enum pstore_type_id type,
enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
return psi->write_buf(type, reason, id, part, psinfo->buf, size, psi);
@@ -266,6 +266,7 @@ void pstore_get_records(int quiet)
char *buf = NULL;
ssize_t size;
u64 id;
+ int count;
enum pstore_type_id type;
struct timespec time;
int failed = 0, rc;
@@ -277,9 +278,9 @@ void pstore_get_records(int quiet)
if (psi->open && psi->open(psi))
goto out;
- while ((size = psi->read(&id, &type, &time, &buf, psi)) > 0) {
- rc = pstore_mkfile(type, psi->name, id, buf, (size_t)size,
- time, psi);
+ while ((size = psi->read(&id, &type, &count, &time, &buf, psi)) > 0) {
+ rc = pstore_mkfile(type, psi->name, id, count, buf,
+ (size_t)size, time, psi);
kfree(buf);
buf = NULL;
if (rc && (rc != -EEXIST || !quiet))
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 749693f..2bfa36e 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -132,9 +132,8 @@ ramoops_get_next_prz(struct persistent_ram_zone *przs[], uint *c, uint max,
}
static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *time,
- char **buf,
- struct pstore_info *psi)
+ int *count, struct timespec *time,
+ char **buf, struct pstore_info *psi)
{
ssize_t size;
struct ramoops_context *cxt = psi->data;
@@ -236,7 +235,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
return 0;
}
-static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
+static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index f6e9336..1788909 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -50,17 +50,19 @@ struct pstore_info {
int (*open)(struct pstore_info *psi);
int (*close)(struct pstore_info *psi);
ssize_t (*read)(u64 *id, enum pstore_type_id *type,
- struct timespec *time, char **buf,
+ int *count, struct timespec *time, char **buf,
struct pstore_info *psi);
int (*write)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi);
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi);
int (*write_buf)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct timespec time, struct pstore_info *psi);
+ int count, struct timespec time,
+ struct pstore_info *psi);
void *data;
};
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:38:06
|
[Issue]
Currently, a variable name, which is used to identify each log entry, consists of type,
id and ctime. But an erase callback does not use ctime.
If efi_pstore supported just one log, type and id were enough.
However, in case of supporting multiple logs, it doesn't work because
it can't distinguish each entry without ctime at erasing time.
<Example>
As you can see below, efi_pstore can't differentiate first event from second one without ctime.
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-23456789
type:0
id:1
ctime:12345678, 23456789
[Solution]
This patch adds ctime to an argument of an erase callback.
It works across reboots because ctime of pstore means the date that the record was originally stored.
To do this, efi_pstore saves the ctime to variable name at writing time and passes it to pstore
at reading time.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/acpi/apei/erst.c | 4 ++--
drivers/firmware/efivars.c | 15 +++++++--------
fs/pstore/inode.c | 3 ++-
fs/pstore/ram.c | 2 +-
include/linux/pstore.h | 2 +-
5 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index e4d9d24..0bd6ae4 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -938,7 +938,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
u64 *id, unsigned int part,
size_t size, struct pstore_info *psi);
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
.owner = THIS_MODULE,
@@ -1102,7 +1102,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
}
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
}
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index fbe9202..6cbeea7 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -747,24 +747,25 @@ static int efi_pstore_write(enum pstore_type_id type,
};
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
- char stub_name[DUMP_NAME_LEN];
+ char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+ sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ time.tv_sec);
spin_lock(&efivars->lock);
for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
+ efi_name[i] = name[i];
/*
- * Clean up any entries with the same name
+ * Clean up an entry with the same name
*/
list_for_each_entry(entry, &efivars->list, list) {
@@ -775,9 +776,6 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
if (utf16_strncmp(entry->var.VariableName, efi_name,
utf16_strlen(efi_name)))
continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
/* found */
found = entry;
@@ -785,6 +783,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
&entry->var.VendorGuid,
PSTORE_EFI_ATTRIBUTES,
0, NULL);
+ break;
}
if (found)
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4ab572e..4300af6 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -175,7 +175,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, p->psi);
+ p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
+ p->psi);
return simple_unlink(dir, dentry);
}
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 1a4f6da..749693f 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -237,7 +237,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
}
static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
struct persistent_ram_zone *prz;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index ee3034a..f6e9336 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -60,7 +60,7 @@ struct pstore_info {
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
void *data;
};
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:37:37
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch removes a logic erasing existing entries in a write callback
because the logic is not needed in the write callback to support holding multiple logs.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 39 ++-------------------------------------
1 files changed, 2 insertions(+), 37 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bee14cc..fbe9202 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -701,18 +701,13 @@ static int efi_pstore_write(enum pstore_type_id type,
unsigned int part, size_t size, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
- char stub_name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
- struct efivar_entry *entry, *found = NULL;
int i, ret = 0;
u64 storage_space, remaining_space, max_variable_size;
efi_status_t status = EFI_NOT_FOUND;
- sprintf(stub_name, "dump-type%u-%u-", type, part);
- sprintf(name, "%s%lu", stub_name, get_seconds());
-
spin_lock(&efivars->lock);
/*
@@ -730,35 +725,8 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
-
- /*
- * Clean up any entries with the same name
- */
-
- list_for_each_entry(entry, &efivars->list, list) {
- get_var_data_locked(efivars, &entry->var);
-
- if (efi_guidcmp(entry->var.VendorGuid, vendor))
- continue;
- if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
-
- /* found */
- found = entry;
- efivars->ops->set_variable(entry->var.VariableName,
- &entry->var.VendorGuid,
- PSTORE_EFI_ATTRIBUTES,
- 0, NULL);
- }
-
- if (found)
- list_del(&found->list);
+ sprintf(name, "dump-type%u-%u-%lu", type, part,
+ get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];
@@ -768,9 +736,6 @@ static int efi_pstore_write(enum pstore_type_id type,
spin_unlock(&efivars->lock);
- if (found)
- efivar_unregister(found);
-
if (size)
ret = efivar_create_sysfs_entry(efivars,
utf16_strsize(efi_name,
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:37:12
|
[Issue] As discussed in a thread below, Running out of space in EFI isn't a well-tested scenario. And we wouldn't expect all firmware to handle it gracefully. http://marc.info/?l=linux-kernel&m=134305325801789&w=2 On the other hand, current efi_pstore doesn't check a remaining space of storage at writing time. Therefore, efi_pstore may not work if it tries to write a large amount of data. [Patch Description] To avoid handling the situation above, this patch checks if there is a space enough to log with QueryVariableInfo() before writing data. Signed-off-by: Seiji Aguchi <sei...@hd...> Acked-by: Mike Waychison <mi...@go...> --- drivers/firmware/efivars.c | 18 ++++++++++++++++++ include/linux/efi.h | 1 + 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index d10c987..37ac21a 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -707,12 +707,29 @@ static int efi_pstore_write(enum pstore_type_id type, struct efivars *efivars = psi->data; struct efivar_entry *entry, *found = NULL; int i, ret = 0; + u64 storage_space, remaining_space, max_variable_size; + efi_status_t status = EFI_NOT_FOUND; sprintf(stub_name, "dump-type%u-%u-", type, part); sprintf(name, "%s%lu", stub_name, get_seconds()); spin_lock(&efivars->lock); + /* + * Check if there is a space enough to log. + * size: a size of logging data + * DUMP_NAME_LEN * 2: a maximum size of variable name + */ + status = efivars->ops->query_variable_info(PSTORE_EFI_ATTRIBUTES, + &storage_space, + &remaining_space, + &max_variable_size); + if (status || remaining_space < size + DUMP_NAME_LEN * 2) { + spin_unlock(&efivars->lock); + *id = part; + return -ENOSPC; + } + for (i = 0; i < DUMP_NAME_LEN; i++) efi_name[i] = stub_name[i]; @@ -1237,6 +1254,7 @@ efivars_init(void) ops.get_variable = efi.get_variable; ops.set_variable = efi.set_variable; ops.get_next_variable = efi.get_next_variable; + ops.query_variable_info = efi.query_variable_info; error = register_efivars(&__efivars, &ops, efi_kobj); if (error) goto err_put; diff --git a/include/linux/efi.h b/include/linux/efi.h index 8670eb1..c47ec36 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -643,6 +643,7 @@ struct efivar_operations { efi_get_variable_t *get_variable; efi_get_next_variable_t *get_next_variable; efi_set_variable_t *set_variable; + efi_query_variable_info_t *query_variable_info; }; struct efivars { -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:37:01
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch freshly adds a logic erasing existing entries, which shared with a write callback,
to an erase callback.
To support holding multiple logs, the write callback doesn't need to erase any entries and
it will be removed in a subsequent patch.
Signed-off-by: Seiji Aguchi <sei...@hd...>
Acked-by: Mike Waychison <mi...@go...>
---
drivers/firmware/efivars.c | 46 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 37ac21a..bee14cc 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -784,7 +784,51 @@ static int efi_pstore_write(enum pstore_type_id type,
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct pstore_info *psi)
{
- efi_pstore_write(type, 0, &id, (unsigned int)id, 0, psi);
+ char stub_name[DUMP_NAME_LEN];
+ efi_char16_t efi_name[DUMP_NAME_LEN];
+ efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
+ struct efivars *efivars = psi->data;
+ struct efivar_entry *entry, *found = NULL;
+ int i;
+
+ sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+
+ spin_lock(&efivars->lock);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name[i] = stub_name[i];
+
+ /*
+ * Clean up any entries with the same name
+ */
+
+ list_for_each_entry(entry, &efivars->list, list) {
+ get_var_data_locked(efivars, &entry->var);
+
+ if (efi_guidcmp(entry->var.VendorGuid, vendor))
+ continue;
+ if (utf16_strncmp(entry->var.VariableName, efi_name,
+ utf16_strlen(efi_name)))
+ continue;
+ /* Needs to be a prefix */
+ if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
+ continue;
+
+ /* found */
+ found = entry;
+ efivars->ops->set_variable(entry->var.VariableName,
+ &entry->var.VendorGuid,
+ PSTORE_EFI_ATTRIBUTES,
+ 0, NULL);
+ }
+
+ if (found)
+ list_del(&found->list);
+
+ spin_unlock(&efivars->lock);
+
+ if (found)
+ efivar_unregister(found);
return 0;
}
--
1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-11-13 16:36:13
|
v4 -> v5
- Rebase to 3.7-rc5
- Add count to an argument of a write callback executed in pstore_console_write()
to build successfully in case where CONSIG_PSTORE_CONSOLE=y is specified. (Patch 5/7)
v3 -> v4
- Rebase to 3.7-rc3
- Add ctime to an argument of efi_pstore_erase to build successfully
in case where CONFIG_PTORE=n is specified. (Patch 4/7)
- Add count to an argument of efi_pstore_erase and efi_pstore_write to
build successfully in case where CONFIG_PTORE=n is specified (Patch 5/7)
- Fix a white space damage (Patch 6/7)
- Modify a logic checking a 3-variable format of a variable name (Patch 7/7)
v2 -> v3
- Create patches 6/7 and 7/7 to work with an existing format of variable name
v1 -> v2
- Separate into 5 patches in accordance with Mike's comment
- Erase an extra line of comment in patch 1/5
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
Solutions of this problem has been discussed among Tony, Matthew, Don, Mike and me.
http://marc.info/?l=linux-kernel&m=134273270704586&w=2
And there are two possible solutions right now.
- First one is introducing some policy overwriting existing logs.
- Second one is simply holding multiple log without overwriting any entries.
We haven't decided the overwriting policy which is reasonable to all users yet.
But I believe we agree that just holding multiple logs is a reasonable way.
We may need further discussions to find the possibility of introducing overwriting
policy, especially getting critical messages in multiple oops case.
But I would like to begin with a simple and reasonable way to everyone.
So, this patch takes an approach just holding multiple logs.
Seiji Aguchi (7):
efi_pstore: Check remaining space with QueryVariableInfo() before
writing data
efi_pstore: Add a logic erasing entries to an erase callback
efi_pstore: Remove a logic erasing entries from a write callback to
hold multiple logs
efi_pstore: Add ctime to argument of erase callback
efi_pstore: Add a sequence counter to a variable name
efi_pstore: Add a format check for an existing variable name at
reading time
efi_pstore: Add a format check for an existing variable name at erasing time
drivers/acpi/apei/erst.c | 16 ++--
drivers/firmware/efivars.c | 156 +++++++++++++++++++++++++++++++-------------
fs/pstore/inode.c | 7 ++-
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 13 ++--
fs/pstore/ram.c | 9 +--
include/linux/efi.h | 1 +
include/linux/pstore.h | 6 +-
8 files changed, 140 insertions(+), 70 deletions(-)
|