You can subscribe to this list here.
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(32) |
Jun
(66) |
Jul
(102) |
Aug
(78) |
Sep
(106) |
Oct
(137) |
Nov
(147) |
Dec
(147) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 |
Jan
(71) |
Feb
(139) |
Mar
(86) |
Apr
(76) |
May
(57) |
Jun
(10) |
Jul
(12) |
Aug
(6) |
Sep
(8) |
Oct
(12) |
Nov
(12) |
Dec
(18) |
| 2011 |
Jan
(16) |
Feb
(19) |
Mar
(3) |
Apr
(1) |
May
(16) |
Jun
(17) |
Jul
(74) |
Aug
(22) |
Sep
(18) |
Oct
(24) |
Nov
(21) |
Dec
(30) |
| 2012 |
Jan
(31) |
Feb
(16) |
Mar
(22) |
Apr
(25) |
May
(18) |
Jun
(13) |
Jul
(83) |
Aug
(49) |
Sep
(20) |
Oct
(60) |
Nov
(35) |
Dec
(28) |
| 2013 |
Jan
(39) |
Feb
(61) |
Mar
(35) |
Apr
(21) |
May
(45) |
Jun
(56) |
Jul
(20) |
Aug
(9) |
Sep
(10) |
Oct
(31) |
Nov
(8) |
Dec
(4) |
| 2014 |
Jan
(6) |
Feb
(7) |
Mar
(7) |
Apr
(6) |
May
(4) |
Jun
(8) |
Jul
(5) |
Aug
(2) |
Sep
(4) |
Oct
(4) |
Nov
(11) |
Dec
(5) |
| 2015 |
Jan
(4) |
Feb
(4) |
Mar
(3) |
Apr
(4) |
May
(9) |
Jun
(4) |
Jul
(15) |
Aug
(8) |
Sep
(16) |
Oct
(18) |
Nov
(15) |
Dec
(7) |
| 2016 |
Jan
(20) |
Feb
(9) |
Mar
(15) |
Apr
(24) |
May
(16) |
Jun
(28) |
Jul
(22) |
Aug
(23) |
Sep
(18) |
Oct
(30) |
Nov
(40) |
Dec
(9) |
| 2017 |
Jan
(1) |
Feb
(8) |
Mar
(37) |
Apr
(26) |
May
(25) |
Jun
(46) |
Jul
(24) |
Aug
(9) |
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:43:55
|
[Issue]
A format of variable name has been updated to type, id, count and ctime
to support holding multiple logs.
Format of current variable name
dump-type0-1-2-12345678
type:0
id:1
count:2
ctime:12345678
On the other hand, if an old variable name before being updated
remains, users can't erase it via /dev/pstore.
Format of old variable name
dump-type0-1-12345678
type:0
id:1
ctime:12345678
[Solution]
This patch adds a format check for the old variable name in a erase callback to make it erasable.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index dd228d5..b1cd028 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -777,6 +777,8 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
int i;
+ unsigned int type_old, part_old;
+ unsigned long time_old;
sprintf(name, "dump-type%u-%u-%d-%lu", type, (unsigned int)id, count,
time.tv_sec);
@@ -796,8 +798,16 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
if (efi_guidcmp(entry->var.VendorGuid, vendor))
continue;
if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
+ utf16_strlen(efi_name))) {
+ /*
+ * Check if an old format,
+ * which doesn't support holding
+ * multiple logs, remains.
+ */
+ if (sscanf(name, "dump-type%u-%u-%lu",
+ &type_old, &part_old, &time_old) != 3)
+ continue;
+ }
/* found */
found = entry;
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:43:24
|
[Issue]
A format of variable name has been updated to type, id, count and ctime
to support holding multiple logs.
Format of current variable name
dump-type0-1-2-12345678
type:0
id:1
count:2
ctime:12345678
On the other hand, if an old variable name before being updated
remains, users can't read it via /dev/pstore.
Format of old variable name
dump-type0-1-12345678
type:0
id:1
ctime:12345678
[Solution]
This patch adds a format check for the old variable name in a read callback
to make it readable.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 38 ++++++++++++++++++++++++++++----------
1 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index dc69802..dd228d5 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -681,17 +681,35 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
*count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
- get_var_data_locked(efivars, &efivars->walk_entry->var);
- size = efivars->walk_entry->var.DataSize;
- *buf = kmalloc(size, GFP_KERNEL);
- if (*buf == NULL)
- return -ENOMEM;
- memcpy(*buf, efivars->walk_entry->var.Data,
- size);
- efivars->walk_entry = list_entry(efivars->walk_entry->list.next,
- struct efivar_entry, list);
- return size;
+ } else if (sscanf(name, "dump-type%u-%u-%lu",
+ type, &part, &time) == 3) {
+ /*
+ * Check if an old format,
+ * which doesn't support holding
+ * multiple logs, remains.
+ */
+ *id = part;
+ *count = 0;
+ timespec->tv_sec = time;
+ timespec->tv_nsec = 0;
+ } else {
+ efivars->walk_entry = list_entry(
+ efivars->walk_entry->list.next,
+ struct efivar_entry, list);
+ continue;
}
+
+ get_var_data_locked(efivars, &efivars->walk_entry->var);
+ size = efivars->walk_entry->var.DataSize;
+ *buf = kmalloc(size, GFP_KERNEL);
+ if (*buf == NULL)
+ return -ENOMEM;
+ memcpy(*buf, efivars->walk_entry->var.Data,
+ size);
+ efivars->walk_entry = list_entry(
+ efivars->walk_entry->list.next,
+ struct efivar_entry, list);
+ return size;
}
efivars->walk_entry = list_entry(efivars->walk_entry->list.next,
struct efivar_entry, list);
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:42:53
|
[Issue]
Currently, a variable name, which identifies each entry, consists of type, id and ctime.
But if multiple events happens in a short time, a second/third event may fail to log because
efi_pstore can't distinguish each event with current variable name.
[Solution]
A reasonable way to identify all events precisely is introducing a sequence counter to
the variable name.
The sequence counter has already supported in a pstore layer with "oopscount".
So, this patch adds it to a variable name.
Also, it is passed to read/erase callbacks of platform drivers in accordance with
the modification of the variable name.
<before applying this patch>
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-12345678
type:0
id:1
ctime:12345678
If multiple events happen in a short time, efi_pstore can't distinguish them because
variable names are same among them.
<after applying this patch>
it can be distinguishable by adding a sequence counter as follows.
a variable name of first event: dump-type0-1-1-12345678
a variable name of Second event: dump-type0-1-2-12345678
type:0
id:1
sequence counter: 1(first event), 2(second event)
ctime:12345678
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/acpi/apei/erst.c | 12 ++++++------
drivers/firmware/efivars.c | 18 +++++++++++-------
fs/pstore/inode.c | 8 +++++---
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 11 ++++++-----
fs/pstore/ram.c | 7 +++----
include/linux/pstore.h | 8 +++++---
7 files changed, 37 insertions(+), 29 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index 0bd6ae4..6d894bf 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -931,13 +931,13 @@ static int erst_check_table(struct acpi_table_erst *erst_tab)
static int erst_open_pstore(struct pstore_info *psi);
static int erst_close_pstore(struct pstore_info *psi);
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi);
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi);
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
@@ -987,7 +987,7 @@ static int erst_close_pstore(struct pstore_info *psi)
return 0;
}
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi)
{
@@ -1055,7 +1055,7 @@ out:
}
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
struct cper_pstore_record *rcd = (struct cper_pstore_record *)
@@ -1101,7 +1101,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
return ret;
}
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 6cbeea7..dc69802 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -658,13 +658,14 @@ static int efi_pstore_close(struct pstore_info *psi)
}
static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *timespec,
+ int *count, struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
char name[DUMP_NAME_LEN];
int i;
+ int cnt;
unsigned int part, size;
unsigned long time;
@@ -674,8 +675,10 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
for (i = 0; i < DUMP_NAME_LEN; i++) {
name[i] = efivars->walk_entry->var.VariableName[i];
}
- if (sscanf(name, "dump-type%u-%u-%lu", type, &part, &time) == 3) {
+ if (sscanf(name, "dump-type%u-%u-%d-%lu",
+ type, &part, &cnt, &time) == 4) {
*id = part;
+ *count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
get_var_data_locked(efivars, &efivars->walk_entry->var);
@@ -698,7 +701,8 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
static int efi_pstore_write(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi)
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
@@ -725,7 +729,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- sprintf(name, "dump-type%u-%u-%lu", type, part,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count,
get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
@@ -746,7 +750,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return ret;
};
-static int efi_pstore_erase(enum pstore_type_id type, u64 id,
+static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
@@ -756,7 +760,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, (unsigned int)id, count,
time.tv_sec);
spin_lock(&efivars->lock);
@@ -807,7 +811,7 @@ static int efi_pstore_close(struct pstore_info *psi)
return 0;
}
-static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
+static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4300af6..ed1d8c7 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -49,6 +49,7 @@ struct pstore_private {
struct pstore_info *psi;
enum pstore_type_id type;
u64 id;
+ int count;
ssize_t size;
char data[];
};
@@ -175,8 +176,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
- p->psi);
+ p->psi->erase(p->type, p->id, p->count,
+ dentry->d_inode->i_ctime, p->psi);
return simple_unlink(dir, dentry);
}
@@ -271,7 +272,7 @@ int pstore_is_mounted(void)
* Load it up with "size" bytes of data from "buf".
* Set the mtime & ctime to the date that this record was originally stored.
*/
-int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
+int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
char *data, size_t size, struct timespec time,
struct pstore_info *psi)
{
@@ -307,6 +308,7 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
goto fail_alloc;
private->type = type;
private->id = id;
+ private->count = count;
private->psi = psi;
switch (type) {
diff --git a/fs/pstore/internal.h b/fs/pstore/internal.h
index 4847f58..937d820 100644
--- a/fs/pstore/internal.h
+++ b/fs/pstore/internal.h
@@ -50,7 +50,7 @@ extern struct pstore_info *psinfo;
extern void pstore_set_kmsg_bytes(int);
extern void pstore_get_records(int);
extern int pstore_mkfile(enum pstore_type_id, char *psname, u64 id,
- char *data, size_t size,
+ int count, char *data, size_t size,
struct timespec time, struct pstore_info *psi);
extern int pstore_is_mounted(void);
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index a40da07..f911bbd 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -136,7 +136,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
break;
ret = psinfo->write(PSTORE_TYPE_DMESG, reason, &id, part,
- hsize + len, psinfo);
+ oopscount, hsize + len, psinfo);
if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted())
pstore_new_entry = 1;
@@ -196,7 +196,7 @@ static void pstore_register_console(void) {}
static int pstore_write_compat(enum pstore_type_id type,
enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
return psi->write_buf(type, reason, id, part, psinfo->buf, size, psi);
@@ -266,6 +266,7 @@ void pstore_get_records(int quiet)
char *buf = NULL;
ssize_t size;
u64 id;
+ int count;
enum pstore_type_id type;
struct timespec time;
int failed = 0, rc;
@@ -277,9 +278,9 @@ void pstore_get_records(int quiet)
if (psi->open && psi->open(psi))
goto out;
- while ((size = psi->read(&id, &type, &time, &buf, psi)) > 0) {
- rc = pstore_mkfile(type, psi->name, id, buf, (size_t)size,
- time, psi);
+ while ((size = psi->read(&id, &type, &count, &time, &buf, psi)) > 0) {
+ rc = pstore_mkfile(type, psi->name, id, count, buf,
+ (size_t)size, time, psi);
kfree(buf);
buf = NULL;
if (rc && (rc != -EEXIST || !quiet))
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 749693f..2bfa36e 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -132,9 +132,8 @@ ramoops_get_next_prz(struct persistent_ram_zone *przs[], uint *c, uint max,
}
static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *time,
- char **buf,
- struct pstore_info *psi)
+ int *count, struct timespec *time,
+ char **buf, struct pstore_info *psi)
{
ssize_t size;
struct ramoops_context *cxt = psi->data;
@@ -236,7 +235,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
return 0;
}
-static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
+static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index f6e9336..1788909 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -50,17 +50,19 @@ struct pstore_info {
int (*open)(struct pstore_info *psi);
int (*close)(struct pstore_info *psi);
ssize_t (*read)(u64 *id, enum pstore_type_id *type,
- struct timespec *time, char **buf,
+ int *count, struct timespec *time, char **buf,
struct pstore_info *psi);
int (*write)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi);
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi);
int (*write_buf)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct timespec time, struct pstore_info *psi);
+ int count, struct timespec time,
+ struct pstore_info *psi);
void *data;
};
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:41:51
|
[Issue]
Currently, a variable name, which is used to identify each log entry, consists of type,
id and ctime. But an erase callback does not use ctime.
If efi_pstore supported just one log, type and id were enough.
However, in case of supporting multiple logs, it doesn't work because
it can't distinguish each entry without ctime at erasing time.
<Example>
As you can see below, efi_pstore can't differentiate first event from second one without ctime.
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-23456789
type:0
id:1
ctime:12345678, 23456789
[Solution]
This patch adds ctime to an argument of an erase callback.
It works across reboots because ctime of pstore means the date that the record was originally stored.
To do this, efi_pstore saves the ctime to variable name at writing time and passes it to pstore
at reading time.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/acpi/apei/erst.c | 4 ++--
drivers/firmware/efivars.c | 15 +++++++--------
fs/pstore/inode.c | 3 ++-
fs/pstore/ram.c | 2 +-
include/linux/pstore.h | 2 +-
5 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index e4d9d24..0bd6ae4 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -938,7 +938,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
u64 *id, unsigned int part,
size_t size, struct pstore_info *psi);
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
.owner = THIS_MODULE,
@@ -1102,7 +1102,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
}
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
}
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index fbe9202..6cbeea7 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -747,24 +747,25 @@ static int efi_pstore_write(enum pstore_type_id type,
};
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
- char stub_name[DUMP_NAME_LEN];
+ char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+ sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ time.tv_sec);
spin_lock(&efivars->lock);
for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
+ efi_name[i] = name[i];
/*
- * Clean up any entries with the same name
+ * Clean up an entry with the same name
*/
list_for_each_entry(entry, &efivars->list, list) {
@@ -775,9 +776,6 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
if (utf16_strncmp(entry->var.VariableName, efi_name,
utf16_strlen(efi_name)))
continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
/* found */
found = entry;
@@ -785,6 +783,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
&entry->var.VendorGuid,
PSTORE_EFI_ATTRIBUTES,
0, NULL);
+ break;
}
if (found)
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4ab572e..4300af6 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -175,7 +175,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, p->psi);
+ p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
+ p->psi);
return simple_unlink(dir, dentry);
}
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 1a4f6da..749693f 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -237,7 +237,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
}
static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
struct persistent_ram_zone *prz;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index ee3034a..f6e9336 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -60,7 +60,7 @@ struct pstore_info {
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
void *data;
};
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:41:17
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch removes a logic erasing existing entries in a write callback
because the logic is not needed in the write callback to support holding multiple logs.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 39 ++-------------------------------------
1 files changed, 2 insertions(+), 37 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bee14cc..fbe9202 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -701,18 +701,13 @@ static int efi_pstore_write(enum pstore_type_id type,
unsigned int part, size_t size, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
- char stub_name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
- struct efivar_entry *entry, *found = NULL;
int i, ret = 0;
u64 storage_space, remaining_space, max_variable_size;
efi_status_t status = EFI_NOT_FOUND;
- sprintf(stub_name, "dump-type%u-%u-", type, part);
- sprintf(name, "%s%lu", stub_name, get_seconds());
-
spin_lock(&efivars->lock);
/*
@@ -730,35 +725,8 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
-
- /*
- * Clean up any entries with the same name
- */
-
- list_for_each_entry(entry, &efivars->list, list) {
- get_var_data_locked(efivars, &entry->var);
-
- if (efi_guidcmp(entry->var.VendorGuid, vendor))
- continue;
- if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
-
- /* found */
- found = entry;
- efivars->ops->set_variable(entry->var.VariableName,
- &entry->var.VendorGuid,
- PSTORE_EFI_ATTRIBUTES,
- 0, NULL);
- }
-
- if (found)
- list_del(&found->list);
+ sprintf(name, "dump-type%u-%u-%lu", type, part,
+ get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];
@@ -768,9 +736,6 @@ static int efi_pstore_write(enum pstore_type_id type,
spin_unlock(&efivars->lock);
- if (found)
- efivar_unregister(found);
-
if (size)
ret = efivar_create_sysfs_entry(efivars,
utf16_strsize(efi_name,
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:40:47
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch freshly adds a logic erasing existing entries, which shared with a write callback,
to an erase callback.
To support holding multiple logs, the write callback doesn't need to erase any entries and
it will be removed in a subsequent patch.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 46 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 37ac21a..bee14cc 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -784,7 +784,51 @@ static int efi_pstore_write(enum pstore_type_id type,
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct pstore_info *psi)
{
- efi_pstore_write(type, 0, &id, (unsigned int)id, 0, psi);
+ char stub_name[DUMP_NAME_LEN];
+ efi_char16_t efi_name[DUMP_NAME_LEN];
+ efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
+ struct efivars *efivars = psi->data;
+ struct efivar_entry *entry, *found = NULL;
+ int i;
+
+ sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+
+ spin_lock(&efivars->lock);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name[i] = stub_name[i];
+
+ /*
+ * Clean up any entries with the same name
+ */
+
+ list_for_each_entry(entry, &efivars->list, list) {
+ get_var_data_locked(efivars, &entry->var);
+
+ if (efi_guidcmp(entry->var.VendorGuid, vendor))
+ continue;
+ if (utf16_strncmp(entry->var.VariableName, efi_name,
+ utf16_strlen(efi_name)))
+ continue;
+ /* Needs to be a prefix */
+ if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
+ continue;
+
+ /* found */
+ found = entry;
+ efivars->ops->set_variable(entry->var.VariableName,
+ &entry->var.VendorGuid,
+ PSTORE_EFI_ATTRIBUTES,
+ 0, NULL);
+ }
+
+ if (found)
+ list_del(&found->list);
+
+ spin_unlock(&efivars->lock);
+
+ if (found)
+ efivar_unregister(found);
return 0;
}
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:40:13
|
[Issue] As discussed in a thread below, Running out of space in EFI isn't a well-tested scenario. And we wouldn't expect all firmware to handle it gracefully. http://marc.info/?l=linux-kernel&m=134305325801789&w=2 On the other hand, current efi_pstore doesn't check a remaining space of storage at writing time. Therefore, efi_pstore may not work if it tries to write a large amount of data. [Patch Description] To avoid handling the situation above, this patch checks if there is a space enough to log with QueryVariableInfo() before writing data. Signed-off-by: Seiji Aguchi <sei...@hd...> Acked-by: Mike Waychison <mi...@go...> --- drivers/firmware/efivars.c | 18 ++++++++++++++++++ include/linux/efi.h | 1 + 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index d10c987..37ac21a 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -707,12 +707,29 @@ static int efi_pstore_write(enum pstore_type_id type, struct efivars *efivars = psi->data; struct efivar_entry *entry, *found = NULL; int i, ret = 0; + u64 storage_space, remaining_space, max_variable_size; + efi_status_t status = EFI_NOT_FOUND; sprintf(stub_name, "dump-type%u-%u-", type, part); sprintf(name, "%s%lu", stub_name, get_seconds()); spin_lock(&efivars->lock); + /* + * Check if there is a space enough to log. + * size: a size of logging data + * DUMP_NAME_LEN * 2: a maximum size of variable name + */ + status = efivars->ops->query_variable_info(PSTORE_EFI_ATTRIBUTES, + &storage_space, + &remaining_space, + &max_variable_size); + if (status || remaining_space < size + DUMP_NAME_LEN * 2) { + spin_unlock(&efivars->lock); + *id = part; + return -ENOSPC; + } + for (i = 0; i < DUMP_NAME_LEN; i++) efi_name[i] = stub_name[i]; @@ -1237,6 +1254,7 @@ efivars_init(void) ops.get_variable = efi.get_variable; ops.set_variable = efi.set_variable; ops.get_next_variable = efi.get_next_variable; + ops.query_variable_info = efi.query_variable_info; error = register_efivars(&__efivars, &ops, efi_kobj); if (error) goto err_put; diff --git a/include/linux/efi.h b/include/linux/efi.h index 8670eb1..c47ec36 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -643,6 +643,7 @@ struct efivar_operations { efi_get_variable_t *get_variable; efi_get_next_variable_t *get_next_variable; efi_set_variable_t *set_variable; + efi_query_variable_info_t *query_variable_info; }; struct efivars { -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2012-10-26 21:39:33
|
Changelog
v2 -> v3
- Create patches 6/7 and 7/7 to work with an existing format of variable name
v1 -> v2
- Separate into 5 patches in accordance with Mike's comment
- Erase an extra line of comment in patch 1/5
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
Solutions of this problem has been discussed among Tony, Matthew, Don, Mike and me.
http://marc.info/?l=linux-kernel&m=134273270704586&w=2
And there are two possible solutions right now.
- First one is introducing some policy overwriting existing logs.
- Second one is simply holding multiple log without overwriting any entries.
We haven't decided the overwriting policy which is reasonable to all users yet.
But I believe we agree that just holding multiple logs is a reasonable way.
We may need further discussions to find the possibility of introducing overwriting
policy, especially getting critical messages in multiple oops case.
But I would like to begin with a simple and reasonable way to everyone.
So, this patch takes an approach just holding multiple logs.
[Patch Description]
(1/7) efi_pstore: Check remaining space with QueryVariableInfo() before writing data
(2/7) efi_pstore: Add a logic erasing entries to an erase callback
(3/7) efi_pstore: Remove a logic erasing entries from a write callback to hold multiple logs
(4/7) efi_pstore: Add ctime to argument of erase callback
(5/7) efi_pstore: Change a format of a variable name by adding a sequence counter
(6/7) efi_pstore: Add a format check for an existing variable name at reading time
(7/7) efi_pstore: Add a format check for an existing variable name at erasing time
Detailed explanations are written in each patch.
drivers/acpi/apei/erst.c | 16 +++---
drivers/firmware/efivars.c | 150 ++++++++++++++++++++++++++++++-------------
fs/pstore/inode.c | 7 ++-
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 11 ++--
fs/pstore/ram.c | 9 +--
include/linux/efi.h | 1 +
include/linux/pstore.h | 6 +-
8 files changed, 133 insertions(+), 69 deletions(-)
|
|
From: Seiji A. <sei...@hd...> - 2012-10-25 22:19:08
|
Peter, I made a patch below making a time penalty zero when not being traced in accordance with your comment. But I'm not sure if it is reasonable to non-tracepoint users. I agree that a tracepoint should be designed to minimize its impact for non-tracepoint users. But duplicating a function call by using macro makes a code readablity worse. It has a big impact for all users although a time penalty is zero. As compared the time penalty to the code readability, I think it is reasonable for all users to just add a tracepoint to an existing function call because ,as Steven said, a tracepoint is just nops when not being traced. Seiji > -----Original Message----- > From: Seiji Aguchi > Sent: Thursday, October 18, 2012 2:41 PM > To: 'H. Peter Anvin' > Cc: 'Thomas Gleixner (tg...@li...)'; 'lin...@vg...'; ''mi...@el...' (mi...@el...)'; 'x8...@ke...'; 'dle- > de...@li...'; Satoru Moriya; 'Borislav Petkov'; ro...@go... > Subject: RE: [RFC][PATCH v5]trace,x86: add x86 irq vector tracepoints > > Any comment? > > Seiji > > > -----Original Message----- > > From: Seiji Aguchi > > Sent: Thursday, October 11, 2012 1:25 PM > > To: 'H. Peter Anvin'; 'Steven Rostedt' > > Cc: 'Thomas Gleixner (tg...@li...)'; > > 'lin...@vg...'; ''mi...@el...' (mi...@el...)'; 'x8...@ke...'; 'dle- de...@li...'; Satoru > Moriya; 'Borislav Petkov' > > Subject: [RFC][PATCH v5]trace,x86: add x86 irq vector tracepoints > > > > Change log > > > > v4 -> v5 > > - Rebased to 3.6.0 > > > > - Introduce a logic switching IDT at enabling/disabling TP time > > so that a time penalty makes a zero when tracepoints are disabled. > > This IDT is created only when CONFIG_TRACEPOINTS is enabled. > > > > - Remove arch_irq_vector_entry/exit and add followings again > > so that we can add each tracepoint in a generic way. > > - error_apic_vector > > - thermal_apic_vector > > - threshold_apic_vector > > - spurious_apic_vector > > - x86_platform_ipi_vector > > > > - Drop nmi tracepoints to begin with apic interrupts and discuss a logic switching > > IDT first. > > > > - Move irq_vectors.h in the directory of arch/x86/include/asm/trace because > > I'm not sure if a logic switching IDT is sharable with other architectures. > > > > v3 -> v4 > > - Add a latency measurement of each tracepoint > > - Rebased to 3.6-rc6 > > > > v2 -> v3 > > - Remove an invalidate_tlb_vector event because it was replaced by a call function vector > > in a following commit. > > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit; > > h=52aec3308db85f4e9f5c8b9f5dc4fbd0138c6fa4 > > > > v1 -> v2 > > - Modify variable name from irq to vector. > > - Merge arch-specific tracepoints below to an arch_irq_vector_entry/exit. > > - error_apic_vector > > - thermal_apic_vector > > - threshold_apic_vector > > - spurious_apic_vector > > - x86_platform_ipi_vector > > > > [Purpose of this patch] > > > > As Vaibhav explained in the thread below, tracepoints for irq vectors are useful. > > > > http://www.spinics.net/lists/mm-commits/msg85707.html > > > > <snip> > > The current interrupt traces from irq_handler_entry and > > irq_handler_exit provide when an interrupt is handled. They provide good data about when the system has switched to kernel > space and how it affects the currently running processes. > > > > There are some IRQ vectors which trigger the system into kernel space, > > which are not handled in generic IRQ handlers. Tracing such events gives us the information about IRQ interaction with other system > events. > > > > The trace also tells where the system is spending its time. We want > > to know which cores are handling interrupts and how they are affecting > > other processes in the system. Also, the trace provides information about when the cores are idle and which interrupts are > changing that state. > > <snip> > > > > On the other hand, my usecase is tracing just local timer event and getting a value of instruction pointer. > > > > I suggested to add an argument local timer event to get instruction pointer before. > > But there is another way to get it with external module like systemtap. > > So, I don't need to add any argument to irq vector tracepoints now. > > > > [Patch Description] > > > > Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events. > > But there is an above use case to trace specific irq_vector rather than tracing all events. > > In this case, we are concerned about overhead due to unwanted events. > > > > This patch adds following tracepoints instead of introducing irq_vector_entry/exit. > > so that we can enable them independently. > > - local_timer_vector > > - reschedule_vector > > - call_function_vector > > - call_function_single_vector > > - irq_work_entry_vector > > - error_apic_vector > > - thermal_apic_vector > > - threshold_apic_vector > > - spurious_apic_vector > > - x86_platform_ipi_vector > > > > Also, it introduces a logic switching IDT at enabling/disabling time > > so that a time penalty makes a complete zero when tracepoints are disabled. Detailed explanations are as follows. > > - Create new irq handlers inserted tracepoints by using macros. > > - Create a new IDT, trace_idt_table, at boot time by duplicating original IDT, idt table, and > > registering the new handers for tracpoints. > > - Switch IDT to new one at enabling TP time. > > - Restore to an original IDT at disabling TP time. > > The new IDT is created only when CONFIG_TRACEPOINTS is enabled to avoid being used for other purposes. > > > > Signed-off-by: Seiji Aguchi <sei...@hd...> > > --- > > arch/x86/include/asm/desc.h | 27 +++++ > > arch/x86/include/asm/entry_arch.h | 32 +++++ > > arch/x86/include/asm/hw_irq.h | 14 +++ > > arch/x86/include/asm/trace/irq_vectors.h | 153 ++++++++++++++++++++++++ > > arch/x86/kernel/Makefile | 1 + > > arch/x86/kernel/apic/apic.c | 186 +++++++++++++++++------------- > > arch/x86/kernel/cpu/mcheck/therm_throt.c | 26 +++-- > > arch/x86/kernel/cpu/mcheck/threshold.c | 27 +++-- > > arch/x86/kernel/entry_64.S | 33 ++++++ > > arch/x86/kernel/head_64.S | 6 + > > arch/x86/kernel/irq.c | 44 ++++--- > > arch/x86/kernel/irq_work.c | 22 +++- > > arch/x86/kernel/irqinit.c | 2 + > > arch/x86/kernel/smp.c | 68 ++++++++---- > > arch/x86/kernel/tracepoint.c | 102 ++++++++++++++++ > > 15 files changed, 600 insertions(+), 143 deletions(-) create mode > > 100644 arch/x86/include/asm/trace/irq_vectors.h > > create mode 100644 arch/x86/kernel/tracepoint.c > > > > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > > index 8bf1c06..52becf4 100644 > > --- a/arch/x86/include/asm/desc.h > > +++ b/arch/x86/include/asm/desc.h > > @@ -345,6 +345,33 @@ static inline void set_intr_gate(unsigned int n, void *addr) > > _set_gate(n, GATE_INTERRUPT, addr, 0, 0, __KERNEL_CS); } > > > > +#ifdef CONFIG_TRACEPOINTS > > +extern gate_desc trace_idt_table[]; > > +extern void trace_idt_table_init(void); static inline void > > +_trace_set_gate(int gate, unsigned type, void *addr, > > + unsigned dpl, unsigned ist, unsigned seg) { > > + gate_desc s; > > + > > + pack_gate(&s, type, (unsigned long)addr, dpl, ist, seg); > > + /* > > + * does not need to be atomic because it is only done once at > > + * setup time > > + */ > > + write_idt_entry(trace_idt_table, gate, &s); } > > + > > +static inline void trace_set_intr_gate(unsigned int n, void *addr) { > > + BUG_ON((unsigned)n > 0xFF); > > + _trace_set_gate(n, GATE_INTERRUPT, addr, 0, 0, __KERNEL_CS); } #else > > +static inline void trace_idt_table_init(void) { } #endif > > + > > extern int first_system_vector; > > /* used_vectors is BITMAP for irq is not managed by percpu vector_irq > > */ extern unsigned long used_vectors[]; diff --git > > a/arch/x86/include/asm/entry_arch.h > > b/arch/x86/include/asm/entry_arch.h > > index 40afa00..8ef3900 100644 > > --- a/arch/x86/include/asm/entry_arch.h > > +++ b/arch/x86/include/asm/entry_arch.h > > @@ -45,3 +45,35 @@ > > BUILD_INTERRUPT(threshold_interrupt,THRESHOLD_APIC_VECTOR) > > #endif > > > > #endif > > + > > +#ifdef CONFIG_TRACEPOINTS > > +#ifdef CONFIG_SMP > > +BUILD_INTERRUPT(trace_reschedule_interrupt, RESCHEDULE_VECTOR) > > +BUILD_INTERRUPT(trace_call_function_interrupt, CALL_FUNCTION_VECTOR) > > +BUILD_INTERRUPT(trace_call_function_single_interrupt, > > + CALL_FUNCTION_SINGLE_VECTOR) > > +#endif > > + > > +BUILD_INTERRUPT(trace_x86_platform_ipi, X86_PLATFORM_IPI_VECTOR) > > + > > +#ifdef CONFIG_X86_LOCAL_APIC > > + > > +BUILD_INTERRUPT(trace_apic_timer_interrupt, LOCAL_TIMER_VECTOR) > > +BUILD_INTERRUPT(trace_error_interrupt, ERROR_APIC_VECTOR) > > +BUILD_INTERRUPT(trace_spurious_interrupt, SPURIOUS_APIC_VECTOR) > > + > > +#ifdef CONFIG_IRQ_WORK > > +BUILD_INTERRUPT(trace_irq_work_interrupt, IRQ_WORK_VECTOR) #endif > > + > > +#ifdef CONFIG_X86_THERMAL_VECTOR > > +BUILD_INTERRUPT(trace_thermal_interrupt, THERMAL_APIC_VECTOR) #endif > > + > > +#ifdef CONFIG_X86_MCE_THRESHOLD > > +BUILD_INTERRUPT(trace_threshold_interrupt, THRESHOLD_APIC_VECTOR) > > +#endif > > + > > +#endif > > + > > +#endif /* CONFIG_TRACEPOINTS */ > > diff --git a/arch/x86/include/asm/hw_irq.h > > b/arch/x86/include/asm/hw_irq.h index eb92a6e..4472a78 100644 > > --- a/arch/x86/include/asm/hw_irq.h > > +++ b/arch/x86/include/asm/hw_irq.h > > @@ -76,6 +76,20 @@ extern void threshold_interrupt(void); extern void > > call_function_interrupt(void); extern void > > call_function_single_interrupt(void); > > > > +#ifdef CONFIG_TRACEPOINTS > > +/* Interrupt handlers registered during init_IRQ */ extern void > > +trace_apic_timer_interrupt(void); > > +extern void trace_x86_platform_ipi(void); extern void > > +trace_error_interrupt(void); extern void > > +trace_irq_work_interrupt(void); extern void > > +trace_spurious_interrupt(void); extern void > > +trace_thermal_interrupt(void); extern void > > +trace_reschedule_interrupt(void); > > +extern void trace_threshold_interrupt(void); extern void > > +trace_call_function_interrupt(void); > > +extern void trace_call_function_single_interrupt(void); > > +#endif /* CONFIG_TRACEPOINTS */ > > + > > /* IOAPIC */ > > #define IO_APIC_IRQ(x) (((x) >= NR_IRQS_LEGACY) || ((1<<(x)) & > > io_apic_irqs)) extern unsigned long io_apic_irqs; diff --git > > a/arch/x86/include/asm/trace/irq_vectors.h > > b/arch/x86/include/asm/trace/irq_vectors.h > > new file mode 100644 > > index 0000000..47858f1 > > --- /dev/null > > +++ b/arch/x86/include/asm/trace/irq_vectors.h > > @@ -0,0 +1,153 @@ > > +#undef TRACE_SYSTEM > > +#define TRACE_SYSTEM irq_vectors > > + > > +#if !defined(_TRACE_IRQ_VECTORS_H) || > > +defined(TRACE_HEADER_MULTI_READ) #define _TRACE_IRQ_VECTORS_H > > + > > +#include <linux/tracepoint.h> > > + > > +extern void trace_irq_vector_regfunc(void); extern void > > +trace_irq_vector_unregfunc(void); > > + > > +#define DECLARE_IRQ_VECTOR_EVENT(name) \ > > +TRACE_EVENT_FN(name, \ > > + TP_PROTO(int vector), \ > > + \ > > + TP_ARGS(vector), \ > > + \ > > + TP_STRUCT__entry( \ > > + __field( int, vector ) \ > > + ), \ > > + \ > > + TP_fast_assign( \ > > + __entry->vector = vector; \ > > + ), \ > > + \ > > + TP_printk("vector=%d", __entry->vector), \ > > + trace_irq_vector_regfunc, trace_irq_vector_unregfunc \ > > +); > > + > > +/* > > + * local_timer_entry - called before enterring a local timer > > +interrupt > > + * vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(local_timer_entry) > > + > > +/* > > + * local_timer_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(local_timer_exit) > > + > > +/* > > + * reschedule_entry - called before enterring a reschedule vector > > +handler */ > > +DECLARE_IRQ_VECTOR_EVENT(reschedule_entry) > > + > > +/* > > + * reschedule_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(reschedule_exit) > > + > > +/* > > + * spurious_apic_entry - called before enterring a spurious apic > > +vector handler */ > > +DECLARE_IRQ_VECTOR_EVENT(spurious_apic_entry) > > + > > +/* > > + * spurious_apic_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(spurious_apic_exit) > > + > > +/* > > + * error_apic_entry - called before enterring an error apic vector > > +handler */ > > +DECLARE_IRQ_VECTOR_EVENT(error_apic_entry) > > + > > +/* > > + * error_apic_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(error_apic_exit) > > + > > +/* > > + * x86_platform_ipi_entry - called before enterring a x86 platform > > +ipi interrupt > > + * vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(x86_platform_ipi_entry) > > + > > +/* > > + * x86_platform_ipi_exit - called immediately after the interrupt > > +vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(x86_platform_ipi_exit) > > + > > +/* > > + * irq_work_entry - called before enterring a irq work interrupt > > + * vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(irq_work_entry) > > + > > +/* > > + * irq_work_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(irq_work_exit) > > + > > +/* > > + * call_function_entry - called before enterring a call function > > +interrupt > > + * vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(call_function_entry) > > + > > +/* > > + * call_function_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(call_function_exit) > > + > > +/* > > + * call_function_single_entry - called before enterring a call > > +function > > + * single interrupt vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(call_function_single_entry) > > + > > +/* > > + * call_function_single_exit - called immediately after the interrupt > > +vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(call_function_single_exit) > > + > > +/* > > + * threshold_apic_entry - called before enterring a threshold apic > > +interrupt > > + * vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(threshold_apic_entry) > > + > > +/* > > + * threshold_apic_exit - called immediately after the interrupt > > +vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(threshold_apic_exit) > > + > > +/* > > + * thermal_apic_entry - called before enterring a thermal apic > > +interrupt > > + * vector handler > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(thermal_apic_entry) > > + > > +/* > > + * thrmal_apic_exit - called immediately after the interrupt vector > > + * handler returns > > + */ > > +DECLARE_IRQ_VECTOR_EVENT(thermal_apic_exit) > > + > > +#undef TRACE_INCLUDE_PATH > > +#define TRACE_INCLUDE_PATH ../../arch/x86/include/asm/trace #define > > +TRACE_INCLUDE_FILE irq_vectors #endif /* _TRACE_IRQ_VECTORS_H */ > > + > > +/* This part must be outside protection */ #include > > +<trace/define_trace.h> > > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index > > 91ce48f..fe4635d 100644 > > --- a/arch/x86/kernel/Makefile > > +++ b/arch/x86/kernel/Makefile > > @@ -100,6 +100,7 @@ obj-$(CONFIG_OF) += devicetree.o > > obj-$(CONFIG_UPROBES) += uprobes.o > > > > obj-$(CONFIG_PERF_EVENTS) += perf_regs.o > > +obj-$(CONFIG_TRACEPOINTS) += tracepoint.o > > > > ### > > # 64 bit specific files > > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > > index b17416e..abbee29 100644 > > --- a/arch/x86/kernel/apic/apic.c > > +++ b/arch/x86/kernel/apic/apic.c > > @@ -55,6 +55,9 @@ > > #include <asm/tsc.h> > > #include <asm/hypervisor.h> > > > > +#define CREATE_TRACE_POINTS > > +#include <asm/trace/irq_vectors.h> > > + > > unsigned int num_processors; > > > > unsigned disabled_cpus __cpuinitdata; @@ -879,27 +882,34 @@ static > > void local_apic_timer_interrupt(void) > > * [ if a single-CPU system runs an SMP kernel then we call the local > > * interrupt as well. Thus we cannot inline the local irq ... ] > > */ > > -void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs) -{ > > - struct pt_regs *old_regs = set_irq_regs(regs); > > - > > - /* > > - * NOTE! We'd better ACK the irq immediately, > > - * because timer handling can be slow. > > - */ > > - ack_APIC_irq(); > > - /* > > - * update_process_times() expects us to have done irq_enter(). > > - * Besides, if we don't timer interrupts ignore the global > > - * interrupt lock, which is the WrongThing (tm) to do. > > - */ > > - irq_enter(); > > - exit_idle(); > > - local_apic_timer_interrupt(); > > - irq_exit(); > > - > > - set_irq_regs(old_regs); > > -} > > +#define SMP_APIC_TIMER_INTERRUPT(trace, trace_enter, trace_exit) \ > > +void __irq_entry smp_##trace##apic_timer_interrupt(struct pt_regs *regs)\ > > +{ \ > > + struct pt_regs *old_regs = set_irq_regs(regs); \ > > + \ > > + /* \ > > + * NOTE! We'd better ACK the irq immediately, \ > > + * because timer handling can be slow. \ > > + */ \ > > + ack_APIC_irq(); \ > > + /* \ > > + * update_process_times() expects us to have done irq_enter(). \ > > + * Besides, if we don't timer interrupts ignore the global \ > > + * interrupt lock, which is the WrongThing (tm) to do. \ > > + */ \ > > + irq_enter(); \ > > + exit_idle(); \ > > + trace_enter; \ > > + local_apic_timer_interrupt(); \ > > + trace_exit; \ > > + irq_exit(); \ > > + \ > > + set_irq_regs(old_regs); \ > > +} > > + > > +SMP_APIC_TIMER_INTERRUPT(,,) > > +SMP_APIC_TIMER_INTERRUPT(trace_, trace_local_timer_entry(LOCAL_TIMER_VECTOR), > > + trace_local_timer_exit(LOCAL_TIMER_VECTOR)) > > > > int setup_profiling_timer(unsigned int multiplier) { @@ -1875,71 > > +1885,91 @@ int __init APIC_init_uniprocessor(void) > > /* > > * This interrupt should _never_ happen with our APIC/SMP architecture > > */ > > -void smp_spurious_interrupt(struct pt_regs *regs) -{ > > - u32 v; > > - > > - irq_enter(); > > - exit_idle(); > > - /* > > - * Check if this really is a spurious interrupt and ACK it > > - * if it is a vectored one. Just in case... > > - * Spurious interrupts should not be ACKed. > > - */ > > - v = apic_read(APIC_ISR + ((SPURIOUS_APIC_VECTOR & ~0x1f) >> 1)); > > - if (v & (1 << (SPURIOUS_APIC_VECTOR & 0x1f))) > > - ack_APIC_irq(); > > - > > - inc_irq_stat(irq_spurious_count); > > - > > - /* see sw-dev-man vol 3, chapter 7.4.13.5 */ > > - pr_info("spurious APIC interrupt on CPU#%d, " > > - "should never happen.\n", smp_processor_id()); > > - irq_exit(); > > -} > > +#define SMP_SPURIOUS_INTERRUPT(trace, trace_enter, trace_exit) \ > > +void smp_##trace##spurious_interrupt(struct pt_regs *regs) \ > > +{ \ > > + u32 v; \ > > + \ > > + irq_enter(); \ > > + exit_idle(); \ > > + trace_enter; \ > > + /* \ > > + * Check if this really is a spurious interrupt and ACK it \ > > + * if it is a vectored one. Just in case... \ > > + * Spurious interrupts should not be ACKed. \ > > + */ \ > > + v = apic_read(APIC_ISR + ((SPURIOUS_APIC_VECTOR & ~0x1f) >> 1));\ > > + if (v & (1 << (SPURIOUS_APIC_VECTOR & 0x1f))) \ > > + ack_APIC_irq(); \ > > + \ > > + inc_irq_stat(irq_spurious_count); \ > > + \ > > + /* see sw-dev-man vol 3, chapter 7.4.13.5 */ \ > > + pr_info("spurious APIC interrupt on CPU#%d, " \ > > + "should never happen.\n", smp_processor_id()); \ > > + trace_exit; \ > > + irq_exit(); \ > > +} > > + > > +SMP_SPURIOUS_INTERRUPT(,,) > > +SMP_SPURIOUS_INTERRUPT(trace_, trace_spurious_apic_entry(SPURIOUS_APIC_VECTOR), > > + trace_spurious_apic_exit(SPURIOUS_APIC_VECTOR)) > > > > /* > > * This interrupt should never happen with our APIC/SMP architecture > > */ > > -void smp_error_interrupt(struct pt_regs *regs) -{ > > - u32 v0, v1; > > - u32 i = 0; > > - static const char * const error_interrupt_reason[] = { > > - "Send CS error", /* APIC Error Bit 0 */ > > - "Receive CS error", /* APIC Error Bit 1 */ > > - "Send accept error", /* APIC Error Bit 2 */ > > - "Receive accept error", /* APIC Error Bit 3 */ > > - "Redirectable IPI", /* APIC Error Bit 4 */ > > - "Send illegal vector", /* APIC Error Bit 5 */ > > - "Received illegal vector", /* APIC Error Bit 6 */ > > - "Illegal register address", /* APIC Error Bit 7 */ > > - }; > > - > > - irq_enter(); > > - exit_idle(); > > - /* First tickle the hardware, only then report what went on. -- REW */ > > - v0 = apic_read(APIC_ESR); > > - apic_write(APIC_ESR, 0); > > - v1 = apic_read(APIC_ESR); > > - ack_APIC_irq(); > > - atomic_inc(&irq_err_count); > > - > > - apic_printk(APIC_DEBUG, KERN_DEBUG "APIC error on CPU%d: %02x(%02x)", > > - smp_processor_id(), v0 , v1); > > - > > - v1 = v1 & 0xff; > > - while (v1) { > > - if (v1 & 0x1) > > - apic_printk(APIC_DEBUG, KERN_CONT " : %s", error_interrupt_reason[i]); > > - i++; > > - v1 >>= 1; > > - } > > - > > - apic_printk(APIC_DEBUG, KERN_CONT "\n"); > > - > > - irq_exit(); > > -} > > +#define SMP_ERROR_INTERRUPT(trace, trace_enter, trace_exit) \ > > +void smp_##trace##error_interrupt(struct pt_regs *regs) \ > > +{ \ > > + u32 v0, v1; \ > > + u32 i = 0; \ > > + static const char * const error_interrupt_reason[] = { \ > > + "Send CS error", /* APIC Error Bit 0 */ \ > > + "Receive CS error", /* APIC Error Bit 1 */ \ > > + "Send accept error", /* APIC Error Bit 2 */ \ > > + "Receive accept error", /* APIC Error Bit 3 */ \ > > + "Redirectable IPI", /* APIC Error Bit 4 */ \ > > + "Send illegal vector", /* APIC Error Bit 5 */ \ > > + "Received illegal vector", /* APIC Error Bit 6 */ \ > > + "Illegal register address", /* APIC Error Bit 7 */ \ > > + }; \ > > + \ > > + irq_enter(); \ > > + exit_idle(); \ > > + trace_enter; \ > > + /* \ > > + * First tickle the hardware, only then report what went on. \ > > + * -- REW \ > > + */ \ > > + v0 = apic_read(APIC_ESR); \ > > + apic_write(APIC_ESR, 0); \ > > + v1 = apic_read(APIC_ESR); \ > > + ack_APIC_irq(); \ > > + atomic_inc(&irq_err_count); \ > > + \ > > + apic_printk(APIC_DEBUG, \ > > + KERN_DEBUG "APIC error on CPU%d: %02x(%02x)", \ > > + smp_processor_id(), v0 , v1); \ > > + \ > > + v1 = v1 & 0xff; \ > > + while (v1) { \ > > + if (v1 & 0x1) \ > > + apic_printk(APIC_DEBUG, KERN_CONT " : %s", \ > > + error_interrupt_reason[i]); \ > > + i++; \ > > + v1 >>= 1; \ > > + } \ > > + \ > > + apic_printk(APIC_DEBUG, KERN_CONT "\n"); \ > > + \ > > + trace_exit; \ > > + irq_exit(); \ > > +} > > + > > + > > +SMP_ERROR_INTERRUPT(,,) > > +SMP_ERROR_INTERRUPT(trace_, trace_error_apic_entry(ERROR_APIC_VECTOR), > > + trace_error_apic_exit(ERROR_APIC_VECTOR)) > > > > /** > > * connect_bsp_APIC - attach the APIC to the interrupt system diff > > --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c > > b/arch/x86/kernel/cpu/mcheck/therm_throt.c > > index 47a1870..a1c86ab 100644 > > --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c > > +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c > > @@ -23,6 +23,7 @@ > > #include <linux/init.h> > > #include <linux/smp.h> > > #include <linux/cpu.h> > > +#include <asm/trace/irq_vectors.h> > > > > #include <asm/processor.h> > > #include <asm/apic.h> > > @@ -378,17 +379,24 @@ static void unexpected_thermal_interrupt(void) > > > > static void (*smp_thermal_vector)(void) = > > unexpected_thermal_interrupt; > > > > -asmlinkage void smp_thermal_interrupt(struct pt_regs *regs) -{ > > - irq_enter(); > > - exit_idle(); > > - inc_irq_stat(irq_thermal_count); > > - smp_thermal_vector(); > > - irq_exit(); > > - /* Ack only at the end to avoid potential reentry */ > > - ack_APIC_irq(); > > +#define SMP_THERMAL_INTERRUPT(trace, trace_enter, trace_exit) \ > > +asmlinkage void smp_##trace##thermal_interrupt(struct pt_regs *regs) \ > > +{ \ > > + irq_enter(); \ > > + exit_idle(); \ > > + trace_enter; \ > > + inc_irq_stat(irq_thermal_count); \ > > + smp_thermal_vector(); \ > > + trace_exit; \ > > + irq_exit(); \ > > + /* Ack only at the end to avoid potential reentry */ \ > > + ack_APIC_irq(); \ > > } > > > > +SMP_THERMAL_INTERRUPT(,,) > > +SMP_THERMAL_INTERRUPT(trace_, trace_thermal_apic_entry(THERMAL_APIC_VECTOR), > > + trace_thermal_apic_exit(THERMAL_APIC_VECTOR)) > > + > > /* Thermal monitoring depends on APIC, ACPI and clock modulation */ > > static int intel_thermal_supported(struct cpuinfo_x86 *c) { diff --git > > a/arch/x86/kernel/cpu/mcheck/threshold.c > > b/arch/x86/kernel/cpu/mcheck/threshold.c > > index aa578ca..b7a95c5 100644 > > --- a/arch/x86/kernel/cpu/mcheck/threshold.c > > +++ b/arch/x86/kernel/cpu/mcheck/threshold.c > > @@ -4,6 +4,7 @@ > > #include <linux/interrupt.h> > > #include <linux/kernel.h> > > > > +#include <asm/trace/irq_vectors.h> > > #include <asm/irq_vectors.h> > > #include <asm/apic.h> > > #include <asm/idle.h> > > @@ -17,13 +18,21 @@ static void default_threshold_interrupt(void) > > > > void (*mce_threshold_vector)(void) = default_threshold_interrupt; > > > > -asmlinkage void smp_threshold_interrupt(void) -{ > > - irq_enter(); > > - exit_idle(); > > - inc_irq_stat(irq_threshold_count); > > - mce_threshold_vector(); > > - irq_exit(); > > - /* Ack only at the end to avoid potential reentry */ > > - ack_APIC_irq(); > > +#define SMP_THRESHOLD_INTERRUPT(trace, trace_enter, trace_exit) \ > > +asmlinkage void smp_##trace##threshold_interrupt(void) \ > > +{ \ > > + irq_enter(); \ > > + exit_idle(); \ > > + trace_enter; \ > > + inc_irq_stat(irq_threshold_count); \ > > + mce_threshold_vector(); \ > > + trace_exit; \ > > + irq_exit(); \ > > + /* Ack only at the end to avoid potential reentry */ \ > > + ack_APIC_irq(); \ > > } > > + > > +SMP_THRESHOLD_INTERRUPT(,,) > > +SMP_THRESHOLD_INTERRUPT(trace_, > > + trace_threshold_apic_entry(THRESHOLD_APIC_VECTOR), > > + trace_threshold_apic_exit(THRESHOLD_APIC_VECTOR)) > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > > index cdc790c..20faa26 100644 > > --- a/arch/x86/kernel/entry_64.S > > +++ b/arch/x86/kernel/entry_64.S > > @@ -1187,6 +1187,39 @@ apicinterrupt IRQ_WORK_VECTOR \ > > irq_work_interrupt smp_irq_work_interrupt #endif > > > > +#ifdef CONFIG_TRACEPOINTS > > + > > +apicinterrupt LOCAL_TIMER_VECTOR \ > > + trace_apic_timer_interrupt smp_trace_apic_timer_interrupt > > +apicinterrupt X86_PLATFORM_IPI_VECTOR \ > > + trace_x86_platform_ipi smp_trace_x86_platform_ipi > > + > > +apicinterrupt THRESHOLD_APIC_VECTOR \ > > + trace_threshold_interrupt smp_trace_threshold_interrupt > > +apicinterrupt THERMAL_APIC_VECTOR \ > > + trace_thermal_interrupt smp_trace_thermal_interrupt > > + > > +#ifdef CONFIG_SMP > > +apicinterrupt CALL_FUNCTION_SINGLE_VECTOR \ > > + trace_call_function_single_interrupt \ > > + smp_trace_call_function_single_interrupt > > +apicinterrupt CALL_FUNCTION_VECTOR \ > > + trace_call_function_interrupt smp_trace_call_function_interrupt > > +apicinterrupt RESCHEDULE_VECTOR \ > > + trace_reschedule_interrupt smp_trace_reschedule_interrupt #endif > > + > > +apicinterrupt ERROR_APIC_VECTOR \ > > + trace_error_interrupt smp_trace_error_interrupt apicinterrupt > > +SPURIOUS_APIC_VECTOR \ > > + trace_spurious_interrupt smp_trace_spurious_interrupt > > + > > +#ifdef CONFIG_IRQ_WORK > > +apicinterrupt IRQ_WORK_VECTOR \ > > + trace_irq_work_interrupt smp_trace_irq_work_interrupt #endif #endif > > +/* CONFIG_TRACEPOINTS */ > > + > > /* > > * Exception entry points. > > */ > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S > > index 94bf9cc..cc32708 100644 > > --- a/arch/x86/kernel/head_64.S > > +++ b/arch/x86/kernel/head_64.S > > @@ -455,6 +455,12 @@ ENTRY(idt_table) > > ENTRY(nmi_idt_table) > > .skip IDT_ENTRIES * 16 > > > > +#ifdef CONFIG_TRACEPOINTS > > + .align L1_CACHE_BYTES > > +ENTRY(trace_idt_table) > > + .skip IDT_ENTRIES * 16 > > +#endif > > + > > __PAGE_ALIGNED_BSS > > .align PAGE_SIZE > > ENTRY(empty_zero_page) > > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index > > e4595f1..9fd70ad 100644 > > --- a/arch/x86/kernel/irq.c > > +++ b/arch/x86/kernel/irq.c > > @@ -18,6 +18,8 @@ > > #include <asm/mce.h> > > #include <asm/hw_irq.h> > > > > +#include <asm/trace/irq_vectors.h> > > + > > atomic_t irq_err_count; > > > > /* Function pointer for generic interrupt vector handling */ @@ > > -208,26 +210,32 @@ unsigned int __irq_entry do_IRQ(struct pt_regs > > *regs) > > /* > > * Handler for X86_PLATFORM_IPI_VECTOR. > > */ > > -void smp_x86_platform_ipi(struct pt_regs *regs) -{ > > - struct pt_regs *old_regs = set_irq_regs(regs); > > - > > - ack_APIC_irq(); > > - > > - irq_enter(); > > - > > - exit_idle(); > > - > > - inc_irq_stat(x86_platform_ipis); > > - > > - if (x86_platform_ipi_callback) > > - x86_platform_ipi_callback(); > > - > > - irq_exit(); > > - > > - set_irq_regs(old_regs); > > +#define SMP_X86_PLATFORM_IPI(trace, trace_enter, trace_exit) \ > > +void smp_##trace##x86_platform_ipi(struct pt_regs *regs) \ > > +{ \ > > + struct pt_regs *old_regs = set_irq_regs(regs); \ > > + \ > > + ack_APIC_irq(); \ > > + \ > > + irq_enter(); \ > > + \ > > + exit_idle(); \ > > + trace_enter; \ > > + inc_irq_stat(x86_platform_ipis); \ > > + \ > > + if (x86_platform_ipi_callback) \ > > + x86_platform_ipi_callback(); \ > > + trace_exit; \ > > + irq_exit(); \ > > + \ > > + set_irq_regs(old_regs); \ > > } > > > > +SMP_X86_PLATFORM_IPI(,,) > > +SMP_X86_PLATFORM_IPI(trace_, > > + trace_x86_platform_ipi_entry(X86_PLATFORM_IPI_VECTOR), > > + trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR)) > > + > > EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq); > > > > #ifdef CONFIG_HOTPLUG_CPU > > diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c > > index ca8f703..a669b94 100644 > > --- a/arch/x86/kernel/irq_work.c > > +++ b/arch/x86/kernel/irq_work.c > > @@ -8,16 +8,24 @@ > > #include <linux/irq_work.h> > > #include <linux/hardirq.h> > > #include <asm/apic.h> > > +#include <asm/trace/irq_vectors.h> > > > > -void smp_irq_work_interrupt(struct pt_regs *regs) -{ > > - irq_enter(); > > - ack_APIC_irq(); > > - inc_irq_stat(apic_irq_work_irqs); > > - irq_work_run(); > > - irq_exit(); > > +#define SMP_IRQ_WORK_INTERRUPT(trace, trace_enter, trace_exit) \ > > +void smp_##trace##irq_work_interrupt(struct pt_regs *regs) \ > > +{ \ > > + irq_enter(); \ > > + ack_APIC_irq(); \ > > + trace_enter; \ > > + inc_irq_stat(apic_irq_work_irqs); \ > > + irq_work_run(); \ > > + trace_exit; \ > > + irq_exit(); \ > > } > > > > +SMP_IRQ_WORK_INTERRUPT(,,) > > +SMP_IRQ_WORK_INTERRUPT(trace_, trace_irq_work_entry(IRQ_WORK_VECTOR), > > + trace_irq_work_exit(IRQ_WORK_VECTOR)) > > + > > void arch_irq_work_raise(void) > > { > > #ifdef CONFIG_X86_LOCAL_APIC > > diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c > > index 6e03b0d..cf76128 100644 > > --- a/arch/x86/kernel/irqinit.c > > +++ b/arch/x86/kernel/irqinit.c > > @@ -251,4 +251,6 @@ void __init native_init_IRQ(void) > > > > irq_ctx_init(smp_processor_id()); > > #endif > > + > > + trace_idt_table_init(); > > } > > diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index > > 48d2b7d..d8e1a2c 100644 > > --- a/arch/x86/kernel/smp.c > > +++ b/arch/x86/kernel/smp.c > > @@ -23,6 +23,7 @@ > > #include <linux/interrupt.h> > > #include <linux/cpu.h> > > #include <linux/gfp.h> > > +#include <asm/trace/irq_vectors.h> > > > > #include <asm/mtrr.h> > > #include <asm/tlbflush.h> > > @@ -249,34 +250,57 @@ finish: > > /* > > * Reschedule call back. > > */ > > -void smp_reschedule_interrupt(struct pt_regs *regs) -{ > > - ack_APIC_irq(); > > - inc_irq_stat(irq_resched_count); > > - scheduler_ipi(); > > - /* > > - * KVM uses this interrupt to force a cpu out of guest mode > > - */ > > +#define SMP_RESCHEDULE_INTERRUPT(trace, trace_enter, trace_exit) \ > > +void smp_##trace##reschedule_interrupt(struct pt_regs *regs) \ > > +{ \ > > + ack_APIC_irq(); \ > > + trace_enter; \ > > + inc_irq_stat(irq_resched_count); \ > > + scheduler_ipi(); \ > > + trace_exit; \ > > + /* \ > > + * KVM uses this interrupt to force a cpu out of guest mode \ > > + */ \ > > } > > > > -void smp_call_function_interrupt(struct pt_regs *regs) -{ > > - ack_APIC_irq(); > > - irq_enter(); > > - generic_smp_call_function_interrupt(); > > - inc_irq_stat(irq_call_count); > > - irq_exit(); > > +SMP_RESCHEDULE_INTERRUPT(,,) > > +SMP_RESCHEDULE_INTERRUPT(trace_, trace_reschedule_entry(RESCHEDULE_VECTOR), > > + trace_reschedule_exit(RESCHEDULE_VECTOR)) > > + > > +#define SMP_CALL_FUNCTION_INTERRUPT(trace, trace_enter, trace_exit) \ > > +void smp_##trace##call_function_interrupt(struct pt_regs *regs) \ > > +{ \ > > + ack_APIC_irq(); \ > > + irq_enter(); \ > > + trace_enter; \ > > + generic_smp_call_function_interrupt(); \ > > + inc_irq_stat(irq_call_count); \ > > + trace_exit; \ > > + irq_exit(); \ > > } > > > > -void smp_call_function_single_interrupt(struct pt_regs *regs) -{ > > - ack_APIC_irq(); > > - irq_enter(); > > - generic_smp_call_function_single_interrupt(); > > - inc_irq_stat(irq_call_count); > > - irq_exit(); > > +SMP_CALL_FUNCTION_INTERRUPT(,,) > > +SMP_CALL_FUNCTION_INTERRUPT(trace_, > > + trace_call_function_entry(CALL_FUNCTION_VECTOR), > > + trace_call_function_exit(CALL_FUNCTION_VECTOR)) > > + > > +#define SMP_CALL_FUNCTION_SINGLE_INTERRUPT(trace, trace_enter, trace_exit)\ > > +void smp_##trace##call_function_single_interrupt(struct pt_regs *regs) \ > > +{ \ > > + ack_APIC_irq(); \ > > + irq_enter(); \ > > + trace_enter; \ > > + generic_smp_call_function_single_interrupt(); \ > > + inc_irq_stat(irq_call_count); \ > > + trace_exit; \ > > + irq_exit(); \ > > } > > > > +SMP_CALL_FUNCTION_SINGLE_INTERRUPT(,,) > > +SMP_CALL_FUNCTION_SINGLE_INTERRUPT(trace_, > > + trace_call_function_single_entry(CALL_FUNCTION_SINGLE_VECTOR), > > + trace_call_function_single_exit(CALL_FUNCTION_SINGLE_VECTOR)) > > + > > static int __init nonmi_ipi_setup(char *str) { > > smp_no_nmi_ipi = true; > > diff --git a/arch/x86/kernel/tracepoint.c > > b/arch/x86/kernel/tracepoint.c new file mode 100644 index > > 0000000..d7c96ba > > --- /dev/null > > +++ b/arch/x86/kernel/tracepoint.c > > @@ -0,0 +1,102 @@ > > +/* > > + * Code for supporting irq vector tracepoints. > > + * > > + * Copyright (C) 2012 Seiji Aguchi <sei...@hd...> > > + * > > + */ > > +#include <asm/hw_irq.h> > > +#include <asm/desc.h> > > + > > +static struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1, > > + (unsigned long) trace_idt_table }; > > + > > +#ifndef CONFIG_X86_64 > > +gate_desc trace_idt_table[NR_VECTORS] __page_aligned_data > > + = { { { { 0, 0 } } }, }; > > +#endif > > + > > +void __init trace_idt_table_init(void) { > > + memcpy(&trace_idt_table, &idt_table, IDT_ENTRIES * 16); > > + /* > > + * The reschedule interrupt is a CPU-to-CPU reschedule-helper > > + * IPI, driven by wakeup. > > + */ > > + trace_set_intr_gate(RESCHEDULE_VECTOR, trace_reschedule_interrupt); > > + > > + /* IPI for generic function call */ > > + trace_set_intr_gate(CALL_FUNCTION_VECTOR, > > + trace_call_function_interrupt); > > + > > + /* IPI for generic single function call */ > > + trace_set_intr_gate(CALL_FUNCTION_SINGLE_VECTOR, > > + trace_call_function_single_interrupt); > > + > > +#ifdef CONFIG_X86_THERMAL_VECTOR > > + trace_set_intr_gate(THERMAL_APIC_VECTOR, trace_thermal_interrupt); > > +#endif #ifdef CONFIG_X86_MCE_THRESHOLD > > + trace_set_intr_gate(THRESHOLD_APIC_VECTOR, > > +trace_threshold_interrupt); #endif > > + > > +#if defined(CONFIG_X86_64) || defined(CONFIG_X86_LOCAL_APIC) > > + /* self generated IPI for local APIC timer */ > > + trace_set_intr_gate(LOCAL_TIMER_VECTOR, trace_apic_timer_interrupt); > > + > > + /* IPI for X86 platform specific use */ > > + trace_set_intr_gate(X86_PLATFORM_IPI_VECTOR, > > +trace_x86_platform_ipi); > > + > > + /* IPI vectors for APIC spurious and error interrupts */ > > + trace_set_intr_gate(SPURIOUS_APIC_VECTOR, trace_spurious_interrupt); > > + trace_set_intr_gate(ERROR_APIC_VECTOR, trace_error_interrupt); > > + > > + /* IRQ work interrupts: */ > > +# ifdef CONFIG_IRQ_WORK > > + trace_set_intr_gate(IRQ_WORK_VECTOR, trace_irq_work_interrupt); # > > +endif # endif } > > + > > +static struct desc_ptr orig_idt_descr[NR_CPUS]; static int > > +trace_irq_vector_refcount; > > + > > +static void switch_trace_idt(void *arg) { > > + store_idt(&orig_idt_descr[smp_processor_id()]); > > + load_idt(&trace_idt_descr); > > + > > + return; > > +} > > + > > +static void restore_original_idt(void *arg) { > > + if (orig_idt_descr[smp_processor_id()].address) { > > + load_idt(&orig_idt_descr[smp_processor_id()]); > > + memset(&orig_idt_descr[smp_processor_id()], 0, > > + sizeof(struct desc_ptr)); > > + } > > + > > + return; > > +} > > + > > +void trace_irq_vector_regfunc(void) > > +{ > > + if (!trace_irq_vector_refcount) { > > + smp_call_function(switch_trace_idt, NULL, 0); > > + local_irq_disable(); > > + switch_trace_idt(NULL); > > + local_irq_enable(); > > + } > > + trace_irq_vector_refcount++; > > +} > > + > > +void trace_irq_vector_unregfunc(void) { > > + trace_irq_vector_refcount--; > > + if (!trace_irq_vector_refcount) { > > + smp_call_function(restore_original_idt, NULL, 0); > > + local_irq_disable(); > > + restore_original_idt(NULL); > > + local_irq_enable(); > > + } > > +} > > -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2012-10-25 21:38:00
|
> > On the other hand, patch 5/5 changes the format by adding sequence counter. > > But efi_pstore_read is modied to work correctly in it. > > > > dump-type0-1-1-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > > > > Variable Name: dump-type0-1-1-1351113059 > > GUID: cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > > > > If I need to elaborate more, please feel free to ask me:) I'm not sure > > if I understand your question completely. > > In this case, I think efi_pstore_read in patch 5/5 should probably fall-back to trying to do a sscanf(...) == 3 if the sscanf(...) == 4 test fails > so that dumps for older dumps from previous kernels are still visible to users, no? They can perhaps get a default count of 0? > efi_pstore_erase would have to be updated to understand this as well. OK. I understand your concern. I will update my patch to work with an old format too. In addition to efi_pstore_read(), efi_pstore_erase should be modified. Anyway, I will post a patch v3 later. Thank you for reviewing. Seiji |
|
From: Mike W. <mi...@go...> - 2012-10-25 21:20:25
|
On Thu, Oct 25, 2012 at 1:51 PM, Seiji Aguchi <sei...@hd...> wrote: >> So doesn't this break erasing of existing dumps in pstore-efivars? >> Unless efi_pstore_read still understands the older variable name formats, users will be stranded with dumps consuming space in >> efivars that aren't exported via pstore anymore. > > Patch 2/5, 3/5 and 4/5 don't change an existing format of a variable name and don't break anything. > The existing format consists of type, id and ctime. And it is not moditied by these patches. > Here is a variable name (and guid) of efi_pstore in curret linus-tree. > > ls /sys/firmware/efi/vars/ > dump-type0-1-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > dump-type0-2-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > dump-type0-3-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > > Variable Name: dump-type0-1-1351113059 > GUID: cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 Okay, this alleviates most of my concerns for ctime. > > On the other hand, patch 5/5 changes the format by adding sequence counter. > But efi_pstore_read is modied to work correctly in it. > > dump-type0-1-1-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > > Variable Name: dump-type0-1-1-1351113059 > GUID: cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 > > If I need to elaborate more, please feel free to ask me:) > I'm not sure if I understand your question completely. In this case, I think efi_pstore_read in patch 5/5 should probably fall-back to trying to do a sscanf(...) == 3 if the sscanf(...) == 4 test fails so that dumps for older dumps from previous kernels are still visible to users, no? They can perhaps get a default count of 0? efi_pstore_erase would have to be updated to understand this as well. |
|
From: Seiji A. <sei...@hd...> - 2012-10-25 20:51:18
|
> So doesn't this break erasing of existing dumps in pstore-efivars? > Unless efi_pstore_read still understands the older variable name formats, users will be stranded with dumps consuming space in > efivars that aren't exported via pstore anymore. Patch 2/5, 3/5 and 4/5 don't change an existing format of a variable name and don't break anything. The existing format consists of type, id and ctime. And it is not moditied by these patches. Here is a variable name (and guid) of efi_pstore in curret linus-tree. ls /sys/firmware/efi/vars/ dump-type0-1-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 dump-type0-2-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 dump-type0-3-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 Variable Name: dump-type0-1-1351113059 GUID: cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 On the other hand, patch 5/5 changes the format by adding sequence counter. But efi_pstore_read is modied to work correctly in it. dump-type0-1-1-1351113059-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 Variable Name: dump-type0-1-1-1351113059 GUID: cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 If I need to elaborate more, please feel free to ask me:) I'm not sure if I understand your question completely. Seiji |
|
From: Mike W. <mi...@go...> - 2012-10-25 20:22:16
|
On Thu, Oct 25, 2012 at 1:03 PM, Seiji Aguchi <sei...@hd...> wrote: >> > - list_del(&found->list); >> > + sprintf(name, "dump-type%u-%u-%lu", type, part, >> > + get_seconds()); >> >> Actually, nothing seems to be ensuring that the value used here ends up being the same as the ctime :\ Consider what happens if the >> get_seconds() call here happens at second N, and the i_ctime is collected at time N+1. The object wouldn't be able to be deleted after >> patch 4/5 is applied. How can we better ensure that the i_ctime is the same value as serialized into the efi var name here? > > i_ctime of pstore file system is different from normal file systems. > As I said in patch v1, i_ctime of pstore means the date that the record was "originally" stored. > > I_ctime is always set to N. > Here is the scenario. > > 1. writing time > - get_seconds() call here happens at second N > - "N" is stored to a variable name > > 2. System reboots... > 3. System boots up... > > 4. reading time > - pstore calls a read callback, efi_pstore_read() > - efi_pstore_read extracts ctime, "N", from the virable name Ah, I accidentally glossed over this bit. Thanks for correcting me :) > - efi_pstore passes "N" to pstore file system > - pstore create a file, /dev/pstore/dmesg* and set "N" to i_ctime > > Pstore file system doesn't set i_ctime by itself but set ctime passed by platform drivers. > So doesn't this break erasing of existing dumps in pstore-efivars? Unless efi_pstore_read still understands the older variable name formats, users will be stranded with dumps consuming space in efivars that aren't exported via pstore anymore. |
|
From: Seiji A. <sei...@hd...> - 2012-10-25 20:04:01
|
> > - list_del(&found->list);
> > + sprintf(name, "dump-type%u-%u-%lu", type, part,
> > + get_seconds());
>
> Actually, nothing seems to be ensuring that the value used here ends up being the same as the ctime :\ Consider what happens if the
> get_seconds() call here happens at second N, and the i_ctime is collected at time N+1. The object wouldn't be able to be deleted after
> patch 4/5 is applied. How can we better ensure that the i_ctime is the same value as serialized into the efi var name here?
i_ctime of pstore file system is different from normal file systems.
As I said in patch v1, i_ctime of pstore means the date that the record was "originally" stored.
I_ctime is always set to N.
Here is the scenario.
1. writing time
- get_seconds() call here happens at second N
- "N" is stored to a variable name
2. System reboots...
3. System boots up...
4. reading time
- pstore calls a read callback, efi_pstore_read()
- efi_pstore_read extracts ctime, "N", from the virable name
- efi_pstore passes "N" to pstore file system
- pstore create a file, /dev/pstore/dmesg* and set "N" to i_ctime
Pstore file system doesn't set i_ctime by itself but set ctime passed by platform drivers.
Seiji
|
|
From: Mike W. <mi...@go...> - 2012-10-25 19:18:32
|
On Wed, Oct 24, 2012 at 6:54 PM, Seiji Aguchi <sei...@hd...> wrote:
> [Issue]
>
> Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
> So, in the following scenario, we will lose 1st panic messages.
>
> 1. kernel panics.
> 2. efi_pstore is kicked and writes panic messages to NVRAM.
> 3. system reboots.
> 4. kernel panics again before a user checks the 1st panic messages in NVRAM.
>
> [Solution]
>
> A reasonable solution to fix the issue is just holding multiple logs without erasing
> existing entries.
> This patch removes a logic erasing existing entries in a write callback
> because the logic is not needed in the write callback to support holding multiple logs.
>
> Signed-off-by: Seiji Aguchi <sei...@hd...>
> ---
> drivers/firmware/efivars.c | 39 ++-------------------------------------
> 1 files changed, 2 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
> index bee14cc..fbe9202 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -701,18 +701,13 @@ static int efi_pstore_write(enum pstore_type_id type,
> unsigned int part, size_t size, struct pstore_info *psi)
> {
> char name[DUMP_NAME_LEN];
> - char stub_name[DUMP_NAME_LEN];
> efi_char16_t efi_name[DUMP_NAME_LEN];
> efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
> struct efivars *efivars = psi->data;
> - struct efivar_entry *entry, *found = NULL;
> int i, ret = 0;
> u64 storage_space, remaining_space, max_variable_size;
> efi_status_t status = EFI_NOT_FOUND;
>
> - sprintf(stub_name, "dump-type%u-%u-", type, part);
> - sprintf(name, "%s%lu", stub_name, get_seconds());
> -
> spin_lock(&efivars->lock);
>
> /*
> @@ -730,35 +725,8 @@ static int efi_pstore_write(enum pstore_type_id type,
> return -ENOSPC;
> }
>
> - for (i = 0; i < DUMP_NAME_LEN; i++)
> - efi_name[i] = stub_name[i];
> -
> - /*
> - * Clean up any entries with the same name
> - */
> -
> - list_for_each_entry(entry, &efivars->list, list) {
> - get_var_data_locked(efivars, &entry->var);
> -
> - if (efi_guidcmp(entry->var.VendorGuid, vendor))
> - continue;
> - if (utf16_strncmp(entry->var.VariableName, efi_name,
> - utf16_strlen(efi_name)))
> - continue;
> - /* Needs to be a prefix */
> - if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
> - continue;
> -
> - /* found */
> - found = entry;
> - efivars->ops->set_variable(entry->var.VariableName,
> - &entry->var.VendorGuid,
> - PSTORE_EFI_ATTRIBUTES,
> - 0, NULL);
> - }
> -
> - if (found)
> - list_del(&found->list);
> + sprintf(name, "dump-type%u-%u-%lu", type, part,
> + get_seconds());
Actually, nothing seems to be ensuring that the value used here ends
up being the same as the ctime :\ Consider what happens if the
get_seconds() call here happens at second N, and the i_ctime is
collected at time N+1. The object wouldn't be able to be deleted after
patch 4/5 is applied. How can we better ensure that the i_ctime is
the same value as serialized into the efi var name here?
>
> for (i = 0; i < DUMP_NAME_LEN; i++)
> efi_name[i] = name[i];
> @@ -768,9 +736,6 @@ static int efi_pstore_write(enum pstore_type_id type,
>
> spin_unlock(&efivars->lock);
>
> - if (found)
> - efivar_unregister(found);
> -
> if (size)
> ret = efivar_create_sysfs_entry(efivars,
> utf16_strsize(efi_name,
> -- 1.7.1
|
|
From: Mike W. <mi...@go...> - 2012-10-25 19:04:28
|
Acked-by: Mike Waychison <mi...@go...>
On Wed, Oct 24, 2012 at 6:54 PM, Seiji Aguchi <sei...@hd...> wrote:
> [Issue]
>
> Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
> So, in the following scenario, we will lose 1st panic messages.
>
> 1. kernel panics.
> 2. efi_pstore is kicked and writes panic messages to NVRAM.
> 3. system reboots.
> 4. kernel panics again before a user checks the 1st panic messages in NVRAM.
>
> [Solution]
>
> A reasonable solution to fix the issue is just holding multiple logs without erasing
> existing entries.
> This patch removes a logic erasing existing entries in a write callback
> because the logic is not needed in the write callback to support holding multiple logs.
>
> Signed-off-by: Seiji Aguchi <sei...@hd...>
> ---
> drivers/firmware/efivars.c | 39 ++-------------------------------------
> 1 files changed, 2 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
> index bee14cc..fbe9202 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -701,18 +701,13 @@ static int efi_pstore_write(enum pstore_type_id type,
> unsigned int part, size_t size, struct pstore_info *psi)
> {
> char name[DUMP_NAME_LEN];
> - char stub_name[DUMP_NAME_LEN];
> efi_char16_t efi_name[DUMP_NAME_LEN];
> efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
> struct efivars *efivars = psi->data;
> - struct efivar_entry *entry, *found = NULL;
> int i, ret = 0;
> u64 storage_space, remaining_space, max_variable_size;
> efi_status_t status = EFI_NOT_FOUND;
>
> - sprintf(stub_name, "dump-type%u-%u-", type, part);
> - sprintf(name, "%s%lu", stub_name, get_seconds());
> -
> spin_lock(&efivars->lock);
>
> /*
> @@ -730,35 +725,8 @@ static int efi_pstore_write(enum pstore_type_id type,
> return -ENOSPC;
> }
>
> - for (i = 0; i < DUMP_NAME_LEN; i++)
> - efi_name[i] = stub_name[i];
> -
> - /*
> - * Clean up any entries with the same name
> - */
> -
> - list_for_each_entry(entry, &efivars->list, list) {
> - get_var_data_locked(efivars, &entry->var);
> -
> - if (efi_guidcmp(entry->var.VendorGuid, vendor))
> - continue;
> - if (utf16_strncmp(entry->var.VariableName, efi_name,
> - utf16_strlen(efi_name)))
> - continue;
> - /* Needs to be a prefix */
> - if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
> - continue;
> -
> - /* found */
> - found = entry;
> - efivars->ops->set_variable(entry->var.VariableName,
> - &entry->var.VendorGuid,
> - PSTORE_EFI_ATTRIBUTES,
> - 0, NULL);
> - }
> -
> - if (found)
> - list_del(&found->list);
> + sprintf(name, "dump-type%u-%u-%lu", type, part,
> + get_seconds());
Could you please also update Documentation/ABI/testing/pstore ? It
seems it was never updated with the "dump*" file prefix.
>
> for (i = 0; i < DUMP_NAME_LEN; i++)
> efi_name[i] = name[i];
> @@ -768,9 +736,6 @@ static int efi_pstore_write(enum pstore_type_id type,
>
> spin_unlock(&efivars->lock);
>
> - if (found)
> - efivar_unregister(found);
> -
> if (size)
> ret = efivar_create_sysfs_entry(efivars,
> utf16_strsize(efi_name,
> -- 1.7.1
|
|
From: Mike W. <mi...@go...> - 2012-10-25 18:56:40
|
Acked-by: Mike Waychison <mi...@go...> On Wed, Oct 24, 2012 at 6:51 PM, Seiji Aguchi <sei...@hd...> wrote: > [Issue] > > As discussed in a thread below, Running out of space in EFI isn't a well-tested scenario. > And we wouldn't expect all firmware to handle it gracefully. > http://marc.info/?l=linux-kernel&m=134305325801789&w=2 > > On the other hand, current efi_pstore doesn't check a remaining space of storage at writing time. > Therefore, efi_pstore may not work if it tries to write a large amount of data. > > [Patch Description] > > To avoid handling the situation above, this patch checks if there is a space enough to log with > QueryVariableInfo() before writing data. > > Signed-off-by: Seiji Aguchi <sei...@hd...> > --- > drivers/firmware/efivars.c | 18 ++++++++++++++++++ > include/linux/efi.h | 1 + > 2 files changed, 19 insertions(+), 0 deletions(-) > > diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c > index d10c987..37ac21a 100644 > --- a/drivers/firmware/efivars.c > +++ b/drivers/firmware/efivars.c > @@ -707,12 +707,29 @@ static int efi_pstore_write(enum pstore_type_id type, > struct efivars *efivars = psi->data; > struct efivar_entry *entry, *found = NULL; > int i, ret = 0; > + u64 storage_space, remaining_space, max_variable_size; > + efi_status_t status = EFI_NOT_FOUND; > > sprintf(stub_name, "dump-type%u-%u-", type, part); > sprintf(name, "%s%lu", stub_name, get_seconds()); > > spin_lock(&efivars->lock); > > + /* > + * Check if there is a space enough to log. > + * size: a size of logging data > + * DUMP_NAME_LEN * 2: a maximum size of variable name > + */ > + status = efivars->ops->query_variable_info(PSTORE_EFI_ATTRIBUTES, > + &storage_space, > + &remaining_space, > + &max_variable_size); > + if (status || remaining_space < size + DUMP_NAME_LEN * 2) { > + spin_unlock(&efivars->lock); > + *id = part; > + return -ENOSPC; > + } > + > for (i = 0; i < DUMP_NAME_LEN; i++) > efi_name[i] = stub_name[i]; > > @@ -1237,6 +1254,7 @@ efivars_init(void) > ops.get_variable = efi.get_variable; > ops.set_variable = efi.set_variable; > ops.get_next_variable = efi.get_next_variable; > + ops.query_variable_info = efi.query_variable_info; > error = register_efivars(&__efivars, &ops, efi_kobj); > if (error) > goto err_put; > diff --git a/include/linux/efi.h b/include/linux/efi.h > index 8670eb1..c47ec36 100644 > --- a/include/linux/efi.h > +++ b/include/linux/efi.h > @@ -643,6 +643,7 @@ struct efivar_operations { > efi_get_variable_t *get_variable; > efi_get_next_variable_t *get_next_variable; > efi_set_variable_t *set_variable; > + efi_query_variable_info_t *query_variable_info; > }; > > struct efivars { > -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2012-10-25 02:00:12
|
Resending a patch by changing a subject from "PATCH 2/5" to "PATCH v2 2/5".
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch freshly adds a logic erasing existing entries, which shared with a write callback,
to an erase callback.
To support holding multiple logs, the write callback doesn't need to erase any entries and
it will be removed in a subsequent patch.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 46 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 37ac21a..bee14cc 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -784,7 +784,51 @@ static int efi_pstore_write(enum pstore_type_id type,
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct pstore_info *psi)
{
- efi_pstore_write(type, 0, &id, (unsigned int)id, 0, psi);
+ char stub_name[DUMP_NAME_LEN];
+ efi_char16_t efi_name[DUMP_NAME_LEN];
+ efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
+ struct efivars *efivars = psi->data;
+ struct efivar_entry *entry, *found = NULL;
+ int i;
+
+ sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+
+ spin_lock(&efivars->lock);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name[i] = stub_name[i];
+
+ /*
+ * Clean up any entries with the same name
+ */
+
+ list_for_each_entry(entry, &efivars->list, list) {
+ get_var_data_locked(efivars, &entry->var);
+
+ if (efi_guidcmp(entry->var.VendorGuid, vendor))
+ continue;
+ if (utf16_strncmp(entry->var.VariableName, efi_name,
+ utf16_strlen(efi_name)))
+ continue;
+ /* Needs to be a prefix */
+ if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
+ continue;
+
+ /* found */
+ found = entry;
+ efivars->ops->set_variable(entry->var.VariableName,
+ &entry->var.VendorGuid,
+ PSTORE_EFI_ATTRIBUTES,
+ 0, NULL);
+ }
+
+ if (found)
+ list_del(&found->list);
+
+ spin_unlock(&efivars->lock);
+
+ if (found)
+ efivar_unregister(found);
return 0;
}
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-25 01:55:25
|
[Issue]
Currently, a variable name, which identifies each entry, consists of type, id and ctime.
But if multiple events happens in a short time, a second/third event may fail to log because
efi_pstore can't distinguish each event with current variable name.
[Solution]
A reasonable way to identify all events precisely is introducing a sequence counter to
the variable name.
The sequence counter has already supported in a pstore layer with "oopscount".
So, this patch adds it to a variable name.
Also, it is passed to read/erase callbacks of platform drivers in accordance with
the modification of the variable name.
<before applying this patch>
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-12345678
type:0
id:1
ctime:12345678
If multiple events happen in a short time, efi_pstore can't distinguish them because
variable names are same among them.
<after applying this patch>
it can be distinguishable by adding a sequence counter as follows.
a variable name of first event: dump-type0-1-1-12345678
a variable name of Second event: dump-type0-1-2-12345678
type:0
id:1
sequence counter: 1(first event), 2(second event)
ctime:12345678
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/acpi/apei/erst.c | 12 ++++++------
drivers/firmware/efivars.c | 18 +++++++++++-------
fs/pstore/inode.c | 8 +++++---
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 11 ++++++-----
fs/pstore/ram.c | 7 +++----
include/linux/pstore.h | 8 +++++---
7 files changed, 37 insertions(+), 29 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index 0bd6ae4..6d894bf 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -931,13 +931,13 @@ static int erst_check_table(struct acpi_table_erst *erst_tab)
static int erst_open_pstore(struct pstore_info *psi);
static int erst_close_pstore(struct pstore_info *psi);
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi);
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi);
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
@@ -987,7 +987,7 @@ static int erst_close_pstore(struct pstore_info *psi)
return 0;
}
-static ssize_t erst_reader(u64 *id, enum pstore_type_id *type,
+static ssize_t erst_reader(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *time, char **buf,
struct pstore_info *psi)
{
@@ -1055,7 +1055,7 @@ out:
}
static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
struct cper_pstore_record *rcd = (struct cper_pstore_record *)
@@ -1101,7 +1101,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
return ret;
}
-static int erst_clearer(enum pstore_type_id type, u64 id,
+static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 6cbeea7..dc69802 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -658,13 +658,14 @@ static int efi_pstore_close(struct pstore_info *psi)
}
static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *timespec,
+ int *count, struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
char name[DUMP_NAME_LEN];
int i;
+ int cnt;
unsigned int part, size;
unsigned long time;
@@ -674,8 +675,10 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
for (i = 0; i < DUMP_NAME_LEN; i++) {
name[i] = efivars->walk_entry->var.VariableName[i];
}
- if (sscanf(name, "dump-type%u-%u-%lu", type, &part, &time) == 3) {
+ if (sscanf(name, "dump-type%u-%u-%d-%lu",
+ type, &part, &cnt, &time) == 4) {
*id = part;
+ *count = cnt;
timespec->tv_sec = time;
timespec->tv_nsec = 0;
get_var_data_locked(efivars, &efivars->walk_entry->var);
@@ -698,7 +701,8 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
static int efi_pstore_write(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi)
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
@@ -725,7 +729,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- sprintf(name, "dump-type%u-%u-%lu", type, part,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count,
get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
@@ -746,7 +750,7 @@ static int efi_pstore_write(enum pstore_type_id type,
return ret;
};
-static int efi_pstore_erase(enum pstore_type_id type, u64 id,
+static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
@@ -756,7 +760,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ sprintf(name, "dump-type%u-%u-%d-%lu", type, (unsigned int)id, count,
time.tv_sec);
spin_lock(&efivars->lock);
@@ -807,7 +811,7 @@ static int efi_pstore_close(struct pstore_info *psi)
return 0;
}
-static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
+static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, int *count,
struct timespec *timespec,
char **buf, struct pstore_info *psi)
{
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4300af6..ed1d8c7 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -49,6 +49,7 @@ struct pstore_private {
struct pstore_info *psi;
enum pstore_type_id type;
u64 id;
+ int count;
ssize_t size;
char data[];
};
@@ -175,8 +176,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
- p->psi);
+ p->psi->erase(p->type, p->id, p->count,
+ dentry->d_inode->i_ctime, p->psi);
return simple_unlink(dir, dentry);
}
@@ -271,7 +272,7 @@ int pstore_is_mounted(void)
* Load it up with "size" bytes of data from "buf".
* Set the mtime & ctime to the date that this record was originally stored.
*/
-int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
+int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
char *data, size_t size, struct timespec time,
struct pstore_info *psi)
{
@@ -307,6 +308,7 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id,
goto fail_alloc;
private->type = type;
private->id = id;
+ private->count = count;
private->psi = psi;
switch (type) {
diff --git a/fs/pstore/internal.h b/fs/pstore/internal.h
index 4847f58..937d820 100644
--- a/fs/pstore/internal.h
+++ b/fs/pstore/internal.h
@@ -50,7 +50,7 @@ extern struct pstore_info *psinfo;
extern void pstore_set_kmsg_bytes(int);
extern void pstore_get_records(int);
extern int pstore_mkfile(enum pstore_type_id, char *psname, u64 id,
- char *data, size_t size,
+ int count, char *data, size_t size,
struct timespec time, struct pstore_info *psi);
extern int pstore_is_mounted(void);
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index a40da07..f911bbd 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -136,7 +136,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
break;
ret = psinfo->write(PSTORE_TYPE_DMESG, reason, &id, part,
- hsize + len, psinfo);
+ oopscount, hsize + len, psinfo);
if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted())
pstore_new_entry = 1;
@@ -196,7 +196,7 @@ static void pstore_register_console(void) {}
static int pstore_write_compat(enum pstore_type_id type,
enum kmsg_dump_reason reason,
- u64 *id, unsigned int part,
+ u64 *id, unsigned int part, int count,
size_t size, struct pstore_info *psi)
{
return psi->write_buf(type, reason, id, part, psinfo->buf, size, psi);
@@ -266,6 +266,7 @@ void pstore_get_records(int quiet)
char *buf = NULL;
ssize_t size;
u64 id;
+ int count;
enum pstore_type_id type;
struct timespec time;
int failed = 0, rc;
@@ -277,9 +278,9 @@ void pstore_get_records(int quiet)
if (psi->open && psi->open(psi))
goto out;
- while ((size = psi->read(&id, &type, &time, &buf, psi)) > 0) {
- rc = pstore_mkfile(type, psi->name, id, buf, (size_t)size,
- time, psi);
+ while ((size = psi->read(&id, &type, &count, &time, &buf, psi)) > 0) {
+ rc = pstore_mkfile(type, psi->name, id, count, buf,
+ (size_t)size, time, psi);
kfree(buf);
buf = NULL;
if (rc && (rc != -EEXIST || !quiet))
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 749693f..2bfa36e 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -132,9 +132,8 @@ ramoops_get_next_prz(struct persistent_ram_zone *przs[], uint *c, uint max,
}
static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type,
- struct timespec *time,
- char **buf,
- struct pstore_info *psi)
+ int *count, struct timespec *time,
+ char **buf, struct pstore_info *psi)
{
ssize_t size;
struct ramoops_context *cxt = psi->data;
@@ -236,7 +235,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
return 0;
}
-static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
+static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index f6e9336..1788909 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -50,17 +50,19 @@ struct pstore_info {
int (*open)(struct pstore_info *psi);
int (*close)(struct pstore_info *psi);
ssize_t (*read)(u64 *id, enum pstore_type_id *type,
- struct timespec *time, char **buf,
+ int *count, struct timespec *time, char **buf,
struct pstore_info *psi);
int (*write)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
- unsigned int part, size_t size, struct pstore_info *psi);
+ unsigned int part, int count, size_t size,
+ struct pstore_info *psi);
int (*write_buf)(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct timespec time, struct pstore_info *psi);
+ int count, struct timespec time,
+ struct pstore_info *psi);
void *data;
};
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-25 01:54:41
|
[Issue]
Currently, a variable name, which is used to identify each log entry, consists of type,
id and ctime. But an erase callback does not use ctime.
If efi_pstore supported just one log, type and id were enough.
However, in case of supporting multiple logs, it doesn't work because
it can't distinguish each entry without ctime at erasing time.
<Example>
As you can see below, efi_pstore can't differentiate first event from second one without ctime.
a variable name of first event: dump-type0-1-12345678
a variable name of second event: dump-type0-1-23456789
type:0
id:1
ctime:12345678, 23456789
[Solution]
This patch adds ctime to an argument of an erase callback.
It works across reboots because ctime of pstore means the date that the record was originally stored.
To do this, efi_pstore saves the ctime to variable name at writing time and passes it to pstore
at reading time.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/acpi/apei/erst.c | 4 ++--
drivers/firmware/efivars.c | 15 +++++++--------
fs/pstore/inode.c | 3 ++-
fs/pstore/ram.c | 2 +-
include/linux/pstore.h | 2 +-
5 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index e4d9d24..0bd6ae4 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -938,7 +938,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
u64 *id, unsigned int part,
size_t size, struct pstore_info *psi);
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
static struct pstore_info erst_info = {
.owner = THIS_MODULE,
@@ -1102,7 +1102,7 @@ static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
}
static int erst_clearer(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
return erst_clear(id);
}
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index fbe9202..6cbeea7 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -747,24 +747,25 @@ static int efi_pstore_write(enum pstore_type_id type,
};
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
- char stub_name[DUMP_NAME_LEN];
+ char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
struct efivar_entry *entry, *found = NULL;
int i;
- sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+ sprintf(name, "dump-type%u-%u-%lu", type, (unsigned int)id,
+ time.tv_sec);
spin_lock(&efivars->lock);
for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
+ efi_name[i] = name[i];
/*
- * Clean up any entries with the same name
+ * Clean up an entry with the same name
*/
list_for_each_entry(entry, &efivars->list, list) {
@@ -775,9 +776,6 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
if (utf16_strncmp(entry->var.VariableName, efi_name,
utf16_strlen(efi_name)))
continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
/* found */
found = entry;
@@ -785,6 +783,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id,
&entry->var.VendorGuid,
PSTORE_EFI_ATTRIBUTES,
0, NULL);
+ break;
}
if (found)
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 4ab572e..4300af6 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -175,7 +175,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
struct pstore_private *p = dentry->d_inode->i_private;
if (p->psi->erase)
- p->psi->erase(p->type, p->id, p->psi);
+ p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
+ p->psi);
return simple_unlink(dir, dentry);
}
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 1a4f6da..749693f 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -237,7 +237,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
}
static int ramoops_pstore_erase(enum pstore_type_id type, u64 id,
- struct pstore_info *psi)
+ struct timespec time, struct pstore_info *psi)
{
struct ramoops_context *cxt = psi->data;
struct persistent_ram_zone *prz;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index ee3034a..f6e9336 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -60,7 +60,7 @@ struct pstore_info {
unsigned int part, const char *buf, size_t size,
struct pstore_info *psi);
int (*erase)(enum pstore_type_id type, u64 id,
- struct pstore_info *psi);
+ struct timespec time, struct pstore_info *psi);
void *data;
};
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-25 01:54:10
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch removes a logic erasing existing entries in a write callback
because the logic is not needed in the write callback to support holding multiple logs.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 39 ++-------------------------------------
1 files changed, 2 insertions(+), 37 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index bee14cc..fbe9202 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -701,18 +701,13 @@ static int efi_pstore_write(enum pstore_type_id type,
unsigned int part, size_t size, struct pstore_info *psi)
{
char name[DUMP_NAME_LEN];
- char stub_name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
struct efivars *efivars = psi->data;
- struct efivar_entry *entry, *found = NULL;
int i, ret = 0;
u64 storage_space, remaining_space, max_variable_size;
efi_status_t status = EFI_NOT_FOUND;
- sprintf(stub_name, "dump-type%u-%u-", type, part);
- sprintf(name, "%s%lu", stub_name, get_seconds());
-
spin_lock(&efivars->lock);
/*
@@ -730,35 +725,8 @@ static int efi_pstore_write(enum pstore_type_id type,
return -ENOSPC;
}
- for (i = 0; i < DUMP_NAME_LEN; i++)
- efi_name[i] = stub_name[i];
-
- /*
- * Clean up any entries with the same name
- */
-
- list_for_each_entry(entry, &efivars->list, list) {
- get_var_data_locked(efivars, &entry->var);
-
- if (efi_guidcmp(entry->var.VendorGuid, vendor))
- continue;
- if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
-
- /* found */
- found = entry;
- efivars->ops->set_variable(entry->var.VariableName,
- &entry->var.VendorGuid,
- PSTORE_EFI_ATTRIBUTES,
- 0, NULL);
- }
-
- if (found)
- list_del(&found->list);
+ sprintf(name, "dump-type%u-%u-%lu", type, part,
+ get_seconds());
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];
@@ -768,9 +736,6 @@ static int efi_pstore_write(enum pstore_type_id type,
spin_unlock(&efivars->lock);
- if (found)
- efivar_unregister(found);
-
if (size)
ret = efivar_create_sysfs_entry(efivars,
utf16_strsize(efi_name,
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-25 01:53:33
|
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
A reasonable solution to fix the issue is just holding multiple logs without erasing
existing entries.
This patch freshly adds a logic erasing existing entries, which shared with a write callback,
to an erase callback.
To support holding multiple logs, the write callback doesn't need to erase any entries and
it will be removed in a subsequent patch.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
drivers/firmware/efivars.c | 46 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 37ac21a..bee14cc 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -784,7 +784,51 @@ static int efi_pstore_write(enum pstore_type_id type,
static int efi_pstore_erase(enum pstore_type_id type, u64 id,
struct pstore_info *psi)
{
- efi_pstore_write(type, 0, &id, (unsigned int)id, 0, psi);
+ char stub_name[DUMP_NAME_LEN];
+ efi_char16_t efi_name[DUMP_NAME_LEN];
+ efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
+ struct efivars *efivars = psi->data;
+ struct efivar_entry *entry, *found = NULL;
+ int i;
+
+ sprintf(stub_name, "dump-type%u-%u-", type, (unsigned int)id);
+
+ spin_lock(&efivars->lock);
+
+ for (i = 0; i < DUMP_NAME_LEN; i++)
+ efi_name[i] = stub_name[i];
+
+ /*
+ * Clean up any entries with the same name
+ */
+
+ list_for_each_entry(entry, &efivars->list, list) {
+ get_var_data_locked(efivars, &entry->var);
+
+ if (efi_guidcmp(entry->var.VendorGuid, vendor))
+ continue;
+ if (utf16_strncmp(entry->var.VariableName, efi_name,
+ utf16_strlen(efi_name)))
+ continue;
+ /* Needs to be a prefix */
+ if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
+ continue;
+
+ /* found */
+ found = entry;
+ efivars->ops->set_variable(entry->var.VariableName,
+ &entry->var.VendorGuid,
+ PSTORE_EFI_ATTRIBUTES,
+ 0, NULL);
+ }
+
+ if (found)
+ list_del(&found->list);
+
+ spin_unlock(&efivars->lock);
+
+ if (found)
+ efivar_unregister(found);
return 0;
}
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2012-10-25 01:51:37
|
[Issue] As discussed in a thread below, Running out of space in EFI isn't a well-tested scenario. And we wouldn't expect all firmware to handle it gracefully. http://marc.info/?l=linux-kernel&m=134305325801789&w=2 On the other hand, current efi_pstore doesn't check a remaining space of storage at writing time. Therefore, efi_pstore may not work if it tries to write a large amount of data. [Patch Description] To avoid handling the situation above, this patch checks if there is a space enough to log with QueryVariableInfo() before writing data. Signed-off-by: Seiji Aguchi <sei...@hd...> --- drivers/firmware/efivars.c | 18 ++++++++++++++++++ include/linux/efi.h | 1 + 2 files changed, 19 insertions(+), 0 deletions(-) diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index d10c987..37ac21a 100644 --- a/drivers/firmware/efivars.c +++ b/drivers/firmware/efivars.c @@ -707,12 +707,29 @@ static int efi_pstore_write(enum pstore_type_id type, struct efivars *efivars = psi->data; struct efivar_entry *entry, *found = NULL; int i, ret = 0; + u64 storage_space, remaining_space, max_variable_size; + efi_status_t status = EFI_NOT_FOUND; sprintf(stub_name, "dump-type%u-%u-", type, part); sprintf(name, "%s%lu", stub_name, get_seconds()); spin_lock(&efivars->lock); + /* + * Check if there is a space enough to log. + * size: a size of logging data + * DUMP_NAME_LEN * 2: a maximum size of variable name + */ + status = efivars->ops->query_variable_info(PSTORE_EFI_ATTRIBUTES, + &storage_space, + &remaining_space, + &max_variable_size); + if (status || remaining_space < size + DUMP_NAME_LEN * 2) { + spin_unlock(&efivars->lock); + *id = part; + return -ENOSPC; + } + for (i = 0; i < DUMP_NAME_LEN; i++) efi_name[i] = stub_name[i]; @@ -1237,6 +1254,7 @@ efivars_init(void) ops.get_variable = efi.get_variable; ops.set_variable = efi.set_variable; ops.get_next_variable = efi.get_next_variable; + ops.query_variable_info = efi.query_variable_info; error = register_efivars(&__efivars, &ops, efi_kobj); if (error) goto err_put; diff --git a/include/linux/efi.h b/include/linux/efi.h index 8670eb1..c47ec36 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -643,6 +643,7 @@ struct efivar_operations { efi_get_variable_t *get_variable; efi_get_next_variable_t *get_next_variable; efi_set_variable_t *set_variable; + efi_query_variable_info_t *query_variable_info; }; struct efivars { -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2012-10-25 01:49:47
|
Changelog
v1 -> v2
- Separate into 5 patches in accordance with Mike's comment
- Erase an extra line of comment in patch 1/5
[Issue]
Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.
1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system reboots.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.
[Solution]
Solutions of this problem has been discussed among Tony, Matthew, Don, Mike and me.
http://marc.info/?l=linux-kernel&m=134273270704586&w=2
And there are two possible solutions right now.
- First one is introducing some policy overwriting existing logs.
- Second one is simply holding multiple log without overwriting any entries.
We haven't decided the overwriting policy which is reasonable to all users yet.
But I believe we agree that just holding multiple logs is a reasonable way.
We may need further discussions to find the possibility of introducing overwriting
policy, especially getting critical messages in multiple oops case.
But I would like to begin with a simple and reasonable way to everyone.
So, this patch takes an approach just holding multiple logs.
[Patch Description]
(1/5) efi_pstore: Check remaining space with QueryVariableInfo() before writing data
(2/5) efi_pstore: Add a logic erasing entries to an erase callback
(3/5) efi_pstore: Remove a logic erasing entries from a write callback to hold multiple logs
(4/5) efi_pstore: Add ctime to argument of erase callback
(5/5) efi_pstore: Add a sequence counter to a variable name
Detailed explanations are written in each patch.
drivers/acpi/apei/erst.c | 16 ++++----
drivers/firmware/efivars.c | 100 ++++++++++++++++++++++++++++---------------
fs/pstore/inode.c | 7 ++-
fs/pstore/internal.h | 2 +-
fs/pstore/platform.c | 11 +++--
fs/pstore/ram.c | 9 ++--
include/linux/efi.h | 1 +
include/linux/pstore.h | 6 ++-
8 files changed, 94 insertions(+), 58 deletions(-)
|
|
From: Seiji A. <sei...@hd...> - 2012-10-19 14:26:51
|
Mike,
Thank you for reviewing my patch.
Here is my comment.
> > - Write callback
> > - Check if there are enough spaces to write logs with QueryVariableInfo()
> > to avoid handling out of space situation. (It is suggested by
> > Matthew Garrett.)
> >
>
> I would prefer to see the exposing query_variable_info callback as a separate patch. The check in the patch looks good.
>
OK. I will make a separate patch.
> > - Remove a logic erasing existing entries.
> >
> > - Erase callback
> > - Freshly create a logic rather than sharing it with a write callback because
> > erasing entries doesn't need in a write callback.
> >
>
> This sort of change is a lot easier to review if you copy and paste the routine in a patch separate from this one.
>
I will separate it as well.
> > + /*
> > + * Check if there is a space enough to log.
> > + * size: a size of logging data
> > + * DUMP_NAME_LEN * 2: a maximum size of variable name
> > + *
>
> extra line
>
Will fix.
> > +
> > + efivars->ops->set_variable(efi_name, &vendor, PSTORE_EFI_ATTRIBUTES,
> > + size, psi->buf);
> > +
> > + spin_unlock(&efivars->lock);
> > +
> > + if (size)
> > + ret = efivar_create_sysfs_entry(efivars,
> > + utf16_strsize(efi_name,
> > + DUMP_NAME_LEN * 2),
> > + efi_name, &vendor);
>
> What is happening here? Why is size checked for non-zero?
>
I just copied an original code sharing a logic of write callback and erase callback..
But, if size is zero, we don't need to create a sysfs file because a set_variable service erases the entry.
It is defined in EFI specification.
So, I think this code is right.
> > +++ b/fs/pstore/inode.c
> > @@ -175,7 +175,8 @@ static int pstore_unlink(struct inode *dir, struct dentry *dentry)
> > struct pstore_private *p = dentry->d_inode->i_private;
> >
> > if (p->psi->erase)
> > - p->psi->erase(p->type, p->id, p->psi);
> > + p->psi->erase(p->type, p->id, dentry->d_inode->i_ctime,
> > + p->psi);
>
> This doesn't look right. What guarantees that the i_ctime means anything across reboots?
>
Ctime is persistent across reboots because ctime of pstore means the date that the recode was
originally stored. (See below.)
To do this, efi_pstore saves the ctime to variable name at writing time and passes it to pstore at reading time.
fs/pstore/inode.c:
<snip>
/*
* Make a regular file in the root directory of our file system.
* Load it up with "size" bytes of data from "buf".
* Set the mtime & ctime to the date that this record was originally stored.
*/
int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
char *data, size_t size, struct timespec time,
struct pstore_info *psi) {
struct dentry *root = pstore_sb->s_root;
struct dentry *dentry;
struct inode *inode;
int rc = 0;
char name[PSTORE_NAMELEN];
<snip>
Seiji
|