You can subscribe to this list here.
| 2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(32) |
Jun
(66) |
Jul
(102) |
Aug
(78) |
Sep
(106) |
Oct
(137) |
Nov
(147) |
Dec
(147) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2010 |
Jan
(71) |
Feb
(139) |
Mar
(86) |
Apr
(76) |
May
(57) |
Jun
(10) |
Jul
(12) |
Aug
(6) |
Sep
(8) |
Oct
(12) |
Nov
(12) |
Dec
(18) |
| 2011 |
Jan
(16) |
Feb
(19) |
Mar
(3) |
Apr
(1) |
May
(16) |
Jun
(17) |
Jul
(74) |
Aug
(22) |
Sep
(18) |
Oct
(24) |
Nov
(21) |
Dec
(30) |
| 2012 |
Jan
(31) |
Feb
(16) |
Mar
(22) |
Apr
(25) |
May
(18) |
Jun
(13) |
Jul
(83) |
Aug
(49) |
Sep
(20) |
Oct
(60) |
Nov
(35) |
Dec
(28) |
| 2013 |
Jan
(39) |
Feb
(61) |
Mar
(35) |
Apr
(21) |
May
(45) |
Jun
(56) |
Jul
(20) |
Aug
(9) |
Sep
(10) |
Oct
(31) |
Nov
(8) |
Dec
(4) |
| 2014 |
Jan
(6) |
Feb
(7) |
Mar
(7) |
Apr
(6) |
May
(4) |
Jun
(8) |
Jul
(5) |
Aug
(2) |
Sep
(4) |
Oct
(4) |
Nov
(11) |
Dec
(5) |
| 2015 |
Jan
(4) |
Feb
(4) |
Mar
(3) |
Apr
(4) |
May
(9) |
Jun
(4) |
Jul
(15) |
Aug
(8) |
Sep
(16) |
Oct
(18) |
Nov
(15) |
Dec
(7) |
| 2016 |
Jan
(20) |
Feb
(9) |
Mar
(15) |
Apr
(24) |
May
(16) |
Jun
(28) |
Jul
(22) |
Aug
(23) |
Sep
(18) |
Oct
(30) |
Nov
(40) |
Dec
(9) |
| 2017 |
Jan
(1) |
Feb
(8) |
Mar
(37) |
Apr
(26) |
May
(25) |
Jun
(46) |
Jul
(24) |
Aug
(9) |
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Matt F. <mat...@in...> - 2013-02-08 23:16:32
|
On Fri, 2013-02-08 at 22:31 +0000, Seiji Aguchi wrote: > Matt, > > Can you please take a look at this patchset which removes > create_sysfs_entries() from efi_pstore_write()? > > It has been updated in accordance with Mike's comment and fixes an > actual bug. > http://comments.gmane.org/gmane.linux.kernel.efi/406 OK, I'll find some time to review these. |
|
From: Seiji A. <sei...@hd...> - 2013-02-08 22:31:37
|
Matt, Can you please take a look at this patchset which removes create_sysfs_entries() from efi_pstore_write()? It has been updated in accordance with Mike's comment and fixes an actual bug. http://comments.gmane.org/gmane.linux.kernel.efi/406 Seiji > -----Original Message----- > From: Seiji Aguchi > Sent: Wednesday, January 23, 2013 7:41 PM > To: lin...@vg...; lin...@vg...; Luck, Tony (ton...@in...); mi...@go...; > mat...@in...; cbo...@gm...; dz...@re... > Cc: dle...@li...; Satoru Moriya > Subject: [PATCH v5 -next 0/2] make efivars/efi_pstore interrupt-safe > > Changelogv4 -> v5 > - Rebase from a linus tree to a linux-next tree to avoid getting > a conflict when this patchset is merged to a linux-next tree. > - Merge previous patches 2/3 and 3/3 into 2/2 because they fix > a same problem. > - Modify to fit a latest upstream kernel as follows. > - Change spinlock operations of efivarfs which has been > introduced recently.(Patch 1/2) > - Remove delete_all_stale_sysfs_entries() from update_sysfs_entries() > because a currnet efi_pstore doesn't erase existing entries > in a write callback and sysfs entries don't become stale. > (Patch 2/2) > > v3 -> v4 > - Patch 2/3 > Move cancel_work_sync() above an efi_enabled test in efivars_exit(). > > v2 -> v3 > - Patch 1/3 > Replace spin_lock_irqsave/spin_unlock_irqrestore with spin_lock_irq/spin_unlock_irq in efivars_unregister(), > efivar_create(), efivar_store_raw() and efivar_delete() which are called in a process context. > > - Patch 2/3 > Change a name of delete_sysfs_entry() to delete_all_stale_sysfs_entries(). > Also, don't release an efivar->lock while searching efivar->list in delete_all_stale_sysfs_entries(). > > - Patch 3/3 > Remove a logic in efi_pstore_erase() which freshly created in patch v2. > > v1 -> v2 > - Patch 1/3 > Add spin_lock_irq/spin_unlock_irq to open/close callbacks of efi_pstore > instead of moving spin_locks to a read callback. > > - Patch 2/3 > Replace a periodical timer with schedule_work(). > > - Patch 3/3 > freshly create to kick a workqueue in oops case only. > > [Problem] > There are following problems related to an interrupt context in efivars including efivarfs and efi_pstore. > > (1)There is a scenario which efi_pstore fails to log messages > in a panic case. > > - CPUA holds an efi_var->lock in either efivarfs parts > or efi_pstore with interrupt enabled. > - CPUB panics and sends IPI to CPUA in smp_send_stop(). > - CPUA stops with holding the lock. > - CPUB kicks efi_pstore_write() via kmsg_dump(KSMG_DUMP_PANIC) > but it returns without logging messages. > > (2)Also, efi_pstore creates sysfs entries, which enable users to access to > NVRAM, in a write callback. > If a kernel panic happens in an interrupt contexts, pstore may fail > because it could sleep due to dynamic memory allocations during creating > sysfs entries. > An actual failure due to the create_sysfs_entry() has been reported. > http://comments.gmane.org/gmane.linux.kernel.efi/406 > > To resolve problems above, this patchset makes efivars/efi_pstore interrupt-safe. > > [Patch Description] > Please see detailed explanations in each patch. > > Seiji Aguchi (2): > efivars: Disable external interrupt while holding efivars->lock > - This patch fixes a problem (1). > > efi_pstore: Introducing workqueue updating sysfs entries > - This patch fixes a problem (2). > > drivers/firmware/efivars.c | 171 ++++++++++++++++++++++++++++++++------------ > include/linux/efi.h | 3 +- > 2 files changed, 126 insertions(+), 48 deletions(-) |
|
From: Stefan H. <ste...@gm...> - 2013-02-05 12:41:13
|
On Mon, Feb 04, 2013 at 03:58:29PM +0000, Seiji Aguchi wrote: > > Also remember that management tools can pass a pipe as stderr when starting QEMU, and > > then they can add their preferred timestamping/log formatting outside of QEMU. > > To get an exact timestamp, qemu should supply it to the messages. > Also, there are some customers who use qemu in a small environment. > In that case, they don't use the management like vdsm. Fair enough, I think an option doesn't hurt anyone. Stefan |
|
From: Seiji A. <sei...@hd...> - 2013-02-04 22:50:54
|
[Issue]
Currently, irq vector handlers for tracing are just
copied non-trace handlers by simply inserting tracepoints.
It is difficult to manage the codes.
[Solution]
This patch shares common codes between non-trace and trace handlers
as follows to make them manageable and readable.
Non-trace irq handler:
smp_irq_handler()
{
entering_irq(); /* pre-processing of this handler */
__smp_irq_handler(); /*
* common logic between non-trace and trace handlers
* in a vector.
*/
exiting_irq(); /* post-processing of this handler */
}
Trace irq_handler:
smp_trace_irq_handler()
{
entering_irq(); /* pre-processing of this handler */
trace_irq_entry(); /* tracepoint for irq entry */
__smp_irq_handler(); /*
* common logic between non-trace and trace handlers
* in a vector.
*/
trace_irq_exit(); /* tracepoint for irq exit */
exiting_irq(); /* post-processing of this handler */
}
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
arch/x86/include/asm/apic.h | 27 ++++++++
arch/x86/kernel/apic/apic.c | 103 ++++++++----------------------
arch/x86/kernel/cpu/mcheck/therm_throt.c | 24 +++----
arch/x86/kernel/cpu/mcheck/threshold.c | 24 +++----
arch/x86/kernel/irq.c | 34 +++-------
arch/x86/kernel/irq_work.c | 22 ++++--
arch/x86/kernel/smp.c | 54 ++++++++++------
7 files changed, 137 insertions(+), 151 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 3388034..f8119b5 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -12,6 +12,7 @@
#include <asm/fixmap.h>
#include <asm/mpspec.h>
#include <asm/msr.h>
+#include <asm/idle.h>
#define ARCH_APICTIMER_STOPS_ON_C3 1
@@ -687,5 +688,31 @@ extern int default_check_phys_apicid_present(int phys_apicid);
#endif
#endif /* CONFIG_X86_LOCAL_APIC */
+extern void irq_enter(void);
+extern void irq_exit(void);
+
+static inline void entering_irq(void)
+{
+ irq_enter();
+ exit_idle();
+}
+
+static inline void entering_ack_irq(void)
+{
+ ack_APIC_irq();
+ entering_irq();
+}
+
+static inline void exiting_irq(void)
+{
+ irq_exit();
+}
+
+static inline void exiting_ack_irq(void)
+{
+ irq_exit();
+ /* Ack only at the end to avoid potential reentry */
+ ack_APIC_irq();
+}
#endif /* _ASM_X86_APIC_H */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 89f3f4d..c146cbc 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -922,17 +922,14 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
/*
* NOTE! We'd better ACK the irq immediately,
* because timer handling can be slow.
- */
- ack_APIC_irq();
- /*
+ *
* update_process_times() expects us to have done irq_enter().
* Besides, if we don't timer interrupts ignore the global
* interrupt lock, which is the WrongThing (tm) to do.
*/
- irq_enter();
- exit_idle();
+ entering_ack_irq();
local_apic_timer_interrupt();
- irq_exit();
+ exiting_irq();
set_irq_regs(old_regs);
}
@@ -944,19 +941,16 @@ void __irq_entry smp_trace_apic_timer_interrupt(struct pt_regs *regs)
/*
* NOTE! We'd better ACK the irq immediately,
* because timer handling can be slow.
- */
- ack_APIC_irq();
- /*
+ *
* update_process_times() expects us to have done irq_enter().
* Besides, if we don't timer interrupts ignore the global
* interrupt lock, which is the WrongThing (tm) to do.
*/
- irq_enter();
- exit_idle();
+ entering_ack_irq();
trace_local_timer_entry(LOCAL_TIMER_VECTOR);
local_apic_timer_interrupt();
trace_local_timer_exit(LOCAL_TIMER_VECTOR);
- irq_exit();
+ exiting_irq();
set_irq_regs(old_regs);
}
@@ -1935,12 +1929,10 @@ int __init APIC_init_uniprocessor(void)
/*
* This interrupt should _never_ happen with our APIC/SMP architecture
*/
-void smp_spurious_interrupt(struct pt_regs *regs)
+static inline void __smp_spurious_interrupt(void)
{
u32 v;
- irq_enter();
- exit_idle();
/*
* Check if this really is a spurious interrupt and ACK it
* if it is a vectored one. Just in case...
@@ -1955,38 +1947,28 @@ void smp_spurious_interrupt(struct pt_regs *regs)
/* see sw-dev-man vol 3, chapter 7.4.13.5 */
pr_info("spurious APIC interrupt on CPU#%d, "
"should never happen.\n", smp_processor_id());
- irq_exit();
}
-void smp_trace_spurious_interrupt(struct pt_regs *regs)
+void smp_spurious_interrupt(struct pt_regs *regs)
{
- u32 v;
+ entering_irq();
+ __smp_spurious_interrupt();
+ exiting_irq();
+}
- irq_enter();
- exit_idle();
+void smp_trace_spurious_interrupt(struct pt_regs *regs)
+{
+ entering_irq();
trace_spurious_apic_entry(SPURIOUS_APIC_VECTOR);
- /*
- * Check if this really is a spurious interrupt and ACK it
- * if it is a vectored one. Just in case...
- * Spurious interrupts should not be ACKed.
- */
- v = apic_read(APIC_ISR + ((SPURIOUS_APIC_VECTOR & ~0x1f) >> 1));
- if (v & (1 << (SPURIOUS_APIC_VECTOR & 0x1f)))
- ack_APIC_irq();
-
- inc_irq_stat(irq_spurious_count);
-
- /* see sw-dev-man vol 3, chapter 7.4.13.5 */
- pr_info("spurious APIC interrupt on CPU#%d, "
- "should never happen.\n", smp_processor_id());
+ __smp_spurious_interrupt();
trace_spurious_apic_exit(SPURIOUS_APIC_VECTOR);
- irq_exit();
+ exiting_irq();
}
/*
* This interrupt should never happen with our APIC/SMP architecture
*/
-void smp_error_interrupt(struct pt_regs *regs)
+static inline void __smp_error_interrupt(struct pt_regs *regs)
{
u32 v0, v1;
u32 i = 0;
@@ -2001,8 +1983,6 @@ void smp_error_interrupt(struct pt_regs *regs)
"Illegal register address", /* APIC Error Bit 7 */
};
- irq_enter();
- exit_idle();
/* First tickle the hardware, only then report what went on. -- REW */
v0 = apic_read(APIC_ESR);
apic_write(APIC_ESR, 0);
@@ -2023,49 +2003,22 @@ void smp_error_interrupt(struct pt_regs *regs)
apic_printk(APIC_DEBUG, KERN_CONT "\n");
- irq_exit();
}
-void smp_trace_error_interrupt(struct pt_regs *regs)
+void smp_error_interrupt(struct pt_regs *regs)
{
- u32 v0, v1;
- u32 i = 0;
- static const char * const error_interrupt_reason[] = {
- "Send CS error", /* APIC Error Bit 0 */
- "Receive CS error", /* APIC Error Bit 1 */
- "Send accept error", /* APIC Error Bit 2 */
- "Receive accept error", /* APIC Error Bit 3 */
- "Redirectable IPI", /* APIC Error Bit 4 */
- "Send illegal vector", /* APIC Error Bit 5 */
- "Received illegal vector", /* APIC Error Bit 6 */
- "Illegal register address", /* APIC Error Bit 7 */
- };
+ entering_irq();
+ __smp_error_interrupt(regs);
+ exiting_irq();
+}
- irq_enter();
- exit_idle();
+void smp_trace_error_interrupt(struct pt_regs *regs)
+{
+ entering_irq();
trace_error_apic_entry(ERROR_APIC_VECTOR);
- /* First tickle the hardware, only then report what went on. -- REW */
- v0 = apic_read(APIC_ESR);
- apic_write(APIC_ESR, 0);
- v1 = apic_read(APIC_ESR);
- ack_APIC_irq();
- atomic_inc(&irq_err_count);
-
- apic_printk(APIC_DEBUG, KERN_DEBUG "APIC error on CPU%d: %02x(%02x)",
- smp_processor_id(), v0 , v1);
-
- v1 = v1 & 0xff;
- while (v1) {
- if (v1 & 0x1)
- apic_printk(APIC_DEBUG, KERN_CONT " : %s", error_interrupt_reason[i]);
- i++;
- v1 >>= 1;
- }
-
- apic_printk(APIC_DEBUG, KERN_CONT "\n");
-
+ __smp_error_interrupt(regs);
trace_error_apic_exit(ERROR_APIC_VECTOR);
- irq_exit();
+ exiting_irq();
}
/**
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index e7aa7fc..2f3a799 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -379,28 +379,26 @@ static void unexpected_thermal_interrupt(void)
static void (*smp_thermal_vector)(void) = unexpected_thermal_interrupt;
-asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
+static inline void __smp_thermal_interrupt(void)
{
- irq_enter();
- exit_idle();
inc_irq_stat(irq_thermal_count);
smp_thermal_vector();
- irq_exit();
- /* Ack only at the end to avoid potential reentry */
- ack_APIC_irq();
+}
+
+asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
+{
+ entering_irq();
+ __smp_thermal_interrupt();
+ exiting_ack_irq();
}
asmlinkage void smp_trace_thermal_interrupt(struct pt_regs *regs)
{
- irq_enter();
- exit_idle();
+ entering_irq();
trace_thermal_apic_entry(THERMAL_APIC_VECTOR);
- inc_irq_stat(irq_thermal_count);
- smp_thermal_vector();
+ __smp_thermal_interrupt();
trace_thermal_apic_exit(THERMAL_APIC_VECTOR);
- irq_exit();
- /* Ack only at the end to avoid potential reentry */
- ack_APIC_irq();
+ exiting_ack_irq();
}
/* Thermal monitoring depends on APIC, ACPI and clock modulation */
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index 0cbef99..fe6b1c8 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -18,26 +18,24 @@ static void default_threshold_interrupt(void)
void (*mce_threshold_vector)(void) = default_threshold_interrupt;
-asmlinkage void smp_threshold_interrupt(void)
+static inline void __smp_threshold_interrupt(void)
{
- irq_enter();
- exit_idle();
inc_irq_stat(irq_threshold_count);
mce_threshold_vector();
- irq_exit();
- /* Ack only at the end to avoid potential reentry */
- ack_APIC_irq();
+}
+
+asmlinkage void smp_threshold_interrupt(void)
+{
+ entering_irq();
+ __smp_threshold_interrupt();
+ exiting_ack_irq();
}
asmlinkage void smp_trace_threshold_interrupt(void)
{
- irq_enter();
- exit_idle();
+ entering_irq();
trace_threshold_apic_entry(THRESHOLD_APIC_VECTOR);
- inc_irq_stat(irq_threshold_count);
- mce_threshold_vector();
+ __smp_threshold_interrupt();
trace_threshold_apic_exit(THRESHOLD_APIC_VECTOR);
- irq_exit();
- /* Ack only at the end to avoid potential reentry */
- ack_APIC_irq();
+ exiting_ack_irq();
}
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 216bec1..ae836cd 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -209,23 +209,21 @@ unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
/*
* Handler for X86_PLATFORM_IPI_VECTOR.
*/
-void smp_x86_platform_ipi(struct pt_regs *regs)
+void __smp_x86_platform_ipi(void)
{
- struct pt_regs *old_regs = set_irq_regs(regs);
-
- ack_APIC_irq();
-
- irq_enter();
-
- exit_idle();
-
inc_irq_stat(x86_platform_ipis);
if (x86_platform_ipi_callback)
x86_platform_ipi_callback();
+}
- irq_exit();
+void smp_x86_platform_ipi(struct pt_regs *regs)
+{
+ struct pt_regs *old_regs = set_irq_regs(regs);
+ entering_ack_irq();
+ __smp_x86_platform_ipi();
+ exiting_irq();
set_irq_regs(old_regs);
}
@@ -233,21 +231,11 @@ void smp_trace_x86_platform_ipi(struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
- ack_APIC_irq();
-
- irq_enter();
-
- exit_idle();
-
+ entering_ack_irq();
trace_x86_platform_ipi_entry(X86_PLATFORM_IPI_VECTOR);
- inc_irq_stat(x86_platform_ipis);
-
- if (x86_platform_ipi_callback)
- x86_platform_ipi_callback();
-
+ __smp_x86_platform_ipi();
trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR);
- irq_exit();
-
+ exiting_irq();
set_irq_regs(old_regs);
}
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index 09e6262..636a55e 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -10,24 +10,32 @@
#include <asm/apic.h>
#include <asm/trace/irq_vectors.h>
-void smp_irq_work_interrupt(struct pt_regs *regs)
+static inline void irq_work_entering_irq(void)
{
irq_enter();
ack_APIC_irq();
+}
+
+static inline void __smp_irq_work_interrupt(void)
+{
inc_irq_stat(apic_irq_work_irqs);
irq_work_run();
- irq_exit();
+}
+
+void smp_irq_work_interrupt(struct pt_regs *regs)
+{
+ irq_work_entering_irq();
+ __smp_irq_work_interrupt();
+ exiting_irq();
}
void smp_trace_irq_work_interrupt(struct pt_regs *regs)
{
- irq_enter();
- ack_APIC_irq();
+ irq_work_entering_irq();
trace_irq_work_entry(IRQ_WORK_VECTOR);
- inc_irq_stat(apic_irq_work_irqs);
- irq_work_run();
+ __smp_irq_work_interrupt();
trace_irq_work_exit(IRQ_WORK_VECTOR);
- irq_exit();
+ exiting_irq();
}
void arch_irq_work_raise(void)
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index aad58af..f4fe0b8 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -250,11 +250,16 @@ finish:
/*
* Reschedule call back.
*/
-void smp_reschedule_interrupt(struct pt_regs *regs)
+static inline void __smp_reschedule_interrupt(void)
{
- ack_APIC_irq();
inc_irq_stat(irq_resched_count);
scheduler_ipi();
+}
+
+void smp_reschedule_interrupt(struct pt_regs *regs)
+{
+ ack_APIC_irq();
+ __smp_reschedule_interrupt();
/*
* KVM uses this interrupt to force a cpu out of guest mode
*/
@@ -264,52 +269,61 @@ void smp_trace_reschedule_interrupt(struct pt_regs *regs)
{
ack_APIC_irq();
trace_reschedule_entry(RESCHEDULE_VECTOR);
- inc_irq_stat(irq_resched_count);
- scheduler_ipi();
+ __smp_reschedule_interrupt();
trace_reschedule_exit(RESCHEDULE_VECTOR);
/*
* KVM uses this interrupt to force a cpu out of guest mode
*/
}
-void smp_call_function_interrupt(struct pt_regs *regs)
+static inline void call_function_entering_irq(void)
{
ack_APIC_irq();
irq_enter();
+}
+
+static inline void __smp_call_function_interrupt(void)
+{
generic_smp_call_function_interrupt();
inc_irq_stat(irq_call_count);
- irq_exit();
+}
+
+void smp_call_function_interrupt(struct pt_regs *regs)
+{
+ call_function_entering_irq();
+ __smp_call_function_interrupt();
+ exiting_irq();
}
void smp_trace_call_function_interrupt(struct pt_regs *regs)
{
- ack_APIC_irq();
- irq_enter();
+ call_function_entering_irq();
trace_call_function_entry(CALL_FUNCTION_VECTOR);
- generic_smp_call_function_interrupt();
- inc_irq_stat(irq_call_count);
+ __smp_call_function_interrupt();
trace_call_function_exit(CALL_FUNCTION_VECTOR);
- irq_exit();
+ exiting_irq();
}
-void smp_call_function_single_interrupt(struct pt_regs *regs)
+static inline void __smp_call_function_single_interrupt(void)
{
- ack_APIC_irq();
- irq_enter();
generic_smp_call_function_single_interrupt();
inc_irq_stat(irq_call_count);
- irq_exit();
+}
+
+void smp_call_function_single_interrupt(struct pt_regs *regs)
+{
+ call_function_entering_irq();
+ __smp_call_function_single_interrupt();
+ exiting_irq();
}
void smp_trace_call_function_single_interrupt(struct pt_regs *regs)
{
- ack_APIC_irq();
- irq_enter();
+ call_function_entering_irq();
trace_call_function_single_entry(CALL_FUNCTION_SINGLE_VECTOR);
- generic_smp_call_function_single_interrupt();
- inc_irq_stat(irq_call_count);
+ __smp_call_function_single_interrupt();
trace_call_function_single_exit(CALL_FUNCTION_SINGLE_VECTOR);
- irq_exit();
+ exiting_irq();
}
static int __init nonmi_ipi_setup(char *str)
-- 1.7.1
|
|
From: Seiji A. <sei...@hd...> - 2013-02-04 22:50:38
|
[Purpose of this patch] As Vaibhav explained in the thread below, tracepoints for irq vectors are useful. http://www.spinics.net/lists/mm-commits/msg85707.html <snip> The current interrupt traces from irq_handler_entry and irq_handler_exit provide when an interrupt is handled. They provide good data about when the system has switched to kernel space and how it affects the currently running processes. There are some IRQ vectors which trigger the system into kernel space, which are not handled in generic IRQ handlers. Tracing such events gives us the information about IRQ interaction with other system events. The trace also tells where the system is spending its time. We want to know which cores are handling interrupts and how they are affecting other processes in the system. Also, the trace provides information about when the cores are idle and which interrupts are changing that state. <snip> On the other hand, my usecase is tracing just local timer event and getting a value of instruction pointer. I suggested to add an argument local timer event to get instruction pointer before. But there is another way to get it with external module like systemtap. So, I don't need to add any argument to irq vector tracepoints now. [Patch Description] Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events. But there is an above use case to trace specific irq_vector rather than tracing all events. In this case, we are concerned about overhead due to unwanted events. This patch adds following tracepoints instead of introducing irq_vector_entry/exit. so that we can enable them independently. - local_timer_vector - reschedule_vector - call_function_vector - call_function_single_vector - irq_work_entry_vector - error_apic_vector - thermal_apic_vector - threshold_apic_vector - spurious_apic_vector - x86_platform_ipi_vector Also, it introduces a logic switching IDT at enabling/disabling time so that a time penalty makes a zero when tracepoints are disabled. Detailed explanations are as follows. - Duplicate new irq handlers inserted tracepoints. - Create a new IDT, trace_idt_table, at boot time by adding a logic to _set_gate(). It is just a copy of original idt table. - Registering the new handers for tracpoints to the new IDT by introducing macros to alloc_intr_gate() called at regstering time of irq_vector handlers. - Switch IDT to new one at enabling TP time. - Restore to an original IDT at disabling TP time. The new IDT is created only when CONFIG_TRACING is enabled to avoid being used for other purposes. Signed-off-by: Seiji Aguchi <sei...@hd...> --- arch/x86/include/asm/desc.h | 33 ++++++- arch/x86/include/asm/entry_arch.h | 5 +- arch/x86/include/asm/hw_irq.h | 16 +++ arch/x86/include/asm/trace/irq_vectors.h | 159 ++++++++++++++++++++++++++++++ arch/x86/kernel/Makefile | 1 + arch/x86/kernel/apic/apic.c | 94 ++++++++++++++++++ arch/x86/kernel/cpu/mcheck/therm_throt.c | 14 +++ arch/x86/kernel/cpu/mcheck/threshold.c | 14 +++ arch/x86/kernel/entry_32.S | 12 ++- arch/x86/kernel/entry_64.S | 27 ++++- arch/x86/kernel/head_64.S | 6 + arch/x86/kernel/irq.c | 23 +++++ arch/x86/kernel/irq_work.c | 12 +++ arch/x86/kernel/smp.c | 35 +++++++ arch/x86/kernel/tracepoint.c | 65 ++++++++++++ include/xen/events.h | 3 + 16 files changed, 509 insertions(+), 10 deletions(-) create mode 100644 arch/x86/include/asm/trace/irq_vectors.h create mode 100644 arch/x86/kernel/tracepoint.c diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index 8bf1c06..f2a381b 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -320,6 +320,17 @@ static inline void set_nmi_gate(int gate, void *addr) } #endif +#ifdef CONFIG_TRACING +extern gate_desc trace_idt_table[]; +static inline void trace_set_intr_gate(unsigned int gate, void *addr) +{ + gate_desc s; + + pack_gate(&s, GATE_INTERRUPT, (unsigned long)addr, 0, 0, __KERNEL_CS); + write_idt_entry(trace_idt_table, gate, &s); +} +#endif + static inline void _set_gate(int gate, unsigned type, void *addr, unsigned dpl, unsigned ist, unsigned seg) { @@ -331,6 +342,9 @@ static inline void _set_gate(int gate, unsigned type, void *addr, * setup time */ write_idt_entry(idt_table, gate, &s); +#ifdef CONFIG_TRACING + write_idt_entry(trace_idt_table, gate, &s); +#endif } /* @@ -360,12 +374,27 @@ static inline void alloc_system_vector(int vector) } } -static inline void alloc_intr_gate(unsigned int n, void *addr) +#ifdef CONFIG_TRACING +static inline void __trace_alloc_intr_gate(unsigned int n, void *addr) +{ + trace_set_intr_gate(n, addr); +} +#else +#define __trace_alloc_intr_gate(n, addr) +#endif + +static inline void __alloc_intr_gate(unsigned int n, void *addr) { - alloc_system_vector(n); set_intr_gate(n, addr); } +#define alloc_intr_gate(n, addr) \ + do { \ + alloc_system_vector(n); \ + __alloc_intr_gate(n, addr); \ + __trace_alloc_intr_gate(n, trace_##addr); \ + } while (0) + /* * This routine sets up an interrupt gate at directory privilege level 3. */ diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h index 40afa00..0bb99d8 100644 --- a/arch/x86/include/asm/entry_arch.h +++ b/arch/x86/include/asm/entry_arch.h @@ -13,8 +13,9 @@ BUILD_INTERRUPT(reschedule_interrupt,RESCHEDULE_VECTOR) BUILD_INTERRUPT(call_function_interrupt,CALL_FUNCTION_VECTOR) BUILD_INTERRUPT(call_function_single_interrupt,CALL_FUNCTION_SINGLE_VECTOR) -BUILD_INTERRUPT(irq_move_cleanup_interrupt,IRQ_MOVE_CLEANUP_VECTOR) -BUILD_INTERRUPT(reboot_interrupt,REBOOT_VECTOR) +BUILD_INTERRUPT3(irq_move_cleanup_interrupt, IRQ_MOVE_CLEANUP_VECTOR, + smp_irq_move_cleanup_interrupt) +BUILD_INTERRUPT3(reboot_interrupt, REBOOT_VECTOR, smp_reboot_interrupt) #endif BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR) diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index eb92a6e..2e297d8 100644 --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -76,6 +76,22 @@ extern void threshold_interrupt(void); extern void call_function_interrupt(void); extern void call_function_single_interrupt(void); +#ifdef CONFIG_TRACING +/* Interrupt handlers registered during init_IRQ */ +extern void trace_apic_timer_interrupt(void); +extern void trace_x86_platform_ipi(void); +extern void trace_error_interrupt(void); +extern void trace_irq_work_interrupt(void); +extern void trace_spurious_interrupt(void); +extern void trace_thermal_interrupt(void); +extern void trace_reschedule_interrupt(void); +extern void trace_threshold_interrupt(void); +extern void trace_call_function_interrupt(void); +extern void trace_call_function_single_interrupt(void); +#define trace_irq_move_cleanup_interrupt irq_move_cleanup_interrupt +#define trace_reboot_interrupt reboot_interrupt +#endif /* CONFIG_TRACING */ + /* IOAPIC */ #define IO_APIC_IRQ(x) (((x) >= NR_IRQS_LEGACY) || ((1<<(x)) & io_apic_irqs)) extern unsigned long io_apic_irqs; diff --git a/arch/x86/include/asm/trace/irq_vectors.h b/arch/x86/include/asm/trace/irq_vectors.h new file mode 100644 index 0000000..b4f1c53 --- /dev/null +++ b/arch/x86/include/asm/trace/irq_vectors.h @@ -0,0 +1,159 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM irq_vectors + +#if !defined(_TRACE_IRQ_VECTORS_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_IRQ_VECTORS_H + +#include <linux/tracepoint.h> + +extern void trace_irq_vector_regfunc(void); +extern void trace_irq_vector_unregfunc(void); + +DECLARE_EVENT_CLASS(x86_irq_vector, + + TP_PROTO(int vector), + + TP_ARGS(vector), + + TP_STRUCT__entry( + __field( int, vector ) + ), + + TP_fast_assign( + __entry->vector = vector; + ), + + TP_printk("vector=%d", __entry->vector) ); + +#define DEFINE_IRQ_VECTOR_EVENT(name) \ +DEFINE_EVENT_FN(x86_irq_vector, name, \ + TP_PROTO(int vector), \ + TP_ARGS(vector), \ + trace_irq_vector_regfunc, \ + trace_irq_vector_unregfunc); + +/* + * local_timer_entry - called before entering a local timer interrupt + * vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(local_timer_entry); + +/* + * local_timer_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(local_timer_exit); + +/* + * reschedule_entry - called before entering a reschedule vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(reschedule_entry); + +/* + * reschedule_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(reschedule_exit); + +/* + * spurious_apic_entry - called before entering a spurious apic vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(spurious_apic_entry); + +/* + * spurious_apic_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(spurious_apic_exit); + +/* + * error_apic_entry - called before entering an error apic vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(error_apic_entry); + +/* + * error_apic_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(error_apic_exit); + +/* + * x86_platform_ipi_entry - called before entering a x86 platform ipi interrupt + * vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(x86_platform_ipi_entry); + +/* + * x86_platform_ipi_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(x86_platform_ipi_exit); + +/* + * irq_work_entry - called before entering a irq work interrupt + * vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(irq_work_entry); + +/* + * irq_work_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(irq_work_exit); + +/* + * call_function_entry - called before entering a call function interrupt + * vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(call_function_entry); + +/* + * call_function_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(call_function_exit); + +/* + * call_function_single_entry - called before entering a call function + * single interrupt vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(call_function_single_entry); + +/* + * call_function_single_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(call_function_single_exit); + +/* + * threshold_apic_entry - called before entering a threshold apic interrupt + * vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(threshold_apic_entry); + +/* + * threshold_apic_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(threshold_apic_exit); + +/* + * thermal_apic_entry - called before entering a thermal apic interrupt + * vector handler + */ +DEFINE_IRQ_VECTOR_EVENT(thermal_apic_entry); + +/* + * thrmal_apic_exit - called immediately after the interrupt vector + * handler returns + */ +DEFINE_IRQ_VECTOR_EVENT(thermal_apic_exit); + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH ../../arch/x86/include/asm/trace +#define TRACE_INCLUDE_FILE irq_vectors +#endif /* _TRACE_IRQ_VECTORS_H */ + +/* This part must be outside protection */ +#include <trace/define_trace.h> + diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 34e923a..24e2080 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -100,6 +100,7 @@ obj-$(CONFIG_OF) += devicetree.o obj-$(CONFIG_UPROBES) += uprobes.o obj-$(CONFIG_PERF_EVENTS) += perf_regs.o +obj-$(CONFIG_TRACING) += tracepoint.o ### # 64 bit specific files diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index b994cc8..89f3f4d 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -55,6 +55,9 @@ #include <asm/tsc.h> #include <asm/hypervisor.h> +#define CREATE_TRACE_POINTS +#include <asm/trace/irq_vectors.h> + unsigned int num_processors; unsigned disabled_cpus __cpuinitdata; @@ -934,6 +937,30 @@ void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs) set_irq_regs(old_regs); } +void __irq_entry smp_trace_apic_timer_interrupt(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + + /* + * NOTE! We'd better ACK the irq immediately, + * because timer handling can be slow. + */ + ack_APIC_irq(); + /* + * update_process_times() expects us to have done irq_enter(). + * Besides, if we don't timer interrupts ignore the global + * interrupt lock, which is the WrongThing (tm) to do. + */ + irq_enter(); + exit_idle(); + trace_local_timer_entry(LOCAL_TIMER_VECTOR); + local_apic_timer_interrupt(); + trace_local_timer_exit(LOCAL_TIMER_VECTOR); + irq_exit(); + + set_irq_regs(old_regs); +} + int setup_profiling_timer(unsigned int multiplier) { return -EINVAL; @@ -1931,6 +1958,31 @@ void smp_spurious_interrupt(struct pt_regs *regs) irq_exit(); } +void smp_trace_spurious_interrupt(struct pt_regs *regs) +{ + u32 v; + + irq_enter(); + exit_idle(); + trace_spurious_apic_entry(SPURIOUS_APIC_VECTOR); + /* + * Check if this really is a spurious interrupt and ACK it + * if it is a vectored one. Just in case... + * Spurious interrupts should not be ACKed. + */ + v = apic_read(APIC_ISR + ((SPURIOUS_APIC_VECTOR & ~0x1f) >> 1)); + if (v & (1 << (SPURIOUS_APIC_VECTOR & 0x1f))) + ack_APIC_irq(); + + inc_irq_stat(irq_spurious_count); + + /* see sw-dev-man vol 3, chapter 7.4.13.5 */ + pr_info("spurious APIC interrupt on CPU#%d, " + "should never happen.\n", smp_processor_id()); + trace_spurious_apic_exit(SPURIOUS_APIC_VECTOR); + irq_exit(); +} + /* * This interrupt should never happen with our APIC/SMP architecture */ @@ -1974,6 +2026,48 @@ void smp_error_interrupt(struct pt_regs *regs) irq_exit(); } +void smp_trace_error_interrupt(struct pt_regs *regs) +{ + u32 v0, v1; + u32 i = 0; + static const char * const error_interrupt_reason[] = { + "Send CS error", /* APIC Error Bit 0 */ + "Receive CS error", /* APIC Error Bit 1 */ + "Send accept error", /* APIC Error Bit 2 */ + "Receive accept error", /* APIC Error Bit 3 */ + "Redirectable IPI", /* APIC Error Bit 4 */ + "Send illegal vector", /* APIC Error Bit 5 */ + "Received illegal vector", /* APIC Error Bit 6 */ + "Illegal register address", /* APIC Error Bit 7 */ + }; + + irq_enter(); + exit_idle(); + trace_error_apic_entry(ERROR_APIC_VECTOR); + /* First tickle the hardware, only then report what went on. -- REW */ + v0 = apic_read(APIC_ESR); + apic_write(APIC_ESR, 0); + v1 = apic_read(APIC_ESR); + ack_APIC_irq(); + atomic_inc(&irq_err_count); + + apic_printk(APIC_DEBUG, KERN_DEBUG "APIC error on CPU%d: %02x(%02x)", + smp_processor_id(), v0 , v1); + + v1 = v1 & 0xff; + while (v1) { + if (v1 & 0x1) + apic_printk(APIC_DEBUG, KERN_CONT " : %s", error_interrupt_reason[i]); + i++; + v1 >>= 1; + } + + apic_printk(APIC_DEBUG, KERN_CONT "\n"); + + trace_error_apic_exit(ERROR_APIC_VECTOR); + irq_exit(); +} + /** * connect_bsp_APIC - attach the APIC to the interrupt system */ diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c index 47a1870..e7aa7fc 100644 --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c @@ -29,6 +29,7 @@ #include <asm/idle.h> #include <asm/mce.h> #include <asm/msr.h> +#include <asm/trace/irq_vectors.h> /* How long to wait between reporting thermal events */ #define CHECK_INTERVAL (300 * HZ) @@ -389,6 +390,19 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs) ack_APIC_irq(); } +asmlinkage void smp_trace_thermal_interrupt(struct pt_regs *regs) +{ + irq_enter(); + exit_idle(); + trace_thermal_apic_entry(THERMAL_APIC_VECTOR); + inc_irq_stat(irq_thermal_count); + smp_thermal_vector(); + trace_thermal_apic_exit(THERMAL_APIC_VECTOR); + irq_exit(); + /* Ack only at the end to avoid potential reentry */ + ack_APIC_irq(); +} + /* Thermal monitoring depends on APIC, ACPI and clock modulation */ static int intel_thermal_supported(struct cpuinfo_x86 *c) { diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c index aa578ca..0cbef99 100644 --- a/arch/x86/kernel/cpu/mcheck/threshold.c +++ b/arch/x86/kernel/cpu/mcheck/threshold.c @@ -8,6 +8,7 @@ #include <asm/apic.h> #include <asm/idle.h> #include <asm/mce.h> +#include <asm/trace/irq_vectors.h> static void default_threshold_interrupt(void) { @@ -27,3 +28,16 @@ asmlinkage void smp_threshold_interrupt(void) /* Ack only at the end to avoid potential reentry */ ack_APIC_irq(); } + +asmlinkage void smp_trace_threshold_interrupt(void) +{ + irq_enter(); + exit_idle(); + trace_threshold_apic_entry(THRESHOLD_APIC_VECTOR); + inc_irq_stat(irq_threshold_count); + mce_threshold_vector(); + trace_threshold_apic_exit(THRESHOLD_APIC_VECTOR); + irq_exit(); + /* Ack only at the end to avoid potential reentry */ + ack_APIC_irq(); +} diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 6ed91d9..729e75b 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -846,7 +846,17 @@ ENTRY(name) \ CFI_ENDPROC; \ ENDPROC(name) -#define BUILD_INTERRUPT(name, nr) BUILD_INTERRUPT3(name, nr, smp_##name) + +#ifdef CONFIG_TRACING +#define TRACE_BUILD_INTERRUPT(name, nr) \ + BUILD_INTERRUPT3(trace_##name, nr, smp_trace_##name) +#else +#define TRACE_BUILD_INTERRUPT(name, nr) +#endif + +#define BUILD_INTERRUPT(name, nr) \ + BUILD_INTERRUPT3(name, nr, smp_##name); \ + TRACE_BUILD_INTERRUPT(name, nr) /* The include is where all of the SMP etc. interrupts come from */ #include <asm/entry_arch.h> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index cb3c591..aa6afff 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -1146,7 +1146,7 @@ END(common_interrupt) /* * APIC interrupts. */ -.macro apicinterrupt num sym do_sym +.macro apicinterrupt3 num sym do_sym ENTRY(\sym) INTR_FRAME ASM_CLAC @@ -1158,15 +1158,32 @@ ENTRY(\sym) END(\sym) .endm +#ifdef CONFIG_TRACING +#define trace(sym) trace_##sym +#define smp_trace(sym) smp_trace_##sym + +.macro trace_apicinterrupt num sym +apicinterrupt3 \num trace(\sym) smp_trace(\sym) +.endm +#else +.macro trace_apicinterrupt num sym do_sym +.endm +#endif + +.macro apicinterrupt num sym do_sym +apicinterrupt3 \num \sym \do_sym +trace_apicinterrupt \num \sym +.endm + #ifdef CONFIG_SMP -apicinterrupt IRQ_MOVE_CLEANUP_VECTOR \ +apicinterrupt3 IRQ_MOVE_CLEANUP_VECTOR \ irq_move_cleanup_interrupt smp_irq_move_cleanup_interrupt -apicinterrupt REBOOT_VECTOR \ +apicinterrupt3 REBOOT_VECTOR \ reboot_interrupt smp_reboot_interrupt #endif #ifdef CONFIG_X86_UV -apicinterrupt UV_BAU_MESSAGE \ +apicinterrupt3 UV_BAU_MESSAGE \ uv_bau_message_intr1 uv_bau_message_interrupt #endif apicinterrupt LOCAL_TIMER_VECTOR \ @@ -1454,7 +1471,7 @@ ENTRY(xen_failsafe_callback) CFI_ENDPROC END(xen_failsafe_callback) -apicinterrupt XEN_HVM_EVTCHN_CALLBACK \ +apicinterrupt3 XEN_HVM_EVTCHN_CALLBACK \ xen_hvm_callback_vector xen_evtchn_do_upcall #endif /* CONFIG_XEN */ diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 980053c..054213b 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -471,6 +471,12 @@ ENTRY(idt_table) ENTRY(nmi_idt_table) .skip IDT_ENTRIES * 16 +#ifdef CONFIG_TRACING + .align L1_CACHE_BYTES +ENTRY(trace_idt_table) + .skip IDT_ENTRIES * 16 +#endif + __PAGE_ALIGNED_BSS .align PAGE_SIZE ENTRY(empty_zero_page) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index e4595f1..216bec1 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -17,6 +17,7 @@ #include <asm/idle.h> #include <asm/mce.h> #include <asm/hw_irq.h> +#include <asm/trace/irq_vectors.h> atomic_t irq_err_count; @@ -228,6 +229,28 @@ void smp_x86_platform_ipi(struct pt_regs *regs) set_irq_regs(old_regs); } +void smp_trace_x86_platform_ipi(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + + ack_APIC_irq(); + + irq_enter(); + + exit_idle(); + + trace_x86_platform_ipi_entry(X86_PLATFORM_IPI_VECTOR); + inc_irq_stat(x86_platform_ipis); + + if (x86_platform_ipi_callback) + x86_platform_ipi_callback(); + + trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR); + irq_exit(); + + set_irq_regs(old_regs); +} + EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq); #ifdef CONFIG_HOTPLUG_CPU diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c index ca8f703..09e6262 100644 --- a/arch/x86/kernel/irq_work.c +++ b/arch/x86/kernel/irq_work.c @@ -8,6 +8,7 @@ #include <linux/irq_work.h> #include <linux/hardirq.h> #include <asm/apic.h> +#include <asm/trace/irq_vectors.h> void smp_irq_work_interrupt(struct pt_regs *regs) { @@ -18,6 +19,17 @@ void smp_irq_work_interrupt(struct pt_regs *regs) irq_exit(); } +void smp_trace_irq_work_interrupt(struct pt_regs *regs) +{ + irq_enter(); + ack_APIC_irq(); + trace_irq_work_entry(IRQ_WORK_VECTOR); + inc_irq_stat(apic_irq_work_irqs); + irq_work_run(); + trace_irq_work_exit(IRQ_WORK_VECTOR); + irq_exit(); +} + void arch_irq_work_raise(void) { #ifdef CONFIG_X86_LOCAL_APIC diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 48d2b7d..aad58af 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -30,6 +30,7 @@ #include <asm/proto.h> #include <asm/apic.h> #include <asm/nmi.h> +#include <asm/trace/irq_vectors.h> /* * Some notes on x86 processor bugs affecting SMP operation: * @@ -259,6 +260,18 @@ void smp_reschedule_interrupt(struct pt_regs *regs) */ } +void smp_trace_reschedule_interrupt(struct pt_regs *regs) +{ + ack_APIC_irq(); + trace_reschedule_entry(RESCHEDULE_VECTOR); + inc_irq_stat(irq_resched_count); + scheduler_ipi(); + trace_reschedule_exit(RESCHEDULE_VECTOR); + /* + * KVM uses this interrupt to force a cpu out of guest mode + */ +} + void smp_call_function_interrupt(struct pt_regs *regs) { ack_APIC_irq(); @@ -268,6 +281,17 @@ void smp_call_function_interrupt(struct pt_regs *regs) irq_exit(); } +void smp_trace_call_function_interrupt(struct pt_regs *regs) +{ + ack_APIC_irq(); + irq_enter(); + trace_call_function_entry(CALL_FUNCTION_VECTOR); + generic_smp_call_function_interrupt(); + inc_irq_stat(irq_call_count); + trace_call_function_exit(CALL_FUNCTION_VECTOR); + irq_exit(); +} + void smp_call_function_single_interrupt(struct pt_regs *regs) { ack_APIC_irq(); @@ -277,6 +301,17 @@ void smp_call_function_single_interrupt(struct pt_regs *regs) irq_exit(); } +void smp_trace_call_function_single_interrupt(struct pt_regs *regs) +{ + ack_APIC_irq(); + irq_enter(); + trace_call_function_single_entry(CALL_FUNCTION_SINGLE_VECTOR); + generic_smp_call_function_single_interrupt(); + inc_irq_stat(irq_call_count); + trace_call_function_single_exit(CALL_FUNCTION_SINGLE_VECTOR); + irq_exit(); +} + static int __init nonmi_ipi_setup(char *str) { smp_no_nmi_ipi = true; diff --git a/arch/x86/kernel/tracepoint.c b/arch/x86/kernel/tracepoint.c new file mode 100644 index 0000000..2858bf3 --- /dev/null +++ b/arch/x86/kernel/tracepoint.c @@ -0,0 +1,65 @@ +/* + * Code for supporting irq vector tracepoints. + * + * Copyright (C) 2013 Seiji Aguchi <sei...@hd...> + * + */ +#include <asm/hw_irq.h> +#include <asm/desc.h> + +static struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1, + (unsigned long) trace_idt_table }; + +#ifndef CONFIG_X86_64 +gate_desc trace_idt_table[NR_VECTORS] __page_aligned_data + = { { { { 0, 0 } } }, }; +#endif + +static DEFINE_PER_CPU(struct desc_ptr, orig_idt_descr); +static int trace_irq_vector_refcount; +static DEFINE_MUTEX(irq_vector_mutex); + +static void switch_to_trace_idt(void *arg) +{ + store_idt(this_cpu_ptr(&orig_idt_descr)); + load_idt(&trace_idt_descr); + + return; +} + +static void restore_original_idt(void *arg) +{ + if (this_cpu_read(orig_idt_descr.address)) { + load_idt(this_cpu_ptr(&orig_idt_descr)); + this_cpu_write(orig_idt_descr.address, 0); + } + + return; +} + +void trace_irq_vector_regfunc(void) +{ + mutex_lock(&irq_vector_mutex); + if (!trace_irq_vector_refcount) { + smp_call_function(switch_to_trace_idt, NULL, 0); + local_irq_disable(); + switch_to_trace_idt(NULL); + local_irq_enable(); + } + trace_irq_vector_refcount++; + mutex_unlock(&irq_vector_mutex); +} + +void trace_irq_vector_unregfunc(void) +{ + mutex_lock(&irq_vector_mutex); + trace_irq_vector_refcount--; + if (!trace_irq_vector_refcount) { + smp_call_function(restore_original_idt, NULL, 0); + local_irq_disable(); + restore_original_idt(NULL); + local_irq_enable(); + } + mutex_unlock(&irq_vector_mutex); +} + diff --git a/include/xen/events.h b/include/xen/events.h index c6bfe01..9216d07 100644 --- a/include/xen/events.h +++ b/include/xen/events.h @@ -76,6 +76,9 @@ unsigned irq_from_evtchn(unsigned int evtchn); /* Xen HVM evtchn vector callback */ void xen_hvm_callback_vector(void); +#ifdef CONFIG_TRACING +#define trace_xen_hvm_callback_vector xen_hvm_callback_vector +#endif extern int xen_have_vector_callback; int xen_set_callback_via(uint64_t via); void xen_evtchn_do_upcall(struct pt_regs *regs); -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2013-02-04 22:50:02
|
From: Steven Rostedt <ro...@go...> Each TRACE_EVENT() adds several helper functions. If two or more trace events share the same structure and print format, they can also share most of these helper functions and save a lot of space from duplicate code. This is why the DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created. Some events require a trigger to be called at registering and unregistering of the event and to do so they use TRACE_EVENT_FN(). If multiple events require a trigger, they currently have no choice but to use TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This unfortunately causes a lot of wasted duplicate code created. By adding a DEFINE_EVENT_FN(), these events can still use a DECLARE_EVENT_CLASS() and then define their own triggers. Signed-off-by: Steven Rostedt <ro...@go...> Signed-off-by: Seiji Aguchi <sei...@hd...> --- include/linux/tracepoint.h | 2 ++ include/trace/define_trace.h | 5 +++++ include/trace/ftrace.h | 4 ++++ 3 files changed, 11 insertions(+), 0 deletions(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 2f322c3..9bf59e5 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -378,6 +378,8 @@ static inline void tracepoint_synchronize_unregister(void) #define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print) #define DEFINE_EVENT(template, name, proto, args) \ DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg)\ + DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) #define DEFINE_EVENT_CONDITION(template, name, proto, \ diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h index 1905ca8..02e1003 100644 --- a/include/trace/define_trace.h +++ b/include/trace/define_trace.h @@ -44,6 +44,10 @@ #define DEFINE_EVENT(template, name, proto, args) \ DEFINE_TRACE(name) +#undef DEFINE_EVENT_FN +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \ + DEFINE_TRACE_FN(name, reg, unreg) + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ DEFINE_TRACE(name) @@ -91,6 +95,7 @@ #undef TRACE_EVENT_CONDITION #undef DECLARE_EVENT_CLASS #undef DEFINE_EVENT +#undef DEFINE_EVENT_FN #undef DEFINE_EVENT_PRINT #undef DEFINE_EVENT_CONDITION #undef TRACE_HEADER_MULTI_READ diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h index 40dc5e8..7bab676 100644 --- a/include/trace/ftrace.h +++ b/include/trace/ftrace.h @@ -71,6 +71,10 @@ static struct ftrace_event_call __used \ __attribute__((__aligned__(4))) event_##name +#undef DEFINE_EVENT_FN +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \ + DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) -- 1.7.1 |
|
From: Seiji A. <sei...@hd...> - 2013-02-04 22:49:20
|
Change log
v8 -> v9
- Rebase to 3.8-rc6
- Add Steven's email address at the top of the message and
move my signed-off-by below Steven's one because it is
originally created by Steven. (patch 1/3)
- Introduce a irq_vector_mutex to avoid a race at registering/unregistering
time. (patch 2/3)
- Use a per_cpu data to orig_idt_descr because IDT descritor is needed to each cpu
and the appropriate data type is per_cpu data. It is suggested by Steven.
(patch 2/3)
v7 -> v8
- Rebase to 3.8-rc4
- Add a patch 1 introducing DEFINE_EVENT_FN() macro.
- Rename original patches 1 and 2 to 2 and 3.
- Change a definition of tracepoint to use DEFINE_EVENT_FN(). (patch 2)
- Change alloc_intr_gate() to use do{}while(0) to avoid a warning
of checkpatch.pl. (patch 2)
- Move entering_irq()/exiting_irq() to arch/x86/include/asm/apic.h (patch 3)
v6 -> v7
- Divide into two patches to make a code review easier.
Summery of each patch is as follows.
- Patch 1/2
- Add an irq_vector tracing infrastructure.
- Create idt_table for tracing. It is refactored to avoid duplicating
existing logic.
- Duplicate new irq handlers inserted tracepoints.
- Patch 2/2
- Share a common logic among irq handlers to make them
manageable and readable.
v5 -> v6
- Rebased to 3.7
v4 -> v5
- Rebased to 3.6.0
- Introduce a logic switching IDT at enabling/disabling TP time
so that a time penalty makes a zero when tracepoints are disabled.
This IDT is created only when CONFIG_TRACEPOINTS is enabled.
- Remove arch_irq_vector_entry/exit and add followings again
so that we can add each tracepoint in a generic way.
- error_apic_vector
- thermal_apic_vector
- threshold_apic_vector
- spurious_apic_vector
- x86_platform_ipi_vector
- Drop nmi tracepoints to begin with apic interrupts and discuss a logic switching
IDT first.
- Move irq_vectors.h in the directory of arch/x86/include/asm/trace because
I'm not sure if a logic switching IDT is sharable with other architectures.
v3 -> v4
- Add a latency measurement of each tracepoint
- Rebased to 3.6-rc6
v2 -> v3
- Remove an invalidate_tlb_vector event because it was replaced by a call function vector
in a following commit.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=52aec3308db85f4e9f5c8b9f5dc4fbd0138c6fa4
v1 -> v2
- Modify variable name from irq to vector.
- Merge arch-specific tracepoints below to an arch_irq_vector_entry/exit.
- error_apic_vector
- thermal_apic_vector
- threshold_apic_vector
- spurious_apic_vector
- x86_platform_ipi_vector
[Purpose of this patch]
As Vaibhav explained in the thread below, tracepoints for irq vectors
are useful.
http://www.spinics.net/lists/mm-commits/msg85707.html
<snip>
The current interrupt traces from irq_handler_entry and irq_handler_exit
provide when an interrupt is handled. They provide good data about when
the system has switched to kernel space and how it affects the currently
running processes.
There are some IRQ vectors which trigger the system into kernel space,
which are not handled in generic IRQ handlers. Tracing such events gives
us the information about IRQ interaction with other system events.
The trace also tells where the system is spending its time. We want to
know which cores are handling interrupts and how they are affecting other
processes in the system. Also, the trace provides information about when
the cores are idle and which interrupts are changing that state.
<snip>
On the other hand, my usecase is tracing just local timer event and
getting a value of instruction pointer.
I suggested to add an argument local timer event to get instruction pointer before.
But there is another way to get it with external module like systemtap.
So, I don't need to add any argument to irq vector tracepoints now.
[Patch Description]
Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
But there is an above use case to trace specific irq_vector rather than tracing all events.
In this case, we are concerned about overhead due to unwanted events.
This patch adds following tracepoints instead of introducing irq_vector_entry/exit.
so that we can enable them independently.
- local_timer_vector
- reschedule_vector
- call_function_vector
- call_function_single_vector
- irq_work_entry_vector
- error_apic_vector
- thermal_apic_vector
- threshold_apic_vector
- spurious_apic_vector
- x86_platform_ipi_vector
Please see descriptions in each patch.
Seiji Aguchi (2):
trace,x86: add x86 irq vector tracepoints
trace,x86: code-sharing between non-trace and trace irq handlers
Steven Rostedt (1):
tracing: Add DEFINE_EVENT_FN() macro
arch/x86/include/asm/apic.h | 27 +++++
arch/x86/include/asm/desc.h | 33 ++++++-
arch/x86/include/asm/entry_arch.h | 5 +-
arch/x86/include/asm/hw_irq.h | 16 +++
arch/x86/include/asm/trace/irq_vectors.h | 159 ++++++++++++++++++++++++++++++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/apic/apic.c | 71 +++++++++++---
arch/x86/kernel/cpu/mcheck/therm_throt.c | 24 ++++-
arch/x86/kernel/cpu/mcheck/threshold.c | 24 ++++-
arch/x86/kernel/entry_32.S | 12 ++-
arch/x86/kernel/entry_64.S | 27 ++++-
arch/x86/kernel/head_64.S | 6 +
arch/x86/kernel/irq.c | 31 ++++--
arch/x86/kernel/irq_work.c | 24 ++++-
arch/x86/kernel/smp.c | 65 +++++++++++--
arch/x86/kernel/tracepoint.c | 65 ++++++++++++
include/linux/tracepoint.h | 2 +
include/trace/define_trace.h | 5 +
include/trace/ftrace.h | 4 +
include/xen/events.h | 3 +
20 files changed, 550 insertions(+), 54 deletions(-)
create mode 100644 arch/x86/include/asm/trace/irq_vectors.h
create mode 100644 arch/x86/kernel/tracepoint.c
|
|
From: Seiji A. <sei...@hd...> - 2013-02-04 16:09:48
|
> > [TODO] > > Other functions below are used to output error messages in qemu. > > - qerror_report() is called in many source codes. > > Not a problem, it ends up in qerror_print(), which calls error_report(). Thanks. > > > - fprintf() is called in vl.c. > > All over the place, not just vl.c. Also monitor_printf(). I will take a look at the code in monitor_printf() and fprintf() in other places. > > On the other hand, qerror_report() and fprintf () seems to be called > > in signal handers. > > fprintf() in a signal handler? Where? Sorry, I misunderstood the code. In signal handers, qemu just set a value like shutdown_signal and shutdown_pid or so. Seiji |
|
From: Seiji A. <sei...@hd...> - 2013-02-04 15:58:41
|
> > Do we really want to add timestamps to error messages unconditionally? > > I don't doubt it's useful in your scenario, but most of the time it's > > just annoying clutter. > > Agreed, I think it should be an option. OK. I will add the timestamp as an option. > Also remember that management tools can pass a pipe as stderr when starting QEMU, and > then they can add their preferred timestamping/log formatting outside of QEMU. To get an exact timestamp, qemu should supply it to the messages. Also, there are some customers who use qemu in a small environment. In that case, they don't use the management like vdsm. Seiji |
|
From: Markus A. <ar...@re...> - 2013-02-04 11:08:26
|
Stefan Hajnoczi <ste...@gm...> writes: > On Sat, Feb 02, 2013 at 08:31:45AM +0100, Markus Armbruster wrote: >> Seiji Aguchi <sei...@hd...> writes: >> >> > [Issue] >> > When we offer a customer support service and a problem happens >> > in a customer's system, we try to understand the problem by >> > comparing what the customer reports with message logs of the >> > customer's system. >> > >> > In this case, we often need to know when the problem happens. >> > >> > But, currently, there is no timestamp in qemu's error messages. >> > Therefore, we may not be able to understand the problem based on >> > error messages. >> > >> > [Solution] >> > This patch adds a timestamp to qemu's error message logged by >> > error_report(). >> > >> > A logic calculating a time is copied from libvirt, src/util/virtime.c. >> >> Do we really want to add timestamps to error messages unconditionally? >> I don't doubt it's useful in your scenario, but most of the time it's >> just annoying clutter. > > Agreed, I think it should be an option. Also remember that management > tools can pass a pipe as stderr when starting QEMU, and then they can > add their preferred timestamping/log formatting outside of QEMU. That's actually how I'd solve the problem. |
|
From: Stefan H. <ste...@gm...> - 2013-02-04 09:18:41
|
On Sat, Feb 02, 2013 at 08:31:45AM +0100, Markus Armbruster wrote: > Seiji Aguchi <sei...@hd...> writes: > > > [Issue] > > When we offer a customer support service and a problem happens > > in a customer's system, we try to understand the problem by > > comparing what the customer reports with message logs of the > > customer's system. > > > > In this case, we often need to know when the problem happens. > > > > But, currently, there is no timestamp in qemu's error messages. > > Therefore, we may not be able to understand the problem based on > > error messages. > > > > [Solution] > > This patch adds a timestamp to qemu's error message logged by > > error_report(). > > > > A logic calculating a time is copied from libvirt, src/util/virtime.c. > > Do we really want to add timestamps to error messages unconditionally? > I don't doubt it's useful in your scenario, but most of the time it's > just annoying clutter. Agreed, I think it should be an option. Also remember that management tools can pass a pipe as stderr when starting QEMU, and then they can add their preferred timestamping/log formatting outside of QEMU. Stefan |
|
From: Markus A. <ar...@re...> - 2013-02-02 07:31:56
|
Seiji Aguchi <sei...@hd...> writes: > [Issue] > When we offer a customer support service and a problem happens > in a customer's system, we try to understand the problem by > comparing what the customer reports with message logs of the > customer's system. > > In this case, we often need to know when the problem happens. > > But, currently, there is no timestamp in qemu's error messages. > Therefore, we may not be able to understand the problem based on > error messages. > > [Solution] > This patch adds a timestamp to qemu's error message logged by > error_report(). > > A logic calculating a time is copied from libvirt, src/util/virtime.c. Do we really want to add timestamps to error messages unconditionally? I don't doubt it's useful in your scenario, but most of the time it's just annoying clutter. > [TODO] > Other functions below are used to output error messages in qemu. > - qerror_report() is called in many source codes. Not a problem, it ends up in qerror_print(), which calls error_report(). > - fprintf() is called in vl.c. All over the place, not just vl.c. Also monitor_printf(). > A timestamp should be added to these messages as well. [...] |
|
From: Markus A. <ar...@re...> - 2013-02-02 07:23:54
|
Seiji Aguchi <sei...@hd...> writes: >> > Are gmtime() + strftime() unsuitable for some reason? >> >> They are not async-signal safe, so they are not usable in between a >> fork() and exec*(). Libvirt avoids them because it DOES log >> information including timestamps in between fork/exec (and prior to >> writing libvirt's virtime.c, we DID hit cases where libvirt would >> deadlock a child process due to the non-safe use of a more naive >> timestamp generator), but I don't know if qemu suffers from the same >> restriction of when it has anything worth logging. > > In my understanding, gmtime() + strftime() is usable in error_report() > because it seems to be called > For just a option check. error_report() is not bound to a particular task. It simply reports either to the current monitor or to stderr. It's certainly not usuable in a signal handler, or similar signal-unsafe context. >>[TODO] >>Other functions below are used to output error messages in qemu. > >- qerror_report() is called in many source codes. > >- fprintf() is called in vl.c. > > On the other hand, qerror_report() and fprintf () seems to be called > in signal handers. fprintf() in a signal handler? Where? > So, in those cases, some signal-safe functions should be used. |
|
From: Seiji A. <sei...@hd...> - 2013-02-01 22:41:13
|
> > Are gmtime() + strftime() unsuitable for some reason? > > They are not async-signal safe, so they are not usable in between a > fork() and exec*(). Libvirt avoids them because it DOES log information including timestamps in between fork/exec (and prior to > writing libvirt's virtime.c, we DID hit cases where libvirt would deadlock a child process due to the non-safe use of a more naive > timestamp generator), but I don't know if qemu suffers from the same restriction of when it has anything worth logging. In my understanding, gmtime() + strftime() is usable in error_report() because it seems to be called For just a option check. >[TODO] >Other functions below are used to output error messages in qemu. >- qerror_report() is called in many source codes. >- fprintf() is called in vl.c. On the other hand, qerror_report() and fprintf () seems to be called in signal handers. So, in those cases, some signal-safe functions should be used. Seiji |
|
From: Seiji A. <sei...@hd...> - 2013-02-01 22:28:18
|
> > Also, don't forget to add the mutex protection that I suggested in another email. Will do. Thanks, Seiji |
|
From: Steven R. <ro...@go...> - 2013-02-01 22:15:52
|
On Fri, 2013-02-01 at 21:06 +0000, Seiji Aguchi wrote:
> > > +#include <asm/hw_irq.h>
> > > +#include <asm/desc.h>
> > > +
> > > +static struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1,
> > > + (unsigned long) trace_idt_table };
> > > +
> > > +#ifndef CONFIG_X86_64
> > > +gate_desc trace_idt_table[NR_VECTORS] __page_aligned_data
> > > + = { { { { 0, 0 } } }, };
> > > +#endif
> > > +
> > > +static struct desc_ptr orig_idt_descr[NR_CPUS];
> >
> > BTW, this should be:
> >
> > static DEFINE_PER_CPU(struct desc_ptr, orig_idt_descr);
> >
> > And look at the patch I made about the NMI/DEBUG trap handling to see how to use it.
>
> Thanks.
> I'm looking at your patch right now and will update my patch.
Also, don't forget to add the mutex protection that I suggested in
another email.
Thanks,
-- Steve
|
|
From: Eric B. <eb...@re...> - 2013-02-01 21:35:38
|
On 02/01/2013 11:28 AM, Laszlo Ersek wrote: > Hello Seiji-san, > > On 02/01/13 15:53, Seiji Aguchi wrote: > >> A logic calculating a time is copied from libvirt, src/util/virtime.c. > >> +static void error_print_timestamp(void) > > Are gmtime() + strftime() unsuitable for some reason? They are not async-signal safe, so they are not usable in between a fork() and exec*(). Libvirt avoids them because it DOES log information including timestamps in between fork/exec (and prior to writing libvirt's virtime.c, we DID hit cases where libvirt would deadlock a child process due to the non-safe use of a more naive timestamp generator), but I don't know if qemu suffers from the same restriction of when it has anything worth logging. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org |
|
From: Seiji A. <sei...@hd...> - 2013-02-01 21:06:51
|
> > +#include <asm/hw_irq.h>
> > +#include <asm/desc.h>
> > +
> > +static struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1,
> > + (unsigned long) trace_idt_table };
> > +
> > +#ifndef CONFIG_X86_64
> > +gate_desc trace_idt_table[NR_VECTORS] __page_aligned_data
> > + = { { { { 0, 0 } } }, };
> > +#endif
> > +
> > +static struct desc_ptr orig_idt_descr[NR_CPUS];
>
> BTW, this should be:
>
> static DEFINE_PER_CPU(struct desc_ptr, orig_idt_descr);
>
> And look at the patch I made about the NMI/DEBUG trap handling to see how to use it.
Thanks.
I'm looking at your patch right now and will update my patch.
Seiji
|
|
From: Steven R. <ro...@go...> - 2013-02-01 21:03:42
|
On Mon, 2013-01-21 at 22:14 +0000, Seiji Aguchi wrote:
> --- /dev/null
> +++ b/arch/x86/kernel/tracepoint.c
> @@ -0,0 +1,61 @@
> +/*
> + * Code for supporting irq vector tracepoints.
> + *
> + * Copyright (C) 2013 Seiji Aguchi <sei...@hd...>
> + *
> + */
> +#include <asm/hw_irq.h>
> +#include <asm/desc.h>
> +
> +static struct desc_ptr trace_idt_descr = { NR_VECTORS * 16 - 1,
> + (unsigned long) trace_idt_table };
> +
> +#ifndef CONFIG_X86_64
> +gate_desc trace_idt_table[NR_VECTORS] __page_aligned_data
> + = { { { { 0, 0 } } }, };
> +#endif
> +
> +static struct desc_ptr orig_idt_descr[NR_CPUS];
BTW, this should be:
static DEFINE_PER_CPU(struct desc_ptr, orig_idt_descr);
And look at the patch I made about the NMI/DEBUG trap handling to see
how to use it.
Thanks,
-- Steve
> +static int trace_irq_vector_refcount;
> +
> +static void switch_to_trace_idt(void *arg)
> +{
> + store_idt(&orig_idt_descr[smp_processor_id()]);
> + load_idt(&trace_idt_descr);
> +
> + return;
> +}
> +
> +static void restore_original_idt(void *arg)
> +{
> + if (orig_idt_descr[smp_processor_id()].address) {
> + load_idt(&orig_idt_descr[smp_processor_id()]);
> + memset(&orig_idt_descr[smp_processor_id()], 0,
> + sizeof(struct desc_ptr));
> + }
> +
> + return;
> +}
> +
> +void trace_irq_vector_regfunc(void)
> +{
> + if (!trace_irq_vector_refcount) {
> + smp_call_function(switch_to_trace_idt, NULL, 0);
> + local_irq_disable();
> + switch_to_trace_idt(NULL);
> + local_irq_enable();
> + }
> + trace_irq_vector_refcount++;
> +}
> +
> +void trace_irq_vector_unregfunc(void)
> +{
> + trace_irq_vector_refcount--;
> + if (!trace_irq_vector_refcount) {
> + smp_call_function(restore_original_idt, NULL, 0);
> + local_irq_disable();
> + restore_original_idt(NULL);
> + local_irq_enable();
> + }
> +}
> +
> diff --git a/include/xen/events.h b/include/xen/events.h
> index c6bfe01..9216d07 100644
> --- a/include/xen/events.h
> +++ b/include/xen/events.h
> @@ -76,6 +76,9 @@ unsigned irq_from_evtchn(unsigned int evtchn);
>
> /* Xen HVM evtchn vector callback */
> void xen_hvm_callback_vector(void);
> +#ifdef CONFIG_TRACING
> +#define trace_xen_hvm_callback_vector xen_hvm_callback_vector
> +#endif
> extern int xen_have_vector_callback;
> int xen_set_callback_via(uint64_t via);
> void xen_evtchn_do_upcall(struct pt_regs *regs);
|
|
From: Seiji A. <sei...@hd...> - 2013-02-01 20:25:40
|
Hi Laszlo, Thank you for reviewing my patch! I will update my patch in accordance with your comment, using gmtime() + strftime() and gettimeofday(). Seiji |
|
From: Laszlo E. <le...@re...> - 2013-02-01 18:26:15
|
Hello Seiji-san, On 02/01/13 15:53, Seiji Aguchi wrote: > A logic calculating a time is copied from libvirt, src/util/virtime.c. > +static void error_print_timestamp(void) Are gmtime() + strftime() unsuitable for some reason? Also, since the timestamp is ultimately printed with millisecond resolution, clock_gettime(CLOCK_REALTIME) could be replaced with the more portable gettimeofday(). Thanks, Laszlo |
|
From: Seiji A. <sei...@hd...> - 2013-02-01 14:53:18
|
[Issue]
When we offer a customer support service and a problem happens
in a customer's system, we try to understand the problem by
comparing what the customer reports with message logs of the
customer's system.
In this case, we often need to know when the problem happens.
But, currently, there is no timestamp in qemu's error messages.
Therefore, we may not be able to understand the problem based on
error messages.
[Solution]
This patch adds a timestamp to qemu's error message logged by
error_report().
A logic calculating a time is copied from libvirt, src/util/virtime.c.
[TODO]
Other functions below are used to output error messages in qemu.
- qerror_report() is called in many source codes.
- fprintf() is called in vl.c.
A timestamp should be added to these messages as well.
Signed-off-by: Seiji Aguchi <sei...@hd...>
---
util/qemu-error.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 90 insertions(+), 0 deletions(-)
diff --git a/util/qemu-error.c b/util/qemu-error.c
index 08a36f4..44d0fc1 100644
--- a/util/qemu-error.c
+++ b/util/qemu-error.c
@@ -196,6 +196,95 @@ void error_print_loc(void)
}
}
+
+#define SECS_PER_HOUR (60 * 60)
+#define SECS_PER_DAY (SECS_PER_HOUR * 24)
+#define DIV(a, b) ((a) / (b) - ((a) % (b) < 0))
+#define LEAPS_THRU_END_OF(y) (DIV(y, 4) - DIV(y, 100) + DIV(y, 400))
+
+const unsigned short int __mon_yday[2][13] = {
+ /* Normal years. */
+ { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365 },
+ /* Leap years. */
+ { 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366 }
+};
+
+#define is_leap_year(y) \
+ ((y) % 4 == 0 && ((y) % 100 != 0 || (y) % 400 == 0))
+
+static void error_print_timestamp(void)
+{
+ struct timespec ts;
+ unsigned long long now;
+
+ struct tm fields;
+ long int days, rem, y;
+ const unsigned short int *ip;
+ unsigned long long whenSecs;
+ unsigned int offset = 0; /* We hardcoded GMT */
+
+ if (clock_gettime(CLOCK_REALTIME, &ts) < 0) {
+ return;
+ }
+
+ now = (ts.tv_sec * 1000ull) + (ts.tv_nsec / (1000ull * 1000ull));
+ /* This code is taken from GLibC under terms of LGPLv2+ */
+
+ whenSecs = now / 1000ull;
+
+ days = whenSecs / SECS_PER_DAY;
+ rem = whenSecs % SECS_PER_DAY;
+ rem += offset;
+ while (rem < 0) {
+ rem += SECS_PER_DAY;
+ --days;
+ }
+ while (rem >= SECS_PER_DAY) {
+ rem -= SECS_PER_DAY;
+ ++days;
+ }
+ fields.tm_hour = rem / SECS_PER_HOUR;
+ rem %= SECS_PER_HOUR;
+ fields.tm_min = rem / 60;
+ fields.tm_sec = rem % 60;
+ /* January 1, 1970 was a Thursday. */
+ fields.tm_wday = (4 + days) % 7;
+ if (fields.tm_wday < 0) {
+ fields.tm_wday += 7;
+ }
+ y = 1970;
+
+ while (days < 0 || days >= (is_leap_year(y) ? 366 : 365)) {
+ /* Guess a corrected year, assuming 365 days per year. */
+ long int yg = y + days / 365 - (days % 365 < 0);
+
+ /* Adjust DAYS and Y to match the guessed year. */
+ days -= ((yg - y) * 365
+ + LEAPS_THRU_END_OF(yg - 1)
+ - LEAPS_THRU_END_OF(y - 1));
+ y = yg;
+ }
+ fields.tm_year = y - 1900;
+
+ fields.tm_yday = days;
+ ip = __mon_yday[is_leap_year(y)];
+ for (y = 11; days < (long int) ip[y]; --y) {
+ continue;
+ }
+
+ days -= ip[y];
+ fields.tm_mon = y;
+ fields.tm_mday = days + 1;
+
+ error_printf(
+ "%4d-%02d-%02d %02d:%02d:%02d.%03lld+0000 ",
+ fields.tm_year + 1900, fields.tm_mon + 1, fields.tm_mday,
+ fields.tm_hour, fields.tm_min, fields.tm_sec,
+ now % 1000);
+
+ return;
+}
+
/*
* Print an error message to current monitor if we have one, else to stderr.
* Format arguments like sprintf(). The result should not contain
@@ -207,6 +296,7 @@ void error_report(const char *fmt, ...)
{
va_list ap;
+ error_print_timestamp();
error_print_loc();
va_start(ap, fmt);
error_vprintf(fmt, ap);
-- 1.7.1
|
|
From: Steven R. <ro...@go...> - 2013-02-01 02:34:08
|
On Mon, 2013-01-21 at 22:14 +0000, Seiji Aguchi wrote:
> +void trace_irq_vector_regfunc(void)
> +{
> + if (!trace_irq_vector_refcount) {
> + smp_call_function(switch_to_trace_idt, NULL, 0);
> + local_irq_disable();
> + switch_to_trace_idt(NULL);
> + local_irq_enable();
> + }
> + trace_irq_vector_refcount++;
> +}
> +
> +void trace_irq_vector_unregfunc(void)
> +{
> + trace_irq_vector_refcount--;
> + if (!trace_irq_vector_refcount) {
> + smp_call_function(restore_original_idt, NULL, 0);
> + local_irq_disable();
> + restore_original_idt(NULL);
> + local_irq_enable();
> + }
> +}
> +
What protection do these functions have? I mean, the reg functions of a
TRACE_EVENT() can be initiated by perf and ftrace at he same time, as
well as LTTng (if someone adds it). The tracepoint itself may have
protection, but all the tracepoints you created use the same reg
functions, and they are not protected from each other.
You need to add a mutex around these, like:
static DEFINE_MUTEX(irq_reg_mutex);
void trace_irq_vector_regfunc(void)
{
mutex_lock(&irq_reg_mutex);
if (!trace_irq_vector_refcount) {
smp_call_function(switch_to_trace_idt, NULL, 0);
local_irq_disable();
switch_to_trace_idt(NULL);
local_irq_enable();
}
trace_irq_vector_refcount++;
mutex_unlock(&irq_reg_mutex);
}
and the same for the trace_irq_vector_unregfunc().
-- Steve
|
|
From: Seiji A. <sei...@hd...> - 2013-01-31 23:47:50
|
Steven, > As I wrote this patch, you need to add at the top of the message body: > > From: Steven Rostedt <ro...@go...> > > And your Signed-off-by goes after mine. That is, the author of the patch, always goes first. Thanks. I will fix it. Seiji > -----Original Message----- > From: Steven Rostedt [mailto:ro...@go...] > Sent: Thursday, January 31, 2013 6:22 PM > To: Seiji Aguchi > Cc: x8...@ke...; H. Peter Anvin (hp...@zy...); lin...@vg...; Thomas Gleixner (tg...@li...); > 'mi...@el...' (mi...@el...); Borislav Petkov (bp...@al...); Satoru Moriya; dle...@li...; linux- > ed...@vg...; Luck, Tony (ton...@in...) > Subject: Re: [RFC][PATCH v8 1/3] tracing: Add DEFINE_EVENT_FN() macro > > Hi Seiji, > > As I wrote this patch, you need to add at the top of the message body: > > From: Steven Rostedt <ro...@go...> > > And your Signed-off-by goes after mine. That is, the author of the patch, always goes first. > > -- Steve > > > On Mon, 2013-01-21 at 22:13 +0000, Seiji Aguchi wrote: > > Each TRACE_EVENT() adds several helper functions. If two or more trace > > events share the same structure and print format, they can also share > > most of these helper functions and save a lot of space from duplicate > > code. This is why the > > DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created. > > > > Some events require a trigger to be called at registering and > > unregistering of the event and to do so they use TRACE_EVENT_FN(). > > > > If multiple events require a trigger, they currently have no choice > > but to use > > TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This > > unfortunately causes a lot of wasted duplicate code created. > > > > By adding a DEFINE_EVENT_FN(), these events can still use a > > DECLARE_EVENT_CLASS() and then define their own triggers. > > > > Signed-off-by: Seiji Aguchi <sei...@hd...> > > Signed-off-by: Steven Rostedt <ro...@go...> > > --- > > include/linux/tracepoint.h | 2 ++ > > include/trace/define_trace.h | 5 +++++ > > include/trace/ftrace.h | 4 ++++ > > 3 files changed, 11 insertions(+), 0 deletions(-) > > > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > > index 2f322c3..9bf59e5 100644 > > --- a/include/linux/tracepoint.h > > +++ b/include/linux/tracepoint.h > > @@ -378,6 +378,8 @@ static inline void > > tracepoint_synchronize_unregister(void) > > #define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print) > > #define DEFINE_EVENT(template, name, proto, args) \ > > DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) > > +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg)\ > > + DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) > > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ > > DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) > > #define DEFINE_EVENT_CONDITION(template, name, proto, \ > > diff --git a/include/trace/define_trace.h > > b/include/trace/define_trace.h index 1905ca8..02e1003 100644 > > --- a/include/trace/define_trace.h > > +++ b/include/trace/define_trace.h > > @@ -44,6 +44,10 @@ > > #define DEFINE_EVENT(template, name, proto, args) \ > > DEFINE_TRACE(name) > > > > +#undef DEFINE_EVENT_FN > > +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \ > > + DEFINE_TRACE_FN(name, reg, unreg) > > + > > #undef DEFINE_EVENT_PRINT > > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ > > DEFINE_TRACE(name) > > @@ -91,6 +95,7 @@ > > #undef TRACE_EVENT_CONDITION > > #undef DECLARE_EVENT_CLASS > > #undef DEFINE_EVENT > > +#undef DEFINE_EVENT_FN > > #undef DEFINE_EVENT_PRINT > > #undef DEFINE_EVENT_CONDITION > > #undef TRACE_HEADER_MULTI_READ > > diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h index > > 40dc5e8..7bab676 100644 > > --- a/include/trace/ftrace.h > > +++ b/include/trace/ftrace.h > > @@ -71,6 +71,10 @@ > > static struct ftrace_event_call __used \ > > __attribute__((__aligned__(4))) event_##name > > > > +#undef DEFINE_EVENT_FN > > +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \ > > + DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) > > + > > #undef DEFINE_EVENT_PRINT > > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ > > DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) > |
|
From: Steven R. <ro...@go...> - 2013-01-31 23:22:37
|
Hi Seiji, As I wrote this patch, you need to add at the top of the message body: From: Steven Rostedt <ro...@go...> And your Signed-off-by goes after mine. That is, the author of the patch, always goes first. -- Steve On Mon, 2013-01-21 at 22:13 +0000, Seiji Aguchi wrote: > Each TRACE_EVENT() adds several helper functions. If two or more trace events > share the same structure and print format, they can also share most of these > helper functions and save a lot of space from duplicate code. This is why the > DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created. > > Some events require a trigger to be called at registering and unregistering of > the event and to do so they use TRACE_EVENT_FN(). > > If multiple events require a trigger, they currently have no choice but to use > TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This unfortunately > causes a lot of wasted duplicate code created. > > By adding a DEFINE_EVENT_FN(), these events can still use a > DECLARE_EVENT_CLASS() and then define their own triggers. > > Signed-off-by: Seiji Aguchi <sei...@hd...> > Signed-off-by: Steven Rostedt <ro...@go...> > --- > include/linux/tracepoint.h | 2 ++ > include/trace/define_trace.h | 5 +++++ > include/trace/ftrace.h | 4 ++++ > 3 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index 2f322c3..9bf59e5 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -378,6 +378,8 @@ static inline void tracepoint_synchronize_unregister(void) > #define DECLARE_EVENT_CLASS(name, proto, args, tstruct, assign, print) > #define DEFINE_EVENT(template, name, proto, args) \ > DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) > +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg)\ > + DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ > DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) > #define DEFINE_EVENT_CONDITION(template, name, proto, \ > diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h > index 1905ca8..02e1003 100644 > --- a/include/trace/define_trace.h > +++ b/include/trace/define_trace.h > @@ -44,6 +44,10 @@ > #define DEFINE_EVENT(template, name, proto, args) \ > DEFINE_TRACE(name) > > +#undef DEFINE_EVENT_FN > +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \ > + DEFINE_TRACE_FN(name, reg, unreg) > + > #undef DEFINE_EVENT_PRINT > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ > DEFINE_TRACE(name) > @@ -91,6 +95,7 @@ > #undef TRACE_EVENT_CONDITION > #undef DECLARE_EVENT_CLASS > #undef DEFINE_EVENT > +#undef DEFINE_EVENT_FN > #undef DEFINE_EVENT_PRINT > #undef DEFINE_EVENT_CONDITION > #undef TRACE_HEADER_MULTI_READ > diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h > index 40dc5e8..7bab676 100644 > --- a/include/trace/ftrace.h > +++ b/include/trace/ftrace.h > @@ -71,6 +71,10 @@ > static struct ftrace_event_call __used \ > __attribute__((__aligned__(4))) event_##name > > +#undef DEFINE_EVENT_FN > +#define DEFINE_EVENT_FN(template, name, proto, args, reg, unreg) \ > + DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) > + > #undef DEFINE_EVENT_PRINT > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ > DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) |