From: Anton V. <ant...@li...> - 2012-04-23 07:08:05
|
Hi all, This is another resend of several task->mm fixes, the bugs I found during LMK code audit. Architectures were traverse the tasklist in an unsafe manner, plus there are a few cases of unsafe access to task->mm in general. There were no objections on the previous resend, and the final words were somewhere along "the patches are fine" line. In v3: - Dropped a controversal 'Make find_lock_task_mm() sparse-aware' patch; - Reword arm and sh commit messages, per Oleg Nesterov's suggestions; - Added an optimization trick in clear_tasks_mm_cpumask(): take only the rcu read lock, no need for the whole tasklist_lock. Suggested by Peter Zijlstra. In v2: - introduced a small helper in cpu.c: most arches duplicate the same [buggy] code snippet, so it's better to fix it and move the logic into a common function. Thanks, -- Anton Vorontsov Email: cbo...@gm... |
From: Anton V. <ant...@li...> - 2012-03-24 10:27:27
|
Hi all, This is a reincarnation of the task->mm fixes. Several architectures were traverse the tasklist in an unsafe manner, plus there are a few cases of unsafe access to task->mm. In v2 I decided to introduce a small helper in cpu.c: most arches duplicate the same [buggy] code snippet, so it's better to fix it and move the logic into a common function. Thanks! -- Anton Vorontsov Email: cbo...@gm... |
From: Anton V. <ant...@li...> - 2012-03-24 10:29:10
|
Many architctures clear tasks' mm_cpumask like this: read_lock(&tasklist_lock); for_each_process(p) { if (p->mm) cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); } read_unlock(&tasklist_lock); The code above has several problems, such as: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. This patch implements a small helper function that does things correctly, i.e.: 1. We take the task's lock while whe handle its mm (we can't use get_task_mm()/mmput() pair as mmput() might sleep); 2. To catch exited main thread case, we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Signed-off-by: Anton Vorontsov <ant...@li...> --- include/linux/cpu.h | 1 + kernel/cpu.c | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 1f65875..941e865 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -171,6 +171,7 @@ extern void put_online_cpus(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) +void clear_tasks_mm_cpumask(int cpu); int cpu_down(unsigned int cpu); #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE diff --git a/kernel/cpu.c b/kernel/cpu.c index 2060c6e..5255936 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -10,6 +10,7 @@ #include <linux/sched.h> #include <linux/unistd.h> #include <linux/cpu.h> +#include <linux/oom.h> #include <linux/export.h> #include <linux/kthread.h> #include <linux/stop_machine.h> @@ -171,6 +172,23 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_cpu_notifier); +void clear_tasks_mm_cpumask(int cpu) +{ + struct task_struct *p; + + read_lock(&tasklist_lock); + for_each_process(p) { + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) + continue; + cpumask_clear_cpu(cpu, mm_cpumask(t->mm)); + task_unlock(t); + } + read_unlock(&tasklist_lock); +} + static inline void check_for_tasks(int cpu) { struct task_struct *p; -- 1.7.9.2 |
From: Peter Z. <a.p...@ch...> - 2012-03-24 12:45:06
|
On Sat, 2012-03-24 at 14:27 +0400, Anton Vorontsov wrote: > +void clear_tasks_mm_cpumask(int cpu) > +{ > + struct task_struct *p; > + > + read_lock(&tasklist_lock); > + for_each_process(p) { > + struct task_struct *t; > + > + t = find_lock_task_mm(p); > + if (!t) > + continue; > + cpumask_clear_cpu(cpu, mm_cpumask(t->mm)); > + task_unlock(t); > + } > + read_unlock(&tasklist_lock); > +} Why bother with the tasklist_lock at all anymore, afaict you could use rcu_read_lock() here. This all is called after the cpu is taken down and marked offline, so its not like new tasks will ever get this cpu set in their mm mask. |
From: Anton V. <ant...@li...> - 2012-03-24 16:44:31
|
Many architctures clear tasks' mm_cpumask like this: read_lock(&tasklist_lock); for_each_process(p) { if (p->mm) cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); } read_unlock(&tasklist_lock); The code above has several problems, such as: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. This patch implements a small helper function that does things correctly, i.e.: 1. We take the task's lock while whe handle its mm (we can't use get_task_mm()/mmput() pair as mmput() might sleep); 2. To catch exited main thread case, we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in the new helper, instead we take the rcu read lock. We can do this because the function is called after the cpu is taken down and marked offline, so no new tasks will get this cpu set in their mm mask. Signed-off-by: Anton Vorontsov <ant...@li...> --- On Sat, Mar 24, 2012 at 01:43:41PM +0100, Peter Zijlstra wrote: > On Sat, 2012-03-24 at 14:27 +0400, Anton Vorontsov wrote: > > + read_unlock(&tasklist_lock); > > +} > > Why bother with the tasklist_lock at all anymore, afaict you could use > rcu_read_lock() here. This all is called after the cpu is taken down and > marked offline, so its not like new tasks will ever get this cpu set in > their mm mask. I guess you're right. Well, this would make the code a bit fragile, but a comment might help. How about this version? include/linux/cpu.h | 1 + kernel/cpu.c | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 1f65875..941e865 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -171,6 +171,7 @@ extern void put_online_cpus(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) +void clear_tasks_mm_cpumask(int cpu); int cpu_down(unsigned int cpu); #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE diff --git a/kernel/cpu.c b/kernel/cpu.c index 2060c6e..ecdf499 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -10,6 +10,8 @@ #include <linux/sched.h> #include <linux/unistd.h> #include <linux/cpu.h> +#include <linux/oom.h> +#include <linux/rcupdate.h> #include <linux/export.h> #include <linux/kthread.h> #include <linux/stop_machine.h> @@ -171,6 +173,30 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_cpu_notifier); +void clear_tasks_mm_cpumask(int cpu) +{ + struct task_struct *p; + + /* + * This function is called after the cpu is taken down and marked + * offline, so its not like new tasks will ever get this cpu set in + * their mm mask. -- Peter Zijlstra + * Thus, we may use rcu_read_lock() here, instead of grabbing + * full-fledged tasklist_lock. + */ + rcu_read_lock(); + for_each_process(p) { + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) + continue; + cpumask_clear_cpu(cpu, mm_cpumask(t->mm)); + task_unlock(t); + } + rcu_read_unlock(); +} + static inline void check_for_tasks(int cpu) { struct task_struct *p; -- 1.7.9.2 |
From: Oleg N. <ol...@re...> - 2012-03-25 17:51:27
|
On 03/24, Anton Vorontsov wrote: > > Many architctures clear tasks' mm_cpumask like this: > > read_lock(&tasklist_lock); > for_each_process(p) { > if (p->mm) > cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); > } > read_unlock(&tasklist_lock); Namely arm, powerpc, and sh. > The code above has several problems, such as: > > 1. Working with task->mm w/o getting mm or grabing the task lock is > dangerous as ->mm might disappear (exit_mm() assigns NULL under > task_lock(), so tasklist lock is not enough). This is not actually true for arm and sh, afaics. They do not even need tasklist or rcu lock for for_each_process(). __cpu_disable() is called by __stop_machine(), we know that nobody can preempt us and other CPUs can do nothing. > 2. Checking for process->mm is not enough because process' main > thread may exit or detach its mm via use_mm(), but other threads > may still have a valid mm. Yes, > Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in > the new helper, instead we take the rcu read lock. We can do this > because the function is called after the cpu is taken down and marked > offline, so no new tasks will get this cpu set in their mm mask. And only powerpc needs rcu_read_lock() and task_lock(). OTOH, I do not understand why powepc does this on CPU_DEAD... And probably CPU_UP_CANCELED doesn't need to clear mm_cpumask(). That said, personally I think these patches are fine, the common helper makes sense. Oleg. |
From: Benjamin H. <be...@ke...> - 2012-03-28 00:03:14
|
On Sun, 2012-03-25 at 19:42 +0200, Oleg Nesterov wrote: > > Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in > > the new helper, instead we take the rcu read lock. We can do this > > because the function is called after the cpu is taken down and > marked > > offline, so no new tasks will get this cpu set in their mm mask. > > And only powerpc needs rcu_read_lock() and task_lock(). > > OTOH, I do not understand why powepc does this on CPU_DEAD... > And probably CPU_UP_CANCELED doesn't need to clear mm_cpumask(). > > That said, personally I think these patches are fine, the common > helper makes sense. Not strictly speaking a problem with this patch, but I was wondering... Do we know for sure that the mmu context has been fully flushed out before the unplug ? idle_task_exit() will do a context switch but in our case that may not be enough. Once the CPU is offline, tlb flushes won't hit it any more so it can get out of sync (in some cases the offlining process is just some kind of deep sleep loop that doesn't involve a TLB state loss). Should we add a flush_tlb_mm of all those bits in that loop ? that would be a tad expensive... we don't have a flush_tlb_all() as a generic kind of accessors, but we could add something like that as a requirement for ppc_md.cpu_die ? Cheers, Ben. |
From: Peter Z. <a.p...@ch...> - 2012-03-26 08:00:05
|
On Sun, 2012-03-25 at 19:42 +0200, Oleg Nesterov wrote: > __cpu_disable() is called by __stop_machine(), we know that nobody > can preempt us and other CPUs can do nothing. It would be very good to not rely on that though, I would love to get rid of the stop_machine usage in cpu hotplug some day. |
From: Oleg N. <ol...@re...> - 2012-03-26 17:13:38
|
On 03/26, Peter Zijlstra wrote: > > On Sun, 2012-03-25 at 19:42 +0200, Oleg Nesterov wrote: > > __cpu_disable() is called by __stop_machine(), we know that nobody > > can preempt us and other CPUs can do nothing. > > It would be very good to not rely on that though, Yes, yes, perhaps I wasn't clear but I think the patches are fine. > I would love to get > rid of the stop_machine usage in cpu hotplug some day. Interesting... Why? I mean, why do you dislike stop_machine() in _cpu_down() ? Just curious. Oleg. |
From: Peter Z. <a.p...@ch...> - 2012-03-26 18:01:26
|
On Mon, 2012-03-26 at 19:04 +0200, Oleg Nesterov wrote: > Interesting... Why? I mean, why do you dislike stop_machine() in > _cpu_down() ? Just curious. It disturbs all cpus, the -rt people don't like that their FIFO tasks don't get to run, the trading people don't like their RDMA poll loops to be interrupted.. etc. Now arguably, one should simply not do hotplug crap while such things are running, and mostly that's a perfectly fine constraint. But it doesn't help that people view cpu hotplug as a power savings or resource provisioning 'feature' and there's userspace daemons that plug on-demand. But my ultimate goal is to completely remove synchronization that is actively machine wide, since we all know that as long as such stuff exists people will want to use it. Now I don't know we'll ever fully get there -- see the BKL saga -- but its worth trying I think. The module unload and esp. the text_poke usage of stop_machine are much worse offenders, since both those are relatively common and much harder to avoid. |
From: Anton V. <ant...@li...> - 2012-03-24 10:29:20
|
Current CPU hotplug code has some task->mm handling issues: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we must take the task lock while handle its mm. 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has all the issues fixed, so let's use it. Suggested-by: Oleg Nesterov <ol...@re...> Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/arm/kernel/smp.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index cdeb727..880b93a 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -136,7 +136,6 @@ static void percpu_timer_stop(void); int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret; ret = platform_cpu_disable(cpu); @@ -166,12 +165,7 @@ int __cpu_disable(void) flush_cache_all(); local_flush_tlb_all(); - read_lock(&tasklist_lock); - for_each_process(p) { - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - } - read_unlock(&tasklist_lock); + clear_tasks_mm_cpumask(cpu); return 0; } -- 1.7.9.2 |
From: Anton V. <ant...@li...> - 2012-03-24 10:29:43
|
Current CPU hotplug code has some task->mm handling issues: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we must take the task lock while handle its mm. 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has all the issues fixed, so let's use it. Suggested-by: Oleg Nesterov <ol...@re...> Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/powerpc/mm/mmu_context_nohash.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/mm/mmu_context_nohash.c b/arch/powerpc/mm/mmu_context_nohash.c index 5b63bd3..e779642 100644 --- a/arch/powerpc/mm/mmu_context_nohash.c +++ b/arch/powerpc/mm/mmu_context_nohash.c @@ -333,9 +333,7 @@ static int __cpuinit mmu_context_cpu_notify(struct notifier_block *self, unsigned long action, void *hcpu) { unsigned int cpu = (unsigned int)(long)hcpu; -#ifdef CONFIG_HOTPLUG_CPU - struct task_struct *p; -#endif + /* We don't touch CPU 0 map, it's allocated at aboot and kept * around forever */ @@ -358,12 +356,7 @@ static int __cpuinit mmu_context_cpu_notify(struct notifier_block *self, stale_map[cpu] = NULL; /* We also clear the cpu_vm_mask bits of CPUs going away */ - read_lock(&tasklist_lock); - for_each_process(p) { - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - } - read_unlock(&tasklist_lock); + clear_tasks_mm_cpumask(cpu); break; #endif /* CONFIG_HOTPLUG_CPU */ } -- 1.7.9.2 |
From: Anton V. <ant...@li...> - 2012-03-24 10:30:26
|
Current CPU hotplug code has some task->mm handling issues: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we must take the task lock while handle its mm. 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has all the issues fixed, so let's use it. Suggested-by: Oleg Nesterov <ol...@re...> Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/sh/kernel/smp.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c index f624174..2470c83 100644 --- a/arch/sh/kernel/smp.c +++ b/arch/sh/kernel/smp.c @@ -123,7 +123,6 @@ void native_play_dead(void) int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret; ret = mp_ops->cpu_disable(cpu); @@ -153,11 +152,7 @@ int __cpu_disable(void) flush_cache_all(); local_flush_tlb_all(); - read_lock(&tasklist_lock); - for_each_process(p) - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - read_unlock(&tasklist_lock); + clear_tasks_mm_cpumask(cpu); return 0; } -- 1.7.9.2 |
From: Anton V. <ant...@li...> - 2012-03-24 10:30:57
|
The patch fixes two problems: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we have to take the task lock while handle its mm. 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Suggested-by: Oleg Nesterov <ol...@re...> Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/blackfin/kernel/trace.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c index 050db44..5102cdf 100644 --- a/arch/blackfin/kernel/trace.c +++ b/arch/blackfin/kernel/trace.c @@ -10,6 +10,8 @@ #include <linux/hardirq.h> #include <linux/thread_info.h> #include <linux/mm.h> +#include <linux/oom.h> +#include <linux/sched.h> #include <linux/uaccess.h> #include <linux/module.h> #include <linux/kallsyms.h> @@ -27,7 +29,6 @@ void decode_address(char *buf, unsigned long address) struct task_struct *p; struct mm_struct *mm; unsigned long flags, offset; - unsigned char in_atomic = (bfin_read_IPEND() & 0x10) || in_atomic(); struct rb_node *n; #ifdef CONFIG_KALLSYMS @@ -113,15 +114,15 @@ void decode_address(char *buf, unsigned long address) */ write_lock_irqsave(&tasklist_lock, flags); for_each_process(p) { - mm = (in_atomic ? p->mm : get_task_mm(p)); - if (!mm) - continue; + struct task_struct *t; - if (!down_read_trylock(&mm->mmap_sem)) { - if (!in_atomic) - mmput(mm); + t = find_lock_task_mm(p); + if (!t) continue; - } + + mm = t->mm; + if (!down_read_trylock(&mm->mmap_sem)) + goto __continue; for (n = rb_first(&mm->mm_rb); n; n = rb_next(n)) { struct vm_area_struct *vma; @@ -130,7 +131,7 @@ void decode_address(char *buf, unsigned long address) if (address >= vma->vm_start && address < vma->vm_end) { char _tmpbuf[256]; - char *name = p->comm; + char *name = t->comm; struct file *file = vma->vm_file; if (file) { @@ -163,8 +164,7 @@ void decode_address(char *buf, unsigned long address) name, vma->vm_start, vma->vm_end); up_read(&mm->mmap_sem); - if (!in_atomic) - mmput(mm); + task_unlock(t); if (buf[0] == '\0') sprintf(buf, "[ %s ] dynamic memory", name); @@ -174,8 +174,8 @@ void decode_address(char *buf, unsigned long address) } up_read(&mm->mmap_sem); - if (!in_atomic) - mmput(mm); +__continue: + task_unlock(t); } /* -- 1.7.9.2 |
From: Anton V. <ant...@li...> - 2012-03-24 10:31:23
|
Oleg Nesterov found an interesting deadlock possibility: > sysrq_showregs_othercpus() does smp_call_function(showacpu) > and showacpu() show_stack()->decode_address(). Now suppose that IPI > interrupts the task holding read_lock(tasklist). To fix this, blackfin should not grab the write_ variant of the tasklist lock, read_ one is enough. Suggested-by: Oleg Nesterov <ol...@re...> Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/blackfin/kernel/trace.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c index 5102cdf..9cc9302 100644 --- a/arch/blackfin/kernel/trace.c +++ b/arch/blackfin/kernel/trace.c @@ -28,7 +28,7 @@ void decode_address(char *buf, unsigned long address) { struct task_struct *p; struct mm_struct *mm; - unsigned long flags, offset; + unsigned long offset; struct rb_node *n; #ifdef CONFIG_KALLSYMS @@ -112,7 +112,7 @@ void decode_address(char *buf, unsigned long address) * mappings of all our processes and see if we can't be a whee * bit more specific */ - write_lock_irqsave(&tasklist_lock, flags); + read_lock(&tasklist_lock); for_each_process(p) { struct task_struct *t; @@ -185,7 +185,7 @@ __continue: sprintf(buf, "/* kernel dynamic memory */"); done: - write_unlock_irqrestore(&tasklist_lock, flags); + read_unlock(&tasklist_lock); } #define EXPAND_LEN ((1 << CONFIG_DEBUG_BFIN_HWTRACE_EXPAND_LEN) * 256 - 1) -- 1.7.9.2 |
From: Anton V. <ant...@li...> - 2012-03-24 10:31:47
|
Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe. p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look. Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/um/kernel/reboot.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 4d93dff..66d754c 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -4,6 +4,7 @@ */ #include "linux/sched.h" +#include "linux/spinlock.h" #include "linux/slab.h" #include "kern_util.h" #include "os.h" @@ -22,6 +23,7 @@ static void kill_off_processes(void) struct task_struct *p; int pid; + read_lock(&tasklist_lock); for_each_process(p) { if (p->mm == NULL) continue; @@ -29,6 +31,7 @@ static void kill_off_processes(void) pid = p->mm->context.id.u.pid; os_kill_ptraced_process(pid, 1); } + read_unlock(&tasklist_lock); } } -- 1.7.9.2 |
From: Richard W. <ri...@no...> - 2012-03-24 11:12:53
Attachments:
signature.asc
|
Am 24.03.2012 11:30, schrieb Anton Vorontsov: > Traversing the tasks requires holding tasklist_lock, otherwise it > is unsafe. > > p.s. However, I'm not sure that calling os_kill_ptraced_process() > in the atomic context is correct. It seem to work, but please > take a closer look. os_kill_ptraced_process() calls a host function. From UML's point of view nothing sleeps, so this is fine. Acked-by: Richard Weinberger <ri...@no...> Thanks, //richard |
From: Peter Z. <a.p...@ch...> - 2012-03-24 12:48:52
|
On Sat, 2012-03-24 at 14:30 +0400, Anton Vorontsov wrote: > Traversing the tasks requires holding tasklist_lock, otherwise it > is unsafe. No it doesn't, it either requires tasklist_lock or rcu_read_lock(). |
From: Anton V. <ant...@li...> - 2012-03-24 16:44:40
|
On Sat, Mar 24, 2012 at 01:48:23PM +0100, Peter Zijlstra wrote: > On Sat, 2012-03-24 at 14:30 +0400, Anton Vorontsov wrote: > > Traversing the tasks requires holding tasklist_lock, otherwise it > > is unsafe. > > No it doesn't, it either requires tasklist_lock or rcu_read_lock(). Well, currently the code does neither. :-) -- Anton Vorontsov Email: cbo...@gm... |
From: Anton V. <ant...@li...> - 2012-03-24 10:32:02
|
Checking for task->mm is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so let's take the task lock while we care about its mm. Note that we should also use find_lock_task_mm() to check all process' threads for a valid mm, but for uml we'll do it in a separate patch. Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/um/kernel/reboot.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 66d754c..1411f4e 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -25,10 +25,13 @@ static void kill_off_processes(void) read_lock(&tasklist_lock); for_each_process(p) { - if (p->mm == NULL) + task_lock(p); + if (!p->mm) { + task_unlock(p); continue; - + } pid = p->mm->context.id.u.pid; + task_unlock(p); os_kill_ptraced_process(pid, 1); } read_unlock(&tasklist_lock); -- 1.7.9.2 |
From: Richard W. <ri...@no...> - 2012-03-24 11:12:40
Attachments:
signature.asc
|
Am 24.03.2012 11:30, schrieb Anton Vorontsov: > Checking for task->mm is dangerous as ->mm might disappear (exit_mm() > assigns NULL under task_lock(), so tasklist lock is not enough). > > We can't use get_task_mm()/mmput() pair as mmput() might sleep, > so let's take the task lock while we care about its mm. > > Note that we should also use find_lock_task_mm() to check all process' > threads for a valid mm, but for uml we'll do it in a separate patch. > > Signed-off-by: Anton Vorontsov <ant...@li...> Acked-by: Richard Weinberger <ri...@no...> Thanks, //richard |
From: Anton V. <ant...@li...> - 2012-03-24 10:32:23
|
kill_off_processes() might miss a valid process, this is because checking for process->mm is not enough. Process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Suggested-by: Oleg Nesterov <ol...@re...> Signed-off-by: Anton Vorontsov <ant...@li...> --- arch/um/kernel/reboot.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 1411f4e..3d15243 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -6,6 +6,7 @@ #include "linux/sched.h" #include "linux/spinlock.h" #include "linux/slab.h" +#include "linux/oom.h" #include "kern_util.h" #include "os.h" #include "skas.h" @@ -25,13 +26,13 @@ static void kill_off_processes(void) read_lock(&tasklist_lock); for_each_process(p) { - task_lock(p); - if (!p->mm) { - task_unlock(p); + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) continue; - } - pid = p->mm->context.id.u.pid; - task_unlock(p); + pid = t->mm->context.id.u.pid; + task_unlock(t); os_kill_ptraced_process(pid, 1); } read_unlock(&tasklist_lock); -- 1.7.9.2 |
From: Richard W. <ri...@no...> - 2012-03-24 11:12:40
Attachments:
signature.asc
|
Am 24.03.2012 11:31, schrieb Anton Vorontsov: > kill_off_processes() might miss a valid process, this is because > checking for process->mm is not enough. Process' main thread may > exit or detach its mm via use_mm(), but other threads may still > have a valid mm. > > To catch this we use find_lock_task_mm(), which walks up all > threads and returns an appropriate task (with task lock held). > > Suggested-by: Oleg Nesterov <ol...@re...> > Signed-off-by: Anton Vorontsov <ant...@li...> Acked-by: Richard Weinberger <ri...@no...> Thanks, //richard |
From: Anton V. <ant...@li...> - 2012-03-24 10:32:41
|
This is needed so that callers would not get 'context imbalance' warnings from the sparse tool. As a side effect, this patch fixes the following sparse warnings: CHECK mm/oom_kill.c mm/oom_kill.c:201:28: warning: context imbalance in 'oom_badness' - unexpected unlock include/linux/rcupdate.h:249:30: warning: context imbalance in 'dump_tasks' - unexpected unlock mm/oom_kill.c:453:9: warning: context imbalance in 'oom_kill_task' - unexpected unlock CHECK mm/memcontrol.c ... mm/memcontrol.c:1130:17: warning: context imbalance in 'task_in_mem_cgroup' - unexpected unlock p.s. I know Peter Zijlstra detest the __cond_lock() stuff, but untill we have anything better in sparse, let's use it. This particular patch helped me to detect one bug that I myself made during task->mm fixup series. So, it is useful. Signed-off-by: Anton Vorontsov <ant...@li...> Acked-by: KOSAKI Motohiro <kos...@jp...> --- include/linux/oom.h | 12 +++++++++++- mm/oom_kill.c | 2 +- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/include/linux/oom.h b/include/linux/oom.h index 552fba9..26cf628 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -21,6 +21,7 @@ #ifdef __KERNEL__ +#include <linux/compiler.h> #include <linux/sched.h> #include <linux/types.h> #include <linux/nodemask.h> @@ -65,7 +66,16 @@ static inline void oom_killer_enable(void) oom_killer_disabled = false; } -extern struct task_struct *find_lock_task_mm(struct task_struct *p); +extern struct task_struct *__find_lock_task_mm(struct task_struct *p); + +static inline struct task_struct *find_lock_task_mm(struct task_struct *p) +{ + struct task_struct *ret; + + ret = __find_lock_task_mm(p); + (void)__cond_lock(&ret->alloc_lock, ret); + return ret; +} /* sysctls */ extern int sysctl_oom_dump_tasks; diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 2958fd8..0ebb383 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -136,7 +136,7 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk, * pointer. Return p, or any of its subthreads with a valid ->mm, with * task_lock() held. */ -struct task_struct *find_lock_task_mm(struct task_struct *p) +struct task_struct *__find_lock_task_mm(struct task_struct *p) { struct task_struct *t = p; -- 1.7.9.2 |
From: Peter Z. <a.p...@ch...> - 2012-03-24 12:54:10
|
On Sat, 2012-03-24 at 14:31 +0400, Anton Vorontsov wrote: > This is needed so that callers would not get 'context imbalance' > warnings from the sparse tool. > > As a side effect, this patch fixes the following sparse warnings: > > CHECK mm/oom_kill.c > mm/oom_kill.c:201:28: warning: context imbalance in 'oom_badness' - > unexpected unlock > include/linux/rcupdate.h:249:30: warning: context imbalance in > 'dump_tasks' - unexpected unlock > mm/oom_kill.c:453:9: warning: context imbalance in 'oom_kill_task' - > unexpected unlock > CHECK mm/memcontrol.c > ... > mm/memcontrol.c:1130:17: warning: context imbalance in > 'task_in_mem_cgroup' - unexpected unlock > > p.s. I know Peter Zijlstra detest the __cond_lock() stuff, but untill > we have anything better in sparse, let's use it. This particular > patch helped me to detect one bug that I myself made during > task->mm fixup series. So, it is useful. Yeah, so Nacked-by: Peter Zijlstra <a.p...@ch...> Also, why didn't lockdep catch it? Fix sparse already instead of smearing ugly all over. |