Thread: steved patches applied to 2.5.68-bk4

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I have gone through an applied the patches that Steve Dickson generated 
to the 2.5.68-bk4 kernel.  Attached is the patch file. I have verified 
that kernel compiles and oprofile runs.

-Will

On Fri, Apr 25, 2003 at 05:04:44PM -0400, Will Cohen wrote:

> I have gone through an applied the patches that Steve Dickson generated 
> to the 2.5.68-bk4 kernel.  Attached is the patch file. I have verified 
> that kernel compiles and oprofile runs.

As I mentioned to Steve, I'm waiting for individual emails per problem. 

Also the patches attached still have problems I pointed out in private
mail to Steve. This is just one reason why I want stuff to be split out
instead of all in one, and accompanied with changelogs.

Imagine you're submitting the changes to Linus :)

> +	if (unlikely(in_interrupt())) {

In  particular stuff like this doesn't even begin to make sense AFAICS

regards
john



The task_lock() has to be held when look in task->mm

diff -u linux-2.4.20/drivers/oprofile/buffer_sync.c 
linux-2.4.20/drivers/oprofile/buffer_sync.c
reallinu0m54.429sdrivers/oprofile/buffer_sync.c 2003-04-17 
09:44:11.000000000 -0400
+++ linux-2.4.20/userers0m26.686s/buffer_sync.c 2003-04-28 
08:27:19.000000000 -0400
@@ -26,6 +26,9 @@                sys    0m6.836s
 #include <linux/profile.h>
 #include <linux/module.h>
 #include <linux/fs.h>
+#include <linux/spinlock.h>
+
+#include <asm/hardirq.h>

 #include "oprofile_stats.h"
 #include "event_buffer.h"
@@ -318,13 +321,15 @@
  */
 static struct mm_struct * take_task_mm(struct task_struct * task)
 {
-   struct mm_struct * mm = task->mm;
-
+   struct mm_struct *mm = NULL;
+
    /* if task->mm !NULL, mm_count must be at least 1. It cannot
     * drop to 0 without the task exiting, which will have to sleep
     * on buffer_sem first. So we do not need to mark mm_count
     * ourselves.
     */
+   task_lock(task);
+    mm = task->mm;
    if (mm) {
        /* More ugliness. If a task took its mmap
         * sem then came to sleep on buffer_sem we
@@ -335,8 +340,9 @@
            /* FIXME: this underestimates samples lost */
            atomic_inc(&oprofile_stats.sample_lost_mmap_sem);
            mm = NULL;
-       }
+       }
    }
+   task_unlock(task);

    return mm;
 }


John Levon wrote:

>On Fri, Apr 25, 2003 at 05:04:44PM -0400, Will Cohen wrote:
>
>  
>
>>I have gone through an applied the patches that Steve Dickson generated 
>>to the 2.5.68-bk4 kernel.  Attached is the patch file. I have verified 
>>that kernel compiles and oprofile runs.
>>    
>>
>
>As I mentioned to Steve, I'm waiting for individual emails per problem. 
>
>Also the patches attached still have problems I pointed out in private
>mail to Steve. This is just one reason why I want stuff to be split out
>instead of all in one, and accompanied with changelogs.
>
>Imagine you're submitting the changes to Linus :)
>
>  
>
>>+	if (unlikely(in_interrupt())) {
>>    
>>
>
>In  particular stuff like this doesn't even begin to make sense AFAICS
>
>regards
>john
>  
>




On Mon, Apr 28, 2003 at 08:58:43AM -0400, Steve Dickson wrote:

> The task_lock() has to be held when look in task->mm

Please make an effort to send out usable patches. Not only has your
mailer wrapped long lines, but you seem to have cut and pasted the
patch, killing all the whitespace and lleaving this :

> reallinu0m54.429sdrivers/oprofile/buffer_sync.c 2003-04-17 

> +#include <asm/hardirq.h>

Not needed.

> +   task_lock(task);

We only need to hold this to get the actual mm value, no longer.

I've done something similar in my tree.

regards
john



John Levon wrote:

>>+   task_lock(task);
>>    
>>
>We only need to hold this to get the actual mm value, no longer.
>
>I've done something similar in my tree.
>  
>
Good... End story...




Holding the buffer_sem lock and setting is_setup is really not protection
so I'm assuming this a typo...

diff -u linux-2.4.20/drivers/oprofile/oprof.c 
linux-2.4.20/drivers/oprofile/oprof.c

--- linux-2.4.20/drivers/oprofile/oprof.c   2003-04-17 
09:44:11.000000000 -0400
+++ linux-2.4.20/drivers/oprofile/oprof.c   2003-04-28 
08:25:07.000000000 -0400
@@ -112,9 +112,9 @@
        oprofile_ops->shutdown();
-    /* down() is also necessary to synchronise all pending events
 -    * before freeing */
-   down(&buffer_sem);
+   down(&start_sem);
    is_setup = 0;
-   up(&buffer_sem);
+   up(&start_sem);
    free_event_buffer();
    free_cpu_buffers();
 }


John Levon wrote:

>On Fri, Apr 25, 2003 at 05:04:44PM -0400, Will Cohen wrote:
>
>  
>
>>I have gone through an applied the patches that Steve Dickson generated 
>>to the 2.5.68-bk4 kernel.  Attached is the patch file. I have verified 
>>that kernel compiles and oprofile runs.
>>    
>>
>
>As I mentioned to Steve, I'm waiting for individual emails per problem. 
>
>  
>




On Mon, Apr 28, 2003 at 09:04:56AM -0400, Steve Dickson wrote:

> Holding the buffer_sem lock and setting is_setup is really not protection
> so I'm assuming this a typo...

We've already gone over this, please read your mail archives

john



John Levon wrote:

>>Holding the buffer_sem lock and setting is_setup is really not protection
>>so I'm assuming this a typo...
>>    
>>
>
>We've already gone over this, please read your mail archives
>
I did... I still contend holding the buffer_sem lock in that code
does do anything especially in respect to protecting the access of
event_buffer (what that lock seems to be used for). It just seems
strange that every else you access is_setup you hold the start_sem
lock except for this one place where you hold the completely
different lock... Maybe a comment or two (in the code) would help...

SteveD.





On Mon, Apr 28, 2003 at 12:13:11PM -0400, Steve Dickson wrote:

> I did... I still contend holding the buffer_sem lock in that code
> does do anything especially in respect to protecting the access of

And you're correct in this statement, as I said in mail.

What is *not* correct is *replacing* the buffer sem with start sem.

The start sem is *also* needed.

> lock except for this one place where you hold the completely
> different lock... Maybe a comment or two (in the code) would help...

    112         /* down() is also necessary to synchronise all pending events
    113          * before freeing */

I've added start_sem taking in my tree.

regards
john



John Levon wrote:

>As I mentioned to Steve, I'm waiting for individual emails per problem. 
>
>Also the patches attached still have problems I pointed out in private
>mail to Steve. This is just one reason why I want stuff to be split out
>instead of all in one, and accompanied with changelogs.
>

three things here...

1) Added  O_NDELAY process. It seemed like a good idea.
2) You can not hold the buffer_sem lock during the copy_to_user
      because it could cause a deadlock in the do_ummap() code
      during a page fault
3) You have to check buffer_ready while holding the lock
      to ensure you catch two racing threads.



On Mon, Apr 28, 2003 at 09:12:58AM -0400, Steve Dickson wrote:

> 1) Added  O_NDELAY process. It seemed like a good idea.

Good ideas generally have reasons ... and as your "XXX" comments, it
doesn't even work properly. So what is the point ?

> 2) You can not hold the buffer_sem lock during the copy_to_user
>      because it could cause a deadlock in the do_ummap() code
>      during a page fault

As I mentioned I don't want to vmalloc() very large regions of memory on
every read() just to avoid a theoretical race. At the very least a
solution like this must be accompanied by benchmarks proving it's not a
problem.

Other possibilities to fix the theoretical bug are to schedule the
buffer sync for later processing instead of calling sync_buffers()
directly, or calling the callback outside of the mmap_sem. The first
will lose samples, the second will cause extra overhead as we can't see
if the unmap was of an executable section.

Other ideas are welcome, all of them have drawbacks. But right now I
would be tempted to go the moving out of the mmap_sem route, and add
a check if the region is executable inside the oprofile code itself. A
simple check (find_vma_prev on the address and then see if that is
executable) will probably be good enough.

> 3) You have to check buffer_ready while holding the lock
>      to ensure you catch two racing threads.

It's impossible to work out which bit you mean since you combined three
changes in one patch. But anyway, I don't see this race, please give a
diagram showing the claimed race.

john



John Levon wrote:

>>2) You can not hold the buffer_sem lock during the copy_to_user
>>     because it could cause a deadlock in the do_ummap() code
>>     during a page fault
>>    
>>
>
>As I mentioned I don't want to vmalloc() very large regions of memory on
>every read() just to avoid a theoretical race.
>
I disagree that its a "theoretical race"... but if you have
better way to handle this race great!! 

>>3) You have to check buffer_ready while holding the lock
>>     to ensure you catch two racing threads.
>>    
>>
>
>It's impossible to work out which bit you mean since you combined three
>changes in one patch. But anyway, I don't see this race, please give a
>diagram showing the claimed race.
>
Two threads are waiting in wait_event_interruptible(), both are woken
up only one get the buffer_sem the other one waits for it.. When the
second thread finally get buffer_sem, buffer_ready my longer be one.
This code remove that uncertainty.

SteveD.





On Mon, Apr 28, 2003 at 12:22:58PM -0400, Steve Dickson wrote:

> I disagree that its a "theoretical race"... but if you have
> better way to handle this race great!! 

Well, it depends on your definition of "theoretical". There is no code
extant that opens the oprofile device that is threaded; therefore it
will not happen in real life now.

> Two threads are waiting in wait_event_interruptible(), both are woken
> up only one get the buffer_sem the other one waits for it.. When the
> second thread finally get buffer_sem, buffer_ready my longer be one.
> This code remove that uncertainty.

There are never two threads. And it's impossible to move buffer_ready()
into the buffer_sem, it's an obvious deadlock because you'd have to move
the wait_event_interruptible() in there.

What can be done is double check buffer_ready() is set after waking up
from the wait_event. It won't trigger in any existing code but ...

regards,
john




Thread: steved patches applied to 2.5.68-bk4

oprofile-list