From: Jeff D. <jd...@ad...> - 2006-05-26 03:56:24
|
We are seeing double ptrace notifications of system call returns on recent x86_64 kernels. This breaks UML and at least one other app. The patch below appears to fix the problem. The bug is caused by both syscall_trace and int_very_careful both calling syscall_trace_leave, and the system call tracing path going through int_very_careful. I would have liked to get rid of one or the other call to syscall_trace_leave. However, the syscall_trace path looks like it can exit to userspace without going through int_very_careful, and int_very_careful does things other than system call tracing. So, instead, I took _TIF_SYSCALL_TRACE and _TIF_SYSCALL_AUDIT out of the flags test on the grounds that they had already been checked in syscall_trace. There is possibly a preemption and call to schedule between syscall_trace and int_very_careful, so if it can be attached at that point, then the first return will be missed. However, I think that ptrace attachment requires a stopped child, not just one that has been preempted. I don't see signal delivery between syscall_trace and int_very_careful, so I don't see that there can be a ptrace attach followed by int_very_careful missing the first return. This is an RFC - if it turns out to be actually correct, some comments need fixing before this goes anywhere. UML works with this applied, and it doesn't seem to break singlestepping, either on normal instructions or across system calls, which looks like the next most vulnerable thing. Jeff Index: linux-2.6.16.x86_64/arch/x86_64/kernel/entry.S =================================================================== --- linux-2.6.16.x86_64.orig/arch/x86_64/kernel/entry.S +++ linux-2.6.16.x86_64/arch/x86_64/kernel/entry.S @@ -345,7 +345,7 @@ int_very_careful: sti SAVE_REST /* Check for syscall exit trace */ - testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edx + testl $(_TIF_SINGLESTEP),%edx jz int_signal pushq %rdi CFI_ADJUST_CFA_OFFSET 8 @@ -353,7 +353,7 @@ int_very_careful: call syscall_trace_leave popq %rdi CFI_ADJUST_CFA_OFFSET -8 - andl $~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edi + andl $~(_TIF_SINGLESTEP),%edi cli jmp int_restore_rest |
From: Andi K. <ak...@su...> - 2006-05-26 10:36:50
|
On Friday 26 May 2006 05:24, Jeff Dike wrote: > We are seeing double ptrace notifications of system call returns on recent > x86_64 kernels. This breaks UML and at least one other app. I believe this patch is the correct fix. Can you confirm it works for you? -Andi Don't do syscall exit tracing twice int_ret_from_syscall already does syscall exit tracing, so no need to do it again in the caller. This caused problems for UML and some other special programs doing syscall interception. Signed-off-by: Andi Kleen <ak...@su...> Index: linux/arch/x86_64/kernel/entry.S =================================================================== --- linux.orig/arch/x86_64/kernel/entry.S +++ linux/arch/x86_64/kernel/entry.S @@ -282,12 +282,7 @@ tracesys: ja 1f movq %r10,%rcx /* fixup for C */ call *sys_call_table(,%rax,8) - movq %rax,RAX-ARGOFFSET(%rsp) -1: SAVE_REST - movq %rsp,%rdi - call syscall_trace_leave - RESTORE_TOP_OF_STACK %rbx - RESTORE_REST +1: movq %rax,RAX-ARGOFFSET(%rsp) /* Use IRET because user could have changed frame */ jmp int_ret_from_sys_call CFI_ENDPROC |
From: Jeff D. <jd...@ad...> - 2006-05-26 14:13:28
|
On Fri, May 26, 2006 at 12:36:26PM +0200, Andi Kleen wrote: > I believe this patch is the correct fix. Can you confirm it works for you? Looks good, thanks. Jeff |
From: Blaisorblade <bla...@ya...> - 2006-06-02 13:47:15
|
On Friday 26 May 2006 16:13, Jeff Dike wrote: > On Fri, May 26, 2006 at 12:36:26PM +0200, Andi Kleen wrote: > > I believe this patch is the correct fix. Can you confirm it works for > > you? > > Looks good, thanks. > > Jeff Sorry for the question, but has this been sent to -stable (since it's a -stable regression, it should be)? To 2.6.17 -git? And have you tested it (somebody should have, but it's not sure)? Sorry for not helping myself, I'll be back at work ASAP. -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade Chiacchiera con i tuoi amici in tempo reale! http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com |
From: Jeff D. <jd...@ad...> - 2006-06-02 15:13:22
|
On Thu, Jun 01, 2006 at 09:07:33PM +0200, Blaisorblade wrote: > Sorry for the question, but has this been sent to -stable (since it's a > -stable regression, it should be)? To 2.6.17 -git? It's in current git. I'm having a hard time telling when the bug was introduced. The git web interface seems to be telling me that the double notification was around since last year, which I don't believe, since I've run much more recent x86_64 kernels. If the bug existed before 2.6.16, then it's fine -stable fodder. > And have you tested it (somebody should have, but it's not sure)? Yes, it's been continuously and happily running UML since Andi sent me his patch. Jeff |
From: Steven J. <py...@li...> - 2006-06-02 15:38:47
|
On Fri, 2 Jun 2006, Jeff Dike wrote: > On Thu, Jun 01, 2006 at 09:07:33PM +0200, Blaisorblade wrote: >> Sorry for the question, but has this been sent to -stable (since it's a >> -stable regression, it should be)? To 2.6.17 -git? > > It's in current git. > > I'm having a hard time telling when the bug was introduced. The git web > interface seems to be telling me that the double notification was around > since last year, which I don't believe, since I've run much more > recent x86_64 kernels. If the bug existed before 2.6.16, then it's > fine -stable fodder. I know that the bug is NOT present in 2.6.15.7. > >> And have you tested it (somebody should have, but it's not sure)? > > Yes, it's been continuously and happily running UML since Andi sent me > his patch. > > Jeff > |
From: Blaisorblade <bla...@ya...> - 2006-06-02 18:17:27
Attachments:
x86_64-fix-ret_from_fork
|
On Friday 02 June 2006 17:13, Jeff Dike wrote: > On Thu, Jun 01, 2006 at 09:07:33PM +0200, Blaisorblade wrote: > > Sorry for the question, but has this been sent to -stable (since it's a > > -stable regression, it should be)? To 2.6.17 -git? > > It's in current git. The patch is likely ok for -stable, but below I express some doubts for inclusion in kernel tree - it probably isn't sufficient and maybe it's not the better fix. > I'm having a hard time telling when the bug was introduced. The git web > interface seems to be telling me that the double notification was around > since last year, Likely you'll looking into the wrong place - int_ret_from_syscall (if this is the name) wasn't used till recently, ret_from_syscall was used. And it never does syscall exit tracing - it expects the caller to have switched away to the slow path. There's a path in which it calls int_with_check, but it does "movl $_TIF_NEED_RESCHED,%edi", so that int_with_check only tests _TIF_NEED_RESCHED and no other flag. However, there's another potential problem in current code, it seems. ret_from_fork. It can jump to rff_trace and then again go to int_ret_from_sys_call, if _TIF_IA32 is set. The check for tracing/auditing should be moved to when we decide to jump to ret_from_sys_call. I've tried doing that with the attached untested patch, but it's likely I've got something wrong - in particular the following pass of the patch wasn't easy to get right (this change is buried in the middle of other things). + testl $3,CS(%rsp) # from kernel_thread? RESTORE_REST - testl $3,CS-ARGOFFSET(%rsp) # from kernel_thread? However, I have a bigger doubt and a proposal for a simpler change. Do we need at all to test for syscall tracing in int_ret_from_sys_call? And why is there a difference with ret_from_sys_call? Instead of removing the test from tracesys, could we remove it from int_ret_from_sys_call (and no, calling int_with_check with a custom %edi doesn't work)? Can syscall tracing be needed on exit and not on entry? I've the suspect that either: a) if some other process kicks in, we'll make him trace from next syscall start onwards (and int_ret_from_sys_call is buggy) - this makes much more sense and works almost always. b) or if we want tracer to see syscall exit, it only works in int_ret_from_sys_call (rare cases). I've thought to how a process can kick in and answers were: *) schedule() to a process which does a ptrace attach *) the syscall is ptrace(PTRACE_TRACEME) in both cases the ptracer must do PTRACE_SYSCALL, and in both cases we need behaviour a). For singlestep on syscall exit I'm totally unsure. But I've not the time to look more this code - I didn't know it and it's very intricate. > which I don't believe, since I've run much more > recent x86_64 kernels. If the bug existed before 2.6.16, then it's > fine -stable fodder. Can you handle sending it to st...@ke...? First introduced in 2.6.16.5 as first diagnosed by Chuck Ebbert: commit 6b12095a4a0e1f21bbf83f95f13299ca99d758fe tree 5d2a3d96f7b99a3a225c0f7a110c6631848524b0 parent 59b2832a31ae2f3279bb5b16ae9b1c4e38e40dea author Andi Kleen <ak...@su...> Wed, 12 Apr 2006 08:19:29 +0200 committer Greg Kroah-Hartman <gr...@su...> Wed, 12 Apr 2006 13:06:54 -0700 [PATCH] x86_64: When user could have changed RIP always force IRET (CVE-2006-0744) Intel EM64T CPUs handle uncanonical return addresses differently from AMD CPUs. The exception is reported in the SYSRET, not the next instruction. Thgis leads to the kernel exception handler running on the user stack with the wrong GS because the kernel didn't expect exceptions on this instruction. This version of the patch has the teething problems that plagued an earlier version fixed. This is CVE-2006-0744 Thanks to Ernie Petrides and Asit B. Mallick for analysis and initial patches. Signed-off-by: Andi Kleen <ak...@su...> Signed-off-by: Greg Kroah-Hartman <gr...@su...> -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade |