From: Young K. <you...@gm...> - 2006-02-09 22:09:10
|
Hi, I have a question about system call implementation in tt mode. it seems that an invoked system call will be executed in tracee's context. (the tracer sends SIGUSR2 to the tracee and tracee executes the system call in usr2_handler) however, 'current' macro, which is used to find the current task structure, would only work in Kernel mode stack (which i assume is the tracer's stack), cause the current task structure is at the bottom of the kernel stack. then, how the system calls can be executed in tracee's context? Thank you! -Young On 1/28/05, Blaisorblade <bla...@ya...> wrote: > On Wednesday 26 January 2005 15:33, Alex LIU wrote: > > Hi,Blaisorblade: > > > I have studied the TT mode of UML source code 2.6.7 for some time.But I > > still can't work out the system call function flow in TT mode.I have re= ad > > some documents and comments on that but all of them are very rough... > > > Is there any more detailed document about the system call function flow= in > > UML TT mode?(had better to the function level) > I don't know if the slides on the main site could be of help for you (I d= on't > think so, but you might try). > > However, I've decided to post a thorough description of the flow to the l= ist > and to you... To be correct, I've studied the source while writing this > mail... I had a rough idea of what happens, I just didn't dig enough to > discover all details because I didn't need it yet. > > First study man 2 ptrace, especially about PTRACE_SYSCALL. > > However, the core mechanism is that tracer() ptrace()s the child: > > (around line 235 of > arch/um/kernel/tt/tracer.c, sorry but references are from around 2.6.9, i= t > should not be too difficult). > > while(1){ //this is executed for the whole lifetime of the child. > CATCH_EINTR(pid =3D waitpid(-1, &status, WUNTRACED)); > ... > else if(WIFSTOPPED(status)){ > ... > sig =3D WSTOPSIG(status); > ... > switch(sig){ > > and when the tracee executes a syscall, as explained in ptrace docs, > waitpid() will return that the child was stopped by a SIGTRAP, so we get > here: > > case SIGTRAP: //this has changed in recent kernel= s to > case (SIGTRAP + 0x80): > > do_syscall is called, and then this is done: > > sig =3D SIGUSR2; > tracing =3D 0; > ... after, this saves the new tracing value inside "task", which is a str= uct > task_struct. > > set_tracing(task, tracing); > > > afterwards, the set value of sig is used so: > > //cont_type is normally set to PTRACE_SYSCALL, but since now tracing =3D= =3D 0, it > will be PTRACE_CONT. > > if(ptrace(cont_type, pid, 0, sig) !=3D 0){ > } > > and this makes sure with ptraces that the child sees a SIGUSR2 signal whe= n > resuming. (it's not done through kill(), see sig =3D SIGUSR2 and the ptra= ce() > call near the end using it.) Since the signal is sent this way, it will b= e > received and handled by the child thread. Now, we are resuming with > PTRACE_CONT, because we are going to execute the UML code which will hand= le > the syscall, so we don't want syscalls to be intercepted. > > Then, the SIGUSR2 signal handler is invoked (it's sig_handler_common_tt w= hich > calls sig_info[SIGUSR2]-> handler, i.e. usr2_handler). It will call > syscall_handler_tt, which will do the syscall execution (with tracing tur= ned > off) and saves the actual result (through SC_SET_SYSCALL_RETURN(sc, resul= t), > which manipulates the saved registers, specifically the value which will = be > stored back in EAX). > > Finally, to switch back to the user mode, during the return path of > sig_handler_common_tt(), set_user_mode(NULL) is called; if it sees that > tracing is 0 (what it reads is the value set by set_tracing()) it sends a > SIGUSR1 signal: > > int set_user_mode(void *t) > { > struct task_struct *task; > > task =3D t ? t : current; > if(task->thread.mode.tt.tracing) > return(1); > task->thread.request.op =3D OP_TRACE_ON; > os_usr1_process(os_getpid()); /*this is a wrapper for the kill() = to > send the signal.*/ > return(0); > } > > Now, this signal is handled by tracer(): in fact, the child (who gets the > signal) is ptraced, so the ptracer can examine each signal and decide wha= t to > do. (Above, for SIGUSR2, we said there was an exception, but it happened > because the signal was sent using ptrace()). > > Here is the piece of code: > > switch(sig){ > case SIGUSR1: > sig =3D 0; // so the child won't see the = signal. > op =3D do_proc_op(task, proc_id); > switch(op){ > case OP_TRACE_ON: > arch_leave_kernel(task, pid); > tracing =3D 1; > break; > > As you see, tracing is switched back to 1, so at the end this iteration w= e > will resume the child with PTRACE_SYSCALL in user mode... and he will see= the > syscall return value. I hope I didn't miss anything. > > > Thanks a lot! > > > Alex > -- > Paolo Giarrusso, aka Blaisorblade > Linux registered user n. 292729 > http://www.user-mode-linux.org/~blaisorblade > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting > Tool for open source databases. Create drag-&-drop reports. Save time > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. > Download a FREE copy at http://www.intelliview.com/go/osdn_nl > _______________________________________________ > User-mode-linux-devel mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > |