|
From: Rich F. <da...@li...> - 2016-02-29 21:23:50
|
I've been trying to use strace on a NOMMU system (sh2) and have been experiencing an issue where the return value (read from r0) and args 5/6 (r0,r1) are bogus, making the output much less useful than it otherwise would be. The problem seems to be that the tracer is desynced with the child's STOP parity and is confusing syscall entry/exit, probably due to exec_or_die not stopping itself before exec to sync with the parent. Even if not for the bug I'm experiencing, this seems to be problematic in that early syscalls in the child can be lost (I've actually hit that problem too). The attached (very hackish at the moment) patch makes it work for me by eliminating the need to define NOMMU_SYSTEM to 1 and using clone() with CLONE_VM and a new stack for the child, instead of vfork. I see some potential issues that need to be addressed before this could be made into a proper solution, though: 1. I'm not sure if all NOMMU systems strace supports have clone. If so, I think vfork could be dropped completely and this used instead. 2. The feature checks which were bypassed on NOMMU, and which I hard-bypassed, could also be changed to use clone. 3. My approach is not directly applicable to daemonized tracer mode without some significant refactoring; either the entire remainder of init() and main() would need to be moved into a function that runs in the child, or longjmp back to the parent's call frame would need to be performed with some synchronization to make it safe. An alternate approach would be to keep vfork but have strace self-exec itself with a special argument to make it STOP itself then exec. For what it's worth this could be significantly slower on NOMMU especially if not using a binary format that's capable of sharing text. Any thoughts on whether changes like this would be acceptable upstream? Rich |