The page fault loop for the same address happens in my UML. But for both my UML and the mainline (I am using 3.7.1) kernel, the addresses that trigger the page fault (in the child thread) are covered by certain vm areas. I use gdb to trace the function call and notice that mmap_region() is never called during the execution of the child task. I am guessing it's because the child task does not use large enough memory space to have the UML installed mapping for it.
The major change I did to my kernel is to modify the vm areas pointers of certain child tasks to share the vm area structure of its parent task. So the parent task's vm areas are shared (as long as VM_DONTCOPY is not set) among some of its child tasks.