Thread: [uml-devel] Endless page fault for the same miss address in my UML

Brought to you by: blaisorblade, derrichard, jdike, rusty

user-mode-linux-devel

[uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-06 19:23:57

Hello guys,

Is there any available resource that explains how user-mode-linux maps the
pages of a task in UML to the host kernel?

In my UML, I modified a task's page table when forking it. Then I ran into
a situation where the page fault happens over and over again for the same
address in the forked task. I use gdb debugger and find out that when the
page fault happens for the first time, the kernel calls do_wp_page() to
fault in the page and marks the page present. This should prevent the next
page fault for the same address from happening again. I checked the PTE in
UML, they are marked as present so is it possible that the page is not
being allocated properly on the host kernel so that the page fault keeps
happening for the same address even though UML thinks the page is present.

Any suggestions?

Thank!

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: richard -r. w. <ric...@gm...> - 2013-04-07 16:52:29

On Sat, Apr 6, 2013 at 9:23 PM, Terry Hsu <ter...@gm...> wrote:
> Is there any available resource that explains how user-mode-linux maps the
> pages of a task in UML to the host kernel?

The code...? ;)
UML receives a SIGEGV on the host side if a page is not mapped.
The SIGEGV handler then installs the mapping using mmap().

> In my UML, I modified a task's page table when forking it. Then I ran into a
> situation where the page fault happens over and over again for the same
> address in the forked task. I use gdb debugger and find out that when the
> page fault happens for the first time, the kernel calls do_wp_page() to
> fault in the page and marks the page present. This should prevent the next
> page fault for the same address from happening again. I checked the PTE in
> UML, they are marked as present so is it possible that the page is not being
> allocated properly on the host kernel so that the page fault keeps happening
> for the same address even though UML thinks the page is present.
>
> Any suggestions?

If the same fault happens over and over UML (on the host side) seems
unable to fix the fault.
Check the return values of mmap()....

Thanks,
//richard

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Peter B. <pb...@pt...> - 2013-04-07 18:30:54

Here's one more example, still the same setup, but this time crashing at
the same place as the original bug report. (BUG: failure at
block/blk-core.c:2978/blk_flush_plug_list()!) See below for output.

BTW my host setup is Linux Mint 14:

Linux ufo 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux



bash $ ./linux ubd0=Fedora18-AMD64-root_fs rw mem=4096M con0=fd:0,fd:1

Core dump limits :
        soft - 0
        hard - NONE
Checking that ptrace can change system call numbers...OK
Checking syscall emulation patch for ptrace...OK
Checking advanced syscall emulation patch for ptrace...OK
Checking for tmpfs mount on /dev/shm...nothing mounted on /dev/shm
Checking PROT_EXEC mmap in /tmp/...OK
Checking for the skas3 patch in the host:
  - /proc/mm...not found: No such file or directory
  - PTRACE_FAULTINFO...not found
  - PTRACE_LDT...not found
UML running in SKAS0 mode
Adding 26181632 bytes to physical memory to account for exec-shield gap
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 3.9.0-rc5 (pbutler@ufo) (gcc version 4.7.2 (Ubuntu/Linaro
4.7.2-2ubuntu1) ) #1 Sat Apr 6 13:15:06 EDT 2013
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1040544
Kernel command line: ubd0=Fedora18-AMD64-root_fs rw mem=4096M
con0=fd:0,fd:1 root=98:0
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Memory: 4068740k available
NR_IRQS:15
Calibrating delay loop... 3800.26 BogoMIPS (lpj=19001344)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 256
Initializing cgroup subsys cpuacct
Initializing cgroup subsys devices
Initializing cgroup subsys freezer
Initializing cgroup subsys blkio
Checking that host ptys support output SIGIO...Yes
Checking that host ptys support SIGIO on close...No, enabling workaround
devtmpfs: initialized
Using 2.6 host AIO
NET: Registered protocol family 16
bio: create slab <bio-0> at 0
Switching to clocksource itimer
NET: Registered protocol family 2
TCP established hash table entries: 32768 (order: 7, 524288 bytes)
TCP bind hash table entries: 32768 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 32768 bind 32768)
TCP: reno registered
UDP hash table entries: 2048 (order: 4, 65536 bytes)
UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes)
NET: Registered protocol family 1
mconsole (version 2) initialized on /home/pbutler/.uml/ovuM3w/mconsole
Checking host MADV_REMOVE support...OK
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 7946
io scheduler noop registered
io scheduler deadline registered (default)
TCP: cubic registered
NET: Registered protocol family 17
Initialized stdio console driver
Console initialized on /dev/tty0
console [tty0] enabled
Initializing software serial port version 1
console [mc-1] enabled
 ubda: unknown partition table
EXT4-fs (ubda): couldn't mount as ext3 due to feature incompatibilities
EXT4-fs (ubda): couldn't mount as ext2 due to feature incompatibilities
EXT4-fs (ubda): warning: maximal mount count reached, running e2fsck is
recommended
EXT4-fs (ubda): mounted filesystem with ordered data mode. Opts: (null)
VFS: Mounted root (ext4 filesystem) on device 98:0.
devtmpfs: mounted
systemd[1]: systemd 197 running in system mode. (+PAM +LIBWRAP +AUDIT
+SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ)

Welcome to Fedora 18 (Spherical Cow)!

systemd[1]: Failed to insert module 'autofs4'
systemd[1]: No hostname configured.
systemd[1]: Set hostname to <localhost>.
systemd[1]: Failed to enable kbrequest handling: Inappropriate ioctl for
device
systemd[1]: Cannot add dependency job for unit display-manager.service,
ignoring: Unit display-manager.service failed to load: No such file or
directory. See system logs and 'systemctl status display-manager.service'
for details.
systemd[1]: Started Replay Read-Ahead Data.
systemd[1]: Starting Collect Read-Ahead Data...
         Starting Collect Read-Ahead Data...
systemd[1]: Starting Forward Password Requests to Wall Directory Watch.
systemd[1]: Started Forward Password Requests to Wall Directory Watch.
systemd[1]: Starting Remote File Systems.
[  OK  ] Reached target Remote File Systems.
systemd-readahead[228]: Failed to create fanotify object: Function not
implemented
systemd[1]: Reached target Remote File Systems.
systemd[1]: Starting Syslog Socket.
[  OK  ] Listening on Syslog Socket.
systemd[1]: Listening on Syslog Socket.
systemd[1]: Starting /dev/initctl Compatibility Named Pipe.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
systemd[1]: Starting Delayed Shutdown Socket.
[  OK  ] Listening on Delayed Shutdown Socket.
systemd[1]: Listening on Delayed Shutdown Socket.
systemd[1]: Starting Encrypted Volumes.
[  OK  ] Reached target Encrypted Volumes.
systemd[1]: Reached target Encrypted Volumes.
systemd[1]: Starting Arbitrary Executable File Formats File System
Automount Point.
systemd[1]: Failed to open /dev/autofs: No such file or directory
systemd[1]: Failed to initialize automounter: No such file or directory
[FAILED] Failed to set up automount Arbitrary Executable File...utomount
Point.
See 'systemctl status proc-sys-fs-binfmt_misc.automount' for details.
systemd[1]: Failed to set up automount Arbitrary Executable File Formats
File System Automount Point.
systemd[1]: Unit proc-sys-fs-binfmt_misc.automount entered failed state
systemd[1]: Starting LVM2 metadata daemon socket.
[  OK  ] Listening on LVM2 metadata daemon socket.
systemd[1]: Listening on LVM2 metadata daemon socket.
systemd[1]: Starting Device-mapper event daemon FIFOs.
[  OK  ] Listening on Device-mapper event daemon FIFOs.
systemd[1]: Listening on Device-mapper event daemon FIFOs.
systemd[1]: Starting Swap.
[  OK  ] Reached target Swap.
systemd[1]: Reached target Swap.
systemd[1]: Starting udev Kernel Socket.
[  OK  ] Listening on udev Kernel Socket.
systemd[1]: Listening on udev Kernel Socket.
systemd[1]: Starting udev Control Socket.
[  OK  ] Listening on udev Control Socket.
systemd[1]: Listening on udev Control Socket.
systemd[1]: Starting Journal Socket.
[  OK  ] Listening on Journal Socket.
systemd[1]: Listening on Journal Socket.
systemd[1]: Starting Syslog.
[  OK  ] Reached target Syslog.
systemd[1]: Reached target Syslog.
systemd[1]: Mounting Temporary Directory...
         Mounting Temporary Directory...
systemd[1]: tmp.mount: Directory /tmp to mount over is not empty, mounting
anyway.
systemd[1]: Started Import network configuration from initramfs.
systemd[1]: Starting Configure read-only root support...
         Starting Configure read-only root support...
systemd[1]: Mounted Huge Pages File System.
systemd[1]: Starting Journal Service...
         Starting Journal Service...
[  OK  ] Started Journal Service.
systemd[1]: Started Journal Service.
systemd[1]: Mounted Debug File System.
systemd[1]: Mounting POSIX Message Queue File System...
         Mounting POSIX Message Queue File System...
systemd[1]: Starting udev Kernel Device Manager...
         Starting udev Kernel Device Manager...
systemd[1]: Starting udev Coldplug all Devices...
         Starting udev Coldplug all Devices...
systemd[1]: systemd-readahead-collect.service: main process exited,
code=exited, status=1/FAILURE
[  OK  ] Started Collect Read-Ahead Data.
systemd[1]: Started Collect Read-Ahead Data.
[  OK  ] Mounted Temporary Directory.
systemd[1]: Mounted Temporary Directory.
systemd[1]: Started Load legacy module configuration.
systemd[1]: Started Load Kernel Modules.
systemd[1]: Mounted Configuration File System.
systemd[1]: Mounted FUSE Control File System.
systemd[1]: Started File System Check on Root Device.
systemd[1]: Starting Remount Root and Kernel File Systems...
         Starting Remount Root and Kernel File Systems...
systemd[1]: Started Set Up Additional Binary Formats.
systemd[1]: Starting Apply Kernel Variables...
         Starting Apply Kernel Variables...
systemd[1]: Starting Setup Virtual Console...
         Starting Setup Virtual Console...
[  OK  ] Mounted POSIX Message Queue File System.
[  OK  ] Started Remount Root and Kernel File Systems.
[  OK  ] Reached target Local File Systems (Pre).
         Starting Load Random Seed...
[  OK  ] Started Apply Kernel Variables.
[FAILED] Failed to start Setup Virtual Console.
See 'systemctl status systemd-vconsole-setup.service' for details.
[  OK  ] Started Load Random Seed.
[  OK  ] Started Configure read-only root support.
[  OK  ] Started udev Kernel Device Manager.
systemd-udevd[238]: starting version 197
[  OK  ] Started udev Coldplug all Devices.
         Starting udev Wait for Complete Device Initialization...
         Starting Show Plymouth Boot Screen...
BUG: failure at block/blk-core.c:2978/blk_flush_plug_list()!
Kernel panic - not syncing: BUG!
Call Trace:
160477d70:  [<6024be78>] panic+0x145/0x2a7
160477da8:  [<6024bd33>] panic+0x0/0x2a7
160477de8:  [<6024bfda>] printk+0x0/0xa0
160477e60:  [<600182c0>] _init+0x7e0/0x8b0
160477e80:  [<6018c15d>] blk_flush_plug_list+0x191/0x252
160477ec0:  [<60046970>] sigsuspend+0x0/0x9e
160477ed0:  [<600182c0>] _init+0x7e0/0x8b0
160477ef0:  [<602503c0>] schedule+0x6a/0x78
160477f00:  [<6004579c>] set_current_blocked+0x17/0x19
160477f10:  [<600469cc>] sigsuspend+0x5c/0x9e
160477f30:  [<6001e6da>] winch_thread+0x204/0x242
160477fd0:  [<6001e4d6>] winch_thread+0x0/0x242


Modules linked in:
Pid: 1615311232, comm: Not tainted 3.9.0-rc5
RIP: 12f0:[<0000000160476e50>]
RSP: 0000000000000000  EFLAGS: 00000000
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000016047b3a8
RDX: 000000016047b3a8 RSI: 000000016047b3b8 RDI: 000000016047b3b8
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 000000016047b380 R11: 000000016047b380 R12: 0000000000000000
R13: 0000000c00000000 R14: 000000000c731a70 R15: 0000000051618cac
Call Trace:
160477cc8:  [<6006b2d6>] __module_text_address+0x14/0x5a
160477ce0:  [<6001c48f>] panic_exit+0x3a/0x58
160477cf0:  [<6004eba2>] __kernel_text_address+0x30/0x5c
160477d10:  [<60055b34>] notifier_call_chain+0x32/0x5c
160477d38:  [<600182c0>] _init+0x7e0/0x8b0
160477d50:  [<60055b6e>] __atomic_notifier_call_chain+0x10/0x12
160477d60:  [<60055b86>] atomic_notifier_call_chain+0x16/0x18
160477d70:  [<6024beab>] panic+0x178/0x2a7
160477da8:  [<6024bd33>] panic+0x0/0x2a7
160477de8:  [<6024bfda>] printk+0x0/0xa0
160477e60:  [<600182c0>] _init+0x7e0/0x8b0
160477e80:  [<6018c15d>] blk_flush_plug_list+0x191/0x252
160477ec0:  [<60046970>] sigsuspend+0x0/0x9e
160477ed0:  [<600182c0>] _init+0x7e0/0x8b0
160477ef0:  [<602503c0>] schedule+0x6a/0x78
160477f00:  [<6004579c>] set_current_blocked+0x17/0x19
160477f10:  [<600469cc>] sigsuspend+0x5c/0x9e
160477f30:  [<6001e6da>] winch_thread+0x204/0x242
160477fd0:  [<6001e4d6>] winch_thread+0x0/0x242

systemd-journald[233]: Received SIGUSR1

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: richard -r. w. <ric...@gm...> - 2013-04-07 21:55:30

Attachments: no_winch.diff

On Sun, Apr 7, 2013 at 8:30 PM, Peter Butler <pb...@pt...> wrote:
> Here's one more example, still the same setup, but this time crashing at
> the same place as the original bug report. (BUG: failure at
> block/blk-core.c:2978/blk_flush_plug_list()!) See below for output.
>
> BTW my host setup is Linux Mint 14:
>
> Linux ufo 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012
> x86_64 x86_64 x86_64 GNU/Linux
>
>
>
> bash $ ./linux ubd0=Fedora18-AMD64-root_fs rw mem=4096M con0=fd:0,fd:1

Please don't post into unrelated threads.

Anyway, all your crashes share one thing, before the crash UML did a
sigsuspend().
It does only so in the SIGWINCH irq path.

To verify ma theory please apply the attached patch. It disables the
feature that you can resize an UML
window on the host side and UML will change the terminal size in the guest.

Thanks,
//richard

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-11 04:16:35

Hi Richard, thanks for replying. I did go back to see the code and try to
understand what exactly is going on in UML, but still no luck.

The faulted address is covered by one of the vm areas of the task, so it
passed the vma sanity check at the beginning of handle_page_fault(). I
print out the PTEs of the task and I notice one strange thing: when the
fault happens for the first time, the PTE does not exist; the PTE is
present when the second fault happens for the same address (but still a
page fault); in the third page fault (same address), the PTE does not exist
anymore.

So in my case, the faulted address does not require a new vma to be
installed.

Also I've looked into copy_mm() to see how pages are copied from parent
task to its child. I do not understand the purpose of the  the special
mapping installed by UML. It seems that every new task with a new mm_struct
will have one special mapping at the head of its vma list.

Thanks.

On Sun, Apr 7, 2013 at 12:52 PM, richard -rw- weinberger <
ric...@gm...> wrote:

> On Sat, Apr 6, 2013 at 9:23 PM, Terry Hsu <ter...@gm...> wrote:
> > Is there any available resource that explains how user-mode-linux maps
> the
> > pages of a task in UML to the host kernel?
>
> The code...? ;)
> UML receives a SIGEGV on the host side if a page is not mapped.
> The SIGEGV handler then installs the mapping using mmap().
>
> > In my UML, I modified a task's page table when forking it. Then I ran
> into a
> > situation where the page fault happens over and over again for the same
> > address in the forked task. I use gdb debugger and find out that when the
> > page fault happens for the first time, the kernel calls do_wp_page() to
> > fault in the page and marks the page present. This should prevent the
> next
> > page fault for the same address from happening again. I checked the PTE
> in
> > UML, they are marked as present so is it possible that the page is not
> being
> > allocated properly on the host kernel so that the page fault keeps
> happening
> > for the same address even though UML thinks the page is present.
> >
> > Any suggestions?
>
> If the same fault happens over and over UML (on the host side) seems
> unable to fix the fault.
> Check the return values of mmap()....
>
> Thanks,
> //richard
>

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: richard -r. w. <ric...@gm...> - 2013-04-11 13:05:06

On Thu, Apr 11, 2013 at 6:15 AM, Terry Hsu <ter...@gm...> wrote:
> Hi Richard, thanks for replying. I did go back to see the code and try to
> understand what exactly is going on in UML, but still no luck.
>
> The faulted address is covered by one of the vm areas of the task, so it
> passed the vma sanity check at the beginning of handle_page_fault(). I print
> out the PTEs of the task and I notice one strange thing: when the fault
> happens for the first time, the PTE does not exist; the PTE is present when
> the second fault happens for the same address (but still a page fault); in
> the third page fault (same address), the PTE does not exist anymore.
>
> So in my case, the faulted address does not require a new vma to be
> installed.

But this is a feature added by you?
We are not talking about a mainline kernel, right?

> Also I've looked into copy_mm() to see how pages are copied from parent task
> to its child. I do not understand the purpose of the  the special mapping
> installed by UML. It seems that every new task with a new mm_struct will
> have one special mapping at the head of its vma list.

The special mapping (the SKAS stub) is needed to install new mapping
from the host
side of UML.
Currently the stub pages have a vma, this will go away such that they
have only a PTE.

--
Thanks,
//richard

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-11 20:15:03

The page fault loop for the same address happens in my UML. But for both my
UML and the mainline (I am using 3.7.1) kernel, the addresses that trigger
the page fault (in the child thread) are covered by certain vm areas. I use
gdb to trace the function call and notice that mmap_region() is never
called during the execution of the child task. I am guessing it's because
the child task does not use large enough memory space to have the UML
installed mapping for it.

The major change I did to my kernel is to modify the vm areas pointers of
certain child tasks to share the vm area structure of its parent task. So
the parent task's vm areas are shared (as long as VM_DONTCOPY is not set)
among some of its child tasks.

On Thu, Apr 11, 2013 at 9:04 AM, richard -rw- weinberger <
ric...@gm...> wrote:

> On Thu, Apr 11, 2013 at 6:15 AM, Terry Hsu <ter...@gm...> wrote:
> > Hi Richard, thanks for replying. I did go back to see the code and try to
> > understand what exactly is going on in UML, but still no luck.
> >
> > The faulted address is covered by one of the vm areas of the task, so it
> > passed the vma sanity check at the beginning of handle_page_fault(). I
> print
> > out the PTEs of the task and I notice one strange thing: when the fault
> > happens for the first time, the PTE does not exist; the PTE is present
> when
> > the second fault happens for the same address (but still a page fault);
> in
> > the third page fault (same address), the PTE does not exist anymore.
> >
> > So in my case, the faulted address does not require a new vma to be
> > installed.
>
> But this is a feature added by you?
> We are not talking about a mainline kernel, right?

> > Also I've looked into copy_mm() to see how pages are copied from parent
> task
> > to its child. I do not understand the purpose of the  the special mapping
> > installed by UML. It seems that every new task with a new mm_struct will
> > have one special mapping at the head of its vma list.
>
> The special mapping (the SKAS stub) is needed to install new mapping
> from the host
> side of UML.
> Currently the stub pages have a vma, this will go away such that they
> have only a PTE.
>
> --
> Thanks,
> //richard
>

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: richard -r. w. <ric...@gm...> - 2013-04-11 21:19:08

On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <ter...@gm...> wrote:
> The page fault loop for the same address happens in my UML. But for both my
> UML and the mainline (I am using 3.7.1) kernel, the addresses that trigger
> the page fault (in the child thread) are covered by certain vm areas. I use
> gdb to trace the function call and notice that mmap_region() is never called
> during the execution of the child task. I am guessing it's because the child
> task does not use large enough memory space to have the UML installed
> mapping for it.

Okay, let's try to figure out what happens here.
The UML _guest_ process has some vmas installed, upon access the host
kernel finds
out that there is no memory mapping installed in the _host_ side of
UML and sends SIGSEGV
to the process. UML's host part catches the SIGSEGV and tries to fix it.
Usually it does so by mmap()'ing the faulting page into the UML guest process.
This is where the SKAS stub magic happens. It write the to be fixed
address into STUB_DATA
and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
After the stub has finished it traps itself and the UML emulation continues.

Now we need to figure out a) What address is faulting and why? b) What
does the UML _host_ side
code to fix it? i.e. What are the mmap() parameters? c) Does this mmap() fail?

To me it looks like UML is unable to fix the fault and therefore it
faults over and over again.

--
Thanks,
//richard

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-11 23:00:58

In the unmodified kernel, I did not see the kernel call mmap (which in turn
calls mmap_region) to install the mapping for the faulting page in child
task. The child task does not have the UML invoked mmap to install
mapping. So I could not examine the parameters passed to mmap neither the
return value of it.

Thanks for the explanation of the special mapping. After reading your
comment I went to Jeff Dike's website to find out more about skas:
http://user-mode-linux.sourceforge.net/old/skas.html

The handle_pte_fault() calls __do_fault(), which in turn invokes
filemap_fault() through
vma->vm_ops->fault(vma, &vmf). How do I find out exactly what the miss
address is for? I am posting the log I print out here. This is the
unmodified kernel version. So the page is faulted in correctly without
calling mmap for the forked child task.

*Note: this is the correct version of page fault in the unmodified kernel.*
[segv_handler] Caller is userspace+0x25d/0x44c, pid 598 a.out
[segv] Caller is segv_handler+0xb1/0xbb, pid 598 a.out
[handle_page_fault] Caller is segv+0xfa/0x324, pid 598 a.out
[handle_page_fault] fault address: 0x400e9cc8
[handle_page_fault] page walk for 0x400e9cc8
[handle_page_fault] pte does not exist!
[handle_page_fault] before handle_page_fault
[print_mm_rss_stat] mm->rss_stat for mm id: 673
[print_mm_rss_stat] mm->rss_stat.count[0] = 0
[print_mm_rss_stat] mm->rss_stat.count[1] = 27
[print_mm_rss_stat] mm->rss_stat.count[2] = 0
[find_vma] Caller is handle_page_fault+0x1ca/0x957, pid 598 a.out
[handle_mm_fault] Caller is handle_page_fault+0x50d/0x957, pid 598 a.out
[handle_mm_fault] pgd: 295944192
[handle_mm_fault] pud: 295944192
[handle_mm_fault] pmd: 294746112
[*handle_mm_fault*] pte: 295581512
[*handle_pte_fault*] calling do_linear_fault
[*__do_fault*] __do_fault for 0x400e9cc8
[__do_fault] line 3292 of file mm/memory.c, pid 598
[*filemap_fault*] line 1604 of file mm/filemap.c, pid 598
[filemap_fault] line 1622 of file mm/filemap.c, pid 598
[filemap_fault] line 1654 of file mm/filemap.c, pid 598
[filemap_fault] line 1680 of file mm/filemap.c, pid 598
[__do_fault] line 3312 of file mm/memory.c, pid 598
[__do_fault] line 3367 of file mm/memory.c, pid 598
[__do_fault] line 3395 of file mm/memory.c, pid 598
[__do_fault] line 3408 of file mm/memory.c, pid 598
[__do_fault] line 3425 of file mm/memory.c, pid 598
[__do_fault] line 3458 of file mm/memory.c, pid 598
[__do_fault] __do_fault for 0x400e9cc8 returning 512
[handle_page_fault] line 205 of file arch/um/kernel/trap.c, pid 598
[handle_page_fault] mm->mm_id: 673
[flush_tlb_page] Caller is handle_page_fault+0x7f5/0x957, pid 598 a.out
[flush_tlb_page] mm->mm_id: 673
[handle_page_fault] page walk for 0x400e9cc8
[handle_page_fault] pte for 0x400e9cc8: 0x119e3748
[handle_page_fault] after handle_page_fault
[print_mm_rss_stat] mm->rss_stat for mm id: 673
[print_mm_rss_stat] mm->rss_stat.count[0] = 1
[print_mm_rss_stat] mm->rss_stat.count[1] = 27
[print_mm_rss_stat] mm->rss_stat.count[2] = 0

On Thu, Apr 11, 2013 at 5:19 PM, richard -rw- weinberger <
ric...@gm...> wrote:

> On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <ter...@gm...> wrote:
> > The page fault loop for the same address happens in my UML. But for both
> my
> > UML and the mainline (I am using 3.7.1) kernel, the addresses that
> trigger
> > the page fault (in the child thread) are covered by certain vm areas. I
> use
> > gdb to trace the function call and notice that mmap_region() is never
> called
> > during the execution of the child task. I am guessing it's because the
> child
> > task does not use large enough memory space to have the UML installed
> > mapping for it.
>
> Okay, let's try to figure out what happens here.
> The UML _guest_ process has some vmas installed, upon access the host
> kernel finds
> out that there is no memory mapping installed in the _host_ side of
> UML and sends SIGSEGV
> to the process. UML's host part catches the SIGSEGV and tries to fix it.
> Usually it does so by mmap()'ing the faulting page into the UML guest
> process.
> This is where the SKAS stub magic happens. It write the to be fixed
> address into STUB_DATA
> and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
> After the stub has finished it traps itself and the UML emulation
> continues.
>
> Now we need to figure out a) What address is faulting and why? b) What
> does the UML _host_ side
> code to fix it? i.e. What are the mmap() parameters? c) Does this mmap()
> fail?
>
> To me it looks like UML is unable to fix the fault and therefore it
> faults over and over again.
>
> --
> Thanks,
> //richard
>

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-12 05:15:41

okay so I looked into the faultinfo structure and was able to obtain the
faulting address, error code, and trap number(?). From my understanding the
error code is the bottom 3 bits of the exception code. But I see error code
"20" sometimes and do not what it means.
I am now looking at how the special mapping works with the host kernel. I
think this might lead me to the solution of my problem. It sounds like the
special mapping is not installed correctly so that the UML was not able to
fix the fault.




On Thu, Apr 11, 2013 at 7:00 PM, Terry Hsu <ter...@gm...> wrote:

> In the unmodified kernel, I did not see the kernel call mmap (which in
> turn calls mmap_region) to install the mapping for the faulting page in
> child task. The child task does not have the UML invoked mmap to install
> mapping. So I could not examine the parameters passed to mmap neither the
> return value of it.
>
> Thanks for the explanation of the special mapping. After reading your
> comment I went to Jeff Dike's website to find out more about skas:
> http://user-mode-linux.sourceforge.net/old/skas.html
>
> The handle_pte_fault() calls __do_fault(), which in turn invokes
> filemap_fault() through
> vma->vm_ops->fault(vma, &vmf). How do I find out exactly what the miss
> address is for? I am posting the log I print out here. This is the
> unmodified kernel version. So the page is faulted in correctly without
> calling mmap for the forked child task.
>
> *Note: this is the correct version of page fault in the unmodified kernel.
> *
> [segv_handler] Caller is userspace+0x25d/0x44c, pid 598 a.out
> [segv] Caller is segv_handler+0xb1/0xbb, pid 598 a.out
> [handle_page_fault] Caller is segv+0xfa/0x324, pid 598 a.out
> [handle_page_fault] fault address: 0x400e9cc8
> [handle_page_fault] page walk for 0x400e9cc8
> [handle_page_fault] pte does not exist!
> [handle_page_fault] before handle_page_fault
> [print_mm_rss_stat] mm->rss_stat for mm id: 673
> [print_mm_rss_stat] mm->rss_stat.count[0] = 0
> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
> [find_vma] Caller is handle_page_fault+0x1ca/0x957, pid 598 a.out
> [handle_mm_fault] Caller is handle_page_fault+0x50d/0x957, pid 598 a.out
> [handle_mm_fault] pgd: 295944192
> [handle_mm_fault] pud: 295944192
> [handle_mm_fault] pmd: 294746112
> [*handle_mm_fault*] pte: 295581512
> [*handle_pte_fault*] calling do_linear_fault
> [*__do_fault*] __do_fault for 0x400e9cc8
> [__do_fault] line 3292 of file mm/memory.c, pid 598
> [*filemap_fault*] line 1604 of file mm/filemap.c, pid 598
> [filemap_fault] line 1622 of file mm/filemap.c, pid 598
> [filemap_fault] line 1654 of file mm/filemap.c, pid 598
> [filemap_fault] line 1680 of file mm/filemap.c, pid 598
> [__do_fault] line 3312 of file mm/memory.c, pid 598
> [__do_fault] line 3367 of file mm/memory.c, pid 598
> [__do_fault] line 3395 of file mm/memory.c, pid 598
> [__do_fault] line 3408 of file mm/memory.c, pid 598
> [__do_fault] line 3425 of file mm/memory.c, pid 598
> [__do_fault] line 3458 of file mm/memory.c, pid 598
> [__do_fault] __do_fault for 0x400e9cc8 returning 512
> [handle_page_fault] line 205 of file arch/um/kernel/trap.c, pid 598
> [handle_page_fault] mm->mm_id: 673
> [flush_tlb_page] Caller is handle_page_fault+0x7f5/0x957, pid 598 a.out
> [flush_tlb_page] mm->mm_id: 673
> [handle_page_fault] page walk for 0x400e9cc8
> [handle_page_fault] pte for 0x400e9cc8: 0x119e3748
> [handle_page_fault] after handle_page_fault
> [print_mm_rss_stat] mm->rss_stat for mm id: 673
> [print_mm_rss_stat] mm->rss_stat.count[0] = 1
> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>
>
>
>
>
> On Thu, Apr 11, 2013 at 5:19 PM, richard -rw- weinberger <
> ric...@gm...> wrote:
>
>> On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <ter...@gm...>
>> wrote:
>> > The page fault loop for the same address happens in my UML. But for
>> both my
>> > UML and the mainline (I am using 3.7.1) kernel, the addresses that
>> trigger
>> > the page fault (in the child thread) are covered by certain vm areas. I
>> use
>> > gdb to trace the function call and notice that mmap_region() is never
>> called
>> > during the execution of the child task. I am guessing it's because the
>> child
>> > task does not use large enough memory space to have the UML installed
>> > mapping for it.
>>
>> Okay, let's try to figure out what happens here.
>> The UML _guest_ process has some vmas installed, upon access the host
>> kernel finds
>> out that there is no memory mapping installed in the _host_ side of
>> UML and sends SIGSEGV
>> to the process. UML's host part catches the SIGSEGV and tries to fix it.
>> Usually it does so by mmap()'ing the faulting page into the UML guest
>> process.
>> This is where the SKAS stub magic happens. It write the to be fixed
>> address into STUB_DATA
>> and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
>> After the stub has finished it traps itself and the UML emulation
>> continues.
>>
>> Now we need to figure out a) What address is faulting and why? b) What
>> does the UML _host_ side
>> code to fix it? i.e. What are the mmap() parameters? c) Does this mmap()
>> fail?
>>
>> To me it looks like UML is unable to fix the fault and therefore it
>> faults over and over again.
>>
>> --
>> Thanks,
>> //richard
>>
>
>

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-12 19:59:43

Do you know which functions are used by UML to write the to be fixed
address into SKAS stub? In the handle_page_fault(), the stub mapping is
never referenced. I print out vm area (in find_vma()) if the address cover
by the stub mapping is referenced, and it prints nothing there.

I want to know when/where the UML writes the to be fixed address into SKAS
stub so I can fix the problem accordingly. I think my UML is using the
wrong SKAS stub to fixed the fault...

Thanks!


On Fri, Apr 12, 2013 at 1:14 AM, Terry Hsu <ter...@gm...> wrote:

> okay so I looked into the faultinfo structure and was able to obtain the
> faulting address, error code, and trap number(?). From my understanding the
> error code is the bottom 3 bits of the exception code. But I see error code
> "20" sometimes and do not what it means.
> I am now looking at how the special mapping works with the host kernel. I
> think this might lead me to the solution of my problem. It sounds like the
> special mapping is not installed correctly so that the UML was not able to
> fix the fault.
>
>
>
>
> On Thu, Apr 11, 2013 at 7:00 PM, Terry Hsu <ter...@gm...> wrote:
>
>> In the unmodified kernel, I did not see the kernel call mmap (which in
>> turn calls mmap_region) to install the mapping for the faulting page in
>> child task. The child task does not have the UML invoked mmap to install
>> mapping. So I could not examine the parameters passed to mmap neither the
>> return value of it.
>>
>> Thanks for the explanation of the special mapping. After reading your
>> comment I went to Jeff Dike's website to find out more about skas:
>> http://user-mode-linux.sourceforge.net/old/skas.html
>>
>> The handle_pte_fault() calls __do_fault(), which in turn invokes
>> filemap_fault() through
>> vma->vm_ops->fault(vma, &vmf). How do I find out exactly what the miss
>> address is for? I am posting the log I print out here. This is the
>> unmodified kernel version. So the page is faulted in correctly without
>> calling mmap for the forked child task.
>>
>> *Note: this is the correct version of page fault in the unmodified
>> kernel.*
>> [segv_handler] Caller is userspace+0x25d/0x44c, pid 598 a.out
>> [segv] Caller is segv_handler+0xb1/0xbb, pid 598 a.out
>> [handle_page_fault] Caller is segv+0xfa/0x324, pid 598 a.out
>> [handle_page_fault] fault address: 0x400e9cc8
>> [handle_page_fault] page walk for 0x400e9cc8
>> [handle_page_fault] pte does not exist!
>> [handle_page_fault] before handle_page_fault
>> [print_mm_rss_stat] mm->rss_stat for mm id: 673
>> [print_mm_rss_stat] mm->rss_stat.count[0] = 0
>> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
>> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>> [find_vma] Caller is handle_page_fault+0x1ca/0x957, pid 598 a.out
>> [handle_mm_fault] Caller is handle_page_fault+0x50d/0x957, pid 598 a.out
>> [handle_mm_fault] pgd: 295944192
>> [handle_mm_fault] pud: 295944192
>> [handle_mm_fault] pmd: 294746112
>> [*handle_mm_fault*] pte: 295581512
>> [*handle_pte_fault*] calling do_linear_fault
>> [*__do_fault*] __do_fault for 0x400e9cc8
>> [__do_fault] line 3292 of file mm/memory.c, pid 598
>> [*filemap_fault*] line 1604 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1622 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1654 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1680 of file mm/filemap.c, pid 598
>> [__do_fault] line 3312 of file mm/memory.c, pid 598
>> [__do_fault] line 3367 of file mm/memory.c, pid 598
>> [__do_fault] line 3395 of file mm/memory.c, pid 598
>> [__do_fault] line 3408 of file mm/memory.c, pid 598
>> [__do_fault] line 3425 of file mm/memory.c, pid 598
>> [__do_fault] line 3458 of file mm/memory.c, pid 598
>> [__do_fault] __do_fault for 0x400e9cc8 returning 512
>> [handle_page_fault] line 205 of file arch/um/kernel/trap.c, pid 598
>> [handle_page_fault] mm->mm_id: 673
>> [flush_tlb_page] Caller is handle_page_fault+0x7f5/0x957, pid 598 a.out
>> [flush_tlb_page] mm->mm_id: 673
>> [handle_page_fault] page walk for 0x400e9cc8
>> [handle_page_fault] pte for 0x400e9cc8: 0x119e3748
>> [handle_page_fault] after handle_page_fault
>> [print_mm_rss_stat] mm->rss_stat for mm id: 673
>> [print_mm_rss_stat] mm->rss_stat.count[0] = 1
>> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
>> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>>
>>
>>
>>
>>
>> On Thu, Apr 11, 2013 at 5:19 PM, richard -rw- weinberger <
>> ric...@gm...> wrote:
>>
>>> On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <ter...@gm...>
>>> wrote:
>>> > The page fault loop for the same address happens in my UML. But for
>>> both my
>>> > UML and the mainline (I am using 3.7.1) kernel, the addresses that
>>> trigger
>>> > the page fault (in the child thread) are covered by certain vm areas.
>>> I use
>>> > gdb to trace the function call and notice that mmap_region() is never
>>> called
>>> > during the execution of the child task. I am guessing it's because the
>>> child
>>> > task does not use large enough memory space to have the UML installed
>>> > mapping for it.
>>>
>>> Okay, let's try to figure out what happens here.
>>> The UML _guest_ process has some vmas installed, upon access the host
>>> kernel finds
>>> out that there is no memory mapping installed in the _host_ side of
>>> UML and sends SIGSEGV
>>> to the process. UML's host part catches the SIGSEGV and tries to fix it.
>>> Usually it does so by mmap()'ing the faulting page into the UML guest
>>> process.
>>> This is where the SKAS stub magic happens. It write the to be fixed
>>> address into STUB_DATA
>>> and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
>>> After the stub has finished it traps itself and the UML emulation
>>> continues.
>>>
>>> Now we need to figure out a) What address is faulting and why? b) What
>>> does the UML _host_ side
>>> code to fix it? i.e. What are the mmap() parameters? c) Does this mmap()
>>> fail?
>>>
>>> To me it looks like UML is unable to fix the fault and therefore it
>>> faults over and over again.
>>>
>>> --
>>> Thanks,
>>> //richard
>>>
>>
>>
>

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: Terry H. <ter...@gm...> - 2013-04-13 03:00:19

On Fri, Apr 12, 2013 at 1:14 AM, Terry Hsu <ter...@gm...> wrote:

> okay so I looked into the faultinfo structure and was able to obtain the
> faulting address, error code, and trap number(?). From my understanding the
> error code is the bottom 3 bits of the exception code. But I see error code
> "20" sometimes and do not what it means.
>

According to p.6-55 in Intel® 64 and IA-32 Architectures Software
Developer’s Manual, Volume 3: System Programming
Guide<http://download.intel.com/design/processor/manuals/253668.pdf>,
the lower 5 bits are Present, Read/Write, User/supervisor, RSVD, and
Instruction/Data bit respectively. So error code 20 means the fault is
caused by an instruction read to a non-present page in user mode.

I found the the reason why the fault cannot be fixed by UML. It is probably
because UML puts the faultinfo in the wrong stub, since I changed the vm
area pointers of the child process, when the fault happens, UML incorrectly
finds its parent process's stub pages and puts the faultinfo in it.
Therefore when the child process tries to access its own skas stub and fix
the fault, it still cannot find the correct instruction pointers hence the
fault happens endlessly.

Why does every process that runs in UML need its own stub for page fault
handling? It seems to me they could've shared the SIGSEGV signal handler
and the function that invokes mmap, munmap, mprotect. In this way only two
pages are needed for all the processes.

I am not sure if I understand the whole thing correctly. Please correct me
if it's not right.

Thanks!

 I am now looking at how the special mapping works with the host kernel. I
> think this might lead me to the solution of my problem. It sounds like the
> special mapping is not installed correctly so that the UML was not able to
> fix the fault.
>
>
>
>
> On Thu, Apr 11, 2013 at 7:00 PM, Terry Hsu <ter...@gm...> wrote:
>
>> In the unmodified kernel, I did not see the kernel call mmap (which in
>> turn calls mmap_region) to install the mapping for the faulting page in
>> child task. The child task does not have the UML invoked mmap to install
>> mapping. So I could not examine the parameters passed to mmap neither the
>> return value of it.
>>
>> Thanks for the explanation of the special mapping. After reading your
>> comment I went to Jeff Dike's website to find out more about skas:
>> http://user-mode-linux.sourceforge.net/old/skas.html
>>
>> The handle_pte_fault() calls __do_fault(), which in turn invokes
>> filemap_fault() through
>> vma->vm_ops->fault(vma, &vmf). How do I find out exactly what the miss
>> address is for? I am posting the log I print out here. This is the
>> unmodified kernel version. So the page is faulted in correctly without
>> calling mmap for the forked child task.
>>
>> *Note: this is the correct version of page fault in the unmodified
>> kernel.*
>> [segv_handler] Caller is userspace+0x25d/0x44c, pid 598 a.out
>> [segv] Caller is segv_handler+0xb1/0xbb, pid 598 a.out
>> [handle_page_fault] Caller is segv+0xfa/0x324, pid 598 a.out
>> [handle_page_fault] fault address: 0x400e9cc8
>> [handle_page_fault] page walk for 0x400e9cc8
>> [handle_page_fault] pte does not exist!
>> [handle_page_fault] before handle_page_fault
>> [print_mm_rss_stat] mm->rss_stat for mm id: 673
>> [print_mm_rss_stat] mm->rss_stat.count[0] = 0
>> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
>> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>> [find_vma] Caller is handle_page_fault+0x1ca/0x957, pid 598 a.out
>> [handle_mm_fault] Caller is handle_page_fault+0x50d/0x957, pid 598 a.out
>> [handle_mm_fault] pgd: 295944192
>> [handle_mm_fault] pud: 295944192
>> [handle_mm_fault] pmd: 294746112
>> [*handle_mm_fault*] pte: 295581512
>> [*handle_pte_fault*] calling do_linear_fault
>> [*__do_fault*] __do_fault for 0x400e9cc8
>> [__do_fault] line 3292 of file mm/memory.c, pid 598
>> [*filemap_fault*] line 1604 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1622 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1654 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1680 of file mm/filemap.c, pid 598
>> [__do_fault] line 3312 of file mm/memory.c, pid 598
>> [__do_fault] line 3367 of file mm/memory.c, pid 598
>> [__do_fault] line 3395 of file mm/memory.c, pid 598
>> [__do_fault] line 3408 of file mm/memory.c, pid 598
>> [__do_fault] line 3425 of file mm/memory.c, pid 598
>> [__do_fault] line 3458 of file mm/memory.c, pid 598
>> [__do_fault] __do_fault for 0x400e9cc8 returning 512
>> [handle_page_fault] line 205 of file arch/um/kernel/trap.c, pid 598
>> [handle_page_fault] mm->mm_id: 673
>> [flush_tlb_page] Caller is handle_page_fault+0x7f5/0x957, pid 598 a.out
>> [flush_tlb_page] mm->mm_id: 673
>> [handle_page_fault] page walk for 0x400e9cc8
>> [handle_page_fault] pte for 0x400e9cc8: 0x119e3748
>> [handle_page_fault] after handle_page_fault
>> [print_mm_rss_stat] mm->rss_stat for mm id: 673
>> [print_mm_rss_stat] mm->rss_stat.count[0] = 1
>> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
>> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>>
>>
>>
>>
>>
>> On Thu, Apr 11, 2013 at 5:19 PM, richard -rw- weinberger <
>> ric...@gm...> wrote:
>>
>>> On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <ter...@gm...>
>>> wrote:
>>> > The page fault loop for the same address happens in my UML. But for
>>> both my
>>> > UML and the mainline (I am using 3.7.1) kernel, the addresses that
>>> trigger
>>> > the page fault (in the child thread) are covered by certain vm areas.
>>> I use
>>> > gdb to trace the function call and notice that mmap_region() is never
>>> called
>>> > during the execution of the child task. I am guessing it's because the
>>> child
>>> > task does not use large enough memory space to have the UML installed
>>> > mapping for it.
>>>
>>> Okay, let's try to figure out what happens here.
>>> The UML _guest_ process has some vmas installed, upon access the host
>>> kernel finds
>>> out that there is no memory mapping installed in the _host_ side of
>>> UML and sends SIGSEGV
>>> to the process. UML's host part catches the SIGSEGV and tries to fix it.
>>> Usually it does so by mmap()'ing the faulting page into the UML guest
>>> process.
>>> This is where the SKAS stub magic happens. It write the to be fixed
>>> address into STUB_DATA
>>> and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
>>> After the stub has finished it traps itself and the UML emulation
>>> continues.
>>>
>>> Now we need to figure out a) What address is faulting and why? b) What
>>> does the UML _host_ side
>>> code to fix it? i.e. What are the mmap() parameters? c) Does this mmap()
>>> fail?
>>>
>>> To me it looks like UML is unable to fix the fault and therefore it
>>> faults over and over again.
>>>
>>> --
>>> Thanks,
>>> //richard
>>>
>>
>>
>

Re: [uml-devel] Endless page fault for the same miss address in my UML

From: richard -r. w. <ric...@gm...> - 2013-04-13 09:22:59

On Sat, Apr 13, 2013 at 4:59 AM, Terry Hsu <ter...@gm...> wrote:
>
> On Fri, Apr 12, 2013 at 1:14 AM, Terry Hsu <ter...@gm...> wrote:
>>
>> okay so I looked into the faultinfo structure and was able to obtain the
>> faulting address, error code, and trap number(?). From my understanding the
>> error code is the bottom 3 bits of the exception code. But I see error code
>> "20" sometimes and do not what it means.
>
>
> According to p.6-55 in Intel® 64 and IA-32 Architectures Software
> Developer’s Manual, Volume 3: System Programming Guide, the lower 5 bits are
> Present, Read/Write, User/supervisor, RSVD, and Instruction/Data bit
> respectively. So error code 20 means the fault is caused by an instruction
> read to a non-present page in user mode.
>
> I found the the reason why the fault cannot be fixed by UML. It is probably
> because UML puts the faultinfo in the wrong stub, since I changed the vm
> area pointers of the child process, when the fault happens, UML incorrectly
> finds its parent process's stub pages and puts the faultinfo in it.
> Therefore when the child process tries to access its own skas stub and fix
> the fault, it still cannot find the correct instruction pointers hence the
> fault happens endlessly.

Can you share an example with us which triggers the issue?

> Why does every process that runs in UML need its own stub for page fault
> handling? It seems to me they could've shared the SIGSEGV signal handler and
> the function that invokes mmap, munmap, mprotect. In this way only two pages
> are needed for all the processes.
>
> I am not sure if I understand the whole thing correctly. Please correct me
> if it's not right.

We need a stub per process because the installs a mapping into the
prozess (on the host size).
As mmap() always operates on current, we need a way to make the
prozess call mmap() itself.
The stub does this.

--
Thanks,
//richard