On Fri, Apr 12, 2013 at 1:14 AM, Terry Hsu <terry.shoes@gmail.com> wrote:
okay so I looked into the faultinfo structure and was able to obtain the faulting address, error code, and trap number(?). From my understanding the error code is the bottom 3 bits of the exception code. But I see error code "20" sometimes and do not what it means. 

According to p.6-55 in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3: System Programming Guide, the lower 5 bits are Present, Read/Write, User/supervisor, RSVD, and Instruction/Data bit respectively. So error code 20 means the fault is caused by an instruction read to a non-present page in user mode. 

I found the the reason why the fault cannot be fixed by UML. It is probably because UML puts the faultinfo in the wrong stub, since I changed the vm area pointers of the child process, when the fault happens, UML incorrectly finds its parent process's stub pages and puts the faultinfo in it. Therefore when the child process tries to access its own skas stub and fix the fault, it still cannot find the correct instruction pointers hence the fault happens endlessly.

Why does every process that runs in UML need its own stub for page fault handling? It seems to me they could've shared the SIGSEGV signal handler and the function that invokes mmap, munmap, mprotect. In this way only two pages are needed for all the processes.

I am not sure if I understand the whole thing correctly. Please correct me if it's not right. 


I am now looking at how the special mapping works with the host kernel. I think this might lead me to the solution of my problem. It sounds like the special mapping is not installed correctly so that the UML was not able to fix the fault.

On Thu, Apr 11, 2013 at 7:00 PM, Terry Hsu <terry.shoes@gmail.com> wrote:
In the unmodified kernel, I did not see the kernel call mmap (which in turn calls mmap_region) to install the mapping for the faulting page in child task. The child task does not have the UML invoked mmap to install mapping. So I could not examine the parameters passed to mmap neither the return value of it.

Thanks for the explanation of the special mapping. After reading your comment I went to Jeff Dike's website to find out more about skas: http://user-mode-linux.sourceforge.net/old/skas.html

The handle_pte_fault() calls __do_fault(), which in turn invokes filemap_fault() through 
vma->vm_ops->fault(vma, &vmf). How do I find out exactly what the miss address is for? I am posting the log I print out here. This is the unmodified kernel version. So the page is faulted in correctly without calling mmap for the forked child task.

Note: this is the correct version of page fault in the unmodified kernel.
[segv_handler] Caller is userspace+0x25d/0x44c, pid 598 a.out
[segv] Caller is segv_handler+0xb1/0xbb, pid 598 a.out
[handle_page_fault] Caller is segv+0xfa/0x324, pid 598 a.out
[handle_page_fault] fault address: 0x400e9cc8
[handle_page_fault] page walk for 0x400e9cc8
[handle_page_fault] pte does not exist!
[handle_page_fault] before handle_page_fault
[print_mm_rss_stat] mm->rss_stat for mm id: 673
[print_mm_rss_stat] mm->rss_stat.count[0] = 0 
[print_mm_rss_stat] mm->rss_stat.count[1] = 27 
[print_mm_rss_stat] mm->rss_stat.count[2] = 0 
[find_vma] Caller is handle_page_fault+0x1ca/0x957, pid 598 a.out
[handle_mm_fault] Caller is handle_page_fault+0x50d/0x957, pid 598 a.out
[handle_mm_fault] pgd: 295944192
[handle_mm_fault] pud: 295944192
[handle_mm_fault] pmd: 294746112
[handle_mm_fault] pte: 295581512
[handle_pte_fault] calling do_linear_fault
[__do_fault] __do_fault for 0x400e9cc8
[__do_fault] line 3292 of file mm/memory.c, pid 598
[filemap_fault] line 1604 of file mm/filemap.c, pid 598
[filemap_fault] line 1622 of file mm/filemap.c, pid 598
[filemap_fault] line 1654 of file mm/filemap.c, pid 598
[filemap_fault] line 1680 of file mm/filemap.c, pid 598
[__do_fault] line 3312 of file mm/memory.c, pid 598
[__do_fault] line 3367 of file mm/memory.c, pid 598
[__do_fault] line 3395 of file mm/memory.c, pid 598
[__do_fault] line 3408 of file mm/memory.c, pid 598
[__do_fault] line 3425 of file mm/memory.c, pid 598
[__do_fault] line 3458 of file mm/memory.c, pid 598
[__do_fault] __do_fault for 0x400e9cc8 returning 512
[handle_page_fault] line 205 of file arch/um/kernel/trap.c, pid 598
[handle_page_fault] mm->mm_id: 673
[flush_tlb_page] Caller is handle_page_fault+0x7f5/0x957, pid 598 a.out
[flush_tlb_page] mm->mm_id: 673
[handle_page_fault] page walk for 0x400e9cc8
[handle_page_fault] pte for 0x400e9cc8: 0x119e3748
[handle_page_fault] after handle_page_fault
[print_mm_rss_stat] mm->rss_stat for mm id: 673
[print_mm_rss_stat] mm->rss_stat.count[0] = 1 
[print_mm_rss_stat] mm->rss_stat.count[1] = 27 
[print_mm_rss_stat] mm->rss_stat.count[2] = 0 

On Thu, Apr 11, 2013 at 5:19 PM, richard -rw- weinberger <richard.weinberger@gmail.com> wrote:
On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <terry.shoes@gmail.com> wrote:
> The page fault loop for the same address happens in my UML. But for both my
> UML and the mainline (I am using 3.7.1) kernel, the addresses that trigger
> the page fault (in the child thread) are covered by certain vm areas. I use
> gdb to trace the function call and notice that mmap_region() is never called
> during the execution of the child task. I am guessing it's because the child
> task does not use large enough memory space to have the UML installed
> mapping for it.

Okay, let's try to figure out what happens here.
The UML _guest_ process has some vmas installed, upon access the host
kernel finds
out that there is no memory mapping installed in the _host_ side of
UML and sends SIGSEGV
to the process. UML's host part catches the SIGSEGV and tries to fix it.
Usually it does so by mmap()'ing the faulting page into the UML guest process.
This is where the SKAS stub magic happens. It write the to be fixed
address into STUB_DATA
and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
After the stub has finished it traps itself and the UML emulation continues.

Now we need to figure out a) What address is faulting and why? b) What
does the UML _host_ side
code to fix it? i.e. What are the mmap() parameters? c) Does this mmap() fail?

To me it looks like UML is unable to fix the fault and therefore it
faults over and over again.