fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE
authorHugh Dickins <hugh@veritas.com>
Mon, 29 Oct 2007 21:37:20 +0000 (14:37 -0700)
committerLinus Torvalds <torvalds@woody.linux-foundation.org>
Tue, 30 Oct 2007 15:06:55 +0000 (08:06 -0700)
It's possible to provoke unionfs (not yet in mainline, though in mm and
some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
2.6.24 ecryptfs no longer calls lower level's ->writepage).

This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
already a fix (e423003028183df54f039dfda8b58c49e78c89d7 - writeback: don't
propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
far as it goes; but insufficient because it doesn't address the underlying
issue, that shmem_writepage expects to be called only by vmscan (relying on
backing_dev_info capabilities to prevent the normal writeback path from
ever approaching it).

That's an increasingly fragile assumption, and ramdisk_writepage (the other
source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
wbc->for_reclaim before returning it.  Make the same check in
shmem_writepage, thereby sidestepping the page_mapped BUG also.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Erez Zadok <ezk@cs.sunysb.edu>
Cc: <stable@kernel.org>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/shmem.c

index 404e53b..253d205 100644 (file)
@@ -915,6 +915,21 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
        struct inode *inode;
 
        BUG_ON(!PageLocked(page));
+       /*
+        * shmem_backing_dev_info's capabilities prevent regular writeback or
+        * sync from ever calling shmem_writepage; but a stacking filesystem
+        * may use the ->writepage of its underlying filesystem, in which case
+        * we want to do nothing when that underlying filesystem is tmpfs
+        * (writing out to swap is useful as a response to memory pressure, but
+        * of no use to stabilize the data) - just redirty the page, unlock it
+        * and claim success in this case.  AOP_WRITEPAGE_ACTIVATE, and the
+        * page_mapped check below, must be avoided unless we're in reclaim.
+        */
+       if (!wbc->for_reclaim) {
+               set_page_dirty(page);
+               unlock_page(page);
+               return 0;
+       }
        BUG_ON(page_mapped(page));
 
        mapping = page->mapping;