mm: write_cache_pages writepage error fix
authorNick Piggin <npiggin@suse.de>
Tue, 6 Jan 2009 22:39:06 +0000 (14:39 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Tue, 6 Jan 2009 23:58:59 +0000 (15:58 -0800)
In write_cache_pages, if ret signals a real error, but we still have some
pages left in the pagevec, done would be set to 1, but the remaining pages
would continue to be processed and ret will be overwritten in the process.

It could easily be overwritten with success, and thus success will be
returned even if there is an error.  Thus the caller is told all writes
succeeded, wheras in reality some did not.

Fix this by bailing immediately if there is an error, and retaining the
first error code.

This is a data integrity bug.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/page-writeback.c

index 01b9cb8..2e847cd 100644 (file)
@@ -944,12 +944,26 @@ retry:
                        }
 
                        ret = (*writepage)(page, wbc, data);
-
-                       if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE)) {
-                               unlock_page(page);
-                               ret = 0;
-                       }
-                       if (ret || (--nr_to_write <= 0))
+                       if (unlikely(ret)) {
+                               if (ret == AOP_WRITEPAGE_ACTIVATE) {
+                                       unlock_page(page);
+                                       ret = 0;
+                               } else {
+                                       /*
+                                        * done_index is set past this page,
+                                        * so media errors will not choke
+                                        * background writeout for the entire
+                                        * file. This has consequences for
+                                        * range_cyclic semantics (ie. it may
+                                        * not be suitable for data integrity
+                                        * writeout).
+                                        */
+                                       done = 1;
+                                       break;
+                               }
+                       }
+
+                       if (--nr_to_write <= 0)
                                done = 1;
                        if (wbc->nonblocking && bdi_write_congested(bdi)) {
                                wbc->encountered_congestion = 1;