Home Home > GIT Browse
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorNeilBrown <neilb@suse.com>2018-11-01 14:04:41 +1100
committerNeilBrown <neilb@suse.com>2018-11-01 14:06:04 +1100
commit014cb60e0001e88cb0cd96809d1a946f5e3c04f3 (patch)
tree265f9ec1a9f55c012e6a86dfd9023fe1d8740a3f
parent8c6c11a7a447e3ff9c46123ee73b471320bb4a40 (diff)
md/raid5: fix data corruption of replacements after originals
dropped (git-fixes).
-rw-r--r--patches.fixes/md-raid5-fix-data-corruption-of-replacements-after-o.patch75
-rw-r--r--series.conf1
2 files changed, 76 insertions, 0 deletions
diff --git a/patches.fixes/md-raid5-fix-data-corruption-of-replacements-after-o.patch b/patches.fixes/md-raid5-fix-data-corruption-of-replacements-after-o.patch
new file mode 100644
index 0000000000..956bb99e51
--- /dev/null
+++ b/patches.fixes/md-raid5-fix-data-corruption-of-replacements-after-o.patch
@@ -0,0 +1,75 @@
+From: BingJing Chang <bingjingc@synology.com>
+Date: Wed, 1 Aug 2018 17:08:36 +0800
+Subject: [PATCH] md/raid5: fix data corruption of replacements after originals
+ dropped
+Git-commit: d63e2fc804c46e50eee825c5d3a7228e07048b47
+Patch-mainline: v4.19
+References: git-fixes
+
+During raid5 replacement, the stripes can be marked with R5_NeedReplace
+flag. Data can be read from being-replaced devices and written to
+replacing spares without reading all other devices. (It's 'replace'
+mode. s.replacing = 1) If a being-replaced device is dropped, the
+replacement progress will be interrupted and resumed with pure recovery
+mode. However, existing stripes before being interrupted cannot read
+from the dropped device anymore. It prints lots of WARN_ON messages.
+And it results in data corruption because existing stripes write
+problematic data into its replacement device and update the progress.
+
+\# Erase disks (1MB + 2GB)
+dd if=/dev/zero of=/dev/sda bs=1MB count=2049
+dd if=/dev/zero of=/dev/sdb bs=1MB count=2049
+dd if=/dev/zero of=/dev/sdc bs=1MB count=2049
+dd if=/dev/zero of=/dev/sdd bs=1MB count=2049
+mdadm -C /dev/md0 -amd -R -l5 -n3 -x0 /dev/sd[abc] -z 2097152
+\# Ensure array stores non-zero data
+dd if=/root/data_4GB.iso of=/dev/md0 bs=1MB
+\# Start replacement
+mdadm /dev/md0 -a /dev/sdd
+mdadm /dev/md0 --replace /dev/sda
+
+Then, Hot-plug out /dev/sda during recovery, and wait for recovery done.
+echo check > /sys/block/md0/md/sync_action
+cat /sys/block/md0/md/mismatch_cnt # it will be greater than 0.
+
+Soon after you hot-plug out /dev/sda, you will see many WARN_ON
+messages. The replacement recovery will be interrupted shortly. After
+the recovery finishes, it will result in data corruption.
+
+Actually, it's just an unhandled case of replacement. In commit
+<f94c0b6658c7> (md/raid5: fix interaction of 'replace' and 'recovery'.),
+if a NeedReplace device is not UPTODATE then that is an error, the
+commit just simply print WARN_ON but also mark these corrupted stripes
+with R5_WantReplace. (it means it's ready for writes.)
+
+To fix this case, we can leverage 'sync and replace' mode mentioned in
+commit <9a3e1101b827> (md/raid5: detect and handle replacements during
+recovery.). We can add logics to detect and use 'sync and replace' mode
+for these stripes.
+
+Reported-by: Alex Chen <alexchen@synology.com>
+Reviewed-by: Alex Wu <alexwu@synology.com>
+Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com>
+Signed-off-by: BingJing Chang <bingjingc@synology.com>
+Signed-off-by: Shaohua Li <shli@fb.com>
+Acked-by: NeilBrown <neilb@suse.com>
+
+---
+ drivers/md/raid5.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+--- a/drivers/md/raid5.c
++++ b/drivers/md/raid5.c
+@@ -4516,6 +4516,12 @@ static void analyse_stripe(struct stripe
+ s->failed++;
+ if (rdev && !test_bit(Faulty, &rdev->flags))
+ do_recovery = 1;
++ else if (!rdev) {
++ rdev = rcu_dereference(
++ conf->disks[i].replacement);
++ if (rdev && !test_bit(Faulty, &rdev->flags))
++ do_recovery = 1;
++ }
+ }
+
+ if (test_bit(R5_InJournal, &dev->flags))
diff --git a/series.conf b/series.conf
index 41b791eeea..a8180e8784 100644
--- a/series.conf
+++ b/series.conf
@@ -17337,6 +17337,7 @@
patches.suse/0001-md-cluster-clear-another-node-s-suspend_area-after-t.patch
patches.suse/0002-md-cluster-show-array-s-status-more-accurate.patch
patches.suse/0003-md-cluster-don-t-send-msg-if-array-is-closing.patch
+ patches.fixes/md-raid5-fix-data-corruption-of-replacements-after-o.patch
patches.arch/0001-x86-init-fix-build-with-CONFIG_SWAP-n.patch
patches.drivers/spi-cadence-Change-usleep_range-to-udelay-for-atomic.patch
patches.drivers/spi-davinci-fix-a-NULL-pointer-dereference.patch