Home Home > GIT Browse
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorFilipe Manana <fdmanana@suse.com>2019-08-16 16:50:02 +0100
committerFilipe Manana <fdmanana@suse.com>2019-08-16 16:50:02 +0100
commitfc04aaf77625e365937c728afc3db574dcd8668f (patch)
treed7b0f9cfcdb478c470cc9481cb64ef7e5a473340
parent7426909ca011cf92ee359d1b1a192a677c94e67b (diff)
Btrfs: fix race leading to fs corruption after transaction abort
(bsc#1145937).
-rw-r--r--patches.suse/btrfs-fix-race-leading-to-fs-corruption-after-transa.patch144
-rw-r--r--series.conf1
2 files changed, 145 insertions, 0 deletions
diff --git a/patches.suse/btrfs-fix-race-leading-to-fs-corruption-after-transa.patch b/patches.suse/btrfs-fix-race-leading-to-fs-corruption-after-transa.patch
new file mode 100644
index 0000000000..dd70a59ff8
--- /dev/null
+++ b/patches.suse/btrfs-fix-race-leading-to-fs-corruption-after-transa.patch
@@ -0,0 +1,144 @@
+From: Filipe Manana <fdmanana@suse.com>
+Date: Thu, 25 Jul 2019 11:27:04 +0100
+Git-commit: cb2d3daddbfb6318d170e79aac1f7d5e4d49f0d7
+Patch-mainline: 5.3-rc3
+Subject: [PATCH] Btrfs: fix race leading to fs corruption after transaction
+ abort
+References: bsc#1145937
+
+When one transaction is finishing its commit, it is possible for another
+transaction to start and enter its initial commit phase as well. If the
+first ends up getting aborted, we have a small time window where the second
+transaction commit does not notice that the previous transaction aborted
+and ends up committing, writing a superblock that points to btrees that
+reference extent buffers (nodes and leafs) that were not persisted to disk.
+The consequence is that after mounting the filesystem again, we will be
+unable to load some btree nodes/leafs, either because the content on disk
+is either garbage (or just zeroes) or corresponds to the old content of a
+previouly COWed or deleted node/leaf, resulting in the well known error
+messages "parent transid verify failed on ...".
+The following sequence diagram illustrates how this can happen.
+
+ CPU 1 CPU 2
+
+ <at transaction N>
+
+ btrfs_commit_transaction()
+ (...)
+ --> sets transaction state to
+ TRANS_STATE_UNBLOCKED
+ --> sets fs_info->running_transaction
+ to NULL
+
+ (...)
+ btrfs_start_transaction()
+ start_transaction()
+ wait_current_trans()
+ --> returns immediately
+ because
+ fs_info->running_transaction
+ is NULL
+ join_transaction()
+ --> creates transaction N + 1
+ --> sets
+ fs_info->running_transaction
+ to transaction N + 1
+ --> adds transaction N + 1 to
+ the fs_info->trans_list list
+ --> returns transaction handle
+ pointing to the new
+ transaction N + 1
+ (...)
+
+ btrfs_sync_file()
+ btrfs_start_transaction()
+ --> returns handle to
+ transaction N + 1
+ (...)
+
+ btrfs_write_and_wait_transaction()
+ --> writeback of some extent
+ buffer fails, returns an
+ error
+ btrfs_handle_fs_error()
+ --> sets BTRFS_FS_STATE_ERROR in
+ fs_info->fs_state
+ --> jumps to label "scrub_continue"
+ cleanup_transaction()
+ btrfs_abort_transaction(N)
+ --> sets BTRFS_FS_STATE_TRANS_ABORTED
+ flag in fs_info->fs_state
+ --> sets aborted field in the
+ transaction and transaction
+ handle structures, for
+ transaction N only
+ --> removes transaction from the
+ list fs_info->trans_list
+ btrfs_commit_transaction(N + 1)
+ --> transaction N + 1 was not
+ aborted, so it proceeds
+ (...)
+ --> sets the transaction's state
+ to TRANS_STATE_COMMIT_START
+ --> does not find the previous
+ transaction (N) in the
+ fs_info->trans_list, so it
+ doesn't know that transaction
+ was aborted, and the commit
+ of transaction N + 1 proceeds
+ (...)
+ --> sets transaction N + 1 state
+ to TRANS_STATE_UNBLOCKED
+ btrfs_write_and_wait_transaction()
+ --> succeeds writing all extent
+ buffers created in the
+ transaction N + 1
+ write_all_supers()
+ --> succeeds
+ --> we now have a superblock on
+ disk that points to trees
+ that refer to at least one
+ extent buffer that was
+ never persisted
+
+So fix this by updating the transaction commit path to check if the flag
+BTRFS_FS_STATE_TRANS_ABORTED is set on fs_info->fs_state if after setting
+the transaction to the TRANS_STATE_COMMIT_START we do not find any previous
+transaction in the fs_info->trans_list. If the flag is set, just fail the
+transaction commit with -EROFS, as we do in other places. The exact error
+code for the previous transaction abort was already logged and reported.
+
+Fixes: 49b25e0540904b ("btrfs: enhance transaction abort infrastructure")
+CC: stable@vger.kernel.org # 4.4+
+Reviewed-by: Josef Bacik <josef@toxicpanda.com>
+Signed-off-by: Filipe Manana <fdmanana@suse.com>
+Reviewed-by: David Sterba <dsterba@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+---
+ fs/btrfs/transaction.c | 10 ++++++++++
+ 1 file changed, 10 insertions(+)
+
+diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
+index 5ce9180030e6..0d0f5b4b819f 100644
+--- a/fs/btrfs/transaction.c
++++ b/fs/btrfs/transaction.c
+@@ -2064,6 +2064,16 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
+ }
+ } else {
+ spin_unlock(&fs_info->trans_lock);
++ /*
++ * The previous transaction was aborted and was already removed
++ * from the list of transactions at fs_info->trans_list. So we
++ * abort to prevent writing a new superblock that reflects a
++ * corrupt state (pointing to trees with unwritten nodes/leafs).
++ */
++ if (test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state)) {
++ ret = -EROFS;
++ goto cleanup_transaction;
++ }
+ }
+
+ extwriter_counter_dec(cur_trans, trans->type);
+--
+2.16.4
+
diff --git a/series.conf b/series.conf
index e8975e358e..a7428b7a58 100644
--- a/series.conf
+++ b/series.conf
@@ -23339,6 +23339,7 @@
patches.drivers/ALSA-pcm-fix-lost-wakeup-event-scenarios-in-snd_pcm_.patch
patches.drivers/ALSA-usb-audio-Fix-gpf-in-snd_usb_pipe_sanity_check.patch
patches.drivers/ACPI-PM-Fix-regression-in-acpi_device_set_power.patch
+ patches.suse/btrfs-fix-race-leading-to-fs-corruption-after-transa.patch
patches.drivers/IB-mlx5-Fix-MR-registration-flow-to-use-UMR-properly.patch
patches.drivers/libata-zpodd-Fix-small-read-overflow-in-zpodd_get_me.patch
patches.drivers/ata-libahci-do-not-complain-in-case-of-deferred-prob.patch