Home Home > GIT Browse > SLE12-SP5
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDenis Kirjanov <dkirjanov@suse.com>2019-10-14 11:47:37 +0300
committerDenis Kirjanov <dkirjanov@suse.com>2019-10-14 11:47:37 +0300
commitaf5659c9d557b9a3982ccb17c1307dec44ccd425 (patch)
treef9236531de8e6dc3e66f15efefa2385e978d5638
parent2818c69f4d0b57bfd34f9b4c73f7e2d8b325c346 (diff)
parent5faace8e2db41a0f2d74bf523d5cbb1cd1c9c106 (diff)
Merge 'users/wqu/SLE12-SP5/for-next' into SLE12-SP5SLE12-SP5
Pull btrfs fixes from Qu Wenruo
-rw-r--r--patches.suse/0001-btrfs-qgroup-Fix-the-wrong-target-io_tree-when-freei.patch84
-rw-r--r--patches.suse/0001-btrfs-relocation-fix-use-after-free-on-dead-relocati.patch212
-rw-r--r--patches.suse/0002-btrfs-qgroup-Fix-reserved-data-space-leak-if-we-have.patch90
-rw-r--r--series.conf3
4 files changed, 389 insertions, 0 deletions
diff --git a/patches.suse/0001-btrfs-qgroup-Fix-the-wrong-target-io_tree-when-freei.patch b/patches.suse/0001-btrfs-qgroup-Fix-the-wrong-target-io_tree-when-freei.patch
new file mode 100644
index 0000000000..f0973886fb
--- /dev/null
+++ b/patches.suse/0001-btrfs-qgroup-Fix-the-wrong-target-io_tree-when-freei.patch
@@ -0,0 +1,84 @@
+From bab32fc069ce8829c416e8737c119f62a57970f9 Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Mon, 16 Sep 2019 20:02:38 +0800
+Patch-mainline: v5.4-rc1
+Git-commit: bab32fc069ce8829c416e8737c119f62a57970f9
+References: bsc#1152974
+Subject: [PATCH 1/2] btrfs: qgroup: Fix the wrong target io_tree when freeing
+ reserved data space
+
+[BUG]
+Under the following case with qgroup enabled, if some error happened
+after we have reserved delalloc space, then in error handling path, we
+could cause qgroup data space leakage:
+
+From btrfs_truncate_block() in inode.c:
+
+ ret = btrfs_delalloc_reserve_space(inode, &data_reserved,
+ block_start, blocksize);
+ if (ret)
+ goto out;
+
+ again:
+ page = find_or_create_page(mapping, index, mask);
+ if (!page) {
+ btrfs_delalloc_release_space(inode, data_reserved,
+ block_start, blocksize, true);
+ btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true);
+ ret = -ENOMEM;
+ goto out;
+ }
+
+[CAUSE]
+In the above case, btrfs_delalloc_reserve_space() will call
+btrfs_qgroup_reserve_data() and mark the io_tree range with
+EXTENT_QGROUP_RESERVED flag.
+
+In the error handling path, we have the following call stack:
+btrfs_delalloc_release_space()
+|- btrfs_free_reserved_data_space()
+ |- btrsf_qgroup_free_data()
+ |- __btrfs_qgroup_release_data(reserved=@reserved, free=1)
+ |- qgroup_free_reserved_data(reserved=@reserved)
+ |- clear_record_extent_bits();
+ |- freed += changeset.bytes_changed;
+
+However due to a completion bug, qgroup_free_reserved_data() will clear
+EXTENT_QGROUP_RESERVED flag in BTRFS_I(inode)->io_failure_tree, other
+than the correct BTRFS_I(inode)->io_tree.
+Since io_failure_tree is never marked with that flag,
+btrfs_qgroup_free_data() will not free any data reserved space at all,
+causing a leakage.
+
+This type of error handling can only be triggered by errors outside of
+qgroup code. So EDQUOT error from qgroup can't trigger it.
+
+[FIX]
+Fix the wrong target io_tree.
+
+Reported-by: Josef Bacik <josef@toxicpanda.com>
+Fixes: bc42bda22345 ("btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges")
+CC: stable@vger.kernel.org # 4.14+
+Reviewed-by: Nikolay Borisov <nborisov@suse.com>
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+---
+ fs/btrfs/qgroup.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
+index 52701c1be109..4ab85555a947 100644
+--- a/fs/btrfs/qgroup.c
++++ b/fs/btrfs/qgroup.c
+@@ -3486,7 +3486,7 @@ static int qgroup_free_reserved_data(struct inode *inode,
+ * EXTENT_QGROUP_RESERVED, we won't double free.
+ * So not need to rush.
+ */
+- ret = clear_record_extent_bits(&BTRFS_I(inode)->io_failure_tree,
++ ret = clear_record_extent_bits(&BTRFS_I(inode)->io_tree,
+ free_start, free_start + free_len - 1,
+ EXTENT_QGROUP_RESERVED, &changeset);
+ if (ret < 0)
+--
+2.23.0
+
diff --git a/patches.suse/0001-btrfs-relocation-fix-use-after-free-on-dead-relocati.patch b/patches.suse/0001-btrfs-relocation-fix-use-after-free-on-dead-relocati.patch
new file mode 100644
index 0000000000..24474dfcdb
--- /dev/null
+++ b/patches.suse/0001-btrfs-relocation-fix-use-after-free-on-dead-relocati.patch
@@ -0,0 +1,212 @@
+From 1fac4a54374f7ef385938f3c6cf7649c0fe4f6cd Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Mon, 23 Sep 2019 14:56:14 +0800
+Patch-mainline: v5.4-rc1
+Git-commit: 1fac4a54374f7ef385938f3c6cf7649c0fe4f6cd
+References: bsc#1152972
+Subject: [PATCH] btrfs: relocation: fix use-after-free on dead relocation
+ roots
+
+[BUG]
+One user reported a reproducible KASAN report about use-after-free:
+
+ BTRFS info (device sdi1): balance: start -dvrange=1256811659264..1256811659265
+ BTRFS info (device sdi1): relocating block group 1256811659264 flags data|raid0
+ ==================================================================
+ BUG: KASAN: use-after-free in btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
+ Write of size 8 at addr ffff88856f671710 by task kworker/u24:10/261579
+
+ CPU: 2 PID: 261579 Comm: kworker/u24:10 Tainted: P OE 5.2.11-arch1-1-kasan #4
+ Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Extreme4, BIOS P3.80 04/06/2018
+ Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
+ Call Trace:
+ dump_stack+0x7b/0xba
+ print_address_description+0x6c/0x22e
+ ? btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
+ __kasan_report.cold+0x1b/0x3b
+ ? btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
+ kasan_report+0x12/0x17
+ __asan_report_store8_noabort+0x17/0x20
+ btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
+ record_root_in_trans+0x2a0/0x370 [btrfs]
+ btrfs_record_root_in_trans+0xf4/0x140 [btrfs]
+ start_transaction+0x1ab/0xe90 [btrfs]
+ btrfs_join_transaction+0x1d/0x20 [btrfs]
+ btrfs_finish_ordered_io+0x7bf/0x18a0 [btrfs]
+ ? lock_repin_lock+0x400/0x400
+ ? __kmem_cache_shutdown.cold+0x140/0x1ad
+ ? btrfs_unlink_subvol+0x9b0/0x9b0 [btrfs]
+ finish_ordered_fn+0x15/0x20 [btrfs]
+ normal_work_helper+0x1bd/0xca0 [btrfs]
+ ? process_one_work+0x819/0x1720
+ ? kasan_check_read+0x11/0x20
+ btrfs_endio_write_helper+0x12/0x20 [btrfs]
+ process_one_work+0x8c9/0x1720
+ ? pwq_dec_nr_in_flight+0x2f0/0x2f0
+ ? worker_thread+0x1d9/0x1030
+ worker_thread+0x98/0x1030
+ kthread+0x2bb/0x3b0
+ ? process_one_work+0x1720/0x1720
+ ? kthread_park+0x120/0x120
+ ret_from_fork+0x35/0x40
+
+ Allocated by task 369692:
+ __kasan_kmalloc.part.0+0x44/0xc0
+ __kasan_kmalloc.constprop.0+0xba/0xc0
+ kasan_kmalloc+0x9/0x10
+ kmem_cache_alloc_trace+0x138/0x260
+ btrfs_read_tree_root+0x92/0x360 [btrfs]
+ btrfs_read_fs_root+0x10/0xb0 [btrfs]
+ create_reloc_root+0x47d/0xa10 [btrfs]
+ btrfs_init_reloc_root+0x1e2/0x340 [btrfs]
+ record_root_in_trans+0x2a0/0x370 [btrfs]
+ btrfs_record_root_in_trans+0xf4/0x140 [btrfs]
+ start_transaction+0x1ab/0xe90 [btrfs]
+ btrfs_start_transaction+0x1e/0x20 [btrfs]
+ __btrfs_prealloc_file_range+0x1c2/0xa00 [btrfs]
+ btrfs_prealloc_file_range+0x13/0x20 [btrfs]
+ prealloc_file_extent_cluster+0x29f/0x570 [btrfs]
+ relocate_file_extent_cluster+0x193/0xc30 [btrfs]
+ relocate_data_extent+0x1f8/0x490 [btrfs]
+ relocate_block_group+0x600/0x1060 [btrfs]
+ btrfs_relocate_block_group+0x3a0/0xa00 [btrfs]
+ btrfs_relocate_chunk+0x9e/0x180 [btrfs]
+ btrfs_balance+0x14e4/0x2fc0 [btrfs]
+ btrfs_ioctl_balance+0x47f/0x640 [btrfs]
+ btrfs_ioctl+0x119d/0x8380 [btrfs]
+ do_vfs_ioctl+0x9f5/0x1060
+ ksys_ioctl+0x67/0x90
+ __x64_sys_ioctl+0x73/0xb0
+ do_syscall_64+0xa5/0x370
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ Freed by task 369692:
+ __kasan_slab_free+0x14f/0x210
+ kasan_slab_free+0xe/0x10
+ kfree+0xd8/0x270
+ btrfs_drop_snapshot+0x154c/0x1eb0 [btrfs]
+ clean_dirty_subvols+0x227/0x340 [btrfs]
+ relocate_block_group+0x972/0x1060 [btrfs]
+ btrfs_relocate_block_group+0x3a0/0xa00 [btrfs]
+ btrfs_relocate_chunk+0x9e/0x180 [btrfs]
+ btrfs_balance+0x14e4/0x2fc0 [btrfs]
+ btrfs_ioctl_balance+0x47f/0x640 [btrfs]
+ btrfs_ioctl+0x119d/0x8380 [btrfs]
+ do_vfs_ioctl+0x9f5/0x1060
+ ksys_ioctl+0x67/0x90
+ __x64_sys_ioctl+0x73/0xb0
+ do_syscall_64+0xa5/0x370
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ The buggy address belongs to the object at ffff88856f671100
+ which belongs to the cache kmalloc-4k of size 4096
+ The buggy address is located 1552 bytes inside of
+ 4096-byte region [ffff88856f671100, ffff88856f672100)
+ The buggy address belongs to the page:
+ page:ffffea0015bd9c00 refcount:1 mapcount:0 mapping:ffff88864400e600 index:0x0 compound_mapcount: 0
+ flags: 0x2ffff0000010200(slab|head)
+ raw: 02ffff0000010200 dead000000000100 dead000000000200 ffff88864400e600
+ raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
+ page dumped because: kasan: bad access detected
+
+ Memory state around the buggy address:
+ ffff88856f671600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
+ ffff88856f671680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
+ >ffff88856f671700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
+ ^
+ ffff88856f671780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
+ ffff88856f671800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
+ ==================================================================
+ BTRFS info (device sdi1): 1 enospc errors during balance
+ BTRFS info (device sdi1): balance: ended with status: -28
+
+[CAUSE]
+The problem happens when finish_ordered_io() get called with balance
+still running, while the reloc root of that subvolume is already dead.
+(Tree is swap already done, but tree not yet deleted for possible qgroup
+usage.)
+
+That means root->reloc_root still exists, but that reloc_root can be
+under btrfs_drop_snapshot(), thus we shouldn't access it.
+
+The following race could cause the use-after-free problem:
+
+ CPU1 | CPU2
+--------------------------------------------------------------------------
+ | relocate_block_group()
+ | |- unset_reloc_control(rc)
+ | |- btrfs_commit_transaction()
+btrfs_finish_ordered_io() | |- clean_dirty_subvols()
+|- btrfs_join_transaction() | |
+ |- record_root_in_trans() | |
+ |- btrfs_init_reloc_root() | |
+ |- if (root->reloc_root) | |
+ | | |- root->reloc_root = NULL
+ | | |- btrfs_drop_snapshot(reloc_root);
+ |- reloc_root->last_trans|
+ = trans->transid |
+ ^^^^^^^^^^^^^^^^^^^^^^
+ Use after free
+
+[FIX]
+Fix it by the following modifications:
+
+- Test if the root has dead reloc tree before accessing root->reloc_root
+ If the root has BTRFS_ROOT_DEAD_RELOC_TREE, then we don't need to
+ create or update root->reloc_tree
+
+- Clear the BTRFS_ROOT_DEAD_RELOC_TREE flag until we have fully dropped
+ reloc tree
+ To co-operate with above modification, so as long as
+ BTRFS_ROOT_DEAD_RELOC_TREE is still set, we won't try to re-create
+ reloc tree at record_root_in_trans().
+
+Reported-by: Cebtenzzre <cebtenzzre@gmail.com>
+Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
+CC: stable@vger.kernel.org # 5.1+
+Reviewed-by: Josef Bacik <josef@toxicpanda.com>
+Reviewed-by: Filipe Manana <fdmanana@suse.com>
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Reviewed-by: David Sterba <dsterba@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+---
+ fs/btrfs/relocation.c | 9 ++++++++-
+ 1 file changed, 8 insertions(+), 1 deletion(-)
+
+diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
+index 2f0e25afa486..00504657b602 100644
+--- a/fs/btrfs/relocation.c
++++ b/fs/btrfs/relocation.c
+@@ -1435,6 +1435,13 @@ int btrfs_init_reloc_root(struct btrfs_trans_handle *trans,
+ int clear_rsv = 0;
+ int ret;
+
++ /*
++ * The subvolume has reloc tree but the swap is finished, no need to
++ * create/update the dead reloc tree
++ */
++ if (test_bit(BTRFS_ROOT_DEAD_RELOC_TREE, &root->state))
++ return 0;
++
+ if (root->reloc_root) {
+ reloc_root = root->reloc_root;
+ reloc_root->last_trans = trans->transid;
+@@ -2187,7 +2194,6 @@ static int clean_dirty_subvols(struct reloc_control *rc)
+ /* Merged subvolume, cleanup its reloc root */
+ struct btrfs_root *reloc_root = root->reloc_root;
+
+- clear_bit(BTRFS_ROOT_DEAD_RELOC_TREE, &root->state);
+ list_del_init(&root->reloc_dirty_list);
+ root->reloc_root = NULL;
+ if (reloc_root) {
+@@ -2196,6 +2202,7 @@ static int clean_dirty_subvols(struct reloc_control *rc)
+ if (ret2 < 0 && !ret)
+ ret = ret2;
+ }
++ clear_bit(BTRFS_ROOT_DEAD_RELOC_TREE, &root->state);
+ btrfs_put_fs_root(root);
+ } else {
+ /* Orphan reloc tree, just clean it up */
+--
+2.23.0
+
diff --git a/patches.suse/0002-btrfs-qgroup-Fix-reserved-data-space-leak-if-we-have.patch b/patches.suse/0002-btrfs-qgroup-Fix-reserved-data-space-leak-if-we-have.patch
new file mode 100644
index 0000000000..cba19a4188
--- /dev/null
+++ b/patches.suse/0002-btrfs-qgroup-Fix-reserved-data-space-leak-if-we-have.patch
@@ -0,0 +1,90 @@
+From d4e204948fe3e0dc8e1fbf3f8f3290c9c2823be3 Mon Sep 17 00:00:00 2001
+From: Qu Wenruo <wqu@suse.com>
+Date: Mon, 16 Sep 2019 20:02:39 +0800
+Patch-mainline: v5.4-rc1
+Git-commit: d4e204948fe3e0dc8e1fbf3f8f3290c9c2823be3
+References: bsc#1152975
+Subject: [PATCH 2/2] btrfs: qgroup: Fix reserved data space leak if we have
+ multiple reserve calls
+
+[BUG]
+The following script can cause btrfs qgroup data space leak:
+
+ mkfs.btrfs -f $dev
+ mount $dev -o nospace_cache $mnt
+
+ btrfs subv create $mnt/subv
+ btrfs quota en $mnt
+ btrfs quota rescan -w $mnt
+ btrfs qgroup limit 128m $mnt/subv
+
+ for (( i = 0; i < 3; i++)); do
+ # Create 3 64M holes for latter fallocate to fail
+ truncate -s 192m $mnt/subv/file
+ xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
+ xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
+ sync
+
+ # it's supposed to fail, and each failure will leak at least 64M
+ # data space
+ xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
+ rm $mnt/subv/file
+ sync
+ done
+
+ # Shouldn't fail after we removed the file
+ xfs_io -f -c "falloc 0 64m" $mnt/subv/file
+
+[CAUSE]
+Btrfs qgroup data reserve code allow multiple reservations to happen on
+a single extent_changeset:
+E.g:
+ btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_1M);
+ btrfs_qgroup_reserve_data(inode, &data_reserved, SZ_1M, SZ_2M);
+ btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_4M);
+
+Btrfs qgroup code has its internal tracking to make sure we don't
+double-reserve in above example.
+
+The only pattern utilizing this feature is in the main while loop of
+btrfs_fallocate() function.
+
+However btrfs_qgroup_reserve_data()'s error handling has a bug in that
+on error it clears all ranges in the io_tree with EXTENT_QGROUP_RESERVED
+flag but doesn't free previously reserved bytes.
+
+This bug has a two fold effect:
+- Clearing EXTENT_QGROUP_RESERVED ranges
+ This is the correct behavior, but it prevents
+ btrfs_qgroup_check_reserved_leak() to catch the leakage as the
+ detector is purely EXTENT_QGROUP_RESERVED flag based.
+
+- Leak the previously reserved data bytes.
+
+The bug manifests when N calls to btrfs_qgroup_reserve_data are made and
+the last one fails, leaking space reserved in the previous ones.
+
+[FIX]
+Also free previously reserved data bytes when btrfs_qgroup_reserve_data
+fails.
+
+Fixes: 524725537023 ("btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function")
+CC: stable@vger.kernel.org # 4.4+
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+---
+ fs/btrfs/qgroup.c | 3 +++
+ 1 file changed, 3 insertions(+)
+
+--- a/fs/btrfs/qgroup.c
++++ b/fs/btrfs/qgroup.c
+@@ -3376,6 +3376,9 @@ cleanup:
+ clear_extent_bit(&BTRFS_I(inode)->io_tree, unode->val,
+ unode->aux, EXTENT_QGROUP_RESERVED, 0, 0, NULL,
+ GFP_NOFS);
++ /* Also free data bytes of already reserved one */
++ btrfs_qgroup_free_refroot(root->fs_info, root->root_key.objectid,
++ orig_reserved, BTRFS_QGROUP_RSV_DATA);
+ extent_changeset_release(reserved);
+ return ret;
+ }
diff --git a/series.conf b/series.conf
index 3823f0cf43..c6ae6f12d0 100644
--- a/series.conf
+++ b/series.conf
@@ -49856,6 +49856,9 @@
patches.suse/scsi-lpfc-Fix-reset-recovery-paths-that-are-not-reco.patch
patches.suse/scsi-scsi_dh_rdac-zero-cdb-in-send_mode_select.patch
patches.suse/libnvdimm-altmap-track-namespace-boundaries-in-altmap.patch
+ patches.suse/0001-btrfs-relocation-fix-use-after-free-on-dead-relocati.patch
+ patches.suse/0001-btrfs-qgroup-Fix-the-wrong-target-io_tree-when-freei.patch
+ patches.suse/0002-btrfs-qgroup-Fix-reserved-data-space-leak-if-we-have.patch
# dhowells/linux-fs keys-uefi
patches.suse/0001-KEYS-Allow-unrestricted-boot-time-addition-of-keys-t.patch