Home Home > GIT Browse > openSUSE-15.1
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorNikolay Borisov <nborisov@suse.com>2019-08-15 09:57:52 +0300
committerNikolay Borisov <nborisov@suse.com>2019-08-15 09:58:02 +0300
commit2e71e21ad9950ece89b926bc8da57d41d4286416 (patch)
treefe2d2554a06f1630f02d55b7010107c19968266f
parentdf62a3195e81d1d15223c2c1058c440f7db4cebc (diff)
xfs: include an allocfree res for inobt modifications
(bsc#1145235).
-rw-r--r--patches.suse/xfs-include-an-allocfree-res-for-inobt-modifications.patch214
-rw-r--r--series.conf1
2 files changed, 215 insertions, 0 deletions
diff --git a/patches.suse/xfs-include-an-allocfree-res-for-inobt-modifications.patch b/patches.suse/xfs-include-an-allocfree-res-for-inobt-modifications.patch
new file mode 100644
index 0000000000..9475c7b1b6
--- /dev/null
+++ b/patches.suse/xfs-include-an-allocfree-res-for-inobt-modifications.patch
@@ -0,0 +1,214 @@
+From: Brian Foster <bfoster@redhat.com>
+Date: Mon, 8 Jan 2018 10:41:37 -0800
+Subject: xfs: include an allocfree res for inobt modifications
+Git-commit: f03c78f39710995d2766236f229295d91b8de9dd
+Patch-mainline: v4.16-rc1
+References: bsc#1145235
+
+Analysis of recent reports of log reservation overruns and code
+inspection has uncovered that the reservations associated with inode
+operations may not cover the worst case scenarios. In particular,
+many cases only include one allocfree res. for a particular
+operation even though said operations may also entail AGFL fixups
+and inode btree block allocations in addition to the actual inode
+chunk allocation. This can easily turn into two or three block
+allocations (or frees) per operation.
+
+In theory, the only way to define the worst case reservation is to
+include an allocfree res for each individual allocation in a
+transaction. Since that is impractical (we can perform multiple agfl
+fixups per tx and not every allocation results in a full tree
+operation), we need to find a reasonable compromise that addresses
+the deficiency in practice without blowing out the size of the
+transactions.
+
+Since the inode btrees are not filled by the AGFL, record insertion
+and removal can directly result in block allocations and frees
+depending on the shape of the tree. These allocations and frees
+occur in the same transaction context as the inobt update itself,
+but are separate from the allocation/free that might be required for
+an inode chunk. Therefore, it makes sense to assume that an [f]inobt
+insert/remove can directly result in one or more block allocations
+on behalf of the tree.
+
+Refactor the inode transaction reservations to include one allocfree
+res. per inode btree modification to cover allocations required by
+the tree itself. This separates the reservation required to allocate
+the inode chunk from the reservation required for inobt record
+insertion/removal. Apply the same logic to the finobt. This results
+in killing off the finobt modify condition because we no longer
+assume that the broader transaction reservation will cover finobt
+block allocations and finobt shape changes can occur in either of
+the inobt allocation or modify situations.
+
+Suggested-by: Dave Chinner <david@fromorbit.com>
+Signed-off-by: Brian Foster <bfoster@redhat.com>
+Reviewed-by: Dave Chinner <dchinner@redhat.com>
+Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
+Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
+Acked-by: Nikolay Borisov <nborisov@suse.com>
+---
+ fs/xfs/libxfs/xfs_trans_resv.c | 84 +++++++++++++++++++++---------------------
+ 1 file changed, 43 insertions(+), 41 deletions(-)
+
+diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
+index 037a1295d289..19f3a226a357 100644
+--- a/fs/xfs/libxfs/xfs_trans_resv.c
++++ b/fs/xfs/libxfs/xfs_trans_resv.c
+@@ -132,44 +132,43 @@ xfs_calc_inode_res(
+ }
+
+ /*
+- * The free inode btree is a conditional feature and the log reservation
+- * requirements differ slightly from that of the traditional inode allocation
+- * btree. The finobt tracks records for inode chunks with at least one free
+- * inode. A record can be removed from the tree for an inode allocation
+- * or free and thus the finobt reservation is unconditional across:
++ * Inode btree record insertion/removal modifies the inode btree and free space
++ * btrees (since the inobt does not use the agfl). This requires the following
++ * reservation:
+ *
+- * - inode allocation
+- * - inode free
+- * - inode chunk allocation
++ * the inode btree: max depth * blocksize
++ * the allocation btrees: 2 trees * (max depth - 1) * block size
+ *
+- * The 'modify' param indicates to include the record modification scenario. The
+- * 'alloc' param indicates to include the reservation for free space btree
+- * modifications on behalf of finobt modifications. This is required only for
+- * transactions that do not already account for free space btree modifications.
++ * The caller must account for SB and AG header modifications, etc.
++ */
++STATIC uint
++xfs_calc_inobt_res(
++ struct xfs_mount *mp)
++{
++ return xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) +
++ xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1),
++ XFS_FSB_TO_B(mp, 1));
++}
++
++/*
++ * The free inode btree is a conditional feature. The behavior differs slightly
++ * from that of the traditional inode btree in that the finobt tracks records
++ * for inode chunks with at least one free inode. A record can be removed from
++ * the tree during individual inode allocation. Therefore the finobt
++ * reservation is unconditional for both the inode chunk allocation and
++ * individual inode allocation (modify) cases.
+ *
+- * the free inode btree: max depth * block size
+- * the allocation btrees: 2 trees * (max depth - 1) * block size
+- * the free inode btree entry: block size
++ * Behavior aside, the reservation for finobt modification is equivalent to the
++ * traditional inobt: cover a full finobt shape change plus block allocation.
+ */
+ STATIC uint
+ xfs_calc_finobt_res(
+- struct xfs_mount *mp,
+- int alloc,
+- int modify)
++ struct xfs_mount *mp)
+ {
+- uint res;
+-
+ if (!xfs_sb_version_hasfinobt(&mp->m_sb))
+ return 0;
+
+- res = xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1));
+- if (alloc)
+- res += xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1),
+- XFS_FSB_TO_B(mp, 1));
+- if (modify)
+- res += (uint)XFS_FSB_TO_B(mp, 1);
+-
+- return res;
++ return xfs_calc_inobt_res(mp);
+ }
+
+ /*
+@@ -373,7 +372,7 @@ xfs_calc_create_resv_modify(
+ xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) +
+ (uint)XFS_FSB_TO_B(mp, 1) +
+ xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1)) +
+- xfs_calc_finobt_res(mp, 1, 1);
++ xfs_calc_finobt_res(mp);
+ }
+
+ /*
+@@ -381,8 +380,8 @@ xfs_calc_create_resv_modify(
+ * the agi and agf of the ag getting the new inodes: 2 * sectorsize
+ * the superblock for the nlink flag: sector size
+ * the inode blocks allocated: mp->m_ialloc_blks * blocksize
+- * the inode btree: max depth * blocksize
+ * the allocation btrees: 2 trees * (max depth - 1) * block size
++ * the inode btree (record insertion)
+ */
+ STATIC uint
+ xfs_calc_create_resv_alloc(
+@@ -391,9 +390,9 @@ xfs_calc_create_resv_alloc(
+ return xfs_calc_buf_res(2, mp->m_sb.sb_sectsize) +
+ mp->m_sb.sb_sectsize +
+ xfs_calc_buf_res(mp->m_ialloc_blks, XFS_FSB_TO_B(mp, 1)) +
+- xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) +
+ xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1),
+- XFS_FSB_TO_B(mp, 1));
++ XFS_FSB_TO_B(mp, 1)) +
++ xfs_calc_inobt_res(mp);
+ }
+
+ STATIC uint
+@@ -409,8 +408,8 @@ __xfs_calc_create_reservation(
+ * For icreate we can allocate some inodes giving:
+ * the agi and agf of the ag getting the new inodes: 2 * sectorsize
+ * the superblock for the nlink flag: sector size
+- * the inode btree: max depth * blocksize
+ * the allocation btrees: 2 trees * (max depth - 1) * block size
++ * the inobt (record insertion)
+ * the finobt (record insertion)
+ */
+ STATIC uint
+@@ -419,10 +418,10 @@ xfs_calc_icreate_resv_alloc(
+ {
+ return xfs_calc_buf_res(2, mp->m_sb.sb_sectsize) +
+ mp->m_sb.sb_sectsize +
+- xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) +
+ xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1),
+ XFS_FSB_TO_B(mp, 1)) +
+- xfs_calc_finobt_res(mp, 0, 0);
++ xfs_calc_inobt_res(mp) +
++ xfs_calc_finobt_res(mp);
+ }
+
+ STATIC uint
+@@ -487,9 +486,14 @@ xfs_calc_symlink_reservation(
+ * the super block free inode counter, AGF and AGFL: sector size
+ * the on disk inode (agi unlinked list removal)
+ * the inode chunk is marked stale (headers only)
+- * the inode btree: max depth * blocksize
+- * the allocation btrees: 2 trees * (max depth - 1) * block size
++ * the inode btree
+ * the finobt (record insertion, removal or modification)
++ *
++ * Note that the allocfree res. for the inode chunk itself is not included
++ * because the extent free occurs after a transaction roll. We could take the
++ * maximum of the pre/post roll operations, but the pre-roll reservation already
++ * includes at least one allocfree res. for the inobt and is thus guaranteed to
++ * be larger.
+ */
+ STATIC uint
+ xfs_calc_ifree_reservation(
+@@ -500,10 +504,8 @@ xfs_calc_ifree_reservation(
+ xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) +
+ xfs_calc_iunlink_remove_reservation(mp) +
+ xfs_calc_buf_res(mp->m_ialloc_blks, 0) +
+- xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) +
+- xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1),
+- XFS_FSB_TO_B(mp, 1)) +
+- xfs_calc_finobt_res(mp, 0, 1);
++ xfs_calc_inobt_res(mp) +
++ xfs_calc_finobt_res(mp);
+ }
+
+ /*
+
diff --git a/series.conf b/series.conf
index 521aa67aca..7b80adb3d6 100644
--- a/series.conf
+++ b/series.conf
@@ -11983,6 +11983,7 @@
patches.suse/xfs-include-inobt-buffers-in-ifree-tx-log-reservation.patch
patches.suse/xfs-fix-up-agi-unlinked-list-reservations.patch
patches.suse/xfs-truncate-transaction-does-not-modify-the-inobt.patch
+ patches.suse/xfs-include-an-allocfree-res-for-inobt-modifications.patch
patches.fixes/0004-iomap-report-collisions-between-directio-and-buffere.patch
patches.fixes/xfs-call-xfs_qm_dqattach-before-performing-reflink-o.patch
patches.fixes/xfs-preserve-i_rdev-when-recycling-a-reclaimable-inode.patch