Home Home > GIT Browse
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTakashi Iwai <tiwai@suse.de>2018-01-12 15:54:05 +0100
committerTakashi Iwai <tiwai@suse.de>2018-01-12 15:54:05 +0100
commitb13f43af2631272d3d449a83d5ffd6a676ad819d (patch)
tree7eb5195a4ecba9086d2e5230a5e104733a1ef24e
parentc092517e006fe398a587fda134dd28aaa9024ebd (diff)
parent529f10d2911ae73a249a107e6c4d407d0ba9c75b (diff)
Merge branch 'users/hare/SLE15/for-next' into SLE15
-rw-r--r--patches.suse/nvme_fc-correct-hang-in-nvme_ns_remove.patch43
-rw-r--r--patches.suse/nvme_fc-fix-rogue-admin-cmds-stalling-teardown.patch50
-rw-r--r--series.conf2
3 files changed, 95 insertions, 0 deletions
diff --git a/patches.suse/nvme_fc-correct-hang-in-nvme_ns_remove.patch b/patches.suse/nvme_fc-correct-hang-in-nvme_ns_remove.patch
new file mode 100644
index 0000000000..8640b7b771
--- /dev/null
+++ b/patches.suse/nvme_fc-correct-hang-in-nvme_ns_remove.patch
@@ -0,0 +1,43 @@
+From: James Smart <jsmart2021@gmail.com>
+Date: Thu, 11 Jan 2018 15:21:38 -0800
+Subject: [PATCH] nvme_fc: correct hang in nvme_ns_remove()
+Patch-Mainline: submitted linux-nvme 2018/01/11
+References: bsc#1075811
+Reviewed-by: Hannes Reinecke <hare@suse.com>
+
+When connectivity is lost to a device, the association is terminated
+and the blk-mq queues are quiesced/stopped. When connectivity is
+re-established, they are resumed.
+
+If connectivity is lost for a sufficient amount of time that the
+controller is then deleted, the delete path starts tearing down queues,
+and eventually calling nvme_ns_remove(). It appears that pending
+commands may cause blk_cleanup_queue() to never complete and the
+teardown stalls.
+
+Correct by starting the ns queues after transitioning to a DELETING
+state, allowing pending commands to be flushed with io failures. Thus
+the delete path is clear when reached.
+
+Signed-off-by: James Smart <james.smart@broadcom.com>
+---
+ drivers/nvme/host/fc.c | 3 +++
+ 1 file changed, 3 insertions(+)
+
+diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
+index 8f9ddd0..aa916a4 100644
+--- a/drivers/nvme/host/fc.c
++++ b/drivers/nvme/host/fc.c
+@@ -2938,6 +2938,9 @@ static inline blk_status_t nvme_fc_is_ready(struct nvme_fc_queue *queue,
+ * waiting for io to terminate
+ */
+ nvme_fc_delete_association(ctrl);
++
++ /* resume the io queues so that things will fast fail */
++ nvme_start_queues(nctrl);
+ }
+
+ static void
+--
+1.8.5.6
+
diff --git a/patches.suse/nvme_fc-fix-rogue-admin-cmds-stalling-teardown.patch b/patches.suse/nvme_fc-fix-rogue-admin-cmds-stalling-teardown.patch
new file mode 100644
index 0000000000..837def59a0
--- /dev/null
+++ b/patches.suse/nvme_fc-fix-rogue-admin-cmds-stalling-teardown.patch
@@ -0,0 +1,50 @@
+From: James Smart <jsmart2021@gmail.com>
+Date: Thu, 11 Jan 2018 14:29:22 -0800
+Subject: [PATCH] nvme_fc: fix rogue admin cmds stalling teardown
+Patch-Mainline: submitted linux-nvme 2018/01/11
+References: bsc#1075811
+
+When connectivity is lost to a device, the association is terminated
+and the blk-mq queues are quiesced/stopped. When connectivity is
+re-established, they are resumed.
+
+If an admin command is received while connectivity is list, the ioctl
+queues the command on the admin_q and the command stalls (the thread
+issuing the ioctl hangs/waits). if the connectivity is lost long
+enough such that the controller is then deleted, the delete code
+makes its calls to initiate the delete, which then expects the core
+layer to call the transport when all references are removed and the
+controller can be freed. Unfortunately, nothing in this path dequeued
+the admin command, so a reference sits outstanding and things stop,
+hanging the delete indefinitely.
+
+Correct by unquiescing the admin queue in the delete association. This
+means any admin command (which should only be from an ioctl) issued
+after connectivity is lost will detect the controller is in a
+reconnecting state and will (fast) fail the command. Thus, a pending
+reference can no longer be created. Once connectivity is re-established,
+a new ioctl/admin command would see proper device state and function again.
+
+Signed-off-by: James Smart <james.smart@broadcom.com>
+Reviewed-by: Hannes Reinecke <hare@suse.com>
+---
+ drivers/nvme/host/fc.c | 3 +++
+ 1 file changed, 3 insertions(+)
+
+diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
+index 794e66e..8f9ddd0 100644
+--- a/drivers/nvme/host/fc.c
++++ b/drivers/nvme/host/fc.c
+@@ -2921,6 +2921,9 @@ static inline blk_status_t nvme_fc_is_ready(struct nvme_fc_queue *queue,
+ __nvme_fc_delete_hw_queue(ctrl, &ctrl->queues[0], 0);
+ nvme_fc_free_queue(&ctrl->queues[0]);
+
++ /* re-enable the admin_q so anything new can fast fail */
++ blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
++
+ nvme_fc_ctlr_inactive_on_rport(ctrl);
+ }
+
+--
+1.8.5.6
+
diff --git a/series.conf b/series.conf
index ac8b2738c8..6a22c08e66 100644
--- a/series.conf
+++ b/series.conf
@@ -6514,6 +6514,8 @@
patches.fixes/dax-Pass-detailed-error-code-from-dax_iomap_fault.patch
patches.fixes/ext4-Fix-ENOSPC-handling-in-DAX-page-fault-handler.patch
patches.drivers/ibmvnic-Fix-pending-MAC-address-changes.patch
+ patches.suse/nvme_fc-fix-rogue-admin-cmds-stalling-teardown.patch
+ patches.suse/nvme_fc-correct-hang-in-nvme_ns_remove.patch
########################################################
# end of sorted patches