From nobody Thu Dec 30 22:41:25 2021
X-Original-To: scsi@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B42B8192B449
	for <scsi@mlmmj.nyi.freebsd.org>; Thu, 30 Dec 2021 22:41:26 +0000 (UTC)
	(envelope-from jhb@FreeBSD.org)
Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (4096 bits) client-digest SHA256)
	(Client CN "smtp.freebsd.org", Issuer "R3" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4JQ3Cp47Ffz4dFD;
	Thu, 30 Dec 2021 22:41:26 +0000 (UTC)
	(envelope-from jhb@FreeBSD.org)
Received: from [10.0.1.4] (ralph.baldwin.cx [66.234.199.215])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(Client did not present a certificate)
	(Authenticated sender: jhb)
	by smtp.freebsd.org (Postfix) with ESMTPSA id DD24033D0;
	Thu, 30 Dec 2021 22:41:25 +0000 (UTC)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <68ad0ffd-79fb-973d-e5ba-92deee352d44@FreeBSD.org>
Date: Thu, 30 Dec 2021 14:41:25 -0800
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-scsi
List-Help: <mailto:scsi+help@freebsd.org>
List-Post: <mailto:scsi@freebsd.org>
List-Subscribe: <mailto:scsi+subscribe@freebsd.org>
List-Unsubscribe: <mailto:scsi+unsubscribe@freebsd.org>
Sender: owner-freebsd-scsi@freebsd.org
X-BeenThere: freebsd-scsi@freebsd.org
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:91.0)
 Gecko/20100101 Thunderbird/91.4.1
Content-Language: en-US
To: Alexander Motin <mav@FreeBSD.org>
Cc: =?UTF-8?Q?Edward_Tomasz_Napiera=c5=82a?= <trasz@freebsd.org>,
 scsi@FreeBSD.org, Ken Merry <ken@freebsd.org>
References: <fd383f6f-5a19-e2bb-5383-e559271eb3cd@FreeBSD.org>
 <b6c090ac-6cb0-6173-422d-9aef0b37b8ee@FreeBSD.org>
 <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org>
 <7f1a1a65-199d-858a-792c-42871d1df13e@FreeBSD.org>
From: John Baldwin <jhb@FreeBSD.org>
Subject: Re: iSCSI target: Handling in-flight requests during ctld shutdown
In-Reply-To: <7f1a1a65-199d-858a-792c-42871d1df13e@FreeBSD.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org;
	s=dkim; t=1640904086;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=IfCU3ZbZ01/s9vaHnf9FFF1cF6H4SgoPMSckUhwPO8w=;
	b=ntDPjnGx57ysgkLQgY38NQW/s3YF4ZKsFGhxYpP6AV8RatkTCQ7lGAxtaj6CQ8PZIkGtjn
	PV826W6KKqJ5t5a1bvQDgFMRde6sTYelDXbUDcxHy3tYF232fel0rXFwZ4U1KYhRffzJsq
	RV1BnnmR80uMkwx1qENGzISXwAVKQbeBlNqQrLbkcPziK65CmK2+8DSGLl5Nu5EInMwTFX
	td99LLU/2jJ+g0CIhqY9kcTt1svQnOTy404T8aaEUk5HAnjeRBq5Gp8OTc+z9jT3b6CYYx
	5HnBFB9hZnJ/O5Odv0KL4GJDbB6cH64WtxLYq3pzO/f9lgIUDOVKTKmNNj1yYg==
ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1640904086; a=rsa-sha256; cv=none;
	b=EHi8oXnw13ACFW4igSuitetteEGeUlbTmGtHW3sEdIsNpAIn2PrRMAt2zcOlIfahubFUxw
	XI1bIvL80GxkempW3PN/Sw/TmDa+KzV04viK0Zd2V8xQO9R1TnFChDJAGIPGBgQYZDBf+Y
	ODJIBm7XNSUEOgKRGz4iLAewSmFIeKehzhHWwGgHVFxuUiKf929eOYdS1zt6nkNAFPS+1/
	/ngAgnKf27n6ca//0Tt1bowJkWW9K6qKaogjIU05tjwPTUp3789WUYzZG45tFME9NtDU0z
	nXIPKm47vrfvm8P/ROIKNFVD6WJnrNQG3ddhzWf84DnhfL8JxdIkcTRK1d2x7w==
ARC-Authentication-Results: i=1;
	mx1.freebsd.org;
	none
X-ThisMailContainsUnwantedMimeParts: N

On 12/30/21 12:10 PM, Alexander Motin wrote:
> On 30.12.2021 14:29, John Baldwin wrote:
>> On 12/29/21 1:57 PM, Alexander Motin wrote:
>>> The HARDWARE ERROR is obviously not expected by the initiator.  It
>>> should better not be leaked after we decided to kill the connection.
>>> Initiator may retry it and still work happily after reconnect, but
>>> cleaner would be to not rely on that.  cfiscsi_session_terminate_tasks()
>>> aborts all running commands by CTL_TASK_I_T_NEXUS_RESET, that make them
>>> not return statuses to initiator, but I suppose this is the other side
>>> of the race now.
>>
>> Hmm, I wonder if we should be setting CTL_FLAG_ABORT instead of setting the
>> port_status when aborting an I/O?  The comment in ctl_frontend_iscsi.c
>> claims
>> the backends check the port_status, but I don't see any checks for
>> port_status
>> at all in backends.
> 
> The checks were in all backends, until in 2c7dc6bae9fd I've moved them
> to ctl_datamove_done_process().  Backends don't have any special
> knowledge now to process specific data move error codes.  It was
> complete code duplication between backends and I've tried to organize
> that a bit.  Unfortunately we do not have formalized data move error
> statuses in CTL to do more.  It would be much more interesting (and
> complicated ;)) if iSCSI (for example) could report data transfer error
> without killing connection, but that area in general is not very
> specified in SCSI, since usually each transport has own means of recovery.

Hmmm, but this commit seems to have actually broken this case.  That is,
now the possibly uninitialized data is written to the backing store:

         /*
-        * We set status at this point for read commands, and write
-        * commands with errors.
+        * We set status at this point for read and compare commands.
          */
-       if (io->io_hdr.flags & CTL_FLAG_ABORT) {
-               ;
-       } else if ((io->io_hdr.port_status != 0) &&
-           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE ||
-            (io->io_hdr.status & CTL_STATUS_MASK) == CTL_SUCCESS)) {
-               ctl_set_internal_failure(&io->scsiio, /*sks_valid*/ 1,
-                   /*retry_count*/ io->io_hdr.port_status);
-       } else if (io->scsiio.kern_data_resid != 0 &&
-           (io->io_hdr.flags & CTL_FLAG_DATA_MASK) == CTL_FLAG_DATA_OUT &&
-           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE ||
-            (io->io_hdr.status & CTL_STATUS_MASK) == CTL_SUCCESS)) {
-               ctl_set_invalid_field_ciu(&io->scsiio);
-       } else if ((io->io_hdr.port_status == 0) &&
-           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE)) {
+       if ((io->io_hdr.flags & CTL_FLAG_ABORT) == 0 &&
+           (io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE) {
                 lbalen = ARGS(beio->io);
                 if (lbalen->flags & CTL_LLF_READ) {
                         ctl_set_success(&io->scsiio);

As ctl_datamove_done_process() will call ctl_set_internal_failure(), but
ctl_datamove_done() will still call be_move_done which runs the code above.
The code above doesn't check port_status anywhere, so it will still happily
write the stale data to the backend unless CTL_FLAG_ABORT is set.  Do the
backends need to be checking port_status == 0 again in addition to
CTL_FLAG_ABORT then?

>>    I do see checks for CTL_FLAG_ABORT, and the handler
>> for
>> the CTL_TASK_I_T_NEXUS_RESET does set CTL_FLAG_ABORT on pending requests.
>>
>> For the tasks in sciscsi_session_terminate_tasks(), those should already
>> have
>> CTL_FLAG_ABORT set anyway, but it wouldn't hurt if it were set again by
>> cfiscsi_data_wait_abort().  For the the cfiscsi_task_management_done
>> case I'm
>> less certain, but I suspect there too that returning an internal error
>> status
>> back to the initiator is not expected and that it would be better to
>> just set
>> CTL_FLAG_ABORT and drop any response?
> 
> cfiscsi_data_wait_abort() does only one specific thing -- it aborts
> waiting for data and completes the data move operation, allowing backend
> to continue (or at least free resources, etc).  It is only
> cfiscsi_session_terminate_tasks() knows that it should abort data moves
> while it aborts tasks, since it knows that the connection is dead.
> cfiscsi_task_management_done() is another special case, since iSCSI
> explicitly mentions that there will be no further exchanges related to
> that task, so the data move should explicitly return.  But in general
> case I think it would be nice to not combine those two facts of aborting
> task and aborting data transfer, since the last may not necessary imply
> the first, as the first does not imply the last.
> 
> It is tempting to just set the flag, but I suspect it may cause more
> problems (I am worrying about HA peer update).  I'd prefer the race
> inside the transport to be handled inside the transport.  May be we
> could just add another flag to be set under the session lock inside
> cfiscsi_session_terminate_tasks() after it aborted all the tasks, but
> before it start to abort transfers itself and use it inside
> cfiscsi_datamove_out() instead of cs_terminating.

I think though that even that flag wouldn't quite help as you would still
have the problem that the internal_error response would still be sent on
the wire if we don't abort the entire task vs aborting just the move.

It may be that all I really need to do is fix the one case in
cfiscsi_datamove_out() to set CTL_FLAG_ABORT rather than changing
cfiscsi_data_move_abort()?

-- 
John Baldwin