From nobody Thu Dec 30 20:10:16 2021
X-Original-To: scsi@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 0FCDA192EBB7
	for <scsi@mlmmj.nyi.freebsd.org>; Thu, 30 Dec 2021 20:10:19 +0000 (UTC)
	(envelope-from mavbsd@gmail.com)
Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4JPzsQ4xnvz3m2k;
	Thu, 30 Dec 2021 20:10:18 +0000 (UTC)
	(envelope-from mavbsd@gmail.com)
Received: by mail-qv1-xf2b.google.com with SMTP id g15so23028833qvi.6;
        Thu, 30 Dec 2021 12:10:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=sender:message-id:date:mime-version:user-agent:subject
         :content-language:to:cc:references:from:in-reply-to
         :content-transfer-encoding;
        bh=8SYTJlKsG0ignnYT7itBfdZU+xoCRKW/dZ9Am06jg58=;
        b=dnzm2KxTxqQvHq8R1sLHUMN78D0bldhBwY6h8J1w773p08wR2aIihWw8NO0Z08caAQ
         5afJMrml+QKUyj5wjh3oBAb16fvg8QLIW63FGvsFYqBYNBJiiYNafpR3U33VvXWWqwF8
         UeQZf+u1kkPPzFsJ6soL9MQtoYTL6l2Wdo3i77hhR9JOxeXVy7pbeGqckML+tfEnxRVl
         K+DpHi96NijUvj//3S1knms2kk6BuEvKzCEEZeqo0F9WObgEFi5mR3NEYHDMNsuIAIDY
         V1RVKOPmyPJF/Nu8lMVzSCb8AbC2yXxjo2s8sSiFYaT7R3O9whzcy7HfrxKLV5mfi9zA
         hyeg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:sender:message-id:date:mime-version:user-agent
         :subject:content-language:to:cc:references:from:in-reply-to
         :content-transfer-encoding;
        bh=8SYTJlKsG0ignnYT7itBfdZU+xoCRKW/dZ9Am06jg58=;
        b=Sr50CpvA/ZeCVA8LzgUdcbkd7JrMaBmaxTiqvlfVGVnsufKT2uZA+V9lrRlnLs0mMf
         KAbhFvZHOhnhrCn7SlJeUCF9SGSvqjHmvnuJIsuomHxwe7vW0rFkUJGI+l11T4UqsW5M
         Obf261KBHYQcjnXZOcIPGDf9ZTKhQnKHopheFWGwDigbj0Q/vXDIcmI2wPaWbHGxivSC
         Pklsj1YWdqY/ySM1glI0SucTklnauuy/wsWPxhi2wLRn5k5/yqpYXRogyPwVtpAAQi6J
         SjiSI1ktl+iPZGEWtbg/BGqMJtcWXknPGs5r4GhC/OHj+J+/yP5ICx/kQlYVLKt3ghvO
         Qiwg==
X-Gm-Message-State: AOAM531tmew+eWptrhtoXnXiofUql0s9HEGwGeyMG1EdKatVPHFhog+t
	W5bbseYQJYhbjj6oXx8V9EIOTovxNBM=
X-Google-Smtp-Source: ABdhPJwdcuiP7ZtHIHnEBRWVQkC8Z+lHLB6QRaSwao47Ltm9/ndmigVg2G6SaXg2oSPIEtwezPw08g==
X-Received: by 2002:a05:6214:c6d:: with SMTP id t13mr13845155qvj.76.1640895017915;
        Thu, 30 Dec 2021 12:10:17 -0800 (PST)
Received: from [10.231.1.66] ([38.32.73.2])
        by smtp.gmail.com with ESMTPSA id v9sm19196003qkf.90.2021.12.30.12.10.17
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Thu, 30 Dec 2021 12:10:17 -0800 (PST)
Message-ID: <7f1a1a65-199d-858a-792c-42871d1df13e@FreeBSD.org>
Date: Thu, 30 Dec 2021 15:10:16 -0500
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-scsi
List-Help: <mailto:scsi+help@freebsd.org>
List-Post: <mailto:scsi@freebsd.org>
List-Subscribe: <mailto:scsi+subscribe@freebsd.org>
List-Unsubscribe: <mailto:scsi+unsubscribe@freebsd.org>
Sender: owner-freebsd-scsi@freebsd.org
X-BeenThere: freebsd-scsi@freebsd.org
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101
 Thunderbird/91.4.1
Subject: Re: iSCSI target: Handling in-flight requests during ctld shutdown
Content-Language: en-US
To: John Baldwin <jhb@FreeBSD.org>
Cc: =?UTF-8?Q?Edward_Tomasz_Napiera=c5=82a?= <trasz@freebsd.org>,
 scsi@FreeBSD.org, Ken Merry <ken@freebsd.org>
References: <fd383f6f-5a19-e2bb-5383-e559271eb3cd@FreeBSD.org>
 <b6c090ac-6cb0-6173-422d-9aef0b37b8ee@FreeBSD.org>
 <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org>
From: Alexander Motin <mav@FreeBSD.org>
In-Reply-To: <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Rspamd-Queue-Id: 4JPzsQ4xnvz3m2k
X-Spamd-Bar: ----
Authentication-Results: mx1.freebsd.org;
	none
X-Spamd-Result: default: False [-4.00 / 15.00];
	 REPLY(-4.00)[]
X-ThisMailContainsUnwantedMimeParts: N

On 30.12.2021 14:29, John Baldwin wrote:
> On 12/29/21 1:57 PM, Alexander Motin wrote:
>> The HARDWARE ERROR is obviously not expected by the initiator.  It
>> should better not be leaked after we decided to kill the connection.
>> Initiator may retry it and still work happily after reconnect, but
>> cleaner would be to not rely on that.  cfiscsi_session_terminate_tasks()
>> aborts all running commands by CTL_TASK_I_T_NEXUS_RESET, that make them
>> not return statuses to initiator, but I suppose this is the other side
>> of the race now.
> 
> Hmm, I wonder if we should be setting CTL_FLAG_ABORT instead of setting the
> port_status when aborting an I/O?  The comment in ctl_frontend_iscsi.c 
> claims
> the backends check the port_status, but I don't see any checks for 
> port_status
> at all in backends.

The checks were in all backends, until in 2c7dc6bae9fd I've moved them 
to ctl_datamove_done_process().  Backends don't have any special 
knowledge now to process specific data move error codes.  It was 
complete code duplication between backends and I've tried to organize 
that a bit.  Unfortunately we do not have formalized data move error 
statuses in CTL to do more.  It would be much more interesting (and 
complicated ;)) if iSCSI (for example) could report data transfer error 
without killing connection, but that area in general is not very 
specified in SCSI, since usually each transport has own means of recovery.

>  I do see checks for CTL_FLAG_ABORT, and the handler 
> for
> the CTL_TASK_I_T_NEXUS_RESET does set CTL_FLAG_ABORT on pending requests.
> 
> For the tasks in sciscsi_session_terminate_tasks(), those should already 
> have
> CTL_FLAG_ABORT set anyway, but it wouldn't hurt if it were set again by
> cfiscsi_data_wait_abort().  For the the cfiscsi_task_management_done 
> case I'm
> less certain, but I suspect there too that returning an internal error 
> status
> back to the initiator is not expected and that it would be better to 
> just set
> CTL_FLAG_ABORT and drop any response?

cfiscsi_data_wait_abort() does only one specific thing -- it aborts 
waiting for data and completes the data move operation, allowing backend 
to continue (or at least free resources, etc).  It is only 
cfiscsi_session_terminate_tasks() knows that it should abort data moves 
while it aborts tasks, since it knows that the connection is dead. 
cfiscsi_task_management_done() is another special case, since iSCSI 
explicitly mentions that there will be no further exchanges related to 
that task, so the data move should explicitly return.  But in general 
case I think it would be nice to not combine those two facts of aborting 
task and aborting data transfer, since the last may not necessary imply 
the first, as the first does not imply the last.

It is tempting to just set the flag, but I suspect it may cause more 
problems (I am worrying about HA peer update).  I'd prefer the race 
inside the transport to be handled inside the transport.  May be we 
could just add another flag to be set under the session lock inside 
cfiscsi_session_terminate_tasks() after it aborted all the tasks, but 
before it start to abort transfers itself and use it inside 
cfiscsi_datamove_out() instead of cs_terminating.

-- 
Alexander Motin