From nobody Thu Dec 30 20:10:16 2021 X-Original-To: scsi@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 0FCDA192EBB7 for ; Thu, 30 Dec 2021 20:10:19 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JPzsQ4xnvz3m2k; Thu, 30 Dec 2021 20:10:18 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: by mail-qv1-xf2b.google.com with SMTP id g15so23028833qvi.6; Thu, 30 Dec 2021 12:10:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=8SYTJlKsG0ignnYT7itBfdZU+xoCRKW/dZ9Am06jg58=; b=dnzm2KxTxqQvHq8R1sLHUMN78D0bldhBwY6h8J1w773p08wR2aIihWw8NO0Z08caAQ 5afJMrml+QKUyj5wjh3oBAb16fvg8QLIW63FGvsFYqBYNBJiiYNafpR3U33VvXWWqwF8 UeQZf+u1kkPPzFsJ6soL9MQtoYTL6l2Wdo3i77hhR9JOxeXVy7pbeGqckML+tfEnxRVl K+DpHi96NijUvj//3S1knms2kk6BuEvKzCEEZeqo0F9WObgEFi5mR3NEYHDMNsuIAIDY V1RVKOPmyPJF/Nu8lMVzSCb8AbC2yXxjo2s8sSiFYaT7R3O9whzcy7HfrxKLV5mfi9zA hyeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:message-id:date:mime-version:user-agent :subject:content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=8SYTJlKsG0ignnYT7itBfdZU+xoCRKW/dZ9Am06jg58=; b=Sr50CpvA/ZeCVA8LzgUdcbkd7JrMaBmaxTiqvlfVGVnsufKT2uZA+V9lrRlnLs0mMf KAbhFvZHOhnhrCn7SlJeUCF9SGSvqjHmvnuJIsuomHxwe7vW0rFkUJGI+l11T4UqsW5M Obf261KBHYQcjnXZOcIPGDf9ZTKhQnKHopheFWGwDigbj0Q/vXDIcmI2wPaWbHGxivSC Pklsj1YWdqY/ySM1glI0SucTklnauuy/wsWPxhi2wLRn5k5/yqpYXRogyPwVtpAAQi6J SjiSI1ktl+iPZGEWtbg/BGqMJtcWXknPGs5r4GhC/OHj+J+/yP5ICx/kQlYVLKt3ghvO Qiwg== X-Gm-Message-State: AOAM531tmew+eWptrhtoXnXiofUql0s9HEGwGeyMG1EdKatVPHFhog+t W5bbseYQJYhbjj6oXx8V9EIOTovxNBM= X-Google-Smtp-Source: ABdhPJwdcuiP7ZtHIHnEBRWVQkC8Z+lHLB6QRaSwao47Ltm9/ndmigVg2G6SaXg2oSPIEtwezPw08g== X-Received: by 2002:a05:6214:c6d:: with SMTP id t13mr13845155qvj.76.1640895017915; Thu, 30 Dec 2021 12:10:17 -0800 (PST) Received: from [10.231.1.66] ([38.32.73.2]) by smtp.gmail.com with ESMTPSA id v9sm19196003qkf.90.2021.12.30.12.10.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 30 Dec 2021 12:10:17 -0800 (PST) Message-ID: <7f1a1a65-199d-858a-792c-42871d1df13e@FreeBSD.org> Date: Thu, 30 Dec 2021 15:10:16 -0500 List-Id: SCSI subsystem List-Archive: https://lists.freebsd.org/archives/freebsd-scsi List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-scsi@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Subject: Re: iSCSI target: Handling in-flight requests during ctld shutdown Content-Language: en-US To: John Baldwin Cc: =?UTF-8?Q?Edward_Tomasz_Napiera=c5=82a?= , scsi@FreeBSD.org, Ken Merry References: <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org> From: Alexander Motin In-Reply-To: <42e175d9-1693-29e2-0b5b-3fa513aa2a2d@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4JPzsQ4xnvz3m2k X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On 30.12.2021 14:29, John Baldwin wrote: > On 12/29/21 1:57 PM, Alexander Motin wrote: >> The HARDWARE ERROR is obviously not expected by the initiator.  It >> should better not be leaked after we decided to kill the connection. >> Initiator may retry it and still work happily after reconnect, but >> cleaner would be to not rely on that.  cfiscsi_session_terminate_tasks() >> aborts all running commands by CTL_TASK_I_T_NEXUS_RESET, that make them >> not return statuses to initiator, but I suppose this is the other side >> of the race now. > > Hmm, I wonder if we should be setting CTL_FLAG_ABORT instead of setting the > port_status when aborting an I/O?  The comment in ctl_frontend_iscsi.c > claims > the backends check the port_status, but I don't see any checks for > port_status > at all in backends. The checks were in all backends, until in 2c7dc6bae9fd I've moved them to ctl_datamove_done_process(). Backends don't have any special knowledge now to process specific data move error codes. It was complete code duplication between backends and I've tried to organize that a bit. Unfortunately we do not have formalized data move error statuses in CTL to do more. It would be much more interesting (and complicated ;)) if iSCSI (for example) could report data transfer error without killing connection, but that area in general is not very specified in SCSI, since usually each transport has own means of recovery. >  I do see checks for CTL_FLAG_ABORT, and the handler > for > the CTL_TASK_I_T_NEXUS_RESET does set CTL_FLAG_ABORT on pending requests. > > For the tasks in sciscsi_session_terminate_tasks(), those should already > have > CTL_FLAG_ABORT set anyway, but it wouldn't hurt if it were set again by > cfiscsi_data_wait_abort().  For the the cfiscsi_task_management_done > case I'm > less certain, but I suspect there too that returning an internal error > status > back to the initiator is not expected and that it would be better to > just set > CTL_FLAG_ABORT and drop any response? cfiscsi_data_wait_abort() does only one specific thing -- it aborts waiting for data and completes the data move operation, allowing backend to continue (or at least free resources, etc). It is only cfiscsi_session_terminate_tasks() knows that it should abort data moves while it aborts tasks, since it knows that the connection is dead. cfiscsi_task_management_done() is another special case, since iSCSI explicitly mentions that there will be no further exchanges related to that task, so the data move should explicitly return. But in general case I think it would be nice to not combine those two facts of aborting task and aborting data transfer, since the last may not necessary imply the first, as the first does not imply the last. It is tempting to just set the flag, but I suspect it may cause more problems (I am worrying about HA peer update). I'd prefer the race inside the transport to be handled inside the transport. May be we could just add another flag to be set under the session lock inside cfiscsi_session_terminate_tasks() after it aborted all the tasks, but before it start to abort transfers itself and use it inside cfiscsi_datamove_out() instead of cs_terminating. -- Alexander Motin