git: 7dbcef9536b4 - stable/12 - Upgrade ENA to v2.4.1
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 11 Oct 2021 14:32:43 UTC
The branch stable/12 has been updated by mw: URL: https://cgit.FreeBSD.org/src/commit/?id=7dbcef9536b410426e8b391e721e5800f5d503b5 commit 7dbcef9536b410426e8b391e721e5800f5d503b5 Author: Marcin Wojtas <mw@FreeBSD.org> AuthorDate: 2021-07-23 22:31:32 +0000 Commit: Marcin Wojtas <mw@FreeBSD.org> CommitDate: 2021-10-11 14:32:14 +0000 Upgrade ENA to v2.4.1 Approved by: re ena: Remove redundant declaration of ena_log_level. GCC6 raises a -Wredundant-decl error due to duplicate declarations in ena_fbsd_log.h and ena_plat.h. Submitted by: jhb Sponsored by: Chelsio Communications (cherry picked from commit 8843787aa1bdbd10de6ba47a04489179ec2d2d3c) ena: Avoid unnecessary mbuf collapses for LLQ condition In case of Low-latency Queue, one small enough descriptor can be pushed directly to the ENA hw, thus saving one fragment. Check for this condition before performing collapse. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit c81f8c26115a64b9a97ecdb2a64e824dd839ee73) ena: Trigger reset on ena_com_prepare_tx failure All ena_com_prepare_tx errors other than ENA_COM_NO_MEM are fatal and require device reset. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 36130d2979d695dd439bc607feb00dcdb9a1937b) ena: Prevent reset after device destruction Check for ENA_FLAG_TRIGGER_RESET inside a locked context in order to avoid potential race conditions with ena_destroy_device. This aligns the reset task logic with the Linux driver. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 433ab9b6987b42b3e5b25b8b5dc7e5178c7ef9bb) ena: Add extra log messages Stay aligned with the Linux driver by adding the following logs: * inform the user about retrying queue creation * warn on non-empty ena_tx_buffer.mbuf prior to ena_tx_map_mbuf Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 77160654a162b5faa8ad7a02e18d2bef2589f868) ena: Add locking assertions ENA silently assumed that ena_up, ena_down and ena_start_xmit routines should be called within locked context. Driver's logic heavily assumes on concurrent access to those routines, so for safety and better documentation about this assumption, the locking assertions were added to the above functions. The assertion was added only for the main steps (skipping the helper functions) which can be called from multiple places including the kernel and the driver itself. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit cb98c439d66c303353a9f4abbbe9ddb51559c638) ena: Move RSS logic into its own source files Delegate RSS related functionality into separate .c/.h files in preparation for the full RSS support. While at it, reorder functions and remove prototypes for ones with internal linkage. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 986e7b9227668caf9620f207e3c1d708c87b634d) ena: Disable meta descriptor caching for netmap If LLQ is being used, `ena_tx_ctx.meta_valid` must stay enabled. This fixes netmap support on latest generation ENA HW and aligns it with the core driver behavior. As netmap doesn't support any csum offloads, the `adapter->disable_meta_caching` value can be simply passed to the HW. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit a831466830de6ab55fc03170290b313157196e81) ena: Share ena_global_lock between driver instances In order to use `ena_global_lock` in sysctl context, it must be kept outside the driver instance's software context, as sysctls can be called before attach and after detach, leading to lock use before sx_init and after sx_destroy otherwise. Solve this issue by turning `ena_global_lock` into a file scope variable, shared between all instances of the driver and associated sysctl context, and in turn initialized/destroyed in dedicated SYSINIT/SYSUNINIT functions. As a side effect, this change also fixes existing race in the reset routine, when simultaneously accessing sysctl exposed properties. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 07aff471c0de2de9a1dc5c7749c46b525bdd0201) ena: Add missing statistics Provide the following sysctl statistics in order to stay aligned with the Linux driver: * rx_ring.csum_good * tx_ring.unmask_interrupt_num Also rename the 'bad_csum' statistic name to 'csum_bad' for alignment. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 223c8cb12e951c63807300a0cbdc4a1569520b4b) ena: Implement full RSS reconfiguration Bind RX/TX queues and MSI-X vectors to matching CPUs based on the RSS bucket entries. Introduce sysctls for the following RSS functionality: - rss.indir_table: indirection table mapping - rss.indir_table_size: indirection table size - rss.key: RSS hash key (if Toeplitz used) Said sysctls are only available when compiled without `option RSS`, as kernel-side RSS support currently doesn't offer RSS reconfiguration. Migrate the hash algorithm from CRC32 to Toeplitz and change the initial hash value to 0x0 in order to match the standard Toeplitz implementation. Provide helpers for hash key inversion required for HW operations. Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 6d1ef2abd330fac4057f092abbbdc28a568b4327) ena: fix building in-kernel driver When building ENA as compiled into the kernel, the driver would fail to build. Resolve the problem by introducing the following changes: 1. Add missing `ena_rss.c` entry in `sys/conf/files`. 2. Prevent SYSCTL_ADD_INT from throwing an assert due to an extra CTLTYPE_INT flag. Fixes: 986e7b92276 ("ena: Move RSS logic into its own source files") Fixes: 6d1ef2abd33 ("ena: Implement full RSS reconfiguration") Submitted by: Artur Rojek <ar@semihalf.com> Obtained from: Semihalf Sponsored by: Amazon, Inc. MFC after: 1 week (cherry picked from commit a3f0d18237bdcf272461d3b4b682de384c572144) ena: Update driver version to v2.4.1 Some of the changes in this release: * Hardware RSS hash key reconfiguration and indirection table reconfiguration support. * Full kernel RSS support. * Extra statistic counters. * Netmap support for ENAv3. * Locking assertions. * Extra log messages. * Reset handling fixes. Submitted by: Michal Krawczyk <mk@semihalf.com> Obtained from: Semihalf MFC after: 2 weeks Sponsored by: Amazon, Inc. (cherry picked from commit 42c7760be3ea420668f625f2064ae347aa7e818e) --- share/man/man4/ena.4 | 51 +++++++ sys/conf/files | 2 + sys/contrib/ena-com/ena_plat.h | 2 - sys/dev/ena/ena.c | 302 +++++++++++++------------------------ sys/dev/ena/ena.h | 34 +++-- sys/dev/ena/ena_datapath.c | 32 +++- sys/dev/ena/ena_netmap.c | 7 +- sys/dev/ena/ena_rss.c | 300 +++++++++++++++++++++++++++++++++++++ sys/dev/ena/ena_rss.h | 73 +++++++++ sys/dev/ena/ena_sysctl.c | 329 +++++++++++++++++++++++++++++++++++++++-- sys/modules/ena/Makefile | 2 +- 11 files changed, 900 insertions(+), 234 deletions(-) diff --git a/share/man/man4/ena.4 b/share/man/man4/ena.4 index bc1927608b04..71bdf4babca6 100644 --- a/share/man/man4/ena.4 +++ b/share/man/man4/ena.4 @@ -269,6 +269,57 @@ command should be used: .Bd -literal -offset indent sysctl dev.ena.1.eni_metrics.sample_interval=10 .Ed +.It Va dev.ena.X.rss.indir_table_size +RSS indirection table size. +The default is 128. +Returns the number of entries in the RSS indirection table. +.Pp +Example: +To read the RSS indirection table size, the following command should be used: +.Bd -literal -offset indent +sysctl dev.ena.0.rss.indir_table_size +.Ed +.It Va dev.ena.X.rss.indir_table +RSS indirection table mapping. +The default is x:y key-pairs of indir_table_size length. +Updates selected indices of the RSS indirection table. +.Pp +The entry string consists of one or more x:y keypairs, where x stands for +the table index and y for its new value. Table indices that don't need to be +updated can be omitted from the string and will retain their existing values. +.Pp +If an index is entered more than once, the last value is used. +.Pp +Example: +To update two selected indices in the RSS indirection table, e.g. setting index +0 to queue 5 and then index 5 to queue 0, the following command should be used: +.Bd -literal -offset indent +sysctl dev.ena.0.rss.indir_table="0:5 5:0" +.Ed +.It Va dev.ena.X.rss.key +RSS hash key. +The default is 40 bytes long randomly generated hash key. +Controls the RSS Toeplitz hash algorithm key value. +.Pp +Only available when driver compiled without the kernel side RSS support. +.Pp +Example: +To change the RSS hash key value to +.Pp +0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, +.br +0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, +.br +0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4, +.br +0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c, +.br +0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa +.Pp +the following command should be used: +.Bd -literal -offset indent +sysctl dev.ena.0.rss.key=6d5a56da255b0ec24167253d43a38fb0d0ca2bcbae7b30b477cb2da38030f20c6a42b73bbeac01fa +.Ed .El .Sh DIAGNOSTICS .Ss Device initialization phase diff --git a/sys/conf/files b/sys/conf/files index ec54a06e84f1..34e1fffc4165 100644 --- a/sys/conf/files +++ b/sys/conf/files @@ -1714,6 +1714,8 @@ dev/ena/ena_datapath.c optional ena \ compile-with "${NORMAL_C} -I$S/contrib" dev/ena/ena_netmap.c optional ena \ compile-with "${NORMAL_C} -I$S/contrib" +dev/ena/ena_rss.c optional ena \ + compile-with "${NORMAL_C} -I$S/contrib" dev/ena/ena_sysctl.c optional ena \ compile-with "${NORMAL_C} -I$S/contrib" contrib/ena-com/ena_com.c optional ena diff --git a/sys/contrib/ena-com/ena_plat.h b/sys/contrib/ena-com/ena_plat.h index b31821248398..274f795950c0 100644 --- a/sys/contrib/ena-com/ena_plat.h +++ b/sys/contrib/ena-com/ena_plat.h @@ -98,8 +98,6 @@ extern struct ena_bus_space ebs; #define DEFAULT_ALLOC_ALIGNMENT 8 #define ENA_CDESC_RING_SIZE_ALIGNMENT (1 << 12) /* 4K */ -extern int ena_log_level; - #define container_of(ptr, type, member) \ ({ \ const __typeof(((type *)0)->member) *__p = (ptr); \ diff --git a/sys/dev/ena/ena.c b/sys/dev/ena/ena.c index 6425ddddee9a..5b5a7f6593f4 100644 --- a/sys/dev/ena/ena.c +++ b/sys/dev/ena/ena.c @@ -63,9 +63,6 @@ __FBSDID("$FreeBSD$"); #include <net/if_media.h> #include <net/if_types.h> #include <net/if_vlan_var.h> -#ifdef RSS -#include <net/rss_config.h> -#endif #include <netinet/in_systm.h> #include <netinet/in.h> @@ -84,6 +81,7 @@ __FBSDID("$FreeBSD$"); #include "ena_datapath.h" #include "ena.h" #include "ena_sysctl.h" +#include "ena_rss.h" #ifdef DEV_NETMAP #include "ena_netmap.h" @@ -143,7 +141,6 @@ static void ena_free_io_irq(struct ena_adapter *); static void ena_free_irqs(struct ena_adapter*); static void ena_disable_msix(struct ena_adapter *); static void ena_unmask_all_io_irqs(struct ena_adapter *); -static int ena_rss_configure(struct ena_adapter *); static int ena_up_complete(struct ena_adapter *); static uint64_t ena_get_counter(if_t, ift_counter); static int ena_media_change(if_t); @@ -161,8 +158,6 @@ static int ena_set_queues_placement_policy(device_t, struct ena_com_dev *, static uint32_t ena_calc_max_io_queue_num(device_t, struct ena_com_dev *, struct ena_com_dev_get_features_ctx *); static int ena_calc_io_queue_size(struct ena_calc_queue_size_ctx *); -static int ena_rss_init_default(struct ena_adapter *); -static void ena_rss_init_default_deferred(void *); static void ena_config_host_info(struct ena_com_dev *, device_t); static int ena_attach(device_t); static int ena_detach(device_t); @@ -186,6 +181,8 @@ static ena_vendor_info_t ena_vendor_info_array[] = { { 0, 0, 0 } }; +struct sx ena_global_lock; + /* * Contains pointers to event handlers, e.g. link state chage. */ @@ -265,27 +262,6 @@ fail_tag: return (error); } -/* - * This function should generate unique key for the whole driver. - * If the key was already genereated in the previous call (for example - * for another adapter), then it should be returned instead. - */ -void -ena_rss_key_fill(void *key, size_t size) -{ - static bool key_generated; - static uint8_t default_key[ENA_HASH_KEY_SIZE]; - - KASSERT(size <= ENA_HASH_KEY_SIZE, ("Requested more bytes than ENA RSS key can hold")); - - if (!key_generated) { - arc4random_buf(default_key, ENA_HASH_KEY_SIZE); - key_generated = true; - } - - memcpy(key, default_key, size); -} - static void ena_free_pci_resources(struct ena_adapter *adapter) { @@ -625,8 +601,10 @@ static int ena_setup_tx_resources(struct ena_adapter *adapter, int qid) { device_t pdev = adapter->pdev; + char thread_name[MAXCOMLEN + 1]; struct ena_que *que = &adapter->que[qid]; struct ena_ring *tx_ring = que->tx_ring; + cpuset_t *cpu_mask = NULL; int size, i, err; #ifdef DEV_NETMAP bus_dmamap_t *map; @@ -710,8 +688,16 @@ ena_setup_tx_resources(struct ena_adapter *adapter, int qid) tx_ring->running = true; - taskqueue_start_threads(&tx_ring->enqueue_tq, 1, PI_NET, - "%s txeq %d", device_get_nameunit(adapter->pdev), que->cpu); +#ifdef RSS + cpu_mask = &que->cpu_mask; + snprintf(thread_name, sizeof(thread_name), "%s txeq %d", + device_get_nameunit(adapter->pdev), que->cpu); +#else + snprintf(thread_name, sizeof(thread_name), "%s txeq %d", + device_get_nameunit(adapter->pdev), que->id); +#endif + taskqueue_start_threads_cpuset(&tx_ring->enqueue_tq, 1, PI_NET, + cpu_mask, "%s", thread_name); return (0); @@ -1153,8 +1139,6 @@ ena_update_buf_ring_size(struct ena_adapter *adapter, int rc = 0; bool dev_was_up; - ENA_LOCK_LOCK(adapter); - old_buf_ring_size = adapter->buf_ring_size; adapter->buf_ring_size = new_buf_ring_size; @@ -1189,8 +1173,6 @@ ena_update_buf_ring_size(struct ena_adapter *adapter, } } - ENA_LOCK_UNLOCK(adapter); - return (rc); } @@ -1202,8 +1184,6 @@ ena_update_queue_size(struct ena_adapter *adapter, uint32_t new_tx_size, int rc = 0; bool dev_was_up; - ENA_LOCK_LOCK(adapter); - old_tx_size = adapter->requested_tx_ring_size; old_rx_size = adapter->requested_rx_ring_size; adapter->requested_tx_ring_size = new_tx_size; @@ -1244,8 +1224,6 @@ ena_update_queue_size(struct ena_adapter *adapter, uint32_t new_tx_size, } } - ENA_LOCK_UNLOCK(adapter); - return (rc); } @@ -1268,8 +1246,6 @@ ena_update_io_queue_nb(struct ena_adapter *adapter, uint32_t new_num) int rc = 0; bool dev_was_up; - ENA_LOCK_LOCK(adapter); - dev_was_up = ENA_FLAG_ISSET(ENA_FLAG_DEV_UP, adapter); old_num = adapter->num_io_queues; ena_down(adapter); @@ -1299,8 +1275,6 @@ ena_update_io_queue_nb(struct ena_adapter *adapter, uint32_t new_num) } } - ENA_LOCK_UNLOCK(adapter); - return (rc); } @@ -1459,6 +1433,7 @@ ena_create_io_queues(struct ena_adapter *adapter) struct ena_que *queue; uint16_t ena_qid; uint32_t msix_vector; + cpuset_t *cpu_mask = NULL; int rc, i; /* Create TX queues */ @@ -1525,7 +1500,11 @@ ena_create_io_queues(struct ena_adapter *adapter) queue->cleanup_tq = taskqueue_create_fast("ena cleanup", M_WAITOK, taskqueue_thread_enqueue, &queue->cleanup_tq); - taskqueue_start_threads(&queue->cleanup_tq, 1, PI_NET, +#ifdef RSS + cpu_mask = &queue->cpu_mask; +#endif + taskqueue_start_threads_cpuset(&queue->cleanup_tq, 1, PI_NET, + cpu_mask, "%s queue %d cleanup", device_get_nameunit(adapter->pdev), i); } @@ -1664,7 +1643,10 @@ ena_setup_mgmnt_intr(struct ena_adapter *adapter) static int ena_setup_io_intr(struct ena_adapter *adapter) { - static int last_bind_cpu = -1; +#ifdef RSS + int num_buckets = rss_getnumbuckets(); + static int last_bind = 0; +#endif int irq_idx; if (adapter->msix_entries == NULL) @@ -1682,15 +1664,12 @@ ena_setup_io_intr(struct ena_adapter *adapter) ena_log(adapter->pdev, DBG, "ena_setup_io_intr vector: %d\n", adapter->msix_entries[irq_idx].vector); - /* - * We want to bind rings to the corresponding cpu - * using something similar to the RSS round-robin technique. - */ - if (unlikely(last_bind_cpu < 0)) - last_bind_cpu = CPU_FIRST(); +#ifdef RSS adapter->que[i].cpu = adapter->irq_tbl[irq_idx].cpu = - last_bind_cpu; - last_bind_cpu = CPU_NEXT(last_bind_cpu); + rss_getcpu(last_bind); + last_bind = (last_bind + 1) % num_buckets; + CPU_SETOF(adapter->que[i].cpu, &adapter->que[i].cpu_mask); +#endif } return (0); @@ -1782,6 +1761,19 @@ ena_request_io_irq(struct ena_adapter *adapter) goto err; } irq->requested = true; + +#ifdef RSS + rc = bus_bind_intr(adapter->pdev, irq->res, irq->cpu); + if (unlikely(rc != 0)) { + ena_log(pdev, ERR, "failed to bind " + "interrupt handler for irq %ju to cpu %d: %d\n", + rman_get_start(irq->res), irq->cpu, rc); + goto err; + } + + ena_log(pdev, INFO, "queue %d - cpu %d\n", + i - ENA_IO_IRQ_FIRST_IDX, irq->cpu); +#endif } return (rc); @@ -1910,6 +1902,7 @@ ena_unmask_all_io_irqs(struct ena_adapter *adapter) { struct ena_com_io_cq* io_cq; struct ena_eth_io_intr_reg intr_reg; + struct ena_ring *tx_ring; uint16_t ena_qid; int i; @@ -1918,47 +1911,12 @@ ena_unmask_all_io_irqs(struct ena_adapter *adapter) ena_qid = ENA_IO_TXQ_IDX(i); io_cq = &adapter->ena_dev->io_cq_queues[ena_qid]; ena_com_update_intr_reg(&intr_reg, 0, 0, true); + tx_ring = &adapter->tx_ring[i]; + counter_u64_add(tx_ring->tx_stats.unmask_interrupt_num, 1); ena_com_unmask_intr(io_cq, &intr_reg); } } -/* Configure the Rx forwarding */ -static int -ena_rss_configure(struct ena_adapter *adapter) -{ - struct ena_com_dev *ena_dev = adapter->ena_dev; - int rc; - - /* In case the RSS table was destroyed */ - if (!ena_dev->rss.tbl_log_size) { - rc = ena_rss_init_default(adapter); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) { - ena_log(adapter->pdev, ERR, - "WARNING: RSS was not properly re-initialized," - " it will affect bandwidth\n"); - ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter); - return (rc); - } - } - - /* Set indirect table */ - rc = ena_com_indirect_table_set(ena_dev); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) - return (rc); - - /* Configure hash function (if supported) */ - rc = ena_com_set_hash_function(ena_dev); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) - return (rc); - - /* Configure hash inputs (if supported) */ - rc = ena_com_set_hash_ctrl(ena_dev); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) - return (rc); - - return (0); -} - static int ena_up_complete(struct ena_adapter *adapter) { @@ -2079,6 +2037,10 @@ err_setup_tx: return (rc); } + ena_log(pdev, INFO, + "Retrying queue creation with sizes TX=%d, RX=%d\n", + new_tx_ring_size, new_rx_ring_size); + set_io_rings_size(adapter, new_tx_ring_size, new_rx_ring_size); } } @@ -2088,6 +2050,8 @@ ena_up(struct ena_adapter *adapter) { int rc = 0; + ENA_LOCK_ASSERT(); + if (unlikely(device_is_attached(adapter->pdev) == 0)) { ena_log(adapter->pdev, ERR, "device is not attached!\n"); return (ENXIO); @@ -2205,13 +2169,13 @@ ena_media_status(if_t ifp, struct ifmediareq *ifmr) struct ena_adapter *adapter = if_getsoftc(ifp); ena_log(adapter->pdev, DBG, "Media status update\n"); - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; if (!ENA_FLAG_ISSET(ENA_FLAG_LINK_UP, adapter)) { - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); ena_log(adapter->pdev, INFO, "Link is down\n"); return; } @@ -2219,7 +2183,7 @@ ena_media_status(if_t ifp, struct ifmediareq *ifmr) ifmr->ifm_status |= IFM_ACTIVE; ifmr->ifm_active |= IFM_UNKNOWN | IFM_FDX; - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); } static void @@ -2228,9 +2192,9 @@ ena_init(void *arg) struct ena_adapter *adapter = (struct ena_adapter *)arg; if (!ENA_FLAG_ISSET(ENA_FLAG_DEV_UP, adapter)) { - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); ena_up(adapter); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); } } @@ -2252,13 +2216,13 @@ ena_ioctl(if_t ifp, u_long command, caddr_t data) case SIOCSIFMTU: if (ifp->if_mtu == ifr->ifr_mtu) break; - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); ena_down(adapter); ena_change_mtu(ifp, ifr->ifr_mtu); rc = ena_up(adapter); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); break; case SIOCSIFFLAGS: @@ -2270,15 +2234,15 @@ ena_ioctl(if_t ifp, u_long command, caddr_t data) "ioctl promisc/allmulti\n"); } } else { - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); rc = ena_up(adapter); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); } } else { if ((if_getdrvflags(ifp) & IFF_DRV_RUNNING) != 0) { - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); ena_down(adapter); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); } } break; @@ -2303,10 +2267,10 @@ ena_ioctl(if_t ifp, u_long command, caddr_t data) if ((reinit != 0) && ((if_getdrvflags(ifp) & IFF_DRV_RUNNING) != 0)) { - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); ena_down(adapter); rc = ena_up(adapter); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); } } @@ -2460,6 +2424,8 @@ ena_down(struct ena_adapter *adapter) { int rc; + ENA_LOCK_ASSERT(); + if (!ENA_FLAG_ISSET(ENA_FLAG_DEV_UP, adapter)) return; @@ -2525,6 +2491,10 @@ ena_calc_max_io_queue_num(device_t pdev, struct ena_com_dev *ena_dev, /* 1 IRQ for for mgmnt and 1 IRQ for each TX/RX pair */ max_num_io_queues = min_t(uint32_t, max_num_io_queues, pci_msix_count(pdev) - 1); +#ifdef RSS + max_num_io_queues = min_t(uint32_t, max_num_io_queues, + rss_getnumbuckets()); +#endif return (max_num_io_queues); } @@ -2726,90 +2696,6 @@ ena_calc_io_queue_size(struct ena_calc_queue_size_ctx *ctx) return (0); } -static int -ena_rss_init_default(struct ena_adapter *adapter) -{ - struct ena_com_dev *ena_dev = adapter->ena_dev; - device_t dev = adapter->pdev; - int qid, rc, i; - - rc = ena_com_rss_init(ena_dev, ENA_RX_RSS_TABLE_LOG_SIZE); - if (unlikely(rc != 0)) { - ena_log(dev, ERR, "Cannot init indirect table\n"); - return (rc); - } - - for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; i++) { - qid = i % adapter->num_io_queues; - rc = ena_com_indirect_table_fill_entry(ena_dev, i, - ENA_IO_RXQ_IDX(qid)); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) { - ena_log(dev, ERR, "Cannot fill indirect table\n"); - goto err_rss_destroy; - } - } - -#ifdef RSS - uint8_t rss_algo = rss_gethashalgo(); - if (rss_algo == RSS_HASH_TOEPLITZ) { - uint8_t hash_key[RSS_KEYSIZE]; - - rss_getkey(hash_key); - rc = ena_com_fill_hash_function(ena_dev, ENA_ADMIN_TOEPLITZ, - hash_key, RSS_KEYSIZE, 0xFFFFFFFF); - } else -#endif - rc = ena_com_fill_hash_function(ena_dev, ENA_ADMIN_CRC32, NULL, - ENA_HASH_KEY_SIZE, 0xFFFFFFFF); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) { - ena_log(dev, ERR, "Cannot fill hash function\n"); - goto err_rss_destroy; - } - - rc = ena_com_set_default_hash_ctrl(ena_dev); - if (unlikely((rc != 0) && (rc != EOPNOTSUPP))) { - ena_log(dev, ERR, "Cannot fill hash control\n"); - goto err_rss_destroy; - } - - return (0); - -err_rss_destroy: - ena_com_rss_destroy(ena_dev); - return (rc); -} - -static void -ena_rss_init_default_deferred(void *arg) -{ - struct ena_adapter *adapter; - devclass_t dc; - int max; - int rc; - - dc = devclass_find("ena"); - if (unlikely(dc == NULL)) { - ena_log_raw(ERR, "SYSINIT: %s: No devclass ena\n", __func__); - return; - } - - max = devclass_get_maxunit(dc); - while (max-- >= 0) { - adapter = devclass_get_softc(dc, max); - if (adapter != NULL) { - rc = ena_rss_init_default(adapter); - ENA_FLAG_SET_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter); - if (unlikely(rc != 0)) { - ena_log(adapter->pdev, WARN, - "WARNING: RSS was not properly initialized," - " it will affect bandwidth\n"); - ENA_FLAG_CLEAR_ATOMIC(ENA_FLAG_RSS_ACTIVE, adapter); - } - } - } -} -SYSINIT(ena_rss_init, SI_SUB_KICK_SCHEDULER, SI_ORDER_SECOND, ena_rss_init_default_deferred, NULL); - static void ena_config_host_info(struct ena_com_dev *ena_dev, device_t dev) { @@ -2842,7 +2728,8 @@ ena_config_host_info(struct ena_com_dev *ena_dev, device_t dev) (DRV_MODULE_VER_SUBMINOR << ENA_ADMIN_HOST_INFO_SUB_MINOR_SHIFT); host_info->num_cpus = mp_ncpus; host_info->driver_supported_features = - ENA_ADMIN_HOST_INFO_RX_OFFSET_MASK; + ENA_ADMIN_HOST_INFO_RX_OFFSET_MASK | + ENA_ADMIN_HOST_INFO_RSS_CONFIGURABLE_FUNCTION_KEY_MASK; rc = ena_com_set_host_attributes(ena_dev); if (unlikely(rc != 0)) { @@ -3543,16 +3430,12 @@ ena_reset_task(void *arg, int pending) { struct ena_adapter *adapter = (struct ena_adapter *)arg; - if (unlikely(!ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) { - ena_log(adapter->pdev, WARN, - "device reset scheduled but trigger_reset is off\n"); - return; + ENA_LOCK_LOCK(); + if (likely(ENA_FLAG_ISSET(ENA_FLAG_TRIGGER_RESET, adapter))) { + ena_destroy_device(adapter, false); + ena_restore_device(adapter); } - - ENA_LOCK_LOCK(adapter); - ena_destroy_device(adapter, false); - ena_restore_device(adapter); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); } /** @@ -3581,8 +3464,6 @@ ena_attach(device_t pdev) adapter = device_get_softc(pdev); adapter->pdev = pdev; - ENA_LOCK_INIT(adapter); - /* * Set up the timer service - driver is responsible for avoiding * concurrency, as the callout won't be using any locking inside. @@ -3824,19 +3705,19 @@ ena_detach(device_t pdev) ether_ifdetach(adapter->ifp); /* Stop timer service */ - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); callout_drain(&adapter->timer_service); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); /* Release reset task */ while (taskqueue_cancel(adapter->reset_tq, &adapter->reset_task, NULL)) taskqueue_drain(adapter->reset_tq, &adapter->reset_task); taskqueue_free(adapter->reset_tq); - ENA_LOCK_LOCK(adapter); + ENA_LOCK_LOCK(); ena_down(adapter); ena_destroy_device(adapter, true); - ENA_LOCK_UNLOCK(adapter); + ENA_LOCK_UNLOCK(); /* Restore unregistered sysctl queue nodes. */ ena_sysctl_update_queue_node_nb(adapter, adapter->num_io_queues, @@ -3865,13 +3746,14 @@ ena_detach(device_t pdev) ena_free_pci_resources(adapter); + if (adapter->rss_indir != NULL) + free(adapter->rss_indir, M_DEVBUF); + if (likely(ENA_FLAG_ISSET(ENA_FLAG_RSS_ACTIVE, adapter))) ena_com_rss_destroy(ena_dev); ena_com_delete_host_info(ena_dev); - ENA_LOCK_DESTROY(adapter); - if_free(adapter->ifp); free(ena_dev->bus, M_DEVBUF); @@ -3937,6 +3819,20 @@ static void ena_notification(void *adapter_data, } } +static void +ena_lock_init(void *arg) +{ + ENA_LOCK_INIT(); +} +SYSINIT(ena_lock_init, SI_SUB_LOCK, SI_ORDER_FIRST, ena_lock_init, NULL); + +static void +ena_lock_uninit(void *arg) +{ + ENA_LOCK_DESTROY(); +} +SYSUNINIT(ena_lock_uninit, SI_SUB_LOCK, SI_ORDER_FIRST, ena_lock_uninit, NULL); + /** * This handler will called for unknown event group or unimplemented handlers **/ diff --git a/sys/dev/ena/ena.h b/sys/dev/ena/ena.h index ff18b335ca1a..b580bca1159c 100644 --- a/sys/dev/ena/ena.h +++ b/sys/dev/ena/ena.h @@ -34,14 +34,14 @@ #ifndef ENA_H #define ENA_H -#include <sys/types.h> +#include "opt_rss.h" #include "ena-com/ena_com.h" #include "ena-com/ena_eth_com.h" #define DRV_MODULE_VER_MAJOR 2 #define DRV_MODULE_VER_MINOR 4 -#define DRV_MODULE_VER_SUBMINOR 0 +#define DRV_MODULE_VER_SUBMINOR 1 #define DRV_MODULE_NAME "ena" @@ -123,6 +123,8 @@ #define ENA_IO_TXQ_IDX(q) (2 * (q)) #define ENA_IO_RXQ_IDX(q) (2 * (q) + 1) +#define ENA_IO_TXQ_IDX_TO_COMBINED_IDX(q) ((q) / 2) +#define ENA_IO_RXQ_IDX_TO_COMBINED_IDX(q) (((q) - 1) / 2) #define ENA_MGMNT_IRQ_IDX 0 #define ENA_IO_IRQ_FIRST_IDX 1 @@ -201,7 +203,9 @@ struct ena_irq { void *cookie; unsigned int vector; bool requested; +#ifdef RSS int cpu; +#endif char name[ENA_IRQNAME_SIZE]; }; @@ -214,7 +218,10 @@ struct ena_que { struct taskqueue *cleanup_tq; uint32_t id; +#ifdef RSS int cpu; + cpuset_t cpu_mask; +#endif struct sysctl_oid *oid; }; @@ -281,19 +288,21 @@ struct ena_stats_tx { *** 1123 LINES SKIPPED ***