Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

xprtrdma: Scale receive batch size with credit window

The fixed RPCRDMA_MAX_RECV_BATCH of 7 results in frequent
small ib_post_recv batches during high-rate workloads. With
a 128-slot credit window, receives are reposted every 7th
completion, each batch incurring atomic serialization and a
doorbell write.

Replace the fixed batch constant with a per-endpoint value
scaled to 25% of the negotiated credit window. For a typical
128-credit connection this raises the batch from 7 to 32,
reducing doorbell frequency by roughly 4x and amortizing the
per-batch atomic and MMIO costs over a larger group of
receive WRs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

authored by

Chuck Lever and committed by
Trond Myklebust
93b4791a 7a079ab5

+4 -2
+2 -1
net/sunrpc/xprtrdma/frwr_ops.c
··· 244 244 } 245 245 ep->re_attr.cap.max_send_wr += RPCRDMA_BACKWARD_WRS; 246 246 ep->re_attr.cap.max_send_wr += 1; /* for ib_drain_sq */ 247 + ep->re_recv_batch = ep->re_max_requests >> 2; 247 248 ep->re_attr.cap.max_recv_wr = ep->re_max_requests; 248 249 ep->re_attr.cap.max_recv_wr += RPCRDMA_BACKWARD_WRS; 249 - ep->re_attr.cap.max_recv_wr += RPCRDMA_MAX_RECV_BATCH; 250 + ep->re_attr.cap.max_recv_wr += ep->re_recv_batch; 250 251 ep->re_attr.cap.max_recv_wr += 1; /* for ib_drain_rq */ 251 252 252 253 ep->re_max_rdma_segs =
+1 -1
net/sunrpc/xprtrdma/verbs.c
··· 1374 1374 if (likely(ep->re_receive_count > needed)) 1375 1375 goto out; 1376 1376 needed -= ep->re_receive_count; 1377 - needed += RPCRDMA_MAX_RECV_BATCH; 1377 + needed += ep->re_recv_batch; 1378 1378 1379 1379 if (atomic_inc_return(&ep->re_receiving) > 1) 1380 1380 goto out_dec;
+1
net/sunrpc/xprtrdma/xprt_rdma.h
··· 96 96 struct rpcrdma_notification re_rn; 97 97 int re_receive_count; 98 98 unsigned int re_max_requests; /* depends on device */ 99 + unsigned int re_recv_batch; 99 100 unsigned int re_inline_send; /* negotiated */ 100 101 unsigned int re_inline_recv; /* negotiated */ 101 102