Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Revert "vhost/net: Defer TX queue re-enable until after sendmsg"

This reverts commit 8c2e6b26ffe243be1e78f5a4bfb1a857d6e6f6d6. It tries
to defer the notification enabling by moving the logic out of the loop
after the vhost_tx_batch() when nothing new is spotted. This will
bring side effects as the new logic would be reused for several other
error conditions.

One example is the IOTLB: when there's an IOTLB miss, get_tx_bufs()
might return -EAGAIN and exit the loop and see there's still available
buffers, so it will queue the tx work again until userspace feed the
IOTLB entry correctly. This will slowdown the tx processing and
trigger the TX watchdog in the guest as reported in
https://lkml.org/lkml/2025/9/10/1596.

To fix, revert the change. A follow up patch will bring the performance
back in a safe way.

Reported-by: Jon Kohler <jon@nutanix.com>
Cc: stable@vger.kernel.org
Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20250917063045.2042-2-jasowang@redhat.com>

+9 -21
+9 -21
drivers/vhost/net.c
··· 765 765 int err; 766 766 int sent_pkts = 0; 767 767 bool sock_can_batch = (sock->sk->sk_sndbuf == INT_MAX); 768 - bool busyloop_intr; 769 768 bool in_order = vhost_has_feature(vq, VIRTIO_F_IN_ORDER); 770 769 771 770 do { 772 - busyloop_intr = false; 771 + bool busyloop_intr = false; 772 + 773 773 if (nvq->done_idx == VHOST_NET_BATCH) 774 774 vhost_tx_batch(net, nvq, sock, &msg); 775 775 ··· 780 780 break; 781 781 /* Nothing new? Wait for eventfd to tell us they refilled. */ 782 782 if (head == vq->num) { 783 - /* Kicks are disabled at this point, break loop and 784 - * process any remaining batched packets. Queue will 785 - * be re-enabled afterwards. 786 - */ 783 + if (unlikely(busyloop_intr)) { 784 + vhost_poll_queue(&vq->poll); 785 + } else if (unlikely(vhost_enable_notify(&net->dev, 786 + vq))) { 787 + vhost_disable_notify(&net->dev, vq); 788 + continue; 789 + } 787 790 break; 788 791 } 789 792 ··· 842 839 ++nvq->done_idx; 843 840 } while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len))); 844 841 845 - /* Kicks are still disabled, dispatch any remaining batched msgs. */ 846 842 vhost_tx_batch(net, nvq, sock, &msg); 847 - 848 - if (unlikely(busyloop_intr)) 849 - /* If interrupted while doing busy polling, requeue the 850 - * handler to be fair handle_rx as well as other tasks 851 - * waiting on cpu. 852 - */ 853 - vhost_poll_queue(&vq->poll); 854 - else 855 - /* All of our work has been completed; however, before 856 - * leaving the TX handler, do one last check for work, 857 - * and requeue handler if necessary. If there is no work, 858 - * queue will be reenabled. 859 - */ 860 - vhost_net_busy_poll_try_queue(net, vq); 861 843 } 862 844 863 845 static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)