Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

x86/msi: Make irq_retrigger() functional for posted MSI

Luigi reported that retriggering a posted MSI interrupt does not work
correctly.

The reason is that the retrigger happens at the vector domain by sending an
IPI to the actual vector on the target CPU. That works correctly exactly
once because the posted MSI interrupt chip does not issue an EOI as that's
only required for the posted MSI notification vector itself.

As a consequence the vector becomes stale in the ISR, which not only
affects this vector but also any lower priority vector in the affected
APIC because the ISR bit is not cleared.

Luigi proposed to set the vector in the remap PIR bitmap and raise the
posted MSI notification vector. That works, but that still does not cure a
related problem:

If there is ever a stray interrupt on such a vector, then the related
APIC ISR bit becomes stale due to the lack of EOI as described above.
Unlikely to happen, but if it happens it's not debuggable at all.

So instead of playing games with the PIR, this can be actually solved
for both cases by:

1) Keeping track of the posted interrupt vector handler state

2) Implementing a posted MSI specific irq_ack() callback which checks that
state. If the posted vector handler is inactive it issues an EOI,
otherwise it delegates that to the posted handler.

This is correct versus affinity changes and concurrent events on the posted
vector as the actual handler invocation is serialized through the interrupt
descriptor lock.

Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs")
Reported-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Luigi Rizzo <lrizzo@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251125214631.044440658@linutronix.de
Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com

+34 -4
+7
arch/x86/include/asm/irq_remapping.h
··· 87 87 } 88 88 89 89 #endif /* CONFIG_IRQ_REMAP */ 90 + 91 + #ifdef CONFIG_X86_POSTED_MSI 92 + void intel_ack_posted_msi_irq(struct irq_data *irqd); 93 + #else 94 + #define intel_ack_posted_msi_irq NULL 95 + #endif 96 + 90 97 #endif /* __X86_IRQ_REMAPPING_H */
+23
arch/x86/kernel/irq.c
··· 397 397 398 398 /* Posted Interrupt Descriptors for coalesced MSIs to be posted */ 399 399 DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc); 400 + static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active); 400 401 401 402 void intel_posted_msi_init(void) 402 403 { ··· 413 412 apic_id = this_cpu_read(x86_cpu_to_apicid); 414 413 destination = x2apic_enabled() ? apic_id : apic_id << 8; 415 414 this_cpu_write(posted_msi_pi_desc.ndst, destination); 415 + } 416 + 417 + void intel_ack_posted_msi_irq(struct irq_data *irqd) 418 + { 419 + irq_move_irq(irqd); 420 + 421 + /* 422 + * Handle the rare case that irq_retrigger() raised the actual 423 + * assigned vector on the target CPU, which means that it was not 424 + * invoked via the posted MSI handler below. In that case APIC EOI 425 + * is required as otherwise the ISR entry becomes stale and lower 426 + * priority interrupts are never going to be delivered after that. 427 + * 428 + * If the posted handler invoked the device interrupt handler then 429 + * the EOI would be premature because it would acknowledge the 430 + * posted vector. 431 + */ 432 + if (unlikely(!__this_cpu_read(posted_msi_handler_active))) 433 + apic_eoi(); 416 434 } 417 435 418 436 static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_regs *regs) ··· 466 446 467 447 pid = this_cpu_ptr(&posted_msi_pi_desc); 468 448 449 + /* Mark the handler active for intel_ack_posted_msi_irq() */ 450 + __this_cpu_write(posted_msi_handler_active, true); 469 451 inc_irq_stat(posted_msi_notification_count); 470 452 irq_enter(); 471 453 ··· 496 474 497 475 apic_eoi(); 498 476 irq_exit(); 477 + __this_cpu_write(posted_msi_handler_active, false); 499 478 set_irq_regs(old_regs); 500 479 } 501 480 #endif /* X86_POSTED_MSI */
+4 -4
drivers/iommu/intel/irq_remapping.c
··· 1303 1303 * irq_enter(); 1304 1304 * handle_edge_irq() 1305 1305 * irq_chip_ack_parent() 1306 - * irq_move_irq(); // No EOI 1306 + * intel_ack_posted_msi_irq(); // No EOI 1307 1307 * handle_irq_event() 1308 1308 * driver_handler() 1309 1309 * handle_edge_irq() 1310 1310 * irq_chip_ack_parent() 1311 - * irq_move_irq(); // No EOI 1311 + * intel_ack_posted_msi_irq(); // No EOI 1312 1312 * handle_irq_event() 1313 1313 * driver_handler() 1314 1314 * handle_edge_irq() 1315 1315 * irq_chip_ack_parent() 1316 - * irq_move_irq(); // No EOI 1316 + * intel_ack_posted_msi_irq(); // No EOI 1317 1317 * handle_irq_event() 1318 1318 * driver_handler() 1319 1319 * apic_eoi() ··· 1322 1322 */ 1323 1323 static struct irq_chip intel_ir_chip_post_msi = { 1324 1324 .name = "INTEL-IR-POST", 1325 - .irq_ack = irq_move_irq, 1325 + .irq_ack = intel_ack_posted_msi_irq, 1326 1326 .irq_set_affinity = intel_ir_set_affinity, 1327 1327 .irq_compose_msi_msg = intel_ir_compose_msi_msg, 1328 1328 .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity,