Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

clocksource/drivers/hyper-v: Rework clocksource and sched clock setup

Current code assigns either the Hyper-V TSC page or MSR-based ref counter
as the sched clock. This may be sub-optimal in two cases. First, if there
is hardware support to ensure consistent TSC frequency across live
migrations and Hyper-V is using that support, the raw TSC is a faster
source of time than the Hyper-V TSC page. Second, the MSR-based ref
counter is relatively slow because reads require a trap to the hypervisor.
As such, it should never be used as the sched clock. The native sched
clock based on the raw TSC or jiffies is much better.

Rework the sched clock setup so it is set to the TSC page only if
Hyper-V indicates that the TSC may have inconsistent frequency across
live migrations. Also, remove the code that sets the sched clock to
the MSR-based ref counter. In the cases where it is not set, the sched
clock will then be the native sched clock.

As part of the rework, always enable both the TSC page clocksource and
the MSR-based ref counter clocksource. Set the ratings so the TSC page
clocksource is preferred. While the MSR-based ref counter clocksource
is unlikely to ever be the default, having it available for manual
selection is convenient for development purposes.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1687201360-16003-1-git-send-email-mikelley@microsoft.com

authored by

Michael Kelley and committed by
Daniel Lezcano
e5313f1c 038d454a

+23 -31
+23 -31
drivers/clocksource/hyperv_timer.c
··· 475 475 return read_hv_clock_msr(); 476 476 } 477 477 478 - static u64 notrace read_hv_sched_clock_msr(void) 479 - { 480 - return (read_hv_clock_msr() - hv_sched_clock_offset) * 481 - (NSEC_PER_SEC / HV_CLOCK_HZ); 482 - } 483 - 484 478 static struct clocksource hyperv_cs_msr = { 485 479 .name = "hyperv_clocksource_msr", 486 - .rating = 500, 480 + .rating = 495, 487 481 .read = read_hv_clock_msr_cs, 488 482 .mask = CLOCKSOURCE_MASK(64), 489 483 .flags = CLOCK_SOURCE_IS_CONTINUOUS, ··· 507 513 static __always_inline void hv_setup_sched_clock(void *sched_clock) {} 508 514 #endif /* CONFIG_GENERIC_SCHED_CLOCK */ 509 515 510 - static bool __init hv_init_tsc_clocksource(void) 516 + static void __init hv_init_tsc_clocksource(void) 511 517 { 512 518 union hv_reference_tsc_msr tsc_msr; 513 519 ··· 518 524 * Hyper-V Reference TSC rating, causing the generic TSC to be used. 519 525 * TSC_INVARIANT is not offered on ARM64, so the Hyper-V Reference 520 526 * TSC will be preferred over the virtualized ARM64 arch counter. 521 - * While the Hyper-V MSR clocksource won't be used since the 522 - * Reference TSC clocksource is present, change its rating as 523 - * well for consistency. 524 527 */ 525 528 if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) { 526 529 hyperv_cs_tsc.rating = 250; 527 - hyperv_cs_msr.rating = 250; 530 + hyperv_cs_msr.rating = 245; 528 531 } 529 532 530 533 if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE)) 531 - return false; 534 + return; 532 535 533 536 hv_read_reference_counter = read_hv_clock_tsc; 534 537 ··· 556 565 557 566 clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100); 558 567 559 - hv_sched_clock_offset = hv_read_reference_counter(); 560 - hv_setup_sched_clock(read_hv_sched_clock_tsc); 561 - 562 - return true; 568 + /* 569 + * If TSC is invariant, then let it stay as the sched clock since it 570 + * will be faster than reading the TSC page. But if not invariant, use 571 + * the TSC page so that live migrations across hosts with different 572 + * frequencies is handled correctly. 573 + */ 574 + if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)) { 575 + hv_sched_clock_offset = hv_read_reference_counter(); 576 + hv_setup_sched_clock(read_hv_sched_clock_tsc); 577 + } 563 578 } 564 579 565 580 void __init hv_init_clocksource(void) 566 581 { 567 582 /* 568 - * Try to set up the TSC page clocksource. If it succeeds, we're 569 - * done. Otherwise, set up the MSR clocksource. At least one of 570 - * these will always be available except on very old versions of 571 - * Hyper-V on x86. In that case we won't have a Hyper-V 583 + * Try to set up the TSC page clocksource, then the MSR clocksource. 584 + * At least one of these will always be available except on very old 585 + * versions of Hyper-V on x86. In that case we won't have a Hyper-V 572 586 * clocksource, but Linux will still run with a clocksource based 573 587 * on the emulated PIT or LAPIC timer. 588 + * 589 + * Never use the MSR clocksource as sched clock. It's too slow. 590 + * Better to use the native sched clock as the fallback. 574 591 */ 575 - if (hv_init_tsc_clocksource()) 576 - return; 592 + hv_init_tsc_clocksource(); 577 593 578 - if (!(ms_hyperv.features & HV_MSR_TIME_REF_COUNT_AVAILABLE)) 579 - return; 580 - 581 - hv_read_reference_counter = read_hv_clock_msr; 582 - clocksource_register_hz(&hyperv_cs_msr, NSEC_PER_SEC/100); 583 - 584 - hv_sched_clock_offset = hv_read_reference_counter(); 585 - hv_setup_sched_clock(read_hv_sched_clock_msr); 594 + if (ms_hyperv.features & HV_MSR_TIME_REF_COUNT_AVAILABLE) 595 + clocksource_register_hz(&hyperv_cs_msr, NSEC_PER_SEC/100); 586 596 } 587 597 588 598 void __init hv_remap_tsc_clocksource(void)