Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case

When CONFIG_HIGH_RES_TIMERS is disabled, but NOHZ is enabled then
clock_was_set() is not doing anything. With HIGHRES=n the kernel relies on
the periodic tick to update the clock offsets, but when NOHZ is enabled and
active then CPUs which are in a deep idle sleep do not have a periodic tick
which means the expiry of timers affected by clock_was_set() can be
arbitrarily delayed up to the point where the CPUs are brought out of idle
again.

Make the clock_was_set() logic unconditionaly available so that idle CPUs
are kicked out of idle to handle the update.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.288697903@linutronix.de

+59 -28
+59 -28
kernel/time/hrtimer.c
··· 739 739 return hrtimer_hres_enabled; 740 740 } 741 741 742 - /* 743 - * Retrigger next event is called after clock was set 744 - * 745 - * Called with interrupts disabled via on_each_cpu() 746 - */ 747 - static void retrigger_next_event(void *arg) 748 - { 749 - struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases); 750 - 751 - if (!__hrtimer_hres_active(base)) 752 - return; 753 - 754 - raw_spin_lock(&base->lock); 755 - hrtimer_update_base(base); 756 - hrtimer_force_reprogram(base, 0); 757 - raw_spin_unlock(&base->lock); 758 - } 742 + static void retrigger_next_event(void *arg); 759 743 760 744 /* 761 745 * Switch to high resolution mode ··· 765 781 766 782 static inline int hrtimer_is_hres_enabled(void) { return 0; } 767 783 static inline void hrtimer_switch_to_hres(void) { } 768 - static inline void retrigger_next_event(void *arg) { } 769 784 770 785 #endif /* CONFIG_HIGH_RES_TIMERS */ 786 + /* 787 + * Retrigger next event is called after clock was set with interrupts 788 + * disabled through an SMP function call or directly from low level 789 + * resume code. 790 + * 791 + * This is only invoked when: 792 + * - CONFIG_HIGH_RES_TIMERS is enabled. 793 + * - CONFIG_NOHZ_COMMON is enabled 794 + * 795 + * For the other cases this function is empty and because the call sites 796 + * are optimized out it vanishes as well, i.e. no need for lots of 797 + * #ifdeffery. 798 + */ 799 + static void retrigger_next_event(void *arg) 800 + { 801 + struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases); 802 + 803 + /* 804 + * When high resolution mode or nohz is active, then the offsets of 805 + * CLOCK_REALTIME/TAI/BOOTTIME have to be updated. Otherwise the 806 + * next tick will take care of that. 807 + * 808 + * If high resolution mode is active then the next expiring timer 809 + * must be reevaluated and the clock event device reprogrammed if 810 + * necessary. 811 + * 812 + * In the NOHZ case the update of the offset and the reevaluation 813 + * of the next expiring timer is enough. The return from the SMP 814 + * function call will take care of the reprogramming in case the 815 + * CPU was in a NOHZ idle sleep. 816 + */ 817 + if (!__hrtimer_hres_active(base) && !tick_nohz_active) 818 + return; 819 + 820 + raw_spin_lock(&base->lock); 821 + hrtimer_update_base(base); 822 + if (__hrtimer_hres_active(base)) 823 + hrtimer_force_reprogram(base, 0); 824 + else 825 + hrtimer_update_next_event(base); 826 + raw_spin_unlock(&base->lock); 827 + } 771 828 772 829 /* 773 830 * When a timer is enqueued and expires earlier than the already enqueued ··· 867 842 } 868 843 869 844 /* 870 - * Clock realtime was set 845 + * Clock was set. This might affect CLOCK_REALTIME, CLOCK_TAI and 846 + * CLOCK_BOOTTIME (for late sleep time injection). 871 847 * 872 - * Change the offset of the realtime clock vs. the monotonic 873 - * clock. 874 - * 875 - * We might have to reprogram the high resolution timer interrupt. On 876 - * SMP we call the architecture specific code to retrigger _all_ high 877 - * resolution timer interrupts. On UP we just disable interrupts and 878 - * call the high resolution interrupt code. 848 + * This requires to update the offsets for these clocks 849 + * vs. CLOCK_MONOTONIC. When high resolution timers are enabled, then this 850 + * also requires to eventually reprogram the per CPU clock event devices 851 + * when the change moves an affected timer ahead of the first expiring 852 + * timer on that CPU. Obviously remote per CPU clock event devices cannot 853 + * be reprogrammed. The other reason why an IPI has to be sent is when the 854 + * system is in !HIGH_RES and NOHZ mode. The NOHZ mode updates the offsets 855 + * in the tick, which obviously might be stopped, so this has to bring out 856 + * the remote CPU which might sleep in idle to get this sorted. 879 857 */ 880 858 void clock_was_set(void) 881 859 { 882 - #ifdef CONFIG_HIGH_RES_TIMERS 860 + if (!hrtimer_hres_active() && !tick_nohz_active) 861 + goto out_timerfd; 862 + 883 863 /* Retrigger the CPU local events everywhere */ 884 864 on_each_cpu(retrigger_next_event, NULL, 1); 885 - #endif 865 + 866 + out_timerfd: 886 867 timerfd_clock_was_set(); 887 868 } 888 869