Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 AVX512 status update from Ingo Molnar:
"This adds a new ABI that the main scheduler probably doesn't want to
deal with but HPC job schedulers might want to use: the
AVX512_elapsed_ms field in the new /proc/<pid>/arch_status task status
file, which allows the user-space job scheduler to cluster such tasks,
to avoid turbo frequency drops"

* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/filesystems/proc.txt: Add arch_status file
x86/process: Add AVX-512 usage elapsed time to /proc/pid/arch_status
proc: Add /proc/<pid>/arch_status

+107
+40
Documentation/filesystems/proc.txt
··· 45 45 3.9 /proc/<pid>/map_files - Information about memory mapped files 46 46 3.10 /proc/<pid>/timerslack_ns - Task timerslack value 47 47 3.11 /proc/<pid>/patch_state - Livepatch patch operation state 48 + 3.12 /proc/<pid>/arch_status - Task architecture specific information 48 49 49 50 4 Configuring procfs 50 51 4.1 Mount options ··· 1949 1948 patched. If the patch is being disabled, then the task hasn't been 1950 1949 unpatched yet. 1951 1950 1951 + 3.12 /proc/<pid>/arch_status - task architecture specific status 1952 + ------------------------------------------------------------------- 1953 + When CONFIG_PROC_PID_ARCH_STATUS is enabled, this file displays the 1954 + architecture specific status of the task. 1955 + 1956 + Example 1957 + ------- 1958 + $ cat /proc/6753/arch_status 1959 + AVX512_elapsed_ms: 8 1960 + 1961 + Description 1962 + ----------- 1963 + 1964 + x86 specific entries: 1965 + --------------------- 1966 + AVX512_elapsed_ms: 1967 + ------------------ 1968 + If AVX512 is supported on the machine, this entry shows the milliseconds 1969 + elapsed since the last time AVX512 usage was recorded. The recording 1970 + happens on a best effort basis when a task is scheduled out. This means 1971 + that the value depends on two factors: 1972 + 1973 + 1) The time which the task spent on the CPU without being scheduled 1974 + out. With CPU isolation and a single runnable task this can take 1975 + several seconds. 1976 + 1977 + 2) The time since the task was scheduled out last. Depending on the 1978 + reason for being scheduled out (time slice exhausted, syscall ...) 1979 + this can be arbitrary long time. 1980 + 1981 + As a consequence the value cannot be considered precise and authoritative 1982 + information. The application which uses this information has to be aware 1983 + of the overall scenario on the system in order to determine whether a 1984 + task is a real AVX512 user or not. Precise information can be obtained 1985 + with performance counters. 1986 + 1987 + A special value of '-1' indicates that no AVX512 usage was recorded, thus 1988 + the task is unlikely an AVX512 user, but depends on the workload and the 1989 + scheduling scenario, it also could be a false negative mentioned above. 1952 1990 1953 1991 ------------------------------------------------------------------------------ 1954 1992 Configuring procfs
+1
arch/x86/Kconfig
··· 220 220 select USER_STACKTRACE_SUPPORT 221 221 select VIRT_TO_BUS 222 222 select X86_FEATURE_NAMES if PROC_FS 223 + select PROC_PID_ARCH_STATUS if PROC_FS 223 224 224 225 config INSTRUCTION_DECODER 225 226 def_bool y
+47
arch/x86/kernel/fpu/xstate.c
··· 8 8 #include <linux/cpu.h> 9 9 #include <linux/mman.h> 10 10 #include <linux/pkeys.h> 11 + #include <linux/seq_file.h> 12 + #include <linux/proc_fs.h> 11 13 12 14 #include <asm/fpu/api.h> 13 15 #include <asm/fpu/internal.h> ··· 1233 1231 1234 1232 return 0; 1235 1233 } 1234 + 1235 + #ifdef CONFIG_PROC_PID_ARCH_STATUS 1236 + /* 1237 + * Report the amount of time elapsed in millisecond since last AVX512 1238 + * use in the task. 1239 + */ 1240 + static void avx512_status(struct seq_file *m, struct task_struct *task) 1241 + { 1242 + unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp); 1243 + long delta; 1244 + 1245 + if (!timestamp) { 1246 + /* 1247 + * Report -1 if no AVX512 usage 1248 + */ 1249 + delta = -1; 1250 + } else { 1251 + delta = (long)(jiffies - timestamp); 1252 + /* 1253 + * Cap to LONG_MAX if time difference > LONG_MAX 1254 + */ 1255 + if (delta < 0) 1256 + delta = LONG_MAX; 1257 + delta = jiffies_to_msecs(delta); 1258 + } 1259 + 1260 + seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); 1261 + seq_putc(m, '\n'); 1262 + } 1263 + 1264 + /* 1265 + * Report architecture specific information 1266 + */ 1267 + int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, 1268 + struct pid *pid, struct task_struct *task) 1269 + { 1270 + /* 1271 + * Report AVX512 state if the processor and build option supported. 1272 + */ 1273 + if (cpu_feature_enabled(X86_FEATURE_AVX512F)) 1274 + avx512_status(m, task); 1275 + 1276 + return 0; 1277 + } 1278 + #endif /* CONFIG_PROC_PID_ARCH_STATUS */
+4
fs/proc/Kconfig
··· 98 98 99 99 Say Y if you are running any user-space software which takes benefit from 100 100 this interface. For example, rkt is such a piece of software. 101 + 102 + config PROC_PID_ARCH_STATUS 103 + def_bool n 104 + depends on PROC_FS
+6
fs/proc/base.c
··· 3061 3061 #ifdef CONFIG_STACKLEAK_METRICS 3062 3062 ONE("stack_depth", S_IRUGO, proc_stack_depth), 3063 3063 #endif 3064 + #ifdef CONFIG_PROC_PID_ARCH_STATUS 3065 + ONE("arch_status", S_IRUGO, proc_pid_arch_status), 3066 + #endif 3064 3067 }; 3065 3068 3066 3069 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx) ··· 3450 3447 #endif 3451 3448 #ifdef CONFIG_LIVEPATCH 3452 3449 ONE("patch_state", S_IRUSR, proc_pid_patch_state), 3450 + #endif 3451 + #ifdef CONFIG_PROC_PID_ARCH_STATUS 3452 + ONE("arch_status", S_IRUGO, proc_pid_arch_status), 3453 3453 #endif 3454 3454 }; 3455 3455
+9
include/linux/proc_fs.h
··· 75 75 void *data); 76 76 extern struct pid *tgid_pidfd_to_pid(const struct file *file); 77 77 78 + #ifdef CONFIG_PROC_PID_ARCH_STATUS 79 + /* 80 + * The architecture which selects CONFIG_PROC_PID_ARCH_STATUS must 81 + * provide proc_pid_arch_status() definition. 82 + */ 83 + int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, 84 + struct pid *pid, struct task_struct *task); 85 + #endif /* CONFIG_PROC_PID_ARCH_STATUS */ 86 + 78 87 #else /* CONFIG_PROC_FS */ 79 88 80 89 static inline void proc_root_init(void)