Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: declare VMA flags by bit

Patch series "initial work on making VMA flags a bitmap", v3.

We are in the rather silly situation that we are running out of VMA flags
as they are currently limited to a system word in size.

This leads to absurd situations where we limit features to 64-bit
architectures only because we simply do not have the ability to add a flag
for 32-bit ones.

This is very constraining and leads to hacks or, in the worst case, simply
an inability to implement features we want for entirely arbitrary reasons.

This also of course gives us something of a Y2K type situation in mm where
we might eventually exhaust all of the VMA flags even on 64-bit systems.

This series lays the groundwork for getting away from this limitation by
establishing VMA flags as a bitmap whose size we can increase in future
beyond 64 bits if required.

This is necessarily a highly iterative process given the extensive use of
VMA flags throughout the kernel, so we start by performing basic steps.

Firstly, we declare VMA flags by bit number rather than by value,
retaining the VM_xxx fields but in terms of these newly introduced
VMA_xxx_BIT fields.

While we are here, we use sparse annotations to ensure that, when dealing
with VMA bit number parameters, we cannot be passed values which are not
declared as such - providing some useful type safety.

We then introduce an opaque VMA flag type, much like the opaque mm_struct
flag type introduced in commit bb6525f2f8c4 ("mm: add bitmap mm->flags
field"), which we establish in union with vma->vm_flags (but still set at
system word size meaning there is no functional or data type size change).

We update the vm_flags_xxx() helpers to use this new bitmap, introducing
sensible helpers to do so.

This series lays the foundation for further work to expand the use of
bitmap VMA flags and eventually eliminate these arbitrary restrictions.


This patch (of 4):

In order to lay the groundwork for VMA flags being a bitmap rather than a
system word in size, we need to be able to consistently refer to VMA flags
by bit number rather than value.

Take this opportunity to do so in an enum which we which is additionally
useful for tooling to extract metadata from.

This additionally makes it very clear which bits are being used for what
at a glance.

We use the VMA_ prefix for the bit values as it is logical to do so since
these reference VMAs. We consistently suffix with _BIT to make it clear
what the values refer to.

We declare bit values even when the flags that use them would not be
enabled by config options as this is simply clearer and clearly defines
what bit numbers are used for what, at no additional cost.

We declare a sparse-bitwise type vma_flag_t which ensures that users can't
pass around invalid VMA flags by accident and prepares for future work
towards VMA flags being a bitmap where we want to ensure bit values are
type safe.

To make life easier, we declare some macro helpers - DECLARE_VMA_BIT()
allows us to avoid duplication in the enum bit number declarations (and
maintaining the sparse __bitwise attribute), and INIT_VM_FLAG() is used to
assist with declaration of flags.

Unfortunately we can't declare both in the enum, as we run into issue with
logic in the kernel requiring that flags are preprocessor definitions, and
additionally we cannot have a macro which declares another macro so we
must define each flag macro directly.

Additionally, update the VMA userland testing vma_internal.h header to
include these changes.

We also have to fix the parameters to the vma_flag_*_atomic() functions
since VMA_MAYBE_GUARD_BIT is now of type vma_flag_t and sparse will
complain otherwise.

We have to update some rather silly if-deffery found in mm/task_mmu.c
which would otherwise break.

Finally, we update the rust binding helper as now it cannot auto-detect
the flags at all.

Link: https://lkml.kernel.org/r/cover.1764064556.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/3a35e5a0bcfa00e84af24cbafc0653e74deda64a.1764064556.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: Alice Ryhl <aliceryhl@google.com> [rust]
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Chris Li <chrisl@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Wei Xu <weixugc@google.com>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes and committed by
Andrew Morton
2b6a3f06 8f4338b1

+534 -227
+2 -2
fs/proc/task_mmu.c
··· 1183 1183 [ilog2(VM_PKEY_BIT0)] = "", 1184 1184 [ilog2(VM_PKEY_BIT1)] = "", 1185 1185 [ilog2(VM_PKEY_BIT2)] = "", 1186 - #if VM_PKEY_BIT3 1186 + #if CONFIG_ARCH_PKEY_BITS > 3 1187 1187 [ilog2(VM_PKEY_BIT3)] = "", 1188 1188 #endif 1189 - #if VM_PKEY_BIT4 1189 + #if CONFIG_ARCH_PKEY_BITS > 4 1190 1190 [ilog2(VM_PKEY_BIT4)] = "", 1191 1191 #endif 1192 1192 #endif /* CONFIG_ARCH_HAS_PKEYS */
+220 -177
include/linux/mm.h
··· 271 271 extern unsigned int kobjsize(const void *objp); 272 272 #endif 273 273 274 - #define VM_MAYBE_GUARD_BIT 11 275 - 276 274 /* 277 275 * vm_flags in vm_area_struct, see mm_types.h. 278 276 * When changing, update also include/trace/events/mmflags.h 279 277 */ 278 + 280 279 #define VM_NONE 0x00000000 281 280 282 - #define VM_READ 0x00000001 /* currently active flags */ 283 - #define VM_WRITE 0x00000002 284 - #define VM_EXEC 0x00000004 285 - #define VM_SHARED 0x00000008 286 - 287 - /* mprotect() hardcodes VM_MAYREAD >> 4 == VM_READ, and so for r/w/x bits. */ 288 - #define VM_MAYREAD 0x00000010 /* limits for mprotect() etc */ 289 - #define VM_MAYWRITE 0x00000020 290 - #define VM_MAYEXEC 0x00000040 291 - #define VM_MAYSHARE 0x00000080 292 - 293 - #define VM_GROWSDOWN 0x00000100 /* general info on the segment */ 294 - #ifdef CONFIG_MMU 295 - #define VM_UFFD_MISSING 0x00000200 /* missing pages tracking */ 296 - #else /* CONFIG_MMU */ 297 - #define VM_MAYOVERLAY 0x00000200 /* nommu: R/O MAP_PRIVATE mapping that might overlay a file mapping */ 298 - #define VM_UFFD_MISSING 0 299 - #endif /* CONFIG_MMU */ 300 - #define VM_PFNMAP 0x00000400 /* Page-ranges managed without "struct page", just pure PFN */ 301 - #define VM_MAYBE_GUARD BIT(VM_MAYBE_GUARD_BIT) /* The VMA maybe contains guard regions. */ 302 - #define VM_UFFD_WP 0x00001000 /* wrprotect pages tracking */ 303 - 304 - #define VM_LOCKED 0x00002000 305 - #define VM_IO 0x00004000 /* Memory mapped I/O or similar */ 306 - 307 - /* Used by sys_madvise() */ 308 - #define VM_SEQ_READ 0x00008000 /* App will access data sequentially */ 309 - #define VM_RAND_READ 0x00010000 /* App will not benefit from clustered reads */ 310 - 311 - #define VM_DONTCOPY 0x00020000 /* Do not copy this vma on fork */ 312 - #define VM_DONTEXPAND 0x00040000 /* Cannot expand with mremap() */ 313 - #define VM_LOCKONFAULT 0x00080000 /* Lock the pages covered when they are faulted in */ 314 - #define VM_ACCOUNT 0x00100000 /* Is a VM accounted object */ 315 - #define VM_NORESERVE 0x00200000 /* should the VM suppress accounting */ 316 - #define VM_HUGETLB 0x00400000 /* Huge TLB Page VM */ 317 - #define VM_SYNC 0x00800000 /* Synchronous page faults */ 318 - #define VM_ARCH_1 0x01000000 /* Architecture-specific flag */ 319 - #define VM_WIPEONFORK 0x02000000 /* Wipe VMA contents in child. */ 320 - #define VM_DONTDUMP 0x04000000 /* Do not include in the core dump */ 321 - 322 - #ifdef CONFIG_MEM_SOFT_DIRTY 323 - # define VM_SOFTDIRTY 0x08000000 /* Not soft dirty clean area */ 324 - #else 325 - # define VM_SOFTDIRTY 0 326 - #endif 327 - 328 - #define VM_MIXEDMAP 0x10000000 /* Can contain "struct page" and pure PFN pages */ 329 - #define VM_HUGEPAGE 0x20000000 /* MADV_HUGEPAGE marked this vma */ 330 - #define VM_NOHUGEPAGE 0x40000000 /* MADV_NOHUGEPAGE marked this vma */ 331 - #define VM_MERGEABLE BIT(31) /* KSM may merge identical pages */ 332 - 333 - #ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS 334 - #define VM_HIGH_ARCH_BIT_0 32 /* bit only usable on 64-bit architectures */ 335 - #define VM_HIGH_ARCH_BIT_1 33 /* bit only usable on 64-bit architectures */ 336 - #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ 337 - #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ 338 - #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ 339 - #define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ 340 - #define VM_HIGH_ARCH_BIT_6 38 /* bit only usable on 64-bit architectures */ 341 - #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) 342 - #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) 343 - #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) 344 - #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) 345 - #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) 346 - #define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) 347 - #define VM_HIGH_ARCH_6 BIT(VM_HIGH_ARCH_BIT_6) 348 - #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ 349 - 350 - #ifdef CONFIG_ARCH_HAS_PKEYS 351 - # define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0 352 - # define VM_PKEY_BIT0 VM_HIGH_ARCH_0 353 - # define VM_PKEY_BIT1 VM_HIGH_ARCH_1 354 - # define VM_PKEY_BIT2 VM_HIGH_ARCH_2 355 - #if CONFIG_ARCH_PKEY_BITS > 3 356 - # define VM_PKEY_BIT3 VM_HIGH_ARCH_3 357 - #else 358 - # define VM_PKEY_BIT3 0 359 - #endif 360 - #if CONFIG_ARCH_PKEY_BITS > 4 361 - # define VM_PKEY_BIT4 VM_HIGH_ARCH_4 362 - #else 363 - # define VM_PKEY_BIT4 0 364 - #endif 365 - #endif /* CONFIG_ARCH_HAS_PKEYS */ 366 - 367 - #ifdef CONFIG_X86_USER_SHADOW_STACK 368 - /* 369 - * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of 370 - * support core mm. 281 + /** 282 + * typedef vma_flag_t - specifies an individual VMA flag by bit number. 371 283 * 372 - * These VMAs will get a single end guard page. This helps userspace protect 373 - * itself from attacks. A single page is enough for current shadow stack archs 374 - * (x86). See the comments near alloc_shstk() in arch/x86/kernel/shstk.c 375 - * for more details on the guard size. 284 + * This value is made type safe by sparse to avoid passing invalid flag values 285 + * around. 376 286 */ 377 - # define VM_SHADOW_STACK VM_HIGH_ARCH_5 378 - #endif 287 + typedef int __bitwise vma_flag_t; 379 288 380 - #if defined(CONFIG_ARM64_GCS) 381 - /* 382 - * arm64's Guarded Control Stack implements similar functionality and 383 - * has similar constraints to shadow stacks. 384 - */ 385 - # define VM_SHADOW_STACK VM_HIGH_ARCH_6 289 + #define DECLARE_VMA_BIT(name, bitnum) \ 290 + VMA_ ## name ## _BIT = ((__force vma_flag_t)bitnum) 291 + #define DECLARE_VMA_BIT_ALIAS(name, aliased) \ 292 + VMA_ ## name ## _BIT = (VMA_ ## aliased ## _BIT) 293 + enum { 294 + DECLARE_VMA_BIT(READ, 0), 295 + DECLARE_VMA_BIT(WRITE, 1), 296 + DECLARE_VMA_BIT(EXEC, 2), 297 + DECLARE_VMA_BIT(SHARED, 3), 298 + /* mprotect() hardcodes VM_MAYREAD >> 4 == VM_READ, and so for r/w/x bits. */ 299 + DECLARE_VMA_BIT(MAYREAD, 4), /* limits for mprotect() etc. */ 300 + DECLARE_VMA_BIT(MAYWRITE, 5), 301 + DECLARE_VMA_BIT(MAYEXEC, 6), 302 + DECLARE_VMA_BIT(MAYSHARE, 7), 303 + DECLARE_VMA_BIT(GROWSDOWN, 8), /* general info on the segment */ 304 + #ifdef CONFIG_MMU 305 + DECLARE_VMA_BIT(UFFD_MISSING, 9),/* missing pages tracking */ 306 + #else 307 + /* nommu: R/O MAP_PRIVATE mapping that might overlay a file mapping */ 308 + DECLARE_VMA_BIT(MAYOVERLAY, 9), 309 + #endif /* CONFIG_MMU */ 310 + /* Page-ranges managed without "struct page", just pure PFN */ 311 + DECLARE_VMA_BIT(PFNMAP, 10), 312 + DECLARE_VMA_BIT(MAYBE_GUARD, 11), 313 + DECLARE_VMA_BIT(UFFD_WP, 12), /* wrprotect pages tracking */ 314 + DECLARE_VMA_BIT(LOCKED, 13), 315 + DECLARE_VMA_BIT(IO, 14), /* Memory mapped I/O or similar */ 316 + DECLARE_VMA_BIT(SEQ_READ, 15), /* App will access data sequentially */ 317 + DECLARE_VMA_BIT(RAND_READ, 16), /* App will not benefit from clustered reads */ 318 + DECLARE_VMA_BIT(DONTCOPY, 17), /* Do not copy this vma on fork */ 319 + DECLARE_VMA_BIT(DONTEXPAND, 18),/* Cannot expand with mremap() */ 320 + DECLARE_VMA_BIT(LOCKONFAULT, 19),/* Lock pages covered when faulted in */ 321 + DECLARE_VMA_BIT(ACCOUNT, 20), /* Is a VM accounted object */ 322 + DECLARE_VMA_BIT(NORESERVE, 21), /* should the VM suppress accounting */ 323 + DECLARE_VMA_BIT(HUGETLB, 22), /* Huge TLB Page VM */ 324 + DECLARE_VMA_BIT(SYNC, 23), /* Synchronous page faults */ 325 + DECLARE_VMA_BIT(ARCH_1, 24), /* Architecture-specific flag */ 326 + DECLARE_VMA_BIT(WIPEONFORK, 25),/* Wipe VMA contents in child. */ 327 + DECLARE_VMA_BIT(DONTDUMP, 26), /* Do not include in the core dump */ 328 + DECLARE_VMA_BIT(SOFTDIRTY, 27), /* NOT soft dirty clean area */ 329 + DECLARE_VMA_BIT(MIXEDMAP, 28), /* Can contain struct page and pure PFN pages */ 330 + DECLARE_VMA_BIT(HUGEPAGE, 29), /* MADV_HUGEPAGE marked this vma */ 331 + DECLARE_VMA_BIT(NOHUGEPAGE, 30),/* MADV_NOHUGEPAGE marked this vma */ 332 + DECLARE_VMA_BIT(MERGEABLE, 31), /* KSM may merge identical pages */ 333 + /* These bits are reused, we define specific uses below. */ 334 + DECLARE_VMA_BIT(HIGH_ARCH_0, 32), 335 + DECLARE_VMA_BIT(HIGH_ARCH_1, 33), 336 + DECLARE_VMA_BIT(HIGH_ARCH_2, 34), 337 + DECLARE_VMA_BIT(HIGH_ARCH_3, 35), 338 + DECLARE_VMA_BIT(HIGH_ARCH_4, 36), 339 + DECLARE_VMA_BIT(HIGH_ARCH_5, 37), 340 + DECLARE_VMA_BIT(HIGH_ARCH_6, 38), 341 + /* 342 + * This flag is used to connect VFIO to arch specific KVM code. It 343 + * indicates that the memory under this VMA is safe for use with any 344 + * non-cachable memory type inside KVM. Some VFIO devices, on some 345 + * platforms, are thought to be unsafe and can cause machine crashes 346 + * if KVM does not lock down the memory type. 347 + */ 348 + DECLARE_VMA_BIT(ALLOW_ANY_UNCACHED, 39), 349 + #ifdef CONFIG_PPC32 350 + DECLARE_VMA_BIT_ALIAS(DROPPABLE, ARCH_1), 351 + #else 352 + DECLARE_VMA_BIT(DROPPABLE, 40), 386 353 #endif 387 - 388 - #ifndef VM_SHADOW_STACK 389 - # define VM_SHADOW_STACK VM_NONE 354 + DECLARE_VMA_BIT(UFFD_MINOR, 41), 355 + DECLARE_VMA_BIT(SEALED, 42), 356 + /* Flags that reuse flags above. */ 357 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT0, HIGH_ARCH_0), 358 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT1, HIGH_ARCH_1), 359 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT2, HIGH_ARCH_2), 360 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT3, HIGH_ARCH_3), 361 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT4, HIGH_ARCH_4), 362 + #if defined(CONFIG_X86_USER_SHADOW_STACK) 363 + /* 364 + * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of 365 + * support core mm. 366 + * 367 + * These VMAs will get a single end guard page. This helps userspace 368 + * protect itself from attacks. A single page is enough for current 369 + * shadow stack archs (x86). See the comments near alloc_shstk() in 370 + * arch/x86/kernel/shstk.c for more details on the guard size. 371 + */ 372 + DECLARE_VMA_BIT_ALIAS(SHADOW_STACK, HIGH_ARCH_5), 373 + #elif defined(CONFIG_ARM64_GCS) 374 + /* 375 + * arm64's Guarded Control Stack implements similar functionality and 376 + * has similar constraints to shadow stacks. 377 + */ 378 + DECLARE_VMA_BIT_ALIAS(SHADOW_STACK, HIGH_ARCH_6), 390 379 #endif 380 + DECLARE_VMA_BIT_ALIAS(SAO, ARCH_1), /* Strong Access Ordering (powerpc) */ 381 + DECLARE_VMA_BIT_ALIAS(GROWSUP, ARCH_1), /* parisc */ 382 + DECLARE_VMA_BIT_ALIAS(SPARC_ADI, ARCH_1), /* sparc64 */ 383 + DECLARE_VMA_BIT_ALIAS(ARM64_BTI, ARCH_1), /* arm64 */ 384 + DECLARE_VMA_BIT_ALIAS(ARCH_CLEAR, ARCH_1), /* sparc64, arm64 */ 385 + DECLARE_VMA_BIT_ALIAS(MAPPED_COPY, ARCH_1), /* !CONFIG_MMU */ 386 + DECLARE_VMA_BIT_ALIAS(MTE, HIGH_ARCH_4), /* arm64 */ 387 + DECLARE_VMA_BIT_ALIAS(MTE_ALLOWED, HIGH_ARCH_5),/* arm64 */ 388 + #ifdef CONFIG_STACK_GROWSUP 389 + DECLARE_VMA_BIT_ALIAS(STACK, GROWSUP), 390 + DECLARE_VMA_BIT_ALIAS(STACK_EARLY, GROWSDOWN), 391 + #else 392 + DECLARE_VMA_BIT_ALIAS(STACK, GROWSDOWN), 393 + #endif 394 + }; 395 + #undef DECLARE_VMA_BIT 396 + #undef DECLARE_VMA_BIT_ALIAS 391 397 398 + #define INIT_VM_FLAG(name) BIT((__force int) VMA_ ## name ## _BIT) 399 + #define VM_READ INIT_VM_FLAG(READ) 400 + #define VM_WRITE INIT_VM_FLAG(WRITE) 401 + #define VM_EXEC INIT_VM_FLAG(EXEC) 402 + #define VM_SHARED INIT_VM_FLAG(SHARED) 403 + #define VM_MAYREAD INIT_VM_FLAG(MAYREAD) 404 + #define VM_MAYWRITE INIT_VM_FLAG(MAYWRITE) 405 + #define VM_MAYEXEC INIT_VM_FLAG(MAYEXEC) 406 + #define VM_MAYSHARE INIT_VM_FLAG(MAYSHARE) 407 + #define VM_GROWSDOWN INIT_VM_FLAG(GROWSDOWN) 408 + #ifdef CONFIG_MMU 409 + #define VM_UFFD_MISSING INIT_VM_FLAG(UFFD_MISSING) 410 + #else 411 + #define VM_UFFD_MISSING VM_NONE 412 + #define VM_MAYOVERLAY INIT_VM_FLAG(MAYOVERLAY) 413 + #endif 414 + #define VM_PFNMAP INIT_VM_FLAG(PFNMAP) 415 + #define VM_MAYBE_GUARD INIT_VM_FLAG(MAYBE_GUARD) 416 + #define VM_UFFD_WP INIT_VM_FLAG(UFFD_WP) 417 + #define VM_LOCKED INIT_VM_FLAG(LOCKED) 418 + #define VM_IO INIT_VM_FLAG(IO) 419 + #define VM_SEQ_READ INIT_VM_FLAG(SEQ_READ) 420 + #define VM_RAND_READ INIT_VM_FLAG(RAND_READ) 421 + #define VM_DONTCOPY INIT_VM_FLAG(DONTCOPY) 422 + #define VM_DONTEXPAND INIT_VM_FLAG(DONTEXPAND) 423 + #define VM_LOCKONFAULT INIT_VM_FLAG(LOCKONFAULT) 424 + #define VM_ACCOUNT INIT_VM_FLAG(ACCOUNT) 425 + #define VM_NORESERVE INIT_VM_FLAG(NORESERVE) 426 + #define VM_HUGETLB INIT_VM_FLAG(HUGETLB) 427 + #define VM_SYNC INIT_VM_FLAG(SYNC) 428 + #define VM_ARCH_1 INIT_VM_FLAG(ARCH_1) 429 + #define VM_WIPEONFORK INIT_VM_FLAG(WIPEONFORK) 430 + #define VM_DONTDUMP INIT_VM_FLAG(DONTDUMP) 431 + #ifdef CONFIG_MEM_SOFT_DIRTY 432 + #define VM_SOFTDIRTY INIT_VM_FLAG(SOFTDIRTY) 433 + #else 434 + #define VM_SOFTDIRTY VM_NONE 435 + #endif 436 + #define VM_MIXEDMAP INIT_VM_FLAG(MIXEDMAP) 437 + #define VM_HUGEPAGE INIT_VM_FLAG(HUGEPAGE) 438 + #define VM_NOHUGEPAGE INIT_VM_FLAG(NOHUGEPAGE) 439 + #define VM_MERGEABLE INIT_VM_FLAG(MERGEABLE) 440 + #define VM_STACK INIT_VM_FLAG(STACK) 441 + #ifdef CONFIG_STACK_GROWS_UP 442 + #define VM_STACK_EARLY INIT_VM_FLAG(STACK_EARLY) 443 + #else 444 + #define VM_STACK_EARLY VM_NONE 445 + #endif 446 + #ifdef CONFIG_ARCH_HAS_PKEYS 447 + #define VM_PKEY_SHIFT ((__force int)VMA_HIGH_ARCH_0_BIT) 448 + /* Despite the naming, these are FLAGS not bits. */ 449 + #define VM_PKEY_BIT0 INIT_VM_FLAG(PKEY_BIT0) 450 + #define VM_PKEY_BIT1 INIT_VM_FLAG(PKEY_BIT1) 451 + #define VM_PKEY_BIT2 INIT_VM_FLAG(PKEY_BIT2) 452 + #if CONFIG_ARCH_PKEY_BITS > 3 453 + #define VM_PKEY_BIT3 INIT_VM_FLAG(PKEY_BIT3) 454 + #else 455 + #define VM_PKEY_BIT3 VM_NONE 456 + #endif /* CONFIG_ARCH_PKEY_BITS > 3 */ 457 + #if CONFIG_ARCH_PKEY_BITS > 4 458 + #define VM_PKEY_BIT4 INIT_VM_FLAG(PKEY_BIT4) 459 + #else 460 + #define VM_PKEY_BIT4 VM_NONE 461 + #endif /* CONFIG_ARCH_PKEY_BITS > 4 */ 462 + #endif /* CONFIG_ARCH_HAS_PKEYS */ 463 + #if defined(CONFIG_X86_USER_SHADOW_STACK) || defined(CONFIG_ARM64_GCS) 464 + #define VM_SHADOW_STACK INIT_VM_FLAG(SHADOW_STACK) 465 + #else 466 + #define VM_SHADOW_STACK VM_NONE 467 + #endif 392 468 #if defined(CONFIG_PPC64) 393 - # define VM_SAO VM_ARCH_1 /* Strong Access Ordering (powerpc) */ 469 + #define VM_SAO INIT_VM_FLAG(SAO) 394 470 #elif defined(CONFIG_PARISC) 395 - # define VM_GROWSUP VM_ARCH_1 471 + #define VM_GROWSUP INIT_VM_FLAG(GROWSUP) 396 472 #elif defined(CONFIG_SPARC64) 397 - # define VM_SPARC_ADI VM_ARCH_1 /* Uses ADI tag for access control */ 398 - # define VM_ARCH_CLEAR VM_SPARC_ADI 473 + #define VM_SPARC_ADI INIT_VM_FLAG(SPARC_ADI) 474 + #define VM_ARCH_CLEAR INIT_VM_FLAG(ARCH_CLEAR) 399 475 #elif defined(CONFIG_ARM64) 400 - # define VM_ARM64_BTI VM_ARCH_1 /* BTI guarded page, a.k.a. GP bit */ 401 - # define VM_ARCH_CLEAR VM_ARM64_BTI 476 + #define VM_ARM64_BTI INIT_VM_FLAG(ARM64_BTI) 477 + #define VM_ARCH_CLEAR INIT_VM_FLAG(ARCH_CLEAR) 402 478 #elif !defined(CONFIG_MMU) 403 - # define VM_MAPPED_COPY VM_ARCH_1 /* T if mapped copy of data (nommu mmap) */ 479 + #define VM_MAPPED_COPY INIT_VM_FLAG(MAPPED_COPY) 404 480 #endif 405 - 406 - #if defined(CONFIG_ARM64_MTE) 407 - # define VM_MTE VM_HIGH_ARCH_4 /* Use Tagged memory for access control */ 408 - # define VM_MTE_ALLOWED VM_HIGH_ARCH_5 /* Tagged memory permitted */ 409 - #else 410 - # define VM_MTE VM_NONE 411 - # define VM_MTE_ALLOWED VM_NONE 412 - #endif 413 - 414 481 #ifndef VM_GROWSUP 415 - # define VM_GROWSUP VM_NONE 482 + #define VM_GROWSUP VM_NONE 416 483 #endif 417 - 418 - #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR 419 - # define VM_UFFD_MINOR_BIT 41 420 - # define VM_UFFD_MINOR BIT(VM_UFFD_MINOR_BIT) /* UFFD minor faults */ 421 - #else /* !CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ 422 - # define VM_UFFD_MINOR VM_NONE 423 - #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ 424 - 425 - /* 426 - * This flag is used to connect VFIO to arch specific KVM code. It 427 - * indicates that the memory under this VMA is safe for use with any 428 - * non-cachable memory type inside KVM. Some VFIO devices, on some 429 - * platforms, are thought to be unsafe and can cause machine crashes 430 - * if KVM does not lock down the memory type. 431 - */ 432 - #ifdef CONFIG_64BIT 433 - #define VM_ALLOW_ANY_UNCACHED_BIT 39 434 - #define VM_ALLOW_ANY_UNCACHED BIT(VM_ALLOW_ANY_UNCACHED_BIT) 484 + #ifdef CONFIG_ARM64_MTE 485 + #define VM_MTE INIT_VM_FLAG(MTE) 486 + #define VM_MTE_ALLOWED INIT_VM_FLAG(MTE_ALLOWED) 435 487 #else 436 - #define VM_ALLOW_ANY_UNCACHED VM_NONE 488 + #define VM_MTE VM_NONE 489 + #define VM_MTE_ALLOWED VM_NONE 437 490 #endif 438 - 491 + #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR 492 + #define VM_UFFD_MINOR INIT_VM_FLAG(UFFD_MINOR) 493 + #else 494 + #define VM_UFFD_MINOR VM_NONE 495 + #endif 439 496 #ifdef CONFIG_64BIT 440 - #define VM_DROPPABLE_BIT 40 441 - #define VM_DROPPABLE BIT(VM_DROPPABLE_BIT) 442 - #elif defined(CONFIG_PPC32) 443 - #define VM_DROPPABLE VM_ARCH_1 497 + #define VM_ALLOW_ANY_UNCACHED INIT_VM_FLAG(ALLOW_ANY_UNCACHED) 498 + #define VM_SEALED INIT_VM_FLAG(SEALED) 499 + #else 500 + #define VM_ALLOW_ANY_UNCACHED VM_NONE 501 + #define VM_SEALED VM_NONE 502 + #endif 503 + #if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) 504 + #define VM_DROPPABLE INIT_VM_FLAG(DROPPABLE) 444 505 #else 445 506 #define VM_DROPPABLE VM_NONE 446 - #endif 447 - 448 - #ifdef CONFIG_64BIT 449 - #define VM_SEALED_BIT 42 450 - #define VM_SEALED BIT(VM_SEALED_BIT) 451 - #else 452 - #define VM_SEALED VM_NONE 453 507 #endif 454 508 455 509 /* Bits set in the VMA until the stack is in its final location */ ··· 529 475 530 476 #define VM_STARTGAP_FLAGS (VM_GROWSDOWN | VM_SHADOW_STACK) 531 477 532 - #ifdef CONFIG_STACK_GROWSUP 533 - #define VM_STACK VM_GROWSUP 534 - #define VM_STACK_EARLY VM_GROWSDOWN 478 + #ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS 479 + #define VM_SEALED_SYSMAP VM_SEALED 535 480 #else 536 - #define VM_STACK VM_GROWSDOWN 537 - #define VM_STACK_EARLY 0 481 + #define VM_SEALED_SYSMAP VM_NONE 538 482 #endif 539 483 540 484 #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT) 541 485 542 486 /* VMA basic access permission flags */ 543 487 #define VM_ACCESS_FLAGS (VM_READ | VM_WRITE | VM_EXEC) 544 - 545 488 546 489 /* 547 490 * Special vmas that are non-mergable, non-mlock()able. ··· 574 523 575 524 /* Arch-specific flags to clear when updating VM flags on protection change */ 576 525 #ifndef VM_ARCH_CLEAR 577 - # define VM_ARCH_CLEAR VM_NONE 526 + #define VM_ARCH_CLEAR VM_NONE 578 527 #endif 579 528 #define VM_FLAGS_CLEAR (ARCH_VM_PKEY_FLAGS | VM_ARCH_CLEAR) 580 529 ··· 971 920 } 972 921 973 922 static inline bool __vma_flag_atomic_valid(struct vm_area_struct *vma, 974 - int bit) 923 + vma_flag_t bit) 975 924 { 976 - const vm_flags_t mask = BIT(bit); 925 + const vm_flags_t mask = BIT((__force int)bit); 977 926 978 927 /* Only specific flags are permitted */ 979 928 if (WARN_ON_ONCE(!(mask & VM_ATOMIC_SET_ALLOWED))) ··· 986 935 * Set VMA flag atomically. Requires only VMA/mmap read lock. Only specific 987 936 * valid flags are allowed to do this. 988 937 */ 989 - static inline void vma_flag_set_atomic(struct vm_area_struct *vma, int bit) 938 + static inline void vma_flag_set_atomic(struct vm_area_struct *vma, 939 + vma_flag_t bit) 990 940 { 991 941 /* mmap read lock/VMA read lock must be held. */ 992 942 if (!rwsem_is_locked(&vma->vm_mm->mmap_lock)) 993 943 vma_assert_locked(vma); 994 944 995 945 if (__vma_flag_atomic_valid(vma, bit)) 996 - set_bit(bit, &ACCESS_PRIVATE(vma, __vm_flags)); 946 + set_bit((__force int)bit, &ACCESS_PRIVATE(vma, __vm_flags)); 997 947 } 998 948 999 949 /* ··· 1004 952 * This is necessarily racey, so callers must ensure that serialisation is 1005 953 * achieved through some other means, or that races are permissible. 1006 954 */ 1007 - static inline bool vma_flag_test_atomic(struct vm_area_struct *vma, int bit) 955 + static inline bool vma_flag_test_atomic(struct vm_area_struct *vma, 956 + vma_flag_t bit) 1008 957 { 1009 958 if (__vma_flag_atomic_valid(vma, bit)) 1010 - return test_bit(bit, &vma->vm_flags); 959 + return test_bit((__force int)bit, &vma->vm_flags); 1011 960 1012 961 return false; 1013 962 } ··· 4569 4516 int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status); 4570 4517 int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status); 4571 4518 int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status); 4572 - 4573 - 4574 - /* 4575 - * mseal of userspace process's system mappings. 4576 - */ 4577 - #ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS 4578 - #define VM_SEALED_SYSMAP VM_SEALED 4579 - #else 4580 - #define VM_SEALED_SYSMAP VM_NONE 4581 - #endif 4582 4519 4583 4520 /* 4584 4521 * DMA mapping IDs for page_pool
+1 -1
mm/khugepaged.c
··· 1740 1740 * obtained on guard region installation after the flag is set, so this 1741 1741 * check being performed under this lock excludes races. 1742 1742 */ 1743 - if (vma_flag_test_atomic(vma, VM_MAYBE_GUARD_BIT)) 1743 + if (vma_flag_test_atomic(vma, VMA_MAYBE_GUARD_BIT)) 1744 1744 return false; 1745 1745 1746 1746 return true;
+1 -1
mm/madvise.c
··· 1142 1142 * acquire an mmap/VMA write lock to read it. All remaining readers may 1143 1143 * or may not see the flag set, but we don't care. 1144 1144 */ 1145 - vma_flag_set_atomic(vma, VM_MAYBE_GUARD_BIT); 1145 + vma_flag_set_atomic(vma, VMA_MAYBE_GUARD_BIT); 1146 1146 1147 1147 /* 1148 1148 * If anonymous and we are establishing page tables the VMA ought to
+25
rust/bindgen_parameters
··· 35 35 # recognized, block generation of the non-helper constants. 36 36 --blocklist-item ARCH_SLAB_MINALIGN 37 37 --blocklist-item ARCH_KMALLOC_MINALIGN 38 + --blocklist-item VM_MERGEABLE 39 + --blocklist-item VM_READ 40 + --blocklist-item VM_WRITE 41 + --blocklist-item VM_EXEC 42 + --blocklist-item VM_SHARED 43 + --blocklist-item VM_MAYREAD 44 + --blocklist-item VM_MAYWRITE 45 + --blocklist-item VM_MAYEXEC 46 + --blocklist-item VM_MAYEXEC 47 + --blocklist-item VM_PFNMAP 48 + --blocklist-item VM_IO 49 + --blocklist-item VM_DONTCOPY 50 + --blocklist-item VM_DONTEXPAND 51 + --blocklist-item VM_LOCKONFAULT 52 + --blocklist-item VM_ACCOUNT 53 + --blocklist-item VM_NORESERVE 54 + --blocklist-item VM_HUGETLB 55 + --blocklist-item VM_SYNC 56 + --blocklist-item VM_ARCH_1 57 + --blocklist-item VM_WIPEONFORK 58 + --blocklist-item VM_DONTDUMP 59 + --blocklist-item VM_SOFTDIRTY 60 + --blocklist-item VM_MIXEDMAP 61 + --blocklist-item VM_HUGEPAGE 62 + --blocklist-item VM_NOHUGEPAGE 38 63 39 64 # Structs should implement `Zeroable` when all of their fields do. 40 65 --with-derive-custom-struct .*=MaybeZeroable
+25
rust/bindings/bindings_helper.h
··· 108 108 109 109 const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC = XA_FLAGS_ALLOC; 110 110 const gfp_t RUST_CONST_HELPER_XA_FLAGS_ALLOC1 = XA_FLAGS_ALLOC1; 111 + 111 112 const vm_flags_t RUST_CONST_HELPER_VM_MERGEABLE = VM_MERGEABLE; 113 + const vm_flags_t RUST_CONST_HELPER_VM_READ = VM_READ; 114 + const vm_flags_t RUST_CONST_HELPER_VM_WRITE = VM_WRITE; 115 + const vm_flags_t RUST_CONST_HELPER_VM_EXEC = VM_EXEC; 116 + const vm_flags_t RUST_CONST_HELPER_VM_SHARED = VM_SHARED; 117 + const vm_flags_t RUST_CONST_HELPER_VM_MAYREAD = VM_MAYREAD; 118 + const vm_flags_t RUST_CONST_HELPER_VM_MAYWRITE = VM_MAYWRITE; 119 + const vm_flags_t RUST_CONST_HELPER_VM_MAYEXEC = VM_MAYEXEC; 120 + const vm_flags_t RUST_CONST_HELPER_VM_MAYSHARE = VM_MAYEXEC; 121 + const vm_flags_t RUST_CONST_HELPER_VM_PFNMAP = VM_PFNMAP; 122 + const vm_flags_t RUST_CONST_HELPER_VM_IO = VM_IO; 123 + const vm_flags_t RUST_CONST_HELPER_VM_DONTCOPY = VM_DONTCOPY; 124 + const vm_flags_t RUST_CONST_HELPER_VM_DONTEXPAND = VM_DONTEXPAND; 125 + const vm_flags_t RUST_CONST_HELPER_VM_LOCKONFAULT = VM_LOCKONFAULT; 126 + const vm_flags_t RUST_CONST_HELPER_VM_ACCOUNT = VM_ACCOUNT; 127 + const vm_flags_t RUST_CONST_HELPER_VM_NORESERVE = VM_NORESERVE; 128 + const vm_flags_t RUST_CONST_HELPER_VM_HUGETLB = VM_HUGETLB; 129 + const vm_flags_t RUST_CONST_HELPER_VM_SYNC = VM_SYNC; 130 + const vm_flags_t RUST_CONST_HELPER_VM_ARCH_1 = VM_ARCH_1; 131 + const vm_flags_t RUST_CONST_HELPER_VM_WIPEONFORK = VM_WIPEONFORK; 132 + const vm_flags_t RUST_CONST_HELPER_VM_DONTDUMP = VM_DONTDUMP; 133 + const vm_flags_t RUST_CONST_HELPER_VM_SOFTDIRTY = VM_SOFTDIRTY; 134 + const vm_flags_t RUST_CONST_HELPER_VM_MIXEDMAP = VM_MIXEDMAP; 135 + const vm_flags_t RUST_CONST_HELPER_VM_HUGEPAGE = VM_HUGEPAGE; 136 + const vm_flags_t RUST_CONST_HELPER_VM_NOHUGEPAGE = VM_NOHUGEPAGE; 112 137 113 138 #if IS_ENABLED(CONFIG_ANDROID_BINDER_IPC_RUST) 114 139 #include "../../drivers/android/binder/rust_binder.h"
+260 -46
tools/testing/vma/vma_internal.h
··· 46 46 47 47 #define MMF_HAS_MDWE 28 48 48 49 + /* 50 + * vm_flags in vm_area_struct, see mm_types.h. 51 + * When changing, update also include/trace/events/mmflags.h 52 + */ 53 + 49 54 #define VM_NONE 0x00000000 50 - #define VM_READ 0x00000001 51 - #define VM_WRITE 0x00000002 52 - #define VM_EXEC 0x00000004 53 - #define VM_SHARED 0x00000008 54 - #define VM_MAYREAD 0x00000010 55 - #define VM_MAYWRITE 0x00000020 56 - #define VM_MAYEXEC 0x00000040 57 - #define VM_GROWSDOWN 0x00000100 58 - #define VM_PFNMAP 0x00000400 59 - #define VM_MAYBE_GUARD 0x00000800 60 - #define VM_LOCKED 0x00002000 61 - #define VM_IO 0x00004000 62 - #define VM_SEQ_READ 0x00008000 /* App will access data sequentially */ 63 - #define VM_RAND_READ 0x00010000 /* App will not benefit from clustered reads */ 64 - #define VM_DONTEXPAND 0x00040000 65 - #define VM_LOCKONFAULT 0x00080000 66 - #define VM_ACCOUNT 0x00100000 67 - #define VM_NORESERVE 0x00200000 68 - #define VM_MIXEDMAP 0x10000000 69 - #define VM_STACK VM_GROWSDOWN 70 - #define VM_SHADOW_STACK VM_NONE 71 - #define VM_SOFTDIRTY 0 72 - #define VM_ARCH_1 0x01000000 /* Architecture-specific flag */ 73 - #define VM_GROWSUP VM_NONE 74 55 75 - #define VM_ACCESS_FLAGS (VM_READ | VM_WRITE | VM_EXEC) 76 - #define VM_SPECIAL (VM_IO | VM_DONTEXPAND | VM_PFNMAP | VM_MIXEDMAP) 56 + /** 57 + * typedef vma_flag_t - specifies an individual VMA flag by bit number. 58 + * 59 + * This value is made type safe by sparse to avoid passing invalid flag values 60 + * around. 61 + */ 62 + typedef int __bitwise vma_flag_t; 77 63 78 - #ifdef CONFIG_STACK_GROWSUP 79 - #define VM_STACK VM_GROWSUP 80 - #define VM_STACK_EARLY VM_GROWSDOWN 64 + #define DECLARE_VMA_BIT(name, bitnum) \ 65 + VMA_ ## name ## _BIT = ((__force vma_flag_t)bitnum) 66 + #define DECLARE_VMA_BIT_ALIAS(name, aliased) \ 67 + VMA_ ## name ## _BIT = VMA_ ## aliased ## _BIT 68 + enum { 69 + DECLARE_VMA_BIT(READ, 0), 70 + DECLARE_VMA_BIT(WRITE, 1), 71 + DECLARE_VMA_BIT(EXEC, 2), 72 + DECLARE_VMA_BIT(SHARED, 3), 73 + /* mprotect() hardcodes VM_MAYREAD >> 4 == VM_READ, and so for r/w/x bits. */ 74 + DECLARE_VMA_BIT(MAYREAD, 4), /* limits for mprotect() etc. */ 75 + DECLARE_VMA_BIT(MAYWRITE, 5), 76 + DECLARE_VMA_BIT(MAYEXEC, 6), 77 + DECLARE_VMA_BIT(MAYSHARE, 7), 78 + DECLARE_VMA_BIT(GROWSDOWN, 8), /* general info on the segment */ 79 + #ifdef CONFIG_MMU 80 + DECLARE_VMA_BIT(UFFD_MISSING, 9),/* missing pages tracking */ 81 81 #else 82 - #define VM_STACK VM_GROWSDOWN 83 - #define VM_STACK_EARLY 0 82 + /* nommu: R/O MAP_PRIVATE mapping that might overlay a file mapping */ 83 + DECLARE_VMA_BIT(MAYOVERLAY, 9), 84 + #endif /* CONFIG_MMU */ 85 + /* Page-ranges managed without "struct page", just pure PFN */ 86 + DECLARE_VMA_BIT(PFNMAP, 10), 87 + DECLARE_VMA_BIT(MAYBE_GUARD, 11), 88 + DECLARE_VMA_BIT(UFFD_WP, 12), /* wrprotect pages tracking */ 89 + DECLARE_VMA_BIT(LOCKED, 13), 90 + DECLARE_VMA_BIT(IO, 14), /* Memory mapped I/O or similar */ 91 + DECLARE_VMA_BIT(SEQ_READ, 15), /* App will access data sequentially */ 92 + DECLARE_VMA_BIT(RAND_READ, 16), /* App will not benefit from clustered reads */ 93 + DECLARE_VMA_BIT(DONTCOPY, 17), /* Do not copy this vma on fork */ 94 + DECLARE_VMA_BIT(DONTEXPAND, 18),/* Cannot expand with mremap() */ 95 + DECLARE_VMA_BIT(LOCKONFAULT, 19),/* Lock pages covered when faulted in */ 96 + DECLARE_VMA_BIT(ACCOUNT, 20), /* Is a VM accounted object */ 97 + DECLARE_VMA_BIT(NORESERVE, 21), /* should the VM suppress accounting */ 98 + DECLARE_VMA_BIT(HUGETLB, 22), /* Huge TLB Page VM */ 99 + DECLARE_VMA_BIT(SYNC, 23), /* Synchronous page faults */ 100 + DECLARE_VMA_BIT(ARCH_1, 24), /* Architecture-specific flag */ 101 + DECLARE_VMA_BIT(WIPEONFORK, 25),/* Wipe VMA contents in child. */ 102 + DECLARE_VMA_BIT(DONTDUMP, 26), /* Do not include in the core dump */ 103 + DECLARE_VMA_BIT(SOFTDIRTY, 27), /* NOT soft dirty clean area */ 104 + DECLARE_VMA_BIT(MIXEDMAP, 28), /* Can contain struct page and pure PFN pages */ 105 + DECLARE_VMA_BIT(HUGEPAGE, 29), /* MADV_HUGEPAGE marked this vma */ 106 + DECLARE_VMA_BIT(NOHUGEPAGE, 30),/* MADV_NOHUGEPAGE marked this vma */ 107 + DECLARE_VMA_BIT(MERGEABLE, 31), /* KSM may merge identical pages */ 108 + /* These bits are reused, we define specific uses below. */ 109 + DECLARE_VMA_BIT(HIGH_ARCH_0, 32), 110 + DECLARE_VMA_BIT(HIGH_ARCH_1, 33), 111 + DECLARE_VMA_BIT(HIGH_ARCH_2, 34), 112 + DECLARE_VMA_BIT(HIGH_ARCH_3, 35), 113 + DECLARE_VMA_BIT(HIGH_ARCH_4, 36), 114 + DECLARE_VMA_BIT(HIGH_ARCH_5, 37), 115 + DECLARE_VMA_BIT(HIGH_ARCH_6, 38), 116 + /* 117 + * This flag is used to connect VFIO to arch specific KVM code. It 118 + * indicates that the memory under this VMA is safe for use with any 119 + * non-cachable memory type inside KVM. Some VFIO devices, on some 120 + * platforms, are thought to be unsafe and can cause machine crashes 121 + * if KVM does not lock down the memory type. 122 + */ 123 + DECLARE_VMA_BIT(ALLOW_ANY_UNCACHED, 39), 124 + #ifdef CONFIG_PPC32 125 + DECLARE_VMA_BIT_ALIAS(DROPPABLE, ARCH_1), 126 + #else 127 + DECLARE_VMA_BIT(DROPPABLE, 40), 84 128 #endif 129 + DECLARE_VMA_BIT(UFFD_MINOR, 41), 130 + DECLARE_VMA_BIT(SEALED, 42), 131 + /* Flags that reuse flags above. */ 132 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT0, HIGH_ARCH_0), 133 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT1, HIGH_ARCH_1), 134 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT2, HIGH_ARCH_2), 135 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT3, HIGH_ARCH_3), 136 + DECLARE_VMA_BIT_ALIAS(PKEY_BIT4, HIGH_ARCH_4), 137 + #if defined(CONFIG_X86_USER_SHADOW_STACK) 138 + /* 139 + * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of 140 + * support core mm. 141 + * 142 + * These VMAs will get a single end guard page. This helps userspace 143 + * protect itself from attacks. A single page is enough for current 144 + * shadow stack archs (x86). See the comments near alloc_shstk() in 145 + * arch/x86/kernel/shstk.c for more details on the guard size. 146 + */ 147 + DECLARE_VMA_BIT_ALIAS(SHADOW_STACK, HIGH_ARCH_5), 148 + #elif defined(CONFIG_ARM64_GCS) 149 + /* 150 + * arm64's Guarded Control Stack implements similar functionality and 151 + * has similar constraints to shadow stacks. 152 + */ 153 + DECLARE_VMA_BIT_ALIAS(SHADOW_STACK, HIGH_ARCH_6), 154 + #endif 155 + DECLARE_VMA_BIT_ALIAS(SAO, ARCH_1), /* Strong Access Ordering (powerpc) */ 156 + DECLARE_VMA_BIT_ALIAS(GROWSUP, ARCH_1), /* parisc */ 157 + DECLARE_VMA_BIT_ALIAS(SPARC_ADI, ARCH_1), /* sparc64 */ 158 + DECLARE_VMA_BIT_ALIAS(ARM64_BTI, ARCH_1), /* arm64 */ 159 + DECLARE_VMA_BIT_ALIAS(ARCH_CLEAR, ARCH_1), /* sparc64, arm64 */ 160 + DECLARE_VMA_BIT_ALIAS(MAPPED_COPY, ARCH_1), /* !CONFIG_MMU */ 161 + DECLARE_VMA_BIT_ALIAS(MTE, HIGH_ARCH_4), /* arm64 */ 162 + DECLARE_VMA_BIT_ALIAS(MTE_ALLOWED, HIGH_ARCH_5),/* arm64 */ 163 + #ifdef CONFIG_STACK_GROWSUP 164 + DECLARE_VMA_BIT_ALIAS(STACK, GROWSUP), 165 + DECLARE_VMA_BIT_ALIAS(STACK_EARLY, GROWSDOWN), 166 + #else 167 + DECLARE_VMA_BIT_ALIAS(STACK, GROWSDOWN), 168 + #endif 169 + }; 170 + 171 + #define INIT_VM_FLAG(name) BIT((__force int) VMA_ ## name ## _BIT) 172 + #define VM_READ INIT_VM_FLAG(READ) 173 + #define VM_WRITE INIT_VM_FLAG(WRITE) 174 + #define VM_EXEC INIT_VM_FLAG(EXEC) 175 + #define VM_SHARED INIT_VM_FLAG(SHARED) 176 + #define VM_MAYREAD INIT_VM_FLAG(MAYREAD) 177 + #define VM_MAYWRITE INIT_VM_FLAG(MAYWRITE) 178 + #define VM_MAYEXEC INIT_VM_FLAG(MAYEXEC) 179 + #define VM_MAYSHARE INIT_VM_FLAG(MAYSHARE) 180 + #define VM_GROWSDOWN INIT_VM_FLAG(GROWSDOWN) 181 + #ifdef CONFIG_MMU 182 + #define VM_UFFD_MISSING INIT_VM_FLAG(UFFD_MISSING) 183 + #else 184 + #define VM_UFFD_MISSING VM_NONE 185 + #define VM_MAYOVERLAY INIT_VM_FLAG(MAYOVERLAY) 186 + #endif 187 + #define VM_PFNMAP INIT_VM_FLAG(PFNMAP) 188 + #define VM_MAYBE_GUARD INIT_VM_FLAG(MAYBE_GUARD) 189 + #define VM_UFFD_WP INIT_VM_FLAG(UFFD_WP) 190 + #define VM_LOCKED INIT_VM_FLAG(LOCKED) 191 + #define VM_IO INIT_VM_FLAG(IO) 192 + #define VM_SEQ_READ INIT_VM_FLAG(SEQ_READ) 193 + #define VM_RAND_READ INIT_VM_FLAG(RAND_READ) 194 + #define VM_DONTCOPY INIT_VM_FLAG(DONTCOPY) 195 + #define VM_DONTEXPAND INIT_VM_FLAG(DONTEXPAND) 196 + #define VM_LOCKONFAULT INIT_VM_FLAG(LOCKONFAULT) 197 + #define VM_ACCOUNT INIT_VM_FLAG(ACCOUNT) 198 + #define VM_NORESERVE INIT_VM_FLAG(NORESERVE) 199 + #define VM_HUGETLB INIT_VM_FLAG(HUGETLB) 200 + #define VM_SYNC INIT_VM_FLAG(SYNC) 201 + #define VM_ARCH_1 INIT_VM_FLAG(ARCH_1) 202 + #define VM_WIPEONFORK INIT_VM_FLAG(WIPEONFORK) 203 + #define VM_DONTDUMP INIT_VM_FLAG(DONTDUMP) 204 + #ifdef CONFIG_MEM_SOFT_DIRTY 205 + #define VM_SOFTDIRTY INIT_VM_FLAG(SOFTDIRTY) 206 + #else 207 + #define VM_SOFTDIRTY VM_NONE 208 + #endif 209 + #define VM_MIXEDMAP INIT_VM_FLAG(MIXEDMAP) 210 + #define VM_HUGEPAGE INIT_VM_FLAG(HUGEPAGE) 211 + #define VM_NOHUGEPAGE INIT_VM_FLAG(NOHUGEPAGE) 212 + #define VM_MERGEABLE INIT_VM_FLAG(MERGEABLE) 213 + #define VM_STACK INIT_VM_FLAG(STACK) 214 + #ifdef CONFIG_STACK_GROWS_UP 215 + #define VM_STACK_EARLY INIT_VM_FLAG(STACK_EARLY) 216 + #else 217 + #define VM_STACK_EARLY VM_NONE 218 + #endif 219 + #ifdef CONFIG_ARCH_HAS_PKEYS 220 + #define VM_PKEY_SHIFT ((__force int)VMA_HIGH_ARCH_0_BIT) 221 + /* Despite the naming, these are FLAGS not bits. */ 222 + #define VM_PKEY_BIT0 INIT_VM_FLAG(PKEY_BIT0) 223 + #define VM_PKEY_BIT1 INIT_VM_FLAG(PKEY_BIT1) 224 + #define VM_PKEY_BIT2 INIT_VM_FLAG(PKEY_BIT2) 225 + #if CONFIG_ARCH_PKEY_BITS > 3 226 + #define VM_PKEY_BIT3 INIT_VM_FLAG(PKEY_BIT3) 227 + #else 228 + #define VM_PKEY_BIT3 VM_NONE 229 + #endif /* CONFIG_ARCH_PKEY_BITS > 3 */ 230 + #if CONFIG_ARCH_PKEY_BITS > 4 231 + #define VM_PKEY_BIT4 INIT_VM_FLAG(PKEY_BIT4) 232 + #else 233 + #define VM_PKEY_BIT4 VM_NONE 234 + #endif /* CONFIG_ARCH_PKEY_BITS > 4 */ 235 + #endif /* CONFIG_ARCH_HAS_PKEYS */ 236 + #if defined(CONFIG_X86_USER_SHADOW_STACK) || defined(CONFIG_ARM64_GCS) 237 + #define VM_SHADOW_STACK INIT_VM_FLAG(SHADOW_STACK) 238 + #else 239 + #define VM_SHADOW_STACK VM_NONE 240 + #endif 241 + #if defined(CONFIG_PPC64) 242 + #define VM_SAO INIT_VM_FLAG(SAO) 243 + #elif defined(CONFIG_PARISC) 244 + #define VM_GROWSUP INIT_VM_FLAG(GROWSUP) 245 + #elif defined(CONFIG_SPARC64) 246 + #define VM_SPARC_ADI INIT_VM_FLAG(SPARC_ADI) 247 + #define VM_ARCH_CLEAR INIT_VM_FLAG(ARCH_CLEAR) 248 + #elif defined(CONFIG_ARM64) 249 + #define VM_ARM64_BTI INIT_VM_FLAG(ARM64_BTI) 250 + #define VM_ARCH_CLEAR INIT_VM_FLAG(ARCH_CLEAR) 251 + #elif !defined(CONFIG_MMU) 252 + #define VM_MAPPED_COPY INIT_VM_FLAG(MAPPED_COPY) 253 + #endif 254 + #ifndef VM_GROWSUP 255 + #define VM_GROWSUP VM_NONE 256 + #endif 257 + #ifdef CONFIG_ARM64_MTE 258 + #define VM_MTE INIT_VM_FLAG(MTE) 259 + #define VM_MTE_ALLOWED INIT_VM_FLAG(MTE_ALLOWED) 260 + #else 261 + #define VM_MTE VM_NONE 262 + #define VM_MTE_ALLOWED VM_NONE 263 + #endif 264 + #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR 265 + #define VM_UFFD_MINOR INIT_VM_FLAG(UFFD_MINOR) 266 + #else 267 + #define VM_UFFD_MINOR VM_NONE 268 + #endif 269 + #ifdef CONFIG_64BIT 270 + #define VM_ALLOW_ANY_UNCACHED INIT_VM_FLAG(ALLOW_ANY_UNCACHED) 271 + #define VM_SEALED INIT_VM_FLAG(SEALED) 272 + #else 273 + #define VM_ALLOW_ANY_UNCACHED VM_NONE 274 + #define VM_SEALED VM_NONE 275 + #endif 276 + #if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) 277 + #define VM_DROPPABLE INIT_VM_FLAG(DROPPABLE) 278 + #else 279 + #define VM_DROPPABLE VM_NONE 280 + #endif 281 + 282 + /* Bits set in the VMA until the stack is in its final location */ 283 + #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY) 284 + 285 + #define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) 286 + 287 + /* Common data flag combinations */ 288 + #define VM_DATA_FLAGS_TSK_EXEC (VM_READ | VM_WRITE | TASK_EXEC | \ 289 + VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) 290 + #define VM_DATA_FLAGS_NON_EXEC (VM_READ | VM_WRITE | VM_MAYREAD | \ 291 + VM_MAYWRITE | VM_MAYEXEC) 292 + #define VM_DATA_FLAGS_EXEC (VM_READ | VM_WRITE | VM_EXEC | \ 293 + VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) 294 + 295 + #ifndef VM_DATA_DEFAULT_FLAGS /* arch can override this */ 296 + #define VM_DATA_DEFAULT_FLAGS VM_DATA_FLAGS_EXEC 297 + #endif 298 + 299 + #ifndef VM_STACK_DEFAULT_FLAGS /* arch can override this */ 300 + #define VM_STACK_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS 301 + #endif 302 + 303 + #define VM_STARTGAP_FLAGS (VM_GROWSDOWN | VM_SHADOW_STACK) 304 + 305 + #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT) 306 + 307 + /* VMA basic access permission flags */ 308 + #define VM_ACCESS_FLAGS (VM_READ | VM_WRITE | VM_EXEC) 309 + 310 + /* 311 + * Special vmas that are non-mergable, non-mlock()able. 312 + */ 313 + #define VM_SPECIAL (VM_IO | VM_DONTEXPAND | VM_PFNMAP | VM_MIXEDMAP) 85 314 86 315 #define DEFAULT_MAP_WINDOW ((1UL << 47) - PAGE_SIZE) 87 316 #define TASK_SIZE_LOW DEFAULT_MAP_WINDOW ··· 326 97 #define VM_DATA_FLAGS_TSK_EXEC (VM_READ | VM_WRITE | TASK_EXEC | \ 327 98 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) 328 99 329 - #define VM_DATA_DEFAULT_FLAGS VM_DATA_FLAGS_TSK_EXEC 330 - 331 - #define VM_STARTGAP_FLAGS (VM_GROWSDOWN | VM_SHADOW_STACK) 332 - 333 - #define VM_STACK_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS 334 - #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT) 335 - #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY) 336 - 337 100 #define RLIMIT_STACK 3 /* max stack size */ 338 101 #define RLIMIT_MEMLOCK 8 /* max locked-in-memory address space */ 339 102 340 103 #define CAP_IPC_LOCK 14 341 - 342 - #ifdef CONFIG_64BIT 343 - #define VM_SEALED_BIT 42 344 - #define VM_SEALED BIT(VM_SEALED_BIT) 345 - #else 346 - #define VM_SEALED VM_NONE 347 - #endif 348 104 349 105 /* 350 106 * Flags which should be 'sticky' on merge - that is, flags which, when one VMA