Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: document warning in move_normal_pmd() and make it warn only once

Naresh Kamboju reported that the LTP tests can cause warnings on i386
going back all the way to v5.0, and bisected it to commit 2c91bd4a4e2e
("mm: speed up mremap by 20x on large regions").

The warning in move_normal_pmd() is actually mostly correct, but we have
a very unusual special case at process creation time, when we may move
the stack down with an overlapping mode (kind of like a "memmove()"
except using the page tables).

And when you have just the right condition of "move a large initial
stack by the right alignment in the end, but with the early part of the
move being only page-aligned", we'll be in a situation where we're
trying to move a normal PMD entry on top of an already existing - but
now empty - PMD entry.

The warning is still worth having, in case it ever triggers other cases,
and perhaps as a reminder that we could do the stack move case more
efficiently (although it's clearly rare enough that it probably doesn't
matter).

But make it do WARN_ON_ONCE(), so that you can't flood the logs with it.

And add a *big* comment above it to explain and remind us what's going
on, because it took some figuring out to see how this could trigger.
Kudos to Joel Fernandes for debugging this.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Debugged-and-acked-by: Joel Fernandes <joel@joelfernandes.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+21 -2
+21 -2
mm/mremap.c
··· 206 206 207 207 /* 208 208 * The destination pmd shouldn't be established, free_pgtables() 209 - * should have release it. 209 + * should have released it. 210 + * 211 + * However, there's a case during execve() where we use mremap 212 + * to move the initial stack, and in that case the target area 213 + * may overlap the source area (always moving down). 214 + * 215 + * If everything is PMD-aligned, that works fine, as moving 216 + * each pmd down will clear the source pmd. But if we first 217 + * have a few 4kB-only pages that get moved down, and then 218 + * hit the "now the rest is PMD-aligned, let's do everything 219 + * one pmd at a time", we will still have the old (now empty 220 + * of any 4kB pages, but still there) PMD in the page table 221 + * tree. 222 + * 223 + * Warn on it once - because we really should try to figure 224 + * out how to do this better - but then say "I won't move 225 + * this pmd". 226 + * 227 + * One alternative might be to just unmap the target pmd at 228 + * this point, and verify that it really is empty. We'll see. 210 229 */ 211 - if (WARN_ON(!pmd_none(*new_pmd))) 230 + if (WARN_ON_ONCE(!pmd_none(*new_pmd))) 212 231 return false; 213 232 214 233 /*