Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Fix possible splice() mmap_sem deadlock

Nick Piggin points out that splice isn't being good about the mmap
semaphore: while two readers can nest inside each others, it does leave
a possible deadlock if a writer (ie a new mmap()) comes in during that
nesting.

Original "just move the locking" patch by Nick, replaced by one by me
based on an optimistic pagefault_disable(). And then Jens tested and
updated that patch.

Reported-by: Nick Piggin <npiggin@suse.de>
Tested-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+34 -12
+34 -12
fs/splice.c
··· 1224 1224 } 1225 1225 1226 1226 /* 1227 + * Do a copy-from-user while holding the mmap_semaphore for reading, in a 1228 + * manner safe from deadlocking with simultaneous mmap() (grabbing mmap_sem 1229 + * for writing) and page faulting on the user memory pointed to by src. 1230 + * This assumes that we will very rarely hit the partial != 0 path, or this 1231 + * will not be a win. 1232 + */ 1233 + static int copy_from_user_mmap_sem(void *dst, const void __user *src, size_t n) 1234 + { 1235 + int partial; 1236 + 1237 + pagefault_disable(); 1238 + partial = __copy_from_user_inatomic(dst, src, n); 1239 + pagefault_enable(); 1240 + 1241 + /* 1242 + * Didn't copy everything, drop the mmap_sem and do a faulting copy 1243 + */ 1244 + if (unlikely(partial)) { 1245 + up_read(&current->mm->mmap_sem); 1246 + partial = copy_from_user(dst, src, n); 1247 + down_read(&current->mm->mmap_sem); 1248 + } 1249 + 1250 + return partial; 1251 + } 1252 + 1253 + /* 1227 1254 * Map an iov into an array of pages and offset/length tupples. With the 1228 1255 * partial_page structure, we can map several non-contiguous ranges into 1229 1256 * our ones pages[] map instead of splitting that operation into pieces. ··· 1263 1236 { 1264 1237 int buffers = 0, error = 0; 1265 1238 1266 - /* 1267 - * It's ok to take the mmap_sem for reading, even 1268 - * across a "get_user()". 1269 - */ 1270 1239 down_read(&current->mm->mmap_sem); 1271 1240 1272 1241 while (nr_vecs) { 1273 1242 unsigned long off, npages; 1243 + struct iovec entry; 1274 1244 void __user *base; 1275 1245 size_t len; 1276 1246 int i; 1277 1247 1278 - /* 1279 - * Get user address base and length for this iovec. 1280 - */ 1281 - error = get_user(base, &iov->iov_base); 1282 - if (unlikely(error)) 1248 + error = -EFAULT; 1249 + if (copy_from_user_mmap_sem(&entry, iov, sizeof(entry))) 1283 1250 break; 1284 - error = get_user(len, &iov->iov_len); 1285 - if (unlikely(error)) 1286 - break; 1251 + 1252 + base = entry.iov_base; 1253 + len = entry.iov_len; 1287 1254 1288 1255 /* 1289 1256 * Sanity check this iovec. 0 read succeeds. 1290 1257 */ 1258 + error = 0; 1291 1259 if (unlikely(!len)) 1292 1260 break; 1293 1261 error = -EFAULT;