Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

blk-mq: convert to serialize updating nr_requests with update_nr_hwq_lock

request_queue->nr_requests can be changed by:

a) switch elevator by updating nr_hw_queues
b) switch elevator by elevator sysfs attribute
c) configue queue sysfs attribute nr_requests

Current lock order is:

1) update_nr_hwq_lock, case a,b
2) freeze_queue
3) elevator_lock, case a,b,c

And update nr_requests is seriablized by elevator_lock() already,
however, in the case c, we'll have to allocate new sched_tags if
nr_requests grow, and do this with elevator_lock held and queue
freezed has the risk of deadlock.

Hence use update_nr_hwq_lock instead, make it possible to allocate
memory if tags grow, meanwhile also prevent nr_requests to be changed
concurrently.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Yu Kuai and committed by
Jens Axboe
626ff4f8 b46d4c44

+20 -5
+20 -5
block/blk-sysfs.c
··· 68 68 int ret, err; 69 69 unsigned int memflags; 70 70 struct request_queue *q = disk->queue; 71 + struct blk_mq_tag_set *set = q->tag_set; 71 72 72 73 ret = queue_var_store(&nr, page, count); 73 74 if (ret < 0) 74 75 return ret; 75 76 76 - memflags = blk_mq_freeze_queue(q); 77 - mutex_lock(&q->elevator_lock); 77 + /* 78 + * Serialize updating nr_requests with concurrent queue_requests_store() 79 + * and switching elevator. 80 + */ 81 + down_write(&set->update_nr_hwq_lock); 78 82 79 83 if (nr == q->nr_requests) 80 84 goto unlock; ··· 86 82 if (nr < BLKDEV_MIN_RQ) 87 83 nr = BLKDEV_MIN_RQ; 88 84 89 - if (nr <= q->tag_set->reserved_tags || 85 + /* 86 + * Switching elevator is protected by update_nr_hwq_lock: 87 + * - read lock is held from elevator sysfs attribute; 88 + * - write lock is held from updating nr_hw_queues; 89 + * Hence it's safe to access q->elevator here with write lock held. 90 + */ 91 + if (nr <= set->reserved_tags || 90 92 (q->elevator && nr > MAX_SCHED_RQ) || 91 - (!q->elevator && nr > q->tag_set->queue_depth)) { 93 + (!q->elevator && nr > set->queue_depth)) { 92 94 ret = -EINVAL; 93 95 goto unlock; 94 96 } 97 + 98 + memflags = blk_mq_freeze_queue(q); 99 + mutex_lock(&q->elevator_lock); 95 100 96 101 err = blk_mq_update_nr_requests(disk->queue, nr); 97 102 if (err) 98 103 ret = err; 99 104 100 - unlock: 101 105 mutex_unlock(&q->elevator_lock); 102 106 blk_mq_unfreeze_queue(q, memflags); 107 + 108 + unlock: 109 + up_write(&set->update_nr_hwq_lock); 103 110 return ret; 104 111 } 105 112