workqueue: Remove WORK_OFFQ_CANCELING

cancel[_delayed]_work_sync() guarantees that it can shut down
self-requeueing work items. To achieve that, it grabs and then holds
WORK_STRUCT_PENDING bit set while flushing the currently executing instance.
As the PENDING bit is set, all queueing attempts including the
self-requeueing ones fail and once the currently executing instance is
flushed, the work item should be idle as long as someone else isn't actively
queueing it.

This means that the cancel_work_sync path may hold the PENDING bit set while
flushing the target work item. This isn't a problem for the queueing path -
it can just fail which is the desired effect. It doesn't affect flush. It
doesn't matter to cancel_work either as it can just report that the work
item has successfully canceled. However, if there's another cancel_work_sync
attempt on the work item, it can't simply fail or report success and that
would breach the guarantee that it should provide. cancel_work_sync has to
wait for and grab that PENDING bit and go through the motions.

WORK_OFFQ_CANCELING and wq_cancel_waitq are what implement this
cancel_work_sync to cancel_work_sync wait mechanism. When a work item is
being canceled, WORK_OFFQ_CANCELING is also set on it and other
cancel_work_sync attempts wait on the bit to be cleared using the wait
queue.

While this works, it's an isolated wart which doesn't jive with the rest of
flush and cancel mechanisms and forces enable_work() and disable_work() to
require a sleepable context, which hampers their usability.

Now that a work item can be disabled, we can use that to block queueing
while cancel_work_sync is in progress. Instead of holding PENDING the bit,
it can temporarily disable the work item, flush and then re-enable it as
that'd achieve the same end result of blocking queueings while canceling and
thus enable canceling of self-requeueing work items.

- WORK_OFFQ_CANCELING and the surrounding mechanims are removed.

- work_grab_pending() is now simpler, no longer has to wait for a blocking
operation and thus can be called from any context.

- With work_grab_pending() simplified, no need to use try_to_grab_pending()
directly. All users are converted to use work_grab_pending().

- __cancel_work_sync() is updated to __cancel_work() with
WORK_CANCEL_DISABLE to cancel and plug racing queueing attempts. It then
flushes and re-enables the work item if necessary.

- These changes allow disable_work() and enable_work() to be called from any
context.

v2: Lai pointed out that mod_delayed_work_on() needs to check the disable
count before queueing the delayed work item. Added
clear_pending_if_disabled() call.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>

Tejun Heo 2 years ago f09b10b6 86898fa6

+20 -124

2 changed files

expand all

include

linux

workqueue.h

kernel

workqueue.c

+1 -3

include/linux/workqueue.h

··· 52 52 * 53 53 * MSB 54 54 * [ pool ID ] [ disable depth ] [ OFFQ flags ] [ STRUCT flags ] 55 - * 16 bits 1 bit 4 or 5 bits 55 + * 16 bits 0 bits 4 or 5 bits 56 56 */ 57 57 WORK_OFFQ_FLAG_SHIFT = WORK_STRUCT_FLAG_BITS, 58 - WORK_OFFQ_CANCELING_BIT = WORK_OFFQ_FLAG_SHIFT, 59 58 WORK_OFFQ_FLAG_END, 60 59 WORK_OFFQ_FLAG_BITS = WORK_OFFQ_FLAG_END - WORK_OFFQ_FLAG_SHIFT, 61 60 ··· 98 99 }; 99 100 100 101 /* Convenience constants - of type 'unsigned long', not 'enum'! */ 101 - #define WORK_OFFQ_CANCELING (1ul << WORK_OFFQ_CANCELING_BIT) 102 102 #define WORK_OFFQ_FLAG_MASK (((1ul << WORK_OFFQ_FLAG_BITS) - 1) << WORK_OFFQ_FLAG_SHIFT) 103 103 #define WORK_OFFQ_DISABLE_MASK (((1ul << WORK_OFFQ_DISABLE_BITS) - 1) << WORK_OFFQ_DISABLE_SHIFT) 104 104 #define WORK_OFFQ_POOL_NONE ((1ul << WORK_OFFQ_POOL_BITS) - 1)

+19 -121

kernel/workqueue.c

··· 497 497 static struct workqueue_attrs *ordered_wq_attrs[NR_STD_WORKER_POOLS]; 498 498 499 499 /* 500 - * Used to synchronize multiple cancel_sync attempts on the same work item. See 501 - * work_grab_pending() and __cancel_work_sync(). 502 - */ 503 - static DECLARE_WAIT_QUEUE_HEAD(wq_cancel_waitq); 504 - 505 - /* 506 500 * I: kthread_worker to release pwq's. pwq release needs to be bounced to a 507 501 * process context while holding a pool lock. Bounce to a dedicated kthread 508 502 * worker to avoid A-A deadlocks. ··· 777 783 * corresponding to a work. Pool is available once the work has been 778 784 * queued anywhere after initialization until it is sync canceled. pwq is 779 785 * available only while the work item is queued. 780 - * 781 - * %WORK_OFFQ_CANCELING is used to mark a work item which is being 782 - * canceled. While being canceled, a work item may have its PENDING set 783 - * but stay off timer and worklist for arbitrarily long and nobody should 784 - * try to steal the PENDING bit. 785 786 */ 786 787 static inline void set_work_data(struct work_struct *work, unsigned long data) 787 788 { ··· 908 919 { 909 920 return ((unsigned long)offqd->disable << WORK_OFFQ_DISABLE_SHIFT) | 910 921 ((unsigned long)offqd->flags); 911 - } 912 - 913 - static bool work_is_canceling(struct work_struct *work) 914 - { 915 - unsigned long data = atomic_long_read(&work->data); 916 - 917 - return !(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_CANCELING); 918 922 } 919 923 920 924 /* ··· 2040 2058 * 1 if @work was pending and we successfully stole PENDING 2041 2059 * 0 if @work was idle and we claimed PENDING 2042 2060 * -EAGAIN if PENDING couldn't be grabbed at the moment, safe to busy-retry 2043 - * -ENOENT if someone else is canceling @work, this state may persist 2044 - * for arbitrarily long 2045 2061 * ======== ================================================================ 2046 2062 * 2047 2063 * Note: ··· 2135 2155 fail: 2136 2156 rcu_read_unlock(); 2137 2157 local_irq_restore(*irq_flags); 2138 - if (work_is_canceling(work)) 2139 - return -ENOENT; 2140 - cpu_relax(); 2141 2158 return -EAGAIN; 2142 - } 2143 - 2144 - struct cwt_wait { 2145 - wait_queue_entry_t wait; 2146 - struct work_struct *work; 2147 - }; 2148 - 2149 - static int cwt_wakefn(wait_queue_entry_t *wait, unsigned mode, int sync, void *key) 2150 - { 2151 - struct cwt_wait *cwait = container_of(wait, struct cwt_wait, wait); 2152 - 2153 - if (cwait->work != key) 2154 - return 0; 2155 - return autoremove_wake_function(wait, mode, sync, key); 2156 2159 } 2157 2160 2158 2161 /** ··· 2147 2184 * Grab PENDING bit of @work. @work can be in any stable state - idle, on timer 2148 2185 * or on worklist. 2149 2186 * 2150 - * Must be called in process context. IRQ is disabled on return with IRQ state 2187 + * Can be called from any context. IRQ is disabled on return with IRQ state 2151 2188 * stored in *@irq_flags. The caller is responsible for re-enabling it using 2152 2189 * local_irq_restore(). 2153 2190 * ··· 2156 2193 static bool work_grab_pending(struct work_struct *work, u32 cflags, 2157 2194 unsigned long *irq_flags) 2158 2195 { 2159 - struct cwt_wait cwait; 2160 2196 int ret; 2161 2197 2162 - might_sleep(); 2163 - repeat: 2164 - ret = try_to_grab_pending(work, cflags, irq_flags); 2165 - if (likely(ret >= 0)) 2166 - return ret; 2167 - if (ret != -ENOENT) 2168 - goto repeat; 2169 - 2170 - /* 2171 - * Someone is already canceling. Wait for it to finish. flush_work() 2172 - * doesn't work for PREEMPT_NONE because we may get woken up between 2173 - * @work's completion and the other canceling task resuming and clearing 2174 - * CANCELING - flush_work() will return false immediately as @work is no 2175 - * longer busy, try_to_grab_pending() will return -ENOENT as @work is 2176 - * still being canceled and the other canceling task won't be able to 2177 - * clear CANCELING as we're hogging the CPU. 2178 - * 2179 - * Let's wait for completion using a waitqueue. As this may lead to the 2180 - * thundering herd problem, use a custom wake function which matches 2181 - * @work along with exclusive wait and wakeup. 2182 - */ 2183 - init_wait(&cwait.wait); 2184 - cwait.wait.func = cwt_wakefn; 2185 - cwait.work = work; 2186 - 2187 - prepare_to_wait_exclusive(&wq_cancel_waitq, &cwait.wait, 2188 - TASK_UNINTERRUPTIBLE); 2189 - if (work_is_canceling(work)) 2190 - schedule(); 2191 - finish_wait(&wq_cancel_waitq, &cwait.wait); 2192 - 2193 - goto repeat; 2198 + while (true) { 2199 + ret = try_to_grab_pending(work, cflags, irq_flags); 2200 + if (ret >= 0) 2201 + return ret; 2202 + cpu_relax(); 2203 + } 2194 2204 } 2195 2205 2196 2206 /** ··· 2581 2645 struct delayed_work *dwork, unsigned long delay) 2582 2646 { 2583 2647 unsigned long irq_flags; 2584 - int ret; 2648 + bool ret; 2585 2649 2586 - do { 2587 - ret = try_to_grab_pending(&dwork->work, WORK_CANCEL_DELAYED, 2588 - &irq_flags); 2589 - } while (unlikely(ret == -EAGAIN)); 2650 + ret = work_grab_pending(&dwork->work, WORK_CANCEL_DELAYED, &irq_flags); 2590 2651 2591 - if (likely(ret >= 0)) { 2652 + if (!clear_pending_if_disabled(&dwork->work)) 2592 2653 __queue_delayed_work(cpu, wq, dwork, delay); 2593 - local_irq_restore(irq_flags); 2594 - } 2595 2654 2596 - /* -ENOENT from try_to_grab_pending() becomes %true */ 2655 + local_irq_restore(irq_flags); 2597 2656 return ret; 2598 2657 } 2599 2658 EXPORT_SYMBOL_GPL(mod_delayed_work_on); ··· 4251 4320 unsigned long irq_flags; 4252 4321 int ret; 4253 4322 4254 - if (cflags & WORK_CANCEL_DISABLE) { 4255 - ret = work_grab_pending(work, cflags, &irq_flags); 4256 - } else { 4257 - do { 4258 - ret = try_to_grab_pending(work, cflags, &irq_flags); 4259 - } while (unlikely(ret == -EAGAIN)); 4260 - 4261 - if (unlikely(ret < 0)) 4262 - return false; 4263 - } 4323 + ret = work_grab_pending(work, cflags, &irq_flags); 4264 4324 4265 4325 work_offqd_unpack(&offqd, *work_data_bits(work)); 4266 4326 ··· 4266 4344 4267 4345 static bool __cancel_work_sync(struct work_struct *work, u32 cflags) 4268 4346 { 4269 - struct work_offq_data offqd; 4270 - unsigned long irq_flags; 4271 4347 bool ret; 4272 4348 4273 - /* claim @work and tell other tasks trying to grab @work to back off */ 4274 - ret = work_grab_pending(work, cflags, &irq_flags); 4275 - 4276 - work_offqd_unpack(&offqd, *work_data_bits(work)); 4277 - 4278 - if (cflags & WORK_CANCEL_DISABLE) 4279 - work_offqd_disable(&offqd); 4280 - 4281 - offqd.flags |= WORK_OFFQ_CANCELING; 4282 - set_work_pool_and_keep_pending(work, offqd.pool_id, 4283 - work_offqd_pack_flags(&offqd)); 4284 - local_irq_restore(irq_flags); 4349 + ret = __cancel_work(work, cflags | WORK_CANCEL_DISABLE); 4285 4350 4286 4351 /* 4287 4352 * Skip __flush_work() during early boot when we know that @work isn't ··· 4277 4368 if (wq_online) 4278 4369 __flush_work(work, true); 4279 4370 4280 - work_offqd_unpack(&offqd, *work_data_bits(work)); 4281 - 4282 - /* 4283 - * smp_mb() at the end of set_work_pool_and_clear_pending() is paired 4284 - * with prepare_to_wait() above so that either waitqueue_active() is 4285 - * visible here or !work_is_canceling() is visible there. 4286 - */ 4287 - offqd.flags &= ~WORK_OFFQ_CANCELING; 4288 - set_work_pool_and_clear_pending(work, WORK_OFFQ_POOL_NONE, 4289 - work_offqd_pack_flags(&offqd)); 4290 - 4291 - if (waitqueue_active(&wq_cancel_waitq)) 4292 - __wake_up(&wq_cancel_waitq, TASK_NORMAL, 1, work); 4371 + if (!(cflags & WORK_CANCEL_DISABLE)) 4372 + enable_work(work); 4293 4373 4294 4374 return ret; 4295 4375 } ··· 4362 4464 * will fail and return %false. The maximum supported disable depth is 2 to the 4363 4465 * power of %WORK_OFFQ_DISABLE_BITS, currently 65536. 4364 4466 * 4365 - * Must be called from a sleepable context. Returns %true if @work was pending, 4366 - * %false otherwise. 4467 + * Can be called from any context. Returns %true if @work was pending, %false 4468 + * otherwise. 4367 4469 */ 4368 4470 bool disable_work(struct work_struct *work) 4369 4471 { ··· 4394 4496 * Undo disable_work[_sync]() by decrementing @work's disable count. @work can 4395 4497 * only be queued if its disable count is 0. 4396 4498 * 4397 - * Must be called from a sleepable context. Returns %true if the disable count 4398 - * reached 0. Otherwise, %false. 4499 + * Can be called from any context. Returns %true if the disable count reached 0. 4500 + * Otherwise, %false. 4399 4501 */ 4400 4502 bool enable_work(struct work_struct *work) 4401 4503 {

Configure Feed

Configure Feed