Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'dm-3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper changes from Alasdair G Kergon:
"Add a device-mapper target called dm-switch to provide a multipath
framework for storage arrays that dynamically reconfigure their
preferred paths for different device regions.

Fix a bug in the verity target that prevented its use with some
specific sizes of devices.

Improve some locking mechanisms in the device-mapper core and bufio.

Add Mike Snitzer as a device-mapper maintainer.

A few more clean-ups and fixes"

* tag 'dm-3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm: add switch target
dm: update maintainers
dm: optimize reorder structure
dm: optimize use SRCU and RCU
dm bufio: submit writes outside lock
dm cache: fix arm link errors with inline
dm verity: use __ffs and __fls
dm flakey: correct ctr alloc failure mesg
dm verity: remove pointless comparison
dm: use __GFP_HIGHMEM in __vmalloc
dm verity: fix inability to use a few specific devices sizes
dm ioctl: set noio flag to avoid __vmalloc deadlock
dm mpath: fix ioctl deadlock when no paths

+951 -185
+126
Documentation/device-mapper/switch.txt
··· 1 + dm-switch 2 + ========= 3 + 4 + The device-mapper switch target creates a device that supports an 5 + arbitrary mapping of fixed-size regions of I/O across a fixed set of 6 + paths. The path used for any specific region can be switched 7 + dynamically by sending the target a message. 8 + 9 + It maps I/O to underlying block devices efficiently when there is a large 10 + number of fixed-sized address regions but there is no simple pattern 11 + that would allow for a compact representation of the mapping such as 12 + dm-stripe. 13 + 14 + Background 15 + ---------- 16 + 17 + Dell EqualLogic and some other iSCSI storage arrays use a distributed 18 + frameless architecture. In this architecture, the storage group 19 + consists of a number of distinct storage arrays ("members") each having 20 + independent controllers, disk storage and network adapters. When a LUN 21 + is created it is spread across multiple members. The details of the 22 + spreading are hidden from initiators connected to this storage system. 23 + The storage group exposes a single target discovery portal, no matter 24 + how many members are being used. When iSCSI sessions are created, each 25 + session is connected to an eth port on a single member. Data to a LUN 26 + can be sent on any iSCSI session, and if the blocks being accessed are 27 + stored on another member the I/O will be forwarded as required. This 28 + forwarding is invisible to the initiator. The storage layout is also 29 + dynamic, and the blocks stored on disk may be moved from member to 30 + member as needed to balance the load. 31 + 32 + This architecture simplifies the management and configuration of both 33 + the storage group and initiators. In a multipathing configuration, it 34 + is possible to set up multiple iSCSI sessions to use multiple network 35 + interfaces on both the host and target to take advantage of the 36 + increased network bandwidth. An initiator could use a simple round 37 + robin algorithm to send I/O across all paths and let the storage array 38 + members forward it as necessary, but there is a performance advantage to 39 + sending data directly to the correct member. 40 + 41 + A device-mapper table already lets you map different regions of a 42 + device onto different targets. However in this architecture the LUN is 43 + spread with an address region size on the order of 10s of MBs, which 44 + means the resulting table could have more than a million entries and 45 + consume far too much memory. 46 + 47 + Using this device-mapper switch target we can now build a two-layer 48 + device hierarchy: 49 + 50 + Upper Tier – Determine which array member the I/O should be sent to. 51 + Lower Tier – Load balance amongst paths to a particular member. 52 + 53 + The lower tier consists of a single dm multipath device for each member. 54 + Each of these multipath devices contains the set of paths directly to 55 + the array member in one priority group, and leverages existing path 56 + selectors to load balance amongst these paths. We also build a 57 + non-preferred priority group containing paths to other array members for 58 + failover reasons. 59 + 60 + The upper tier consists of a single dm-switch device. This device uses 61 + a bitmap to look up the location of the I/O and choose the appropriate 62 + lower tier device to route the I/O. By using a bitmap we are able to 63 + use 4 bits for each address range in a 16 member group (which is very 64 + large for us). This is a much denser representation than the dm table 65 + b-tree can achieve. 66 + 67 + Construction Parameters 68 + ======================= 69 + 70 + <num_paths> <region_size> <num_optional_args> [<optional_args>...] 71 + [<dev_path> <offset>]+ 72 + 73 + <num_paths> 74 + The number of paths across which to distribute the I/O. 75 + 76 + <region_size> 77 + The number of 512-byte sectors in a region. Each region can be redirected 78 + to any of the available paths. 79 + 80 + <num_optional_args> 81 + The number of optional arguments. Currently, no optional arguments 82 + are supported and so this must be zero. 83 + 84 + <dev_path> 85 + The block device that represents a specific path to the device. 86 + 87 + <offset> 88 + The offset of the start of data on the specific <dev_path> (in units 89 + of 512-byte sectors). This number is added to the sector number when 90 + forwarding the request to the specific path. Typically it is zero. 91 + 92 + Messages 93 + ======== 94 + 95 + set_region_mappings <index>:<path_nr> [<index>]:<path_nr> [<index>]:<path_nr>... 96 + 97 + Modify the region table by specifying which regions are redirected to 98 + which paths. 99 + 100 + <index> 101 + The region number (region size was specified in constructor parameters). 102 + If index is omitted, the next region (previous index + 1) is used. 103 + Expressed in hexadecimal (WITHOUT any prefix like 0x). 104 + 105 + <path_nr> 106 + The path number in the range 0 ... (<num_paths> - 1). 107 + Expressed in hexadecimal (WITHOUT any prefix like 0x). 108 + 109 + Status 110 + ====== 111 + 112 + No status line is reported. 113 + 114 + Example 115 + ======= 116 + 117 + Assume that you have volumes vg1/switch0 vg1/switch1 vg1/switch2 with 118 + the same size. 119 + 120 + Create a switch device with 64kB region size: 121 + dmsetup create switch --table "0 `blockdev --getsize /dev/vg1/switch0` 122 + switch 3 128 0 /dev/vg1/switch0 0 /dev/vg1/switch1 0 /dev/vg1/switch2 0" 123 + 124 + Set mappings for the first 7 entries to point to devices switch0, switch1, 125 + switch2, switch0, switch1, switch2, switch1: 126 + dmsetup message switch 0 set_region_mappings 0:0 :1 :2 :0 :1 :2 :1
+2
MAINTAINERS
··· 2574 2574 2575 2575 DEVICE-MAPPER (LVM) 2576 2576 M: Alasdair Kergon <agk@redhat.com> 2577 + M: Mike Snitzer <snitzer@redhat.com> 2577 2578 M: dm-devel@redhat.com 2578 2579 L: dm-devel@redhat.com 2579 2580 W: http://sources.redhat.com/dm ··· 2586 2585 F: drivers/md/persistent-data/ 2587 2586 F: include/linux/device-mapper.h 2588 2587 F: include/linux/dm-*.h 2588 + F: include/uapi/linux/dm-*.h 2589 2589 2590 2590 DIOLAN U2C-12 I2C DRIVER 2591 2591 M: Guenter Roeck <linux@roeck-us.net>
+14
drivers/md/Kconfig
··· 412 412 413 413 If unsure, say N. 414 414 415 + config DM_SWITCH 416 + tristate "Switch target support (EXPERIMENTAL)" 417 + depends on BLK_DEV_DM 418 + ---help--- 419 + This device-mapper target creates a device that supports an arbitrary 420 + mapping of fixed-size regions of I/O across a fixed set of paths. 421 + The path used for any specific region can be switched dynamically 422 + by sending the target a message. 423 + 424 + To compile this code as a module, choose M here: the module will 425 + be called dm-switch. 426 + 427 + If unsure, say N. 428 + 415 429 endif # MD
+1
drivers/md/Makefile
··· 40 40 obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o 41 41 obj-$(CONFIG_DM_MULTIPATH_QL) += dm-queue-length.o 42 42 obj-$(CONFIG_DM_MULTIPATH_ST) += dm-service-time.o 43 + obj-$(CONFIG_DM_SWITCH) += dm-switch.o 43 44 obj-$(CONFIG_DM_SNAPSHOT) += dm-snapshot.o 44 45 obj-$(CONFIG_DM_PERSISTENT_DATA) += persistent-data/ 45 46 obj-$(CONFIG_DM_MIRROR) += dm-mirror.o dm-log.o dm-region-hash.o
+59 -16
drivers/md/dm-bufio.c
··· 145 145 unsigned long state; 146 146 unsigned long last_accessed; 147 147 struct dm_bufio_client *c; 148 + struct list_head write_list; 148 149 struct bio bio; 149 150 struct bio_vec bio_vec[DM_BUFIO_INLINE_VECS]; 150 151 }; ··· 350 349 if (gfp_mask & __GFP_NORETRY) 351 350 noio_flag = memalloc_noio_save(); 352 351 353 - ptr = __vmalloc(c->block_size, gfp_mask, PAGE_KERNEL); 352 + ptr = __vmalloc(c->block_size, gfp_mask | __GFP_HIGHMEM, PAGE_KERNEL); 354 353 355 354 if (gfp_mask & __GFP_NORETRY) 356 355 memalloc_noio_restore(noio_flag); ··· 631 630 * - Submit our write and don't wait on it. We set B_WRITING indicating 632 631 * that there is a write in progress. 633 632 */ 634 - static void __write_dirty_buffer(struct dm_buffer *b) 633 + static void __write_dirty_buffer(struct dm_buffer *b, 634 + struct list_head *write_list) 635 635 { 636 636 if (!test_bit(B_DIRTY, &b->state)) 637 637 return; ··· 641 639 wait_on_bit_lock(&b->state, B_WRITING, 642 640 do_io_schedule, TASK_UNINTERRUPTIBLE); 643 641 644 - submit_io(b, WRITE, b->block, write_endio); 642 + if (!write_list) 643 + submit_io(b, WRITE, b->block, write_endio); 644 + else 645 + list_add_tail(&b->write_list, write_list); 646 + } 647 + 648 + static void __flush_write_list(struct list_head *write_list) 649 + { 650 + struct blk_plug plug; 651 + blk_start_plug(&plug); 652 + while (!list_empty(write_list)) { 653 + struct dm_buffer *b = 654 + list_entry(write_list->next, struct dm_buffer, write_list); 655 + list_del(&b->write_list); 656 + submit_io(b, WRITE, b->block, write_endio); 657 + dm_bufio_cond_resched(); 658 + } 659 + blk_finish_plug(&plug); 645 660 } 646 661 647 662 /* ··· 674 655 return; 675 656 676 657 wait_on_bit(&b->state, B_READING, do_io_schedule, TASK_UNINTERRUPTIBLE); 677 - __write_dirty_buffer(b); 658 + __write_dirty_buffer(b, NULL); 678 659 wait_on_bit(&b->state, B_WRITING, do_io_schedule, TASK_UNINTERRUPTIBLE); 679 660 } 680 661 ··· 821 802 wake_up(&c->free_buffer_wait); 822 803 } 823 804 824 - static void __write_dirty_buffers_async(struct dm_bufio_client *c, int no_wait) 805 + static void __write_dirty_buffers_async(struct dm_bufio_client *c, int no_wait, 806 + struct list_head *write_list) 825 807 { 826 808 struct dm_buffer *b, *tmp; 827 809 ··· 838 818 if (no_wait && test_bit(B_WRITING, &b->state)) 839 819 return; 840 820 841 - __write_dirty_buffer(b); 821 + __write_dirty_buffer(b, write_list); 842 822 dm_bufio_cond_resched(); 843 823 } 844 824 } ··· 873 853 * If we are over threshold_buffers, start freeing buffers. 874 854 * If we're over "limit_buffers", block until we get under the limit. 875 855 */ 876 - static void __check_watermark(struct dm_bufio_client *c) 856 + static void __check_watermark(struct dm_bufio_client *c, 857 + struct list_head *write_list) 877 858 { 878 859 unsigned long threshold_buffers, limit_buffers; 879 860 ··· 893 872 } 894 873 895 874 if (c->n_buffers[LIST_DIRTY] > threshold_buffers) 896 - __write_dirty_buffers_async(c, 1); 875 + __write_dirty_buffers_async(c, 1, write_list); 897 876 } 898 877 899 878 /* ··· 918 897 *--------------------------------------------------------------*/ 919 898 920 899 static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, 921 - enum new_flag nf, int *need_submit) 900 + enum new_flag nf, int *need_submit, 901 + struct list_head *write_list) 922 902 { 923 903 struct dm_buffer *b, *new_b = NULL; 924 904 ··· 946 924 goto found_buffer; 947 925 } 948 926 949 - __check_watermark(c); 927 + __check_watermark(c, write_list); 950 928 951 929 b = new_b; 952 930 b->hold_count = 1; ··· 1014 992 int need_submit; 1015 993 struct dm_buffer *b; 1016 994 995 + LIST_HEAD(write_list); 996 + 1017 997 dm_bufio_lock(c); 1018 - b = __bufio_new(c, block, nf, &need_submit); 998 + b = __bufio_new(c, block, nf, &need_submit, &write_list); 1019 999 dm_bufio_unlock(c); 1000 + 1001 + __flush_write_list(&write_list); 1020 1002 1021 1003 if (!b) 1022 1004 return b; ··· 1073 1047 { 1074 1048 struct blk_plug plug; 1075 1049 1050 + LIST_HEAD(write_list); 1051 + 1076 1052 BUG_ON(dm_bufio_in_request()); 1077 1053 1078 1054 blk_start_plug(&plug); ··· 1083 1055 for (; n_blocks--; block++) { 1084 1056 int need_submit; 1085 1057 struct dm_buffer *b; 1086 - b = __bufio_new(c, block, NF_PREFETCH, &need_submit); 1058 + b = __bufio_new(c, block, NF_PREFETCH, &need_submit, 1059 + &write_list); 1060 + if (unlikely(!list_empty(&write_list))) { 1061 + dm_bufio_unlock(c); 1062 + blk_finish_plug(&plug); 1063 + __flush_write_list(&write_list); 1064 + blk_start_plug(&plug); 1065 + dm_bufio_lock(c); 1066 + } 1087 1067 if (unlikely(b != NULL)) { 1088 1068 dm_bufio_unlock(c); 1089 1069 ··· 1105 1069 goto flush_plug; 1106 1070 dm_bufio_lock(c); 1107 1071 } 1108 - 1109 1072 } 1110 1073 1111 1074 dm_bufio_unlock(c); ··· 1161 1126 1162 1127 void dm_bufio_write_dirty_buffers_async(struct dm_bufio_client *c) 1163 1128 { 1129 + LIST_HEAD(write_list); 1130 + 1164 1131 BUG_ON(dm_bufio_in_request()); 1165 1132 1166 1133 dm_bufio_lock(c); 1167 - __write_dirty_buffers_async(c, 0); 1134 + __write_dirty_buffers_async(c, 0, &write_list); 1168 1135 dm_bufio_unlock(c); 1136 + __flush_write_list(&write_list); 1169 1137 } 1170 1138 EXPORT_SYMBOL_GPL(dm_bufio_write_dirty_buffers_async); 1171 1139 ··· 1185 1147 unsigned long buffers_processed = 0; 1186 1148 struct dm_buffer *b, *tmp; 1187 1149 1150 + LIST_HEAD(write_list); 1151 + 1188 1152 dm_bufio_lock(c); 1189 - __write_dirty_buffers_async(c, 0); 1153 + __write_dirty_buffers_async(c, 0, &write_list); 1154 + dm_bufio_unlock(c); 1155 + __flush_write_list(&write_list); 1156 + dm_bufio_lock(c); 1190 1157 1191 1158 again: 1192 1159 list_for_each_entry_safe_reverse(b, tmp, &c->lru[LIST_DIRTY], lru_list) { ··· 1317 1274 BUG_ON(!b->hold_count); 1318 1275 BUG_ON(test_bit(B_READING, &b->state)); 1319 1276 1320 - __write_dirty_buffer(b); 1277 + __write_dirty_buffer(b, NULL); 1321 1278 if (b->hold_count == 1) { 1322 1279 wait_on_bit(&b->state, B_WRITING, 1323 1280 do_io_schedule, TASK_UNINTERRUPTIBLE);
+4
drivers/md/dm-cache-target.c
··· 425 425 return cache->sectors_per_block_shift >= 0; 426 426 } 427 427 428 + /* gcc on ARM generates spurious references to __udivdi3 and __umoddi3 */ 429 + #if defined(CONFIG_ARM) && __GNUC__ == 4 && __GNUC_MINOR__ <= 6 430 + __always_inline 431 + #endif 428 432 static dm_block_t block_div(dm_block_t b, uint32_t n) 429 433 { 430 434 do_div(b, n);
+1 -1
drivers/md/dm-flakey.c
··· 176 176 177 177 fc = kzalloc(sizeof(*fc), GFP_KERNEL); 178 178 if (!fc) { 179 - ti->error = "Cannot allocate linear context"; 179 + ti->error = "Cannot allocate context"; 180 180 return -ENOMEM; 181 181 } 182 182 fc->start_time = jiffies;
+86 -41
drivers/md/dm-ioctl.c
··· 36 36 struct dm_table *new_map; 37 37 }; 38 38 39 + /* 40 + * A dummy definition to make RCU happy. 41 + * struct dm_table should never be dereferenced in this file. 42 + */ 43 + struct dm_table { 44 + int undefined__; 45 + }; 46 + 39 47 struct vers_iter { 40 48 size_t param_size; 41 49 struct dm_target_versions *vers, *old_vers; ··· 250 242 return -EBUSY; 251 243 } 252 244 253 - static void __hash_remove(struct hash_cell *hc) 245 + static struct dm_table *__hash_remove(struct hash_cell *hc) 254 246 { 255 247 struct dm_table *table; 248 + int srcu_idx; 256 249 257 250 /* remove from the dev hash */ 258 251 list_del(&hc->uuid_list); ··· 262 253 dm_set_mdptr(hc->md, NULL); 263 254 mutex_unlock(&dm_hash_cells_mutex); 264 255 265 - table = dm_get_live_table(hc->md); 266 - if (table) { 256 + table = dm_get_live_table(hc->md, &srcu_idx); 257 + if (table) 267 258 dm_table_event(table); 268 - dm_table_put(table); 269 - } 259 + dm_put_live_table(hc->md, srcu_idx); 270 260 261 + table = NULL; 271 262 if (hc->new_map) 272 - dm_table_destroy(hc->new_map); 263 + table = hc->new_map; 273 264 dm_put(hc->md); 274 265 free_cell(hc); 266 + 267 + return table; 275 268 } 276 269 277 270 static void dm_hash_remove_all(int keep_open_devices) ··· 281 270 int i, dev_skipped; 282 271 struct hash_cell *hc; 283 272 struct mapped_device *md; 273 + struct dm_table *t; 284 274 285 275 retry: 286 276 dev_skipped = 0; ··· 299 287 continue; 300 288 } 301 289 302 - __hash_remove(hc); 290 + t = __hash_remove(hc); 303 291 304 292 up_write(&_hash_lock); 305 293 294 + if (t) { 295 + dm_sync_table(md); 296 + dm_table_destroy(t); 297 + } 306 298 dm_put(md); 307 299 if (likely(keep_open_devices)) 308 300 dm_destroy(md); ··· 372 356 struct dm_table *table; 373 357 struct mapped_device *md; 374 358 unsigned change_uuid = (param->flags & DM_UUID_FLAG) ? 1 : 0; 359 + int srcu_idx; 375 360 376 361 /* 377 362 * duplicate new. ··· 435 418 /* 436 419 * Wake up any dm event waiters. 437 420 */ 438 - table = dm_get_live_table(hc->md); 439 - if (table) { 421 + table = dm_get_live_table(hc->md, &srcu_idx); 422 + if (table) 440 423 dm_table_event(table); 441 - dm_table_put(table); 442 - } 424 + dm_put_live_table(hc->md, srcu_idx); 443 425 444 426 if (!dm_kobject_uevent(hc->md, KOBJ_CHANGE, param->event_nr)) 445 427 param->flags |= DM_UEVENT_GENERATED_FLAG; ··· 636 620 * _hash_lock without first calling dm_table_put, because dm_table_destroy 637 621 * waits for this dm_table_put and could be called under this lock. 638 622 */ 639 - static struct dm_table *dm_get_inactive_table(struct mapped_device *md) 623 + static struct dm_table *dm_get_inactive_table(struct mapped_device *md, int *srcu_idx) 640 624 { 641 625 struct hash_cell *hc; 642 626 struct dm_table *table = NULL; 627 + 628 + /* increment rcu count, we don't care about the table pointer */ 629 + dm_get_live_table(md, srcu_idx); 643 630 644 631 down_read(&_hash_lock); 645 632 hc = dm_get_mdptr(md); ··· 652 633 } 653 634 654 635 table = hc->new_map; 655 - if (table) 656 - dm_table_get(table); 657 636 658 637 out: 659 638 up_read(&_hash_lock); ··· 660 643 } 661 644 662 645 static struct dm_table *dm_get_live_or_inactive_table(struct mapped_device *md, 663 - struct dm_ioctl *param) 646 + struct dm_ioctl *param, 647 + int *srcu_idx) 664 648 { 665 649 return (param->flags & DM_QUERY_INACTIVE_TABLE_FLAG) ? 666 - dm_get_inactive_table(md) : dm_get_live_table(md); 650 + dm_get_inactive_table(md, srcu_idx) : dm_get_live_table(md, srcu_idx); 667 651 } 668 652 669 653 /* ··· 675 657 { 676 658 struct gendisk *disk = dm_disk(md); 677 659 struct dm_table *table; 660 + int srcu_idx; 678 661 679 662 param->flags &= ~(DM_SUSPEND_FLAG | DM_READONLY_FLAG | 680 663 DM_ACTIVE_PRESENT_FLAG); ··· 695 676 param->event_nr = dm_get_event_nr(md); 696 677 param->target_count = 0; 697 678 698 - table = dm_get_live_table(md); 679 + table = dm_get_live_table(md, &srcu_idx); 699 680 if (table) { 700 681 if (!(param->flags & DM_QUERY_INACTIVE_TABLE_FLAG)) { 701 682 if (get_disk_ro(disk)) 702 683 param->flags |= DM_READONLY_FLAG; 703 684 param->target_count = dm_table_get_num_targets(table); 704 685 } 705 - dm_table_put(table); 706 686 707 687 param->flags |= DM_ACTIVE_PRESENT_FLAG; 708 688 } 689 + dm_put_live_table(md, srcu_idx); 709 690 710 691 if (param->flags & DM_QUERY_INACTIVE_TABLE_FLAG) { 711 - table = dm_get_inactive_table(md); 692 + int srcu_idx; 693 + table = dm_get_inactive_table(md, &srcu_idx); 712 694 if (table) { 713 695 if (!(dm_table_get_mode(table) & FMODE_WRITE)) 714 696 param->flags |= DM_READONLY_FLAG; 715 697 param->target_count = dm_table_get_num_targets(table); 716 - dm_table_put(table); 717 698 } 699 + dm_put_live_table(md, srcu_idx); 718 700 } 719 701 } 720 702 ··· 816 796 struct hash_cell *hc; 817 797 struct mapped_device *md; 818 798 int r; 799 + struct dm_table *t; 819 800 820 801 down_write(&_hash_lock); 821 802 hc = __find_device_hash_cell(param); ··· 840 819 return r; 841 820 } 842 821 843 - __hash_remove(hc); 822 + t = __hash_remove(hc); 844 823 up_write(&_hash_lock); 824 + 825 + if (t) { 826 + dm_sync_table(md); 827 + dm_table_destroy(t); 828 + } 845 829 846 830 if (!dm_kobject_uevent(md, KOBJ_REMOVE, param->event_nr)) 847 831 param->flags |= DM_UEVENT_GENERATED_FLAG; ··· 1012 986 1013 987 old_map = dm_swap_table(md, new_map); 1014 988 if (IS_ERR(old_map)) { 989 + dm_sync_table(md); 1015 990 dm_table_destroy(new_map); 1016 991 dm_put(md); 1017 992 return PTR_ERR(old_map); ··· 1030 1003 param->flags |= DM_UEVENT_GENERATED_FLAG; 1031 1004 } 1032 1005 1006 + /* 1007 + * Since dm_swap_table synchronizes RCU, nobody should be in 1008 + * read-side critical section already. 1009 + */ 1033 1010 if (old_map) 1034 1011 dm_table_destroy(old_map); 1035 1012 ··· 1156 1125 int r = 0; 1157 1126 struct mapped_device *md; 1158 1127 struct dm_table *table; 1128 + int srcu_idx; 1159 1129 1160 1130 md = find_device(param); 1161 1131 if (!md) ··· 1177 1145 */ 1178 1146 __dev_status(md, param); 1179 1147 1180 - table = dm_get_live_or_inactive_table(md, param); 1181 - if (table) { 1148 + table = dm_get_live_or_inactive_table(md, param, &srcu_idx); 1149 + if (table) 1182 1150 retrieve_status(table, param, param_size); 1183 - dm_table_put(table); 1184 - } 1151 + dm_put_live_table(md, srcu_idx); 1185 1152 1186 1153 out: 1187 1154 dm_put(md); ··· 1252 1221 { 1253 1222 int r; 1254 1223 struct hash_cell *hc; 1255 - struct dm_table *t; 1224 + struct dm_table *t, *old_map = NULL; 1256 1225 struct mapped_device *md; 1257 1226 struct target_type *immutable_target_type; 1258 1227 ··· 1308 1277 hc = dm_get_mdptr(md); 1309 1278 if (!hc || hc->md != md) { 1310 1279 DMWARN("device has been removed from the dev hash table."); 1311 - dm_table_destroy(t); 1312 1280 up_write(&_hash_lock); 1281 + dm_table_destroy(t); 1313 1282 r = -ENXIO; 1314 1283 goto out; 1315 1284 } 1316 1285 1317 1286 if (hc->new_map) 1318 - dm_table_destroy(hc->new_map); 1287 + old_map = hc->new_map; 1319 1288 hc->new_map = t; 1320 1289 up_write(&_hash_lock); 1321 1290 ··· 1323 1292 __dev_status(md, param); 1324 1293 1325 1294 out: 1295 + if (old_map) { 1296 + dm_sync_table(md); 1297 + dm_table_destroy(old_map); 1298 + } 1299 + 1326 1300 dm_put(md); 1327 1301 1328 1302 return r; ··· 1337 1301 { 1338 1302 struct hash_cell *hc; 1339 1303 struct mapped_device *md; 1304 + struct dm_table *old_map = NULL; 1340 1305 1341 1306 down_write(&_hash_lock); 1342 1307 ··· 1349 1312 } 1350 1313 1351 1314 if (hc->new_map) { 1352 - dm_table_destroy(hc->new_map); 1315 + old_map = hc->new_map; 1353 1316 hc->new_map = NULL; 1354 1317 } 1355 1318 ··· 1358 1321 __dev_status(hc->md, param); 1359 1322 md = hc->md; 1360 1323 up_write(&_hash_lock); 1324 + if (old_map) { 1325 + dm_sync_table(md); 1326 + dm_table_destroy(old_map); 1327 + } 1361 1328 dm_put(md); 1362 1329 1363 1330 return 0; ··· 1411 1370 { 1412 1371 struct mapped_device *md; 1413 1372 struct dm_table *table; 1373 + int srcu_idx; 1414 1374 1415 1375 md = find_device(param); 1416 1376 if (!md) ··· 1419 1377 1420 1378 __dev_status(md, param); 1421 1379 1422 - table = dm_get_live_or_inactive_table(md, param); 1423 - if (table) { 1380 + table = dm_get_live_or_inactive_table(md, param, &srcu_idx); 1381 + if (table) 1424 1382 retrieve_deps(table, param, param_size); 1425 - dm_table_put(table); 1426 - } 1383 + dm_put_live_table(md, srcu_idx); 1427 1384 1428 1385 dm_put(md); 1429 1386 ··· 1437 1396 { 1438 1397 struct mapped_device *md; 1439 1398 struct dm_table *table; 1399 + int srcu_idx; 1440 1400 1441 1401 md = find_device(param); 1442 1402 if (!md) ··· 1445 1403 1446 1404 __dev_status(md, param); 1447 1405 1448 - table = dm_get_live_or_inactive_table(md, param); 1449 - if (table) { 1406 + table = dm_get_live_or_inactive_table(md, param, &srcu_idx); 1407 + if (table) 1450 1408 retrieve_status(table, param, param_size); 1451 - dm_table_put(table); 1452 - } 1409 + dm_put_live_table(md, srcu_idx); 1453 1410 1454 1411 dm_put(md); 1455 1412 ··· 1484 1443 struct dm_target_msg *tmsg = (void *) param + param->data_start; 1485 1444 size_t maxlen; 1486 1445 char *result = get_result_buffer(param, param_size, &maxlen); 1446 + int srcu_idx; 1487 1447 1488 1448 md = find_device(param); 1489 1449 if (!md) ··· 1512 1470 if (r <= 1) 1513 1471 goto out_argv; 1514 1472 1515 - table = dm_get_live_table(md); 1473 + table = dm_get_live_table(md, &srcu_idx); 1516 1474 if (!table) 1517 - goto out_argv; 1475 + goto out_table; 1518 1476 1519 1477 if (dm_deleting_md(md)) { 1520 1478 r = -ENXIO; ··· 1533 1491 } 1534 1492 1535 1493 out_table: 1536 - dm_table_put(table); 1494 + dm_put_live_table(md, srcu_idx); 1537 1495 out_argv: 1538 1496 kfree(argv); 1539 1497 out: ··· 1686 1644 } 1687 1645 1688 1646 if (!dmi) { 1689 - dmi = __vmalloc(param_kernel->data_size, GFP_NOIO | __GFP_REPEAT | __GFP_HIGH, PAGE_KERNEL); 1647 + unsigned noio_flag; 1648 + noio_flag = memalloc_noio_save(); 1649 + dmi = __vmalloc(param_kernel->data_size, GFP_NOIO | __GFP_REPEAT | __GFP_HIGH | __GFP_HIGHMEM, PAGE_KERNEL); 1650 + memalloc_noio_restore(noio_flag); 1690 1651 if (dmi) 1691 1652 *param_flags |= DM_PARAMS_VMALLOC; 1692 1653 }
+2 -6
drivers/md/dm-mpath.c
··· 1561 1561 unsigned long flags; 1562 1562 int r; 1563 1563 1564 - again: 1565 1564 bdev = NULL; 1566 1565 mode = 0; 1567 1566 r = 0; ··· 1578 1579 } 1579 1580 1580 1581 if ((pgpath && m->queue_io) || (!pgpath && m->queue_if_no_path)) 1581 - r = -EAGAIN; 1582 + r = -ENOTCONN; 1582 1583 else if (!bdev) 1583 1584 r = -EIO; 1584 1585 ··· 1590 1591 if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) 1591 1592 r = scsi_verify_blk_ioctl(NULL, cmd); 1592 1593 1593 - if (r == -EAGAIN && !fatal_signal_pending(current)) { 1594 + if (r == -ENOTCONN && !fatal_signal_pending(current)) 1594 1595 queue_work(kmultipathd, &m->process_queued_ios); 1595 - msleep(10); 1596 - goto again; 1597 - } 1598 1596 1599 1597 return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg); 1600 1598 }
+538
drivers/md/dm-switch.c
··· 1 + /* 2 + * Copyright (C) 2010-2012 by Dell Inc. All rights reserved. 3 + * Copyright (C) 2011-2013 Red Hat, Inc. 4 + * 5 + * This file is released under the GPL. 6 + * 7 + * dm-switch is a device-mapper target that maps IO to underlying block 8 + * devices efficiently when there are a large number of fixed-sized 9 + * address regions but there is no simple pattern to allow for a compact 10 + * mapping representation such as dm-stripe. 11 + */ 12 + 13 + #include <linux/device-mapper.h> 14 + 15 + #include <linux/module.h> 16 + #include <linux/init.h> 17 + #include <linux/vmalloc.h> 18 + 19 + #define DM_MSG_PREFIX "switch" 20 + 21 + /* 22 + * One region_table_slot_t holds <region_entries_per_slot> region table 23 + * entries each of which is <region_table_entry_bits> in size. 24 + */ 25 + typedef unsigned long region_table_slot_t; 26 + 27 + /* 28 + * A device with the offset to its start sector. 29 + */ 30 + struct switch_path { 31 + struct dm_dev *dmdev; 32 + sector_t start; 33 + }; 34 + 35 + /* 36 + * Context block for a dm switch device. 37 + */ 38 + struct switch_ctx { 39 + struct dm_target *ti; 40 + 41 + unsigned nr_paths; /* Number of paths in path_list. */ 42 + 43 + unsigned region_size; /* Region size in 512-byte sectors */ 44 + unsigned long nr_regions; /* Number of regions making up the device */ 45 + signed char region_size_bits; /* log2 of region_size or -1 */ 46 + 47 + unsigned char region_table_entry_bits; /* Number of bits in one region table entry */ 48 + unsigned char region_entries_per_slot; /* Number of entries in one region table slot */ 49 + signed char region_entries_per_slot_bits; /* log2 of region_entries_per_slot or -1 */ 50 + 51 + region_table_slot_t *region_table; /* Region table */ 52 + 53 + /* 54 + * Array of dm devices to switch between. 55 + */ 56 + struct switch_path path_list[0]; 57 + }; 58 + 59 + static struct switch_ctx *alloc_switch_ctx(struct dm_target *ti, unsigned nr_paths, 60 + unsigned region_size) 61 + { 62 + struct switch_ctx *sctx; 63 + 64 + sctx = kzalloc(sizeof(struct switch_ctx) + nr_paths * sizeof(struct switch_path), 65 + GFP_KERNEL); 66 + if (!sctx) 67 + return NULL; 68 + 69 + sctx->ti = ti; 70 + sctx->region_size = region_size; 71 + 72 + ti->private = sctx; 73 + 74 + return sctx; 75 + } 76 + 77 + static int alloc_region_table(struct dm_target *ti, unsigned nr_paths) 78 + { 79 + struct switch_ctx *sctx = ti->private; 80 + sector_t nr_regions = ti->len; 81 + sector_t nr_slots; 82 + 83 + if (!(sctx->region_size & (sctx->region_size - 1))) 84 + sctx->region_size_bits = __ffs(sctx->region_size); 85 + else 86 + sctx->region_size_bits = -1; 87 + 88 + sctx->region_table_entry_bits = 1; 89 + while (sctx->region_table_entry_bits < sizeof(region_table_slot_t) * 8 && 90 + (region_table_slot_t)1 << sctx->region_table_entry_bits < nr_paths) 91 + sctx->region_table_entry_bits++; 92 + 93 + sctx->region_entries_per_slot = (sizeof(region_table_slot_t) * 8) / sctx->region_table_entry_bits; 94 + if (!(sctx->region_entries_per_slot & (sctx->region_entries_per_slot - 1))) 95 + sctx->region_entries_per_slot_bits = __ffs(sctx->region_entries_per_slot); 96 + else 97 + sctx->region_entries_per_slot_bits = -1; 98 + 99 + if (sector_div(nr_regions, sctx->region_size)) 100 + nr_regions++; 101 + 102 + sctx->nr_regions = nr_regions; 103 + if (sctx->nr_regions != nr_regions || sctx->nr_regions >= ULONG_MAX) { 104 + ti->error = "Region table too large"; 105 + return -EINVAL; 106 + } 107 + 108 + nr_slots = nr_regions; 109 + if (sector_div(nr_slots, sctx->region_entries_per_slot)) 110 + nr_slots++; 111 + 112 + if (nr_slots > ULONG_MAX / sizeof(region_table_slot_t)) { 113 + ti->error = "Region table too large"; 114 + return -EINVAL; 115 + } 116 + 117 + sctx->region_table = vmalloc(nr_slots * sizeof(region_table_slot_t)); 118 + if (!sctx->region_table) { 119 + ti->error = "Cannot allocate region table"; 120 + return -ENOMEM; 121 + } 122 + 123 + return 0; 124 + } 125 + 126 + static void switch_get_position(struct switch_ctx *sctx, unsigned long region_nr, 127 + unsigned long *region_index, unsigned *bit) 128 + { 129 + if (sctx->region_entries_per_slot_bits >= 0) { 130 + *region_index = region_nr >> sctx->region_entries_per_slot_bits; 131 + *bit = region_nr & (sctx->region_entries_per_slot - 1); 132 + } else { 133 + *region_index = region_nr / sctx->region_entries_per_slot; 134 + *bit = region_nr % sctx->region_entries_per_slot; 135 + } 136 + 137 + *bit *= sctx->region_table_entry_bits; 138 + } 139 + 140 + /* 141 + * Find which path to use at given offset. 142 + */ 143 + static unsigned switch_get_path_nr(struct switch_ctx *sctx, sector_t offset) 144 + { 145 + unsigned long region_index; 146 + unsigned bit, path_nr; 147 + sector_t p; 148 + 149 + p = offset; 150 + if (sctx->region_size_bits >= 0) 151 + p >>= sctx->region_size_bits; 152 + else 153 + sector_div(p, sctx->region_size); 154 + 155 + switch_get_position(sctx, p, &region_index, &bit); 156 + path_nr = (ACCESS_ONCE(sctx->region_table[region_index]) >> bit) & 157 + ((1 << sctx->region_table_entry_bits) - 1); 158 + 159 + /* This can only happen if the processor uses non-atomic stores. */ 160 + if (unlikely(path_nr >= sctx->nr_paths)) 161 + path_nr = 0; 162 + 163 + return path_nr; 164 + } 165 + 166 + static void switch_region_table_write(struct switch_ctx *sctx, unsigned long region_nr, 167 + unsigned value) 168 + { 169 + unsigned long region_index; 170 + unsigned bit; 171 + region_table_slot_t pte; 172 + 173 + switch_get_position(sctx, region_nr, &region_index, &bit); 174 + 175 + pte = sctx->region_table[region_index]; 176 + pte &= ~((((region_table_slot_t)1 << sctx->region_table_entry_bits) - 1) << bit); 177 + pte |= (region_table_slot_t)value << bit; 178 + sctx->region_table[region_index] = pte; 179 + } 180 + 181 + /* 182 + * Fill the region table with an initial round robin pattern. 183 + */ 184 + static void initialise_region_table(struct switch_ctx *sctx) 185 + { 186 + unsigned path_nr = 0; 187 + unsigned long region_nr; 188 + 189 + for (region_nr = 0; region_nr < sctx->nr_regions; region_nr++) { 190 + switch_region_table_write(sctx, region_nr, path_nr); 191 + if (++path_nr >= sctx->nr_paths) 192 + path_nr = 0; 193 + } 194 + } 195 + 196 + static int parse_path(struct dm_arg_set *as, struct dm_target *ti) 197 + { 198 + struct switch_ctx *sctx = ti->private; 199 + unsigned long long start; 200 + int r; 201 + 202 + r = dm_get_device(ti, dm_shift_arg(as), dm_table_get_mode(ti->table), 203 + &sctx->path_list[sctx->nr_paths].dmdev); 204 + if (r) { 205 + ti->error = "Device lookup failed"; 206 + return r; 207 + } 208 + 209 + if (kstrtoull(dm_shift_arg(as), 10, &start) || start != (sector_t)start) { 210 + ti->error = "Invalid device starting offset"; 211 + dm_put_device(ti, sctx->path_list[sctx->nr_paths].dmdev); 212 + return -EINVAL; 213 + } 214 + 215 + sctx->path_list[sctx->nr_paths].start = start; 216 + 217 + sctx->nr_paths++; 218 + 219 + return 0; 220 + } 221 + 222 + /* 223 + * Destructor: Don't free the dm_target, just the ti->private data (if any). 224 + */ 225 + static void switch_dtr(struct dm_target *ti) 226 + { 227 + struct switch_ctx *sctx = ti->private; 228 + 229 + while (sctx->nr_paths--) 230 + dm_put_device(ti, sctx->path_list[sctx->nr_paths].dmdev); 231 + 232 + vfree(sctx->region_table); 233 + kfree(sctx); 234 + } 235 + 236 + /* 237 + * Constructor arguments: 238 + * <num_paths> <region_size> <num_optional_args> [<optional_args>...] 239 + * [<dev_path> <offset>]+ 240 + * 241 + * Optional args are to allow for future extension: currently this 242 + * parameter must be 0. 243 + */ 244 + static int switch_ctr(struct dm_target *ti, unsigned argc, char **argv) 245 + { 246 + static struct dm_arg _args[] = { 247 + {1, (KMALLOC_MAX_SIZE - sizeof(struct switch_ctx)) / sizeof(struct switch_path), "Invalid number of paths"}, 248 + {1, UINT_MAX, "Invalid region size"}, 249 + {0, 0, "Invalid number of optional args"}, 250 + }; 251 + 252 + struct switch_ctx *sctx; 253 + struct dm_arg_set as; 254 + unsigned nr_paths, region_size, nr_optional_args; 255 + int r; 256 + 257 + as.argc = argc; 258 + as.argv = argv; 259 + 260 + r = dm_read_arg(_args, &as, &nr_paths, &ti->error); 261 + if (r) 262 + return -EINVAL; 263 + 264 + r = dm_read_arg(_args + 1, &as, &region_size, &ti->error); 265 + if (r) 266 + return r; 267 + 268 + r = dm_read_arg_group(_args + 2, &as, &nr_optional_args, &ti->error); 269 + if (r) 270 + return r; 271 + /* parse optional arguments here, if we add any */ 272 + 273 + if (as.argc != nr_paths * 2) { 274 + ti->error = "Incorrect number of path arguments"; 275 + return -EINVAL; 276 + } 277 + 278 + sctx = alloc_switch_ctx(ti, nr_paths, region_size); 279 + if (!sctx) { 280 + ti->error = "Cannot allocate redirection context"; 281 + return -ENOMEM; 282 + } 283 + 284 + r = dm_set_target_max_io_len(ti, region_size); 285 + if (r) 286 + goto error; 287 + 288 + while (as.argc) { 289 + r = parse_path(&as, ti); 290 + if (r) 291 + goto error; 292 + } 293 + 294 + r = alloc_region_table(ti, nr_paths); 295 + if (r) 296 + goto error; 297 + 298 + initialise_region_table(sctx); 299 + 300 + /* For UNMAP, sending the request down any path is sufficient */ 301 + ti->num_discard_bios = 1; 302 + 303 + return 0; 304 + 305 + error: 306 + switch_dtr(ti); 307 + 308 + return r; 309 + } 310 + 311 + static int switch_map(struct dm_target *ti, struct bio *bio) 312 + { 313 + struct switch_ctx *sctx = ti->private; 314 + sector_t offset = dm_target_offset(ti, bio->bi_sector); 315 + unsigned path_nr = switch_get_path_nr(sctx, offset); 316 + 317 + bio->bi_bdev = sctx->path_list[path_nr].dmdev->bdev; 318 + bio->bi_sector = sctx->path_list[path_nr].start + offset; 319 + 320 + return DM_MAPIO_REMAPPED; 321 + } 322 + 323 + /* 324 + * We need to parse hex numbers in the message as quickly as possible. 325 + * 326 + * This table-based hex parser improves performance. 327 + * It improves a time to load 1000000 entries compared to the condition-based 328 + * parser. 329 + * table-based parser condition-based parser 330 + * PA-RISC 0.29s 0.31s 331 + * Opteron 0.0495s 0.0498s 332 + */ 333 + static const unsigned char hex_table[256] = { 334 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 335 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 336 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 337 + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 255, 255, 255, 255, 255, 255, 338 + 255, 10, 11, 12, 13, 14, 15, 255, 255, 255, 255, 255, 255, 255, 255, 255, 339 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 340 + 255, 10, 11, 12, 13, 14, 15, 255, 255, 255, 255, 255, 255, 255, 255, 255, 341 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 342 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 343 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 344 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 345 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 346 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 347 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 348 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 349 + 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255 350 + }; 351 + 352 + static __always_inline unsigned long parse_hex(const char **string) 353 + { 354 + unsigned char d; 355 + unsigned long r = 0; 356 + 357 + while ((d = hex_table[(unsigned char)**string]) < 16) { 358 + r = (r << 4) | d; 359 + (*string)++; 360 + } 361 + 362 + return r; 363 + } 364 + 365 + static int process_set_region_mappings(struct switch_ctx *sctx, 366 + unsigned argc, char **argv) 367 + { 368 + unsigned i; 369 + unsigned long region_index = 0; 370 + 371 + for (i = 1; i < argc; i++) { 372 + unsigned long path_nr; 373 + const char *string = argv[i]; 374 + 375 + if (*string == ':') 376 + region_index++; 377 + else { 378 + region_index = parse_hex(&string); 379 + if (unlikely(*string != ':')) { 380 + DMWARN("invalid set_region_mappings argument: '%s'", argv[i]); 381 + return -EINVAL; 382 + } 383 + } 384 + 385 + string++; 386 + if (unlikely(!*string)) { 387 + DMWARN("invalid set_region_mappings argument: '%s'", argv[i]); 388 + return -EINVAL; 389 + } 390 + 391 + path_nr = parse_hex(&string); 392 + if (unlikely(*string)) { 393 + DMWARN("invalid set_region_mappings argument: '%s'", argv[i]); 394 + return -EINVAL; 395 + } 396 + if (unlikely(region_index >= sctx->nr_regions)) { 397 + DMWARN("invalid set_region_mappings region number: %lu >= %lu", region_index, sctx->nr_regions); 398 + return -EINVAL; 399 + } 400 + if (unlikely(path_nr >= sctx->nr_paths)) { 401 + DMWARN("invalid set_region_mappings device: %lu >= %u", path_nr, sctx->nr_paths); 402 + return -EINVAL; 403 + } 404 + 405 + switch_region_table_write(sctx, region_index, path_nr); 406 + } 407 + 408 + return 0; 409 + } 410 + 411 + /* 412 + * Messages are processed one-at-a-time. 413 + * 414 + * Only set_region_mappings is supported. 415 + */ 416 + static int switch_message(struct dm_target *ti, unsigned argc, char **argv) 417 + { 418 + static DEFINE_MUTEX(message_mutex); 419 + 420 + struct switch_ctx *sctx = ti->private; 421 + int r = -EINVAL; 422 + 423 + mutex_lock(&message_mutex); 424 + 425 + if (!strcasecmp(argv[0], "set_region_mappings")) 426 + r = process_set_region_mappings(sctx, argc, argv); 427 + else 428 + DMWARN("Unrecognised message received."); 429 + 430 + mutex_unlock(&message_mutex); 431 + 432 + return r; 433 + } 434 + 435 + static void switch_status(struct dm_target *ti, status_type_t type, 436 + unsigned status_flags, char *result, unsigned maxlen) 437 + { 438 + struct switch_ctx *sctx = ti->private; 439 + unsigned sz = 0; 440 + int path_nr; 441 + 442 + switch (type) { 443 + case STATUSTYPE_INFO: 444 + result[0] = '\0'; 445 + break; 446 + 447 + case STATUSTYPE_TABLE: 448 + DMEMIT("%u %u 0", sctx->nr_paths, sctx->region_size); 449 + for (path_nr = 0; path_nr < sctx->nr_paths; path_nr++) 450 + DMEMIT(" %s %llu", sctx->path_list[path_nr].dmdev->name, 451 + (unsigned long long)sctx->path_list[path_nr].start); 452 + break; 453 + } 454 + } 455 + 456 + /* 457 + * Switch ioctl: 458 + * 459 + * Passthrough all ioctls to the path for sector 0 460 + */ 461 + static int switch_ioctl(struct dm_target *ti, unsigned cmd, 462 + unsigned long arg) 463 + { 464 + struct switch_ctx *sctx = ti->private; 465 + struct block_device *bdev; 466 + fmode_t mode; 467 + unsigned path_nr; 468 + int r = 0; 469 + 470 + path_nr = switch_get_path_nr(sctx, 0); 471 + 472 + bdev = sctx->path_list[path_nr].dmdev->bdev; 473 + mode = sctx->path_list[path_nr].dmdev->mode; 474 + 475 + /* 476 + * Only pass ioctls through if the device sizes match exactly. 477 + */ 478 + if (ti->len + sctx->path_list[path_nr].start != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT) 479 + r = scsi_verify_blk_ioctl(NULL, cmd); 480 + 481 + return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg); 482 + } 483 + 484 + static int switch_iterate_devices(struct dm_target *ti, 485 + iterate_devices_callout_fn fn, void *data) 486 + { 487 + struct switch_ctx *sctx = ti->private; 488 + int path_nr; 489 + int r; 490 + 491 + for (path_nr = 0; path_nr < sctx->nr_paths; path_nr++) { 492 + r = fn(ti, sctx->path_list[path_nr].dmdev, 493 + sctx->path_list[path_nr].start, ti->len, data); 494 + if (r) 495 + return r; 496 + } 497 + 498 + return 0; 499 + } 500 + 501 + static struct target_type switch_target = { 502 + .name = "switch", 503 + .version = {1, 0, 0}, 504 + .module = THIS_MODULE, 505 + .ctr = switch_ctr, 506 + .dtr = switch_dtr, 507 + .map = switch_map, 508 + .message = switch_message, 509 + .status = switch_status, 510 + .ioctl = switch_ioctl, 511 + .iterate_devices = switch_iterate_devices, 512 + }; 513 + 514 + static int __init dm_switch_init(void) 515 + { 516 + int r; 517 + 518 + r = dm_register_target(&switch_target); 519 + if (r < 0) 520 + DMERR("dm_register_target() failed %d", r); 521 + 522 + return r; 523 + } 524 + 525 + static void __exit dm_switch_exit(void) 526 + { 527 + dm_unregister_target(&switch_target); 528 + } 529 + 530 + module_init(dm_switch_init); 531 + module_exit(dm_switch_exit); 532 + 533 + MODULE_DESCRIPTION(DM_NAME " dynamic path switching target"); 534 + MODULE_AUTHOR("Kevin D. O'Kelley <Kevin_OKelley@dell.com>"); 535 + MODULE_AUTHOR("Narendran Ganapathy <Narendran_Ganapathy@dell.com>"); 536 + MODULE_AUTHOR("Jim Ramsay <Jim_Ramsay@dell.com>"); 537 + MODULE_AUTHOR("Mikulas Patocka <mpatocka@redhat.com>"); 538 + MODULE_LICENSE("GPL");
-35
drivers/md/dm-table.c
··· 26 26 #define KEYS_PER_NODE (NODE_SIZE / sizeof(sector_t)) 27 27 #define CHILDREN_PER_NODE (KEYS_PER_NODE + 1) 28 28 29 - /* 30 - * The table has always exactly one reference from either mapped_device->map 31 - * or hash_cell->new_map. This reference is not counted in table->holders. 32 - * A pair of dm_create_table/dm_destroy_table functions is used for table 33 - * creation/destruction. 34 - * 35 - * Temporary references from the other code increase table->holders. A pair 36 - * of dm_table_get/dm_table_put functions is used to manipulate it. 37 - * 38 - * When the table is about to be destroyed, we wait for table->holders to 39 - * drop to zero. 40 - */ 41 - 42 29 struct dm_table { 43 30 struct mapped_device *md; 44 - atomic_t holders; 45 31 unsigned type; 46 32 47 33 /* btree table */ ··· 194 208 195 209 INIT_LIST_HEAD(&t->devices); 196 210 INIT_LIST_HEAD(&t->target_callbacks); 197 - atomic_set(&t->holders, 0); 198 211 199 212 if (!num_targets) 200 213 num_targets = KEYS_PER_NODE; ··· 231 246 if (!t) 232 247 return; 233 248 234 - while (atomic_read(&t->holders)) 235 - msleep(1); 236 - smp_mb(); 237 - 238 249 /* free the indexes */ 239 250 if (t->depth >= 2) 240 251 vfree(t->index[t->depth - 2]); ··· 254 273 255 274 kfree(t); 256 275 } 257 - 258 - void dm_table_get(struct dm_table *t) 259 - { 260 - atomic_inc(&t->holders); 261 - } 262 - EXPORT_SYMBOL(dm_table_get); 263 - 264 - void dm_table_put(struct dm_table *t) 265 - { 266 - if (!t) 267 - return; 268 - 269 - smp_mb__before_atomic_dec(); 270 - atomic_dec(&t->holders); 271 - } 272 - EXPORT_SYMBOL(dm_table_put); 273 276 274 277 /* 275 278 * Checks to see if we need to extend highs or targets.
+8 -9
drivers/md/dm-verity.c
··· 451 451 goto no_prefetch_cluster; 452 452 453 453 if (unlikely(cluster & (cluster - 1))) 454 - cluster = 1 << (fls(cluster) - 1); 454 + cluster = 1 << __fls(cluster); 455 455 456 456 hash_block_start &= ~(sector_t)(cluster - 1); 457 457 hash_block_end |= cluster - 1; ··· 695 695 goto bad; 696 696 } 697 697 698 - if (sscanf(argv[0], "%d%c", &num, &dummy) != 1 || 699 - num < 0 || num > 1) { 698 + if (sscanf(argv[0], "%u%c", &num, &dummy) != 1 || 699 + num > 1) { 700 700 ti->error = "Invalid version"; 701 701 r = -EINVAL; 702 702 goto bad; ··· 723 723 r = -EINVAL; 724 724 goto bad; 725 725 } 726 - v->data_dev_block_bits = ffs(num) - 1; 726 + v->data_dev_block_bits = __ffs(num); 727 727 728 728 if (sscanf(argv[4], "%u%c", &num, &dummy) != 1 || 729 729 !num || (num & (num - 1)) || ··· 733 733 r = -EINVAL; 734 734 goto bad; 735 735 } 736 - v->hash_dev_block_bits = ffs(num) - 1; 736 + v->hash_dev_block_bits = __ffs(num); 737 737 738 738 if (sscanf(argv[5], "%llu%c", &num_ll, &dummy) != 1 || 739 739 (sector_t)(num_ll << (v->data_dev_block_bits - SECTOR_SHIFT)) ··· 812 812 } 813 813 814 814 v->hash_per_block_bits = 815 - fls((1 << v->hash_dev_block_bits) / v->digest_size) - 1; 815 + __fls((1 << v->hash_dev_block_bits) / v->digest_size); 816 816 817 817 v->levels = 0; 818 818 if (v->data_blocks) ··· 831 831 for (i = v->levels - 1; i >= 0; i--) { 832 832 sector_t s; 833 833 v->hash_level_block[i] = hash_position; 834 - s = verity_position_at_level(v, v->data_blocks, i); 835 - s = (s >> v->hash_per_block_bits) + 836 - !!(s & ((1 << v->hash_per_block_bits) - 1)); 834 + s = (v->data_blocks + ((sector_t)1 << ((i + 1) * v->hash_per_block_bits)) - 1) 835 + >> ((i + 1) * v->hash_per_block_bits); 837 836 if (hash_position + s < hash_position) { 838 837 ti->error = "Hash device offset overflow"; 839 838 r = -E2BIG;
+105 -72
drivers/md/dm.c
··· 117 117 #define DMF_MERGE_IS_OPTIONAL 6 118 118 119 119 /* 120 + * A dummy definition to make RCU happy. 121 + * struct dm_table should never be dereferenced in this file. 122 + */ 123 + struct dm_table { 124 + int undefined__; 125 + }; 126 + 127 + /* 120 128 * Work processed by per-device workqueue. 121 129 */ 122 130 struct mapped_device { 123 - struct rw_semaphore io_lock; 131 + struct srcu_struct io_barrier; 124 132 struct mutex suspend_lock; 125 - rwlock_t map_lock; 126 133 atomic_t holders; 127 134 atomic_t open_count; 135 + 136 + /* 137 + * The current mapping. 138 + * Use dm_get_live_table{_fast} or take suspend_lock for 139 + * dereference. 140 + */ 141 + struct dm_table *map; 128 142 129 143 unsigned long flags; 130 144 ··· 167 153 * Processing queue (flush) 168 154 */ 169 155 struct workqueue_struct *wq; 170 - 171 - /* 172 - * The current mapping. 173 - */ 174 - struct dm_table *map; 175 156 176 157 /* 177 158 * io objects are allocated from here. ··· 395 386 unsigned int cmd, unsigned long arg) 396 387 { 397 388 struct mapped_device *md = bdev->bd_disk->private_data; 398 - struct dm_table *map = dm_get_live_table(md); 389 + int srcu_idx; 390 + struct dm_table *map; 399 391 struct dm_target *tgt; 400 392 int r = -ENOTTY; 393 + 394 + retry: 395 + map = dm_get_live_table(md, &srcu_idx); 401 396 402 397 if (!map || !dm_table_get_size(map)) 403 398 goto out; ··· 421 408 r = tgt->type->ioctl(tgt, cmd, arg); 422 409 423 410 out: 424 - dm_table_put(map); 411 + dm_put_live_table(md, srcu_idx); 412 + 413 + if (r == -ENOTCONN) { 414 + msleep(10); 415 + goto retry; 416 + } 425 417 426 418 return r; 427 419 } ··· 520 502 /* 521 503 * Everyone (including functions in this file), should use this 522 504 * function to access the md->map field, and make sure they call 523 - * dm_table_put() when finished. 505 + * dm_put_live_table() when finished. 524 506 */ 525 - struct dm_table *dm_get_live_table(struct mapped_device *md) 507 + struct dm_table *dm_get_live_table(struct mapped_device *md, int *srcu_idx) __acquires(md->io_barrier) 526 508 { 527 - struct dm_table *t; 528 - unsigned long flags; 509 + *srcu_idx = srcu_read_lock(&md->io_barrier); 529 510 530 - read_lock_irqsave(&md->map_lock, flags); 531 - t = md->map; 532 - if (t) 533 - dm_table_get(t); 534 - read_unlock_irqrestore(&md->map_lock, flags); 511 + return srcu_dereference(md->map, &md->io_barrier); 512 + } 535 513 536 - return t; 514 + void dm_put_live_table(struct mapped_device *md, int srcu_idx) __releases(md->io_barrier) 515 + { 516 + srcu_read_unlock(&md->io_barrier, srcu_idx); 517 + } 518 + 519 + void dm_sync_table(struct mapped_device *md) 520 + { 521 + synchronize_srcu(&md->io_barrier); 522 + synchronize_rcu_expedited(); 523 + } 524 + 525 + /* 526 + * A fast alternative to dm_get_live_table/dm_put_live_table. 527 + * The caller must not block between these two functions. 528 + */ 529 + static struct dm_table *dm_get_live_table_fast(struct mapped_device *md) __acquires(RCU) 530 + { 531 + rcu_read_lock(); 532 + return rcu_dereference(md->map); 533 + } 534 + 535 + static void dm_put_live_table_fast(struct mapped_device *md) __releases(RCU) 536 + { 537 + rcu_read_unlock(); 537 538 } 538 539 539 540 /* ··· 1386 1349 /* 1387 1350 * Entry point to split a bio into clones and submit them to the targets. 1388 1351 */ 1389 - static void __split_and_process_bio(struct mapped_device *md, struct bio *bio) 1352 + static void __split_and_process_bio(struct mapped_device *md, 1353 + struct dm_table *map, struct bio *bio) 1390 1354 { 1391 1355 struct clone_info ci; 1392 1356 int error = 0; 1393 1357 1394 - ci.map = dm_get_live_table(md); 1395 - if (unlikely(!ci.map)) { 1358 + if (unlikely(!map)) { 1396 1359 bio_io_error(bio); 1397 1360 return; 1398 1361 } 1399 1362 1363 + ci.map = map; 1400 1364 ci.md = md; 1401 1365 ci.io = alloc_io(md); 1402 1366 ci.io->error = 0; ··· 1424 1386 1425 1387 /* drop the extra reference count */ 1426 1388 dec_pending(ci.io, error); 1427 - dm_table_put(ci.map); 1428 1389 } 1429 1390 /*----------------------------------------------------------------- 1430 1391 * CRUD END ··· 1434 1397 struct bio_vec *biovec) 1435 1398 { 1436 1399 struct mapped_device *md = q->queuedata; 1437 - struct dm_table *map = dm_get_live_table(md); 1400 + struct dm_table *map = dm_get_live_table_fast(md); 1438 1401 struct dm_target *ti; 1439 1402 sector_t max_sectors; 1440 1403 int max_size = 0; ··· 1444 1407 1445 1408 ti = dm_table_find_target(map, bvm->bi_sector); 1446 1409 if (!dm_target_is_valid(ti)) 1447 - goto out_table; 1410 + goto out; 1448 1411 1449 1412 /* 1450 1413 * Find maximum amount of I/O that won't need splitting ··· 1473 1436 1474 1437 max_size = 0; 1475 1438 1476 - out_table: 1477 - dm_table_put(map); 1478 - 1479 1439 out: 1440 + dm_put_live_table_fast(md); 1480 1441 /* 1481 1442 * Always allow an entire first page 1482 1443 */ ··· 1493 1458 int rw = bio_data_dir(bio); 1494 1459 struct mapped_device *md = q->queuedata; 1495 1460 int cpu; 1461 + int srcu_idx; 1462 + struct dm_table *map; 1496 1463 1497 - down_read(&md->io_lock); 1464 + map = dm_get_live_table(md, &srcu_idx); 1498 1465 1499 1466 cpu = part_stat_lock(); 1500 1467 part_stat_inc(cpu, &dm_disk(md)->part0, ios[rw]); ··· 1505 1468 1506 1469 /* if we're suspended, we have to queue this io for later */ 1507 1470 if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) { 1508 - up_read(&md->io_lock); 1471 + dm_put_live_table(md, srcu_idx); 1509 1472 1510 1473 if (bio_rw(bio) != READA) 1511 1474 queue_io(md, bio); ··· 1514 1477 return; 1515 1478 } 1516 1479 1517 - __split_and_process_bio(md, bio); 1518 - up_read(&md->io_lock); 1480 + __split_and_process_bio(md, map, bio); 1481 + dm_put_live_table(md, srcu_idx); 1519 1482 return; 1520 1483 } 1521 1484 ··· 1701 1664 static void dm_request_fn(struct request_queue *q) 1702 1665 { 1703 1666 struct mapped_device *md = q->queuedata; 1704 - struct dm_table *map = dm_get_live_table(md); 1667 + int srcu_idx; 1668 + struct dm_table *map = dm_get_live_table(md, &srcu_idx); 1705 1669 struct dm_target *ti; 1706 1670 struct request *rq, *clone; 1707 1671 sector_t pos; ··· 1757 1719 delay_and_out: 1758 1720 blk_delay_queue(q, HZ / 10); 1759 1721 out: 1760 - dm_table_put(map); 1722 + dm_put_live_table(md, srcu_idx); 1761 1723 } 1762 1724 1763 1725 int dm_underlying_device_busy(struct request_queue *q) ··· 1770 1732 { 1771 1733 int r; 1772 1734 struct mapped_device *md = q->queuedata; 1773 - struct dm_table *map = dm_get_live_table(md); 1735 + struct dm_table *map = dm_get_live_table_fast(md); 1774 1736 1775 1737 if (!map || test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) 1776 1738 r = 1; 1777 1739 else 1778 1740 r = dm_table_any_busy_target(map); 1779 1741 1780 - dm_table_put(map); 1742 + dm_put_live_table_fast(md); 1781 1743 1782 1744 return r; 1783 1745 } ··· 1789 1751 struct dm_table *map; 1790 1752 1791 1753 if (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) { 1792 - map = dm_get_live_table(md); 1754 + map = dm_get_live_table_fast(md); 1793 1755 if (map) { 1794 1756 /* 1795 1757 * Request-based dm cares about only own queue for ··· 1800 1762 bdi_bits; 1801 1763 else 1802 1764 r = dm_table_any_congested(map, bdi_bits); 1803 - 1804 - dm_table_put(map); 1805 1765 } 1766 + dm_put_live_table_fast(md); 1806 1767 } 1807 1768 1808 1769 return r; ··· 1906 1869 if (r < 0) 1907 1870 goto bad_minor; 1908 1871 1872 + r = init_srcu_struct(&md->io_barrier); 1873 + if (r < 0) 1874 + goto bad_io_barrier; 1875 + 1909 1876 md->type = DM_TYPE_NONE; 1910 - init_rwsem(&md->io_lock); 1911 1877 mutex_init(&md->suspend_lock); 1912 1878 mutex_init(&md->type_lock); 1913 1879 spin_lock_init(&md->deferred_lock); 1914 - rwlock_init(&md->map_lock); 1915 1880 atomic_set(&md->holders, 1); 1916 1881 atomic_set(&md->open_count, 0); 1917 1882 atomic_set(&md->event_nr, 0); ··· 1976 1937 bad_disk: 1977 1938 blk_cleanup_queue(md->queue); 1978 1939 bad_queue: 1940 + cleanup_srcu_struct(&md->io_barrier); 1941 + bad_io_barrier: 1979 1942 free_minor(minor); 1980 1943 bad_minor: 1981 1944 module_put(THIS_MODULE); ··· 2001 1960 bioset_free(md->bs); 2002 1961 blk_integrity_unregister(md->disk); 2003 1962 del_gendisk(md->disk); 1963 + cleanup_srcu_struct(&md->io_barrier); 2004 1964 free_minor(minor); 2005 1965 2006 1966 spin_lock(&_minor_lock); ··· 2144 2102 struct dm_table *old_map; 2145 2103 struct request_queue *q = md->queue; 2146 2104 sector_t size; 2147 - unsigned long flags; 2148 2105 int merge_is_optional; 2149 2106 2150 2107 size = dm_table_get_size(t); ··· 2172 2131 2173 2132 merge_is_optional = dm_table_merge_is_optional(t); 2174 2133 2175 - write_lock_irqsave(&md->map_lock, flags); 2176 2134 old_map = md->map; 2177 - md->map = t; 2135 + rcu_assign_pointer(md->map, t); 2178 2136 md->immutable_target_type = dm_table_get_immutable_target_type(t); 2179 2137 2180 2138 dm_table_set_restrictions(t, q, limits); ··· 2181 2141 set_bit(DMF_MERGE_IS_OPTIONAL, &md->flags); 2182 2142 else 2183 2143 clear_bit(DMF_MERGE_IS_OPTIONAL, &md->flags); 2184 - write_unlock_irqrestore(&md->map_lock, flags); 2144 + dm_sync_table(md); 2185 2145 2186 2146 return old_map; 2187 2147 } ··· 2192 2152 static struct dm_table *__unbind(struct mapped_device *md) 2193 2153 { 2194 2154 struct dm_table *map = md->map; 2195 - unsigned long flags; 2196 2155 2197 2156 if (!map) 2198 2157 return NULL; 2199 2158 2200 2159 dm_table_event_callback(map, NULL, NULL); 2201 - write_lock_irqsave(&md->map_lock, flags); 2202 - md->map = NULL; 2203 - write_unlock_irqrestore(&md->map_lock, flags); 2160 + rcu_assign_pointer(md->map, NULL); 2161 + dm_sync_table(md); 2204 2162 2205 2163 return map; 2206 2164 } ··· 2350 2312 static void __dm_destroy(struct mapped_device *md, bool wait) 2351 2313 { 2352 2314 struct dm_table *map; 2315 + int srcu_idx; 2353 2316 2354 2317 might_sleep(); 2355 2318 2356 2319 spin_lock(&_minor_lock); 2357 - map = dm_get_live_table(md); 2320 + map = dm_get_live_table(md, &srcu_idx); 2358 2321 idr_replace(&_minor_idr, MINOR_ALLOCED, MINOR(disk_devt(dm_disk(md)))); 2359 2322 set_bit(DMF_FREEING, &md->flags); 2360 2323 spin_unlock(&_minor_lock); ··· 2364 2325 dm_table_presuspend_targets(map); 2365 2326 dm_table_postsuspend_targets(map); 2366 2327 } 2328 + 2329 + /* dm_put_live_table must be before msleep, otherwise deadlock is possible */ 2330 + dm_put_live_table(md, srcu_idx); 2367 2331 2368 2332 /* 2369 2333 * Rare, but there may be I/O requests still going to complete, ··· 2382 2340 dm_device_name(md), atomic_read(&md->holders)); 2383 2341 2384 2342 dm_sysfs_exit(md); 2385 - dm_table_put(map); 2386 2343 dm_table_destroy(__unbind(md)); 2387 2344 free_dev(md); 2388 2345 } ··· 2438 2397 struct mapped_device *md = container_of(work, struct mapped_device, 2439 2398 work); 2440 2399 struct bio *c; 2400 + int srcu_idx; 2401 + struct dm_table *map; 2441 2402 2442 - down_read(&md->io_lock); 2403 + map = dm_get_live_table(md, &srcu_idx); 2443 2404 2444 2405 while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) { 2445 2406 spin_lock_irq(&md->deferred_lock); ··· 2451 2408 if (!c) 2452 2409 break; 2453 2410 2454 - up_read(&md->io_lock); 2455 - 2456 2411 if (dm_request_based(md)) 2457 2412 generic_make_request(c); 2458 2413 else 2459 - __split_and_process_bio(md, c); 2460 - 2461 - down_read(&md->io_lock); 2414 + __split_and_process_bio(md, map, c); 2462 2415 } 2463 2416 2464 - up_read(&md->io_lock); 2417 + dm_put_live_table(md, srcu_idx); 2465 2418 } 2466 2419 2467 2420 static void dm_queue_flush(struct mapped_device *md) ··· 2489 2450 * reappear. 2490 2451 */ 2491 2452 if (dm_table_has_no_data_devices(table)) { 2492 - live_map = dm_get_live_table(md); 2453 + live_map = dm_get_live_table_fast(md); 2493 2454 if (live_map) 2494 2455 limits = md->queue->limits; 2495 - dm_table_put(live_map); 2456 + dm_put_live_table_fast(md); 2496 2457 } 2497 2458 2498 2459 if (!live_map) { ··· 2572 2533 goto out_unlock; 2573 2534 } 2574 2535 2575 - map = dm_get_live_table(md); 2536 + map = md->map; 2576 2537 2577 2538 /* 2578 2539 * DMF_NOFLUSH_SUSPENDING must be set before presuspend. ··· 2593 2554 if (!noflush && do_lockfs) { 2594 2555 r = lock_fs(md); 2595 2556 if (r) 2596 - goto out; 2557 + goto out_unlock; 2597 2558 } 2598 2559 2599 2560 /* ··· 2608 2569 * (dm_wq_work), we set BMF_BLOCK_IO_FOR_SUSPEND and call 2609 2570 * flush_workqueue(md->wq). 2610 2571 */ 2611 - down_write(&md->io_lock); 2612 2572 set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags); 2613 - up_write(&md->io_lock); 2573 + synchronize_srcu(&md->io_barrier); 2614 2574 2615 2575 /* 2616 2576 * Stop md->queue before flushing md->wq in case request-based ··· 2627 2589 */ 2628 2590 r = dm_wait_for_completion(md, TASK_INTERRUPTIBLE); 2629 2591 2630 - down_write(&md->io_lock); 2631 2592 if (noflush) 2632 2593 clear_bit(DMF_NOFLUSH_SUSPENDING, &md->flags); 2633 - up_write(&md->io_lock); 2594 + synchronize_srcu(&md->io_barrier); 2634 2595 2635 2596 /* were we interrupted ? */ 2636 2597 if (r < 0) { ··· 2639 2602 start_queue(md->queue); 2640 2603 2641 2604 unlock_fs(md); 2642 - goto out; /* pushback list is already flushed, so skip flush */ 2605 + goto out_unlock; /* pushback list is already flushed, so skip flush */ 2643 2606 } 2644 2607 2645 2608 /* ··· 2651 2614 set_bit(DMF_SUSPENDED, &md->flags); 2652 2615 2653 2616 dm_table_postsuspend_targets(map); 2654 - 2655 - out: 2656 - dm_table_put(map); 2657 2617 2658 2618 out_unlock: 2659 2619 mutex_unlock(&md->suspend_lock); ··· 2666 2632 if (!dm_suspended_md(md)) 2667 2633 goto out; 2668 2634 2669 - map = dm_get_live_table(md); 2635 + map = md->map; 2670 2636 if (!map || !dm_table_get_size(map)) 2671 2637 goto out; 2672 2638 ··· 2690 2656 2691 2657 r = 0; 2692 2658 out: 2693 - dm_table_put(map); 2694 2659 mutex_unlock(&md->suspend_lock); 2695 2660 2696 2661 return r;
+3 -3
include/linux/device-mapper.h
··· 446 446 /* 447 447 * Table reference counting. 448 448 */ 449 - struct dm_table *dm_get_live_table(struct mapped_device *md); 450 - void dm_table_get(struct dm_table *t); 451 - void dm_table_put(struct dm_table *t); 449 + struct dm_table *dm_get_live_table(struct mapped_device *md, int *srcu_idx); 450 + void dm_put_live_table(struct mapped_device *md, int srcu_idx); 451 + void dm_sync_table(struct mapped_device *md); 452 452 453 453 /* 454 454 * Queries
+2 -2
include/uapi/linux/dm-ioctl.h
··· 267 267 #define DM_DEV_SET_GEOMETRY _IOWR(DM_IOCTL, DM_DEV_SET_GEOMETRY_CMD, struct dm_ioctl) 268 268 269 269 #define DM_VERSION_MAJOR 4 270 - #define DM_VERSION_MINOR 24 270 + #define DM_VERSION_MINOR 25 271 271 #define DM_VERSION_PATCHLEVEL 0 272 - #define DM_VERSION_EXTRA "-ioctl (2013-01-15)" 272 + #define DM_VERSION_EXTRA "-ioctl (2013-06-26)" 273 273 274 274 /* Status bits */ 275 275 #define DM_READONLY_FLAG (1 << 0) /* In/Out */