Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ext34: ensure do_split leaves enough free space in both blocks

The do_split() function for htree dir blocks is intended to split a leaf
block to make room for a new entry. It sorts the entries in the original
block by hash value, then moves the last half of the entries to the new
block - without accounting for how much space this actually moves. (IOW,
it moves half of the entry *count* not half of the entry *space*). If by
chance we have both large & small entries, and we move only the smallest
entries, and we have a large new entry to insert, we may not have created
enough space for it.

The patch below stores each record size when calculating the dx_map, and
then walks the hash-sorted dx_map, calculating how many entries must be
moved to more evenly split the existing entries between the old block and
the new block, guaranteeing enough space for the new entry.

The dx_map "offs" member is reduced to u16 so that the overall map size
does not change - it is temporarily stored at the end of the new block, and
if it grows too large it may be overwritten. By making offs and size both
u16, we won't grow the map size.

Also add a few comments to the functions involved.

This fixes the testcase reported by hooanon05@yahoo.co.jp on the
linux-ext4 list, "ext3 dir_index causes an error"

Thanks to Andreas Dilger for discussing the problem & solution with me.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Tested-by: Junjiro Okajima <hooanon05@yahoo.co.jp>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Eric Sandeen and committed by
Linus Torvalds
ef2b02d3 e4260197

+70 -8
+35 -4
fs/ext3/namei.c
··· 140 140 struct dx_map_entry 141 141 { 142 142 u32 hash; 143 - u32 offs; 143 + u16 offs; 144 + u16 size; 144 145 }; 145 146 146 147 #ifdef CONFIG_EXT3_INDEX ··· 698 697 * Directory block splitting, compacting 699 698 */ 700 699 700 + /* 701 + * Create map of hash values, offsets, and sizes, stored at end of block. 702 + * Returns number of entries mapped. 703 + */ 701 704 static int dx_make_map (struct ext3_dir_entry_2 *de, int size, 702 705 struct dx_hash_info *hinfo, struct dx_map_entry *map_tail) 703 706 { ··· 715 710 ext3fs_dirhash(de->name, de->name_len, &h); 716 711 map_tail--; 717 712 map_tail->hash = h.hash; 718 - map_tail->offs = (u32) ((char *) de - base); 713 + map_tail->offs = (u16) ((char *) de - base); 714 + map_tail->size = le16_to_cpu(de->rec_len); 719 715 count++; 720 716 cond_resched(); 721 717 } ··· 726 720 return count; 727 721 } 728 722 723 + /* Sort map by hash value */ 729 724 static void dx_sort_map (struct dx_map_entry *map, unsigned count) 730 725 { 731 726 struct dx_map_entry *p, *q, *top = map + count - 1; ··· 1124 1117 } 1125 1118 1126 1119 #ifdef CONFIG_EXT3_INDEX 1120 + /* 1121 + * Move count entries from end of map between two memory locations. 1122 + * Returns pointer to last entry moved. 1123 + */ 1127 1124 static struct ext3_dir_entry_2 * 1128 1125 dx_move_dirents(char *from, char *to, struct dx_map_entry *map, int count) 1129 1126 { ··· 1146 1135 return (struct ext3_dir_entry_2 *) (to - rec_len); 1147 1136 } 1148 1137 1138 + /* 1139 + * Compact each dir entry in the range to the minimal rec_len. 1140 + * Returns pointer to last entry in range. 1141 + */ 1149 1142 static struct ext3_dir_entry_2* dx_pack_dirents(char *base, int size) 1150 1143 { 1151 1144 struct ext3_dir_entry_2 *next, *to, *prev, *de = (struct ext3_dir_entry_2 *) base; ··· 1172 1157 return prev; 1173 1158 } 1174 1159 1160 + /* 1161 + * Split a full leaf block to make room for a new dir entry. 1162 + * Allocate a new block, and move entries so that they are approx. equally full. 1163 + * Returns pointer to de in block into which the new entry will be inserted. 1164 + */ 1175 1165 static struct ext3_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, 1176 1166 struct buffer_head **bh,struct dx_frame *frame, 1177 1167 struct dx_hash_info *hinfo, int *error) ··· 1188 1168 u32 hash2; 1189 1169 struct dx_map_entry *map; 1190 1170 char *data1 = (*bh)->b_data, *data2; 1191 - unsigned split; 1171 + unsigned split, move, size, i; 1192 1172 struct ext3_dir_entry_2 *de = NULL, *de2; 1193 1173 int err = 0; 1194 1174 ··· 1216 1196 count = dx_make_map ((struct ext3_dir_entry_2 *) data1, 1217 1197 blocksize, hinfo, map); 1218 1198 map -= count; 1219 - split = count/2; // need to adjust to actual middle 1220 1199 dx_sort_map (map, count); 1200 + /* Split the existing block in the middle, size-wise */ 1201 + size = 0; 1202 + move = 0; 1203 + for (i = count-1; i >= 0; i--) { 1204 + /* is more than half of this entry in 2nd half of the block? */ 1205 + if (size + map[i].size/2 > blocksize/2) 1206 + break; 1207 + size += map[i].size; 1208 + move++; 1209 + } 1210 + /* map index at which we will split */ 1211 + split = count - move; 1221 1212 hash2 = map[split].hash; 1222 1213 continued = hash2 == map[split - 1].hash; 1223 1214 dxtrace(printk("Split block %i at %x, %i/%i\n",
+35 -4
fs/ext4/namei.c
··· 140 140 struct dx_map_entry 141 141 { 142 142 u32 hash; 143 - u32 offs; 143 + u16 offs; 144 + u16 size; 144 145 }; 145 146 146 147 #ifdef CONFIG_EXT4_INDEX ··· 698 697 * Directory block splitting, compacting 699 698 */ 700 699 700 + /* 701 + * Create map of hash values, offsets, and sizes, stored at end of block. 702 + * Returns number of entries mapped. 703 + */ 701 704 static int dx_make_map (struct ext4_dir_entry_2 *de, int size, 702 705 struct dx_hash_info *hinfo, struct dx_map_entry *map_tail) 703 706 { ··· 715 710 ext4fs_dirhash(de->name, de->name_len, &h); 716 711 map_tail--; 717 712 map_tail->hash = h.hash; 718 - map_tail->offs = (u32) ((char *) de - base); 713 + map_tail->offs = (u16) ((char *) de - base); 714 + map_tail->size = le16_to_cpu(de->rec_len); 719 715 count++; 720 716 cond_resched(); 721 717 } ··· 726 720 return count; 727 721 } 728 722 723 + /* Sort map by hash value */ 729 724 static void dx_sort_map (struct dx_map_entry *map, unsigned count) 730 725 { 731 726 struct dx_map_entry *p, *q, *top = map + count - 1; ··· 1122 1115 } 1123 1116 1124 1117 #ifdef CONFIG_EXT4_INDEX 1118 + /* 1119 + * Move count entries from end of map between two memory locations. 1120 + * Returns pointer to last entry moved. 1121 + */ 1125 1122 static struct ext4_dir_entry_2 * 1126 1123 dx_move_dirents(char *from, char *to, struct dx_map_entry *map, int count) 1127 1124 { ··· 1144 1133 return (struct ext4_dir_entry_2 *) (to - rec_len); 1145 1134 } 1146 1135 1136 + /* 1137 + * Compact each dir entry in the range to the minimal rec_len. 1138 + * Returns pointer to last entry in range. 1139 + */ 1147 1140 static struct ext4_dir_entry_2* dx_pack_dirents(char *base, int size) 1148 1141 { 1149 1142 struct ext4_dir_entry_2 *next, *to, *prev, *de = (struct ext4_dir_entry_2 *) base; ··· 1170 1155 return prev; 1171 1156 } 1172 1157 1158 + /* 1159 + * Split a full leaf block to make room for a new dir entry. 1160 + * Allocate a new block, and move entries so that they are approx. equally full. 1161 + * Returns pointer to de in block into which the new entry will be inserted. 1162 + */ 1173 1163 static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, 1174 1164 struct buffer_head **bh,struct dx_frame *frame, 1175 1165 struct dx_hash_info *hinfo, int *error) ··· 1186 1166 u32 hash2; 1187 1167 struct dx_map_entry *map; 1188 1168 char *data1 = (*bh)->b_data, *data2; 1189 - unsigned split; 1169 + unsigned split, move, size, i; 1190 1170 struct ext4_dir_entry_2 *de = NULL, *de2; 1191 1171 int err = 0; 1192 1172 ··· 1214 1194 count = dx_make_map ((struct ext4_dir_entry_2 *) data1, 1215 1195 blocksize, hinfo, map); 1216 1196 map -= count; 1217 - split = count/2; // need to adjust to actual middle 1218 1197 dx_sort_map (map, count); 1198 + /* Split the existing block in the middle, size-wise */ 1199 + size = 0; 1200 + move = 0; 1201 + for (i = count-1; i >= 0; i--) { 1202 + /* is more than half of this entry in 2nd half of the block? */ 1203 + if (size + map[i].size/2 > blocksize/2) 1204 + break; 1205 + size += map[i].size; 1206 + move++; 1207 + } 1208 + /* map index at which we will split */ 1209 + split = count - move; 1219 1210 hash2 = map[split].hash; 1220 1211 continued = hash2 == map[split - 1].hash; 1221 1212 dxtrace(printk("Split block %i at %x, %i/%i\n",