Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

btrfs: fix double free of qgroup record after failure to add delayed ref head

In the previous code it was possible to incur into a double kfree()
scenario when calling add_delayed_ref_head(). This could happen if the
record was reported to already exist in the
btrfs_qgroup_trace_extent_nolock() call, but then there was an error
later on add_delayed_ref_head(). In this case, since
add_delayed_ref_head() returned an error, the caller went to free the
record. Since add_delayed_ref_head() couldn't set this kfree'd pointer
to NULL, then kfree() would have acted on a non-NULL 'record' object
which was pointing to memory already freed by the callee.

The problem comes from the fact that the responsibility to kfree the
object is on both the caller and the callee at the same time. Hence, the
fix for this is to shift the ownership of the 'qrecord' object out of
the add_delayed_ref_head(). That is, we will never attempt to kfree()
the given object inside of this function, and will expect the caller to
act on the 'qrecord' object on its own. The only exception where the
'qrecord' object cannot be kfree'd is if it was inserted into the
tracing logic, for which we already have the 'qrecord_inserted_ret'
boolean to account for this. Hence, the caller has to kfree the object
only if add_delayed_ref_head() reports not to have inserted it on the
tracing logic.

As a side-effect of the above, we must guarantee that
'qrecord_inserted_ret' is properly initialized at the start of the
function, not at the end, and then set when an actual insert
happens. This way we avoid 'qrecord_inserted_ret' having an invalid
value on an early exit.

The documentation from the add_delayed_ref_head() has also been updated
to reflect on the exact ownership of the 'qrecord' object.

Fixes: 6ef8fbce0104 ("btrfs: fix missing error handling when adding delayed ref with qgroups enabled")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

authored by

Miquel Sabaté Solà and committed by
David Sterba
725e4629 2215e6b4

+33 -10
+33 -10
fs/btrfs/delayed-ref.c
··· 798 798 } 799 799 800 800 /* 801 - * helper function to actually insert a head node into the rbtree. 802 - * this does all the dirty work in terms of maintaining the correct 803 - * overall modification count. 801 + * Helper function to actually insert a head node into the xarray. This does all 802 + * the dirty work in terms of maintaining the correct overall modification 803 + * count. 804 + * 805 + * The caller is responsible for calling kfree() on @qrecord. More specifically, 806 + * if this function reports that it did not insert it as noted in 807 + * @qrecord_inserted_ret, then it's safe to call kfree() on it. 804 808 * 805 809 * Returns an error pointer in case of an error. 806 810 */ ··· 818 814 struct btrfs_delayed_ref_head *existing; 819 815 struct btrfs_delayed_ref_root *delayed_refs; 820 816 const unsigned long index = (head_ref->bytenr >> fs_info->sectorsize_bits); 821 - bool qrecord_inserted = false; 817 + 818 + /* 819 + * If 'qrecord_inserted_ret' is provided, then the first thing we need 820 + * to do is to initialize it to false just in case we have an exit 821 + * before trying to insert the record. 822 + */ 823 + if (qrecord_inserted_ret) 824 + *qrecord_inserted_ret = false; 822 825 823 826 delayed_refs = &trans->transaction->delayed_refs; 824 827 lockdep_assert_held(&delayed_refs->lock); ··· 844 833 845 834 /* Record qgroup extent info if provided */ 846 835 if (qrecord) { 836 + /* 837 + * Setting 'qrecord' but not 'qrecord_inserted_ret' will likely 838 + * result in a memory leakage. 839 + */ 840 + ASSERT(qrecord_inserted_ret != NULL); 841 + 847 842 int ret; 848 843 849 844 ret = btrfs_qgroup_trace_extent_nolock(fs_info, delayed_refs, qrecord, ··· 857 840 if (ret) { 858 841 /* Clean up if insertion fails or item exists. */ 859 842 xa_release(&delayed_refs->dirty_extents, index); 860 - /* Caller responsible for freeing qrecord on error. */ 861 843 if (ret < 0) 862 844 return ERR_PTR(ret); 863 - kfree(qrecord); 864 - } else { 865 - qrecord_inserted = true; 845 + } else if (qrecord_inserted_ret) { 846 + *qrecord_inserted_ret = true; 866 847 } 867 848 } 868 849 ··· 903 888 delayed_refs->num_heads++; 904 889 delayed_refs->num_heads_ready++; 905 890 } 906 - if (qrecord_inserted_ret) 907 - *qrecord_inserted_ret = qrecord_inserted; 908 891 909 892 return head_ref; 910 893 } ··· 1062 1049 xa_release(&delayed_refs->head_refs, index); 1063 1050 spin_unlock(&delayed_refs->lock); 1064 1051 ret = PTR_ERR(new_head_ref); 1052 + 1053 + /* 1054 + * It's only safe to call kfree() on 'qrecord' if 1055 + * add_delayed_ref_head() has _not_ inserted it for 1056 + * tracing. Otherwise we need to handle this here. 1057 + */ 1058 + if (!qrecord_reserved || qrecord_inserted) 1059 + goto free_head_ref; 1065 1060 goto free_record; 1066 1061 } 1067 1062 head_ref = new_head_ref; ··· 1092 1071 1093 1072 if (qrecord_inserted) 1094 1073 return btrfs_qgroup_trace_extent_post(trans, record, generic_ref->bytenr); 1074 + 1075 + kfree(record); 1095 1076 return 0; 1096 1077 1097 1078 free_record: