Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

NFSD: Encode COMPOUND operation status on page boundaries

J. David reports an odd corruption of a READDIR reply sent to a
FreeBSD client.

xdr_reserve_space() has to do a special trick when the @nbytes value
requests more space than there is in the current page of the XDR
buffer.

In that case, xdr_reserve_space() returns a pointer to the start of
the next page, and then the next call to xdr_reserve_space() invokes
__xdr_commit_encode() to copy enough of the data item back into the
previous page to make that data item contiguous across the page
boundary.

But we need to be careful in the case where buffer space is reserved
early for a data item whose value will be inserted into the buffer
later.

One such caller, nfsd4_encode_operation(), reserves 8 bytes in the
encoding buffer for each COMPOUND operation. However, a READDIR
result can sometimes encode file names so that there are only 4
bytes left at the end of the current XDR buffer page (though plenty
of pages are left to handle the remaining encoding tasks).

If a COMPOUND operation follows the READDIR result (say, a GETATTR),
then nfsd4_encode_operation() will reserve 8 bytes for the op number
(9) and the op status (usually NFS4_OK). In this weird case,
xdr_reserve_space() returns a pointer to byte zero of the next buffer
page, as it assumes the data item will be copied back into place (in
the previous page) on the next call to xdr_reserve_space().

nfsd4_encode_operation() writes the op num into the buffer, then
saves the next 4-byte location for the op's status code. The next
xdr_reserve_space() call is part of GETATTR encoding, so the op num
gets copied back into the previous page, but the saved location for
the op status continues to point to the wrong spot in the current
XDR buffer page because __xdr_commit_encode() moved that data item.

After GETATTR encoding is complete, nfsd4_encode_operation() writes
the op status over the first XDR data item in the GETATTR result.
The NFS4_OK status code (0) makes it look like there are zero items
in the GETATTR's attribute bitmask.

The patch description of commit 2825a7f90753 ("nfsd4: allow encoding
across page boundaries") [2014] remarks that NFSD "can't handle a
new operation starting close to the end of a page." This bug appears
to be one reason for that remark.

Reported-by: J David <j.david.lists@gmail.com>
Closes: https://lore.kernel.org/linux-nfs/3998d739-c042-46b4-8166-dbd6c5f0e804@oracle.com/T/#t
Tested-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

+10 -10
+10 -10
fs/nfsd/nfs4xdr.c
··· 5761 5761 struct nfs4_stateowner *so = resp->cstate.replay_owner; 5762 5762 struct svc_rqst *rqstp = resp->rqstp; 5763 5763 const struct nfsd4_operation *opdesc = op->opdesc; 5764 - int post_err_offset; 5764 + unsigned int op_status_offset; 5765 5765 nfsd4_enc encoder; 5766 - __be32 *p; 5767 5766 5768 - p = xdr_reserve_space(xdr, 8); 5769 - if (!p) 5767 + if (xdr_stream_encode_u32(xdr, op->opnum) != XDR_UNIT) 5770 5768 goto release; 5771 - *p++ = cpu_to_be32(op->opnum); 5772 - post_err_offset = xdr->buf->len; 5769 + op_status_offset = xdr->buf->len; 5770 + if (!xdr_reserve_space(xdr, XDR_UNIT)) 5771 + goto release; 5773 5772 5774 5773 if (op->opnum == OP_ILLEGAL) 5775 5774 goto status; ··· 5809 5810 * bug if we had to do this on a non-idempotent op: 5810 5811 */ 5811 5812 warn_on_nonidempotent_op(op); 5812 - xdr_truncate_encode(xdr, post_err_offset); 5813 + xdr_truncate_encode(xdr, op_status_offset + XDR_UNIT); 5813 5814 } 5814 5815 if (so) { 5815 - int len = xdr->buf->len - post_err_offset; 5816 + int len = xdr->buf->len - (op_status_offset + XDR_UNIT); 5816 5817 5817 5818 so->so_replay.rp_status = op->status; 5818 5819 so->so_replay.rp_buflen = len; 5819 - read_bytes_from_xdr_buf(xdr->buf, post_err_offset, 5820 + read_bytes_from_xdr_buf(xdr->buf, op_status_offset + XDR_UNIT, 5820 5821 so->so_replay.rp_buf, len); 5821 5822 } 5822 5823 status: 5823 5824 op->status = nfsd4_map_status(op->status, 5824 5825 resp->cstate.minorversion); 5825 - *p = op->status; 5826 + write_bytes_to_xdr_buf(xdr->buf, op_status_offset, 5827 + &op->status, XDR_UNIT); 5826 5828 release: 5827 5829 if (opdesc && opdesc->op_release) 5828 5830 opdesc->op_release(&op->u);