Linux kernel ============ The Linux kernel is the core of any Linux operating system. It manages hardware, system resources, and provides the fundamental services for all other software. Quick Start ----------- * Report a bug: See Documentation/admin-guide/reporting-issues.rst * Get the latest kernel: https://kernel.org * Build the kernel: See Documentation/admin-guide/quickly-build-trimmed-linux.rst * Join the community: https://lore.kernel.org/ Essential Documentation ----------------------- All users should be familiar with: * Building requirements: Documentation/process/changes.rst * Code of Conduct: Documentation/process/code-of-conduct.rst * License: See COPYING Documentation can be built with make htmldocs or viewed online at: https://www.kernel.org/doc/html/latest/ Who Are You? ============ Find your role below: * New Kernel Developer - Getting started with kernel development * Academic Researcher - Studying kernel internals and architecture * Security Expert - Hardening and vulnerability analysis * Backport/Maintenance Engineer - Maintaining stable kernels * System Administrator - Configuring and troubleshooting * Maintainer - Leading subsystems and reviewing patches * Hardware Vendor - Writing drivers for new hardware * Distribution Maintainer - Packaging kernels for distros * AI Coding Assistant - LLMs and AI-powered development tools For Specific Users ================== New Kernel Developer -------------------- Welcome! Start your kernel development journey here: * Getting Started: Documentation/process/development-process.rst * Your First Patch: Documentation/process/submitting-patches.rst * Coding Style: Documentation/process/coding-style.rst * Build System: Documentation/kbuild/index.rst * Development Tools: Documentation/dev-tools/index.rst * Kernel Hacking Guide: Documentation/kernel-hacking/hacking.rst * Core APIs: Documentation/core-api/index.rst Academic Researcher ------------------- Explore the kernel's architecture and internals: * Researcher Guidelines: Documentation/process/researcher-guidelines.rst * Memory Management: Documentation/mm/index.rst * Scheduler: Documentation/scheduler/index.rst * Networking Stack: Documentation/networking/index.rst * Filesystems: Documentation/filesystems/index.rst * RCU (Read-Copy Update): Documentation/RCU/index.rst * Locking Primitives: Documentation/locking/index.rst * Power Management: Documentation/power/index.rst Security Expert --------------- Security documentation and hardening guides: * Security Documentation: Documentation/security/index.rst * LSM Development: Documentation/security/lsm-development.rst * Self Protection: Documentation/security/self-protection.rst * Reporting Vulnerabilities: Documentation/process/security-bugs.rst * CVE Procedures: Documentation/process/cve.rst * Embargoed Hardware Issues: Documentation/process/embargoed-hardware-issues.rst * Security Features: Documentation/userspace-api/seccomp_filter.rst Backport/Maintenance Engineer ----------------------------- Maintain and stabilize kernel versions: * Stable Kernel Rules: Documentation/process/stable-kernel-rules.rst * Backporting Guide: Documentation/process/backporting.rst * Applying Patches: Documentation/process/applying-patches.rst * Subsystem Profile: Documentation/maintainer/maintainer-entry-profile.rst * Git for Maintainers: Documentation/maintainer/configure-git.rst System Administrator -------------------- Configure, tune, and troubleshoot Linux systems: * Admin Guide: Documentation/admin-guide/index.rst * Kernel Parameters: Documentation/admin-guide/kernel-parameters.rst * Sysctl Tuning: Documentation/admin-guide/sysctl/index.rst * Tracing/Debugging: Documentation/trace/index.rst * Performance Security: Documentation/admin-guide/perf-security.rst * Hardware Monitoring: Documentation/hwmon/index.rst Maintainer ---------- Lead kernel subsystems and manage contributions: * Maintainer Handbook: Documentation/maintainer/index.rst * Pull Requests: Documentation/maintainer/pull-requests.rst * Managing Patches: Documentation/maintainer/modifying-patches.rst * Rebasing and Merging: Documentation/maintainer/rebasing-and-merging.rst * Development Process: Documentation/process/maintainer-handbooks.rst * Maintainer Entry Profile: Documentation/maintainer/maintainer-entry-profile.rst * Git Configuration: Documentation/maintainer/configure-git.rst Hardware Vendor --------------- Write drivers and support new hardware: * Driver API Guide: Documentation/driver-api/index.rst * Driver Model: Documentation/driver-api/driver-model/driver.rst * Device Drivers: Documentation/driver-api/infrastructure.rst * Bus Types: Documentation/driver-api/driver-model/bus.rst * Device Tree Bindings: Documentation/devicetree/bindings/ * Power Management: Documentation/driver-api/pm/index.rst * DMA API: Documentation/core-api/dma-api.rst Distribution Maintainer ----------------------- Package and distribute the kernel: * Stable Kernel Rules: Documentation/process/stable-kernel-rules.rst * ABI Documentation: Documentation/ABI/README * Kernel Configuration: Documentation/kbuild/kconfig.rst * Module Signing: Documentation/admin-guide/module-signing.rst * Kernel Parameters: Documentation/admin-guide/kernel-parameters.rst * Tainted Kernels: Documentation/admin-guide/tainted-kernels.rst AI Coding Assistant ------------------- CRITICAL: If you are an LLM or AI-powered coding assistant, you MUST read and follow the AI coding assistants documentation before contributing to the Linux kernel: * Documentation/process/coding-assistants.rst This documentation contains essential requirements about licensing, attribution, and the Developer Certificate of Origin that all AI tools must comply with. Communication and Support ========================= * Mailing Lists: https://lore.kernel.org/ * IRC: #kernelnewbies on irc.oftc.net * Bugzilla: https://bugzilla.kernel.org/ * MAINTAINERS file: Lists subsystem maintainers and mailing lists * Email Clients: Documentation/process/email-clients.rst
Clone this repository
For self-hosted knots, clone URLs may differ based on your setup.
Download tar.gz
Mark ublk_filter_unused_tags() as noinline since it is only called from
the unlikely(needs_filter) branch. Extract the error-handling block from
__ublk_batch_dispatch() into a new noinline ublk_batch_dispatch_fail()
function to keep the hot path compact and icache-friendly. This also
makes __ublk_batch_dispatch() more readable by separating the error
recovery logic from the normal dispatch flow.
Before: __ublk_batch_dispatch is ~1419 bytes
After: __ublk_batch_dispatch is ~1090 bytes (-329 bytes, -23%)
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://patch.msgid.link/20260318014112.3125432-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull MD changes from Yu Kuia:
"Bug Fixes:
- md: suppress spurious superblock update error message for dm-raid
(Chen Cheng)
- md/raid1: fix the comparing region of interval tree (Xiao Ni)
- md/raid10: fix deadlock with check operation and nowait requests
(Josh Hunt)
- md/raid5: skip 2-failure compute when other disk is R5_LOCKED
(FengWei Shih)
- md/md-llbitmap: raise barrier before state machine transition
(Yu Kuai)
- md/md-llbitmap: skip reading rdevs that are not in_sync (Yu Kuai)
Improvements:
- md/raid5: set chunk_sectors to enable full stripe I/O splitting
(Yu Kuai)
Cleanups:
- md: remove unused mddev argument from export_rdev (Chen Cheng)
- md/raid5: remove stale md_raid5_kick_device() declaration
(Chen Cheng)
- md/raid5: move handle_stripe() comment to correct location
(Chen Cheng)"
* tag 'md-7.1-20260323' of git://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux:
md: remove unused mddev argument from export_rdev
md/raid5: move handle_stripe() comment to correct location
md/raid5: remove stale md_raid5_kick_device() declaration
md/raid1: fix the comparing region of interval tree
md/raid5: skip 2-failure compute when other disk is R5_LOCKED
md/md-llbitmap: raise barrier before state machine transition
md/md-llbitmap: skip reading rdevs that are not in_sync
md/raid5: set chunk_sectors to enable full stripe I/O splitting
md/raid10: fix deadlock with check operation and nowait requests
md: suppress spurious superblock update error message for dm-raid
In preparation for removing the strlcat API[1], replace the char *pp_buf
with a struct seq_buf, which tracks the current write position and
remaining space internally. This allows for:
- Direct use of seq_buf_printf() in place of snprintf()+strlcat()
pairs, eliminating local tmp buffers throughout.
- Adjacent strlcat() calls that build strings piece-by-piece
(e.g., strlcat("["); strlcat(name); strlcat("]")) to be collapsed
into single seq_buf_printf() calls.
- Simpler call sites: seq_buf_puts() takes only the buffer and string,
with no need to pass PAGE_SIZE at every call.
The backing buffer allocation is unchanged (__get_free_page), and the
output path uses seq_buf_str() to NUL-terminate before passing to
printk().
Link: https://github.com/KSPP/linux/issues/370 [1]
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Josh Law <objecting@objecting.org>
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Josh Law <objecting@objecting.org>
Link: https://patch.msgid.link/20260321004840.work.670-kees@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The mddev argument in export_rdev() is never used. Remove it to
simplify callers.
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/linux-raid/20260304111417.20777-1-chencheng@fnnas.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Implement the SCSI-specific io_uring command handler for BSG using
struct bsg_uring_cmd.
The handler builds a SCSI request from the io_uring command, maps user
buffers (including fixed buffers), and completes asynchronously via a
request end_io callback and task_work. Completion returns a 32-bit
status and packed residual/sense information via CQE res and res2, and
supports IO_URING_F_NONBLOCK.
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260317072226.2598233-4-yangxiuwei@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move the handle_stripe() documentation comment from above
analyse_stripe() to directly above handle_stripe() where it belongs.
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
Reviewed-by: Yu Kuai <yukuai@fnnas.com>
Link: https://lore.kernel.org/linux-raid/20260304111001.15767-1-chencheng@fnnas.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Add an io_uring command handler to the generic BSG layer. The new
.uring_cmd file operation validates io_uring features and delegates
handling to a per-queue bsg_uring_cmd_fn callback.
Extend bsg_register_queue() so transport drivers can register both
sg_io and io_uring command handlers.
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260317072226.2598233-3-yangxiuwei@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove the unused md_raid5_kick_device() declaration from raid5.h -
no definition exists for this function.
Signed-off-by: Chen Cheng <chencheng@fnnas.com>
Reviewed-by: Yu Kuai <yukuai@fnnas.com>
Link: https://lore.kernel.org/linux-raid/20260304110919.15071-1-chencheng@fnnas.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Add the bsg_uring_cmd structure to the BSG UAPI header to support
io_uring-based SCSI passthrough operations via IORING_OP_URING_CMD.
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260317072226.2598233-2-yangxiuwei@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Interval tree uses [start, end] as a region which stores in the tree.
In raid1, it uses the wrong end value. For example:
bio(A,B) is too big and needs to be split to bio1(A,C-1), bio2(C,B).
The region of bio1 is [A,C] and the region of bio2 is [C,B]. So bio1 and
bio2 overlap which is not right.
Fix this problem by using right end value of the region.
Fixes: d0d2d8ba0494 ("md/raid1: introduce wait_for_serialization")
Signed-off-by: Xiao Ni <xni@redhat.com>
Link: https://lore.kernel.org/linux-raid/20260305011839.5118-2-xni@redhat.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
The function bio_add_page() returns the number of bytes added to the
bio, and if that failed it should return 0.
However there is a special quirk, if a caller is passing a page with
length 0, that function will always return 0 but with different results:
- The page is added to the bio
If there is enough bvec slot or the folio can be merged with the last
bvec.
The return value 0 is just the length passed in, which is also 0.
- The page is not added to the bio
If the page is not mergeable with the last bvec, or there is no bvec
slot available.
The return value 0 means page is not added into the bio.
Unfortunately the caller is not able to distinguish the above two cases,
and will treat the 0 return value as page addition failure.
In that case, this can lead to the double releasing of the last page:
- By the bio cleanup
Which normally goes through every page of the bio, including the last
page which is added into the bio.
- By the caller
Which believes the page is not added into the bio, thus would manually
release the page.
I do not think anyone should call bio_add_folio()/bio_add_page() with zero
length, but idiots like me can still show up.
So add an extra WARN_ON_ONCE() check for zero length and rejects it
early to avoid double freeing.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Link: https://patch.msgid.link/bc2223c080f38d0b63f968f605c918181c840f40.1773734749.git.wqu@suse.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When skip_copy is enabled on a doubly-degraded RAID6, a device that is
being written to will be in R5_LOCKED state with R5_UPTODATE cleared.
If a new read triggers fetch_block() while the write is still in
flight, the 2-failure compute path may select this locked device as a
compute target because it is not R5_UPTODATE.
Because skip_copy makes the device page point directly to the bio page,
reconstructing data into it might be risky. Also, since the compute
marks the device R5_UPTODATE, it triggers WARN_ON in ops_run_io()
which checks that R5_SkipCopy and R5_UPTODATE are not both set.
This can be reproduced by running small-range concurrent read/write on
a doubly-degraded RAID6 with skip_copy enabled, for example:
mdadm -C /dev/md0 -l6 -n6 -R -f /dev/loop[0-3] missing missing
echo 1 > /sys/block/md0/md/skip_copy
fio --filename=/dev/md0 --rw=randrw --bs=4k --numjobs=8 \
--iodepth=32 --size=4M --runtime=30 --time_based --direct=1
Fix by checking R5_LOCKED before proceeding with the compute. The
compute will be retried once the lock is cleared on IO completion.
Signed-off-by: FengWei Shih <dannyshih@synology.com>
Reviewed-by: Yu Kuai <yukuai@fnnas.com>
Link: https://lore.kernel.org/linux-raid/20260319053351.3676794-1-dannyshih@synology.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
The blk_mq_hw_ctx_sysfs_entry structures are never modified,
mark them as const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20260316-b4-sysfs-const-attr-block-v1-4-a35d73b986b0@weissschuh.net
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move the barrier raise operation before calling llbitmap_state_machine()
in both llbitmap_start_write() and llbitmap_start_discard(). This
ensures the barrier is in place before any state transitions occur,
preventing potential race conditions where the state machine could
complete before the barrier is properly raised.
Cc: stable@vger.kernel.org
Fixes: 5ab829f1971d ("md/md-llbitmap: introduce new lockless bitmap")
Link: https://lore.kernel.org/linux-raid/20260223024038.3084853-3-yukuai@fnnas.com
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
The blk_crypto_attrs structures are never modified, mark them as const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: John Garry <john.g.garry@oracle.com>>
Link: https://patch.msgid.link/20260316-b4-sysfs-const-attr-block-v1-3-a35d73b986b0@weissschuh.net
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When reading bitmap pages from member disks, the code iterates through
all rdevs and attempts to read from the first available one. However,
it only checks for raid_disk assignment and Faulty flag, missing the
In_sync flag check.
This can cause bitmap data to be read from spare disks that are still
being rebuilt and don't have valid bitmap information yet. Reading
stale or uninitialized bitmap data from such disks can lead to
incorrect dirty bit tracking, potentially causing data corruption
during recovery or normal operation.
Add the In_sync flag check to ensure bitmap pages are only read from
fully synchronized member disks that have valid bitmap data.
Cc: stable@vger.kernel.org
Fixes: 5ab829f1971d ("md/md-llbitmap: introduce new lockless bitmap")
Link: https://lore.kernel.org/linux-raid/20260223024038.3084853-2-yukuai@fnnas.com
Signed-off-by: Yu Kuai <yukuai@fnnas.com>