Linux kernel ============ The Linux kernel is the core of any Linux operating system. It manages hardware, system resources, and provides the fundamental services for all other software. Quick Start ----------- * Report a bug: See Documentation/admin-guide/reporting-issues.rst * Get the latest kernel: https://kernel.org * Build the kernel: See Documentation/admin-guide/quickly-build-trimmed-linux.rst * Join the community: https://lore.kernel.org/ Essential Documentation ----------------------- All users should be familiar with: * Building requirements: Documentation/process/changes.rst * Code of Conduct: Documentation/process/code-of-conduct.rst * License: See COPYING Documentation can be built with make htmldocs or viewed online at: https://www.kernel.org/doc/html/latest/ Who Are You? ============ Find your role below: * New Kernel Developer - Getting started with kernel development * Academic Researcher - Studying kernel internals and architecture * Security Expert - Hardening and vulnerability analysis * Backport/Maintenance Engineer - Maintaining stable kernels * System Administrator - Configuring and troubleshooting * Maintainer - Leading subsystems and reviewing patches * Hardware Vendor - Writing drivers for new hardware * Distribution Maintainer - Packaging kernels for distros * AI Coding Assistant - LLMs and AI-powered development tools For Specific Users ================== New Kernel Developer -------------------- Welcome! Start your kernel development journey here: * Getting Started: Documentation/process/development-process.rst * Your First Patch: Documentation/process/submitting-patches.rst * Coding Style: Documentation/process/coding-style.rst * Build System: Documentation/kbuild/index.rst * Development Tools: Documentation/dev-tools/index.rst * Kernel Hacking Guide: Documentation/kernel-hacking/hacking.rst * Core APIs: Documentation/core-api/index.rst Academic Researcher ------------------- Explore the kernel's architecture and internals: * Researcher Guidelines: Documentation/process/researcher-guidelines.rst * Memory Management: Documentation/mm/index.rst * Scheduler: Documentation/scheduler/index.rst * Networking Stack: Documentation/networking/index.rst * Filesystems: Documentation/filesystems/index.rst * RCU (Read-Copy Update): Documentation/RCU/index.rst * Locking Primitives: Documentation/locking/index.rst * Power Management: Documentation/power/index.rst Security Expert --------------- Security documentation and hardening guides: * Security Documentation: Documentation/security/index.rst * LSM Development: Documentation/security/lsm-development.rst * Self Protection: Documentation/security/self-protection.rst * Reporting Vulnerabilities: Documentation/process/security-bugs.rst * CVE Procedures: Documentation/process/cve.rst * Embargoed Hardware Issues: Documentation/process/embargoed-hardware-issues.rst * Security Features: Documentation/userspace-api/seccomp_filter.rst Backport/Maintenance Engineer ----------------------------- Maintain and stabilize kernel versions: * Stable Kernel Rules: Documentation/process/stable-kernel-rules.rst * Backporting Guide: Documentation/process/backporting.rst * Applying Patches: Documentation/process/applying-patches.rst * Subsystem Profile: Documentation/maintainer/maintainer-entry-profile.rst * Git for Maintainers: Documentation/maintainer/configure-git.rst System Administrator -------------------- Configure, tune, and troubleshoot Linux systems: * Admin Guide: Documentation/admin-guide/index.rst * Kernel Parameters: Documentation/admin-guide/kernel-parameters.rst * Sysctl Tuning: Documentation/admin-guide/sysctl/index.rst * Tracing/Debugging: Documentation/trace/index.rst * Performance Security: Documentation/admin-guide/perf-security.rst * Hardware Monitoring: Documentation/hwmon/index.rst Maintainer ---------- Lead kernel subsystems and manage contributions: * Maintainer Handbook: Documentation/maintainer/index.rst * Pull Requests: Documentation/maintainer/pull-requests.rst * Managing Patches: Documentation/maintainer/modifying-patches.rst * Rebasing and Merging: Documentation/maintainer/rebasing-and-merging.rst * Development Process: Documentation/process/maintainer-handbooks.rst * Maintainer Entry Profile: Documentation/maintainer/maintainer-entry-profile.rst * Git Configuration: Documentation/maintainer/configure-git.rst Hardware Vendor --------------- Write drivers and support new hardware: * Driver API Guide: Documentation/driver-api/index.rst * Driver Model: Documentation/driver-api/driver-model/driver.rst * Device Drivers: Documentation/driver-api/infrastructure.rst * Bus Types: Documentation/driver-api/driver-model/bus.rst * Device Tree Bindings: Documentation/devicetree/bindings/ * Power Management: Documentation/driver-api/pm/index.rst * DMA API: Documentation/core-api/dma-api.rst Distribution Maintainer ----------------------- Package and distribute the kernel: * Stable Kernel Rules: Documentation/process/stable-kernel-rules.rst * ABI Documentation: Documentation/ABI/README * Kernel Configuration: Documentation/kbuild/kconfig.rst * Module Signing: Documentation/admin-guide/module-signing.rst * Kernel Parameters: Documentation/admin-guide/kernel-parameters.rst * Tainted Kernels: Documentation/admin-guide/tainted-kernels.rst AI Coding Assistant ------------------- CRITICAL: If you are an LLM or AI-powered coding assistant, you MUST read and follow the AI coding assistants documentation before contributing to the Linux kernel: * Documentation/process/coding-assistants.rst This documentation contains essential requirements about licensing, attribution, and the Developer Certificate of Origin that all AI tools must comply with. Communication and Support ========================= * Mailing Lists: https://lore.kernel.org/ * IRC: #kernelnewbies on irc.oftc.net * Bugzilla: https://bugzilla.kernel.org/ * MAINTAINERS file: Lists subsystem maintainers and mailing lists * Email Clients: Documentation/process/email-clients.rst
Clone this repository
For self-hosted knots, clone URLs may differ based on your setup.
Download tar.gz
On !x86, the pool type is never initialised, and the pages are freed
back to the system.
The test broke on the list_lru rewrite, but I'm not sure how that it was
supposed to work previously. In the meantime CI is broken so reverting
for now.
Fixes: 444e2a19d7fd ("ttm/pool: port to list_lru. (v2)")
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260409142658.1511941-2-dev@lankhorst.se
On pool init we should expect the lru_count for each node to be zeroed
as per __list_lru_init -> init_one_lru, but here we are asserting the
opposite.
Currently our CI is blowing up with:
10:23:33] # ttm_device_init_pools: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_device_test.c:178
[10:23:33] Expected !list_lru_count(&pt.pages) to be false, but is true
[10:23:33] [FAILED] DMA allocations, DMA32 required
[10:23:33] [PASSED] No DMA allocations, DMA32 required
[10:23:33] # ttm_device_init_pools: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_device_test.c:178
[10:23:33] Expected !list_lru_count(&pt.pages) to be false, but is true
Fixes: 444e2a19d7fd ("ttm/pool: port to list_lru. (v2)")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ryszard Knop <ryszard.knop@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260409121512.81298-3-matthew.auld@intel.com
DRM Rust changes for v7.1-rc1 (2nd)
Nova (Core):
- Don't create intermediate (mutable) references to the whole command
queue buffer, which is potential undefined behavior.
- Add missing padding to the falcon firmware DMA buffer to prevent DMA
transfers going out of range of the DMA buffer.
- Actually set the default values in the bitfield Default
implementation.
- Use u32::from_le_bytes() instead of manual bit shifts to parse the
PCI ROM header.
- Fix a missing colon in the SEC2 boot debug message.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Danilo Krummrich" <dakr@kernel.org>
Link: https://patch.msgid.link/DHN5GMSIBKO2.2AYOLXDU4X19S@kernel.org
This gets the memory sizes from the nodes and stores the limit
as 50% of those. I think eventually we should drop the limits
once we have memcg aware shrinking, but this should be more NUMA
friendly, and I think seems like what people would prefer to
happen on NUMA aware systems.
Cc: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The SEC2 mailbox debug output formats MBOX1 without a colon separator,
producing "MBOX10xdead" instead of "MBOX1: 0xdead". The GSP debug
message a few lines above uses the correct format.
Fixes: 5949d419c193 ("gpu: nova-core: gsp: Boot GSP")
Signed-off-by: David Carlier <devnexen@gmail.com>
Link: https://patch.msgid.link/20260331103744.605683-1-devnexen@gmail.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
This enable NUMA awareness for the shrinker on the
ttm pools.
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Clippy fires two clippy::precedence warnings on the manual
byte-shifting expression:
warning: operator precedence can trip the unwary
--> drivers/gpu/nova-core/vbios.rs:511:17
|
511 | / u32::from(data[29]) << 24
512 | | | u32::from(data[28]) << 16
513 | | | u32::from(data[27]) << 8
| |______________________________________________^
Clear the warnings by replacing manual byte-shifting with
u32::from_le_bytes(). Using from_le_bytes() is also a tiny code
improvement, because it uses less code and is clearer about the intent.
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Link: https://patch.msgid.link/20260404212831.78971-2-jhubbard@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
The list_lru will now handle numa for us, so no need to keep
separate pool types for it. Just consolidate into the global ones.
This adds a debugfs change to avoid dumping non-existant orders due
to this change.
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The current implementation does not actually set the default values for
the fields in the bitfield.
Fixes: 3fa145bef533 ("gpu: nova-core: register: generate correct `Default` implementation")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
Link: https://patch.msgid.link/20260401-fix-bitfield-v2-1-2fa68c98114a@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
This is an initial port of the TTM pools for
write combined and uncached pages to use the list_lru.
This makes the pool's more NUMA aware and avoids
needing separate NUMA pools (later commit enables this).
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Commit a88831502c8f ("gpu: nova-core: falcon: use dma::Coherent")
dropped the nova-local `DmaObject` device memory type for the
kernel-global `Coherent` one.
This switch had a side-effect: `DmaObject` always aligned the requested
size to `PAGE_SIZE`, and also reported that adjusted size when queried.
`Coherent`, on the other hand, does page-align allocation sizes but only
allows CPU access on the exact size provided by the caller.
This change runs into a limitation of falcon DMA copies, namely that DMA
accesses are done on blocks of exactly 256 bytes. If the provided data
does not have a length that is a multiple of 256, `dma_wr` returns
an error.
It was expected that all firmwares would present the proper adjusted
size, but this is not the case at least on my GA107:
NovaCore 0000:08:00.0: DMA transfer goes beyond range of DMA object
NovaCore 0000:08:00.0: Failed to load FWSEC firmware: EINVAL
NovaCore 0000:08:00.0: probe with driver NovaCore failed with error -22
Fix this by padding the `Coherent`'s size to `MEM_BLOCK_ALIGNMENT` (i.e.
256) when allocating it and filling it with zeroes, before copying the
firmware on top of it.
Fixes: a88831502c8f ("gpu: nova-core: falcon: use dma::Coherent")
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260405-falcon-dma-roundup-v2-1-4af5b2ff9c16@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
This uses the newly introduced per-node gpu tracking stats,
to track GPU memory allocated via TTM and reclaimable memory in
the TTM page pools.
These stats will be useful later for system information and
later when mem cgroups are integrated.
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
`driver_read_area` and `driver_write_area` are internal methods that
return slices containing the area of the command queue buffer that the
driver has exclusive read or write access, respectively.
While their returned value is correct and safe to use, internally they
temporarily create a reference to the whole command-buffer slice,
including GSP-owned regions. These regions can change without notice,
and thus creating a slice to them, even if never accessed, is undefined
behavior.
Fix this by making these methods create slices to valid regions only.
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reported-by: Danilo Krummrich <dakr@kernel.org>
Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260404-cmdq-ub-fix-v5-1-53d21f4752f5@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
While discussing memcg intergration with gpu memory allocations,
it was pointed out that there was no numa/system counters for
GPU memory allocations.
With more integrated memory GPU server systems turning up, and
more requirements for memory tracking it seems we should start
closing the gap.
Add two counters to track GPU per-node system memory allocations.
The first is currently allocated to GPU objects, and the second
is for memory that is stored in GPU page pools that can be reclaimed,
by the shrinker.
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Factor out a chunk of complexity into a new subroutine. This is an
incremental step in adding ELF32 support to the existing ELF64 section
support, for handling GPU firmware.
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260326013902.588242-9-jhubbard@nvidia.com
[acourbot: use fuller prefix in commit message.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Thomas Zimmermann needs 2f42c1a61616 ("drm/ast: dp501: Fix
initialization of SCU2C") for drm-misc-next.
Conflicts:
- drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c
Just between e927b36ae18b ("drm/amd/display: Fix NULL pointer
dereference in dcn401_init_hw()") and it's cherry-pick that confused
git.
- drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
Deleted in 6b0a6116286e ("drm/amd/pm: Unify version check in SMUv11")
but some cherry-picks confused git. Same for v12/v14.
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>