Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull device-dax updates from Dan Williams:
"New device-dax infrastructure to allow persistent memory and other
"reserved" / performance differentiated memories, to be assigned to
the core-mm as "System RAM".

Some users want to use persistent memory as additional volatile
memory. They are willing to cope with potential performance
differences, for example between DRAM and 3D Xpoint, and want to use
typical Linux memory management apis rather than a userspace memory
allocator layered over an mmap() of a dax file. The administration
model is to decide how much Persistent Memory (pmem) to use as System
RAM, create a device-dax-mode namespace of that size, and then assign
it to the core-mm. The rationale for device-dax is that it is a
generic memory-mapping driver that can be layered over any "special
purpose" memory, not just pmem. On subsequent boots udev rules can be
used to restore the memory assignment.

One implication of using pmem as RAM is that mlock() no longer keeps
data off persistent media. For this reason it is recommended to enable
NVDIMM Security (previously merged for 5.0) to encrypt pmem contents
at rest. We considered making this recommendation an actively enforced
requirement, but in the end decided to leave it as a distribution /
administrator policy to allow for emulation and test environments that
lack security capable NVDIMMs.

Summary:

- Replace the /sys/class/dax device model with /sys/bus/dax, and
include a compat driver so distributions can opt-in to the new ABI.

- Allow for an alternative driver for the device-dax address-range

- Introduce the 'kmem' driver to hotplug / assign a device-dax
address-range to the core-mm.

- Arrange for the device-dax target-node to be onlined so that the
newly added memory range can be uniquely referenced by numa apis"

NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because
we currently have special - and very annoying rules in the kernel about
accessing PMEM only with the "MC safe" accessors, because machine checks
inside the regular repeat string copy functions can be fatal in some
(not described) circumstances.

And apparently the PMEM modules can cause that a lot more than regular
RAM. The argument is that this happens because PMEM doesn't necessarily
get scrubbed at boot like RAM does, but that is planned to be added for
the user space tooling.

Quoting Dan from another email:
"The exposure can be reduced in the volatile-RAM case by scanning for
and clearing errors before it is onlined as RAM. The userspace tooling
for that can be in place before v5.1-final. There's also runtime
notifications of errors via acpi_nfit_uc_error_notify() from
background scrubbers on the DIMM devices. With that mechanism the
kernel could proactively clear newly discovered poison in the volatile
case, but that would be additional development more suitable for v5.2.

I understand the concern, and the need to highlight this issue by
tapping the brakes on feature development, but I don't see PMEM as RAM
making the situation worse when the exposure is also there via DAX in
the PMEM case. Volatile-RAM is arguably a safer use case since it's
possible to repair pages where the persistent case needs active
application coordination"

* tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: "Hotplug" persistent memory for use like normal RAM
mm/resource: Let walk_system_ram_range() search child resources
mm/memory-hotplug: Allow memory resources to be children
mm/resource: Move HMM pr_debug() deeper into resource code
mm/resource: Return real error codes from walk failures
device-dax: Add a 'modalias' attribute to DAX 'bus' devices
device-dax: Add a 'target_node' attribute
device-dax: Auto-bind device after successful new_id
acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node
device-dax: Add /sys/class/dax backwards compatibility
device-dax: Add support for a dax override driver
device-dax: Move resource pinning+mapping into the common driver
device-dax: Introduce bus + driver model
device-dax: Start defining a dax bus model
device-dax: Remove multi-resource infrastructure
device-dax: Kill dax_region base
device-dax: Kill dax_region ida

+1112 -538
+22
Documentation/ABI/obsolete/sysfs-class-dax
··· 1 + What: /sys/class/dax/ 2 + Date: May, 2016 3 + KernelVersion: v4.7 4 + Contact: linux-nvdimm@lists.01.org 5 + Description: Device DAX is the device-centric analogue of Filesystem 6 + DAX (CONFIG_FS_DAX). It allows memory ranges to be 7 + allocated and mapped without need of an intervening file 8 + system. Device DAX is strict, precise and predictable. 9 + Specifically this interface: 10 + 11 + 1/ Guarantees fault granularity with respect to a given 12 + page size (pte, pmd, or pud) set at configuration time. 13 + 14 + 2/ Enforces deterministic behavior by being strict about 15 + what fault scenarios are supported. 16 + 17 + The /sys/class/dax/ interface enumerates all the 18 + device-dax instances in the system. The ABI is 19 + deprecated and will be removed after 2020. It is 20 + replaced with the DAX bus interface /sys/bus/dax/ where 21 + device-dax instances can be found under 22 + /sys/bus/dax/devices/
+1
arch/powerpc/platforms/pseries/papr_scm.c
··· 239 239 memset(&ndr_desc, 0, sizeof(ndr_desc)); 240 240 ndr_desc.attr_groups = region_attr_groups; 241 241 ndr_desc.numa_node = dev_to_node(&p->pdev->dev); 242 + ndr_desc.target_node = ndr_desc.numa_node; 242 243 ndr_desc.res = &p->res; 243 244 ndr_desc.of_node = p->dn; 244 245 ndr_desc.provider_data = p;
+6 -2
drivers/acpi/nfit/core.c
··· 2956 2956 ndr_desc->res = &res; 2957 2957 ndr_desc->provider_data = nfit_spa; 2958 2958 ndr_desc->attr_groups = acpi_nfit_region_attribute_groups; 2959 - if (spa->flags & ACPI_NFIT_PROXIMITY_VALID) 2959 + if (spa->flags & ACPI_NFIT_PROXIMITY_VALID) { 2960 2960 ndr_desc->numa_node = acpi_map_pxm_to_online_node( 2961 2961 spa->proximity_domain); 2962 - else 2962 + ndr_desc->target_node = acpi_map_pxm_to_node( 2963 + spa->proximity_domain); 2964 + } else { 2963 2965 ndr_desc->numa_node = NUMA_NO_NODE; 2966 + ndr_desc->target_node = NUMA_NO_NODE; 2967 + } 2964 2968 2965 2969 /* 2966 2970 * Persistence domain bits are hierarchical, if
+1
drivers/acpi/numa.c
··· 84 84 85 85 return node; 86 86 } 87 + EXPORT_SYMBOL(acpi_map_pxm_to_node); 87 88 88 89 /** 89 90 * acpi_map_pxm_to_online_node - Map proximity ID to online node
+1
drivers/base/memory.c
··· 88 88 { 89 89 return MIN_MEMORY_BLOCK_SIZE; 90 90 } 91 + EXPORT_SYMBOL_GPL(memory_block_size_bytes); 91 92 92 93 static unsigned long get_memory_block_size(void) 93 94 {
+27 -1
drivers/dax/Kconfig
··· 23 23 config DEV_DAX_PMEM 24 24 tristate "PMEM DAX: direct access to persistent memory" 25 25 depends on LIBNVDIMM && NVDIMM_DAX && DEV_DAX 26 + depends on m # until we can kill DEV_DAX_PMEM_COMPAT 26 27 default DEV_DAX 27 28 help 28 29 Support raw access to persistent memory. Note that this 29 30 driver consumes memory ranges allocated and exported by the 30 31 libnvdimm sub-system. 31 32 32 - Say Y if unsure 33 + Say M if unsure 34 + 35 + config DEV_DAX_KMEM 36 + tristate "KMEM DAX: volatile-use of persistent memory" 37 + default DEV_DAX 38 + depends on DEV_DAX 39 + depends on MEMORY_HOTPLUG # for add_memory() and friends 40 + help 41 + Support access to persistent memory as if it were RAM. This 42 + allows easier use of persistent memory by unmodified 43 + applications. 44 + 45 + To use this feature, a DAX device must be unbound from the 46 + device_dax driver (PMEM DAX) and bound to this kmem driver 47 + on each boot. 48 + 49 + Say N if unsure. 50 + 51 + config DEV_DAX_PMEM_COMPAT 52 + tristate "PMEM DAX: support the deprecated /sys/class/dax interface" 53 + depends on DEV_DAX_PMEM 54 + default DEV_DAX_PMEM 55 + help 56 + Older versions of the libdaxctl library expect to find all 57 + device-dax instances under /sys/class/dax. If libdaxctl in 58 + your distribution is older than v58 say M, otherwise say N. 33 59 34 60 endif
+4 -2
drivers/dax/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 obj-$(CONFIG_DAX) += dax.o 3 3 obj-$(CONFIG_DEV_DAX) += device_dax.o 4 - obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o 4 + obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o 5 5 6 6 dax-y := super.o 7 - dax_pmem-y := pmem.o 7 + dax-y += bus.o 8 8 device_dax-y := device.o 9 + 10 + obj-y += pmem/
+503
drivers/dax/bus.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2017-2018 Intel Corporation. All rights reserved. */ 3 + #include <linux/memremap.h> 4 + #include <linux/device.h> 5 + #include <linux/mutex.h> 6 + #include <linux/list.h> 7 + #include <linux/slab.h> 8 + #include <linux/dax.h> 9 + #include "dax-private.h" 10 + #include "bus.h" 11 + 12 + static struct class *dax_class; 13 + 14 + static DEFINE_MUTEX(dax_bus_lock); 15 + 16 + #define DAX_NAME_LEN 30 17 + struct dax_id { 18 + struct list_head list; 19 + char dev_name[DAX_NAME_LEN]; 20 + }; 21 + 22 + static int dax_bus_uevent(struct device *dev, struct kobj_uevent_env *env) 23 + { 24 + /* 25 + * We only ever expect to handle device-dax instances, i.e. the 26 + * @type argument to MODULE_ALIAS_DAX_DEVICE() is always zero 27 + */ 28 + return add_uevent_var(env, "MODALIAS=" DAX_DEVICE_MODALIAS_FMT, 0); 29 + } 30 + 31 + static struct dax_device_driver *to_dax_drv(struct device_driver *drv) 32 + { 33 + return container_of(drv, struct dax_device_driver, drv); 34 + } 35 + 36 + static struct dax_id *__dax_match_id(struct dax_device_driver *dax_drv, 37 + const char *dev_name) 38 + { 39 + struct dax_id *dax_id; 40 + 41 + lockdep_assert_held(&dax_bus_lock); 42 + 43 + list_for_each_entry(dax_id, &dax_drv->ids, list) 44 + if (sysfs_streq(dax_id->dev_name, dev_name)) 45 + return dax_id; 46 + return NULL; 47 + } 48 + 49 + static int dax_match_id(struct dax_device_driver *dax_drv, struct device *dev) 50 + { 51 + int match; 52 + 53 + mutex_lock(&dax_bus_lock); 54 + match = !!__dax_match_id(dax_drv, dev_name(dev)); 55 + mutex_unlock(&dax_bus_lock); 56 + 57 + return match; 58 + } 59 + 60 + enum id_action { 61 + ID_REMOVE, 62 + ID_ADD, 63 + }; 64 + 65 + static ssize_t do_id_store(struct device_driver *drv, const char *buf, 66 + size_t count, enum id_action action) 67 + { 68 + struct dax_device_driver *dax_drv = to_dax_drv(drv); 69 + unsigned int region_id, id; 70 + char devname[DAX_NAME_LEN]; 71 + struct dax_id *dax_id; 72 + ssize_t rc = count; 73 + int fields; 74 + 75 + fields = sscanf(buf, "dax%d.%d", &region_id, &id); 76 + if (fields != 2) 77 + return -EINVAL; 78 + sprintf(devname, "dax%d.%d", region_id, id); 79 + if (!sysfs_streq(buf, devname)) 80 + return -EINVAL; 81 + 82 + mutex_lock(&dax_bus_lock); 83 + dax_id = __dax_match_id(dax_drv, buf); 84 + if (!dax_id) { 85 + if (action == ID_ADD) { 86 + dax_id = kzalloc(sizeof(*dax_id), GFP_KERNEL); 87 + if (dax_id) { 88 + strncpy(dax_id->dev_name, buf, DAX_NAME_LEN); 89 + list_add(&dax_id->list, &dax_drv->ids); 90 + } else 91 + rc = -ENOMEM; 92 + } else 93 + /* nothing to remove */; 94 + } else if (action == ID_REMOVE) { 95 + list_del(&dax_id->list); 96 + kfree(dax_id); 97 + } else 98 + /* dax_id already added */; 99 + mutex_unlock(&dax_bus_lock); 100 + 101 + if (rc < 0) 102 + return rc; 103 + if (action == ID_ADD) 104 + rc = driver_attach(drv); 105 + if (rc) 106 + return rc; 107 + return count; 108 + } 109 + 110 + static ssize_t new_id_store(struct device_driver *drv, const char *buf, 111 + size_t count) 112 + { 113 + return do_id_store(drv, buf, count, ID_ADD); 114 + } 115 + static DRIVER_ATTR_WO(new_id); 116 + 117 + static ssize_t remove_id_store(struct device_driver *drv, const char *buf, 118 + size_t count) 119 + { 120 + return do_id_store(drv, buf, count, ID_REMOVE); 121 + } 122 + static DRIVER_ATTR_WO(remove_id); 123 + 124 + static struct attribute *dax_drv_attrs[] = { 125 + &driver_attr_new_id.attr, 126 + &driver_attr_remove_id.attr, 127 + NULL, 128 + }; 129 + ATTRIBUTE_GROUPS(dax_drv); 130 + 131 + static int dax_bus_match(struct device *dev, struct device_driver *drv); 132 + 133 + static struct bus_type dax_bus_type = { 134 + .name = "dax", 135 + .uevent = dax_bus_uevent, 136 + .match = dax_bus_match, 137 + .drv_groups = dax_drv_groups, 138 + }; 139 + 140 + static int dax_bus_match(struct device *dev, struct device_driver *drv) 141 + { 142 + struct dax_device_driver *dax_drv = to_dax_drv(drv); 143 + 144 + /* 145 + * All but the 'device-dax' driver, which has 'match_always' 146 + * set, requires an exact id match. 147 + */ 148 + if (dax_drv->match_always) 149 + return 1; 150 + 151 + return dax_match_id(dax_drv, dev); 152 + } 153 + 154 + /* 155 + * Rely on the fact that drvdata is set before the attributes are 156 + * registered, and that the attributes are unregistered before drvdata 157 + * is cleared to assume that drvdata is always valid. 158 + */ 159 + static ssize_t id_show(struct device *dev, 160 + struct device_attribute *attr, char *buf) 161 + { 162 + struct dax_region *dax_region = dev_get_drvdata(dev); 163 + 164 + return sprintf(buf, "%d\n", dax_region->id); 165 + } 166 + static DEVICE_ATTR_RO(id); 167 + 168 + static ssize_t region_size_show(struct device *dev, 169 + struct device_attribute *attr, char *buf) 170 + { 171 + struct dax_region *dax_region = dev_get_drvdata(dev); 172 + 173 + return sprintf(buf, "%llu\n", (unsigned long long) 174 + resource_size(&dax_region->res)); 175 + } 176 + static struct device_attribute dev_attr_region_size = __ATTR(size, 0444, 177 + region_size_show, NULL); 178 + 179 + static ssize_t align_show(struct device *dev, 180 + struct device_attribute *attr, char *buf) 181 + { 182 + struct dax_region *dax_region = dev_get_drvdata(dev); 183 + 184 + return sprintf(buf, "%u\n", dax_region->align); 185 + } 186 + static DEVICE_ATTR_RO(align); 187 + 188 + static struct attribute *dax_region_attributes[] = { 189 + &dev_attr_region_size.attr, 190 + &dev_attr_align.attr, 191 + &dev_attr_id.attr, 192 + NULL, 193 + }; 194 + 195 + static const struct attribute_group dax_region_attribute_group = { 196 + .name = "dax_region", 197 + .attrs = dax_region_attributes, 198 + }; 199 + 200 + static const struct attribute_group *dax_region_attribute_groups[] = { 201 + &dax_region_attribute_group, 202 + NULL, 203 + }; 204 + 205 + static void dax_region_free(struct kref *kref) 206 + { 207 + struct dax_region *dax_region; 208 + 209 + dax_region = container_of(kref, struct dax_region, kref); 210 + kfree(dax_region); 211 + } 212 + 213 + void dax_region_put(struct dax_region *dax_region) 214 + { 215 + kref_put(&dax_region->kref, dax_region_free); 216 + } 217 + EXPORT_SYMBOL_GPL(dax_region_put); 218 + 219 + static void dax_region_unregister(void *region) 220 + { 221 + struct dax_region *dax_region = region; 222 + 223 + sysfs_remove_groups(&dax_region->dev->kobj, 224 + dax_region_attribute_groups); 225 + dax_region_put(dax_region); 226 + } 227 + 228 + struct dax_region *alloc_dax_region(struct device *parent, int region_id, 229 + struct resource *res, int target_node, unsigned int align, 230 + unsigned long pfn_flags) 231 + { 232 + struct dax_region *dax_region; 233 + 234 + /* 235 + * The DAX core assumes that it can store its private data in 236 + * parent->driver_data. This WARN is a reminder / safeguard for 237 + * developers of device-dax drivers. 238 + */ 239 + if (dev_get_drvdata(parent)) { 240 + dev_WARN(parent, "dax core failed to setup private data\n"); 241 + return NULL; 242 + } 243 + 244 + if (!IS_ALIGNED(res->start, align) 245 + || !IS_ALIGNED(resource_size(res), align)) 246 + return NULL; 247 + 248 + dax_region = kzalloc(sizeof(*dax_region), GFP_KERNEL); 249 + if (!dax_region) 250 + return NULL; 251 + 252 + dev_set_drvdata(parent, dax_region); 253 + memcpy(&dax_region->res, res, sizeof(*res)); 254 + dax_region->pfn_flags = pfn_flags; 255 + kref_init(&dax_region->kref); 256 + dax_region->id = region_id; 257 + dax_region->align = align; 258 + dax_region->dev = parent; 259 + dax_region->target_node = target_node; 260 + if (sysfs_create_groups(&parent->kobj, dax_region_attribute_groups)) { 261 + kfree(dax_region); 262 + return NULL; 263 + } 264 + 265 + kref_get(&dax_region->kref); 266 + if (devm_add_action_or_reset(parent, dax_region_unregister, dax_region)) 267 + return NULL; 268 + return dax_region; 269 + } 270 + EXPORT_SYMBOL_GPL(alloc_dax_region); 271 + 272 + static ssize_t size_show(struct device *dev, 273 + struct device_attribute *attr, char *buf) 274 + { 275 + struct dev_dax *dev_dax = to_dev_dax(dev); 276 + unsigned long long size = resource_size(&dev_dax->region->res); 277 + 278 + return sprintf(buf, "%llu\n", size); 279 + } 280 + static DEVICE_ATTR_RO(size); 281 + 282 + static int dev_dax_target_node(struct dev_dax *dev_dax) 283 + { 284 + struct dax_region *dax_region = dev_dax->region; 285 + 286 + return dax_region->target_node; 287 + } 288 + 289 + static ssize_t target_node_show(struct device *dev, 290 + struct device_attribute *attr, char *buf) 291 + { 292 + struct dev_dax *dev_dax = to_dev_dax(dev); 293 + 294 + return sprintf(buf, "%d\n", dev_dax_target_node(dev_dax)); 295 + } 296 + static DEVICE_ATTR_RO(target_node); 297 + 298 + static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, 299 + char *buf) 300 + { 301 + /* 302 + * We only ever expect to handle device-dax instances, i.e. the 303 + * @type argument to MODULE_ALIAS_DAX_DEVICE() is always zero 304 + */ 305 + return sprintf(buf, DAX_DEVICE_MODALIAS_FMT "\n", 0); 306 + } 307 + static DEVICE_ATTR_RO(modalias); 308 + 309 + static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int n) 310 + { 311 + struct device *dev = container_of(kobj, struct device, kobj); 312 + struct dev_dax *dev_dax = to_dev_dax(dev); 313 + 314 + if (a == &dev_attr_target_node.attr && dev_dax_target_node(dev_dax) < 0) 315 + return 0; 316 + return a->mode; 317 + } 318 + 319 + static struct attribute *dev_dax_attributes[] = { 320 + &dev_attr_modalias.attr, 321 + &dev_attr_size.attr, 322 + &dev_attr_target_node.attr, 323 + NULL, 324 + }; 325 + 326 + static const struct attribute_group dev_dax_attribute_group = { 327 + .attrs = dev_dax_attributes, 328 + .is_visible = dev_dax_visible, 329 + }; 330 + 331 + static const struct attribute_group *dax_attribute_groups[] = { 332 + &dev_dax_attribute_group, 333 + NULL, 334 + }; 335 + 336 + void kill_dev_dax(struct dev_dax *dev_dax) 337 + { 338 + struct dax_device *dax_dev = dev_dax->dax_dev; 339 + struct inode *inode = dax_inode(dax_dev); 340 + 341 + kill_dax(dax_dev); 342 + unmap_mapping_range(inode->i_mapping, 0, 0, 1); 343 + } 344 + EXPORT_SYMBOL_GPL(kill_dev_dax); 345 + 346 + static void dev_dax_release(struct device *dev) 347 + { 348 + struct dev_dax *dev_dax = to_dev_dax(dev); 349 + struct dax_region *dax_region = dev_dax->region; 350 + struct dax_device *dax_dev = dev_dax->dax_dev; 351 + 352 + dax_region_put(dax_region); 353 + put_dax(dax_dev); 354 + kfree(dev_dax); 355 + } 356 + 357 + static void unregister_dev_dax(void *dev) 358 + { 359 + struct dev_dax *dev_dax = to_dev_dax(dev); 360 + 361 + dev_dbg(dev, "%s\n", __func__); 362 + 363 + kill_dev_dax(dev_dax); 364 + device_del(dev); 365 + put_device(dev); 366 + } 367 + 368 + struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id, 369 + struct dev_pagemap *pgmap, enum dev_dax_subsys subsys) 370 + { 371 + struct device *parent = dax_region->dev; 372 + struct dax_device *dax_dev; 373 + struct dev_dax *dev_dax; 374 + struct inode *inode; 375 + struct device *dev; 376 + int rc = -ENOMEM; 377 + 378 + if (id < 0) 379 + return ERR_PTR(-EINVAL); 380 + 381 + dev_dax = kzalloc(sizeof(*dev_dax), GFP_KERNEL); 382 + if (!dev_dax) 383 + return ERR_PTR(-ENOMEM); 384 + 385 + memcpy(&dev_dax->pgmap, pgmap, sizeof(*pgmap)); 386 + 387 + /* 388 + * No 'host' or dax_operations since there is no access to this 389 + * device outside of mmap of the resulting character device. 390 + */ 391 + dax_dev = alloc_dax(dev_dax, NULL, NULL); 392 + if (!dax_dev) 393 + goto err; 394 + 395 + /* a device_dax instance is dead while the driver is not attached */ 396 + kill_dax(dax_dev); 397 + 398 + /* from here on we're committed to teardown via dax_dev_release() */ 399 + dev = &dev_dax->dev; 400 + device_initialize(dev); 401 + 402 + dev_dax->dax_dev = dax_dev; 403 + dev_dax->region = dax_region; 404 + dev_dax->target_node = dax_region->target_node; 405 + kref_get(&dax_region->kref); 406 + 407 + inode = dax_inode(dax_dev); 408 + dev->devt = inode->i_rdev; 409 + if (subsys == DEV_DAX_BUS) 410 + dev->bus = &dax_bus_type; 411 + else 412 + dev->class = dax_class; 413 + dev->parent = parent; 414 + dev->groups = dax_attribute_groups; 415 + dev->release = dev_dax_release; 416 + dev_set_name(dev, "dax%d.%d", dax_region->id, id); 417 + 418 + rc = device_add(dev); 419 + if (rc) { 420 + kill_dev_dax(dev_dax); 421 + put_device(dev); 422 + return ERR_PTR(rc); 423 + } 424 + 425 + rc = devm_add_action_or_reset(dax_region->dev, unregister_dev_dax, dev); 426 + if (rc) 427 + return ERR_PTR(rc); 428 + 429 + return dev_dax; 430 + 431 + err: 432 + kfree(dev_dax); 433 + 434 + return ERR_PTR(rc); 435 + } 436 + EXPORT_SYMBOL_GPL(__devm_create_dev_dax); 437 + 438 + static int match_always_count; 439 + 440 + int __dax_driver_register(struct dax_device_driver *dax_drv, 441 + struct module *module, const char *mod_name) 442 + { 443 + struct device_driver *drv = &dax_drv->drv; 444 + int rc = 0; 445 + 446 + INIT_LIST_HEAD(&dax_drv->ids); 447 + drv->owner = module; 448 + drv->name = mod_name; 449 + drv->mod_name = mod_name; 450 + drv->bus = &dax_bus_type; 451 + 452 + /* there can only be one default driver */ 453 + mutex_lock(&dax_bus_lock); 454 + match_always_count += dax_drv->match_always; 455 + if (match_always_count > 1) { 456 + match_always_count--; 457 + WARN_ON(1); 458 + rc = -EINVAL; 459 + } 460 + mutex_unlock(&dax_bus_lock); 461 + if (rc) 462 + return rc; 463 + return driver_register(drv); 464 + } 465 + EXPORT_SYMBOL_GPL(__dax_driver_register); 466 + 467 + void dax_driver_unregister(struct dax_device_driver *dax_drv) 468 + { 469 + struct device_driver *drv = &dax_drv->drv; 470 + struct dax_id *dax_id, *_id; 471 + 472 + mutex_lock(&dax_bus_lock); 473 + match_always_count -= dax_drv->match_always; 474 + list_for_each_entry_safe(dax_id, _id, &dax_drv->ids, list) { 475 + list_del(&dax_id->list); 476 + kfree(dax_id); 477 + } 478 + mutex_unlock(&dax_bus_lock); 479 + driver_unregister(drv); 480 + } 481 + EXPORT_SYMBOL_GPL(dax_driver_unregister); 482 + 483 + int __init dax_bus_init(void) 484 + { 485 + int rc; 486 + 487 + if (IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT)) { 488 + dax_class = class_create(THIS_MODULE, "dax"); 489 + if (IS_ERR(dax_class)) 490 + return PTR_ERR(dax_class); 491 + } 492 + 493 + rc = bus_register(&dax_bus_type); 494 + if (rc) 495 + class_destroy(dax_class); 496 + return rc; 497 + } 498 + 499 + void __exit dax_bus_exit(void) 500 + { 501 + bus_unregister(&dax_bus_type); 502 + class_destroy(dax_class); 503 + }
+61
drivers/dax/bus.h
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */ 3 + #ifndef __DAX_BUS_H__ 4 + #define __DAX_BUS_H__ 5 + #include <linux/device.h> 6 + 7 + struct dev_dax; 8 + struct resource; 9 + struct dax_device; 10 + struct dax_region; 11 + void dax_region_put(struct dax_region *dax_region); 12 + struct dax_region *alloc_dax_region(struct device *parent, int region_id, 13 + struct resource *res, int target_node, unsigned int align, 14 + unsigned long flags); 15 + 16 + enum dev_dax_subsys { 17 + DEV_DAX_BUS, 18 + DEV_DAX_CLASS, 19 + }; 20 + 21 + struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id, 22 + struct dev_pagemap *pgmap, enum dev_dax_subsys subsys); 23 + 24 + static inline struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, 25 + int id, struct dev_pagemap *pgmap) 26 + { 27 + return __devm_create_dev_dax(dax_region, id, pgmap, DEV_DAX_BUS); 28 + } 29 + 30 + /* to be deleted when DEV_DAX_CLASS is removed */ 31 + struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys); 32 + 33 + struct dax_device_driver { 34 + struct device_driver drv; 35 + struct list_head ids; 36 + int match_always; 37 + }; 38 + 39 + int __dax_driver_register(struct dax_device_driver *dax_drv, 40 + struct module *module, const char *mod_name); 41 + #define dax_driver_register(driver) \ 42 + __dax_driver_register(driver, THIS_MODULE, KBUILD_MODNAME) 43 + void dax_driver_unregister(struct dax_device_driver *dax_drv); 44 + void kill_dev_dax(struct dev_dax *dev_dax); 45 + 46 + #if IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT) 47 + int dev_dax_probe(struct device *dev); 48 + #endif 49 + 50 + /* 51 + * While run_dax() is potentially a generic operation that could be 52 + * defined in include/linux/dax.h we don't want to grow any users 53 + * outside of drivers/dax/ 54 + */ 55 + void run_dax(struct dax_device *dax_dev); 56 + 57 + #define MODULE_ALIAS_DAX_DEVICE(type) \ 58 + MODULE_ALIAS("dax:t" __stringify(type) "*") 59 + #define DAX_DEVICE_MODALIAS_FMT "dax:t%d" 60 + 61 + #endif /* __DAX_BUS_H__ */
+24 -10
drivers/dax/dax-private.h
··· 16 16 #include <linux/device.h> 17 17 #include <linux/cdev.h> 18 18 19 + /* private routines between core files */ 20 + struct dax_device; 21 + struct dax_device *inode_dax(struct inode *inode); 22 + struct inode *dax_inode(struct dax_device *dax_dev); 23 + int dax_bus_init(void); 24 + void dax_bus_exit(void); 25 + 19 26 /** 20 27 * struct dax_region - mapping infrastructure for dax devices 21 28 * @id: kernel-wide unique region for a memory range 22 - * @base: linear address corresponding to @res 29 + * @target_node: effective numa node if this memory range is onlined 23 30 * @kref: to pin while other agents have a need to do lookups 24 31 * @dev: parent device backing this region 25 32 * @align: allocation and mapping alignment for child dax devices ··· 35 28 */ 36 29 struct dax_region { 37 30 int id; 38 - struct ida ida; 39 - void *base; 31 + int target_node; 40 32 struct kref kref; 41 33 struct device *dev; 42 34 unsigned int align; ··· 44 38 }; 45 39 46 40 /** 47 - * struct dev_dax - instance data for a subdivision of a dax region 41 + * struct dev_dax - instance data for a subdivision of a dax region, and 42 + * data while the device is activated in the driver. 48 43 * @region - parent region 49 44 * @dax_dev - core dax functionality 45 + * @target_node: effective numa node if dev_dax memory range is onlined 50 46 * @dev - device core 51 - * @id - child id in the region 52 - * @num_resources - number of physical address extents in this device 53 - * @res - array of physical address ranges 47 + * @pgmap - pgmap for memmap setup / lifetime (driver owned) 48 + * @ref: pgmap reference count (driver owned) 49 + * @cmp: @ref final put completion (driver owned) 54 50 */ 55 51 struct dev_dax { 56 52 struct dax_region *region; 57 53 struct dax_device *dax_dev; 54 + int target_node; 58 55 struct device dev; 59 - int id; 60 - int num_resources; 61 - struct resource res[0]; 56 + struct dev_pagemap pgmap; 57 + struct percpu_ref ref; 58 + struct completion cmp; 62 59 }; 60 + 61 + static inline struct dev_dax *to_dev_dax(struct device *dev) 62 + { 63 + return container_of(dev, struct dev_dax, dev); 64 + } 63 65 #endif
-18
drivers/dax/dax.h
··· 1 - /* 2 - * Copyright(c) 2016 - 2017 Intel Corporation. All rights reserved. 3 - * 4 - * This program is free software; you can redistribute it and/or modify 5 - * it under the terms of version 2 of the GNU General Public License as 6 - * published by the Free Software Foundation. 7 - * 8 - * This program is distributed in the hope that it will be useful, but 9 - * WITHOUT ANY WARRANTY; without even the implied warranty of 10 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 11 - * General Public License for more details. 12 - */ 13 - #ifndef __DAX_H__ 14 - #define __DAX_H__ 15 - struct dax_device; 16 - struct dax_device *inode_dax(struct inode *inode); 17 - struct inode *dax_inode(struct dax_device *dax_dev); 18 - #endif /* __DAX_H__ */
-25
drivers/dax/device-dax.h
··· 1 - /* 2 - * Copyright(c) 2016 Intel Corporation. All rights reserved. 3 - * 4 - * This program is free software; you can redistribute it and/or modify 5 - * it under the terms of version 2 of the GNU General Public License as 6 - * published by the Free Software Foundation. 7 - * 8 - * This program is distributed in the hope that it will be useful, but 9 - * WITHOUT ANY WARRANTY; without even the implied warranty of 10 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 11 - * General Public License for more details. 12 - */ 13 - #ifndef __DEVICE_DAX_H__ 14 - #define __DEVICE_DAX_H__ 15 - struct device; 16 - struct dev_dax; 17 - struct resource; 18 - struct dax_region; 19 - void dax_region_put(struct dax_region *dax_region); 20 - struct dax_region *alloc_dax_region(struct device *parent, 21 - int region_id, struct resource *res, unsigned int align, 22 - void *addr, unsigned long flags); 23 - struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, 24 - int id, struct resource *res, int count); 25 - #endif /* __DEVICE_DAX_H__ */
+87 -280
drivers/dax/device.c
··· 1 - /* 2 - * Copyright(c) 2016 - 2017 Intel Corporation. All rights reserved. 3 - * 4 - * This program is free software; you can redistribute it and/or modify 5 - * it under the terms of version 2 of the GNU General Public License as 6 - * published by the Free Software Foundation. 7 - * 8 - * This program is distributed in the hope that it will be useful, but 9 - * WITHOUT ANY WARRANTY; without even the implied warranty of 10 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 11 - * General Public License for more details. 12 - */ 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2016-2018 Intel Corporation. All rights reserved. */ 3 + #include <linux/memremap.h> 13 4 #include <linux/pagemap.h> 14 5 #include <linux/module.h> 15 6 #include <linux/device.h> ··· 12 21 #include <linux/mm.h> 13 22 #include <linux/mman.h> 14 23 #include "dax-private.h" 15 - #include "dax.h" 24 + #include "bus.h" 16 25 17 - static struct class *dax_class; 18 - 19 - /* 20 - * Rely on the fact that drvdata is set before the attributes are 21 - * registered, and that the attributes are unregistered before drvdata 22 - * is cleared to assume that drvdata is always valid. 23 - */ 24 - static ssize_t id_show(struct device *dev, 25 - struct device_attribute *attr, char *buf) 26 + static struct dev_dax *ref_to_dev_dax(struct percpu_ref *ref) 26 27 { 27 - struct dax_region *dax_region = dev_get_drvdata(dev); 28 - 29 - return sprintf(buf, "%d\n", dax_region->id); 30 - } 31 - static DEVICE_ATTR_RO(id); 32 - 33 - static ssize_t region_size_show(struct device *dev, 34 - struct device_attribute *attr, char *buf) 35 - { 36 - struct dax_region *dax_region = dev_get_drvdata(dev); 37 - 38 - return sprintf(buf, "%llu\n", (unsigned long long) 39 - resource_size(&dax_region->res)); 40 - } 41 - static struct device_attribute dev_attr_region_size = __ATTR(size, 0444, 42 - region_size_show, NULL); 43 - 44 - static ssize_t align_show(struct device *dev, 45 - struct device_attribute *attr, char *buf) 46 - { 47 - struct dax_region *dax_region = dev_get_drvdata(dev); 48 - 49 - return sprintf(buf, "%u\n", dax_region->align); 50 - } 51 - static DEVICE_ATTR_RO(align); 52 - 53 - static struct attribute *dax_region_attributes[] = { 54 - &dev_attr_region_size.attr, 55 - &dev_attr_align.attr, 56 - &dev_attr_id.attr, 57 - NULL, 58 - }; 59 - 60 - static const struct attribute_group dax_region_attribute_group = { 61 - .name = "dax_region", 62 - .attrs = dax_region_attributes, 63 - }; 64 - 65 - static const struct attribute_group *dax_region_attribute_groups[] = { 66 - &dax_region_attribute_group, 67 - NULL, 68 - }; 69 - 70 - static void dax_region_free(struct kref *kref) 71 - { 72 - struct dax_region *dax_region; 73 - 74 - dax_region = container_of(kref, struct dax_region, kref); 75 - kfree(dax_region); 28 + return container_of(ref, struct dev_dax, ref); 76 29 } 77 30 78 - void dax_region_put(struct dax_region *dax_region) 31 + static void dev_dax_percpu_release(struct percpu_ref *ref) 79 32 { 80 - kref_put(&dax_region->kref, dax_region_free); 81 - } 82 - EXPORT_SYMBOL_GPL(dax_region_put); 33 + struct dev_dax *dev_dax = ref_to_dev_dax(ref); 83 34 84 - static void dax_region_unregister(void *region) 85 - { 86 - struct dax_region *dax_region = region; 87 - 88 - sysfs_remove_groups(&dax_region->dev->kobj, 89 - dax_region_attribute_groups); 90 - dax_region_put(dax_region); 35 + dev_dbg(&dev_dax->dev, "%s\n", __func__); 36 + complete(&dev_dax->cmp); 91 37 } 92 38 93 - struct dax_region *alloc_dax_region(struct device *parent, int region_id, 94 - struct resource *res, unsigned int align, void *addr, 95 - unsigned long pfn_flags) 39 + static void dev_dax_percpu_exit(void *data) 96 40 { 97 - struct dax_region *dax_region; 41 + struct percpu_ref *ref = data; 42 + struct dev_dax *dev_dax = ref_to_dev_dax(ref); 98 43 99 - /* 100 - * The DAX core assumes that it can store its private data in 101 - * parent->driver_data. This WARN is a reminder / safeguard for 102 - * developers of device-dax drivers. 103 - */ 104 - if (dev_get_drvdata(parent)) { 105 - dev_WARN(parent, "dax core failed to setup private data\n"); 106 - return NULL; 107 - } 108 - 109 - if (!IS_ALIGNED(res->start, align) 110 - || !IS_ALIGNED(resource_size(res), align)) 111 - return NULL; 112 - 113 - dax_region = kzalloc(sizeof(*dax_region), GFP_KERNEL); 114 - if (!dax_region) 115 - return NULL; 116 - 117 - dev_set_drvdata(parent, dax_region); 118 - memcpy(&dax_region->res, res, sizeof(*res)); 119 - dax_region->pfn_flags = pfn_flags; 120 - kref_init(&dax_region->kref); 121 - dax_region->id = region_id; 122 - ida_init(&dax_region->ida); 123 - dax_region->align = align; 124 - dax_region->dev = parent; 125 - dax_region->base = addr; 126 - if (sysfs_create_groups(&parent->kobj, dax_region_attribute_groups)) { 127 - kfree(dax_region); 128 - return NULL; 129 - } 130 - 131 - kref_get(&dax_region->kref); 132 - if (devm_add_action_or_reset(parent, dax_region_unregister, dax_region)) 133 - return NULL; 134 - return dax_region; 135 - } 136 - EXPORT_SYMBOL_GPL(alloc_dax_region); 137 - 138 - static struct dev_dax *to_dev_dax(struct device *dev) 139 - { 140 - return container_of(dev, struct dev_dax, dev); 44 + dev_dbg(&dev_dax->dev, "%s\n", __func__); 45 + wait_for_completion(&dev_dax->cmp); 46 + percpu_ref_exit(ref); 141 47 } 142 48 143 - static ssize_t size_show(struct device *dev, 144 - struct device_attribute *attr, char *buf) 49 + static void dev_dax_percpu_kill(struct percpu_ref *data) 145 50 { 146 - struct dev_dax *dev_dax = to_dev_dax(dev); 147 - unsigned long long size = 0; 148 - int i; 51 + struct percpu_ref *ref = data; 52 + struct dev_dax *dev_dax = ref_to_dev_dax(ref); 149 53 150 - for (i = 0; i < dev_dax->num_resources; i++) 151 - size += resource_size(&dev_dax->res[i]); 152 - 153 - return sprintf(buf, "%llu\n", size); 54 + dev_dbg(&dev_dax->dev, "%s\n", __func__); 55 + percpu_ref_kill(ref); 154 56 } 155 - static DEVICE_ATTR_RO(size); 156 - 157 - static struct attribute *dev_dax_attributes[] = { 158 - &dev_attr_size.attr, 159 - NULL, 160 - }; 161 - 162 - static const struct attribute_group dev_dax_attribute_group = { 163 - .attrs = dev_dax_attributes, 164 - }; 165 - 166 - static const struct attribute_group *dax_attribute_groups[] = { 167 - &dev_dax_attribute_group, 168 - NULL, 169 - }; 170 57 171 58 static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma, 172 59 const char *func) ··· 95 226 __weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, 96 227 unsigned long size) 97 228 { 98 - struct resource *res; 99 - /* gcc-4.6.3-nolibc for i386 complains that this is uninitialized */ 100 - phys_addr_t uninitialized_var(phys); 101 - int i; 229 + struct resource *res = &dev_dax->region->res; 230 + phys_addr_t phys; 102 231 103 - for (i = 0; i < dev_dax->num_resources; i++) { 104 - res = &dev_dax->res[i]; 105 - phys = pgoff * PAGE_SIZE + res->start; 106 - if (phys >= res->start && phys <= res->end) 107 - break; 108 - pgoff -= PHYS_PFN(resource_size(res)); 109 - } 110 - 111 - if (i < dev_dax->num_resources) { 112 - res = &dev_dax->res[i]; 232 + phys = pgoff * PAGE_SIZE + res->start; 233 + if (phys >= res->start && phys <= res->end) { 113 234 if (phys + size - 1 <= res->end) 114 235 return phys; 115 236 } ··· 435 576 .mmap_supported_flags = MAP_SYNC, 436 577 }; 437 578 438 - static void dev_dax_release(struct device *dev) 579 + static void dev_dax_cdev_del(void *cdev) 439 580 { 440 - struct dev_dax *dev_dax = to_dev_dax(dev); 441 - struct dax_region *dax_region = dev_dax->region; 442 - struct dax_device *dax_dev = dev_dax->dax_dev; 443 - 444 - if (dev_dax->id >= 0) 445 - ida_simple_remove(&dax_region->ida, dev_dax->id); 446 - dax_region_put(dax_region); 447 - put_dax(dax_dev); 448 - kfree(dev_dax); 581 + cdev_del(cdev); 449 582 } 450 583 451 - static void kill_dev_dax(struct dev_dax *dev_dax) 584 + static void dev_dax_kill(void *dev_dax) 452 585 { 453 - struct dax_device *dax_dev = dev_dax->dax_dev; 454 - struct inode *inode = dax_inode(dax_dev); 455 - 456 - kill_dax(dax_dev); 457 - unmap_mapping_range(inode->i_mapping, 0, 0, 1); 458 - } 459 - 460 - static void unregister_dev_dax(void *dev) 461 - { 462 - struct dev_dax *dev_dax = to_dev_dax(dev); 463 - struct dax_device *dax_dev = dev_dax->dax_dev; 464 - struct inode *inode = dax_inode(dax_dev); 465 - struct cdev *cdev = inode->i_cdev; 466 - 467 - dev_dbg(dev, "trace\n"); 468 - 469 586 kill_dev_dax(dev_dax); 470 - cdev_device_del(cdev, dev); 471 - put_device(dev); 472 587 } 473 588 474 - struct dev_dax *devm_create_dev_dax(struct dax_region *dax_region, 475 - int id, struct resource *res, int count) 589 + int dev_dax_probe(struct device *dev) 476 590 { 477 - struct device *parent = dax_region->dev; 478 - struct dax_device *dax_dev; 479 - struct dev_dax *dev_dax; 591 + struct dev_dax *dev_dax = to_dev_dax(dev); 592 + struct dax_device *dax_dev = dev_dax->dax_dev; 593 + struct resource *res = &dev_dax->region->res; 480 594 struct inode *inode; 481 - struct device *dev; 482 595 struct cdev *cdev; 483 - int rc, i; 596 + void *addr; 597 + int rc; 484 598 485 - if (!count) 486 - return ERR_PTR(-EINVAL); 487 - 488 - dev_dax = kzalloc(struct_size(dev_dax, res, count), GFP_KERNEL); 489 - if (!dev_dax) 490 - return ERR_PTR(-ENOMEM); 491 - 492 - for (i = 0; i < count; i++) { 493 - if (!IS_ALIGNED(res[i].start, dax_region->align) 494 - || !IS_ALIGNED(resource_size(&res[i]), 495 - dax_region->align)) { 496 - rc = -EINVAL; 497 - break; 498 - } 499 - dev_dax->res[i].start = res[i].start; 500 - dev_dax->res[i].end = res[i].end; 599 + /* 1:1 map region resource range to device-dax instance range */ 600 + if (!devm_request_mem_region(dev, res->start, resource_size(res), 601 + dev_name(dev))) { 602 + dev_warn(dev, "could not reserve region %pR\n", res); 603 + return -EBUSY; 501 604 } 502 605 503 - if (i < count) 504 - goto err_id; 606 + init_completion(&dev_dax->cmp); 607 + rc = percpu_ref_init(&dev_dax->ref, dev_dax_percpu_release, 0, 608 + GFP_KERNEL); 609 + if (rc) 610 + return rc; 505 611 506 - if (id < 0) { 507 - id = ida_simple_get(&dax_region->ida, 0, 0, GFP_KERNEL); 508 - dev_dax->id = id; 509 - if (id < 0) { 510 - rc = id; 511 - goto err_id; 512 - } 513 - } else { 514 - /* region provider owns @id lifetime */ 515 - dev_dax->id = -1; 612 + rc = devm_add_action_or_reset(dev, dev_dax_percpu_exit, &dev_dax->ref); 613 + if (rc) 614 + return rc; 615 + 616 + dev_dax->pgmap.ref = &dev_dax->ref; 617 + dev_dax->pgmap.kill = dev_dax_percpu_kill; 618 + addr = devm_memremap_pages(dev, &dev_dax->pgmap); 619 + if (IS_ERR(addr)) { 620 + devm_remove_action(dev, dev_dax_percpu_exit, &dev_dax->ref); 621 + percpu_ref_exit(&dev_dax->ref); 622 + return PTR_ERR(addr); 516 623 } 517 - 518 - /* 519 - * No 'host' or dax_operations since there is no access to this 520 - * device outside of mmap of the resulting character device. 521 - */ 522 - dax_dev = alloc_dax(dev_dax, NULL, NULL); 523 - if (!dax_dev) { 524 - rc = -ENOMEM; 525 - goto err_dax; 526 - } 527 - 528 - /* from here on we're committed to teardown via dax_dev_release() */ 529 - dev = &dev_dax->dev; 530 - device_initialize(dev); 531 624 532 625 inode = dax_inode(dax_dev); 533 626 cdev = inode->i_cdev; 534 627 cdev_init(cdev, &dax_fops); 535 - cdev->owner = parent->driver->owner; 536 - 537 - dev_dax->num_resources = count; 538 - dev_dax->dax_dev = dax_dev; 539 - dev_dax->region = dax_region; 540 - kref_get(&dax_region->kref); 541 - 542 - dev->devt = inode->i_rdev; 543 - dev->class = dax_class; 544 - dev->parent = parent; 545 - dev->groups = dax_attribute_groups; 546 - dev->release = dev_dax_release; 547 - dev_set_name(dev, "dax%d.%d", dax_region->id, id); 548 - 549 - rc = cdev_device_add(cdev, dev); 550 - if (rc) { 551 - kill_dev_dax(dev_dax); 552 - put_device(dev); 553 - return ERR_PTR(rc); 554 - } 555 - 556 - rc = devm_add_action_or_reset(dax_region->dev, unregister_dev_dax, dev); 628 + if (dev->class) { 629 + /* for the CONFIG_DEV_DAX_PMEM_COMPAT case */ 630 + cdev->owner = dev->parent->driver->owner; 631 + } else 632 + cdev->owner = dev->driver->owner; 633 + cdev_set_parent(cdev, &dev->kobj); 634 + rc = cdev_add(cdev, dev->devt, 1); 557 635 if (rc) 558 - return ERR_PTR(rc); 636 + return rc; 559 637 560 - return dev_dax; 638 + rc = devm_add_action_or_reset(dev, dev_dax_cdev_del, cdev); 639 + if (rc) 640 + return rc; 561 641 562 - err_dax: 563 - if (dev_dax->id >= 0) 564 - ida_simple_remove(&dax_region->ida, dev_dax->id); 565 - err_id: 566 - kfree(dev_dax); 567 - 568 - return ERR_PTR(rc); 642 + run_dax(dax_dev); 643 + return devm_add_action_or_reset(dev, dev_dax_kill, dev_dax); 569 644 } 570 - EXPORT_SYMBOL_GPL(devm_create_dev_dax); 645 + EXPORT_SYMBOL_GPL(dev_dax_probe); 646 + 647 + static int dev_dax_remove(struct device *dev) 648 + { 649 + /* all probe actions are unwound by devm */ 650 + return 0; 651 + } 652 + 653 + static struct dax_device_driver device_dax_driver = { 654 + .drv = { 655 + .probe = dev_dax_probe, 656 + .remove = dev_dax_remove, 657 + }, 658 + .match_always = 1, 659 + }; 571 660 572 661 static int __init dax_init(void) 573 662 { 574 - dax_class = class_create(THIS_MODULE, "dax"); 575 - return PTR_ERR_OR_ZERO(dax_class); 663 + return dax_driver_register(&device_dax_driver); 576 664 } 577 665 578 666 static void __exit dax_exit(void) 579 667 { 580 - class_destroy(dax_class); 668 + dax_driver_unregister(&device_dax_driver); 581 669 } 582 670 583 671 MODULE_AUTHOR("Intel Corporation"); 584 672 MODULE_LICENSE("GPL v2"); 585 - subsys_initcall(dax_init); 673 + module_init(dax_init); 586 674 module_exit(dax_exit); 675 + MODULE_ALIAS_DAX_DEVICE(0);
+108
drivers/dax/kmem.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2016-2019 Intel Corporation. All rights reserved. */ 3 + #include <linux/memremap.h> 4 + #include <linux/pagemap.h> 5 + #include <linux/memory.h> 6 + #include <linux/module.h> 7 + #include <linux/device.h> 8 + #include <linux/pfn_t.h> 9 + #include <linux/slab.h> 10 + #include <linux/dax.h> 11 + #include <linux/fs.h> 12 + #include <linux/mm.h> 13 + #include <linux/mman.h> 14 + #include "dax-private.h" 15 + #include "bus.h" 16 + 17 + int dev_dax_kmem_probe(struct device *dev) 18 + { 19 + struct dev_dax *dev_dax = to_dev_dax(dev); 20 + struct resource *res = &dev_dax->region->res; 21 + resource_size_t kmem_start; 22 + resource_size_t kmem_size; 23 + resource_size_t kmem_end; 24 + struct resource *new_res; 25 + int numa_node; 26 + int rc; 27 + 28 + /* 29 + * Ensure good NUMA information for the persistent memory. 30 + * Without this check, there is a risk that slow memory 31 + * could be mixed in a node with faster memory, causing 32 + * unavoidable performance issues. 33 + */ 34 + numa_node = dev_dax->target_node; 35 + if (numa_node < 0) { 36 + dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n", 37 + res, numa_node); 38 + return -EINVAL; 39 + } 40 + 41 + /* Hotplug starting at the beginning of the next block: */ 42 + kmem_start = ALIGN(res->start, memory_block_size_bytes()); 43 + 44 + kmem_size = resource_size(res); 45 + /* Adjust the size down to compensate for moving up kmem_start: */ 46 + kmem_size -= kmem_start - res->start; 47 + /* Align the size down to cover only complete blocks: */ 48 + kmem_size &= ~(memory_block_size_bytes() - 1); 49 + kmem_end = kmem_start + kmem_size; 50 + 51 + /* Region is permanently reserved. Hot-remove not yet implemented. */ 52 + new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev)); 53 + if (!new_res) { 54 + dev_warn(dev, "could not reserve region [%pa-%pa]\n", 55 + &kmem_start, &kmem_end); 56 + return -EBUSY; 57 + } 58 + 59 + /* 60 + * Set flags appropriate for System RAM. Leave ..._BUSY clear 61 + * so that add_memory() can add a child resource. Do not 62 + * inherit flags from the parent since it may set new flags 63 + * unknown to us that will break add_memory() below. 64 + */ 65 + new_res->flags = IORESOURCE_SYSTEM_RAM; 66 + new_res->name = dev_name(dev); 67 + 68 + rc = add_memory(numa_node, new_res->start, resource_size(new_res)); 69 + if (rc) 70 + return rc; 71 + 72 + return 0; 73 + } 74 + 75 + static int dev_dax_kmem_remove(struct device *dev) 76 + { 77 + /* 78 + * Purposely leak the request_mem_region() for the device-dax 79 + * range and return '0' to ->remove() attempts. The removal of 80 + * the device from the driver always succeeds, but the region 81 + * is permanently pinned as reserved by the unreleased 82 + * request_mem_region(). 83 + */ 84 + return 0; 85 + } 86 + 87 + static struct dax_device_driver device_dax_kmem_driver = { 88 + .drv = { 89 + .probe = dev_dax_kmem_probe, 90 + .remove = dev_dax_kmem_remove, 91 + }, 92 + }; 93 + 94 + static int __init dax_kmem_init(void) 95 + { 96 + return dax_driver_register(&device_dax_kmem_driver); 97 + } 98 + 99 + static void __exit dax_kmem_exit(void) 100 + { 101 + dax_driver_unregister(&device_dax_kmem_driver); 102 + } 103 + 104 + MODULE_AUTHOR("Intel Corporation"); 105 + MODULE_LICENSE("GPL v2"); 106 + module_init(dax_kmem_init); 107 + module_exit(dax_kmem_exit); 108 + MODULE_ALIAS_DAX_DEVICE(0);
-153
drivers/dax/pmem.c
··· 1 - /* 2 - * Copyright(c) 2016 Intel Corporation. All rights reserved. 3 - * 4 - * This program is free software; you can redistribute it and/or modify 5 - * it under the terms of version 2 of the GNU General Public License as 6 - * published by the Free Software Foundation. 7 - * 8 - * This program is distributed in the hope that it will be useful, but 9 - * WITHOUT ANY WARRANTY; without even the implied warranty of 10 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 11 - * General Public License for more details. 12 - */ 13 - #include <linux/percpu-refcount.h> 14 - #include <linux/memremap.h> 15 - #include <linux/module.h> 16 - #include <linux/pfn_t.h> 17 - #include "../nvdimm/pfn.h" 18 - #include "../nvdimm/nd.h" 19 - #include "device-dax.h" 20 - 21 - struct dax_pmem { 22 - struct device *dev; 23 - struct percpu_ref ref; 24 - struct dev_pagemap pgmap; 25 - struct completion cmp; 26 - }; 27 - 28 - static struct dax_pmem *to_dax_pmem(struct percpu_ref *ref) 29 - { 30 - return container_of(ref, struct dax_pmem, ref); 31 - } 32 - 33 - static void dax_pmem_percpu_release(struct percpu_ref *ref) 34 - { 35 - struct dax_pmem *dax_pmem = to_dax_pmem(ref); 36 - 37 - dev_dbg(dax_pmem->dev, "trace\n"); 38 - complete(&dax_pmem->cmp); 39 - } 40 - 41 - static void dax_pmem_percpu_exit(void *data) 42 - { 43 - struct percpu_ref *ref = data; 44 - struct dax_pmem *dax_pmem = to_dax_pmem(ref); 45 - 46 - dev_dbg(dax_pmem->dev, "trace\n"); 47 - wait_for_completion(&dax_pmem->cmp); 48 - percpu_ref_exit(ref); 49 - } 50 - 51 - static void dax_pmem_percpu_kill(struct percpu_ref *ref) 52 - { 53 - struct dax_pmem *dax_pmem = to_dax_pmem(ref); 54 - 55 - dev_dbg(dax_pmem->dev, "trace\n"); 56 - percpu_ref_kill(ref); 57 - } 58 - 59 - static int dax_pmem_probe(struct device *dev) 60 - { 61 - void *addr; 62 - struct resource res; 63 - int rc, id, region_id; 64 - struct nd_pfn_sb *pfn_sb; 65 - struct dev_dax *dev_dax; 66 - struct dax_pmem *dax_pmem; 67 - struct nd_namespace_io *nsio; 68 - struct dax_region *dax_region; 69 - struct nd_namespace_common *ndns; 70 - struct nd_dax *nd_dax = to_nd_dax(dev); 71 - struct nd_pfn *nd_pfn = &nd_dax->nd_pfn; 72 - 73 - ndns = nvdimm_namespace_common_probe(dev); 74 - if (IS_ERR(ndns)) 75 - return PTR_ERR(ndns); 76 - nsio = to_nd_namespace_io(&ndns->dev); 77 - 78 - dax_pmem = devm_kzalloc(dev, sizeof(*dax_pmem), GFP_KERNEL); 79 - if (!dax_pmem) 80 - return -ENOMEM; 81 - 82 - /* parse the 'pfn' info block via ->rw_bytes */ 83 - rc = devm_nsio_enable(dev, nsio); 84 - if (rc) 85 - return rc; 86 - rc = nvdimm_setup_pfn(nd_pfn, &dax_pmem->pgmap); 87 - if (rc) 88 - return rc; 89 - devm_nsio_disable(dev, nsio); 90 - 91 - pfn_sb = nd_pfn->pfn_sb; 92 - 93 - if (!devm_request_mem_region(dev, nsio->res.start, 94 - resource_size(&nsio->res), 95 - dev_name(&ndns->dev))) { 96 - dev_warn(dev, "could not reserve region %pR\n", &nsio->res); 97 - return -EBUSY; 98 - } 99 - 100 - dax_pmem->dev = dev; 101 - init_completion(&dax_pmem->cmp); 102 - rc = percpu_ref_init(&dax_pmem->ref, dax_pmem_percpu_release, 0, 103 - GFP_KERNEL); 104 - if (rc) 105 - return rc; 106 - 107 - rc = devm_add_action(dev, dax_pmem_percpu_exit, &dax_pmem->ref); 108 - if (rc) { 109 - percpu_ref_exit(&dax_pmem->ref); 110 - return rc; 111 - } 112 - 113 - dax_pmem->pgmap.ref = &dax_pmem->ref; 114 - dax_pmem->pgmap.kill = dax_pmem_percpu_kill; 115 - addr = devm_memremap_pages(dev, &dax_pmem->pgmap); 116 - if (IS_ERR(addr)) 117 - return PTR_ERR(addr); 118 - 119 - /* adjust the dax_region resource to the start of data */ 120 - memcpy(&res, &dax_pmem->pgmap.res, sizeof(res)); 121 - res.start += le64_to_cpu(pfn_sb->dataoff); 122 - 123 - rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", &region_id, &id); 124 - if (rc != 2) 125 - return -EINVAL; 126 - 127 - dax_region = alloc_dax_region(dev, region_id, &res, 128 - le32_to_cpu(pfn_sb->align), addr, PFN_DEV|PFN_MAP); 129 - if (!dax_region) 130 - return -ENOMEM; 131 - 132 - /* TODO: support for subdividing a dax region... */ 133 - dev_dax = devm_create_dev_dax(dax_region, id, &res, 1); 134 - 135 - /* child dev_dax instances now own the lifetime of the dax_region */ 136 - dax_region_put(dax_region); 137 - 138 - return PTR_ERR_OR_ZERO(dev_dax); 139 - } 140 - 141 - static struct nd_device_driver dax_pmem_driver = { 142 - .probe = dax_pmem_probe, 143 - .drv = { 144 - .name = "dax_pmem", 145 - }, 146 - .type = ND_DRIVER_DAX_PMEM, 147 - }; 148 - 149 - module_nd_driver(dax_pmem_driver); 150 - 151 - MODULE_LICENSE("GPL v2"); 152 - MODULE_AUTHOR("Intel Corporation"); 153 - MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
+7
drivers/dax/pmem/Makefile
··· 1 + obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o 2 + obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem_core.o 3 + obj-$(CONFIG_DEV_DAX_PMEM_COMPAT) += dax_pmem_compat.o 4 + 5 + dax_pmem-y := pmem.o 6 + dax_pmem_core-y := core.o 7 + dax_pmem_compat-y := compat.o
+73
drivers/dax/pmem/compat.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */ 3 + #include <linux/percpu-refcount.h> 4 + #include <linux/memremap.h> 5 + #include <linux/module.h> 6 + #include <linux/pfn_t.h> 7 + #include <linux/nd.h> 8 + #include "../bus.h" 9 + 10 + /* we need the private definitions to implement compat suport */ 11 + #include "../dax-private.h" 12 + 13 + static int dax_pmem_compat_probe(struct device *dev) 14 + { 15 + struct dev_dax *dev_dax = __dax_pmem_probe(dev, DEV_DAX_CLASS); 16 + int rc; 17 + 18 + if (IS_ERR(dev_dax)) 19 + return PTR_ERR(dev_dax); 20 + 21 + if (!devres_open_group(&dev_dax->dev, dev_dax, GFP_KERNEL)) 22 + return -ENOMEM; 23 + 24 + device_lock(&dev_dax->dev); 25 + rc = dev_dax_probe(&dev_dax->dev); 26 + device_unlock(&dev_dax->dev); 27 + 28 + devres_close_group(&dev_dax->dev, dev_dax); 29 + if (rc) 30 + devres_release_group(&dev_dax->dev, dev_dax); 31 + 32 + return rc; 33 + } 34 + 35 + static int dax_pmem_compat_release(struct device *dev, void *data) 36 + { 37 + device_lock(dev); 38 + devres_release_group(dev, to_dev_dax(dev)); 39 + device_unlock(dev); 40 + 41 + return 0; 42 + } 43 + 44 + static int dax_pmem_compat_remove(struct device *dev) 45 + { 46 + device_for_each_child(dev, NULL, dax_pmem_compat_release); 47 + return 0; 48 + } 49 + 50 + static struct nd_device_driver dax_pmem_compat_driver = { 51 + .probe = dax_pmem_compat_probe, 52 + .remove = dax_pmem_compat_remove, 53 + .drv = { 54 + .name = "dax_pmem_compat", 55 + }, 56 + .type = ND_DRIVER_DAX_PMEM, 57 + }; 58 + 59 + static int __init dax_pmem_compat_init(void) 60 + { 61 + return nd_driver_register(&dax_pmem_compat_driver); 62 + } 63 + module_init(dax_pmem_compat_init); 64 + 65 + static void __exit dax_pmem_compat_exit(void) 66 + { 67 + driver_unregister(&dax_pmem_compat_driver.drv); 68 + } 69 + module_exit(dax_pmem_compat_exit); 70 + 71 + MODULE_LICENSE("GPL v2"); 72 + MODULE_AUTHOR("Intel Corporation"); 73 + MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);
+71
drivers/dax/pmem/core.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */ 3 + #include <linux/memremap.h> 4 + #include <linux/module.h> 5 + #include <linux/pfn_t.h> 6 + #include "../../nvdimm/pfn.h" 7 + #include "../../nvdimm/nd.h" 8 + #include "../bus.h" 9 + 10 + struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys subsys) 11 + { 12 + struct resource res; 13 + int rc, id, region_id; 14 + resource_size_t offset; 15 + struct nd_pfn_sb *pfn_sb; 16 + struct dev_dax *dev_dax; 17 + struct nd_namespace_io *nsio; 18 + struct dax_region *dax_region; 19 + struct dev_pagemap pgmap = { 0 }; 20 + struct nd_namespace_common *ndns; 21 + struct nd_dax *nd_dax = to_nd_dax(dev); 22 + struct nd_pfn *nd_pfn = &nd_dax->nd_pfn; 23 + struct nd_region *nd_region = to_nd_region(dev->parent); 24 + 25 + ndns = nvdimm_namespace_common_probe(dev); 26 + if (IS_ERR(ndns)) 27 + return ERR_CAST(ndns); 28 + nsio = to_nd_namespace_io(&ndns->dev); 29 + 30 + /* parse the 'pfn' info block via ->rw_bytes */ 31 + rc = devm_nsio_enable(dev, nsio); 32 + if (rc) 33 + return ERR_PTR(rc); 34 + rc = nvdimm_setup_pfn(nd_pfn, &pgmap); 35 + if (rc) 36 + return ERR_PTR(rc); 37 + devm_nsio_disable(dev, nsio); 38 + 39 + /* reserve the metadata area, device-dax will reserve the data */ 40 + pfn_sb = nd_pfn->pfn_sb; 41 + offset = le64_to_cpu(pfn_sb->dataoff); 42 + if (!devm_request_mem_region(dev, nsio->res.start, offset, 43 + dev_name(&ndns->dev))) { 44 + dev_warn(dev, "could not reserve metadata\n"); 45 + return ERR_PTR(-EBUSY); 46 + } 47 + 48 + rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", &region_id, &id); 49 + if (rc != 2) 50 + return ERR_PTR(-EINVAL); 51 + 52 + /* adjust the dax_region resource to the start of data */ 53 + memcpy(&res, &pgmap.res, sizeof(res)); 54 + res.start += offset; 55 + dax_region = alloc_dax_region(dev, region_id, &res, 56 + nd_region->target_node, le32_to_cpu(pfn_sb->align), 57 + PFN_DEV|PFN_MAP); 58 + if (!dax_region) 59 + return ERR_PTR(-ENOMEM); 60 + 61 + dev_dax = __devm_create_dev_dax(dax_region, id, &pgmap, subsys); 62 + 63 + /* child dev_dax instances now own the lifetime of the dax_region */ 64 + dax_region_put(dax_region); 65 + 66 + return dev_dax; 67 + } 68 + EXPORT_SYMBOL_GPL(__dax_pmem_probe); 69 + 70 + MODULE_LICENSE("GPL v2"); 71 + MODULE_AUTHOR("Intel Corporation");
+40
drivers/dax/pmem/pmem.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */ 3 + #include <linux/percpu-refcount.h> 4 + #include <linux/memremap.h> 5 + #include <linux/module.h> 6 + #include <linux/pfn_t.h> 7 + #include <linux/nd.h> 8 + #include "../bus.h" 9 + 10 + static int dax_pmem_probe(struct device *dev) 11 + { 12 + return PTR_ERR_OR_ZERO(__dax_pmem_probe(dev, DEV_DAX_BUS)); 13 + } 14 + 15 + static struct nd_device_driver dax_pmem_driver = { 16 + .probe = dax_pmem_probe, 17 + .drv = { 18 + .name = "dax_pmem", 19 + }, 20 + .type = ND_DRIVER_DAX_PMEM, 21 + }; 22 + 23 + static int __init dax_pmem_init(void) 24 + { 25 + return nd_driver_register(&dax_pmem_driver); 26 + } 27 + module_init(dax_pmem_init); 28 + 29 + static void __exit dax_pmem_exit(void) 30 + { 31 + driver_unregister(&dax_pmem_driver.drv); 32 + } 33 + module_exit(dax_pmem_exit); 34 + 35 + MODULE_LICENSE("GPL v2"); 36 + MODULE_AUTHOR("Intel Corporation"); 37 + #if !IS_ENABLED(CONFIG_DEV_DAX_PMEM_COMPAT) 38 + /* For compat builds, don't load this module by default */ 39 + MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM); 40 + #endif
+29 -12
drivers/dax/super.c
··· 22 22 #include <linux/uio.h> 23 23 #include <linux/dax.h> 24 24 #include <linux/fs.h> 25 + #include "dax-private.h" 25 26 26 27 static dev_t dax_devt; 27 28 DEFINE_STATIC_SRCU(dax_srcu); ··· 384 383 spin_lock(&dax_host_lock); 385 384 hlist_del_init(&dax_dev->list); 386 385 spin_unlock(&dax_host_lock); 387 - 388 - dax_dev->private = NULL; 389 386 } 390 387 EXPORT_SYMBOL_GPL(kill_dax); 388 + 389 + void run_dax(struct dax_device *dax_dev) 390 + { 391 + set_bit(DAXDEV_ALIVE, &dax_dev->flags); 392 + } 393 + EXPORT_SYMBOL_GPL(run_dax); 391 394 392 395 static struct inode *dax_alloc_inode(struct super_block *sb) 393 396 { ··· 607 602 608 603 void *dax_get_private(struct dax_device *dax_dev) 609 604 { 605 + if (!test_bit(DAXDEV_ALIVE, &dax_dev->flags)) 606 + return NULL; 610 607 return dax_dev->private; 611 608 } 612 609 EXPORT_SYMBOL_GPL(dax_get_private); ··· 622 615 inode_init_once(inode); 623 616 } 624 617 625 - static int __dax_fs_init(void) 618 + static int dax_fs_init(void) 626 619 { 627 620 int rc; 628 621 ··· 654 647 return rc; 655 648 } 656 649 657 - static void __dax_fs_exit(void) 650 + static void dax_fs_exit(void) 658 651 { 659 652 kern_unmount(dax_mnt); 660 653 unregister_filesystem(&dax_fs_type); 661 654 kmem_cache_destroy(dax_cache); 662 655 } 663 656 664 - static int __init dax_fs_init(void) 657 + static int __init dax_core_init(void) 665 658 { 666 659 int rc; 667 660 668 - rc = __dax_fs_init(); 661 + rc = dax_fs_init(); 669 662 if (rc) 670 663 return rc; 671 664 672 665 rc = alloc_chrdev_region(&dax_devt, 0, MINORMASK+1, "dax"); 673 666 if (rc) 674 - __dax_fs_exit(); 675 - return rc; 667 + goto err_chrdev; 668 + 669 + rc = dax_bus_init(); 670 + if (rc) 671 + goto err_bus; 672 + return 0; 673 + 674 + err_bus: 675 + unregister_chrdev_region(dax_devt, MINORMASK+1); 676 + err_chrdev: 677 + dax_fs_exit(); 678 + return 0; 676 679 } 677 680 678 - static void __exit dax_fs_exit(void) 681 + static void __exit dax_core_exit(void) 679 682 { 680 683 unregister_chrdev_region(dax_devt, MINORMASK+1); 681 684 ida_destroy(&dax_minor_ida); 682 - __dax_fs_exit(); 685 + dax_fs_exit(); 683 686 } 684 687 685 688 MODULE_AUTHOR("Intel Corporation"); 686 689 MODULE_LICENSE("GPL v2"); 687 - subsys_initcall(dax_fs_init); 688 - module_exit(dax_fs_exit); 690 + subsys_initcall(dax_core_init); 691 + module_exit(dax_core_exit);
+1
drivers/nvdimm/e820.c
··· 47 47 ndr_desc.res = res; 48 48 ndr_desc.attr_groups = e820_pmem_region_attribute_groups; 49 49 ndr_desc.numa_node = e820_range_to_nid(res->start); 50 + ndr_desc.target_node = ndr_desc.numa_node; 50 51 set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); 51 52 if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc)) 52 53 return -ENXIO;
+1 -1
drivers/nvdimm/nd.h
··· 153 153 u16 ndr_mappings; 154 154 u64 ndr_size; 155 155 u64 ndr_start; 156 - int id, num_lanes, ro, numa_node; 156 + int id, num_lanes, ro, numa_node, target_node; 157 157 void *provider_data; 158 158 struct kernfs_node *bb_state; 159 159 struct badblocks bb;
+1
drivers/nvdimm/of_pmem.c
··· 68 68 memset(&ndr_desc, 0, sizeof(ndr_desc)); 69 69 ndr_desc.attr_groups = region_attr_groups; 70 70 ndr_desc.numa_node = dev_to_node(&pdev->dev); 71 + ndr_desc.target_node = ndr_desc.numa_node; 71 72 ndr_desc.res = &pdev->resource[i]; 72 73 ndr_desc.of_node = np; 73 74 set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
+1
drivers/nvdimm/region_devs.c
··· 1072 1072 nd_region->flags = ndr_desc->flags; 1073 1073 nd_region->ro = ro; 1074 1074 nd_region->numa_node = ndr_desc->numa_node; 1075 + nd_region->target_node = ndr_desc->target_node; 1075 1076 ida_init(&nd_region->ns_ida); 1076 1077 ida_init(&nd_region->btt_ida); 1077 1078 ida_init(&nd_region->pfn_ida);
+5
include/linux/acpi.h
··· 400 400 401 401 #ifdef CONFIG_ACPI_NUMA 402 402 int acpi_map_pxm_to_online_node(int pxm); 403 + int acpi_map_pxm_to_node(int pxm); 403 404 int acpi_get_node(acpi_handle handle); 404 405 #else 405 406 static inline int acpi_map_pxm_to_online_node(int pxm) 407 + { 408 + return 0; 409 + } 410 + static inline int acpi_map_pxm_to_node(int pxm) 406 411 { 407 412 return 0; 408 413 }
+1
include/linux/libnvdimm.h
··· 130 130 void *provider_data; 131 131 int num_lanes; 132 132 int numa_node; 133 + int target_node; 133 134 unsigned long flags; 134 135 struct device_node *of_node; 135 136 };
+15 -3
kernel/resource.c
··· 382 382 int (*func)(struct resource *, void *)) 383 383 { 384 384 struct resource res; 385 - int ret = -1; 385 + int ret = -EINVAL; 386 386 387 387 while (start < end && 388 388 !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) { ··· 452 452 * This function calls the @func callback against all memory ranges of type 453 453 * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY. 454 454 * It is to be used only for System RAM. 455 + * 456 + * This will find System RAM ranges that are children of top-level resources 457 + * in addition to top-level System RAM resources. 455 458 */ 456 459 int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, 457 460 void *arg, int (*func)(unsigned long, unsigned long, void *)) ··· 463 460 unsigned long flags; 464 461 struct resource res; 465 462 unsigned long pfn, end_pfn; 466 - int ret = -1; 463 + int ret = -EINVAL; 467 464 468 465 start = (u64) start_pfn << PAGE_SHIFT; 469 466 end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; 470 467 flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; 471 468 while (start < end && 472 469 !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, 473 - true, &res)) { 470 + false, &res)) { 474 471 pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; 475 472 end_pfn = (res.end + 1) >> PAGE_SHIFT; 476 473 if (end_pfn > pfn) ··· 1131 1128 conflict = __request_resource(parent, res); 1132 1129 if (!conflict) 1133 1130 break; 1131 + /* 1132 + * mm/hmm.c reserves physical addresses which then 1133 + * become unavailable to other users. Conflicts are 1134 + * not expected. Warn to aid debugging if encountered. 1135 + */ 1136 + if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { 1137 + pr_warn("Unaddressable device %s %pR conflicts with %pR", 1138 + conflict->name, conflict, res); 1139 + } 1134 1140 if (conflict != parent) { 1135 1141 if (!(conflict->flags & IORESOURCE_BUSY)) { 1136 1142 parent = conflict;
+13 -17
mm/memory_hotplug.c
··· 101 101 /* add this memory to iomem resource */ 102 102 static struct resource *register_memory_resource(u64 start, u64 size) 103 103 { 104 - struct resource *res, *conflict; 104 + struct resource *res; 105 + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; 106 + char *resource_name = "System RAM"; 105 107 106 108 if (start + size > max_mem_size) 107 109 return ERR_PTR(-E2BIG); 108 110 109 - res = kzalloc(sizeof(struct resource), GFP_KERNEL); 110 - if (!res) 111 - return ERR_PTR(-ENOMEM); 111 + /* 112 + * Request ownership of the new memory range. This might be 113 + * a child of an existing resource that was present but 114 + * not marked as busy. 115 + */ 116 + res = __request_region(&iomem_resource, start, size, 117 + resource_name, flags); 112 118 113 - res->name = "System RAM"; 114 - res->start = start; 115 - res->end = start + size - 1; 116 - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; 117 - conflict = request_resource_conflict(&iomem_resource, res); 118 - if (conflict) { 119 - if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { 120 - pr_debug("Device unaddressable memory block " 121 - "memory hotplug at %#010llx !\n", 122 - (unsigned long long)start); 123 - } 124 - pr_debug("System RAM resource %pR cannot be added\n", res); 125 - kfree(res); 119 + if (!res) { 120 + pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n", 121 + start, start + size); 126 122 return ERR_PTR(-EEXIST); 127 123 } 128 124 return res;
+6 -1
tools/testing/nvdimm/Kbuild
··· 35 35 endif 36 36 obj-$(CONFIG_DEV_DAX) += device_dax.o 37 37 obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o 38 + obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem_core.o 39 + obj-$(CONFIG_DEV_DAX_PMEM_COMPAT) += dax_pmem_compat.o 38 40 39 41 nfit-y := $(ACPI_SRC)/core.o 40 42 nfit-y += $(ACPI_SRC)/intel.o ··· 59 57 nd_e820-y += config_check.o 60 58 61 59 dax-y := $(DAX_SRC)/super.o 60 + dax-y += $(DAX_SRC)/bus.o 62 61 dax-y += config_check.o 63 62 64 63 device_dax-y := $(DAX_SRC)/device.o ··· 67 64 device_dax-y += device_dax_test.o 68 65 device_dax-y += config_check.o 69 66 70 - dax_pmem-y := $(DAX_SRC)/pmem.o 67 + dax_pmem-y := $(DAX_SRC)/pmem/pmem.o 68 + dax_pmem_core-y := $(DAX_SRC)/pmem/core.o 69 + dax_pmem_compat-y := $(DAX_SRC)/pmem/compat.o 71 70 dax_pmem-y += config_check.o 72 71 73 72 libnvdimm-y := $(NVDIMM_SRC)/core.o
+3 -13
tools/testing/nvdimm/dax-dev.c
··· 17 17 phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, 18 18 unsigned long size) 19 19 { 20 - struct resource *res; 20 + struct resource *res = &dev_dax->region->res; 21 21 phys_addr_t addr; 22 - int i; 23 22 24 - for (i = 0; i < dev_dax->num_resources; i++) { 25 - res = &dev_dax->res[i]; 26 - addr = pgoff * PAGE_SIZE + res->start; 27 - if (addr >= res->start && addr <= res->end) 28 - break; 29 - pgoff -= PHYS_PFN(resource_size(res)); 30 - } 31 - 32 - if (i < dev_dax->num_resources) { 33 - res = &dev_dax->res[i]; 23 + addr = pgoff * PAGE_SIZE + res->start; 24 + if (addr >= res->start && addr <= res->end) { 34 25 if (addr + size - 1 <= res->end) { 35 26 if (get_nfit_res(addr)) { 36 27 struct page *page; ··· 35 44 return addr; 36 45 } 37 46 } 38 - 39 47 return -1; 40 48 }