Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'dp83tg720-reduce-link-recovery'

Oleksij Rempel says:

====================
dp83tg720: Reduce link recovery

This patch series improves the link recovery behavior of the TI
DP83TG720 PHY driver.

Previously, we introduced randomized reset delay logic to avoid reset
collisions in multi-PHY setups. While this approach was functional, it
had notable drawbacks: unpredictable behavior, longer and more variable
link recovery times, and overall higher complexity in link handling.

With this new approach, we replace the randomized delay with
deterministic, role-specific delays in the PHY reset logic. This enables
us to:
- Remove the redundant empirical 600 ms delay in read_status()
- Drop the random polling interval logic
- Introduce a clean, adaptive polling strategy with consistent
behavior and improved responsiveness

As a result, the PHY is now able to recover link reliably in under
1000_ms
====================

Link: https://patch.msgid.link/20250612104157.2262058-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+124 -57
+124 -57
drivers/net/phy/dp83tg720.c
··· 13 13 #include "open_alliance_helpers.h" 14 14 15 15 /* 16 + * DP83TG720 PHY Limitations and Workarounds 17 + * 18 + * The DP83TG720 1000BASE-T1 PHY has several limitations that require 19 + * software-side mitigations. These workarounds are implemented throughout 20 + * this driver. This section documents the known issues and their corresponding 21 + * mitigation strategies. 22 + * 23 + * 1. Unreliable Link Detection and Synchronized Reset Deadlock 24 + * ------------------------------------------------------------ 25 + * After a link loss or during link establishment, the DP83TG720 PHY may fail 26 + * to detect or report link status correctly. As of June 2025, no public 27 + * errata sheet for the DP83TG720 PHY documents this behavior. 28 + * The "DP83TC81x, DP83TG72x Software Implementation Guide" application note 29 + * (SNLA404, available at https://www.ti.com/lit/an/snla404/snla404.pdf) 30 + * recommends performing a soft restart if polling for a link fails to establish 31 + * a connection after 100ms. This procedure is adopted as the workaround for the 32 + * observed link detection issue. 33 + * 34 + * However, in point-to-point setups where both link partners use the same 35 + * driver (e.g. Linux on both sides), a synchronized reset pattern may emerge. 36 + * This leads to a deadlock, where both PHYs reset at the same time and 37 + * continuously miss each other during auto-negotiation. 38 + * 39 + * To address this, the reset procedure includes two components: 40 + * 41 + * - A **fixed minimum delay of 1ms** after a hardware reset. The datasheet 42 + * "DP83TG720S-Q1 1000BASE-T1 Automotive Ethernet PHY with SGMII and RGMII" 43 + * specifies this as the "Post reset stabilization-time prior to MDC preamble 44 + * for register access" (T6.2), ensuring the PHY is ready for MDIO 45 + * operations. 46 + * 47 + * - An **additional asymmetric delay**, empirically chosen based on 48 + * master/slave role. This reduces the risk of synchronized resets on both 49 + * link partners. Values are selected to avoid periodic overlap and ensure 50 + * the link is re-established within a few cycles. 51 + * 52 + * The functions that implement this logic are: 53 + * - dp83tg720_soft_reset() 54 + * - dp83tg720_get_next_update_time() 55 + * 56 + * 2. Polling-Based Link Detection and IRQ Support 57 + * ----------------------------------------------- 58 + * Due to the PHY-specific limitation described in section 1, link-up events 59 + * cannot be reliably detected via interrupts on the DP83TG720. Therefore, 60 + * polling is required to detect transitions from link-down to link-up. 61 + * 62 + * While link-down events *can* be detected via IRQs on this PHY, this driver 63 + * currently does **not** implement interrupt support. As a result, all link 64 + * state changes must be detected using polling. 65 + * 66 + * Polling behavior: 67 + * - When the link is up: slow polling (e.g. 1s). 68 + * - When the link just went down: fast polling for a short time. 69 + * - When the link stays down: fallback to slow polling. 70 + * 71 + * This design balances responsiveness and CPU usage. It sacrifices fast link-up 72 + * times in cases where the link is expected to remain down for extended periods, 73 + * assuming that such systems do not require immediate reactivity. 74 + */ 75 + 76 + /* 16 77 * DP83TG720S_POLL_ACTIVE_LINK - Polling interval in milliseconds when the link 17 78 * is active. 18 - * DP83TG720S_POLL_NO_LINK_MIN - Minimum polling interval in milliseconds when 19 - * the link is down. 20 - * DP83TG720S_POLL_NO_LINK_MAX - Maximum polling interval in milliseconds when 21 - * the link is down. 79 + * DP83TG720S_POLL_NO_LINK - Polling interval in milliseconds when the 80 + * link is down. 81 + * DP83TG720S_FAST_POLL_DURATION_MS - Timeout in milliseconds for no-link 82 + * polling after which polling interval is 83 + * increased. 84 + * DP83TG720S_POLL_SLOW - Slow polling interval when there is no 85 + * link for a prolongued period. 86 + * DP83TG720S_RESET_DELAY_MS_MASTER - Delay after a reset before attempting 87 + * to establish a link again for master phy. 88 + * DP83TG720S_RESET_DELAY_MS_SLAVE - Delay after a reset before attempting 89 + * to establish a link again for slave phy. 22 90 * 23 91 * These values are not documented or officially recommended by the vendor but 24 92 * were determined through empirical testing. They achieve a good balance in 25 93 * minimizing the number of reset retries while ensuring reliable link recovery 26 94 * within a reasonable timeframe. 27 95 */ 28 - #define DP83TG720S_POLL_ACTIVE_LINK 1000 29 - #define DP83TG720S_POLL_NO_LINK_MIN 100 30 - #define DP83TG720S_POLL_NO_LINK_MAX 1000 96 + #define DP83TG720S_POLL_ACTIVE_LINK 421 97 + #define DP83TG720S_POLL_NO_LINK 149 98 + #define DP83TG720S_FAST_POLL_DURATION_MS 6000 99 + #define DP83TG720S_POLL_SLOW 1117 100 + #define DP83TG720S_RESET_DELAY_MS_MASTER 97 101 + #define DP83TG720S_RESET_DELAY_MS_SLAVE 149 31 102 32 103 #define DP83TG720S_PHY_ID 0x2000a284 33 104 ··· 195 124 196 125 struct dp83tg720_priv { 197 126 struct dp83tg720_stats stats; 127 + unsigned long last_link_down_jiffies; 198 128 }; 199 129 200 130 /** ··· 271 199 priv->stats.rx_err_pkt_cnt += ret; 272 200 273 201 return 0; 202 + } 203 + 204 + static int dp83tg720_soft_reset(struct phy_device *phydev) 205 + { 206 + int ret; 207 + 208 + ret = phy_write(phydev, DP83TG720S_PHY_RESET, DP83TG720S_HW_RESET); 209 + if (ret) 210 + return ret; 211 + 212 + /* Include mandatory MDC-access delay (1ms) + extra asymmetric delay to 213 + * avoid synchronized reset deadlock. See section 1 in the top-of-file 214 + * comment block. 215 + */ 216 + if (phydev->master_slave_state == MASTER_SLAVE_STATE_SLAVE) 217 + msleep(DP83TG720S_RESET_DELAY_MS_SLAVE); 218 + else 219 + msleep(DP83TG720S_RESET_DELAY_MS_MASTER); 220 + 221 + return ret; 274 222 } 275 223 276 224 static void dp83tg720_get_link_stats(struct phy_device *phydev, ··· 474 382 /* According to the "DP83TC81x, DP83TG72x Software 475 383 * Implementation Guide", the PHY needs to be reset after a 476 384 * link loss or if no link is created after at least 100ms. 477 - * 478 - * Currently we are polling with the PHY_STATE_TIME (1000ms) 479 - * interval, which is still enough for not automotive use cases. 480 385 */ 481 386 ret = phy_init_hw(phydev); 482 387 if (ret) 483 388 return ret; 484 - 485 - /* Sleep 600ms for PHY stabilization post-reset. 486 - * Empirically chosen value (not documented). 487 - * Helps reduce reset bounces with link partners having similar 488 - * issues. 489 - */ 490 - msleep(600); 491 389 492 390 /* After HW reset we need to restore master/slave configuration. 493 391 * genphy_c45_pma_baset1_read_master_slave() call will be done ··· 559 477 { 560 478 int ret; 561 479 562 - /* Software Restart is not enough to recover from a link failure. 563 - * Using Hardware Reset instead. 564 - */ 565 - ret = phy_write(phydev, DP83TG720S_PHY_RESET, DP83TG720S_HW_RESET); 480 + /* Reset the PHY to recover from a link failure */ 481 + ret = dp83tg720_soft_reset(phydev); 566 482 if (ret) 567 483 return ret; 568 - 569 - /* Wait until MDC can be used again. 570 - * The wait value of one 1ms is documented in "DP83TG720S-Q1 1000BASE-T1 571 - * Automotive Ethernet PHY with SGMII and RGMII" datasheet. 572 - */ 573 - usleep_range(1000, 2000); 574 484 575 485 if (phy_interface_is_rgmii(phydev)) { 576 486 ret = dp83tg720_config_rgmii_delay(phydev); ··· 599 525 } 600 526 601 527 /** 602 - * dp83tg720_get_next_update_time - Determine the next update time for PHY 603 - * state 528 + * dp83tg720_get_next_update_time - Return next polling interval for PHY state 604 529 * @phydev: Pointer to the phy_device structure 605 530 * 606 - * This function addresses a limitation of the DP83TG720 PHY, which cannot 607 - * reliably detect or report a stable link state. To recover from such 608 - * scenarios, the PHY must be periodically reset when the link is down. However, 609 - * if the link partner also runs Linux with the same driver, synchronized reset 610 - * intervals can lead to a deadlock where the link never establishes due to 611 - * simultaneous resets on both sides. 531 + * Implements adaptive polling interval logic depending on link state and 532 + * downtime duration. See the "2. Polling-Based Link Detection and IRQ Support" 533 + * section at the top of this file for details. 612 534 * 613 - * To avoid this, the function implements randomized polling intervals when the 614 - * link is down. It ensures that reset intervals are desynchronized by 615 - * introducing a random delay between a configured minimum and maximum range. 616 - * When the link is up, a fixed polling interval is used to minimize overhead. 617 - * 618 - * This mechanism guarantees that the link will reestablish within 10 seconds 619 - * in the worst-case scenario. 620 - * 621 - * Return: Time (in jiffies) until the next update event for the PHY state 622 - * machine. 535 + * Return: Time (in jiffies) until the next poll 623 536 */ 624 537 static unsigned int dp83tg720_get_next_update_time(struct phy_device *phydev) 625 538 { 539 + struct dp83tg720_priv *priv = phydev->priv; 626 540 unsigned int next_time_jiffies; 627 541 628 542 if (phydev->link) { 629 - /* When the link is up, use a fixed 1000ms interval 630 - * (in jiffies) 631 - */ 543 + priv->last_link_down_jiffies = 0; 544 + 545 + /* When the link is up, use a slower interval (in jiffies) */ 632 546 next_time_jiffies = 633 547 msecs_to_jiffies(DP83TG720S_POLL_ACTIVE_LINK); 634 548 } else { 635 - unsigned int min_jiffies, max_jiffies, rand_jiffies; 549 + unsigned long now = jiffies; 636 550 637 - /* When the link is down, randomize interval between min/max 638 - * (in jiffies) 639 - */ 640 - min_jiffies = msecs_to_jiffies(DP83TG720S_POLL_NO_LINK_MIN); 641 - max_jiffies = msecs_to_jiffies(DP83TG720S_POLL_NO_LINK_MAX); 551 + if (!priv->last_link_down_jiffies) 552 + priv->last_link_down_jiffies = now; 642 553 643 - rand_jiffies = min_jiffies + 644 - get_random_u32_below(max_jiffies - min_jiffies + 1); 645 - next_time_jiffies = rand_jiffies; 554 + if (time_before(now, priv->last_link_down_jiffies + 555 + msecs_to_jiffies(DP83TG720S_FAST_POLL_DURATION_MS))) { 556 + /* Link recently went down: fast polling */ 557 + next_time_jiffies = 558 + msecs_to_jiffies(DP83TG720S_POLL_NO_LINK); 559 + } else { 560 + /* Link has been down for a while: slow polling */ 561 + next_time_jiffies = 562 + msecs_to_jiffies(DP83TG720S_POLL_SLOW); 563 + } 646 564 } 647 565 648 566 /* Ensure the polling time is at least one jiffy */ ··· 648 582 649 583 .flags = PHY_POLL_CABLE_TEST, 650 584 .probe = dp83tg720_probe, 585 + .soft_reset = dp83tg720_soft_reset, 651 586 .config_aneg = dp83tg720_config_aneg, 652 587 .read_status = dp83tg720_read_status, 653 588 .get_features = genphy_c45_pma_read_ext_abilities,