Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

net: pcs: lynx: accept in-band autoneg for 2500base-x

Testing in two circumstances:

1. back to back optical SFP+ connection between two LS1028A-QDS ports
with the SCH-26908 riser card
2. T1042 with on-board AQR115 PHY using "OCSGMII", as per
https://lore.kernel.org/lkml/aIuEvaSCIQdJWcZx@FUE-ALEWI-WINX/

strongly suggests that enabling in-band auto-negotiation is actually
possible when the lane baud rate is 3.125 Gbps.

It was previously thought that this would not be the case, because it
was only tested on 2500base-x links with on-board Aquantia PHYs, where
it was noticed that MII_LPA is always reported as zero, and it was
thought that this is because of the PCS.

Test case #1 above shows it is not, and the configured MII_ADVERTISE on
system A ends up in the MII_LPA on system B, when in 2500base-x mode
(IF_MODE=0).

Test case #2, which uses "SGMII" auto-negotiation (IF_MODE=3) for the
3.125 Gbps lane, is actually a misconfiguration, but it is what led to
the discovery.

There is actually an old bug in the Lynx PCS driver - it expects all
register values to contain their default out-of-reset values, as if the
PCS were initialized by the Reset Configuration Word (RCW) settings.
There are 2 cases in which this is problematic:
- if the bootloader (or previous kexec-enabled Linux) wrote a different
IF_MODE value
- if dynamically changing the SerDes protocol from 1000base-x to
2500base-x, e.g. by replacing the optical SFP module.

Specifically in test case #2, an accidental alignment between the
bootloader configuring the PCS to expect SGMII in-band code words, and
the AQR115 PHY actually transmitting SGMII in-band code words when
operating in the "OCSGMII" system interface protocol, led to the PCS
transmitting replicated symbols at 3.125 Gbps baud rate. This could only
have happened if the PCS saw and reacted to the SGMII code words in the
first place.

Since test #2 is invalid from a protocol perspective (there seems to be
no standard way of negotiating the data rate of 2500 Mbps with SGMII,
and the lower data rates should remain 10/100/1000), in-band auto-negotiation
for 2500base-x effectively means Clause 37 (i.e. IF_MODE=0).

Make 2500base-x be treated like 1000base-x in this regard, by removing
all prior limitations and calling lynx_pcs_config_giga().

This adds a new feature: LINK_INBAND_ENABLE and at the same time fixes
the Lynx PCS's long standing problem that the registers (specifically
IF_MODE, but others could be misconfigured as well) are not written by
the driver to the known valid values for 2500base-x.

Co-developed-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>
Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251125103507.749654-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Vladimir Oltean and committed by
Jakub Kicinski
56435627 e3b8cbf4

+5 -72
+5 -72
drivers/net/pcs/pcs-lynx.c
··· 40 40 { 41 41 switch (interface) { 42 42 case PHY_INTERFACE_MODE_1000BASEX: 43 + case PHY_INTERFACE_MODE_2500BASEX: 43 44 case PHY_INTERFACE_MODE_SGMII: 44 45 case PHY_INTERFACE_MODE_QSGMII: 45 46 return LINK_INBAND_DISABLE | LINK_INBAND_ENABLE; 46 47 47 48 case PHY_INTERFACE_MODE_10GBASER: 48 - case PHY_INTERFACE_MODE_2500BASEX: 49 49 return LINK_INBAND_DISABLE; 50 50 51 51 case PHY_INTERFACE_MODE_USXGMII: ··· 80 80 phylink_decode_usxgmii_word(state, lpa); 81 81 } 82 82 83 - static void lynx_pcs_get_state_2500basex(struct mdio_device *pcs, 84 - struct phylink_link_state *state) 85 - { 86 - int bmsr; 87 - 88 - bmsr = mdiodev_read(pcs, MII_BMSR); 89 - if (bmsr < 0) { 90 - state->link = false; 91 - return; 92 - } 93 - 94 - state->link = !!(bmsr & BMSR_LSTATUS); 95 - state->an_complete = !!(bmsr & BMSR_ANEGCOMPLETE); 96 - if (!state->link) 97 - return; 98 - 99 - state->speed = SPEED_2500; 100 - state->pause |= MLO_PAUSE_TX | MLO_PAUSE_RX; 101 - state->duplex = DUPLEX_FULL; 102 - } 103 - 104 83 static void lynx_pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode, 105 84 struct phylink_link_state *state) 106 85 { ··· 87 108 88 109 switch (state->interface) { 89 110 case PHY_INTERFACE_MODE_1000BASEX: 111 + case PHY_INTERFACE_MODE_2500BASEX: 90 112 case PHY_INTERFACE_MODE_SGMII: 91 113 case PHY_INTERFACE_MODE_QSGMII: 92 114 phylink_mii_c22_pcs_get_state(lynx->mdio, neg_mode, state); 93 - break; 94 - case PHY_INTERFACE_MODE_2500BASEX: 95 - lynx_pcs_get_state_2500basex(lynx->mdio, state); 96 115 break; 97 116 case PHY_INTERFACE_MODE_USXGMII: 98 117 case PHY_INTERFACE_MODE_10G_QXGMII: ··· 129 152 mdiodev_write(pcs, LINK_TIMER_HI, link_timer >> 16); 130 153 } 131 154 132 - if (interface == PHY_INTERFACE_MODE_1000BASEX) { 155 + if (interface == PHY_INTERFACE_MODE_1000BASEX || 156 + interface == PHY_INTERFACE_MODE_2500BASEX) { 133 157 if_mode = 0; 134 158 } else { 135 159 /* SGMII and QSGMII */ ··· 180 202 case PHY_INTERFACE_MODE_1000BASEX: 181 203 case PHY_INTERFACE_MODE_SGMII: 182 204 case PHY_INTERFACE_MODE_QSGMII: 205 + case PHY_INTERFACE_MODE_2500BASEX: 183 206 return lynx_pcs_config_giga(lynx->mdio, ifmode, advertising, 184 207 neg_mode); 185 - case PHY_INTERFACE_MODE_2500BASEX: 186 - if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) { 187 - dev_err(&lynx->mdio->dev, 188 - "AN not supported on 3.125GHz SerDes lane\n"); 189 - return -EOPNOTSUPP; 190 - } 191 - break; 192 208 case PHY_INTERFACE_MODE_USXGMII: 193 209 case PHY_INTERFACE_MODE_10G_QXGMII: 194 210 return lynx_pcs_config_usxgmii(lynx->mdio, ifmode, advertising, ··· 243 271 if_mode); 244 272 } 245 273 246 - /* 2500Base-X is SerDes protocol 7 on Felix and 6 on ENETC. It is a SerDes lane 247 - * clocked at 3.125 GHz which encodes symbols with 8b/10b and does not have 248 - * auto-negotiation of any link parameters. Electrically it is compatible with 249 - * a single lane of XAUI. 250 - * The hardware reference manual wants to call this mode SGMII, but it isn't 251 - * really, since the fundamental features of SGMII: 252 - * - Downgrading the link speed by duplicating symbols 253 - * - Auto-negotiation 254 - * are not there. 255 - * The speed is configured at 1000 in the IF_MODE because the clock frequency 256 - * is actually given by a PLL configured in the Reset Configuration Word (RCW). 257 - * Since there is no difference between fixed speed SGMII w/o AN and 802.3z w/o 258 - * AN, we call this PHY interface type 2500Base-X. In case a PHY negotiates a 259 - * lower link speed on line side, the system-side interface remains fixed at 260 - * 2500 Mbps and we do rate adaptation through pause frames. 261 - */ 262 - static void lynx_pcs_link_up_2500basex(struct mdio_device *pcs, 263 - unsigned int neg_mode, 264 - int speed, int duplex) 265 - { 266 - u16 if_mode = 0; 267 - 268 - if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) { 269 - dev_err(&pcs->dev, "AN not supported for 2500BaseX\n"); 270 - return; 271 - } 272 - 273 - if (duplex == DUPLEX_HALF) 274 - if_mode |= IF_MODE_HALF_DUPLEX; 275 - if_mode |= IF_MODE_SPEED(SGMII_SPEED_2500); 276 - 277 - mdiodev_modify(pcs, IF_MODE, 278 - IF_MODE_HALF_DUPLEX | IF_MODE_SPEED_MSK, 279 - if_mode); 280 - } 281 - 282 274 static void lynx_pcs_link_up(struct phylink_pcs *pcs, unsigned int neg_mode, 283 275 phy_interface_t interface, 284 276 int speed, int duplex) ··· 253 317 case PHY_INTERFACE_MODE_SGMII: 254 318 case PHY_INTERFACE_MODE_QSGMII: 255 319 lynx_pcs_link_up_sgmii(lynx->mdio, neg_mode, speed, duplex); 256 - break; 257 - case PHY_INTERFACE_MODE_2500BASEX: 258 - lynx_pcs_link_up_2500basex(lynx->mdio, neg_mode, speed, duplex); 259 320 break; 260 321 case PHY_INTERFACE_MODE_USXGMII: 261 322 case PHY_INTERFACE_MODE_10G_QXGMII: