Skip to content

Local ARS HLD#1958

Merged
lihuay merged 14 commits into
sonic-net:masterfrom
Marvell-switching:local_ARS
Feb 19, 2026
Merged

Local ARS HLD#1958
lihuay merged 14 commits into
sonic-net:masterfrom
Marvell-switching:local_ARS

Conversation

@VladimirKuk
Copy link
Copy Markdown
Contributor

@VladimirKuk VladimirKuk commented Apr 8, 2025

This document provides high level design for the feature local Adaptive Routing and Switching (ARS).

PRs:

Repository PR Title State Context
sonic-buildimage Local ARS (Adaptive Routing and Switching) PR Status PR Checks
sonic-swss Local ARS (Adaptive Routing and Switching) PR Status PR Checks
sonic-swss-common Local ARS (Adaptive Routing and Switching) PR Status PR Checks
sonic-mgmt Local ARS (Adaptive Routing and Switching test plan) PR Status PR Checks
sonic-mgmt Local ARS (Adaptive Routing and Switching test plan) PR Status PR Checks

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Fixed phases contents
Fixed spelling
Added user configured port scaling factor
Updated init sequence
Limit idle time value

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Fixed phases contents
Fixed spelling
Added user configured port scaling factor
Updated init sequence
Limit idle time value

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

@Yubin-Li
Copy link
Copy Markdown

Why this schema used prefix for Key? dose it means that only some prefix can used with AR ECMP?

@VladimirKuk
Copy link
Copy Markdown
Contributor Author

Why this schema used prefix for Key? dose it means that only some prefix can used with AR ECMP?

Yes, that is to support use-case for L3 traffic sent to NHG.
Also, currently, there is no other way to reference/identify NHG via CONFIG_DB.

@Yubin-Li
Copy link
Copy Markdown

Why this schema used prefix for Key? dose it means that only some prefix can used with AR ECMP?

Yes, that is to support use-case for L3 traffic sent to NHG. Also, currently, there is no other way to reference/identify NHG via CONFIG_DB.

If so, how to handle if there have 100 prefixes but point to same ECMP, we want to used AR for this ECMP. It's very hard to set dynamic route point to AR ecmp

I think may be we can define some AR ECMP include port list and trip threshold
for example, I define [port1,2,3,4,5] with AR ecmp, and trip threshold = 4
once we create ECMP with [1,2,3,4] [2,3,4,5], [1,2,3,4,5], we can create with AR

@VladimirKuk
Copy link
Copy Markdown
Contributor Author

If so, how to handle if there have 100 prefixes but point to same ECMP, we want to used AR for this ECMP. It's very hard to set dynamic route point to AR ecmp

I think may be we can define some AR ECMP include port list and trip threshold for example, I define [port1,2,3,4,5] with AR ecmp, and trip threshold = 4 once we create ECMP with [1,2,3,4] [2,3,4,5], [1,2,3,4,5], we can create with AR

There is already ARS-enabled interface table, but it could include all interfaces, so I guess shouldn't be used for this. However, to create another table with ARS-enabled interfaces, feels very wasteful.

The problematic scenario is this : suppose you have two NHGs : NHG1 (NH1,NH2,NH3) and NHG2 (NH1,NH2,NH4).
NH3 and NH4 are not reliable and going up and down.
If you rely on ECMP updates to determine whether this is ARS NHG, you could mistake NHG1 with NHG2, since their content will be the same (NH1,NH2).
Prefix (while it is not ideal) is also, specifies the path you want to enable ARS for.

Also, the use-case was to enable static configuration, To enable dynamic NHG identification, the ARS/NHG ids should, probably, be derived from/imported by some protocol (i.e. BGP).

Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md
Comment thread doc/ARS/Local_ARS_HLD.md
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
"max_flows" : "512",
"primary_path_threshold" : "100",
"alternative_path_cost": "250",
"alternative_path_members": {"1.1.1.1", "2.2.2.2"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you in advance which NHs can be used when NHG is created from BGP by Route OA or FRR/Zebra ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should known to user

Comment thread doc/ARS/Local_ARS_HLD.md
@rajendrat
Copy link
Copy Markdown

Why this schema used prefix for Key? dose it means that only some prefix can used with AR ECMP?

Yes, that is to support use-case for L3 traffic sent to NHG. Also, currently, there is no other way to reference/identify NHG via CONFIG_DB.

@VladimirKuk : Is there an option/way to enable ARS for all NH Groups or enable only for matching the "vrf/ip-prefix"?

Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Comment thread doc/ARS/Local_ARS_HLD.md Outdated
Separated NHG identification and ARS object
Added NHG matching by ARS-enabled members
Support NH creation from RouteOrch and NhgOrch

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Comment thread doc/ARS/Local_ARS_HLD.md
description "ARS-enabled interface name";
}

leaf scaling_factor {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the definition of "scaling_factor" differs from SAI_PORT_ATTR_ARS_PORT_LOAD_SCALING_FACTOR and brings ambiguity.

scaling_factor: 10000
SAI_PORT_ATTR_ARS_PORT_LOAD_SCALING_FACTOR:40 # for 400G

Why not just set the value of SAI_PORT_ATTR_ARS_PORT_LOAD_SCALING_FACTOR to be value of scaling_factor?

scaling_factor: 40
SAI_PORT_ATTR_ARS_PORT_LOAD_SCALING_FACTOR:40 # for 400G

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous changes handled cases where the speed was in MB. The current updates address review comments by using the SAI attribute instead.

Signed-off-by: Ashok Kumar P <apannerselva@marvell.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

No pipelines are associated with this pull request.

@lihuay lihuay merged commit 795c61b into sonic-net:master Feb 19, 2026
1 check passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in SONiC 202605 Release Feb 19, 2026
@rck-innovium rck-innovium moved this from Done to In Progress in SONiC 202605 Release Mar 10, 2026
yxieca pushed a commit to sonic-net/sonic-mgmt that referenced this pull request Mar 20, 2026
New test plan for ARS HLD sonic-net/SONiC#1958


Signed-off-by: apannerselva <apannerselva@marvell.com>
vrajeshe pushed a commit to vrajeshe/sonic-mgmt that referenced this pull request Mar 23, 2026
New test plan for ARS HLD sonic-net/SONiC#1958

Signed-off-by: apannerselva <apannerselva@marvell.com>
Signed-off-by: Venkata Gouri Rajesh Etla <vrajeshe@cisco.com>
eddieruan-alibaba pushed a commit that referenced this pull request Mar 27, 2026
* initial HLD version

* review changes

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Identation fix

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Identation fix

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* changes after initial review

* update fonts

* intergrated ARS_OBJECT in NHG and LAG, use notification mechanism

* removed ARS_OBJECT

* Update to latest changes

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Review changes

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Addressing comments

Fixed phases contents
Fixed spelling
Added user configured port scaling factor
Updated init sequence
Limit idle time value

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Addressing comments

Fixed phases contents
Fixed spelling
Added user configured port scaling factor
Updated init sequence
Limit idle time value

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Addressing community comments

Separated NHG identification and ARS object
Added NHG matching by ARS-enabled members
Support NH creation from RouteOrch and NhgOrch

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>

* Fixing Review comments

Signed-off-by: Ashok Kumar P <apannerselva@marvell.com>

---------

Signed-off-by: Vladimir Kuk <vkuk@marvell.com>
Signed-off-by: Ashok Kumar P <apannerselva@marvell.com>
Signed-off-by: Eddie Ruan <eddie.ruan@alibaba-inc.com>
selldinesh pushed a commit to selldinesh/sonic-mgmt that referenced this pull request Apr 1, 2026
New test plan for ARS HLD sonic-net/SONiC#1958

Signed-off-by: apannerselva <apannerselva@marvell.com>
Signed-off-by: selldinesh <dinesh.sellappan@keysight.com>
albertovillarreal-keys pushed a commit to albertovillarreal-keys/sonic-mgmt that referenced this pull request Apr 3, 2026
New test plan for ARS HLD sonic-net/SONiC#1958


Signed-off-by: apannerselva <apannerselva@marvell.com>
rraghav-cisco pushed a commit to rraghav-cisco/sonic-mgmt that referenced this pull request Apr 20, 2026
New test plan for ARS HLD sonic-net/SONiC#1958

Signed-off-by: apannerselva <apannerselva@marvell.com>
Signed-off-by: Raghavendran Ramanathan <rraghav@cisco.com>
apannerselva pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request May 6, 2026
Added support for local ARS (Adaptive Routing and Switching).

HLD: sonic-net/SONiC#1958

Signed-off-by: VladimirKuk <31180446+VladimirKuk@users.noreply.github.com>
apannerselva pushed a commit to Marvell-switching/sonic-swss-common that referenced this pull request May 25, 2026
Added support for local ARS (Adaptive Routing and Switching).

HLD: sonic-net/SONiC#1958

Signed-off-by: VladimirKuk <31180446+VladimirKuk@users.noreply.github.com>
apannerselva added a commit to apannerselva/sonic-mgmt that referenced this pull request May 26, 2026
Description of PR
New test plan for Adaptive Routing and Switching HLD [sonic-net/SONiC#1958]

Summary:
Fixes # (issue)

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 New Test case
 Skipped for non-supported platforms
 Test case improvement
Back port request
 202311
 202405
 202411
 202505
 202511
 202512
 202605
Approach
What is the motivation for this PR?
New test plan for ARS HLD [sonic-net/SONiC#1958]

How did you do it?
Added ARS test plan covering adaptive routing and switching functionality.

How did you verify/test it?
Ran it on the device
Results -
ecmp/ars/test_ars.py::test_ars_modes[per-packet-global] PASSED [ 10%]
ecmp/ars/test_ars.py::test_ars_modes[per-packet-interface] PASSED [ 20%]
ecmp/ars/test_ars.py::test_ars_modes[per-packet-nexthop] PASSED [ 30%]
ecmp/ars/test_ars.py::test_ars_modes[per-flowlet-global] PASSED [ 40%]
ecmp/ars/test_ars.py::test_ars_modes[per-flowlet-interface] PASSED [ 50%]
ecmp/ars/test_ars.py::test_ars_modes[per-flowlet-nexthop] PASSED [ 60%]
ecmp/ars/test_ars.py::test_ars_acl_action PASSED [ 70%]
ecmp/ars/test_ars.py::test_ars_nonars_interface[interface] PASSED [ 80%]
ecmp/ars/test_ars.py::test_ars_nonars_interface[nexthop] PASSED [ 90%]
ecmp/ars/test_ars.py::test_ars_stress PASSED [100%]

Any platform specific information?
To be supported for ARS supported platforms and now the script is supported only for marvell-teralynx

Supported testbed topology if it's a new test case?
T0 Topology

Signed-off-by: Ashok Kumar P <apannerselva@marvell.com>
apannerselva added a commit to apannerselva/sonic-mgmt that referenced this pull request May 26, 2026
Description of PR
New test plan for Adaptive Routing and Switching HLD [sonic-net/SONiC#1958]

Summary:
Fixes # (issue)

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 New Test case
 Skipped for non-supported platforms
 Test case improvement
Back port request
 202311
 202405
 202411
 202505
 202511
 202512
 202605
Approach
What is the motivation for this PR?
New test plan for ARS HLD [sonic-net/SONiC#1958]

How did you do it?
Added ARS test plan covering adaptive routing and switching functionality.

How did you verify/test it?
Ran it on the device
Results -
ecmp/ars/test_ars.py::test_ars_modes[per-packet-global] PASSED [ 10%]
ecmp/ars/test_ars.py::test_ars_modes[per-packet-interface] PASSED [ 20%]
ecmp/ars/test_ars.py::test_ars_modes[per-packet-nexthop] PASSED [ 30%]
ecmp/ars/test_ars.py::test_ars_modes[per-flowlet-global] PASSED [ 40%]
ecmp/ars/test_ars.py::test_ars_modes[per-flowlet-interface] PASSED [ 50%]
ecmp/ars/test_ars.py::test_ars_modes[per-flowlet-nexthop] PASSED [ 60%]
ecmp/ars/test_ars.py::test_ars_acl_action PASSED [ 70%]
ecmp/ars/test_ars.py::test_ars_nonars_interface[interface] PASSED [ 80%]
ecmp/ars/test_ars.py::test_ars_nonars_interface[nexthop] PASSED [ 90%]
ecmp/ars/test_ars.py::test_ars_stress PASSED [100%]

Any platform specific information?
To be supported for ARS supported platforms and now the script is supported only for marvell-teralynx

Supported testbed topology if it's a new test case?
T0 Topology

Signed-off-by: Ashok Kumar P <apannerselva@marvell.com>
apannerselva pushed a commit to Marvell-switching/sonic-swss-common that referenced this pull request Jun 1, 2026
Added support for local ARS (Adaptive Routing and Switching).

HLD: sonic-net/SONiC#1958

Signed-off-by: VladimirKuk <31180446+VladimirKuk@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.