Skip to content

Zebra process crashes intermittently during 'config reload' on the DUT line cards #36

@sanjair-git

Description

@sanjair-git
  • When reporting a crash, provide a backtrace
  • When pasting configs, logs, shell output, backtraces, and other large chunks of text use Markdown code blocks
  • Include the FRR version; if you built from Git, please provide the commit hash
  • Write your issue in English

Describe the bug

On a T2 chassis line card, when we do 'sudo config reload -y', we see 'zebra' process getting crashed and generates a core. We see this issue intermittently happening. (_~ approx once in 30 attempts or so_)

We have started seeing the issue from this commit,

sonic-buildimage-msft commit:
Azure/sonic-buildimage-msft@6f19e12

Following logs are seen on the bgp docker, when the crash is happening.

2023-07-09 13:59:40,064 INFO exited: zebra (terminated by SIGSEGV (core dumped); not expected)
2023-07-11 19:39:22,156 INFO exited: zebra (terminated by SIGSEGV (core dumped); not expected)

Crash logs:

image

Attached the zebra core generated and the frr logs for reference.
zebra.1689104360.44.0.core.gz
frr.zip

Actual Behaviour:

  • Zebra process under bgp docker gets crashed.
  • Core generated

We had already raised an issue under sonic-buildimage regarding this crash, please take a look at this,
sonic-net/sonic-buildimage#15803
15803

To Reproduce
Steps to reproduce the behavior:
On any T2 chassis line card, do 'sudo config reload -y' for multiple times.

Expected behavior

  • 'sudo config reload' on DUT line cards, should not cause any issue. And the line cards should come up fine with all bgp neighbors established without any crash/core files.

Screenshots
If applicable, add screenshots to help explain your problem.

Versions

  • OS Kernel: [e.g. Linux, OpenBSD, etc] [version]
  • FRR Version [version]
admin@ixre-egl-board1:~$ show version

SONiC Software Version: SONiC.HEAD.489499-msft-2205-ndk-d963ac161
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-18-2-amd64
Build commit: d963ac161
Build date: Fri Jul  7 18:18:51 UTC 2023
Built by: gitlab-runner@sonic-bld2

Platform: x86_64-nokia_ixr7250e_36x400g-r0
HwSKU: Nokia-IXR7250E-36x100G
ASIC: broadcom
ASIC Count: 2
Serial Number: EAG2-04-210
Model Number: N/A
Hardware Revision: 56
Uptime: 15:45:52 up 1 day, 12:15,  3 users,  load average: 1.56, 1.54, 1.59
Date: Wed 12 Jul 2023 15:45:52

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions