i
Contents
About emergency response and recovery ··············································2
Definition ···································································································································· 2
Scenarios ··································································································································· 2
Principles ··································································································································· 2
Emergency response and recovery workflow ··········································4
Notify a fault ································································································································ 4
Collect fault information ················································································································· 4
Preliminarily locate the fault by using a tool ······················································································· 5
Continue with troubleshooting based on the scenario ·········································································· 5
Seek for help ······························································································································· 5
View the troubleshooting result ······································································································· 5
Record emergency maintenance information ····················································································· 5
Preliminarily locate the fault by using a tool ············································6
Underlay network check ················································································································ 6
Loop detection ····························································································································· 6
Radar detection ··························································································································· 6
AC interface traffic statistics ··········································································································· 7
Device capacity management ········································································································· 7
Controller configuration auditing ······································································································ 7
Network overlay scenario ···································································9
Network topology ························································································································· 9
Traffic model ······························································································································· 9
Troubleshooting and recovery procedures for common issues ···························································· 10
Large-scale production service failure ····················································································· 10
Large-scale production service failure on one leaf ····································································· 11
Failure of some production services ························································································ 11
Failure of east-west Layer 2 production services across leaf devices············································· 11
Failure of the gateway, north-south, and cross-VPN Layer 3 production services ····························· 13
Network overlay + device incorporation scenario ··································· 15
Network topology ······················································································································· 15
Traffic model ····························································································································· 15
Troubleshooting and recovery procedures for common issues ···························································· 16
Large-scale failure of production services ················································································ 16
Failure of some production services ························································································ 17
Context is deleted by mistake or lost ······················································································· 17
Network overlay + PBR service chain scenario ····································· 18
Network topology ······················································································································· 18
Traffic model ····························································································································· 18
Troubleshooting and recovery procedures for common issues ···························································· 20
Failure of a multi-hop service chain ························································································· 20
Failure of a single-hop service chain ······················································································· 20
Network overlay + multiple service egresses scenario ···························· 21
Network topology ······················································································································· 21
Traffic model ····························································································································· 22
Troubleshooting and recovery procedures for common issues ···························································· 22
Failure to access external network 1 (SNAT) ············································································ 22
Failure to access external network 2 (without SNAT) ·································································· 23
Failure to access external network 3 (without FW egress) ··························································· 23
Network overlay multi-fabric scenario ················································· 25
Network topology ······················································································································· 25