H3C S6890 Series Troubleshooting Manual

Type
Troubleshooting Manual
H3C S6890 Switch Series
Troubleshooting Guide
Document version: 6W100-20190725
Copyright © 2019 New H3C Technologies Co., Ltd. All rights reserved.
No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of New
H3C Technologies Co., Ltd.
Except for the trademarks of New H3C Technologies Co., Ltd., any trademarks that may be mentioned in this document are
the property of their respective owners.
The information in this document is subject to change without notice
i
Contents
Introduction ·····················································································1
General guidelines ······················································································································· 1
Collecting log and operating information ··························································································· 1
Collecting common log messages ···························································································· 2
Collecting diagnostic log messages ·························································································· 2
Collecting operating statistics ·································································································· 3
Contacting technical support ·········································································································· 4
Removing deployment errors ······························································4
Troubleshooting hardware ································································ 10
Switch reboot failure ··················································································································· 10
Symptom ··························································································································· 10
Troubleshooting flowchart ····································································································· 10
Solution ····························································································································· 10
Operating power module failure ···································································································· 11
Symptom ··························································································································· 11
Solution ····························································································································· 11
Newly installed power module failure ····························································································· 12
Symptom ··························································································································· 12
Solution ····························································································································· 12
Fan tray failure ·························································································································· 13
Symptom ··························································································································· 13
Solution ····························································································································· 13
Related commands ···················································································································· 15
Troubleshooting system management ················································ 15
High CPU utilization ···················································································································· 15
Symptom ··························································································································· 15
Troubleshooting flowchart ····································································································· 15
Solution ····························································································································· 15
High memory utilization ··············································································································· 17
Symptom ··························································································································· 17
Troubleshooting flowchart ····································································································· 18
Solution ····························································································································· 18
Temperature alarms ··················································································································· 20
Symptom ··························································································································· 20
Troubleshooting flowchart ····································································································· 20
Solution ····························································································································· 20
Related commands ···················································································································· 21
Troubleshooting ports ····································································· 21
10-Gigabit SFP+ fiber port fails to come up ····················································································· 21
Symptom ··························································································································· 21
Troubleshooting flowchart ····································································································· 22
Solution ····························································································································· 22
100-GE QSFP28 fiber port fails to come up ····················································································· 23
Symptom ··························································································································· 23
Troubleshooting flowchart ····································································································· 24
Solution ····························································································································· 24
Non-H3C transceiver module error message ··················································································· 25
Symptom ··························································································································· 25
Troubleshooting flowchart ····································································································· 25
Solution ····························································································································· 25
Transceiver module does not support digital diagnosis ······································································ 26
Symptom ··························································································································· 26
Troubleshooting flowchart ····································································································· 26
ii
Solution ····························································································································· 27
Error frames (for example, CRC errors) on a port ············································································· 27
Symptom ··························································································································· 27
Troubleshooting flowchart ····································································································· 28
Solution ····························································································································· 28
Failure to receive packets ············································································································ 29
Symptom ··························································································································· 29
Troubleshooting flowchart ····································································································· 30
Solution ····························································································································· 30
Failure to send packets ··············································································································· 31
Symptom ··························································································································· 31
Troubleshooting flowchart ····································································································· 31
Solution ····························································································································· 32
Incorrect port information ············································································································· 32
Symptom ··························································································································· 32
Troubleshooting flowchart ····································································································· 33
Solution ····························································································································· 33
A port fails to come up ················································································································ 34
Symptom ··························································································································· 34
Troubleshooting flowchart ····································································································· 34
Solution ····························································································································· 34
A port flaps ······························································································································· 35
Symptom ··························································································································· 35
Troubleshooting flowchart ····································································································· 36
Solution ····························································································································· 36
CRC error packets on a port ········································································································· 37
Symptom ··························································································································· 37
Troubleshooting flowchart ····································································································· 38
Solution ····························································································································· 38
Related commands ···················································································································· 39
Troubleshooting IRF ······································································· 39
IRF fabric setup failure ················································································································ 39
Symptom ··························································································································· 39
Troubleshooting flowchart ····································································································· 40
Solution ····························································································································· 40
IRF split ··································································································································· 42
Symptom ··························································································································· 42
Troubleshooting flowchart ····································································································· 42
Solution ····························································································································· 43
BFD MAD failure ························································································································ 43
Symptom ··························································································································· 43
Troubleshooting flowchart ····································································································· 44
Solution ····························································································································· 44
LACP MAD failure ······················································································································ 47
Symptom ··························································································································· 47
Troubleshooting flowchart ····································································································· 47
Solution ····························································································································· 47
Related commands ···················································································································· 48
Troubleshooting QoS and ACL ·························································· 49
ACL application failure for unsupported ACL rules or insufficient resources ··········································· 49
Symptom ··························································································································· 49
Troubleshooting flowchart ····································································································· 50
Solution ····························································································································· 50
ACL application failure without an error message ············································································· 51
Symptom ··························································································································· 51
Troubleshooting flowchart ····································································································· 51
Solution ····························································································································· 51
Related commands ···················································································································· 52
1
Introduction
This document provides information about troubleshooting common software and hardware issues
with S6890 Switch Series.
This document is not restricted to specific software or hardware versions.
General guidelines
IMPORTANT:
To prevent a
n issue from causing loss of configuration, save the configuration each time you finish
configuring a feature. For configuration recovery, regularly back up the configuration to a remote
server.
When you troubleshoot S6890 switches, follow these general guidelines:
•
To help identify the cause of the issue, collect system and configuration information, including:
ï‚¡ Symptom, time of failure, and configuration.
ï‚¡ Network topology information, including the network diagram, port connections, and points
of failure.
ï‚¡ Log messages and diagnostic information. For more information about collecting this
information, see "Collecting log and operating information."
ï‚¡ Physical evidence of failure:
− Photos of the hardware.
− Status of the LEDs.
ï‚¡ Steps you have taken, such as reconfiguration, cable swapping, and reboot.
ï‚¡ Output from the commands executed during the troubleshooting process.
•
To ensure safety, wear an ESD wrist strap when you replace or maintain a hardware
component.
Collecting log and operating information
IMPORTANT:
By default, the information center is enabled. If the feature
is disabled, you must use the
info-center enable command to enable the feature for collecting log messages.
Table 1 shows the types of files that the system uses to store operating log and status information.
You can export these files by using FTP, TFTP, or USB.
In an IRF system, these files are stored on the master device. Multiple MPUs will have log files if
master/subordinate switchovers have occurred. You must collect log files from all these devices. To
more easily locate log information, use a consistent rule to categorize and name files. For example,
save log files to a separate folder for each member device, and include their slot numbers in the
folder names.
Table 1 Log and operating information
Category
File name format
Content
Common log
logfileX.log
Command execution and operational log messages.
Diagnostic log
diagfileX.log
Diagnostic log messages about device operation, including the
2
Category
File name format
Content
following items:
• Parameter settings in effect when an error occurs.
• Information about a device startup error.
• Handshaking information between member devices when
a communication error occurs.
Operating
statistics
file-basename
.gz
Current operating statistics for feature modules, including the
following items:
• Device status.
• CPU status.
• Memory status.
• Configuration status.
• Software entries.
• Hardware entries.
Collecting common log messages
1. Save common log messages from the log buffer to a log file:
By default, the log file is saved in the logfile directory of the flash memory on each member
device.
<Sysname> logfile save
The contents in the log file buffer have been saved to the file
flash:/logfile/logfile.log
2. Identify the log file on each member device:
# Display the log file on the master device.
<Sysname> dir flash:/logfile/
Directory of flash:/logfile
0 -rw- 21863 Jul 11 2013 16:00:37 logfile.log
1048576 KB total (38812 KB free)
# Display the log file on each subordinate device:
<Sysname> dir slot2#flash:/logfile/
Directory of flash:/logfile
0 -rw- 21863 Jul 11 2013 16:00:37 logfile.log
1048576 KB total (38812 KB free)
3. Transfer the files to the desired destination by using FTP, TFTP, or USB. (Details not shown.)
Collecting diagnostic log messages
1. Save diagnostic log messages from the diagnostic log file buffer to a diagnostic log file:
By default, the diagnostic log file is saved in the diagfile directory of the flash memory on each
member device.
<Sysname> diagnostic-logfile save
The contents in the diagnostic log file buffer have been saved to the file
flash:/diagfile/diagfile.log
2. Identify the log file on each member device:
# Display the log file on the master device.
<Sysname> dir flash:/diagfile/
Directory of flash:/diagfile
3
0 -rw- 161321 Jul 11 2013 16:16:00 diagfile.log
1048576 KB total (38812 KB free)
# Display the log file on each subordinate device:
<Sysname> dir slot2#flash:/diagfile/
Directory of flash:/diagfile
0 -rw- 161321 Jul 11 2013 16:16:00 diagfile.log
1048576 KB total (38812 KB free)
3. Transfer the files to the desired destination by using FTP, TFTP, or USB. (Details not shown.)
Collecting operating statistics
You can collect operating statistics by saving the statistics to a file or displaying the statistics on the
screen.
When you collect operating statistics, follow these guidelines:
•
Log in to the device through a network or management port instead of the console port, if
possible. Network and management ports are faster than the console port.
•
Do not execute commands while operating statistics are being collected.
•
As a best practice, save operating statistics to a file to retain the information.
To collect operating statistics:
1. Disable pausing between screens of output if you want to display operating statistics on the
screen. Skip this step if you are saving statistics to a file.
<Sysname> screen-length disable
2. Collect operating statistics for multiple feature modules.
<Sysname> display diagnostic-information
Save or display diagnostic information (Y=save, N=display)? [Y/N] :
3. At the prompt, choose to save or display operating statistics:
# To save operating statistics, enter y at the prompt and then specify the destination file path.
Save or display diagnostic information (Y=save, N=display)? [Y/N] : Y
Please input the file name(*.tar.gz)[flash:/diag.tar.gz] :
Diagnostic information is outputting to flash:/diag.tar.gz.
Please wait...
Save successfully.
<Sysname> dir flash:/
Directory of flash:
…
6 -rw- 898180 Jun 26 2013 09:23:51 diag.tar.gz
1021808 KB total (259072 KB free)
# To display operating statistics on the monitor terminal, enter n at the prompt. The output from
this command varies by software version.
Save or display diagnostic information (Y=save, N=display)? [Y/N] :n
===============================================
===============display clock===============
00:08:02.487 UTC Sat 07/06/2019
=================================================
===============display version===============
4
H3C Comware Software, Version 7.1.070, Release 2712
Copyright (c) 2004-2018 New H3C Technologies Co., Ltd. All rights reserved.
H3C S6890-54HF uptime is 0 weeks, 0 days, 0 hours, 8 minutes
Last reboot reason : User reboot
Boot image: flash:/S6890-CMW710-BOOT-R2712.bin
Boot image version: 7.1.070P2214, Release 2712
Compiled Jul 18 2018 14:00:00
System image: flash:/S6890-CMW710-SYSTEM-R2712.bin
System image version: 7.1.070, Release 2712
Compiled Jul 18 2018 14:00:00
……
Contacting technical support
If you cannot resolve an issue after using the troubleshooting procedures in this document, contact
H3C Support. When you contact an authorized H3C support representative, be prepared to provide
the following information:
•
Information described in "General guidelines."
•
Product serial numbers.
This information will help the support engineer assist you as quickly as possible.
The following is the contact information for H3C Support:
•
Telephone number—400-810-0504.
•
E-mail—servi[email protected]m.
Removing deployment errors
Use the deployment checklist in Table 2 to eliminate issues that might be introduced at the
deployment stage. Select items that are suitable for your site.
Table 2 Deployment checklist
Question
Command or method
Remarks
Environment and device
hardware status
Is the sensor temperature
betwee
n the
low-
temperature and
high-temperature warning
thresholds?
display environment
â–¡OK
â–¡Not OK
â–¡Not related
Make sure the temperature
of each sensor is between
the low-
temperature and
high-temperature warning
thresholds.
Are the fan trays operating
correctly?
display fan
â–¡OK
â–¡Not OK
â–¡Not related
Make sure the fan trays are
operating correctly.
Are sufficient power
modules installed
and are
they operating correctly?
display power
â–¡OK
â–¡Not OK
â–¡Not related
Make sure the following
conditions are met:
•
You have installed
sufficient power
modules
to provide
power redundancy.
5
Question
Command or method
Remarks
• The power modules
are operating
correctly. The
display power
command shows that
their state is Normal.
Are the LEDs all displaying
correct statuses?
Visually check the status of
LEDs on each device.
â–¡OK
â–¡Not OK
â–¡Not related
The LED shows the status
of the device:
• Steady green—The
switch is operating
correctly.
• Steady red—The
switch has
failed to
pass POST
or has
problems such as fan
failure.
• Off—
The switch is
powered off
or has
failed to start up.
CPU and memory usage
Does the CPU usage
change rate exceed 10%?
Does the sustained CPU
usage exceed 60%?
display cpu-usage
â–¡OK
â–¡Not OK
â–¡Not related
Execute the
display
cpu-usage
command
repeatedly.
If the CPU sustains a
usage level of over 60% or
has a change rate higher
than 10%, e
xecute the
debugging ip packet
command
to view the
packets delivered to the
CPU for analysis.
Does the memory usage
exceed 60%?
display memory
â–¡OK
â–¡Not OK
â–¡Not related
If memory usage exceeds
60%, execute the
display
memory
command to
identify the module that is
using the most memory.
Ports
Is half duplex used in port
negotiation?
display interface
brief
â–¡OK
â–¡Not OK
â–¡Not related
If the duplex mode of a port
is half, verify that the peer
port uses the same duplex
mode.
Is flow control
unnecessarily
enabled on
ports?
Verify the port settings.
â–¡OK
â–¡Not OK
â–¡Not related
Disable flow control on the
ports.
Are large numbers of error
packets generated
continuously
in the
outbound or inbound
direction of the port?
display interface
â–¡OK
â–¡Not OK
â–¡Not related
If an error counter displays
a non-zero value and the
value is increasing, check
for the following errors:
• L
ink and
optical-electrical
converter errors.
•
Port setting
inconsistencies with
the peer port.
6
Question
Command or method
Remarks
Does the port change
between an up and down
state frequently?
display logbuffer
â–¡OK
â–¡Not OK
â–¡Not related
If the port state flaps, check
for the following errors:
• L
ink and
optical-electrical
converter errors.
• O
ptical power
threshold crossing
events if the port is a
fiber port.
•
Port setting
inconsistencies with
the peer port.
Fiber ports
Do the ports at
the two
ends use the same port
settings?
display
current-configuration
interface
â–¡OK
â–¡Not OK
â–¡Not related
When you connect an H3C
device to a device from
another vendor, set the
same port rate and duplex
mode settings at
the two
ends as a best practice.
Are CRC errors present on
any
fiber port? Is the
number of CRC errors
increasing?
display interface
â–¡OK
â–¡Not OK
â–¡Not related
If CRC errors persist,
r
eplace the transceiver
module or pigtail fiber, or
c
lean the transceiver
module connector.
Trunk port configuration
Do the peer trunk ports use
the same PVID?
display
current-configuration
interface
â–¡OK
â–¡Not OK
â–¡Not related
Make sure the same PVID
is configured on the trunk
ports between two devices.
Are the
peer ports
assigned to the same
VLANs?
display
current-configuration
interface
â–¡OK
â–¡Not OK
â–¡Not related
Make sure the trunk ports
between two devices are
assigned to the same
VLANs.
For example, if you assign
a trunk port to all VLANs,
also assign its peer port to
all VLANs.
Are the peer ports
set to
the same link type?
display
current-configuration
interface
â–¡OK
â–¡Not OK
â–¡Not related
Make sure the ports
between
two devices use
the same link type.
Is a loop present
in VLAN
1?
loopback-detection
global enable vlan 1
â–¡OK
â–¡Not OK
â–¡Not related
Remove ports from VLAN 1
as needed.
Spanning tree feature
Is the
timeout factor
correctly set?
display
current-configuration
â–¡OK
â–¡Not OK
â–¡Not related
As a best practice, set a
timeout factor in the range
of 5 to 7
on a stable
network to avoid
unnecessary
recalculations.
Are ports
connected to
end-user devices
display
current-configuration
â–¡OK
Verify that the output from
7
Question
Command or method
Remarks
configured as edge ports?
interface
â–¡Not OK
â–¡Not related
the
display
current-configuratio
n interface
command
contains the "
stp
edged-port enable
" string
for ports connected to
end-user devices.
As a best practice,
configure ports connected
to end-user devices (PCs,
for example) as edge ports,
or disable
the spanning
tree feature on the ports.
Is the spanning tree
feature disa
bled on ports
connected to devices that
do not support spanning
tree protocols?
display
current-configuration
interface
â–¡OK
â–¡Not OK
â–¡Not related
Disable
the spanning tree
feature on ports connected
to devices that do not
support
spanning tree
protocols.
Make sure the
output from the
display
current-configuratio
n interface
command
contains the "
undo stp
enable
" string for these
ports.
Is the device running
MSTP, STP, or RSTP, and
working with a Cisco
PVST+ device?
display stp
â–¡OK
â–¡Not OK
â–¡Not related
As a best practice to avoid
interoperability issues, set
up a Layer 3 connection to
the Cisco device.
Do the topologies of MSTIs
meet the design?
Are there
as few
overlapping paths as
possible among MSTIs?
display
current-configuration
interface
â–¡OK
â–¡Not OK
â–¡Not related
If the topologies deviate
from the design, reassign
ports to VLANs and revise
the
VLAN and instance
mappings.
For optimal load balancing,
p
lan VLANs and
VLAN-to-instance
mappings
to minimize
overlapping paths among
different MSTIs.
Does a TC a
ttack exist to
cause frequent STP status
changes on any ports?
display stp tc
display stp history
â–¡OK
â–¡Not OK
â–¡Not related
Examine the following
items in the command
output for TC attacks:
•
Incoming and
outgoing TC/TCN
BPDU statistics.
• H
istorical port role
calculation
information.
There is a risk of TC attack
if frequent STP status
changes occur on a stable
network.
Make sure
you have
configured
the following
settings:
• Configure ports
connected to end-user
devices as edge ports,
8
Question
Command or method
Remarks
and enable BPDU
guard. Alternatively,
disable
the spanning
tree feature
on the
ports.
• Disable the spanning
tree feature on ports
connected to devices
that do not support
spanning tree
protocols.
•
Do not disable
TC-BPDU guard.
VRRP
Is the handshake interval
correctly set?
Are the handshake
intervals of the two ends
the same?
display vrrp
â–¡OK
â–¡Not OK
â–¡Not related
Change the handshake
interval to 3 seconds if the
number of VRRP groups is
less than five.
If five or more VRRP
groups exist, assign three
or five VRRP groups into
one group, and configure
the handshake interval as 3
seconds, 5 seconds, and 7
seconds for each group.
ARP
Are there ARP conflicts?
display logbuffer
â–¡OK
â–¡Not OK
â–¡Not related
If the log contains ARP
conflict records, verify that
the hosts in conflict are
legitimate, and remove the
conflicts.
OSPF
Is the router ID of the
device unique on the
network?
display ospf peer
â–¡OK
â–¡Not OK
â–¡Not related
Change the router ID if it is
not unique on the network.
To restart route learning
after you remove the router
ID conflict, you must
execute the
reset ospf
process
command.
Are there a lot of errors in
the output from the
display
ospf statistics error
command?
display ospf
statistics error
â–¡OK
â–¡Not OK
â–¡Not related
If a large number of OSPF
errors has occurred and
the number
continues to
increase,
collect the error
information
for further
analysis.
Are there
severe route
flappings?
display ip
routing-table
statistics
â–¡OK
â–¡Not OK
â–¡Not related
Examine the statistics for
added and
deleted routes
during the system uptime.
If route flapping occurs,
locate the flapping route
and the source device to
9
Question
Command or method
Remarks
analyze the cause. You
can use the
display
ospf lsdb
command
multiple times to view the
age
of routes and locate
the flapping route.
Is the OSPF status stable?
display ospf peer
â–¡OK
â–¡Not OK
â–¡Not related
View the up time of the
OSPF neighbor.
Routes
Is the default route
correct?
Are
there any routing
loops?
tracert
debug ip packet
â–¡OK
â–¡Not OK
â–¡Not related
Use the
tracert
command to trace the path
to a nonexistent network
(1.1.1.1
, for example) to
check for routing loops. If a
routing loop exists, check
the configuration of the
involved devices for errors.
Adjust the route to remove
the loop.
Use the
debug ip packet
command to check for
packets with TTL 0 or 1. If
TTL exceeded packets are
received, check for network
route errors.
CPU security
Are there packet attacks
on CPU?
debug rxtx softcar show
â–¡OK
â–¡Not OK
â–¡Not related
Execute the
debug rxtx
softcar show
command
in probe view to view
packet rate limit
information for cards.
The CPU is under attack if
the number of packets of a
type keeps increasing
unusually.
Records in the local log
buffer
Does the local log buffer
contain exception records?
•
In standalone mode:
local logbuffer
slot slot-number
display
•
In IRF mode:
local logbuffer
chassis
chassis-number
slot slot-
number
display
â–¡OK
â–¡Not OK
â–¡Not related
Execute the
local
logbuffer display
command in probe view.
If
the local log buffer
contains exception
records, contact H3C
Support to troubleshoot the
exceptions.
Use the following
commands in probe view to
clear the history records
after the
exceptions are
removed:
•
In standalone mode:
local logbuffer
slot slot-number
clear
10
Question
Command or method
Remarks
• In IRF mode:
local logbuffer
chassis
chassis-number
slot slot-number
clear
Troubleshooting hardware
This section provides troubleshooting information for common hardware issues.
NOTE:
This section describes how to troubleshoot
switch reboot failure, power module failure, and fan tray
failure
. To troubleshoot transceiver modules, ports, and temperature alarms, see "Troubleshooting
ports" and "Troubleshooting system management."
Switch reboot failure
Symptom
The switch fails to reboot.
Troubleshooting flowchart
Figure 1 Troubleshooting switch reboot failure
Solution
System software
image correct?
Memory runs correctly?
Reload the system
software image
Resolved?
Contact the support
Switch reboot failure
No
Yes No
Yes
No
Yes
End
Error continues to
be reported?
Replace the switch Resolved?
No Yes
Yes
No
Replace the switch
11
To resolve the issue:
1. Verify that the system software image on the switch is correct.
a. Log in to the switch through the console port and restart the switch. If the system reports
that a CRC error occurs or that no system software image is available during the BootWare
loading process, reload the system software image.
b. Verify that the system software image in the flash memory is the same size as the one on
the server. If no system software image is available in the flash memory, or if the image size
is different from the one on the server, reload the system software image. Then set the
reloaded system software image to the current system software image.
The system software image in the flash memory is automatically set to the current system
software image during the BootWare loading process.
2. Verify that the memory is running correctly.
Reboot the switch, and immediately press CTRL+T to examine the memory. If a memory fault is
detected, replace the switch.
3. Verify that no error is reported during the BootWare loading process.
If the memory is running correctly but there are still errors reported during the BootWare loading
process, replace the switch.
4. If the issue persists, contact H3C Support.
Operating power module failure
Symptom
An operating power module fails.
Solution
To resolve the issue:
1. Identify the operating state of the power module.
ï‚¡ Execute the display power command to view the operating state of the power module.
<Sysname> display power
Input Power:132W
PowerID State InPower(W) Current(A) Voltage(V) OutPower(W) Type
1 Absent -- -- -- -- ---
2 Normal -- -- -- -- PSR300-A
ï‚¡ Execute the display alarm command to view alarm information about the power module.
<Sysname> display alarm
Slot CPU Level Info
1 0 INFO Chassis 1 power 1 is absent.
If the power module is in Absent state, go to step 2. If the power module is in Fault state, go to
step 3.
2. Verify that the power module is installed securely.
Remove and reinstall the power module to ensure that the power module is installed securely.
Then execute the display power command to verify that the power module has changed to
Normal state. If the power module remains in Absent state, replace the power module.
3. Verify that the power module is operating correctly.
a. Verify that the power cord is connected to the power module securely.
<Sysname> display power
12
Input Power:132W
PowerID State InPower(W) Current(A) Voltage(V) OutPower(W) Type
1 Absent -- -- -- -- ---
2 Normal -- -- -- -- PSR300-A
If the voltage and current of the power module are 0 and the power module state is Fault,
the power cord is disconnected. Connect the power cord securely to the power module.
Then execute the display power command to verify that the power module has changed
to Normal state.
b. Determine whether the power module is in high temperature. If dust accumulation on the
power module causes the high temperature, remove the dust. Then remove and reinstall
the power module. Execute the display power command to verify that the power module
has changed to Normal state.
c. Install the power module into an empty power module slot. Then execute the display
power command to verify that the power module has changed to Normal state in the new
slot. If the power module remains in Fault state, replace the power module.
4. If the issue persists, contact H3C Support.
Newly
installed power module failure
Symptom
A newly installed power module fails.
Solution
To resolve the issue:
1. Identify the operating state of the power module.
ï‚¡ Execute the display power command to view the operating state of the power module.
<Sysname> display power
Input Power:132W
PowerID State InPower(W) Current(A) Voltage(V) OutPower(W) Type
1 Absent -- -- -- -- ---
2 Normal -- -- -- -- PSR300-A
ï‚¡ Execute the display alarm command to view alarm information about the power module.
<Sysname> display alarm
Slot CPU Level Info
1 0 INFO Chassis 1 power 1 is absent.
If the power module is in Absent state, go to step 2. If the power module is in Fault state, go to
step 3.
2. Verify that the power module is installed securely.
a. Remove and reinstall the power module to make sure the power module is installed
securely. Then execute the display power command to verify that the power module has
changed.
b. Remove and install the power module into an empty power module slot. Then execute the
display power command to verify that the power module has changed to Normal state
in the new slot. If the power module remains in Absent state, go to step 4.
3. Verify that the power module is operating correctly.
13
a. Verify that the power module is connected to the power source correctly. If it is not, connect
it to the power source correctly. Then execute the display power command to verify that
the power module has changed.
b. Remove and install the power module into an empty power module slot. Then execute the
display power command to verify that the power module has changed to Normal state in
the new slot. If the power module remains in Fault state, go to step 4.
4. If the issue persists, contact H3C Support.
Fan tray failure
Symptom
An operating fan tray or a newly installed fan tray fails.
Solution
To resolve the issue:
1. Identify the operating state of the fan tray.
ï‚¡ Execute the display fan command to view the operating state of the fan tray.
<Sysname> display fan
Fan-tray 1:
Status : Normal
Fan Type : LSWM1FANSA
Fan number: 2
Fan mode : Auto
Airflow Direction: Port-to-power
Fan Speed(rpm)
--- ----------
1 10692
2 9105
Fan-tray 2:
Status : Normal
Fan Type : LSWM1FANSA
Fan number: 2
Fan mode : Auto
Airflow Direction: Port-to-power
Fan Speed(rpm)
--- ----------
1 10702
2 9133
Fan-tray 3:
Status : Normal
Fan Type : LSWM1FANSA
Fan number: 2
Fan mode : Auto
Airflow Direction: Port-to-power
Fan Speed(rpm)
--- ----------
14
1 10692
2 9162
Fan-tray 4:
Status : Normal
Fan Type : LSWM1FANSA
Fan number: 2
Fan mode : Auto
Airflow Direction: Port-to-power
Fan Speed(rpm)
--- ----------
1 10731
2 9183
Fan-tray 5:
Status : Normal
Fan Type : LSWM1FANSA
Fan number: 2
Fan mode : Auto
Airflow Direction: Port-to-power
Fan Speed(rpm)
--- ----------
1 10672
2 9183
ï‚¡ Execute the display alarm command to view alarm information about the fan tray.
<Sysname> display alarm
Slot CPU Level Info
1 0 INFO Chassis 1 power 1 is absent.
If the fan tray is in Absent state, go to step 2. If the fan tray is in Fault state, go to step 3.
2. Verify that the fan tray is installed securely.
Remove and reinstall the fan tray to ensure that the fan tray is installed securely. Then execute
the display fan command to verify that the fan tray has changed to Normal state. If the fan
tray remains in Absent state, replace the fan tray.
3. Verify that the fan tray is operating correctly.
a. Identify whether the fan tray is faulty.
− Execute the display environment command to view temperature information.
If the temperature continues to rise, put your hand at the air outlet to feel if air is being
expelled out of the air outlet. If no air is being expelled out of the air outlet, the fan tray is
faulty.
− Execute the display fan command to view the fan speed information.
If the fan speed is less than 500 rpm, the fan tray is faulty.
b. If the fan tray is faulty, remove and reinstall the fan tray to make sure the fan tray is installed
securely. Then execute the display fan command to verify that the fan tray has changed
to Normal state.
c. If the fan tray remains in Fault state, replace the fan tray.
You must make sure the switching operating temperature is below 60°C (140°F) while you
replace the fan tray. If a new fan tray is not readily available, power off the switch to avoid
damage caused by high temperature.
4. If the issue persists, contact H3C Support.
15
Related commands
This section lists the commands that you might use for troubleshooting the hardware.
Command
Description
display alarm
Displays alarm information.
display environment
Displays temperature information.
display fan
Displays the operating states of the fan tray.
display power
Displays power module information.
Troubleshooting system management
This section provides troubleshooting information for common system management issues.
High CPU utilization
Symptom
The sustained CPU utilization on a device is apparently higher than the CPU utilization on other
devices.
Troubleshooting flowchart
Figure 2 Troubleshooting high CPU utilization
Solution
To resolve the issue:
1. Identify the job that has a high CPU utilization. For example:
<Sysname> system-view
[Sysname] probe
[Sysname-probe] display process cpu slot 1
Identify the job that has a
high CPU utilization
Display the stack of the
job
High CPU utilization
Contact the support
16
CPU utilization in 5 secs: 6.0%; 1 min: 5.6%; 5 mins: 5.7%
JID 5Sec 1Min 5Min Name
1 0.0% 0.0% 0.0% scmd
2 0.0% 0.0% 0.0% [kthreadd]
3 0.0% 0.0% 0.0% [migration/0]
4 0.0% 0.0% 0.0% [ksoftirqd/0]
5 0.0% 0.0% 0.0% [watchdog/0]
6 0.0% 0.0% 0.0% [migration/1]
7 0.0% 0.0% 0.0% [ksoftirqd/1]
8 0.0% 0.0% 0.0% [watchdog/1]
9 0.0% 0.0% 0.0% [migration/2]
10 0.0% 0.0% 0.0% [ksoftirqd/2]
11 0.0% 0.0% 0.0% [watchdog/2]
12 0.0% 0.0% 0.0% [migration/3]
13 0.0% 0.0% 0.0% [ksoftirqd/3]
14 0.0% 0.0% 0.0% [watchdog/3]
15 0.0% 0.0% 0.0% [migration/4]
16 0.0% 0.0% 0.0% [ksoftirqd/4]
17 0.0% 0.0% 0.0% [watchdog/4]
18 0.0% 0.0% 0.0% [migration/5]
19 0.0% 0.0% 0.0% [ksoftirqd/5]
20 0.0% 0.0% 0.0% [watchdog/5]
21 0.0% 0.0% 0.0% [migration/6]
The output shows the average CPU usage values of jobs for the last 5 seconds, 1 minute, and
5 minutes. Typically, the average CPU usage of a job is less than 5%.
2. Display the job's stack. In this example, the job uses the ID of 284.
[Sysname-probe]follow job 284 slot 1
Attaching to process 284 ([OPTK])
Iteration 1 of 5
------------------------------
Kernel stack:
[<ffffffff804ad9f0>] schedule+0x710/0x1050
[<ffffffff804ae5d8>] schedule_timeout+0x98/0xe0
[<ffffffff803187d0>] kepoll_wait+0x2d0/0x450
[<ffffffffc71b29d4>] DWARE_OPTMOD_TaskEntry+0xa4/0xd0 [system]
[<ffffffffc72e1894>] thread_boot+0x74/0x90 [system]
[<ffffffff80266470>] kthread+0x140/0x150
[<ffffffff8021d910>] kernel_thread_helper+0x10/0x20
Iteration 2 of 5
------------------------------
Kernel stack:
[<ffffffff804ad9f0>] schedule+0x710/0x1050
[<ffffffff804ae5d8>] schedule_timeout+0x98/0xe0
[<ffffffff803187d0>] kepoll_wait+0x2d0/0x450
[<ffffffffc71b29d4>] DWARE_OPTMOD_TaskEntry+0xa4/0xd0 [system]
[<ffffffffc72e1894>] thread_boot+0x74/0x90 [system]
[<ffffffff80266470>] kthread+0x140/0x150
17
[<ffffffff8021d910>] kernel_thread_helper+0x10/0x20
Iteration 3 of 5
------------------------------
Kernel stack:
[<ffffffff804ad9f0>] schedule+0x710/0x1050
[<ffffffff804ae5d8>] schedule_timeout+0x98/0xe0
[<ffffffff803187d0>] kepoll_wait+0x2d0/0x450
[<ffffffffc71b29d4>] DWARE_OPTMOD_TaskEntry+0xa4/0xd0 [system]
[<ffffffffc72e1894>] thread_boot+0x74/0x90 [system]
[<ffffffff80266470>] kthread+0x140/0x150
[<ffffffff8021d910>] kernel_thread_helper+0x10/0x20
Iteration 4 of 5
------------------------------
Kernel stack:
[<ffffffff804ad9f0>] schedule+0x710/0x1050
[<ffffffff804ae5d8>] schedule_timeout+0x98/0xe0
[<ffffffff803187d0>] kepoll_wait+0x2d0/0x450
[<ffffffffc71b29d4>] DWARE_OPTMOD_TaskEntry+0xa4/0xd0 [system]
[<ffffffffc72e1894>] thread_boot+0x74/0x90 [system]
[<ffffffff80266470>] kthread+0x140/0x150
[<ffffffff8021d910>] kernel_thread_helper+0x10/0x20
Iteration 5 of 5
------------------------------
Kernel stack:
[<ffffffff804ad9f0>] schedule+0x710/0x1050
[<ffffffff804ae5d8>] schedule_timeout+0x98/0xe0
[<ffffffff803187d0>] kepoll_wait+0x2d0/0x450
[<ffffffffc71b29d4>] DWARE_OPTMOD_TaskEntry+0xa4/0xd0 [system]
[<ffffffffc72e1894>] thread_boot+0x74/0x90 [system]
[<ffffffff80266470>] kthread+0x140/0x150
[<ffffffff8021d910>] kernel_thread_helper+0x10/0x20
3. Save the information displayed in the previous steps and use the display
diagnostic-information command to collect diagnostic information.
4. Contact H3C Support.
High memory utilization
Symptom
The display memory command shows that the memory utilization of the device is higher than 60%
during a period of time (typically 30 minutes).
  • Page 1 1
  • Page 2 2
  • Page 3 3
  • Page 4 4
  • Page 5 5
  • Page 6 6
  • Page 7 7
  • Page 8 8
  • Page 9 9
  • Page 10 10
  • Page 11 11
  • Page 12 12
  • Page 13 13
  • Page 14 14
  • Page 15 15
  • Page 16 16
  • Page 17 17
  • Page 18 18
  • Page 19 19
  • Page 20 20
  • Page 21 21
  • Page 22 22
  • Page 23 23
  • Page 24 24
  • Page 25 25
  • Page 26 26
  • Page 27 27
  • Page 28 28
  • Page 29 29
  • Page 30 30
  • Page 31 31
  • Page 32 32
  • Page 33 33
  • Page 34 34
  • Page 35 35
  • Page 36 36
  • Page 37 37
  • Page 38 38
  • Page 39 39
  • Page 40 40
  • Page 41 41
  • Page 42 42
  • Page 43 43
  • Page 44 44
  • Page 45 45
  • Page 46 46
  • Page 47 47
  • Page 48 48
  • Page 49 49
  • Page 50 50
  • Page 51 51
  • Page 52 52
  • Page 53 53
  • Page 54 54
  • Page 55 55

H3C S6890 Series Troubleshooting Manual

Type
Troubleshooting Manual

Ask a question and I''ll find the answer in the document

Finding information in a document is now easier with AI