Cisco UCS C Series Troubleshooting Tips

Type
Troubleshooting Tips

Cisco UCS C Series is a high-performance server platform designed for demanding workloads. It offers a range of features and capabilities that make it ideal for a variety of applications, including virtualization, high-performance computing, and data analytics. Powered by Intel Xeon processors, the UCS C Series provides exceptional processing power and memory capacity. With its advanced I/O capabilities, including support for NVMe and PCIe 4.0, the UCS C Series can handle even the most demanding storage and networking requirements.

Cisco UCS C Series is a high-performance server platform designed for demanding workloads. It offers a range of features and capabilities that make it ideal for a variety of applications, including virtualization, high-performance computing, and data analytics. Powered by Intel Xeon processors, the UCS C Series provides exceptional processing power and memory capacity. With its advanced I/O capabilities, including support for NVMe and PCIe 4.0, the UCS C Series can handle even the most demanding storage and networking requirements.

UCS C−Series Servers Troubleshooting Tips
Document ID: 111508
Contents
Introduction
Prerequisites
Requirements
Components Used
Network Diagram
Conventions
Background Information
C−Series Troubleshooting Tips
Obtaining Showtech Support to TAC
Display of System Event Log Events
Display of Sensor Readings
Display of CIMC Log
Run Debug Firmware Utility (CLI)
Run Diagnostics (CLI)
Common Troubleshooting Scenarios − Power−On Related
Common Troubleshooting Scenarios − Host Does not Boot
Common Troubleshooting Scenarios − BMC
Verify
Troubleshoot
Related Information
Introduction
The Cisco Integrated Management Controller (CIMC) is the management service for the UCS C−Series
server. CIMC runs within the server.
You can use a web−based GUI or SSH−based CLI to access, configure, administer, and monitor the server.
Almost all tasks can be performed in either interface. The results of tasks performed in one interface are
automatically displayed in another.
This document provides some CIMC troubleshooting tips and common troubleshooting scenarios for UCS
C−Series servers.
Prerequisites
Requirements
Cisco recommends that you:
Have a working knowledge of the Cisco UCS C−Series Server Blade hardware and software
administration.
Be familiar with UCS C−Series Servers Cisco Integrated Management Controller.
Understand the impact and implications of the different commands described in this document.
Be familiar with the UCS components and topology.
Components Used
The information in this document is based on Cisco UCS C−Series Servers.
The information in this document was created from the devices in a specific lab environment. All of the
devices used in this document started with a default configuration. If your network is live, make sure that you
understand the potential impact of any command.
Network Diagram
There is currently no specific network diagram available.
Conventions
Refer to the Cisco Technical Tips Conventions for more information on document conventions.
Background Information
There is currently no specific background information available.
C−Series Troubleshooting Tips
Common troubleshooting tips on C−Series servers are provided in this section.
Obtaining Showtech Support to TAC
Perform this task when requested by the Cisco Technical Assistance Center (TAC). This utility creates a
summary report containing configuration information, logs, and diagnostic data that will help TAC in
troubleshooting and resolving a technical issue.
This showtechsupport is available from GUI and CLI to tftp upload a techsupport file for offline analysis.
Complete these steps in order to obtain showtech via GUI:
In the Navigation pane, click the Admin tab.1.
From the Admin tab, click Utilities.2.
In the Actions area of the Utilities pane, click Export Technical Support Data.3.
In the Export Technical Support Data dialog box, complete these fields:
TFTP Server IP Address field − The IP address of the TFTP server on which the support data
file should be stored.
Path and Filename field − The file name in which the support data should be stored on the
server. When you enter this name, include the relative path for the file from the top of the
TFTP tree to the desired location.
4.
Click Export.
And via the CLI command:
Generate show techsupport
Generate show techsupport then provide the generated report file to Cisco TAC.
SanDiego# scope cimc
SanDiego /cimc # scope
firmware
log
network
tech−support
5.
SanDiego /cimc # scope tech−support
SanDiego /cimc/tech−support # set tftp−ip 192.168.1.1
SanDiego /cimc/tech−support *# set path \techsupport\showtech
SanDiego /cimc/tech−support *#commit
SanDiego /cimc/tech−support *#start
These are explanations of some of the key fields within the showtech:
var/ − Contains detailed logs, and status of all monitored services. It also contains services
information files such as the configuration of SOL and IPMI sensor alarms.
var/log − This contains the rolling volatile log messages.
obfl/ − This contains the rolling non−volatile log messages.
met/ − Non−volatile configuration and SEL.
tmp/ − The show techsupport text files, along with BIOS techsupport text files.
Text files in tmp − These contain all process, network, system, mezzanine, and bios state
information.
mctool − Gets basic information on the State of the CIMC to USC management API.
network − See current network configuration and socket information.
obfl − Live obfl
messages − Live /var/log/messages file
alarms − What sensors are in alarm.
sensors − Current sensor readings from IPMI.
power − The current power state of the x86.
Display of System Event Log Events
Complete these steps in order to display the System Event Log (SEL) events:
In the Navigation pane, click the Server tab.1.
From the Server tab, click System Event Log.2.
Review the following information for each system event in the log.3.
(Optional) From the Entries Per Page drop−down list, select the number of system events to display
on each page.
4.
(Optional) Click <Newer and Older> to move backward and forward through the pages of system
events, or click <<Newest to move to the top of the list. By default, the newest system events are
displayed at the top if the list. Cisco CIMC.
5.
Display of Sensor Readings
Complete these steps in order to display the sensor readings:
In the Navigation pane, click the Server tab.1.
From the Server tab, click Sensors.2.
View various sensors by clicking the desired sensor.3.
Display of CIMC Log
Complete these steps in order to display the CIMC log:
In the Navigation pane, click the Admin tab.1.
From the Admin tab, click CIMC Log.2.
From the Entries Per Page drop−down list, select the number of CIMC events to display on each page.3.
Run Debug Firmware Utility (CLI)
You can run Debug Firmware Utility (CLI) to view realtime CIMC debug status:
Access to read−only shell to view realtime CIMC Debug status.
CIMC Debug Utilities
!−−− enter debug shell
rtp−6100−A#
SanDiego /cimc # connect
debug−shell
diags
host
shell
SanDiego /cimc # connect debug−shell
<CR>
SanDiego /cimc # connect debug−shell
BMC Debug Firmware Utility Shell
[ help ]#
!−−− available debug options
[ help ]# ?
__________________________________________
Debug Firmware Utility
__________________________________________
Command List
__________________________________________
alarms
cores
exit
help [COMMAND]
images
mctools
memory
messages
network
obfl
post
power
sensors
sel
fru
tasks
top
update
users
version
__________________________________________
Notes:
"enter Key" will execute last command
"COMMAND ?" will execute help for that command
__________________________________________
[ help ]#
!−−− view how many alarms in realtime
[ help ]# alarms
StatusLedControl: Setting LED to AMBER
− Sensor[176] in ALARM Level[2]
[ alarms ]#
!−−− view all sensors in realtime
[ alarms ]# sensors
P3V_BAT_SCALED | 3.023 | Volts | ok | 2.706 | 2.798 | na | na | 3.089 | na
P12V_SCALED | 12.036 | Volts | ok | 11.269 | 11.623 | na | na | 12.331 | 12.685
P5V_SCALED | 5.037 | Volts | ok | 4.675 | 4.844 | na | na | 5.157 | 5.278
P3V3_SCALED | 3.302 | Volts | ok | 3.097 | 3.192 | na | na | 3.381 | 3.492
P5V_STBY_SCALED | 4.989 | Volts | ok | 4.675 | 4.844 | na | na | 5.157 | 5.278
VR_CPU1_IOUT | 10.680 | Amps | ok | na | na | na | 152.680 | 164.040 | 175.400
VR_CPU2_IOUT | 12.100 | Amps | ok | na | na | na | 152.680 | 164.040 | 175.400
PV_VCCP_CPU1 | 0.862 | Volts | ok | 0.706 | 0.725 | na | na | 1.392 | 1.431
PV_VCCP_CPU2 | 0.862 | Volts | ok | 0.706 | 0.725 | na | na | 1.392 | 1.431
P1V5_DDR3_CPU1 | 1.499 | Volts | ok | 1.411 | 1.450 | na | na | 1.548 | 1.588
P1V5_DDR3_CPU2 | 1.499 | Volts | ok | 1.411 | 1.450 | na | na | 1.548 | 1.588
P1V1_IOH | 1.088 | Volts | ok | 1.029 | 1.068 | na | na | 1.137 | 1.166
P1V8_AUX | 1.784 | Volts | ok | 1.695 | 1.744 | na | na | 1.852 | 1.911
IOH_THERMALERT_N | 0x0 | discrete | 0x0180| na | na | na | na | na | na
IOH_THERMTRIP_N | 0x0 | discrete | 0x0180| na | na | na | na | na | na
P2_THERMTRIP_N | 0x0 | discrete | 0x0180| na | na | na | na | na | na
P1_THERMTRIP_N | 0x0 | discrete | 0x0180| na | na | na | na | na | na
!−−− view power status in realtime
[ help ]# power
OP:[ status ]
Power−State: [ on ]
VDD−Power−Good: [ active ]
Power−On−Fail: [ inactive ]
Power−Ctrl−Lock: [ unlocked ]
OP−CCODE:[ Success ]
[ power ]#
!−−− view network status in realtime
[ power ]# network
eth1 Link encap:Ethernet HWaddr 02:44:67:84:09:1C
inet addr:172.25.183.109 Bcast:172.25.183.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:42862 errors:0 dropped:0 overruns:0 frame:0
TX packets:26968 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3786646 (3.6 MiB) TX bytes:12311980 (11.7 MiB)
Interrupt:1
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.255.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8137 errors:0 dropped:0 overruns:0 frame:0
TX packets:8137 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:877033 (856.4 KiB) TX bytes:877033 (856.4 KiB)
Active Internet connections (servers and established)
Proto Recv−Q Send−Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:3490 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:8195 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:2068 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:23 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:8195 127.0.0.1:2360 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2361 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2353 ESTABLISHED
tcp 0 0 127.0.0.1:2363 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2360 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2361 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2367 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2354 ESTABLISHED
tcp 0 0 127.0.0.1:2354 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2355 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2353 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2358 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2359 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2356 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:2357 127.0.0.1:8195 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2363 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2355 ESTABLISHED
tcp 0 4412 172.25.183.109:22 10.61.100.118:2632 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2356 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2357 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2358 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2367 ESTABLISHED
tcp 0 0 127.0.0.1:8195 127.0.0.1:2359 ESTABLISHED
netstat: no support for 'AF INET6 (tcp)' on this system
udp 0 0 127.0.0.1:9473 0.0.0.0:*
udp 0 0 0.0.0.0:623 0.0.0.0:*
netstat: no support for 'AF INET6 (udp)' on this system
netstat: no support for 'AF INET6 (raw)' on this system
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I−Node Path
unix 2 [ ACC ] STREAM LISTENING 3330 /tmp/rpSocketCB25226
unix 2 [ ACC ] STREAM LISTENING 2112 /var/split_stream_RW
unix 2 [ ACC ] STREAM LISTENING 2114 /var/split_stream_RO
unix 2 [ ACC ] STREAM LISTENING 4437 /tmp/rpSocketSMCB536870913
unix 2 [ ACC ] STREAM LISTENING 2903 /tmp/rpSocket35003
Run Diagnostics (CLI)
You can run Diagnostics (CLI) to diagnose possible issue.
Note: Diagnostics while designed to be safe to a running server should not be run with a load you cannot
afford to lose. Please ensure critical server applications are off−line before running diagnostics.
Diagnostics (CLI)
To view realtime CIMC Diagnostics status.
CIMC Diags Shell
!−−− Enter diagnostics mode.
SanDiego# connect
debug−shell
diags
host
shell
SanDiego# connect diags
<CR>
SanDiego# connect diags
Diagnostics while designed to be safe to a running server should not
be run with a load you cannot afford to lose. Please ensure critical
server applications are off−line before running diagnostics.
Continue?[y|N]y
max_slots = 1
Registering hwmon commands
dsh_cmd_add: Adding hwmon 0x32658
Registering board commands
dsh_cmd_add: Adding sprom 0x129a8
dsh_cmd_add: Adding i2c 0x12a14
dsh_cmd_add: Adding spd 0x12db8
dsh_cmd_add: Adding tmp75 0x12c88
dsh_cmd_add: Adding clkbuf 0x132e4
dsh_cmd_add: Adding timhub 0x13368
dsh_cmd_add: Adding help 0x10778
dsh_cmd_add: Adding show 0x108f4
dsh_cmd_add: Adding run 0x108a0
dsh_cmd_add: Adding skip 0x10b2c
dsh_cmd_add: Adding unskip 0x10b64
dsh_cmd_add: Adding results 0x10b9c
dsh_cmd_add: Adding sp 0x10ce8
dsh_cmd_add: Adding reg 0x10db0
dsh_cmd_add: Adding version 0x10de8
dsh_cmd_add: Adding err 0x10e24
DIAG >
DIAG > q
SanDiego#
Common Troubleshooting Scenarios − Power−On Related
No Standby Power to UCS C250 M1 Extended−Memory Rack−Mount Server
Check that the AC power cord is ok.1.
Failure in Power Supply Unit.2.
Server Host does not power up
Check front I/O board connection.1.
Check Power Sequencer fault LEDs.2.
Power Supply unit failure (PS Failure LED blinking).3.
Server powers on with no video
Check that the front I/O dongle is properly seated.1.
Check the front I/O cable connection to Motherboard.2.
Memory subsystem failure.3.
BMC does not boot
Failure in standby Power rails.1.
Corrupt BMC BIOS.2.
Common Troubleshooting Scenarios − Host Does not Boot
Check for:
Verify front I/O dongle is seated correctly.1.
Check Front I/O cable connection.2.
Reseat/Replace Dimm(s).3.
Verify BIOS is not corrupt.4.
Verify host power rails are good.5.
Check CPU sockets for bent pins.6.
Verify Powerok signals are ok.7.
Verify Resets are good.8.
Common Troubleshooting Scenarios − BMC
BMC booted. Look for Blade health LED to come on which indicates that the BMC has started.
Check that the Standby power rails are ok.1.
Check that the BMC bios is not corrupt.2.
Check that the BMC clock is ok.3.
Check that standby power is ok and resets are valid.4.
BMC Ethernet cannot communicate
Check the flex cable connections to Mother Board and Rear I/O.
Verify
Refer to the above sub−sections for verifications.
Troubleshoot
There is currently no specific troubleshooting information available for this configuration.
Related Information
Technical Support & Documentation − Cisco Systems
Contacts & Feedback | Help | Site Map
© 2014 − 2015 Cisco Systems, Inc. All rights reserved. Terms & Conditions | Privacy Statement | Cookie Policy | Trademarks of
Cisco Systems, Inc.
Updated: Jan 08, 2010 Document ID: 111508
  • Page 1 1
  • Page 2 2
  • Page 3 3
  • Page 4 4
  • Page 5 5
  • Page 6 6
  • Page 7 7
  • Page 8 8
  • Page 9 9
  • Page 10 10
  • Page 11 11

Cisco UCS C Series Troubleshooting Tips

Type
Troubleshooting Tips

Cisco UCS C Series is a high-performance server platform designed for demanding workloads. It offers a range of features and capabilities that make it ideal for a variety of applications, including virtualization, high-performance computing, and data analytics. Powered by Intel Xeon processors, the UCS C Series provides exceptional processing power and memory capacity. With its advanced I/O capabilities, including support for NVMe and PCIe 4.0, the UCS C Series can handle even the most demanding storage and networking requirements.

Ask a question and I''ll find the answer in the document

Finding information in a document is now easier with AI