www.nvidia.com
NVIDIA DGX-1 DU-08033-001 _v13.1|iii
4.1.2.Viewing System Information......................................................................29
4.1.3.Submitting BMC Log Files.........................................................................29
4.1.4.Determining Total Power Consumption......................................................... 29
4.1.5.Accessing the DGX-1 Console.................................................................... 30
4.1.6.Powering Off / Power Cycling the System Remotely.........................................30
4.1.6.1.From the DGX-1 Console Window..........................................................30
4.1.6.2.From the BMC UI............................................................................. 30
4.2.Configuring a Static IP Address for the BMC........................................................31
4.2.1.Configuring a BMC Static IP Address Using ipmitool..........................................31
4.2.2.Configuring a BMC Static IP Address Using the System BIOS................................ 32
4.2.3.Configuring a BMC Static IP Address Using the BMC Dashboard............................ 36
4.3.Configuring Static IP Addresses for the Network Ports............................................37
4.4.Obtaining MAC Addresses.............................................................................. 38
Chapter5.Maintaining and Servicing the NVIDIA DGX-1............................................... 42
5.1.Problem Resolution and Customer Care............................................................. 42
5.2.Restoring the DGX-1 Software Image................................................................ 42
5.2.1.Obtaining the DGX-1 Software ISO Image and Checksum File.............................. 43
5.2.2.Re-Imaging the System Remotely............................................................... 43
5.2.3.Creating a Bootable Installation Medium...................................................... 46
5.2.3.1.Creating a Bootable USB Flash Drive by Using the dd Command......................46
5.2.3.2.Creating a Bootable USB Flash Drive by Using Akeo Rufus............................. 47
5.2.4.Re-Imaging the System From a USB Flash Drive.............................................. 49
5.2.5.Retaining the RAID Partition While Installing the OS.........................................49
5.3.Updating the System BIOS............................................................................. 50
5.4.Updating the BMC....................................................................................... 53
5.5.Replacing the System and Components..............................................................55
5.5.1.Replacing the System............................................................................. 56
5.5.2.Replacing an SSD...................................................................................56
5.5.3.Recreating the Virtual Drives.................................................................... 57
5.5.3.1.Access the BIOS Setup Utility.............................................................. 57
5.5.3.2.Clear the Drive Group Configuration...................................................... 60
5.5.3.3.Recreate the OS Virtual Drive.............................................................. 64
5.5.3.4.Recreate the RAID0 Virtual Drive.......................................................... 72
5.5.4.Recreating the RAID 0 Array..................................................................... 84
5.5.5.Replacing the Power Supplies....................................................................85
5.5.6.Replacing the Fan Module........................................................................ 86
5.5.7.Replacing the DIMMs...............................................................................86
5.5.8.Replacing the InfiniBand Cards.................................................................. 91
5.5.9.Setting Up the InfiniBand Cards.................................................................95
Chapter6.Installing Software on Air-Gapped NVIDIA DGX-1 Systems............................... 99
6.1.Installing NVIDIA DGX-1 Software.....................................................................99
6.1.1.Re-Imaging the System............................................................................99
6.1.2.Creating a Local Mirror of the NVIDIA and Canonical Repositories....................... 100