Broadcom Confidential NetXtreme-E-UG600
3
NetXtreme-E User Guide Tuning Guide for AMD EPYC 7002 Series on Linux
Table of Contents
1 Introduction ..................................................................................................................................................................5
2 AMD EPYC 7002 Series Microarchitecture ................................................................................................................5
2.1 Core Cache Dies (CCD) and Core Cache Complexes (CCX) .............................................................................5
2.2 NUMA and NUMA Per Socket (NPS) ..................................................................................................................6
2.2.1 NPS=1 .......................................................................................................................................................6
2.2.2 NPS=2 .......................................................................................................................................................6
2.2.3 NPS=4 .......................................................................................................................................................7
2.3 Memory Optimizations .........................................................................................................................................7
2.3.1 Platforms Specifically Designed for AMD EPYC 7002 ..............................................................................7
3 BIOS Tuning .................................................................................................................................................................8
3.1 NPS (NUMA Per Socket) .....................................................................................................................................8
3.2 X2APIC ................................................................................................................................................................9
3.3 Determinism Control and Determinism Slider....................................................................................................10
3.4 APBDIS..............................................................................................................................................................11
3.5 Preferred I/O and Enhanced Preferred I/O ........................................................................................................12
3.6 PCIe Ten Bit Tag ...............................................................................................................................................13
3.7 Memory Clock Speed.........................................................................................................................................14
3.8 L3 LLC (Last Level Cache) as NUMA ................................................................................................................15
3.9 Socket/Inter-Chip Global Memory Interconnect (xGMI) .....................................................................................16
4 TCP Performance Tuning ..........................................................................................................................................17
4.1 BIOS Tuning ......................................................................................................................................................17
4.2 NIC Tuning.........................................................................................................................................................17
4.2.1 NUMA: Local vs. Non Local.....................................................................................................................17
4.2.2 Configuring Queues.................................................................................................................................18
4.2.3 Configure IRQ and Application Affinity ....................................................................................................19
4.2.4 TX and RX Flow Steering ........................................................................................................................19
4.2.5 TX/RX Queue Size ..................................................................................................................................20
4.2.6 Interrupt Moderation ................................................................................................................................20
4.2.7 GRO (Generic Receive Offload) ..............................................................................................................20
4.2.8 TX-NoCache-Copy ..................................................................................................................................21
4.2.9 Relaxed Ordering.....................................................................................................................................21
4.2.10 PCIe MRRS (Maximum Read Request Size) ........................................................................................22
4.3 OS Tuning (Linux)..............................................................................................................................................22
4.3.1 IOMMU ....................................................................................................................................................22
4.3.2 Performance Governor ............................................................................................................................23
4.3.3 TCP Memory Configuration .....................................................................................................................23
4.3.4 nohz=off ...................................................................................................................................................23
4.3.5 TCP Example with the BCM957508-P2100G..........................................................................................24