HomeLab Stage LXVI: New Monster Cluster

My entire HomeLab is running fine for several years. Why change or modify it? The answer is very simple: My customers are doing it all the time…. I want to be able to test out the newest technology and I always want to be one step ahead of customers, from an experience perspective.

I was able to acquire two Dell R740xd servers from one of my friends 🙂

Hardware Overview

Each server is equipped with Dual Intel Xeon Platinum 6168 CPUs with 24 Cores and 2.7Ghz. A lot of power, especially in combination with the 1.5TB of memory!

1.5 TB DDR4 2666MHz memory per server
Extreme Powerful Machine

Storage (Controllers & Disks)

I swapped the Dell Perc H730P controller for a Dell HBA330 (to fully support vSAN!).

I have pumped each server with 3 x Intel Optane DC4800DC cards for vSAN cache. I wanted 3 Diskgroups within each host to use the nested fault domains feature for my new 2-Node vSAN cluster. Each Diskgroup has one Intel Optane 375GB Caching Device and 2 x 3,84 TB SAS SSDs.

The server itself boots from a 200GB SSD. Each server has a 1.6TB SSD for vSAN Direct K8S stuff and 2 x 7.6 TB SSDs for the Dell Data Domain VMs.

7.6 TB SSDs for the Dell DataDomain (DDVE) VMs

Network Connections

The onBoard network card (4 x copper) is also replaced with the 4 x 10GbE SFP+ version.

The vSAN network as well as the vMotion network is using a dual port 100GbE NIC (Mellanox Connectx-5)

4 x 10GbE for “normal traffic”, 2 x 100GbE for vMotion / vSAN and 2 x 25GbE (DPU) planned for NSX

GPUs

But what about GPUs? Yes of course! Each server is configured with a Nvdia RTX8000. 48GB of FrameBuffer, fully supported for vGPU and Nvidia AI Enterprise.

I am using the active version of the card, so I am able to disable the fan management inside the server for that slot. That helped a lot to make the machine quiet….

Dell iDRAC Cooling Configuration
Custom PCIe Cooling Config (Slot 1 LFM is disabled, because of the active version of the RTX8000 GPU)
vSphere 7.0.3 with Nvidia vGPU

vGPU Config for the Nvidia RTX8000 (Active)

Shared Direct Configuration to support Nvidia vGPU

vGPU profiles and hardware overview

48GB of Frame Buffer available

vSAN Configuration

Both servers are performing a 2-Node vSAN with 100GbE network connection.

vSAN Disk Management

vSAN Disk Groups for “normal OSA vSAN” and vSAN Direct Configuration. (I am still waiting for 4 additional 3.84 TB SSDs to setup the third Disk Group)

OSA Diskgroups & vSAN Direct disks
vSAN Direct Configuration

vSAN Fault Domains (for the 2 Node config)

2 x Witness Appliances (for faster recovery in case of a failure / only “Change witness host” is needed

vSAN Policies

Different vSAN Policies for the cluster
Striping 2 as the new default

When my additional 4 SSDs have arrived, the third DiskGroup will be created to use the new vSAN Nested Fault Domain feature. Double redundancy (data factor 4x) within a 2 node cluster!!

DPU

The brand new feature of vSphere 8.0 DPU support is also planned within my environment. Actually I am trying to get 2 x Nvidia Bluefield-2 NICs (DPUs) for my HomeLab. Officially only the Dell R750 servers are supported, but you know me….. I will get it working inside my machines!

Nvidia Networking Bluefield 2 DPU

The are several different hardware options available from Nvidia, only a very few of them will work as VMware DPUs. Only P-Core models, only 32GB DDR4 models etc….

vDS Configuration for my setup:

The extra DPU vDS is prepared, the DPUs are ordered….. I will keep you guys updated, when everything is up and running…..

Actual Rack View

Front View
Rear View

Dell Features

Dell has replaced the OMIVV with the new OMEVV integration. This is based on the OpenManage Enterprise and operates as a plugin. New licenses are required inside your servers to utilize the OMEVV as well as the Power Manager plugin within OpenManage. You need the Advanced + license! I will create a blog post about the OMEVV and Power Manager in the future.

OpenManage Enterprise with Advanced+ license is needed for OMEVV

Stay tuned for the next episodes of my HomeLab journey….

#HomeLabKing