Friday, December 01, 2017

vSphere Switch Independent Teaming or LACP?

I have answered this question lot of times during the last couple of years, thus I have finally decided to write a blog post on this topic. Unfortunately, the answer always depends on specific factors (requirements and constraints) for the particular environment so do not expect the short answer. Instead of the simple answer, I will do the comparison of LBT and LACP.

I assume you (my reader) is familiar with LACP but do you know what LBT is? If not, here is the short explanation.
VMware LBT (load based teaming) is advanced switch independent teaming available on VMware DVS which pin each VM vNIC to particular physical uplink in roud robin fasion but if the network traffic of particular physical NIC is higher then 75% of total bandwidth over 30 seconds it will initiate rebalancing across available physical uplinks (physical NICs of ESXi host) to avoid network congestion on particular uplink. 
If you are not familiar with basic VMware vSphere networking read my previous blog post "Back to the basics - VMware vSphere networking" before continuing.

What we are doing is the comparison of switch independent teaming and LACP. LACP is the capability of VMware Distributed Virtual Switch (VDS), therefore, I would assume you are on vSphere Enterprise Plus license and having VDS. When you have VDS then I would have another assumption, that you are already considering LBT as it is the best choice for switch independent teaming algorithms available on VDS.

LBT versus LACP comparison

Option 1: Switch Independent Teaming (LBT - Load Based Teaming)
Option 2: LACP

LBT advantages
  • Fully independent on upstream physical switches
  • Simple configuration
  • Beacon probing can be used.  Note, that beacon probing requires at least 3 physical NICs.
LBT disadvantages
  • Single VM cannot handle higher traffic then is the bandwidth of single physical NIC.
LACP advantages
  • One of the main LACP advantages is continuous heartbeat between two sides of the link (ESXi physical NIC port and switch port). VMware's LACP is sending LACPDUs every 30 seconds but it can be reconfigured to fast mode when LACPDUs are exchanged every 1 second. This improves failover in case of link failure and also helps when link status (up/down) do not work well.
  • Single VM can, in theory, handle higher traffic then single physical NIC because of load-balancing algorithm. 
LACP disadvantages
  • ESXi Network Dump Collector does not work if the Management vmkernel port has been configured to use EtherChannel/LACP
  • VMware vSphere beacon probing cannot be used

So which option is better? Well, it depends.

When you do not have direct or indirect control of physical network infrastructure then switch independent teaming is generally much simpler and safer solution, therefore LBT is a better choice. 

In case, you trust your network vendor LACP implementation and you have some control or trust your physical switch configuration LACP is better choice becaouse of LACPDU heart beating and multiple load-balancing hash algorithms which can, in theory, improve network bandwidth for single VM network traffic. Another advantage is that LACP works better with multi-chassis LAG (MLAG) technologies like Cisco vPC, Dell Force10 VLT, Arista MLAG, etc. If this is   

Other FAQ related to LBT and LACP comparison

Q: VMware's LACP is sending LACPDUs every 30 seconds. Is there any way how to configure LACPDU frequency to 1 second?

A: Yes.

You can use command "esxcli network vswitch dvs vmware lacp timeout set". It allows set advanced timeout settings for LACP

set ... Set long/short timeout for vmnics in one LACP LAG
Cmd options:
-l|--lag-id= The ID of LAG to be configured. (required)
-n|--nic-name= The nic name. If it is set, then only this vmnic in the lag will be configured.
-t|--timeout Set long or short timeout: 1 for short timeout and 0 for long timeout. (required)
-s|--vds= The name of VDS. (required)

Relevant blog post on this topic "VMware vSphere DVS LACP timers".

Q: Does ESXi has a possibility to display LACP settings of established LACP session in particular ESXi host? Something like "show lacp" on Cisco switch?

A: Yes. You can use command "esxcli network vswitch dvs vmware lacp status get". It should be equivalent to "show lacp" on Cisco physical switch

Q: How VMware vSwitch Beacon Probing works?
A: Read following blog posts
Q: What is Beacon Probing interval?
A: 1 second

Q: Is ESXi beacon probing send beacons to every VLAN?
A: Yes, but only to VLANs (portgroups) where at least one VM is connected. It does not make sense to test failure on VLANs where nothing is connected.

No comments: