Why Every Hypervisor Needs Three Votes for High Availability
I posted about Proxmox requiring three nodes for High Availability and got a reply explaining the split-brain problem. It is worth writing up because this is one of the most misunderstood topics in virtualization, especially for anyone running two-server clusters and calling them “HA.”
The short version: every hypervisor requires three votes to do proper HA. The difference is how each platform provides that third vote.
The Split-Brain Problem#
In a two-node cluster, if the network link between nodes drops, each node sees the other as failed. Both attempt to take ownership of the same VMs. Two hosts writing to the same virtual disks simultaneously causes filesystem corruption and data loss.
The solution is quorum – a majority vote. With three voters, a network partition always produces a 2-vs-1 split. The side with two votes keeps running. The side with one vote shuts down its workloads. With only two voters, a partition produces a 1-vs-1 tie, and no safe decision can be made.
This is not a vendor design choice. It is a constraint from distributed consensus theory (Paxos, Raft, Byzantine Generals). Every HA implementation must solve it.
The key insight that trips people up: three votes, not three servers. Every major platform provides a lightweight “witness” mechanism that casts the third vote without running VMs.
VMware vSphere / ESXi#
vSphere HA uses two monitoring channels: network heartbeats between hosts and datastore heartbeats (I/O written to shared storage). When network communication fails, the primary host checks datastore heartbeats to determine if the other host is actually down or just unreachable.
vCenter Server acts as an external arbitrator in conflict resolution, functionally serving as a third voter.
For vSAN two-node clusters, VMware requires an explicit vSAN Witness Appliance – an ESXi OVA deployed on a separate host outside the cluster. It stores witness components for quorum voting on vSAN objects but holds no actual VM data. A single witness appliance can serve multiple two-node vSAN clusters.
The third voter exists in vSphere, but it is abstracted into vCenter and datastore heartbeats rather than exposed as a named quorum resource.
Microsoft Hyper-V (WSFC)#
Hyper-V HA runs on Windows Server Failover Clustering, which has the most explicit quorum model of any hypervisor platform. Each node gets one vote. The cluster needs more than half the total votes to operate: floor(TotalVotes / 2) + 1.
Microsoft’s documentation is clear: for a two-node cluster, a witness is essential. Without one, losing a single node kills the cluster.
WSFC offers three witness types:
- Disk Witness – A small shared LUN on the SAN (512 MB minimum). The most common choice for two-node Hyper-V clusters with existing SAN infrastructure because the shared storage is already there.
- File Share Witness – An SMB share on a separate server, NAS, or even a router with USB storage. Uses under 100 bytes of space.
- Cloud Witness – An Azure Blob Storage endpoint. Introduced in Server 2016. Ideal for stretched clusters across datacenters where a third physical site is unavailable.
This is why two-server Hyper-V clusters with a SAN “just work” for HA – the SAN provides the disk witness LUN as the third voter. The third vote is there, it is just a small LUN that most admins configure during cluster setup without thinking of it as a separate quorum participant.
Proxmox VE#
Proxmox uses Corosync for cluster communication and quorum. Same majority rule: floor(n/2) + 1. In a two-node cluster, both nodes must be online for quorum. If one fails, the survivor holds 1 of 2 votes, which is not a majority. The HA manager refuses to act, and /etc/pve goes read-only.
Proxmox’s equivalent of a witness is the QDevice (Corosync External Vote Support). It has two components:
- corosync-qnetd – Runs on an external Linux host (not a cluster member). This is the arbitrator.
- corosync-qdevice – Runs on each Proxmox node. Communicates with qnetd over TCP 5403.
During a partition, both nodes contact qnetd. It uses the ffsplit algorithm to assign the tiebreaker vote to one side. That side gets quorum (2 of 3 votes) and keeps running. The other side fences.
The qnetd host has minimal requirements: any Linux box with the corosync-qnetd package and network access to both cluster nodes. For a homelab, a Raspberry Pi works. For an enterprise customer, a dedicated 1U server (Dell R260, HPE DL20 Gen11) running Debian 12 is the appropriate choice.
Unlike Hyper-V, adding a SAN to Proxmox does not automatically provide a third voter. Proxmox handles quorum entirely at the network layer through Corosync. Shared storage provides the data path for VM disks and enables live migration, but it does not participate in quorum voting. A QDevice or third node is still required.
Setup is straightforward:
# On the external host
apt install corosync-qnetd
# On each Proxmox node
apt install corosync-qdevice
# From any cluster node
pvecm qdevice setup <qdevice-ip>
# Verify
pvecm status
The QDevice should appear under Membership Information with 1 vote, and the cluster should show 3 expected votes with a quorum of 2.
XenServer / XCP-ng#
XenServer uses network heartbeats (UDP 694) and storage heartbeats written to a dedicated heartbeat SR (shared iSCSI/NFS/FC LUN, 356 MB minimum). When HA is enabled, hosts that lose quorum self-fence immediately – hard restart, all VMs stopped.
For even-numbered clusters that split exactly in half, XenServer uses a deterministic tiebreaker: the partition containing the node with the lowest cluster ID keeps running, and the other half fences. This is a heuristic, not a true independent third voter.
XenServer does not offer a lightweight witness equivalent to Hyper-V’s file share witness or Proxmox’s QDevice. The recommended approach for production HA is three or more hosts in the pool.
The Bottom Line#
| Platform | Third Voter Mechanism | Two-Node HA | Visibility |
|---|---|---|---|
| VMware vSphere | vCenter + datastore heartbeats; vSAN Witness Appliance | Supported (with limitations) | Implicit |
| Hyper-V / WSFC | Disk, File Share, or Cloud Witness | Fully supported | Explicit |
| Proxmox VE | QDevice (corosync-qnetd) | Fully supported | Explicit |
| XenServer / XCP-ng | Heartbeat SR + lowest-ID tiebreaker | Limited; 3+ hosts recommended | Implicit |
Every platform solves the same problem. The difference is whether the third voter is visible to the administrator (Hyper-V and Proxmox) or abstracted into the infrastructure (vSphere and XenServer).
If someone tells you their two-server cluster is “HA,” ask where the third vote is. If they cannot answer, it is not HA – it is a cluster that will lock up or corrupt data the moment a network partition or node failure occurs.
A detailed version of this document with full hardware specifications for a recommended Proxmox two-server + SAN + QDevice configuration is available as a Word document. Contact me if you would like a copy.