Both the xeon and i7 have vt-x and vt-d along with vt-x with extended page tables. The xeon has the capability of supporting ecc ram (needs a motherboard that supports ecc and the ecc enabled ram itself), and has v-pro which has extra security benefits and remote/local monitoring and workstation access. The xeon also has an additional os guard capability. Physical differences, the 5820k only supports 28 pcie lanes and up to 64gb ram max. The xeon has support for 40 pcie lanes and supports up to 768gb ram though unless paired with the proper motherboard may be limited. Far as I know most 2011v3 boards support either 64gb or 128gb of ram (there are both). The xeon is also a tad faster. It also costs around $200 more than the 5820k but for cpu under $600 offers quite a bit.
Obviously vcpu's don't require a full 1:1 ratio and you might be able to run 40vm's on a quad core depending how heavily they're used and at what time. You'd be limited to 32gb of ram on the mainstream desktop platform though. The 1650v3 is about the least expensive 6 core xeon aside from a couple of low power versions running 2ghz and below.
I think it was the older k series that had issues with the vt-d, not the haswells.
Even the similarly priced i7 5930k maxes at 64gb ram, so if you wanted the option to add more in the future with a board that supports 128gb, the xeon may be the better choice for that alone. Also I'm sure you may want the additional security options for a fulltime 24/7 machine. It's really not that much more expensive than a 4790k i7 quad on a ram limited mainstream 1150 platform considering what you're looking to do. I think the 6 core would be a bit more flexible in case it comes under heavier use. Ht would probably help some though many don't really count the hyper threading as cores used for vmware.
"Hyper-Threading is an Intel-specific technology that lets a single core process two separate instructions in parallel (called pipelines). Neat, right? One problem: the pipelines run in lockstep. If the instruction in pipeline one finishes before the thread in pipeline two, pipeline one sits and does nothing. But, that second pipeline shows up as another core. So the question is, do you count it? As far as I know, the official response is: No, Hyper-Threading should not be counted toward physical cores when considering hypervisor processing capabilities."
http://www.altaro.com/hyper-v/hyper-v-virtual-cpus-explained/