Overview of iSCSI MPIO Recommendations & Best Practice on Windows Server

System Requirements:

  • Windows Server 2008 Storage Server
  • Windows Server 2008 R2 Storage Server
  • Windows Server 2012, 2012 R2, 2016

The Problem:

I needed to outline some of the general thinking relating to exactly how a practitioner should logically and physically understand MPIO, however most of the discourse on the subject skips a fair amount of the obvious questions that people starting out with the technology may be asking (or trying to answer). I therefore present some thinking on the subject of understanding MPIO optimisation and best practice for iSCSI.

The information presented in this document is intended for those who are new to the concept of iSCSI and MPIO and is not intended to be product specific.

More Info

Multi-Path Input/Output or MPIO is a server technology that usually sits on the storage side of load balancing, failover and aggregation technologies. If you are getting into SAS, iSCSI or Enterprise RAID solutions where it is most commonly used (encountered), then you this may (or may not) help you with understanding what MPIO is any why it (possibly) isn’t what you think it is!

The document is written from the perspective of an iSCSI user where it can be conceptually a little harder for new users to understand the best way to approach MPIO.

Logically understanding what MPIO is all about

So you have 2x1Gbps ports in a MPIO team, that means you’ve got a 2Gbps link right? Wrong. That isn’t what is going on with MPIO.

MPIO (and in fact pretty much the majority of balancing and aggregation technologies) doesn’t double the speed, but it does roughly double the bandwidth available to the system. Confused? Think of it like this:

You own a car. The car has a top speed of 70mph and not one mph more. You get on a one way, single track road in a country where there are no speed limits. You are now happily driving along at 70mph. Some bright spark at your local council decides that you should be able to drive at 140mph, so they cut down the trees on one side of the road and add a second one-way carriage way, going in the same direction as the first.

Can your car now drive at 140mph because of the new lane? No. The public official is wrong. Your engine can only offer you 70mph. The extra lane doesn’t help you, but it does help the guy in the car next to you also driving at 70mph arrive at the other end of the road at the same time. It also means that when you encounter a tractor ambling along in your lane, you have somewhere else to go without slowing down.

This is fundamentally what MPIO is doing. So why isn’t it a 2Gbps link? Basically, because networking technology is a serial communications medium and by adding a second lane and calling it a faster way to get data to the end of it, you get into the different world of parallel communications. Under parallel communications you have to split (fragment) information into smaller pieces and push it down each one of the wires to the destination. This in turn infers the need to have more complicated buffer/caching designs to store information as part of a strategy that is designed to be able to cope with each section of the data arriving at a different time, it arriving all at the same time, in a different order than intended or of course, it not arriving at all. Something known as clock skew.

To fix this, you need to introduce overhead either to synchronise delivery to be reliable (thus slowing it down and reducing error tolerance) or adding overheard mechanisms designed to deconstruct, sequence, wait for or re-request missing or corrupt data sections and track timing – alls something that you really don’t want in an iSCSI or SAS environment where response time (latency) is king. Consequently, there is a diminishing return on how much of this parallel working you can derive a benefit from in any system, including an MPIO system. iSCSI MPIO, if correctly configured, will offer something at around about the boundary between worth-while and not bothering in the first place. Yet it is important to understand that it will not be a 100% increase in performance, neither will likely be a 50% increase, but more realistically something around the 30-40% mark.

Performance is only one of the intended design considerations for MPIO, and in that it is not the primary consideration. The primary consideration is for fault tolerance and reliability.

In a correctly designed iSCSI system, independent NICs connected to more than one switch and usually to more than one controller on the storage side and more than one server on the host side. If one of these fails, in a correctly implemented system, your production service probably won’t even notice. You can even be as bold to perform live switch re-wiring on iSCSI systems without impacting the client services involved – although it should be stressed that this is for bragging rights and in practice should not be attempted.

To summarise, MPIO allows you to get twice (+) as much data down to the end of the link, but you cannot get it there any faster. In general, if you can avoid using fragmented streams, you will reap the maximum benefit. The obvious approach here is that each “lane” should be using unrelated data: instead of carving up a single video file and pushing little bits of it down each lane one bit at a time (MPIO can do this), one lane is used for the video and the second lane is used for literally anything and everything else. This is a simplification of what MPIO generally does, however in practice is offers a good way to get your head around it.

Techniques

So how does MPIO carve up the traffic?

There are broadly speaking 4 different paradigms for carving up MPIO traffic

Failover/Redundant In this mode, one link is active, while the other is passive i.e. up, but not doing anything. If the first link fails, the second path takes over and all existing traffic streams continue to receive the same bandwidth (% of the total available pie) on the same terms as before. This would give us a completely separate road that can only be used in emergencies. It may not be as fast or robust, or it may be identically spec’d and just as capable. A failover design may or may not return traffic to the first channel once it becomes available once again.
Round Robin This mode alternates traffic between channel 1 and channel 2, then goes back to channel 1, channel 2 and so on. Both links are active, both receive traffic in a slight skew as the data is de-queued at the sender. This offers the 2 lane analogy used above with each 70mph car getting to the end at roughly the same time.
Least Queue Depth This puts the traffic into the channel that has the least amount on it (or to be more accurately about to go onto it). If one channel is busier than the other (e.g. the large video file) then it will put other traffic down the second channel, allowing the video to transfer without needing to slow down to allow new traffic to join, delaying its delivery. There are many different algorithms that exist on how this is achieved, including varieties that use hashing to offer clients consistent paths based upon Layer 2 or Layer 3 addresses.
Path Weighting Weighted paths and least blocking methods assess the state/capabilities of the channels. This is more useful if there are lots of hops between source and destination, multiple routes between a destination or different channels have different capabilities. For example, if you have iSCSI running through a routed network, then there could be multiple ways for it to get there. One route may go through 5 routers and another 18 routers. Generally, the 5 router path might be preferable, provided the lower hop route genuinely gets the data there faster. Equally the weighting could be based upon the speed of the path through to the recipient or finally, if channel 1 is 10Gbps and channel 2 is only 1Gbps, then you might prefer the 10Gbps path to be used with a higher preference. Usually, a lower weighting number means a higher preference. This would be the equivalent of a 70mph road with a backup road with a max speed of 50mph. You know that it will get you to the destination, but you can guarantee that if you have to use it, it will take longer.

So, more lanes equal more stuff then?

Sounds simple doesn’t it? Just keep throwing lanes into the road and then everyone gets to travel smoothly at 70mph.

In principle, it is a nice idea, but in practice it doesn’t actually work in most iSCSI implementations.

For starters, server grade network card (which you should be using for MPIO, and not client adapters) are expensive and server backplanes can only accept a finite amount of them. Server NIC’s also consume power and power consumes money! Keep that in mind if you do decide to throw extra ports at an iSCSI solution.

The reality is that if you have an MPIO solution that will allow you to experiment with more than 2 NIC adapters in a MPIO group, you will likely see the performance gain rapidly tail off. In turn it will actually wind up presenting you with steadily worsening performance, not the increase that you are expecting.

Attempting to MPIO iSCSI traffic across 4x 1Gbps NIC’s actually offers worse read and write speeds for a virtual machine than 2x 1Gbps under a Hyper-V environment (see tests E and F below). The system starts to waste so much time trying to break apart and put back together each lane’s worth of traffic that it just doesn’t help the hypervisor.

Where a 4 NIC configuration is beneficial is actually in providing you with a “RAID 6” MPIO solution. Here you can have 2 active and 2 passive adapters – remember in an idealised scenario they could be 2x10Gbe and 2x1Gbe with a hard-coded preference for the 10Gbe and a method of failing traffic back to the 10Gbe. Just be aware that you can only use the 10Gbe set OR the 1Gbe set at the same time, not one port from each. The exception to this rule is for hashing based channel assignment as these offer more paths to “permanently” assign data into, without the overhead of path swapping or de-fragmentation of traffic.

Some DSM’s (effectively a OEM specific MPIO driver under Windows, such as Dell Host Integration Tools [HIT] or NetApp Host Utilities) logically limit a MPIO to two active NIC’s if the storage controller is only exposing 2 usable NICs back to the HIT instance. Dell EqualLogic Host Integration Tools (the EqualLogic DSM) will grab the first two paths it finds and shutdown any others into a passive state, no matter how hard you try to start them up.

What should a MPIO network “look” like?

Ultimately this is down to what you want to get out of the MPIO solution and within the bounds of what your hardware vendor will support.

There are effectively three schools of thought here (I won’t comment on which is right because as you’ll see, it isn’t that simple)

MPIO is about Meshing

If you see MPIO is a mesh then 2 NIC’s in a server connecting to 2 NIC’s in a storage appliance equals a mesh where each NIC has a path to the other. This is more aligned with how you probably already think about Ethernet networks.

MPIO is about Pathing

If you see MPIO in this model it is simple about more than one line being drawn between two different end points, with no line crossing or adding any complexity, complication and confusion. This is more aligned with how you likely currently think about SAS, Fibre Channel and hard drive wiring.

MPIO is about Redundancy

This is the purest of the three views. It sees the complexity and overheads associated with MPIO as being a problem – there will always be some sort of increase in latency, a drop in some aspect performance by trying to squeeze more bandwidth out of MPIO. This view attempts to keep the design simple, run everything at an unimpeded wire speed but maintain the failover functionality afforded by MPIO.

The three schools of thought are outlined in the diagram below.

Why not Meshing?

When you start out with MPIO, you may be tempted towards implementing option 1. After all, your Server NICs (circles) are likely connected to a switch, as is your storage array (squares). The switch allows you to design to this topology and if you allow the MPIO system to have knowledge over all possible permeations of connectivity, the system will highly redundant, making it very robust.

Yes and no! Yes, it is very robust, but at this point in your implementation, how do you know which path traffic is taking? How do you know that it is optimised? What is stopping Server NIC1 and Server NIC2 from both talking to storage NIC1 at the same time? If they do that, then they have to share 1Gbps of bandwidth between them while Storage NIC2 is left idle. Suddenly all of your services will have intermediate bursts of speed and infuriating drops in performance. The more server NICs that you add, the faster the decrease in performance will be. With 4 Server NICs, there is nothing to stop the MPIO load balancer from intermittently pushing the data from all 4 Server NICs towards a single Storage NIC.

In a Round Robin setup, in a full Mesh design (as shown in #1) it will likely order the RR protocol in the order that you gave the system access to the paths. Given the following IP Addresses

Server: 192.168.0.1, 192.168.0.2
Storage: 192.168.0.11, 192.168.0.12

The RR table could like this

  1. 192.168.0.1 -> 192.168.0.11
  2. 192.168.0.2 -> 192.168.0.11
  3. 192.168.0.1 -> 192.168.0.12
  4. 192.168.0.2 -> 192.168.0.12

Or it could like like this

  1. 192.168.0.1 -> 192.168.0.11
  2. 192.168.0.1 -> 192.168.0.12
  3. 192.168.0.2 -> 192.168.0.11
  4. 192.168.0.2 -> 192.168.0.12

In both examples you either have two different sets of traffic being sent from the same Server NIC concurrently or received by the same Storage NIC concurrently. This is going to undermine performance, not improve it (this is outlined in Mbps terms in the tests shown later in this document).

In a failure situation, the performance issue is exacerbated

  • If #3 fails, then nothing changes in performance or bandwidth.
  • If #2 fails then the total bandwidth available to the system halves and all services contend using the first link.
  • If #1 fails then as with #2, all services suffer with contended bandwidth, however the system also has the overhead of MPIO to further reduce performance.

What benefit is there to MPIO operating in scenario #1? In this failed state, should one of the Storage NICs also fail, the system will continue to operate. In #2 if the working Storage NIC fails, the entire system will fail despite the fact that the Storage NIC on the second path is actually working. It is up to you and your design as to whether you think that the performance hit that you will experience is worth this extra safeguard? In a highly secure system, mission critical or safety system it may be worth the extra overhead.

There are however some middleware layers that can manage this for you. Dell Host Integration Tools (HIT), does, for example, attempt to undertake some management of these types of situations, optimising the mesh by putting the links that will cause overhead into a failover only state, while maintaining the optimal number of active mesh links. In my experience though, the HIT solution is not able to perfectly manage the optimal risk. It does not provide any consideration over redundant NIC controllers. For example, if you have 2 physical Dual Port NICs in your Server with the intention of one port from each NIC making up the active “pair”, Dell HIT is not able to detect or be programmed to ensure that the active paths are prioritised around ensuring that the correct controller is being used. In my experience, it will tend to bunch them together onto the same physical NIC controller, leaving the second controller idle.

Fixing this problem requires an additional layer of complex, expensive and usually proprietary middleware logic, further impacting performance and increasing cost. Therefore, industry best practice is to avoid thinking of iSCSI MPIO as being a Full or even a Partial Mesh, but instead think of it as offering independent channels akin to those shown in #2. It is for this reason that virtually all iSCSI MPIO vendors insist that each Server -> Storage NIC pair exist on its own logical IP subnet as this completely negates the possibility of interweaving the MPIO paths while also ensuring that any subnet-local issue (such as a broadcast or unicast storm) is only likely to take down one of the subnets, not both.

iSCSI as part of a Virtual Network Adapter, Converged Fabric LBFO Team

Since the release of Windows Server 2012, Microsoft have allowed to be hinted at the idea of using iSCSI through Converged Fabric* Load balancing Failover (LBFO) teams — as long as the iSCSI NICs are Virtual and they connect through a Hyper-V VM Switch which itself backs onto a Windows Server LBFO team. Even the venerable Aidan Finn has hinted at it. I have, however, never seen a discussion of it being attempted online, neither have I ever seen it benchmarked.

To be clear over what we are talking about when I say a Virtualised, Converged Fabric, LBFO Team:

  1. 4x 1Gbps Ethernet physical adapters
  2. Grouped into a Windows Server 2016 LBFO Team, appearing to Windows as a single logical network adapter called “ConvergedNIC”
  3. “ConvergedNIC” is connected to an External Virtual Switch called “ConvergedSwitch”
  4. A Virtual Machine Network Adapter is created on the Hypervisor’s Parent Partition (ManagementOS) and this is assigned to the correct VLAN, given an IP address and hooked up to the iSCSI Target
  5. 4 physical NICs, no MPIO, 1 logical NIC

So, does it work?

Yes! It does work and it appears to be stable and even usable; but with some sacrifice in performance (keep reading for some benchmark numbers as “test A” below). I have however had test VMs running under this design for nearly a year without any perceivable issues in either VM or hypervisor stability.

* If you are not familiar with the Concept of a Converged Fabric: A Converged Fabric is a data centre architecture model in which the concept of 1 NIC = 1 Network/Subnet/VLAN/Traffic Type is abandoned. Instead, NICs are usually pooled together into Teams with multiple traffic types, Networks, Subnets and VLANs being allowed to use any of the available bandwidth within the team. Quality of Service (QoS) algorithms are used to ensure that priority traffic types are defined (such as iSCSI in this example), ensuring that the iSCSI system is never starved for bandwidth by someone performing a large file transfer across the team. A Converged Fabric architecture is considered to be more efficient, lower cost and offer better failover reliability than traditional methods in which entire 1GbE or 10GbE NICs could be left idle, waiting for traffic that while high bandwidth, may be infrequent. A Converged Fabric architecture allows other users/systems to benefit from the available bandwidth when not needed by its primary application. It can also offer the primary application additional bandwidth in some situations.

If you have an 8 NIC Hypervisor setup with 2 physical iSCSI NICs, 2 physical production network NICs, 1 physical heartbeat NIC, 1 physical live migration NIC, 1 management network NIC and 1 out of bounds management NIC, then you are paying to power but to not derive much of any benefit from NICs 4-8 due to how infrequently they are used. If this sounds familiar to you, then you should consider migrating to a Converged Fabric design.

Quantifying Best Practice

So far, this article has discussed MPIO, meshing, pathing and redundancy as well as a quick detour into using converged fabric LBFO for iSCSI connections. So let’s look at some numbers that underpin these approaches.

Tests were undertaken using the following hardware configuration:

  • Dell EqualLogic PS4110x running firmware 9.1.1 R436216, with 2 active 1GbE NIC’s on a single controller
  • Dell PowerEdge P630 with 8x1GbE adapters (4x Broadcom NetXtreme and 4x Intel I350 adapters) with 9K Jumbo Frames correctly enabled
  • Windows Server 2016
  • Switching on Cat6a cabling via 2x Cisco Catalyst 2960-48’s
  • The 64K block, GPT formatted, 3TB target LUN was setup as a CSV and the nodes were in a Cluster with a second identical node idling as a second cluster member (CSV-FS has a natural performance hit compared to NTFS)

7 tests were performed as outlined in the following table

Physical Paths
Active NICs
Test Description Active Passive Intel Broadcom LBFO Team Dell HIT MPIO Mode
A
4 NIC in LBFO Team, No MPIO
4
0
0
4
Y
N
n/a
B
4 NICs, fully meshed, RR
8
0
2
2
N
N
Round Robin
C
2 NICs, no mesh (point to point)
2
2
2
0
N
N
Round Robin
D
1 NIC only (control test)
1
1
1
0
N
N
n/a
E
4 NICs, fully meshed, LQD
8
0
2
2
N
N
Least Queue Depth
F
4 NICs, partial mesh, RR
4
0
2
2
N
N
Round Robin
G
2 NICs, no mesh (point to point) with EqualLogic Host Integration Tools
1
1
2
0
N
Y
Least Queue Depth

If you are more visual, the following diagram summarises the above in a graphical format

The Results

The following table summarises the read/write performance of each test on Sequential 4MB reads as outlined through “Anvil’s Storage Utilities”, version 1.1.0, build 1st January 2014. all tests were performed on the same Windows 10 Enterprise VM without rebooting in between each test and without performing any other activities on the VM disk.

The results below are ordered by test, from the test offering the best performance to the test offering the worst performance, using the Read MB/s column as the sort index.

Sequential 4MB (Read)
Sequential 4MB (Write)
Test
Response (ms)
MB read
IOPS
MB/s
Control Deviance (%)
Response (ms)
MB written
IOPS
MB/s
Control Deviance (%)
C
30.4791
1052
32.81
131.24
32.17
21.7266
1024
46.03
184.11
70.25
F
39.801
804
25.13
100.50
1.21
468.9896
772
2.13
8.53
-92.11
D
40.2814
796
24.83
99.30
0
36.9883
1024
27.04
108.14
0
A
51.3782
624
19.46
77.85
-21.60
89.5977
1024
11.16
44.64
-58.72
G
60.7197
528
16.47
65.88
-33.66
23.8047
1024
42.01
168.03
55.38
E
273.9667
120
3.65
14.60
-85.30
1010.7556
360
0.99
3.96
-96.34
B
404.65
80
2.47
9.89
-90.04
964.766
376
1.04
4.15
-96.16

Response (ms) = Lower is better
MB read/written = Higher is better
IOPS = Higher is better

Control Deviance (%) = the positive or negative impact in MB/s performance compared to the single NIC, no MPIO control test (test D).

Test A | Converged Fabric LBFO

The Microsoft dream of virtualising everything does hold up – at not being completely terrible. Sitting in the middle of the table, using a fully converged fabric, virtualised setup across 4 NICs resulted in a 22% reduction in read speed compared to a single NIC and a 59% reduction in write speed.

There may be some improvements to made by creating multiple Virtual iSCSI interfaces connected to the virtual switch, however these were not tried. Based upon the current view of the technology, while it works and offers a data centre design simplification, that simplification factor is not worth the performance sacrifice.

Test B | Round Robin, Full Mesh

This test proves that viewing an iSCSI setup as a full mesh and throwing NICs at the proverbial problem is going to do nothing to help you. Your iSCSI should be configured in a 1:1 “path” setup between initiator and target. Any additional NICs should be put into “Round Robin with subset” i.e. made to be passive fail-over adapters. That is a 90% and 96% reduction in respective read/write performance!

Test C | Round Robin, 1:1 Paths

This test proves how you are supposed to use iSCSI. Two, non-crossing paths allows for a full bandwidth connection down each path between the initiator and the target. This configuration provided an increase in performance over a single adapter and was the only test that provided improvements to both read and write metrics.

Test D | Control

This was the baseline control test for this experiment. 1 NIC talking to 1 controller port. Nothing complicated here.

Test E | Least Queue Depth, Full Mesh

This test repeated Test B, but changed the MPIO model from RR to LQD to see if it made any difference. Read performance was slightly better than under RR, but was still 85% worse than the control test.

Test F | Partial Active Mesh

This test looked to see whether having a partial active mesh made any difference. There was a very small 1% increase in read performance from this, but a significant write penalty. In practice, you cannot push/pull 2Gbps to/from a 1Gbps source, so the design is not conducive towards improved speed under a synthetic load.

Test G | Least Queue Depth, 1:1 Paths

Test G was a genuine surprise. I was expecting to see Dell EqualLogic Host Integration Tools (HIT) version 4.9 offer an increase in performance, not a decrease. However, repeating the test yielded the same results. In my experience, this has never usually been the case, with VM’s feeling more responsive with HIT installed compared to not. Experience suggests to me that something else was at play here, perhaps the HIT version being poorly optimised for Windows Server 2016, or the Dell stack getting grumbly about it using a retail Intel I350-T4 adapter instead of a Dell one. Dell HIT forces the use of pathing no matter what you try and set all other adapters into passive mode. It used LQD as the MPIO algorithm. Evidently this resulted in an increase in writes but a reduction in read performance, be it not as high as without HIT being installed.

Although not shown in the results above, HIT did help improve performance in some of the Anvil Tests. The long queue depth tests resulted in higher IOPS figures for both read and write values by a small margin. None of the other tests yielded such an improvement.

Conclusion

As you can see from these results. There is only one way that you should be conceptually thinking about your iSCSI environment – 1:1, point to point paths. Anything over and above this should be set to being passive/failover/offline in order not to impact performance.

General Subnet Recommendations

Subnet recommendations go hand in hand with this, but you should note are generally made by the storage vendor — and you should follow their advice. I have encapsulated the general recommendations/requirements of a number of providers in the table below. The subnet count column is in essence a statement that for each NIC on the storage device, there should be a dedicated subnet (and ideally broadcast domain/VLAN) back to the iSCSI server.

Vendor Subnet Count Source
Dell (Non-EqualLogic)
2 View
Dell EMC
2 View
Dell EqualLogic
1 View
Microsoft
2 View
NetApp
? I couldn’t find any guidance from an official source. There is community evidence of both being used by end-users
NetGear
2 View
QNAP
2 View
Synology
2 View

As you can see, with the exception of Dell EqualLogic which provides a middleware solution known as the Host Integration Tools (HIT) to cope with this, most vendors are quite specific on the use of a “single path” logical topology for server/storage connectivity — aka one subnet per storage appliance NIC.

General Advice

I will end this piece with some general advice and tips for working with MPIO. It isn’t exhaustive, but they are some quick observations from experience of using the technology for many years. Some of them are obvious, some of them might help you avoid a head scratcher.

  1. If you are using an enterprise iSCSI solution, follow the vendor’s advice, forget anything you read on the Internet. Everyone is a know-it-all on the internet and there are plenty of “I’m a Linux user so I know best” screaming matches about how EqualLogic are wrong about the recommendations for EqualLogic’s own hardware. I’m pretty sure that EqualLogic… uh, tested their stuff before writing their user manual.
  2. If you are using an enterprise solution and the vendor offers a DSM (MPIO driver), use it. Dell HIT vs the generic Microsoft DSM for Windows Server is noticeably faster, but only works will Dell SAN hardware (naturally). Also ensure that you keep you DSMs up to date.
  3. Follow you vendor’s guidelines with respect to subnets. If in doubt, drop them an email. You’ll usually find them quite accommodating.
  4. Unless your vendor has expressly told you to, you do not MPIO back from the storage system – i.e. don’t team, MPIO, load balance etc on the storage side. Do it all on the server initiating the request.
  5. Stick to two port/1:1 path MPIO designs. If you need more create multiple pairs and have each on different networks going to different storage systems so that the driver knows where to send traffic explicitly while maintaining isolation.
  6. If you want to think about your MPIO as a meshing design, it has to be meshed for redundancy, not active links (unless your system needs to keep living, breathing human beings alive and do so at all costs).
  7. With iSCSI and SAN MPIO, try and avoid network hops (routers).
  8. All ports in a group must be the same type, speed and duplex.
  9. Disable port negotiation and manually set the speed on the client and switch, this will make failover/failback processes faster for your redundant paths.
  10. Use VLAN’s as much as possible (try and avoid overlaying broadcast domains across a shared Layer 2 topology).
  11. Use Jumbo Frames as much as possible unless the iSCSI subnet involves client traffic.
    Hint: Your iSCSI subnet should not involve client traffic!
  12. Ensure that your NIC drivers and firmware are kept up-to-date
  13. Disable all Windows NIC service bindings apart from vanilla IPv4 on your iSCSI networks. For example, Client for Microsoft Networks, QoS Packet Scheduler, File and Printer Sharing for Microsoft Networks etc. If you aren’t using it, disable IPv6 too on the iSCSI interfaces to prevent IPv6 node-chatter.
  14. In the driver config for your server grade NIC (because you are using server grade NIC’s, right?) max out the send and receive buffer sizes on the iSCSI port. If the server NIC has iSCSI features that are relevant (such as iSCSI offloading), enable them.
  15. When you are building a Windows Server, script the MPIO install, enable MPIO during the script and set the default policy as part of the build process —- then patch and REBOOT the system before you even start configuration. If I had a £1 for every time I’d had to rescue someone from not doing that and then not REBOOTING…
  16. If you are using a SOHO/SME general purpose commodity NAS, if (and only if) you have a UPS, disable Journaling and/or Sync Writes on your iSCSI partitions/devices. There is a benefit, but remember if you are hosting SMB shares on a commodity appliance you actually do want Journaling running on those volumes.
  17. Keep your NAS/SAN firmware up to date.
  18. Keep your storage system and iSCSI block sizes, cluster and sector sizes optimised for the workload. Generally this means bigger is better for virtualisation storage and video. 256/64K, 128/64K or 64/64K depending on what your solution can offer.
  19. Keep volumes under 80% of capacity as much as possible.
  20. Use UPS’s: Remember, iSCSI and SAS are hard drive/storage protocols. They are designed to get data onto permanent storage medium just like RAID controllers. RAID controllers have backup batteries because you do not want to lose what is in process in the RAID controller cache when the power goes out. Similarly, you need to think of your iSCSI and External SAS sub-systems much the same as you would a RAID sub-system.
  21. If you have a robust UPS solution, enable write caching and write behind/write back cache features on your storage systems and iSCSI mounted services to gain extra performance benefits. Be mindful that there is risk in this if your power and shutdown solution isn’t bullet proof.
  22. Test it! Build a test VM and yank a cable out a few times. You’ll be glad you sacrificed a Windows install or two to ensure it is right when you actually pull an iSCSI cable out of a running server… Believe me I know what a relief that is.

Error 0x80070005 when attempting to Perform a Shared Nothing migration between Hyper-V hosts or move a Hyper-V VM between CSV’s in the same or separate Clusters

System Requirements:

  • Windows Server 2012 R2
  • Windows Server 2016

The Problem:

Hyper-V 2012 R2 has a lot of new features that are worthy of note and one of the most appealing features for Virtualisation Administrators is shared nothing migration between hosts via SMB. If you are in an environment that doesn’t have shared storage it’s useful enough in itself because for VM purposes it may have just validated your decision not to get shared storage in the first place. Yet less well documented is the features value for setups where when you do have shared storage as you can use shared nothing migration as a mechanism to live migrate VM’s between clusters that are backed onto shared storage – or more specifically between “Cluster Shared Volumes” (CSV).

The picture on the back of the box of the smiling, happy systems administrator performing a shared nothing administrator makes it look so easy right? This is however an all too common occurrence:

0x80070005 Error

'General access denied error'('0x80070005')

 

There was an error during move operation.

Virtual machine migration operation failed at migration source.

Failed to create folder.

 

There was an error during move operation.

Virtual machine migration operation failed at migration source.

Failed to create folder.

Virtual machine migration operation for ‘<VM Name>’ failed at migration source ‘<Source Hypervisor name>’. (Virtual machine ID <VM-SID>)

Migration did not succeed. Failed to create folder ‘<RPC path>…\Virtual Hard Disks’: ‘General access denied error'(0x80070005’).

If you look at the specified destination path (e.g. c:\ClusterStorage\Volume1\test) after receiving this error, you will find that it has created the test folder and it will have created a ‘Planned Virtual Machines’ folder beneath it which will in turn contain a folder named with the VM’s VM-SID (the Virtual Machines unique security ID) and a .xml file named with the same VM-SID.

The migration will however not progress any further.

If you attempt to perform the same operation in PowerShell you will receive the PowerShell version of the same error:

VERBOSE: Move-VM will move the virtual machine "<VM Name>" to host "<Destination Server>"
Move-VM : Virtual machine migration operation for '<VM Name>' failed at migration source '<Source Server>'. (Virtual machine ID<VM-SID>)
Migration did not succeed. Failed to create folder
'\\<Destination Server>\<Source Server>.762091686$\{e166ba26-8a4a-4029-ac34-c2466451e439}\<VM Name>\Virtual Hard Disks': 'General access denied error'('0x80070005').
You do not have permission to perform the operation. Contact your administrator if you believe you should have permission to perform this operation.
At line:1 char:69
+ $vm = Get-VM -Name 'test' -ComputerName "<Source Server>" | Move-VM -Des ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : PermissionDenied: (Microsoft.Hyper...VMMigrationTask:VMMigrationTask) [Move-VM], Virtual izationOperationFailedException + FullyQualifiedErrorId : AccessDenied,Microsoft.HyperV.PowerShell.Commands.MoveVMCommand

Please Note: This document does not specifically address 0x80070005 for Hyper-V Replication Troubleshooting, which is a slightly different (yet related) issue.

More Info

Understanding the topology involved in my setup also reveals my reason for needing to get this working – this is important as setup and reasons yours may differ slightly. What I was attempting to do was migrate between two multi-node Windows Hyper-V Server 2012 R2 clusters while being able to initiate the migration from a third device, a Windows 8.1 management console.

Much of the discussion surrounding 0x80070005 suggests that you simply need to deal with the fact that you need to log onto the source workstation and initiate a push of the VM from the source server to the destination server using CredSSP. This is fine if you have a general purpose commodity server that happens to have Hyper-V on it. In the real world if you have a Hyper-V Cluster, you should not be running it in GUI mode, you should be using Server Core – and if you are using Windows Hyper-V Server to begin with, you don’t even have the option of a GUI.

So we can eliminate the use of the GUI tools or the simplicity of “just RDP into the server” immediately from this discussion. People answering as such are running in very simple Hyper-V setups and in environments with simple, very liberal security policies.

You can of course use PowerShell to perform a CredSSP migration on a Server Core installation and as a mater of good practice the ability to transfer VM’s using CredSSP should be confirmed as working before you start out with Kerberos. To do that, log onto the Source Server and execute the following command in a PowerShell session:

Get-VM -Name ‘<VM Name To Move>’ | Move-VM -DestinationHost “<Destination Server>” -DestinationStoragePath “C:\ClusterStorage\Volume1\<VM Name to Move>” -Verbose

If that doesn’t work, I recommend that you troubleshoot this issue before you look to go any further on the 0x80070005 issue.

Additionally, before make sure that you have done performed the basic troubleshooting steps and also ensure that you are simplifying the problem as much as possible before starting. The following provides an overview of such steps in no particular order:

  • Log-in as a Domain Admin to perform this test (if possible). After you have that working migrate down to delegated users and troubleshoot any issues that they are experiencing
  • Only try to ‘shared nothing’ migrate a VM that is turned off (create a new VM, attached a default sized dynamically expanding disk, don’t add any networks and leave it off as this means that you will only have 4MB of data to test move). Once you can migrate a VM that is off, attempt to migrate a running VM with a Live Migration.
  • Only test migrate between the Source Cluster storage (CSV) owner node and the Destination Cluster storage owner node
  • If possible, make the owner of the source and destination cluster core resources the same node that owns the CSV
  • Remember that you must use Hyper-V Manager after you have de-clustered the VM from within Failover Cluster Manager before you can perform a shared nothing migration – the fact that your VM has anything to do with a cluster is an aside for Hyper-V. Treat this process as a Hypervisor to Hypervisor move that happens to be on a CSV and forget about the cluster.
  • On the ‘Choose a new location for virtual machine’ page of the migration wizard, remember that you must enter a file system path (e.g. C:\ClusterStorage\volume 1\test) and not a UNC path (e.g. \\server\c$\ClusterStorage\volume 1\test). The migration is going to take place using RPC and not SMB. Thus do not use a UNC path.
    'Choose a new location for virtual machine' wizard page
  • Ensure that you can migrate the VM using CredSSP as discussed at the beginning of this section
  • Ensure that your Domain Controllers are running Windows Server 2008 or higher (or at least your logon server), Windows Server 2003 Domain Controllers are known to have issues here (possibly due to lack of AES support). Your domain / forest functional levels can reportedly be Windows Server 2003 if required. I have only tested with Windows Server 2008 domain functional and Windows Server 2008 forest functional levels
  • If you are attempting to move between servers in a domain trust, you must ensure that the domain trust supports AES
  • Keep your initial testing paths simple and avoid overly complicated NTFS structures. For example, target the destination to be a local sub folder of c:\ and not a junction (such as ClusterStorage\Volume #) or a non-drive letter NTFS Mount Point (i.e. a iSCSI share or drive mount point exposed as a sub-folder to a higher file system). See the links below for more on this.View: Snapshot – General access denied error (0x80070005)
    View: Migrating a Virtual Machine problemNote: The iCACLS command listed in the second link does not use the principal of least permission. The command to enact the principal of least permission would be as follows:

    icacls F:\hvtest /grant “NT VIRTUAL MACHINE\Virtual Machines”:(OI)(CI)(R,RD,RA,REA,WD,AD) /T

    Finally, keep in mind that for delegation purposes, permissions must be valid for the user account that you are using to perform the move as well as the SYSTEM account.

  • Initially, forget about testing the migration into the cluster CSV itself. Instead, create a new folder on the root of the C Drive of the destination server and migrate into this. There are a few suggestions online that you need to put a couple of folder depths between the root of the drive and the VM itself so try something like:
  • C:\VM Store\Test\
  • If you are following my advice, you will be testing with a 4MB VM called ‘test’ so there won’t be any issue with storage space and the use of the C Drive for testing
  • User PowerShell for testing, otherwise you will go insane from having to repeatedly re-enter information in the Move VM wizard. The general gist of the command is:
    Get-VM -Name ‘<VM Name To Move>’ -ComputerName “<Source Server>” | Move-VM -DestinationHost “<Destination Server>” -DestinationStoragePath “C:\ClusterStorage\Volume1\<VM Name to Move>” -Verbose

    With the 0x80070005 error, you should find that it will get to 2% and then error after a few seconds.

  • Ensure that you have enabled Kerberos authenticated Live Migrations in the properties for the Hypervisor in Hyper-V Manager
    Hypervisor PropertiesNote: You can perform this action in PowerShell using

    Enable-VMMigration -ComputerName <Server Hostname>
    Set-VMHost -ComputerName <Server Hostname> -VirtualMachineMigrationAuthenticationType Kerberos
  • Ensure that your Hypervisor’s and the Windows 8.1 management VM are up to date (at the same patch level) and are joined to the same domain
  • Ensure that all parties in the process have properly registered DNS records in AD DNS
  • Check your Windows Firewall rules – for testing purposes just turn them off if you can (remember to turn them back on afterwards!)
  • Check your ASA/Hardware Firewall rules for the same
  • Keep an eye on the Hyper-V event logs for any additional information. The log of consequence is found in event Viewer under:Applications and Services Logs > Microsoft > Windows > Hyper-V-VMMS > AdminIf you are experiencing the same problem that I was, you will see three events on the Source Server’s log (20414, 20770 and 21024). The 20770 error is the one being reflected by PowerShell or the Hyper-V Management console. Shortly there-after, the Destination Server will log a 13003 event informing you that the virtual machine from the Source Server (with the same VM-SID) was deleted, indicating that the Destination Server performed a clean-up of the initial migration process.

Permissions

There is a lot of discussion about permissions and 0x80070005 errors. Let us look at the salient points

VERBOSE: Move-VM will move the virtual machine "<VM Name>" to host "<Destination Server>"
Move-VM : Virtual machine migration operation for '<VM Name>' failed at migration source '<Source Server>'. (Virtual machine ID <VM-SID>)
Migration did not succeed. Failed to create folder
'\\<Destination Server>\<Source Server>.762091686$\{e166ba26-8a4a-4029-ac34-c2466451e439}\<VM Name>\Virtual Hard Disks': 'General access denied error'('0x80070005').
You do not have permission to perform the operation. Contact your administrator if you believe you should have permission to perform this operation.
At line:1 char:69
+ $vm = Get-VM -Name 'test' -ComputerName "<Source Server>" | Move-VM -Des ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : PermissionDenied: (Microsoft.Hyper...VMMigrationTask:VMMigrationTask) [Move-VM], VirtualizationOperationFailedException + FullyQualifiedErrorId : AccessDenied,Microsoft.HyperV.PowerShell.Commands.MoveVMCommand
  1. The Migration failed at the Source Server
  2. The Source Server failed the migration because it could not ‘create a folder
  3. We know that the folder in question is the Source Server being unable to create a ‘<VM Name>\Virtual Hard Disks‘ folder
  4. We know that the Source Server was able to create a ‘<VM Name>\Planned Virtual Machines’ folder because we can see it in the file system if we use the GUI Wizard to perform the migration.
    Note: The PowerShell version cleans up after itself!
  5. You have told the Hypervisor to use Kerberos to perform the migration

What does this tell us? It tells us that YOU, the administrator are being told that you cannot create the folder. You are using Kerberos to perform the migration, not CredSSP, so the entire process is being run end-to-end using YOUR credentials. The Management Workstation is logging onto the Source Server as YOU. The Management Workstation is telling the Source Server to initiate the move and in turn the Source Server is delegating your authentication session to the Destination Server and telling it to receive instructions from the Source Server using your credentials. At this point it has nothing to do with ‘NT Virtual Machine’ or VM-SID permissions, this comes after the migration of the core parts of the VM and during initialisation of the VM on the Destination Server. We are not there yet.

So the first thing to check is that your account is authorised to perform the move. If you are a Domain Admin, you should be OK, however you should ensure that the Domain Admin’s security group is a member of the Local Administrators Group on the all participating machines – source server, destination server and management workstation.

If you do not want the user account to have full local admin rights you can add them to the “Hyper-V Administrators” group on each server. To add an account to a local group on Server Core or Windows Hyper-V Server:

net localgroup "Hyper-V Administrators" /add domain\user
net localgroup "Administrators" /add domain\user

Constrained Delegation

When viewing the Delegation tab on the computer account in Active Directory Users & Computers (ADUC) ensure that:

  1. You are using “Trust this computer for delegation to specified services only” (it doesn’t appear to work if you use the “any service” option)
  2. You have selected “Use Kerberos only”
  3. You tick the ‘Expanded’ checkbox to view the full list of entries
  4. That (once Expanded) there are two entries for each type (types being CIFS and Microsoft Virtual System Migration Service), one entry will have the NetBIOS Name and the other will have the FQDN i.e. there are 4 entries for each delegated host, two with NetBIOS Names and two with FQDN entries.
  5. When you create the Kerberos Constrained Delegation, you need to ensure that the “Service Name” field column is blank. If there is something listed in the Service Name column, your delegation is not going to work properly.
  6. You need to have the same number of “CIFS” entries for each host as you do for “Microsoft Virtual System Migration Service”
  7. It is not necessary to add the Management Workstation to the Constrained Delegation

When you issue the Move-VM command in PowerShell, try substituting the -ComputerName and -DestinationHost values for four combinations of the NetBIOS Name and FQDN.

Get-VM -Name ‘<VM Name To Move>’ -ComputerName “<Source Server>” | Move-VM -DestinationHost “<Destination Server>” -DestinationStoragePath “C:\ClusterStorage\Volume1\<VM Name to Move>” -Verbose

For example, if your have Server1 and Server1 and your domain is domain.local the combinations to test are:

Source Destination
Server1 Server2
Server1.domain.local Server2
Server1 Server2.domain.local
Server1.domain.local Server2.domain.local

If you find that one of these works while the others do not, you have an error in the constrained delegation setup for DNS or NetBIOS aliasing. Carefully recreate the delegation.

After you have setup the delegation, go into a LDAP browser, ADSI Edit or the Attribute Editor in ADUC. For each delegated server, find the servicePrincipalName property and look at the value list. You should have two of each of the following entries (one with the NetBIOS Name and the other with the FQDN).

  • Hyper-V Replica Service/
  • Microsoft Virtual System Migration Service/
  • RestrictedKrbHost/

If you do not see these, you have a Delegation Error and/or an issue in creating SPN records. Either delete and try to recreate them by recreating the delegation or carefully add them by hand.

DNS

Bindings. I know that you checked them, but check them again. Trust me. On Server Core where you have very little contact with the actual server console this is very easy to overlook.

Constrained delegation may work with both NetBIOS and DNS, however Kerberos does not care for NetBIOS. If your DNS doesn’t work, you aren’t going to get a successful ticket session creation that you will need in order to pass credentials forward as part of the Constrained Delegation setup.

Check the following using short hand and FQDN lookups i.e. nslookup server1.domain.local and just nslookup server1. Are they both going where you expect? Crucially, which server NIC is the DNS query going out of and once the reply comes back, which NIC is being used to attempt to contact the host?

  1. The management console can query all domain controllers in DNS
  2. The management console can query all Hypervisors in DNS
  3. The hypervisors can all query the management console in DNS
  4. The hypervisors can all query all domain controllers in DNS
  5. The hypervisors can all query each other in DNS

This also requires you to check your default gateway settings.

This is important in the following scenario. Most of you will not encounter this because of the scale of your operations, however the fact is that at Enterprise level I did encounter this problem, hence why I able to write about it.

  1. Lets assume that you follow best practice and have separate public, management, cluster, iSCSI and heartbeat networks.
  2. Your management network is data centre local, on a private network with minimal routing and is designated to management of servers, IPC traffic, un-routed VM’s etc in a secure fashion
  3. Local DNS is available on the management network but does not expose Internet Resolution
  4. Your public VM address ranges come from the public network and are not exposed via NAT/PAT i.e. routing and firewall’s
  5. Your domain controllers exist on a public routed network subnet that is separate from the public VM address ranges used for VM’s
  6. You followed best practice and set your management networks binding order to be the first adapter in the binding order on the hypervisors
  7. You will now receive 0x80070005 when you attempt to replicate, live migrate of off-line migrate a VM between cluster nodes using Kerberos Constrained Delegation

The problem is the adapter binding order caused by the use of local DNS on a network that offers no connectivity to the domain controllers. When the KDC attempts to generate a Kerberos ticket for the constrained delegation, the lookups for the domain controllers will be performed using the DNS servers on the management network and will mistakenly attempt to connect to the domain controllers via the management network. This is simply going to time out – causing the wait during migration. Once it times out, Windows DNS doesn’t defer to the next set of DNS servers or attempt to get to the DC’s on a different NIC. It simply gives up.

The resulting very helpful error code that Hyper-V offers back is Access Denied while seemingly attempting to create files in the file system – the Hypervisor will log that it was unable to create the ‘Virtual Hard Drives’ folder on the destination Hypervisor. What it should actually say here is that it could not properly initialise the end to end Kerberos Constrained Delegation ticket session due to a timeout. It of course doesn’t do that.

In this situation the fixes are one of:

  1. Add an interface on the domain controllers on the management LAN
  2. Add a network interface which can connect to the domain controllers in a higher adapter binding order position in the Hypervisor binding order
  3. Remove the DNS servers from the management networks TCP/IP properties, thus forcing Windows Server to use the first available DNS server configuration on a lower ordinal adapter
  4. Allow routing from the management LAN to the domain controllers. Alias, stub or secondary zone the domain controllers in the management networks DNS and hope you remember to keep them up to date when you make changes to Domain Controller DNS records

Assuming that your constrained delegations are correct, it will start working as soon as the DNS updates have propagated.

The Fix

Ultimately the problem that I had was in the setup of the Constrained Delegation and in another case as discussed above, the DNS binding order. For the Constrained Delegation issueI only had NetBIOS values for the ‘Microsoft Virtual System Migration Service’ while I only had FQDN values for CIFS entries which in turn meant that the associated SPN records were missing.

I was originally using a script by Robin CM for this purpose, it appears that it is this script which isn’t quite ticking all of the boxes.

View: Robin CM’s IT Blog – PowerShell: Kerberos Constrained Delegation for Hyper-V Live Migration

 

In my environment, the following represents a corrected version of the script.

The script assumes that you have placed all of your Hypervisor’s in a dedicated OU. The script will obtain a list of all servers in the OU and automatically create the constrained delegation complete with both pairs of the NetBIOS Name and FQDN records.

In addition, the script also now ensures that the system is not adding a constrained delegation back to itself into the AD database.

You must be a domain admin or have permissions to write to msDS-AllowedToDelegateTo objects in AD in order to run this script.

$OU = [ADSI]"LDAP://OU=Hypervisor's,OU=Servers,DC=ad,DC=domain,DC=co,DC=uk"
$DNSSuffix = "ad.domain.co.uk"
$Computers = @{} # Hash tableforeach ($child in $OU.PSBase.Children){
# add each computer in the OU to the hash table
if ($child.ObjectCategory -like '*computer*'){
$Computers.Add($child.Name.Value, $child.distinguishedName.Value)
}
}# Process each AD computer object in the OU in turn
foreach ($ADObjectName in $Computers.Keys){
Write-Host $ADObjectName
Write-Host "Enable VM Live Migration"
Enable-VMMigration -ComputerName $ADObjectName
Write-Host "Set VM migration authentication to Kerberos"
Set-VMHost -ComputerName $ADObjectName -VirtualMachineMigrationAuthenticationType Kerberos
Write-Host "Processing KCD for AD object"
# Add delegation to the current AD computer object for each computer in the OU
foreach ($ComputerName in $Computers.Keys){
#Write-Host $ComputerName.toUpper() $ADObjectName.toUpper()
if ($ComputerName.toUpper() -ne $ADObjectName.toUpper()) {
Write-Host (" Processing "+$ComputerName+", added ") -NoNewline
$ServiceString = "cifs/"+$ComputerName+"."+$DNSSuffix
Set-ADObject -Identity $Computers.$ADObjectName -Add @{"msDS-AllowedToDelegateTo" = $ServiceString}
$ServiceString = "cifs/"+$ComputerName
Set-ADObject -Identity $Computers.$ADObjectName -Add @{"msDS-AllowedToDelegateTo" = $ServiceString}
Write-Host ("cifs") -NoNewline
$ServiceString = "Microsoft Virtual System Migration Service/"+$ComputerName
Set-ADObject -Identity $Computers.$ADObjectName -Add @{"msDS-AllowedToDelegateTo" = $ServiceString}
$ServiceString = "Microsoft Virtual System Migration Service/"+$ComputerName+"."+$DNSSuffix
Set-ADObject -Identity $Computers.$ADObjectName -Add @{"msDS-AllowedToDelegateTo" = $ServiceString}
Write-Host (", Microsoft Virtual System Migration Service")
}
}
}

Once you have run it, give the system a few minutes so that AD can distribute the update to all DC’s and for the Kerberos session on the respective nodes to refresh.

Update for Windows Server 2016

So I decided to reinstall a node to Hyper-V Server 2016 and have a play with it in amongst HyperV Server 2012 R2.

The experience did not go swimmingly well. Here is a quick overview of some issues and I encountered/created myself to keep in mind when troubleshooting this

  1. The Hyper-V server Win32 installer will perform an in-place upgrade as a clean install. Remember that this means that you will need to delete the AD computer account object and DNS records and then re-join the system to the domain in the correct OU.
  2. Once you have done this, you will need to re-create the Kerberos Constrained Delegation records for all Hyper-V nodes
  3. I was experiencing a problem where I could use Kerberos to Live Migrate or offline migrate to the Hyper-V 2016 host, however I could not migrate back unless I logged onto the 2016 node and use CredSSP to move it back again. Looking at the Windows Server 2008 R2 domain controller security logs, Kerberos authentication was failing. In the end the fix was to add a Delegation for CIFS and the ‘Microsoft Virtual Systems Migration’ delegation classes of the computer account object — TO ITSELF. Yes, if you have Computer Accounts HVNode01, HVNode02, HVNode03, the delegation tab for HVNode01 must include CIFS and MVSM entries in DNS and NetBIOS nomenclature for not only HVNode02 and HVNode03 but ALSO HVNode01 (itself). Once I did this, I could magically migrate the VMs back again.
  4. If you are using Jumbo Frames, remember to perform a test using the following command. If it doesn’t work, fix this before doing anything else
    ping <ipAddress> -l 8500 -f
  5. I made a silly mistake in late night PowerShell command entry when setting up the networking on the 2016 box, I entered
    add-vmnetworkadapter -managementos -Name Management

    when I actually meant to enter

    add-vmnetworkadapter -managementos -Name Management -SwitchName VS_Managmement

    This hooked up a new Virtual network adapter on the Hypervisor called ‘Management’ to each and every Virtual Switch on the Hypervisor. So I wound up with 3 NIC’s called Management all on different networks. They went off and got their own IP addresses from DHCP, registered themselves in DNS and created chaos in the adapter binding order. Naturally the one on the unrouted Management network wound up at the top of the binding order and things got a little upset!

  6. The very first randomly selected non-production critical VM that I attempted to migrate was the nodes local console VM. This VM was not designed to move from the node and didn’t have CPU compatibility mode enabled. This caused additional failure issues.
  7. The second randomly selected non-production critical VM that I attempted to migrate gave no hex error code or message what so ever either through the UI or the event log, just throwing Event ID 24024 and stating that the migration failed and the error message could not be found. To cut a long winded story short, in the end I (correctly) assumed it was the VM itself at fault and decided to Export / Import it in order to lazily cycle the file system permissions. It turns out that when I attempted to re-import the VM (as a restore) the import wizard notified me that it was expecting to find a snapshot file but that the snapshot itself was unavailable (this VM had no snapshot on the UI and no snapshot file in the export snapshots folder). The wizard asked me if it could clear the snapshot remnant and imported the VM. Once it was imported again, it could now live migrate and offline migrate properly. It had nothing to do with the 2016 node.Note: Remember to check on the source Hypervisor for remnants of the original Exported VM which may be left in place on the file system.

With the above issues resolved, everything is working correctly between the Hyper-V Server 2012 R2 nodes and the test Hyper-V Server 2016 node.

Script to perform update and synchronisation of DHCP Mac Filter List for Microsoft DHCP Server releases prior to 2008 R2

System Requirements:

  • Windows Server 2008 (R1)
  • Microsoft DHCP Server
  • Microsoft Mac Level Filter extension

The Problem:

If you are using a Microsoft DHCP Server release prior to Windows Server 2008 R2, Mac Address filtering (either allow or deny based) is not included as part of the main console. Microsoft made the feature available as an extension DLL for Microsoft DHCP during Windows Server 2008 (R1)’s early production run.

If you have installed this extension, filtering is restricted to a single server, with no replication options available to peer servers through clustering. This article offers a simple script that can be used to suspend and update a peer server’s Mac Filter list in a master/slave relationship.

The Fix

The script assumes that you have enabled file sharing through your firewall between the servers and that the MAC address filter configuration file is located at c:\windows\system32\dhcp\MACList.txt.

The format of MACList.txt is

#MACList.txt
MAC_ACTION={DENY}
#List of MAC Addresses to deny
001BB04EB711 - # -mypc6 - 192.168.1.28
002BBB831711 - # -mypc7 - 192.168.1.96

Batch script:

@echo off
set TARGET=<hostname/IP of slave server/peer>
cls echo.echo Opening Notepad
echo.
echo Make changes to the file, save and exit. The changes will be
echo replicated to %TARGET% automatically.
echo.
echo Note: This will interrupt DHCP Services for a few seconds on both servers.echo. c:\Windows\system32\notepad.exe "C:\Windows\System32\dhcp\MACList.txt"

:: Stopping DHCP Server Service on Local System
echo.
echo Applying Changes to Local DHCP Service
net.exe stop DHCPServer
echo.

echo Stopping DHCP Server Service on %TARGET%
:: Restart DHCP Server on Target
sc.exe \\%TARGET% stop "DHCPServer"

ping 127.0.0.1 > null echo.

echo Copying MAC List to %TARGET%
copy /y "C:\Windows\System32\dhcp\MACList.txt" "\\%TARGET%\C$\Windows\System32\dhcp\MACList.txt"

:: Starting DHCP Server Service on Local System
net.exe start DHCPServer

:: Starting DHCP Server Service on %TARGET%
sc.exe \\%TARGET% start "DHCPServer"

echo Operation completed. Please periodically check to ensure sync is stable.

In short, the script:

  1. Offers you a notepad session to make any needed changes
  2. When notepad closes it will restart the local servers DHCP service (thus applying the changes locally)
  3. Shutdown the peer servers DHCP service
  4. Copy the updated MAC filter list
  5. Restart the peer servers DHCP service
  6. Terminate

To add a level of safety, the following script can be run periodically to ensure that the DHCP service is in fact running

@echo off
sc interrogate DHCPServer 2>NUL | find /I /N "4 RUNNING">NUL
if "%ERRORLEVEL%"=="0" (
  echo DHCP Service is running
) else (
  echo DHCP Service is not running!!!!!
  net start DHCPServer
)

Windows Server 2012 R2 DHCP Failover services do not synchronise MAC Address Filters between partner servers

System Requirements:

  • Windows Server 2012 R2
  • Windows DHCP Server

The Problem:

If you are making use of the new failover features integrated into the Windows DHCP Server service in Windows Server 2012, should you also be using either positive or negative MAC address filtering for client leases, you may have already noticed (or may be surprised to discover) that the DHCP Filters table is not replicated as part of the failover configuration.

Barred clients will continue to be addressed from the peer server and after replication the Filters sections of the DHCP console on the secondary server will remain frustratingly empty.

More Info

The Microsoft DHCP Server team have this to say on the subject

“DHCP failover does not provide for replication of server level/wide configuration. Allow/Deny MAC filter is a server level/wide configuration. The reason this is not provided for is because a DHCP serve can participate in more than one failover relationships with different partner DHCP servers. In such scenarios, replicating server level configuration can lead to undesirable resultant server level/wide configuration. If your server has a single failover relationship or a allow/deny MAC address filter list that applies to all servers, you can setup a regular sync between the DHCP servers by writing a simple PowerShell script and integrating it with Windows Task Scheduler so that it runs on a periodic basis.”
teamdhcp (Technet, 2013)

In other words, you may be using filtering to dictate onto which DHCP server your clients will be forced to land. In which case you would not want the filters list to be replicated. For users who are simply using the filtering as a way to reduce the risk of rogue hosts, you would want the opposite behaviour.

Additionally, as replicate is per-scope, under the current replication model, your filter list would have to be duplicated for n scopes in order to work in a multi-scope environment. To that end, I can see Microsoft’s point, however the missed opportunity of a configurable option for how the system administrator wants to manage server setting replication would have made the Redmond DHCP team veritable princes.

The Fix

So, yes, Microsoft’s reply suggests that we can use PowerShell to do this (be it less ideal) and yes we can.

This simplified script is designed to be run on the secondary (slave) server. It does not perform a synchronisation, instead it clears its own database and then copies the database of the partner (master) server.

# PowerShell DHCP Filter Replication Script 1.0.0
# (c) C:Amie 2014
# http://www.c-amie.co.uk/$MasterServerHostname = "MyServerHostname";# Get the LOCAL filters from localhost
$lfilters = Get-DhcpServerv4Filter

# Get the REMOTE filters from $MasterServerHostname
$rfilters = invoke-command -computername $MasterServerHostname { Get-DhcpServerv4Filter }

# Delete the local Filter Set
ForEach ($filter in $rfilters) {
Remove-DhcpServerv4Filter -MacAddress $filter.MacAddress
}

# Import the new Filter Set
ForEach ($filter in $rfilters) {
write-host $filter.List
write-host $filter.MacAddress;
write-host $filter.Description
Add-DhcpServerv4Filter -List $filter.List -MacAddress $filter.MacAddress -Description $filter.Description
}

You will need to setup remote PowerShell access using Set-Item TrustedHosts <Hostname> / WinRm in order to allow remote connectivity between your server peers.

You must also remember that as the above does not perform a synchronisation, if you make filter changes on a secondary (slave) server, they will be lost at the next execution.

Finally, schedule the script to run as required through task scheduler using the following as the Program/script name in the Basic Task setup

powershell -file "x:\path\Script.ps1"

Hardening Steps

While this works, it is simplistic at best and presents a couple of problems.

Firstly, you should be mindful that the script will create a very small window of opportunity in which the server will potentially be in a filterless state. During this time, it may be able to respond to a DORA request from a unwanted node. If this is the case, you should disable any Scopes on the server, perform the filter sync and then re-enable the scope. If you wanted to do that the necessary script modification would be

# PowerShell DHCP Filter Replication Script 1.0.1
# (c) C:Amie 2014
# http://www.c-amie.co.uk/$MasterServerHostname = "MyServerHostname";
$ScopeId = 192.168.1.0;# Stop the DHCP Scope on the LOCAL server from leasing
Set-DhcpServerv4Scope -ScopeId $ScopeId -State "Inactive"

# Get the LOCAL filters from localhost
$lfilters = Get-DhcpServerv4Filter

# Get the REMOTE filters from $MasterServerHostname
$rfilters = invoke-command -computername $MasterServerHostname { Get-DhcpServerv4Filter }

# Delete the local Filter Set
ForEach ($filter in $rfilters) {
Remove-DhcpServerv4Filter -MacAddress $filter.MacAddress
}

# Import the new Filter Set
ForEach ($filter in $rfilters) {
write-host $filter.List
write-host $filter.MacAddress;
write-host $filter.Description
Add-DhcpServerv4Filter -List $filter.List -MacAddress $filter.MacAddress -Description $filter.Description
}

# Start the DHCP Scope on the LOCAL server
Set-DhcpServerv4Scope -ScopeId $ScopeId -State "Active"

The second problem is that the script itself executes in a linear fashion. It assumes that the partner server is always going to be on-line and available for processing. The problem here is that if the server is offline or isn’t able to service the sync request, it will have deleted the local filters list before the process of connecting to the peer server fails. At which point the server will void ofany filtering what so ever.

This can be hardened by introducing two credibility checks as follows:

# PowerShell DHCP Filter Replication Script 1.0.2
# (c) C:Amie 2014
# http://www.c-amie.co.uk/$MasterServerHostname = "MyServerHostname";
$ScopeId = 192.168.1.0;if (Test-Connection $MasterServerHostname -quiet) {

# Get the LOCAL filters from localhost
$lfilters = Get-DhcpServerv4Filter

# Get the REMOTE filters from $MasterServerHostname
$rfilters = invoke-command -computername $MasterServerHostname { Get-DhcpServerv4Filter }

if ($rfilters) {

# Stop the DHCP Scope on the LOCAL server from leasing
Set-DhcpServerv4Scope -ScopeId $ScopeId -State "Inactive"

# Delete the local Filter Set
ForEach ($filter in $rfilters) {
Remove-DhcpServerv4Filter -MacAddress $filter.MacAddress
}

# Import the new Filter Set
ForEach ($filter in $rfilters) {
write-host $filter.List
write-host $filter.MacAddress;
write-host $filter.Description
Add-DhcpServerv4Filter -List $filter.List -MacAddress $filter.MacAddress -Description $filter.Description
}

}

# Start the DHCP Scope on the LOCAL server
Set-DhcpServerv4Scope -ScopeId $ScopeId -State "Active"

}

In this version we perform a ping test on the Master Server (ICMP firewall settings required for success) and we check to see whether the $rFilters variable is set before deleting the contents of the local filters database. Finally, regardless of what happens, the resultant state of the server scope should remain active.