Booting Windows Hyper-V Server from USB: Lessons From Practice

System Requirements

  • Windows Hyper-V Server 2008 R2
  • Windows Hyper-V Server 2012
  • Windows Hyper-V Server 2012 R2

The Problem

I recently wanted to explore the viability of pulling a Server RAID controller into a workstation. A few choice pieces of electrical tape to cover PCIe pins later and the card worked as intended… until it melted down a few minutes later.

The inevitable failure got me thinking. Like most enterprise hardware, the PERC 5/i and 6/i do not support processor power management. The onboard processor runs at 100% speed, 100% of the time. As a result the heat that it generated easily overwhelmed the modest airflow of a desktop. The thermals went well past 80 degrees C before it tripped out.

Most of the servers that we are running in one particular production stack were using the same controllers. Despite this, none of them were actually being used as RAID controllers. They were set as HBA/JBOD devices with a single drive attached – i.e. no disk redundancy. The reason why we have a production setup with such a bad design? These servers are clustered hypervisors. It doesn’t much matter if they burn out. There are 20 more to take their place and all actual client data is held within a fully redundant, complex storage network. An admin simply needs to replace the broken part, rebuild the OS and throw it back into the pool.

Cost Rationalisation

Was changing the design of these servers feasible? Each 10,000 RPM 70GB hard drive was at best using 20GB of data – and less than 15GB in most cases. Each of those drives is consuming 15-25w of power, making noise and never sleeping. At the same time each controller is consuming 6-18w of power and again, never sleeping. Both are adding to the heat being thrown down through the backplane and out into the hot isle. All for pretty much needlessly.

Based upon my domestic energy tariff, the potential per-server electricity cost saving stands to be between £3.29 and £4.38 per month. £39.48 and £52.56 per year. This does not include any residual savings in air conditioning costs. While it doesn’t seem a lot. On a cluster of 20 servers that’s between £789.60 and £1051.20 per-year. At that level the potential savings to start to add up.

As an IT designer, it also gives me a budgetary value that I can rationalise any savings against. If we split the difference over 12 months between the upper and lower estimate we get a £46.02 average. If it costs more than that – particularly for old server hardware – it isn’t worth doing: so £46.02 became the ‘per-machine budget’ for my experiment.

Options to Consider

With that said and on the understanding that there is no RAID redundancy involved in the setup I am (re)designing. There were three options to explore:

  1. Pull the RAID controller and attempt to utilise the DVD drive SATA connector with an SSD. This would solve the heat issue, solve the noise issue and reduce power consumption (to ~4w). It will also be faster than the 10,000 RPM rotational drive. The down side is that getting hold of affordable SSD’s (as of writing) isn’t yet an option. Not to mention that various adapters and extra cabling would be required to get the SSD mounted properly (at extra cost). Modifying new cable runs into 1u servers can often be a challenge (it’s bad enough in 3u). The Server BMC also complicates matters as under Dell, OpenManage will notice that you aren’t using a Dell approved drive and this will quickly hit your environmental reporting data. Approximate cost ~£70+ per server. Well over budget.
  2. Pull the RAID controller and mount a SSD/mSATA/m.2 into a PCI-e slot (even potentially the RAID controllers slot) on a PCIe adapter. This solves the cabling problem and has the added advantage of clearing both drive slots. It also means that I can control the bus specification, potentially getting a boost from a SATA III or NVMe controller. Of course this is more expensive although it is easier to get hold of smaller mSATA SSD’s than it is 2.5″ ones. Cost per-server ~£125+. Again, over budget.
  3. Look at SATA DOM or booting from Compact Flash/SD Card. SATA DOM isn’t an option for the PowerEdge 1950 and a NAND flash solution would require modification of the chassis. The headache of managing boot support would also be an issue. Rending this unreaslistic.
  4. Pull the RAID controller, disk and boot the entire enclosure from USB. This solves pretty much all problems but does add one in that these servers do not have an internal USB port. The active OS drive would therefore need to be insecurely exposed and accessible within the rack. Think malicious intent through to “I need a memory stick… ah, no one will notice if I use that one”. The cost of an average consumer USB 3.0/ 16GB USB Flash Drive (UFD) is about £7 – and it just so happened that operations have boxes of new ones lying around for the pilfering fully authorised, fully funded project.

I decided to experiement with option 4 and started to investigate how to boot Hyper-V from USB.

How to

Running Hyper-V Server from a UFD is a supported mechanism (as long as you use supported hardware types and not a consumer off the shelf UFD like I am).

The main Microsoft article on this topic was written for Hyper-V Server 2008 R2, however a set of liner notes with hardware recommendations are also available for 2012/R2.

View: Run Hyper-V Server from a USB Flash Drive

View: Deploying Microsoft Hyper-V Server 2008 R2 on USB Flash Drive

 

So far, so good. The basic premise is that you use disk virtualisation and the Windows 7/8 boot loader to boot strap the operating system. The Hyper-V Server is installed into a VHD and once the boot loader mounts the VHD and loads Windows as if it were any other Virtual Machine. The performance will suffer, but for Windows Server Core, this really doesn’t matter.

Microsoft states that USB 2.0 or higher must be used and that (for OEM redistributors) the UFD must not report itself as being ejectable.

Microsoft recommends the following drives

  • Kingston DataTraveler Ultimate
  • uper Talent Express RC8
  • Western Digital My Passport Enterprise

The closest that I could find were 16GB Kingston DataTraveler G4’s. Based upon UserBenchmark data offer 45% lower performance vs. the DataTraveler Ultimate G3 and 131% lower write speeds. Similarly, USB3Speed reports that the G4 read/write is 102.86/31.48 MB/second on a USB 3.0 bus vs. 174.76/38.46 for the Ultimate G3. So there is a decisive bottle neck being introduced as a result of using a cheaper UFD model.

The Microsoft article recommends the use of 16GB UFD’s rather than 8GB ones to allow for the installation of future updates I grabbed 4x 16GB DataTraveler G4 sticks and proceeded to prepare them to support the boot process.

View: UserBenchmark: DataTraveler Ultimate G3 vs. DataTraveler G4

View: USB3Speed

Check your Server for Suitability

The Microsoft article states that USB 2.0 is supported for Hyper-V Server USB Booting. I confirmed through empirical experimentation that the PowerEdge 1950 does support USB 2.0 and that its firmware supported booting from USB in a reliable, consistent way.

What I mean here is that you don’t want to have to go into a F12 boot menu every time you restart the server because the BIOS/UEFI will not automatically attempt to boot from the USB port. You should – as a matter of course – update your server firmware, including (but not limited to) the BIOS/UEFI as a way to mitigate against any potentially solvable issues in this regard. Do remember however that in a clustered environment, you shold normalise your hardware and firmware setup on all participating nodes before you set out to create the cluster.

In testing, the PowerEdge 1950 demonstrated that it could boot properly from the UFD without intervention. Thus with another tick in the box, the idea was looking increasingly more viable.

The USB Stick & VHD(X) Creation Process

I am not going to repeat the instructions for creating the bootable USB stick, they are clear enough on the Microsoft website. It is a shame that Microsoft closed the Technet Code Library meaning that you can no longer get access to the automated tool.

What I will add is that as I was installing Windows Hyper-V Server 2012 R2, I decided to attempt to use convert VHD to the newer VHDX format. The advantages here are nominal; better crash recovery and support for 4K drives are the main headlines. Regardless I wanted to start with the latest rather than using the VHD as prescribed in the 2008 R2 creation guide.

It didn’t work. The boot loader seemed unable to read the VHDX file. Running the VHDX back through the Hyper-V’s disk editor and into a VHD did however work. After some testing, I discovered that the issue was in the issue was the VHD migration process. To use VHDX you must update the BCD using the Windows 8.1 version of BCD edit and start with the Windows 8.1 boot loader. Repeating the process from scratch in a native VHDX did however result in a bootable OS.

I had initially started testing Hyper-V Server on an 8GB UFD. During the process and having obtained a 16GB drive, I decided to expand the size of the VHD from 7 to 14GB. This was a mistake. The VHD will expands fine, however Windows will not allow you to resize the VHD’s primary partition to fill the newly available space via the GUI or DiskPart. So unless you have access to partition management tools that can work with a mounted VHD(x), you will need to ensure that the size of the VHD is correct when you create it.

The file copy from the management computer onto the UFD of the 14GB VHD file (with write cache enabled) was excruciating. Making around 12.8MB/s from a USB 2.0 port is was getting far less than the benchmarked speed of 31MB/s.

Windows file copy showing 12.8MB/s

Finally, I created a Tools folder on the root of each UFD and copied the Windows 8.1 x64 versions of:

  • ImageX.exe
  • bcdedit.exe
  • bcdboot.exe
  • bootsect.exe

I also copued the x86 version of a Microsoft utility called dskcache.exe into here. dskcache can be used to enable/disable write caching and buffer flusging on connected hard drives. You could directly inject these into the VHD if you wanted to, however if left on the UFD, they are servicable.

Also note that this is your best opportunity to inject drivers into the VHD should you have any special hardware requirements.

The Results

USB 2.0

Despite the Microsoft article stating that USB 2.0 is supported, it became obvious within about 20 seconds of the boot process that something was not right. The time that it took to boot was agonising. Given the poor sustain file write speed shown above, this shouldn’t be overly surprising.

It took well over 60 second for the boot loader itself to start booting, let alone bootstrap the VHD load rest of the operating system. The initial boot time was about 25 minutes – although does have to go through OOBE and perform the driver and HAL customisation processes during the initial boot, so it isn’t very fair to be overly critical at this stage.

The next point of suffering was encountered at the lock screen. On pressing Ctrl + Alt + Del, a 15 second delay elapsed before the screen refreshed and offers the log-in text fields. After resetting the password and logging on, the blue Hyper-V Server configuration sconfig script took around 90 seconds to load. In short, the system was painfully unresponsive.

I had expected it to be sluggish – but I was not expecting it to be quite this bad.

Windows had loaded the UFD’s VHD file with the write cache enabled but buffer flusging (‘advanced features’) disabled. I thus used dskcache.exe to enable both settings.

dskcache +p +w

… and rebooted.

The boot time was around 4 minutes, the Ctrl + Alt + Del screen was still sluggish as was the login process – but it was certainly faster. Having completed the first Windows Update run, boot times to a password entry screen had reduced to a far more respectable 1 minute and 17 seconds. The sluggishness (while still there) had again reduced to 10-15 seconds from log-in to sconfig.

So what is the problem? There is certainly a lot of variables here:

  • The Datatraveler G4 does not offer the performance it is supposed to
  • The bus is USB 2.0
  • There is an artificial abstraction layer being imposed by the disk virtualisation process in and out of the VHD
  • Behind the scenes, Windows still likely thinks that this device is removable and is reacting accordingly
  • While the VHD upload process was a linear one that consisted of a single large file, the operating system will be making thousands of random seeks and random small writes. Random I/O and Linear I/O always offer different statistics – the latter being more synthetic than real world usage will otherwise offer.

8GB

As I mentioned previously, my original test with Hyper-V Server 2012 was on an 8GB UFD with a 7GB primary partition. After install, Hyper-V Server 2012 R2 consumes 2.98GB with no Page File. By the time Windows Update had scanned, downloaded attempted to install updates – including the 870MB Windows Server 2012 R2 Update 1 (KB2919255) – there was only 154MB of free disk space available. It was unable to complete the installation as a result.

Having ascertained that I could not resize the partition post-creation I recreated the VHDX once again, from scratch onto one of the 16GB.

16GB

Installing the Hyper-V Server 2012 R2 into a 14GB VHD on a 16GB stick at left plenty of available disk space. By the time that Windows Update had got around to having downloaded and subsequently attempted to install all available Windows Server 2012 R2 updates, there was 4.52 GB free.

At this point the Hypervisor itself still has not been configured and required support tools such as security software, Dell OpenManage or Dell EqualLogic Host Integration Tools.

Therefore, as with the advice offered on the Microsoft article, do not attempt to run Windows Hyper-V Server 2012 R2 from anything smaller than a 16GB memory stick. If you do, you are going to encounter longevity and maintenance problems with your deployments. In practice you should not consider using anything smaller than 32GB. I can see a time within the next couple of years when the 16GB installation will (as with the 8GB installation) be too large to continue to self-update.

This is significant and should be something that you factor during design as if Fail over Cluster Manager spots a mismatched DSM driver version (i.e. out of sync Windows Update state between cluster nodes in the case of the Microsoft driver), the validation will fail and Microsoft will not offer support for your setup. Therefore being unable to install updates is not an a situation that you want for your clustered Hyper-V Server environments.

Windows Update

As a side note, it is worth pointing out that I ran Windows Update on the live UFD in the server while it was booted. One of the advantages of the UFD approach is that it is easy to keep a box of pre-configured UFD’s in a draw that can be grabbed as a fast way to stand-up a new server or recover a server when its existing UFD has failed. Windows Update maintenance of these UFD’s is made far easier if you use DISM to off-line service the VHD’s and apply Windows Updates to the image before you even start to use the memory stick.

You can periodically update the box of UFD’s to the latest patch revision meaning that should you ever need to use one and you will have a far more up to date fresh install of the OS to hand. Something that is significantly faster than performing on-line servicing.

Improving Performance

There were three choices at this point. Abandon the project, focus on the UFD and buy the higher spec drive (£30 vs £7) or focus on the controller. Looking at prices, the controller was the cheaper option to explore.

USB 2.0 is an old technology. Its maximum theoretical bit rate is 480Mbps (that’s Megabits per second, not MegaBytes). This equates to 60MB/s (MegaBytes second). If we compare this with USB 3.0 whose maximum theoretical bit rate is 5Gbps (Gigabits) or 640MB/s we can see a very clear route to better performance. In practice, USB 3.0 isn’t going to get anywhere near 640MB/s, however a quick trip to eBay revealed controller pricing of between £6 and £35 making it something that was easier to swallow inside my £46.02 budget.

After researching chipset options, I narrowed it down to there being three chip options. The Etrom EJ198, which seems to have the fastest benchmark figures. The Renesas (formerly NEC) D720202 which is the new version of the D720201 which (coming in a close second) and finally the cheap and cheerful VIA Labs (VLI) 805-06 1501.

After further research, I found a lot of reports of compatibility issues with the Etrom which, coupled with its higher price, meant I abandoned it. So I picked up a £6.45 VLI dual port card and a dual port Renesas card for £12.66 simply as a means to have two different chips to test with.

Total spend on project: £25.91. Still well within budget.

Before starting I had a working theory that introducing the USB PCIe controller was going to break the BIOS’s ability to boot from the USB port. Despite extensive research, I was unable to find any controller cards online that stated the presence of an Option ROM to explicitly offer boot support. So ultimately I may have spent £25.91 for nothing particularly as USB 3.0 may not be able to add anything to the already I/O constrained cheaper 16GB UFD; but at this point there was still £20.11 left in the budget witch was available to use if chasing USB 3.0 was a red-herring. Consequently I was able to pickup a Kingston DataTraveler Ultimate G3 for £16.99 from eBay to allow for a thorough exploration of both avenues.

Total spend on project: £42.90. Still £3.20 left in the budget for a cup of tea!

Kingston DataTraveler Ultimate G3

The first thing to note was that it is a far larger memory stick and as such is a lot more obvious and significantly more intrusive sitting on the rear I/O plane of the server. You would definitely want to internally mount this larger UFD simply to protect it from damage caused during routine maintenance and cable management activities.

I elected not to re-create the experiment from scratch with a full 30/31GB VHDX, so instead I copied the existing 14GB VHDX from the existing UFD. Over a USB 2.0 bus a UFD to UFD copy resulted in a 18.1MB/s transfer speed – an immediate 5.3MB/s improvement. Repeating the file transfer from the hard drive onto the UFD increased this further to 24.6MB/s – an improvement. of 11.8MB/s and nearly a doubling of the write speed onto the UFD.

Windows file copy showing 24.6MB/s

Testing the connection on the Servers USB 2.0 bus, the performance difference was immediate. While still occasionally lagging and significantly slower than compared to even a 7k rotational hard drive. Its responsiveness was now at a point where I concluded that performance was acceptable – even for the USB 2.0 bus.

Rolling the VHDX back to an older, un-patched version of the image and having the server self-update was a better experience; with the update process lasting for a period of a few hours rather than all day.

I did however start to experience some operational problems with the higher specification drive. For example, while I had no problems with the cheaper drive (eventually) completing tasks, the DataTraveler Ultimate G3 could not complete some DISM servicing activities, citing “Error: 1726 The remote procedure call failed”. This could be illustrative of the start of a drive failure or some form of corruption in the VHD.

USB 3.0

The cheaper £6.45 USB 3.0 controller arrived first and I threw it into a PCIe 1x slot on the test system. I then retested the file copy on both the Ultimate G3 and the DataTraveler G4 to see if there was any improvement in performance.

Windows file copy showing 16.3MB/s

The DataTraveler G4 copied up at around the 16.3 MB/s mark, this is a 3.5 MB/s improvement over the 12.8 MB/s off of the USB 2.0 controller but nothing compared to the 24.6 MB/s of the DataTraveler Ultimate G3 on the USB 2.0 controller.

So what about the performance of the DataTraveler Ultimate G3 on the USB 3.0 bus? The result was quite phenomenal in comparison

Windows file copy showing 88.2MB/s

88.2 MB/s, some 63.6 MB/s faster than the same drive on the USB 2.0 bus – some 705.6 Megabits per-second. Not bad for a £6.45 VIA Labs chip from eBay!

As anticipated however, the lack of an OptionROM was the downfall to the experiment. The BIOS was unable to ‘see’ the USB 3.0 controller as an add-in device during POST and thus was unable to boot from it.

I attempted to create a dual USB boot solution where the VHDX file lived on a memory stick attached to the USB 3.0 bus. A second memory stick containing the boot loader existed on the motherboards USB 2.0 port. Sadly however no amount of tinkering could get the system to link one to the other. 'No bootable device -- insert boot disk and press any key'.

The second, more expensive Renesas USB 3.0 controller arrived around a week later. Just as with the cheaper VIA Labs controller, there was no possibility of getting it to boot directly either.

Writing onto the cheaper DataTraveler Ultimate G4 using a Renesas driver actually managed a throughput of 19.3 MB/s. Repeating the test with the DataTraveler Ultra G3 yielded a write speed of 93.0 MB/s, again showing an improvement over the VIA. Be it a not particularly significant one given that it was double the price.

In summary the write speeds for performing the large file transfer of the VHD onto the memory stick are shown below.

Write Speed: MegaBytes second (MB/s) – Higher is better
Memory Stick USB 2.0 USB 3.0 (VIA) USB 3.0 (Renesas)
DataTraveler G4
12.8
16.3
19.3
DataTraveler Ultimate G3
24.6
88.2
93.0

From a subjective point of view, the use of the DataTraveler Ultimate G3 on the USB 2.0 bus was “acceptable”. Acceptable given what the system needed to do. Thus the randiom read/write bottleneck can be conculded as being in the memory stick and not the controller itself.

Update 08/04/2019: The VIA controller only lasted around 6 months before it started causing system instability (blue screens). Shortly there-after it died. The Renesas controller is still going strong!

 

Conclusion

So having spend £42 on the experiment, what can conclusions can be drawn.

Many of you have probably been shouting the obvious here. That the best way to reduce costs would be to obtain more efficient servers and consolidate the old ones into fewer appliances. This is true beceause newer servers:

  • Have more efficient, less power hungry, higher capacity components
  • Emit less heat
  • Have more efficient power supplies
  • Have newer, better fans
  • Can consolidate more virtual servers

In the real world most of us don’t work for Google or Microsoft and we cannot get management to agree to write blank cheques. Neither can most start-ups, home lab builders, ‘hand-me-down’ dev-test environments or backup environments. The short of it is if you want to save some money, reduce heat and in turn reduce noise (always useful in a home environment). A £40 – £50 saving a year can go a long way. So spending £42 wasn’t unreasonable.

USB 2.0 is ‘good enough’, especially for testing environments. There are clear performance advantages with USB 3.0, however you are going to need USB 3.0 enabled boot support to make practical use of this technology. Even if you have that, you should consider other solutions such as a small SSD or SATA DOM before considering USB 3.0. If you are in a position to add bootable USB 3.0 to your system. It is however a very viable option.

The biggest headline from his process has been that not all UFD’s are created equally. The wide and varied margin between different models from the same company was surprising – espeically with both devices claiming USB 3.0 featuresets. The benchmark statistics are so stark as to prove that there is virtually no point in having USB 3.0 if you are going to use a low-end UFD.

For Hyper-V Server, with the correct investment in your UFD, you can make USB 2.0 suffice for your needs and as long as you realise that it will not be as fast as a rotational drive. Despite this, if you do not reboot your envrionment very often, it might just be good enough for your requirements.

For me personally, I will be getting the testing cluster migrated over to VHDX/UFD booting hypervisors. There is a cost saving rational that helps me to keep the testing devices running. On a more personal level, for home, I have created UFD devices for a couple of desktop machines that are in my lab and these have been setup as off-line nodes in my cluster. The value here is that they can become hypervisors for a short time without interfering with the OS or drives. Even more importantly I do not have to worry about multi-booting. With these UFD’s I plan on simplifying the maintenance process of the main environment so that I no longer need to have down time on my setup.

So why would you want to consider creating a UFD boot setup for your hypervisors? There are some advantages just as there are clearly some disadvantages

Advantages

  • There is potentially a financial saving to be made as a result of power consumption reduction. This is especially true for large clusters and whole racks of servers using  shared storage
  • It is a very easy way to make a low-cost, reportable environmental sustainability push. This is particularly true if you are not yet able to dispose of your legacy hardware
  • It works well with Microsoft’s push towards the use of SMB 3.0 for low-cost Hyper-V shared storage setups for SMB’s
  • If you accept RAID as being unnecessary in a clustered environment. In the event of a UFD failure you can easily keep a box of pre-configured UFD’s in a draw. Allowing you to get the Hypervisor up and running again and back into the cluster very quickly. Offline servicing can also be used to very easily keep the off-line UFD devices patched
  • Heat reduction was my main driver. By removing the hot RAID/SAS JBOD controllers there is a thermalsaving. There is also potentially an area of additional cost saving in environmental cooling
  • It is extremely cheap to implement. Not specifying your new Hyper-V Server purchase with hard drives will more than pay for the cost and time of setting up the environment. Most new servers will have an internal USB port within the chassis and you can use this to your advantage for security. The UFD approach is cheaper than similar SSD/mSATA alternatives
  • Removing hard drives cuts down on power, use, heat and noise. This is less important for Enterprise but for a small business or an average home/home lab user this might be a very important driver
  • The convenience of a UFD makes this a very good option to keep in mind for emergency planning/disaster recovery. You can throw a pre-configured UFD into any server or even desktop and have it running a serviceable hypervisor minutes. All without impacting the original server’s drives. Simply remove the UFD and reboot and it goes back to doing whatever it was doing previously. This is potentially very useful for a SMB with limited resources who need to service a running Hypervisor without downtime. If you can temporarily promote a seperate machine to be a Hypervisor by plugging in a UFD and rebooting. You can creatively increase your organisational uptime
  • It is becoming difficult to purchase small (32/64GB) SSD drives while it remains easy to obtain smaller UFD’s. This saves money as you will not need to buy a 128GB SSD to support a 20GB requirement
  • You can use both VHD or the newer VHDX formats. VHDX offers better failure safeguards, 4k sector support and is the only real choice for UEFI setups

Disadvantages

  • There is no support for disk redundancy from the setup described in this article. If you require OS’s underpinned by a mirror. Then this is not something to consider and you should look at SSD’s
  • Most Enterprise scale deployments will make use of scripted, rapidly provisioned PXE deployments of Hyper-V Server. The use of VHDX means that you will be unable to use these technologies
  • The UFD to VHD(x) abstraction process introduced by disk virtualisation adds a performance penalty
  • As has been demonstrated, UFD’s are slower than rotational drives and are considerably slower than SSD’s
  • The longevity of the UFD being used for this purpose is unknown. In the absence of reliable MTBF figures, most Enterprise users probably wouldn’t (and shouldn’t) consider it
  • Integration with server management tools such as OpenManage may be a problem for your OEM. This in turn may have an impact on support and warranty options.

 

In summary: For the average Enterprise user on primary production kit it may not be something that you want to consider. In some use cases, such as for backup, testing or disaster recovery environments there are clear advantages. Especially if you are prepared to be creative!

Memory Leak in SvcHost.exe on Microsoft.XmlHttp (IXMLHTTPRequest) .Send() when called from CScript.exe or WScript.exe

System Requirements:

  • Windows Server 2008 R2

The Problem:

Svchost.exe, that black box amongst many other black boxes. If you ever happened to be in the business of watching what your scripts are getting up to on a Sunday morning and you are using Microsoft.XmlHttp, then you might be in for a surprise.

Every 2 hours a batch process on a group of servers fires off a script that in turn iteratively runs a second VBS script some 200-300 times. The script, calls a Web Service and performs a push/pull of instructions. Within a few days of the patch Tuesday reboot, you start noticing that memory use is going up, and up, and up.

You’ve done all of your deallocations, right? “set xmlHttp = nothing”? Yep, but despite that, memory use continues to grow. The culprit, svchost.exe. It grows until it’s into the page file and then grows a little bit more. Every run of the script puts between 4 and 100KB onto the memory footprint. At the end of the month, the servers are groaning because of memory starvation and your SAN array’s are not happy because of all of the paging.

True story.

More Info

I have been able to reproduce this on 3 separate and wholly independent Server 2008 systems (read different clients, enterprise/retail licensing, server hardware and install images) as well as on related servers (read from the same image on same or similar hardware). I have attempted to reproduce it on Windows Server 2012 R2 and I was not successful. Server 2012 R2 does not appear to be impacted by the issue. Running the iterator loop below for 10 minutes yields no increase in the memory use curve on the operating system, just a constant cycle of assign, release, assign, release that you would expect to see.

After a lot of diagnostics and a lot of me initially assuming that the problem was the web service (many, many wasted hours… although I did find a few bugs in the service code itself…) I managed to narrow it down to Microsoft.XmlHttp. More specifically, it’s in the way that CScript or WScript interfaces with Microsoft.XmlHttp at initialisation.

As you probably know, svchost itself is just a service wrapper. Inspection of the wrapper reveals a number of services running inside the wrapper. In this case the specific services are:

  • COM+ Event System
  • Windows Font Cache Service
  • Network List Service
  • Network Store Interface Service
  • Secure Socket Tunneling Protocol Service
  • WinHTTP Web Proxy Auto-Discovery

There are two things here that could be interesting, COM+ Event System and WinHTTP Web Proxy. Microsoft.XMLHTTP itself relies upon the WinHTTP stack for operation, but we are also using a COM interface to call it from VBScript.

While we cannot shutdown the COM+ Event Service and expect the operating system to survive for long, we can the WinHTTP Web Proxy Auto-Discovery Service. Did it release the memory consumed in the leak? No. So in the balance of probabilities, it’s coming from COM+.

The problem with that is in the need to reboot the server to safely clear the memory leak, hence why Patch Tuesday has been the true savior in keeping a gradual performance bottle neck from becoming a full scale meltdown. So what is going on?

I stripped off all of the web service and customisation parts and went back to vanilla Microsoft implementation examples. We cannot get much simpler than this.

Option Explicit
Dim xmlset xml = CreateObject("Microsoft.XmlHttp")
xml.open "POST", "http://127.0.0.1", false
xml.send "he=llo"
set xml = nothing

Save it to a VBS and run it via CScript, run it a lot. Run it in a BAT file loop

:start
ccscript.exe testfile.vbs
goto start

Watch the svchost.exe processes until you spot the instance with the rising service working set (or private set). Now you know which one to focus on.

It’s memory leaked. Hold on, we’ve created the instance of Microsoft.XmlHttp (which is actually an instance of IXMLHTTPRequest), done something and told CScript to deallocate it (set xml = nothing). Why is it leaking memory?

The third parameter on .Open() is bAsync – is it an asynchronous request? It’s false above, meaning that the request is synchronous. It continues to leak. It would be more likely to leak asynchronously than synchronously, however changing that to true makes no difference.

So where is the leak being triggered? By process of line elimination we can reveal that the memory is committed into the svchost wrapper during xml.send(). Run it without .Send() as below and there is no growth in the scvhost process memory footprint no matter how many times you run it..

Option Explicit
Dim xmlset xml = CreateObject("Microsoft.XmlHttp")
xml.open "POST", "http://127.0.0.1", false
' COMMENTED OUT      xml.send "he=llo"
set xml = nothing

In the MSDN documentation for the .Send() method, it states

“If the input type is a BSTR, the response is always encoded as UTF-8. The caller must set a Content-Type header with the appropriate content type and include a charset parameter.”

So far we haven’t done that and we are sending a VBString – which is ultimately a BSTR in C++, so add in the necessary setRequestHeader beneath the .Open() method call in case it is a case of not following the documentation:

Option Explicit
Dim xmlset xml = CreateObject("Microsoft.XmlHttp")
xml.open "POST", "http://127.0.0.1", false
xml.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; Charset=UTF-8"
xml.send "he=llo"
set xml = nothing

It isn’t. There is no change, it still results in an increase in process memory after cscript.exe has shutdown.

We have confirmed that there is a memory leak, where it is and what is triggering it. We can also be confident that given the extremely simple nature of the sample code printed above – and its match to the samples documentation – that it is being implemented correctly.

So the next step is to try and prove that there is an issue in the COM implementation between CreateObject and set nothing. This is achieved by running the allocate/deallocate (set/set nothing) in a loop as shown below

Option Explicit
Dim i
Dim xmlwscript.echo TypeName(xml)              ' This returns "empty" on this testfor i = 0 to 999

set xml = CreateObject("Microsoft.XmlHttp")
xml.open "POST", "http://127.0.0.1", false
xml.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; Charset=UTF-8"
xml.send "he=llo"
wscript.echo xml.responsexml.xml    ' This returns nothing on this test
wscript.echo xml.statusText         ' This returns "OK" on this test
set xml = nothing

next

wscript.echo TypeName(xml)              ' This returns "nothing" on this test

At this point you would expect to see a large increase in the svchost.exe memory footprint.

It does not happen.

1000 iterations and instantiation of the IXMLHTTPRequest later and there is no obvious exponential increase in the memory footprint of svchost.exe. It simply increments once i.e. the additional memory consumption is no worse than running the script with only 1 call to CreateObject/set nothing despite the fact that .send() has been called 1000 times.

What does that mean? Well, it would seem to suggest that the fault isn’t actually in IXMLHTTPRequest (Microsoft.XMLHTTP), but actually in VBScript itself. As a speculative suggestion, I would suggest that VBScript is registering event callbacks with COM+’s Event Management System on the first call to .Send() which are not being cleaned up by the garbage collector when “set nothing” is called in the code. So either there is a bug in VBScript or there is a bug in the event handling interface for COM+ event registration through which IXMLHTTPRequest is registering its own actions.

Most people aren’t going to notice this problem, they are morelikely to iterate instance of Microsoft.XmlHttp inside VBScript than they are to repeatedly externally iterate accross it. It just so happens that I need to fire it externally to the script processor via the command shell. The chances are that if you are reading this, so do you.

The Fix

As of writing, I have not found a direct way to force VBScript to release the memory from scvhost, short of rebooting (or migrating to Windows Server 2012). Calling Microsoft.XmlHttp from WScript or CScript seems to be the problem and the fact that the web service scripts are using an external iterator to repeatedly call n new instances of CScript are exacerbating the situation. Simply put, the transaction load is the catalyst for spotting the leak. In most cases growth would be very subtle as would growth were the iteration internal to the CScript.exe script instance.

While not necessarily ideal, if you are in the position of being able to change provider, you can substitute Microsoft.XMLHTTP for MSXML2.ServerXmlHttp, which provides most of the functionality without making use if WinHTTP. This provider does not exhibit the memory growth issue as in its client counterpart, however its use requires MSXML 3 or 6 and you lose some functionality.

The fact that I could not reproduce the issue under Windows Server 2012 R2 suggests that the culprit has been fixed – either intentionally or inadvertantly. By default, Microsoft.XMLHTTP is a COM Class ID reference to msxml3.dll. Under Windows Server 2008 R2 the file version is SP11 at 8.110.7601.18334, under 2012 R2 the file version is simply 8.110.9600.16483. Yet oddly, with all systems fully patched, vbscript.dll under Windows Server 2008 R2 is version 5.8.9600.17041 (KB2929437) while its counterpart under Server 2012 R2 is 5.8.9600.17031.

What I can tell you is that these systems have been running this recursion script every 2 hours since the beginning of 2012 and the issue has only been observed in more recent months, therefore I suspect that Microsoft have a regression bug on their hands. Until it is fixed however, I have a load of (thankfully firewalled, private network) web service that have a DOS vulnerability. So do you.

Performing WUSU 3.0 database maintenance (database re-indexing) via WsusDBMaintenance under Windows Server 2008

System Requirements:

  • Windows Software Update Services 3.0
  • Windows Server 2008, 2008 R2

The Problem:

Microsoft recommend that you perform monthly maintenance on your WSUS database to keep it in good order. The following pages represent the instruction for undertaking this activity

This document simply seeks to clarify the process when using Windows Internal Database (instead of SQL Server) under Windows Server 2008/R2.

The Fix

The following summarises the steps involved in the process.

  1. Install the SQL Server Native Client for your processor on the WSUS server
  2. Install the SQLCmd utility for your processor on the WSUS server
  3. Copy & Paste the T-SQL code from Re-index the WSUS 3.0 Database into a notepad file and save it on c:\ as WsusDBMaintenance.sql (or download here)

If you wish to automate or schedule the task:

  1. Create a .cmd file on your desktop
  2. Enter the following into it
"c:\Program Files\Microsoft SQL Server\90\Tools\Binn\sqlcmd.exe" -I -i"c:\WsusDbMaintenance.sql" -S "np:\\.\pipe\MSSQL$MICROSOFT##SSEE\sql\query"
  1. Note that the version of the SQL Server SQLCmd that you install will need to be reflected in the version number (90 above). 90 = SQL Server 2005, 100 = SQL Server 2008 and so on.
  2. Right click the .cmd file and select “Run as Administrator” to launch it through an elevated command prompt
  3. If you setup a scheduled task for this, remember to set the always run with full permissions option.

If you just want to run it once:

  1. Open an elevated command prompt
  2. Enter the command:
"c:\Program Files\Microsoft SQL Server\90\Tools\Binn\sqlcmd.exe" -I -i"c:\WsusDbMaintenance.sql" -S "np:\\.\pipe\MSSQL$MICROSOFT##SSEE\sql\query"
  1. Note that the version of the SQL Server SQLCmd that you install will need to be reflected in the version number (90 above). 90 = SQL Server 2005, 100 = SQL Server 2008 and so on.

PEAR for PHP Error “No releases available for package “xxx” install failed” after running ‘pear install xxx’ on Windows Server 2008

System Requirements:

  • Windows Vista
  • Windows 7
  • Windows Server 2008, 2008 R2

The Problem:

You know that something a bit odd is going on when one of a batch of servers starts throwing errors that the others sailed past. In this case trying to configure PEAR for a new PHP install with Mail, Mail_Mime and Net_SMTP (pear.php.net/mail, pear.php.net/mail_mime and pear.php.net/net_smtp) should be fairly standard. The other servers took the install and even this server too Mail and Mail_Mime but would not accept Net_SMTP returning:

C:\Program Files (x86)\PHP>pear install net_smtp
No releases available for package “pear.php.net/net_smtp”
install failed

Leaving it overnight before rolling up my sleeves (in case it was just downtime at the package repository) the fix was fairly simple.

The Fix

If you are experiencing the same problem this server was having, running the following

pear remote-list

Will result in

SECURITY ERROR: Will not write to C:\Users\<user[8.3]>\AppData\Local\Temp\pear\cache\e9b88593398eb79a9aa91024351d646arest.cacheid as it is symlinked to C:\Users\<user>\AppData\Local\Temp\pear\cache\e9b88593398eb79a9aa91024351d646arest.cacheid – Possible symlink attack

If you get something akin to the above simply browse to:

C:\Users\<user>\AppData\Local\Temp\

and delete the pear folder