Asus P7P55D Evo (P55 Express chipset) and an i7-860 with 32GB DDR3 12800/1600MHz RAM

System Requirements:

  • Asus P7P55D Evo
  • 32GB DDR 3

The Problem:

The i7-860 and the P55 Express chipset are both listed as only supporting 16GB RAM. The product page for the P7P55D Evo lists it as only supporting 16GB RAM across 4 DIMM slots. I wanted to try it with 32GB to see if it would work.

View: P7P55D Evo Product Page
View: Intel Core i7-860 Specifications

More Info

I found some reasonably priced Crucial Ballistix Sport DIMM’s with the following specification.

  • DDR3 PC3-12800
  • 9-9-9-24
  • Unbuffered
  • NON-ECC
  • DDR3-1600
  • 1.5V
  • 1024Meg x 64

The set that I picked up was a BLS4CP8G3D1609DS1S00BEU a 32GB set in 4x8GB for £113, which is (as of writing) a discontinued set that you cannot seem to pick-up from crucial.com, but is in essence 2x BLS2CP8G3D1609DS1S00CEU 16GB kits.

The Crucial RAM checker lists it as being incompatible with the P7P55D Evo although the only difference between this and the compatible equivalent are the fact that the last entry on the compatible DIMM’s specification shows a value of 512Meg x 64 rather than 1024Meg x 64, making them 4GB DIMM’s – in fact these were exactly what I removed from the system to perform the test.

The results?

  • The system went through POST fine
  • The system went through the comprehensive POST Memory Test fine
  • The BIOS could see all 32GB
  • The system booted to Windows Server 2012 R2
  • Windows could identify all 32GB RAM as present and usable
  • The RAM passed the Microsoft Memory tester
  • The RAM accepted memory reservations made by Hyper-V for 30GB RAM for several different VM’s (i.e. it could address and allocate past 16GB)
  • The system has been running for 4 days without incident running Windows Server 2012 R2

I call that a result and it would seem that the P55 Express Chipset and the i7-860 are capable of making use of 8GB DIMM’s.

NB: Please keep in mind that the home editions of Windows are memory capped, if you want to address more than 16GB of RAM in Windows 7 or higher, you need the Professional or Enterprise versions of Windows or a Server Edition of Windows. Remember also that it must be x64 (64-bit) and cannot be an x86 (32-bit) edition.

Buy: BLS4CP8G3D1609DS1S00BEU 32GB Kit on
Buy: BLS2CP8G3D1609DS1S00CEU 16GB Kit on

View: BLS2CP8G3D1609DS1S00CEU on Crucial.com

Error 0x80070490 or 0x00000490 when attempting to connect to a Printer queue on a Windows Print Server

System Requirements:

  • Windows Vista, 7, 8, 8.1,10
  • Windows Server 2008, 2008 R2, 2012, 2012 R2

The Problem:

I was having some problems automating the connection to a printer queue from a set of managed Windows 7 systems to foreign Windows Server managed SafeCom printer queue. The device in question was a generic follow-me Printing queue for Xerox Workcentre 7655 devices (not that it is especially relevant).

On opening the SMB share to the print server and connecting to the printer queue, the system would go off for 60 seconds before coming back with error 0x00000490 and no description.

Exploration of this error in event viewer under Event Viewer > Applications and Services > Microsoft > Windows > PrintService > Admin reveals:

Installing printer driver Xerox GPD PS V3.2.303.16.0 failed, error code 0x490, HRESULT 0x80070490. See the event user data for context information.

The only additional information available in the user data of any substance was either

Parse Inf
ProcessDriverDependencies failed

or

PerformInfInstallActions
ParseInf failed

More Info

I also tried the following recommendations from general troubleshooting/elsewhere:

  1. Attempting to manually install the driver didn’t help
  2. Using pnputil -d to delete the driver oemXX.inf didn’t help (i.e. clearing the driver out of C:\Windows\System32\DriverStore\FileRepository)
  3. Using pnputil -a to manually add the desired driver didn’t help
  4. Using the Print Management MMC snap-in to flush the driver out (including renaming any reference dll’s under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Print\Environments\Windows x64\Print Processors\winprint to xxx.old and restarting the print server to allow the deletion of stuck print drivers) didn’t help
  5. Obtaining older and new drivers from Xerox didn’t help
  6. The Microsoft Printer Troubleshooter tool
    View: Fixing Printer Problems
  7. Enabling the Operational log under Event Viewer > Applications and Services > Microsoft > Windows > PrintService didn’t add anything significant to the troubleshooting process just:
    ParseInfAndCommitFileQueue
    PerformInfInstallActions failedProcessDriverDependencies
    FindLatestCoreDrivers failed
  8. Checking setupapi.app.log and setupapi.dev.log for errors under C:\Windows\inf did not show any errors, everything was reporting ‘success’

The Fix

In downloading and installing the older version of the print driver, event viewer was showing an identical error for the driver install to that shown when displaying the error for the install of the most up to date driver version, however comparing the source driver files to the destination files that appeared after repository injection in C:\Windows\System32\DriverStore\FileRepository revealed some slight differences in file dates, the print server was sending slightly modified versions of the driver compared to the vanilla Xerox source of the same version number (2015 file dates for a 2013 Xerox driver package).

Some sleuthing through monitoring tools ultimately presented the cause of the issue and ultimately its fix. Windows was downloading the driver package from the target foreign print server (with its modified files) and injecting it into the repository correctly. Immediately afterwards however it was going off to our internal, public driver repository (a SMB share on a build server) and finding additional copies of a compatible x64 Xerox driver, finding that they were newer and then attempting to use the newer driver.

Without the driver customisation’s (presumably part of the SafeCom suite configuration for follow-me printing) the print server was immediately rejecting the connection.

So the lesson from this experience was that even if you are explicitly telling Windows to use a specific driver version, if it can find a newer version in a driver search path, it will attempt to pick it up and use it instead. Remove any media with drivers (UFD, CD, DVD, Floppy) and check/modify you driver search paths for conflicting drivers as listed in:

HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\DevicePath

Booting Windows Hyper-V Server from USB: Lessons From Practice

System Requirements

  • Windows Hyper-V Server 2008 R2
  • Windows Hyper-V Server 2012
  • Windows Hyper-V Server 2012 R2

The Problem

I recently wanted to explore the viability of pulling a Server RAID controller into a workstation. A few choice pieces of electrical tape to cover PCIe pins later and the card worked as intended… until it melted down a few minutes later.

The inevitable failure got me thinking. Like most enterprise hardware, the PERC 5/i and 6/i do not support processor power management. The onboard processor runs at 100% speed, 100% of the time. As a result the heat that it generated easily overwhelmed the modest airflow of a desktop. The thermals went well past 80 degrees C before it tripped out.

Most of the servers that we are running in one particular production stack were using the same controllers. Despite this, none of them were actually being used as RAID controllers. They were set as HBA/JBOD devices with a single drive attached – i.e. no disk redundancy. The reason why we have a production setup with such a bad design? These servers are clustered hypervisors. It doesn’t much matter if they burn out. There are 20 more to take their place and all actual client data is held within a fully redundant, complex storage network. An admin simply needs to replace the broken part, rebuild the OS and throw it back into the pool.

Cost Rationalisation

Was changing the design of these servers feasible? Each 10,000 RPM 70GB hard drive was at best using 20GB of data – and less than 15GB in most cases. Each of those drives is consuming 15-25w of power, making noise and never sleeping. At the same time each controller is consuming 6-18w of power and again, never sleeping. Both are adding to the heat being thrown down through the backplane and out into the hot isle. All for pretty much needlessly.

Based upon my domestic energy tariff, the potential per-server electricity cost saving stands to be between £3.29 and £4.38 per month. £39.48 and £52.56 per year. This does not include any residual savings in air conditioning costs. While it doesn’t seem a lot. On a cluster of 20 servers that’s between £789.60 and £1051.20 per-year. At that level the potential savings to start to add up.

As an IT designer, it also gives me a budgetary value that I can rationalise any savings against. If we split the difference over 12 months between the upper and lower estimate we get a £46.02 average. If it costs more than that – particularly for old server hardware – it isn’t worth doing: so £46.02 became the ‘per-machine budget’ for my experiment.

Options to Consider

With that said and on the understanding that there is no RAID redundancy involved in the setup I am (re)designing. There were three options to explore:

  1. Pull the RAID controller and attempt to utilise the DVD drive SATA connector with an SSD. This would solve the heat issue, solve the noise issue and reduce power consumption (to ~4w). It will also be faster than the 10,000 RPM rotational drive. The down side is that getting hold of affordable SSD’s (as of writing) isn’t yet an option. Not to mention that various adapters and extra cabling would be required to get the SSD mounted properly (at extra cost). Modifying new cable runs into 1u servers can often be a challenge (it’s bad enough in 3u). The Server BMC also complicates matters as under Dell, OpenManage will notice that you aren’t using a Dell approved drive and this will quickly hit your environmental reporting data. Approximate cost ~£70+ per server. Well over budget.
  2. Pull the RAID controller and mount a SSD/mSATA/m.2 into a PCI-e slot (even potentially the RAID controllers slot) on a PCIe adapter. This solves the cabling problem and has the added advantage of clearing both drive slots. It also means that I can control the bus specification, potentially getting a boost from a SATA III or NVMe controller. Of course this is more expensive although it is easier to get hold of smaller mSATA SSD’s than it is 2.5″ ones. Cost per-server ~£125+. Again, over budget.
  3. Look at SATA DOM or booting from Compact Flash/SD Card. SATA DOM isn’t an option for the PowerEdge 1950 and a NAND flash solution would require modification of the chassis. The headache of managing boot support would also be an issue. Rending this unreaslistic.
  4. Pull the RAID controller, disk and boot the entire enclosure from USB. This solves pretty much all problems but does add one in that these servers do not have an internal USB port. The active OS drive would therefore need to be insecurely exposed and accessible within the rack. Think malicious intent through to “I need a memory stick… ah, no one will notice if I use that one”. The cost of an average consumer USB 3.0/ 16GB USB Flash Drive (UFD) is about £7 – and it just so happened that operations have boxes of new ones lying around for the pilfering fully authorised, fully funded project.

I decided to experiement with option 4 and started to investigate how to boot Hyper-V from USB.

How to

Running Hyper-V Server from a UFD is a supported mechanism (as long as you use supported hardware types and not a consumer off the shelf UFD like I am).

The main Microsoft article on this topic was written for Hyper-V Server 2008 R2, however a set of liner notes with hardware recommendations are also available for 2012/R2.

View: Run Hyper-V Server from a USB Flash Drive

View: Deploying Microsoft Hyper-V Server 2008 R2 on USB Flash Drive

 

So far, so good. The basic premise is that you use disk virtualisation and the Windows 7/8 boot loader to boot strap the operating system. The Hyper-V Server is installed into a VHD and once the boot loader mounts the VHD and loads Windows as if it were any other Virtual Machine. The performance will suffer, but for Windows Server Core, this really doesn’t matter.

Microsoft states that USB 2.0 or higher must be used and that (for OEM redistributors) the UFD must not report itself as being ejectable.

Microsoft recommends the following drives

  • Kingston DataTraveler Ultimate
  • uper Talent Express RC8
  • Western Digital My Passport Enterprise

The closest that I could find were 16GB Kingston DataTraveler G4’s. Based upon UserBenchmark data offer 45% lower performance vs. the DataTraveler Ultimate G3 and 131% lower write speeds. Similarly, USB3Speed reports that the G4 read/write is 102.86/31.48 MB/second on a USB 3.0 bus vs. 174.76/38.46 for the Ultimate G3. So there is a decisive bottle neck being introduced as a result of using a cheaper UFD model.

The Microsoft article recommends the use of 16GB UFD’s rather than 8GB ones to allow for the installation of future updates I grabbed 4x 16GB DataTraveler G4 sticks and proceeded to prepare them to support the boot process.

View: UserBenchmark: DataTraveler Ultimate G3 vs. DataTraveler G4

View: USB3Speed

Check your Server for Suitability

The Microsoft article states that USB 2.0 is supported for Hyper-V Server USB Booting. I confirmed through empirical experimentation that the PowerEdge 1950 does support USB 2.0 and that its firmware supported booting from USB in a reliable, consistent way.

What I mean here is that you don’t want to have to go into a F12 boot menu every time you restart the server because the BIOS/UEFI will not automatically attempt to boot from the USB port. You should – as a matter of course – update your server firmware, including (but not limited to) the BIOS/UEFI as a way to mitigate against any potentially solvable issues in this regard. Do remember however that in a clustered environment, you shold normalise your hardware and firmware setup on all participating nodes before you set out to create the cluster.

In testing, the PowerEdge 1950 demonstrated that it could boot properly from the UFD without intervention. Thus with another tick in the box, the idea was looking increasingly more viable.

The USB Stick & VHD(X) Creation Process

I am not going to repeat the instructions for creating the bootable USB stick, they are clear enough on the Microsoft website. It is a shame that Microsoft closed the Technet Code Library meaning that you can no longer get access to the automated tool.

What I will add is that as I was installing Windows Hyper-V Server 2012 R2, I decided to attempt to use convert VHD to the newer VHDX format. The advantages here are nominal; better crash recovery and support for 4K drives are the main headlines. Regardless I wanted to start with the latest rather than using the VHD as prescribed in the 2008 R2 creation guide.

It didn’t work. The boot loader seemed unable to read the VHDX file. Running the VHDX back through the Hyper-V’s disk editor and into a VHD did however work. After some testing, I discovered that the issue was in the issue was the VHD migration process. To use VHDX you must update the BCD using the Windows 8.1 version of BCD edit and start with the Windows 8.1 boot loader. Repeating the process from scratch in a native VHDX did however result in a bootable OS.

I had initially started testing Hyper-V Server on an 8GB UFD. During the process and having obtained a 16GB drive, I decided to expand the size of the VHD from 7 to 14GB. This was a mistake. The VHD will expands fine, however Windows will not allow you to resize the VHD’s primary partition to fill the newly available space via the GUI or DiskPart. So unless you have access to partition management tools that can work with a mounted VHD(x), you will need to ensure that the size of the VHD is correct when you create it.

The file copy from the management computer onto the UFD of the 14GB VHD file (with write cache enabled) was excruciating. Making around 12.8MB/s from a USB 2.0 port is was getting far less than the benchmarked speed of 31MB/s.

Windows file copy showing 12.8MB/s

Finally, I created a Tools folder on the root of each UFD and copied the Windows 8.1 x64 versions of:

  • ImageX.exe
  • bcdedit.exe
  • bcdboot.exe
  • bootsect.exe

I also copued the x86 version of a Microsoft utility called dskcache.exe into here. dskcache can be used to enable/disable write caching and buffer flusging on connected hard drives. You could directly inject these into the VHD if you wanted to, however if left on the UFD, they are servicable.

Also note that this is your best opportunity to inject drivers into the VHD should you have any special hardware requirements.

The Results

USB 2.0

Despite the Microsoft article stating that USB 2.0 is supported, it became obvious within about 20 seconds of the boot process that something was not right. The time that it took to boot was agonising. Given the poor sustain file write speed shown above, this shouldn’t be overly surprising.

It took well over 60 second for the boot loader itself to start booting, let alone bootstrap the VHD load rest of the operating system. The initial boot time was about 25 minutes – although does have to go through OOBE and perform the driver and HAL customisation processes during the initial boot, so it isn’t very fair to be overly critical at this stage.

The next point of suffering was encountered at the lock screen. On pressing Ctrl + Alt + Del, a 15 second delay elapsed before the screen refreshed and offers the log-in text fields. After resetting the password and logging on, the blue Hyper-V Server configuration sconfig script took around 90 seconds to load. In short, the system was painfully unresponsive.

I had expected it to be sluggish – but I was not expecting it to be quite this bad.

Windows had loaded the UFD’s VHD file with the write cache enabled but buffer flusging (‘advanced features’) disabled. I thus used dskcache.exe to enable both settings.

dskcache +p +w

… and rebooted.

The boot time was around 4 minutes, the Ctrl + Alt + Del screen was still sluggish as was the login process – but it was certainly faster. Having completed the first Windows Update run, boot times to a password entry screen had reduced to a far more respectable 1 minute and 17 seconds. The sluggishness (while still there) had again reduced to 10-15 seconds from log-in to sconfig.

So what is the problem? There is certainly a lot of variables here:

  • The Datatraveler G4 does not offer the performance it is supposed to
  • The bus is USB 2.0
  • There is an artificial abstraction layer being imposed by the disk virtualisation process in and out of the VHD
  • Behind the scenes, Windows still likely thinks that this device is removable and is reacting accordingly
  • While the VHD upload process was a linear one that consisted of a single large file, the operating system will be making thousands of random seeks and random small writes. Random I/O and Linear I/O always offer different statistics – the latter being more synthetic than real world usage will otherwise offer.

8GB

As I mentioned previously, my original test with Hyper-V Server 2012 was on an 8GB UFD with a 7GB primary partition. After install, Hyper-V Server 2012 R2 consumes 2.98GB with no Page File. By the time Windows Update had scanned, downloaded attempted to install updates – including the 870MB Windows Server 2012 R2 Update 1 (KB2919255) – there was only 154MB of free disk space available. It was unable to complete the installation as a result.

Having ascertained that I could not resize the partition post-creation I recreated the VHDX once again, from scratch onto one of the 16GB.

16GB

Installing the Hyper-V Server 2012 R2 into a 14GB VHD on a 16GB stick at left plenty of available disk space. By the time that Windows Update had got around to having downloaded and subsequently attempted to install all available Windows Server 2012 R2 updates, there was 4.52 GB free.

At this point the Hypervisor itself still has not been configured and required support tools such as security software, Dell OpenManage or Dell EqualLogic Host Integration Tools.

Therefore, as with the advice offered on the Microsoft article, do not attempt to run Windows Hyper-V Server 2012 R2 from anything smaller than a 16GB memory stick. If you do, you are going to encounter longevity and maintenance problems with your deployments. In practice you should not consider using anything smaller than 32GB. I can see a time within the next couple of years when the 16GB installation will (as with the 8GB installation) be too large to continue to self-update.

This is significant and should be something that you factor during design as if Fail over Cluster Manager spots a mismatched DSM driver version (i.e. out of sync Windows Update state between cluster nodes in the case of the Microsoft driver), the validation will fail and Microsoft will not offer support for your setup. Therefore being unable to install updates is not an a situation that you want for your clustered Hyper-V Server environments.

Windows Update

As a side note, it is worth pointing out that I ran Windows Update on the live UFD in the server while it was booted. One of the advantages of the UFD approach is that it is easy to keep a box of pre-configured UFD’s in a draw that can be grabbed as a fast way to stand-up a new server or recover a server when its existing UFD has failed. Windows Update maintenance of these UFD’s is made far easier if you use DISM to off-line service the VHD’s and apply Windows Updates to the image before you even start to use the memory stick.

You can periodically update the box of UFD’s to the latest patch revision meaning that should you ever need to use one and you will have a far more up to date fresh install of the OS to hand. Something that is significantly faster than performing on-line servicing.

Improving Performance

There were three choices at this point. Abandon the project, focus on the UFD and buy the higher spec drive (£30 vs £7) or focus on the controller. Looking at prices, the controller was the cheaper option to explore.

USB 2.0 is an old technology. Its maximum theoretical bit rate is 480Mbps (that’s Megabits per second, not MegaBytes). This equates to 60MB/s (MegaBytes second). If we compare this with USB 3.0 whose maximum theoretical bit rate is 5Gbps (Gigabits) or 640MB/s we can see a very clear route to better performance. In practice, USB 3.0 isn’t going to get anywhere near 640MB/s, however a quick trip to eBay revealed controller pricing of between £6 and £35 making it something that was easier to swallow inside my £46.02 budget.

After researching chipset options, I narrowed it down to there being three chip options. The Etrom EJ198, which seems to have the fastest benchmark figures. The Renesas (formerly NEC) D720202 which is the new version of the D720201 which (coming in a close second) and finally the cheap and cheerful VIA Labs (VLI) 805-06 1501.

After further research, I found a lot of reports of compatibility issues with the Etrom which, coupled with its higher price, meant I abandoned it. So I picked up a £6.45 VLI dual port card and a dual port Renesas card for £12.66 simply as a means to have two different chips to test with.

Total spend on project: £25.91. Still well within budget.

Before starting I had a working theory that introducing the USB PCIe controller was going to break the BIOS’s ability to boot from the USB port. Despite extensive research, I was unable to find any controller cards online that stated the presence of an Option ROM to explicitly offer boot support. So ultimately I may have spent £25.91 for nothing particularly as USB 3.0 may not be able to add anything to the already I/O constrained cheaper 16GB UFD; but at this point there was still £20.11 left in the budget witch was available to use if chasing USB 3.0 was a red-herring. Consequently I was able to pickup a Kingston DataTraveler Ultimate G3 for £16.99 from eBay to allow for a thorough exploration of both avenues.

Total spend on project: £42.90. Still £3.20 left in the budget for a cup of tea!

Kingston DataTraveler Ultimate G3

The first thing to note was that it is a far larger memory stick and as such is a lot more obvious and significantly more intrusive sitting on the rear I/O plane of the server. You would definitely want to internally mount this larger UFD simply to protect it from damage caused during routine maintenance and cable management activities.

I elected not to re-create the experiment from scratch with a full 30/31GB VHDX, so instead I copied the existing 14GB VHDX from the existing UFD. Over a USB 2.0 bus a UFD to UFD copy resulted in a 18.1MB/s transfer speed – an immediate 5.3MB/s improvement. Repeating the file transfer from the hard drive onto the UFD increased this further to 24.6MB/s – an improvement. of 11.8MB/s and nearly a doubling of the write speed onto the UFD.

Windows file copy showing 24.6MB/s

Testing the connection on the Servers USB 2.0 bus, the performance difference was immediate. While still occasionally lagging and significantly slower than compared to even a 7k rotational hard drive. Its responsiveness was now at a point where I concluded that performance was acceptable – even for the USB 2.0 bus.

Rolling the VHDX back to an older, un-patched version of the image and having the server self-update was a better experience; with the update process lasting for a period of a few hours rather than all day.

I did however start to experience some operational problems with the higher specification drive. For example, while I had no problems with the cheaper drive (eventually) completing tasks, the DataTraveler Ultimate G3 could not complete some DISM servicing activities, citing “Error: 1726 The remote procedure call failed”. This could be illustrative of the start of a drive failure or some form of corruption in the VHD.

USB 3.0

The cheaper £6.45 USB 3.0 controller arrived first and I threw it into a PCIe 1x slot on the test system. I then retested the file copy on both the Ultimate G3 and the DataTraveler G4 to see if there was any improvement in performance.

Windows file copy showing 16.3MB/s

The DataTraveler G4 copied up at around the 16.3 MB/s mark, this is a 3.5 MB/s improvement over the 12.8 MB/s off of the USB 2.0 controller but nothing compared to the 24.6 MB/s of the DataTraveler Ultimate G3 on the USB 2.0 controller.

So what about the performance of the DataTraveler Ultimate G3 on the USB 3.0 bus? The result was quite phenomenal in comparison

Windows file copy showing 88.2MB/s

88.2 MB/s, some 63.6 MB/s faster than the same drive on the USB 2.0 bus – some 705.6 Megabits per-second. Not bad for a £6.45 VIA Labs chip from eBay!

As anticipated however, the lack of an OptionROM was the downfall to the experiment. The BIOS was unable to ‘see’ the USB 3.0 controller as an add-in device during POST and thus was unable to boot from it.

I attempted to create a dual USB boot solution where the VHDX file lived on a memory stick attached to the USB 3.0 bus. A second memory stick containing the boot loader existed on the motherboards USB 2.0 port. Sadly however no amount of tinkering could get the system to link one to the other. 'No bootable device -- insert boot disk and press any key'.

The second, more expensive Renesas USB 3.0 controller arrived around a week later. Just as with the cheaper VIA Labs controller, there was no possibility of getting it to boot directly either.

Writing onto the cheaper DataTraveler Ultimate G4 using a Renesas driver actually managed a throughput of 19.3 MB/s. Repeating the test with the DataTraveler Ultra G3 yielded a write speed of 93.0 MB/s, again showing an improvement over the VIA. Be it a not particularly significant one given that it was double the price.

In summary the write speeds for performing the large file transfer of the VHD onto the memory stick are shown below.

Write Speed: MegaBytes second (MB/s) – Higher is better
Memory Stick USB 2.0 USB 3.0 (VIA) USB 3.0 (Renesas)
DataTraveler G4
12.8
16.3
19.3
DataTraveler Ultimate G3
24.6
88.2
93.0

From a subjective point of view, the use of the DataTraveler Ultimate G3 on the USB 2.0 bus was “acceptable”. Acceptable given what the system needed to do. Thus the randiom read/write bottleneck can be conculded as being in the memory stick and not the controller itself.

Update 08/04/2019: The VIA controller only lasted around 6 months before it started causing system instability (blue screens). Shortly there-after it died. The Renesas controller is still going strong!

 

Conclusion

So having spend £42 on the experiment, what can conclusions can be drawn.

Many of you have probably been shouting the obvious here. That the best way to reduce costs would be to obtain more efficient servers and consolidate the old ones into fewer appliances. This is true beceause newer servers:

  • Have more efficient, less power hungry, higher capacity components
  • Emit less heat
  • Have more efficient power supplies
  • Have newer, better fans
  • Can consolidate more virtual servers

In the real world most of us don’t work for Google or Microsoft and we cannot get management to agree to write blank cheques. Neither can most start-ups, home lab builders, ‘hand-me-down’ dev-test environments or backup environments. The short of it is if you want to save some money, reduce heat and in turn reduce noise (always useful in a home environment). A £40 – £50 saving a year can go a long way. So spending £42 wasn’t unreasonable.

USB 2.0 is ‘good enough’, especially for testing environments. There are clear performance advantages with USB 3.0, however you are going to need USB 3.0 enabled boot support to make practical use of this technology. Even if you have that, you should consider other solutions such as a small SSD or SATA DOM before considering USB 3.0. If you are in a position to add bootable USB 3.0 to your system. It is however a very viable option.

The biggest headline from his process has been that not all UFD’s are created equally. The wide and varied margin between different models from the same company was surprising – espeically with both devices claiming USB 3.0 featuresets. The benchmark statistics are so stark as to prove that there is virtually no point in having USB 3.0 if you are going to use a low-end UFD.

For Hyper-V Server, with the correct investment in your UFD, you can make USB 2.0 suffice for your needs and as long as you realise that it will not be as fast as a rotational drive. Despite this, if you do not reboot your envrionment very often, it might just be good enough for your requirements.

For me personally, I will be getting the testing cluster migrated over to VHDX/UFD booting hypervisors. There is a cost saving rational that helps me to keep the testing devices running. On a more personal level, for home, I have created UFD devices for a couple of desktop machines that are in my lab and these have been setup as off-line nodes in my cluster. The value here is that they can become hypervisors for a short time without interfering with the OS or drives. Even more importantly I do not have to worry about multi-booting. With these UFD’s I plan on simplifying the maintenance process of the main environment so that I no longer need to have down time on my setup.

So why would you want to consider creating a UFD boot setup for your hypervisors? There are some advantages just as there are clearly some disadvantages

Advantages

  • There is potentially a financial saving to be made as a result of power consumption reduction. This is especially true for large clusters and whole racks of servers using  shared storage
  • It is a very easy way to make a low-cost, reportable environmental sustainability push. This is particularly true if you are not yet able to dispose of your legacy hardware
  • It works well with Microsoft’s push towards the use of SMB 3.0 for low-cost Hyper-V shared storage setups for SMB’s
  • If you accept RAID as being unnecessary in a clustered environment. In the event of a UFD failure you can easily keep a box of pre-configured UFD’s in a draw. Allowing you to get the Hypervisor up and running again and back into the cluster very quickly. Offline servicing can also be used to very easily keep the off-line UFD devices patched
  • Heat reduction was my main driver. By removing the hot RAID/SAS JBOD controllers there is a thermalsaving. There is also potentially an area of additional cost saving in environmental cooling
  • It is extremely cheap to implement. Not specifying your new Hyper-V Server purchase with hard drives will more than pay for the cost and time of setting up the environment. Most new servers will have an internal USB port within the chassis and you can use this to your advantage for security. The UFD approach is cheaper than similar SSD/mSATA alternatives
  • Removing hard drives cuts down on power, use, heat and noise. This is less important for Enterprise but for a small business or an average home/home lab user this might be a very important driver
  • The convenience of a UFD makes this a very good option to keep in mind for emergency planning/disaster recovery. You can throw a pre-configured UFD into any server or even desktop and have it running a serviceable hypervisor minutes. All without impacting the original server’s drives. Simply remove the UFD and reboot and it goes back to doing whatever it was doing previously. This is potentially very useful for a SMB with limited resources who need to service a running Hypervisor without downtime. If you can temporarily promote a seperate machine to be a Hypervisor by plugging in a UFD and rebooting. You can creatively increase your organisational uptime
  • It is becoming difficult to purchase small (32/64GB) SSD drives while it remains easy to obtain smaller UFD’s. This saves money as you will not need to buy a 128GB SSD to support a 20GB requirement
  • You can use both VHD or the newer VHDX formats. VHDX offers better failure safeguards, 4k sector support and is the only real choice for UEFI setups

Disadvantages

  • There is no support for disk redundancy from the setup described in this article. If you require OS’s underpinned by a mirror. Then this is not something to consider and you should look at SSD’s
  • Most Enterprise scale deployments will make use of scripted, rapidly provisioned PXE deployments of Hyper-V Server. The use of VHDX means that you will be unable to use these technologies
  • The UFD to VHD(x) abstraction process introduced by disk virtualisation adds a performance penalty
  • As has been demonstrated, UFD’s are slower than rotational drives and are considerably slower than SSD’s
  • The longevity of the UFD being used for this purpose is unknown. In the absence of reliable MTBF figures, most Enterprise users probably wouldn’t (and shouldn’t) consider it
  • Integration with server management tools such as OpenManage may be a problem for your OEM. This in turn may have an impact on support and warranty options.

 

In summary: For the average Enterprise user on primary production kit it may not be something that you want to consider. In some use cases, such as for backup, testing or disaster recovery environments there are clear advantages. Especially if you are prepared to be creative!

IPMI: Lessons from Practice

System Requirements:

  • IPMI / BMC equipped hardware
  • Sideband or Dedicated IPMI interfaces

The Problem:

I have recently been saved (and hindered) by having to pull the less often used IPMI tool from the proverbial administrators utility belt. In doing so I found a couple of design issues that you might like to consider if you are starting out with this technology.

More Info

These are a couple of design gotcha’s to keep in mind.

IPMI Physical & Logical Network Design

If you want to implement IPMI what should and should you not be doing? In looking around and thinking about it, this seems to be a poorly documented topic so I wanted to write up some thoughts on the subject.

Physical: Sideband or Dedicated Port?

Sadly this is always a choice, as your hardware will likely be one or the other. Some Dell hardware is equipped with the option to use the dedicated port or a logical MAC addresses presented through one of the motherboard integrated NICs (the so called LOM interfaces or LAN on Motherboard interfaces which are sometimes referred to as Sideband IPMI interfaces).

With a sideband interface, the physical NIC(s) is allocated a second Ethernet MAC address in software through which the IPMI controller listens for traffic destined for the IPMI controller. By definition there has to be a penalty for doing this, even if it is extremely minor as the NIC must check which MAC address the inbound packet is destined for and redirect it accordingly. Most of the time traffic will be destined for the production network rather than the IPMI virtual interface and thus the processing overhead is being performed needlessly.

In contrast the use of the Sideband interface removes 1 cable from your rack and one used switch port or, to scale that up in the case of a 50u rack, 5 cables requiring 50 switch ports which just needed an extra 2+ switches to receive all of the extra ethernet cabling. It is for this reason that sideband ports are a viable option. Some IPMI implementation also allow you to change the physical LOM port in software while others are fixed.

Both therefore have their uses although in general terms I prefer the use of dedicated management channels as the means there is less to potentially go wrong with production systems or accidental cable pulls of the IPMI Sideband LOM port as you can colour code your cabling accordingly to represent the fact that “blue equals IPMI/ILO/iDRAC etc”.

You do not always have the luxury of choice however. Servers such as the Dell PowerEdge 1950 that do not have an iDRAC card have no dedicated port option and the software LOM is fixed on LOM Port #1. Consequently I posit the following advice:

  1. If you need to NIC Team (e.g. LCAP) across the Motherboard NIC + any other expansion NIC, do not include the IPMI sideband port’s VLAN in the team otherwise you may find that IPMI traffic never finds its way up the correct bit of copper!
  2. If you need to team the LOM (and only 2+ ports on the LOM with no expansion cards included) check to see if this is supported by the server vendor. If there is only a single LOM port without support for teaming on your hardware you will encounter the same problem as outlined above; IPMI commands may simply get lost as they never make it to the correct port *.
  3. If you are not going to be using NIC Teaming, set the IPMI sideband port to be the port that services your management network, not your iSCSI or production networks. This allows you to reconfigure the NIC without impacting production services, make the IPMI NIC a member of the management network directly, make it a member of an alias management network for IPMI traffic or assign it as part of a VLAN that splits IPMI and non-IPMI management traffic.
  4. Unless you have a rack/switch density issue or do not have the budget for dedicated switches to support IPMI I recommend using the dedicated port option on your hardware if available — lets be honest, a dedicated IPMI switch even in here in 2015 can safely be a 10/100 Fast Ethernet switch, you do not need 10GB hardware to issue the occasional 2 byte PSU reset command. This is especially true when you realise that Serial Over LAN (SOL) is only itself operating at 9600bps up to 115200bps to provide you with serial console access via the IPMI system.

* I believe that Dell 9th Generation + servers all support NIC Teaming on the LOM, thus 8th Generation and lower will have issues with NIC Teaming. The solution implemented by Dell to support this is that all LOM NICs listen for IPMI traffic while only the named IPMI port sends IPMI response and Event Trap data. Dell firmware refers to this as “Shared Mode” while “Failover Mode” simply adds the option to dynamically change the IPMI Tx channel to a different live port in the event that NIC 1 fails.

Source: Dell OpenManage Baseboard Management Controller Utilities User’s Guide

Logical: Isolation/No-Isolation

If you actually look into IPMI as a technology, it is fair I think to say that it is not the most secure, particularly if you don’t spend time setting up manual encryption keys on each of your servers. So what can/should you do with traffic?

Firstly it depends on how complicated you want it to be/need it to be and secondly it depends on whether you have managed switches or not. If you do NOT have managed switches, then the best that you can muster is one of the following three options:

  1. Assign address allocations to the IPMI interfaces inside your production LAN (Hint: Do Not Do This)
  2. If you are not willing to have dedicated switches or sacrifice a physical server NIC port as a dedicated IPMI port: Assign address allocations to the IPMI interfaces on the servers and access them as an overlay network using IP Alias Addresses (see below for more details)
  3. If you are willing to have dedicated switches or sacrifice a physical server NIC port as a dedicated IPMI port: Isolate your IPMI traffic in hardware

If you do have managed switches then in addition to the above (not recommended) three options you also have the following two options:

Add the IPMI interface to your existing single management network. You have one, right? You know, the network that you perform Windows Updates off of, perform administrative file transfers across, have hooked up to change, configuration, asset and remote management tools and through which you perform remote administration of your servers from the one and only client ethernet port that connects directly to your management workstation. Yep, that management network! The advantage of this is that you already have the VLAN, you likely already have DHCP services on it (it is your call on whether you want to hunt around in the DHCP allocation list to find the specific allocation when you need to rescue a server in an emergency) and it is already reasonably secure.

The disadvantages of using the existing VLAN are that you may want to use encryption on the management network and there may be packet sniffers involved as part of the day to day management of the network. IPMI is not going to fit in nicely with your enterprise policy and encryption software e.g. IPSec. The inherent insecurity of IPMI (particularly before version 2.0) means that exposing the ability to power off systems to anything sniffing the management network may not be ideal given the damage that could be caused. Thus, good design suggests that your IPMI, ILO and DRAC traffic should be further isolated into their own VLAN, separate from the main production network (that should go without saying) and separate again from the main management network.

Which way you choose to go down is a design time decision based upon the needs of your own environment. Personally if given a choice, I prefer to cable the management and IPMI/ILO/iDRAC networks using different colours on dedicated ports to mitigate against disconnection accidents and in turn isolate them into their own VLAN with a minimum number of intersection points – usually just a management VM or workstation.

The footprint for something to go wrong is significantly reduced under this model as is the need for you to meticulously configure security and keep firmware patched (it isn’t a reason not to do it mind!). As an abstraction you could also think of it in terms of a firmware management VLAN (IPMI) and a software management VLAN (Windows/Linux management tools) or a day to day management VLAN and your ‘get out of jail free’ VLAN.

Accessing the servers IPMI interface using an Alias Address

When you are starting out with IPMI, or for more simplistic configurations and are using a Sideband port (where the IPMI port is shared with an active, production network connection) you may want to connect to other IPMI devices directly from one of the servers that itself is an IPMI host. Alternatively you may have a single NIC management PC and you want to connect to the same isolated IPMI network without losing access to your main network.

You can avoid needing to sacrifice one of your network ports as a dedicated IPMI management port by making use of an Alias IP Address. The main advantage of this is that you do not have to make use of VLAN’s (which are not supported properly anyway under IPMI 1.5 or older) and thus do not require managed switch hardware.

An Alias address is simply the process of offering one or more additional IP addresses to an already addressed network adapter which do not necessarily have to be part of the same network subnet.

IP Address Alias Screenshot

As you can see in the above screenshot, the servicing 128 address is complimented with a 192 address. This means that the network card can actively participate in both networks although the latter is limited to its own broadcast domain as it doesn’t have a gateway configured.

You do not need to alias any IPMI adapter to have the server participate in being a servicing IPMI client, the firmware does this once configured and that configuration is completely transparent to the Operating System. However if you need the server itself (or a management PC) to be able to connect to the IPMI overlay network without resorting to additional NIC’s, an alias address is a good way to get started without resorting to VLAN’s

To create an Alias

In Unix like operating systems you would issue

ifconfig en1 inet 10.0.1.1/32 alias

In Windows you can either use the Advanced TCP/IP Settings screen shown above or you can use the following to achieve the same result

netsh interface ip add address “Local Area Connection” 10.0.1.1 255.255.255.255

Note: The alias address that you set on the adapter should not be the same as the one on the IPMI virtual firmware interface. It must be independent and unique on the IPMI subnet.

There is however a problem with this approach if you are attempting to access IPMI from an IPMI enable server; the server will not be able to see itself.

In testing ping, ipmitool and IPMI Viewer are unable to view the “localhost” servers interface. when using this type of configuration. This of course isn’t really too much of a problem because your management server is working as demonstrated by the fact that you are administering IPMI using it. However if you attempt to communicate with its own interface or keep a list of all hosts in IPMI Viewer, the host will simply appear as offline.

Using a Virtual Machine as an IMIP Management Console

The issue outlined above may seem trivial, however the reason that I have covered it is to record (highlight) the issue that I directly encountered with the seemingly sensible idea of having the management server as a virtual server hosted on a large node cluster. From a safety standpoint, having a multi-node cluster is a fairly safe place to keep your management tools while allowing easy sharing amongst admin team members without having to route vast amounts of VLAN’s into desktop machines (although I always keep one standby physical server bound to all management networks and excluded from public production and client network for such emergencies).

As an expansion on the issue outlined above there is an obvious pathway flaw in the logical network. If the physical server cannot access itself, then the hypervisor cannot access itself. Thus, when a management server node is running on the hypervisor stack, whichever node the management server is running on at the time will appear offline.

If the management server VM moves to a different hypervisor, then the offline hypervisor will change once ARP cache expiration time limitations have been taken into account — although the new host will appear offline immediately the now available one will not – in testing – appear until ARP has fully flushed.

Again, in a real world scenario this isn’t much of an issue: if you need to IPMI into a host, most of the time it is because that host is down. Thus, if the host is down then the Virtual Management server is not going to be running on it any more, it will have migrated to a new host and thus you will be able to see the downed server. Of course stranger things have happened!

The point of this article is to highlight what caused me a little head scratching and some incorrect troubleshooting where by I assumed that IPMI controller on a hypervisor had failed because it could not be raised from a management VM that (you guessed it) was running (it turned out) on the very same hypervisor. Once I had moved the VM and expired the ARP cache, the offline server changed to the new host node and the previously offline server popped back to life.

Notes on VLAN use

If you plan to use a VLAN to isolate the traffic onto your IPMI interface, obviously you need to ensure that your IPMI BMC is capable of filtering and in turn tagging VLAN traffic against its interface.

One thing to keep in mind is that Intel (particularly) have dropped a lot of driver update support for anything older than the I340 series with the release of Windows Server 2012 R2 (something that I learnt the hard way when I assumed that Windows Server 2012 ‘R1’ support would automatically equate to R2 support.

In practice this means that the Windows driver itself will not expose the Intel Device Manager extensions and thus you might find it difficult to tag the non-IMPI side of the shared interface onto the correct non-IPMI VLAN.

Remember that you will need to set the switch port to trunk mode and set the allowed VLAN ID’s for the port to both your production VLAN and your IPMI VLAN – or worse more if you are using the port as part of a virtualised converged fabric. Just remember that the ports that you will be able to send the IPMI VLAN down to are limited to one or a sub-set of all available NIC ports (depending upon the specification of your BMC).

Finally, you should be aware that NIC ‘link-up’ convergence can be noticeably slower when using VLAN trunk ports and tag isolation. This isn’t limited to just IPMI interfaces, but for the purposes of troubleshooting you should double the time that you allow to check that the port is in fact not working. For example, if your convergence time on a non-VLAN tagged access port is 8-10 pings you should wait for 16-20 pings while troubleshooting to allow for a VLAN to come up and register with both Windows and/or the IPMI BMC.

In real word situations, this means that Windows can be past the kernel initialisation process on a Dell PowerEdge before the IPMI interface is responding to traffic following a warm boot. The point of this: the use of VLAN tagging can lull you into a sense of insecurity in that you think that things are not working when all that is actually required is patience.