Redesigning the Hardware for the Virtual TV Streaming Server

This article discusses a hardware design change to the Virtual TV Streaming Server discussed in Creating a Virtual TV Streaming Server.

If you are not familiar with the previous setup. The design revolved around an array of TV tuners connected to a 7-port USB 3.0 hub. In turn, this connected to a USB 3.0 controller which was passed through Discrete Device Assignment (DDA) through to a Windows 10 Virtual Machine. This run DVBLogic TV Mosaic, the IP TV streaming software.

 

Virtual TV Streaming Server Meltdown

The solution has run extremely well. There have been no crashes from TV Mosaic, the VM or the Hypervisor. Until last week.

The system missed last Saturdays recording schedules and on Sunday afternoon, wouldn’t initiate playback. On inspection of the VM, one of the Tuners was showing as “unknown” on the TV Mosaic console. The others were all fine. Once this phantom tuner was removed from the console, everything started working again.

Initially thinking that it was related to a coincidental BIOS update on the server, it turned out that the tuner was simply dead. I RMA’d it with DVBLogic – who didn’t challenge my diagnostic or offer any resistance – but I did have to ship it Internationally at my own expense.

A week later, I came to use the system again and, once more, it was dead. A trip to the attic later and the was dead. A multimeter confirmed that the power supply had died, and I begun an RMA process with StarTech this morning.

 

Analysis

If the power supply on the StarTech was defective, it could potentially have caused the fault with the TV Butler tuner. Although this is speculative and unprovable. My main suspicion is that the problems were caused by heat. The attic roof space is uninsulated, and the UK is in the summer period. With temperature in the attic space certainly to have ranged into the 40c’s.

Unlike with the physical TV server that this setup replaced – which had fans. This setup doesn’t. PCIe TV Tuners are intrinsically designed to withstand higher thermal variances than USB ones. The StarTech and TV Butler products are quite simply basic consumer devices. It is possible that this factor led to both of their demises.

There was a power outage mid-week last week, and the StarTech itself was not sitting on the server UPS – but it was on a surge protector. It is my belief that this did not contribute to the issue.

 

Hardware Redesign

The brief for the redesign is simple

  1. Remove essential electrical components from the attic
  2. Minimise space use
  3. Minimise electrical consumption (as everything will now be powered through the UPS)
  4. Do not clutter up the backplane of the server with dongles

 

Power

To accommodate #1, #2 and #3 the USB Hub is going to be eliminated from the design. The TV Tuners will now connect directly to the DDA USB controller. In order to do this, the dual port controller will need to be replaced.

After deliberating on whether to get an externally powered or bus powered 4-port controller, I chose a , bus powered card. A risk, given my previous experience here. The DG-PCIE-04B reviewed better than a similarly priced externally powered one. The decider was that it uses a NEC chipset and not a RealTek/SiS (i.e. cheap) chip. Finally, the fact that each of the ports had its own voltage management and fuse circuit is a valuable quality safeguard.

 

Patch Panel TV

To satisfy design brief #4, the USB TV Tuners will need to be mounted away from the server. To achieve this, I am going to mount the Tuners in the patch panel.

Using a set of keystone jacks. A USB lead will run between the USB controller and the Patch panel; simply mounting to the TV Tuners held in the patch panel.
TNP USB 3.0 Keystone Jack Image

The patch panel happens to be near the ceiling, directly above the TV aerial distributor for the house. Using 4m coaxial cable, the aerial feed can route through the existing ceiling cable run and clip neatly into the TV Tuners.

The Amazon order consisted of

  • 1x
  • 1x Pack of 5
  • 4x Rankie USB 3.0 Type A Male to Male Data Cable, 3m (Server – Patch Panel)
  • 3x Ex-Pro White Coax F Plug Type – to – Male M Coax plug Connection Cable Lead – 4m (Aerial distributor – TV Tuners)

 

Installation

The installation was extremely simple.

  1. Replace the existing 2 port USB controller with the 4 port one
  2. Clip the USB 3.0 keystones into the patch panel
  3. Run cables between the USB controller and the front profile (base) of the USB keystones
  4. Passing the USB controller through to Hyper-V
    1. Shutdown the Virtual Streaming TV Server VM
    2. Get the Device Instance Path from the Details tab > Device instance path section in Device Manager e.g.
      PCI\VEN_1912&DEV_0014&SUBSYS_00000000&REV_03\4&1B96500D&0&0010
    3. Use PowerShell to dismount the USB Controller from the Hypervisor and attach it to the VM
$vmName = 'TvServer'
$pnpdevs = Get-PnpDevice -PresentOnly | Where-Object {$_.InstanceId -eq 'PCI\VEN_1912&DEV_0014&SUBSYS_00000000&REV_03\4&1B96500D&0&0010'}
$instanceId = $pnpdev.InstanceId
$locationPath = ($pnpdevs[0] | get-pnpdeviceproperty DEVPKEY_Device_LocationPaths).data[0]
Write-Host "    Instance ID: $instanceId"
Write-Host "    Location Path: $locationpath"

# Disable the Device on the Host Hypervisor
Disable-PnpDevice -InstanceId $instanceId -Confirm:$false

# Wait for the dismount to complete
Start-Sleep -s 15

# Dismount the Device from the Host Hypervisor
Dismount-VmHostAssignableDevice -locationpath $locationPath -Force

# Attach the PCIe Device to the Virtual Machine
Add-VMAssignableDevice -LocationPath $locationpath -VMName $vmName

# Note: You may need to reboot the Hypervisor hosts at this point.
# If the VM's device manager informs you that it can see the controller, but is  unable to initialise
# the controllers USB Root Hub. A reboot should fix it.
  1. Clip the DVBLogic TV Butler TV Tuners into the patch panel USB keystone jacks using the inside (top) port on the keystones
  2. Start the TV Server VM
Photograph of USB Tuners mounted in patch panel
The patch panel now has three USB ports – the left-most TV Butler is missing as the RMA replacement has not yet arrived.

Photograph of USB Tuners mounted in patch panel Photograph of USB Tuners mounted in patch panel

The Virtualised Windows 10 Streaming TV Server came back online and there hasn’t been any instability caused by the bus-powered USB controller. The TV Butler’s are warm to the touch, have plenty of air-flow and the ambient temperature can be monitored via existing sensors in the room.

The completed assembly in the Patch Panel

With any luck, I will not need to revisit this project for quite some time!

Scanning and repairing drive 9% complete – the curse of chkdsk

This article discusses an issue of a computer getting stuck at boot with the message “Scanning and repairing drive 9% complete” with chkdsk hanging at 9%.

The hypervisor was 12 months over-due for a BIOS update. Updating the UEFI should be simple enough, however SuperMicro have a nasty habit of clearing the CMOS during BIOS updates. Why most other OEM’s are able to transfer settings and SuperMicro insists on not is one of only a few gripes that I have ever had with the firm. Yet it is a persistent one that I’ve had with them going back to 1998.

The Fault

After the successful update, I reset the BIOS to the previous values as best I could recall. Unfortunately I also enabled the firmware watchdog timer.

SuperMicro’s firmware level watchdog timer does not operate as you might expect. It requires a daemon or service to be present within the running operating system that polls the watchdog interrupt periodically. If the interrupt isn’t polled, the firmware forces a soft reboot. Supermicro do not provide a driver to do this for Windows, although their IPMI implementation can do so.

After 5 minutes from the POST the hypervisor performed an ungraceful, uninitaited reset. Following the first occurrence I assumed it was completing Windows Update. Subsequent to the second, I was looking for a problem and after the third (and a carefully placed stopwatch) I had a suspicion that I must have turned on the UEFI watchdog.

I was correct and, after disabling it, the issue was resolved.

This particular hypervisor has SSD block storage for VMs internally and large block storage for backup via an external USB 3.1 enclosure – a lot of it. Without giving it any thought, I told the system to

chkdsk <mountPoint> /F

Note that this does not include the /R switch to perform a 5 step surface scan. I told chkdsk not to dismount the volume, but to bundle all of the scans together during the required reboot to scan the C:. Doing it this way meant that I could walk away from the system. In theory this would mean that when chkdsk finished, it would rejoin the Hyper-V cluster on its own and become available to receive workloads.

… and restarted.

 

Scanning and repairing drive 9% complete

chkdsk skipped the SSD storage as it is all configured as ReFS. Under ReFS, disk checking is not required as it performs journaling activities in the background to preserve data integrity. Unfortunately, the external backup enclosure volume was NTFS. It would be scanned – and it was also quite full.

The system rebooted, and sitting at the intermedia chkdsk stage of the NT boot process. It zipped through the SSD NTFS boot volume in a few seconds, before hitting the external enclosure. Within around 5 minutes it had arrived at the magic “9% complete” threshold.

1 hour, 2 hours, 4 hours… 8 hours. That turned into 24 hours later and the message was still the same.

Windows Boot Scanning and repairing drive (F:): 9% complete

Scanning and repairing drive (F:): 9% complete.

Crashing the chkdsk

The insanity of waiting over 24 hours had to come to an end and I used IPMI to forcefully shutdown the server.

After a minute or two, we powered back on. To be met with a black screen of death from Windows after the POST.

The c:\pagefile.sys was corrupt and unreadable. Perform a system recovery or press enter to load the boot menu. On pressing enter, the single option to boot Windows Server 2019 was present, and, after a few moments. Windows self-deleted the corrupt pagefile.sys, recreated it and booted -to much relief.

I then ran

chkdsk c: /f

and rebooted, which completed within a few seconds and marked the volume as clean, with no reported anomalies.

The Windows System Event Log contained no errors (in fact as you might expect, no data) for the 24 hour period that the server had been ‘down’. The were no ‘after the event’ errors added to the System log or any of the Hardware or Disk logs either. for all intents and purposes, the system reported as fine.

 

Trying chkdsk for a second time

I decided to brave running chkdsk on the external enclosure again. Initially in read-only mode

chkdsk F:

Note the absence of the /F switch here.

It zipped through the process in a few seconds stating

Windows has scanned the file system and found no problems.
No further action is required.

Next I ran a full 3-phase scan

chkdsk F: /F

Again, it passed the scan in a few seconds without reporting any errors. So much for the last 24 hours!

 

Analysis

The corruption in the page file indicates that Windows was doing something. The disk array was certainly very active, with disk activity visible (via LED), acoustically and via data from the power monitor on the server all confirming that “something” was happening. Forcibly shutting down the system killed the page file during a write. Had been a 5-step chkdsk F: /f /r scan I could understand the length of time that it was taking.

With chkdsk /f /r – assuming a 512 byte hard drive – the system has to test 1,953,125,000 sectors for each terabyte of disk space. Depending on the drive speed, CPU speed and RAM involved it isn’t uncommon to hear of systems taking 5 hours per-terabyte to scan. This scan was not a 5-step scan, just a 3-step. A live Windows environment could scan the disk correctly in a few seconds.

Resources were not an issue in this system. Being a hypervisor, it had 128GB of RAM and was running with 2018 manufactured processors.

My suspicion is that the problem exists because of a bad interaction between the boot level USB driver and the USB enclosure. The assumption is that Windows fell into either a race condition or a deadlocked loop. During this fault, chkdsk was genuinely scanning the disk and diagnostic data was being tested in virtual memory (i.e. in the page file) but it was never able to successfully exit.

The lesson that I will take away from this experience is that unless it to avoid using a boot cycle chkdsk to perform a scan on a USB disk enclosure.

WorkFolders Folder shows Sync Error even though its contents are fully synchronised

WorkFolders allows you to perform policy based file HTTPS synchronisation between corporate servers and BYOD devices or teleworker devices. This article discusses a workaround to a problem where an anonymous sync error appears on a directory despite all of its contents synchronising successfully.

Outline of the Problem

Assume the following directory structure

C:\Users\CompanyUser\WorkFolders\Documents\WorkFolders Test

The following files/folders are present within WorkFolders Test:

WorkFolders Test\Problem Folder
WorkFolders Test\Problem Folder\File in Problem Folder.docx
WorkFolders Test\This file is OK.txt

Windows Explorer will display a Green circle with a tick for file/folder object that is synchronised and a Red circle with a cross for a faulted file/folder. After synchronising, the sync results will display as follows:

WorkFolders Test\Problem Folder [CROSS]
WorkFolders Test\Problem Folder\File in Problem Folder.docx [TICK]
WorkFolders Test\This file is OK.txt [TICK]

No errors are displayed in the Control Panel WorkFolders applet. There are no related errors in the clients WorkFolders Management/Operational Event Viewer logs. No relevant errors are present in the file servers SyncShare Operational/Reporting logs.

WorkFolders Error Screenshot - outer folder
The parent folder shows that its sub-folder has a sync error
WorkFolders Error Screenshot - inner folder
The contents of the error’d folder are however correctly synchronised.

Analysis

Although WorkFolders is indicating that the issue is being caused by the “Problem Folder” directory. The issue is being caused by the “File in Problem Folder.docx”.

The following symptoms will be true:

  1. Renaming “Problem Folder” will not fix the issue
  2. Altering the filename of “File in Problem Folder” will not fix the issue
  3. Changing the “File in Problem Folder.docx” file extension (e.g. to .txt) will not fix the issue
  4. Opening and saving the “File in Problem Folder.docx” will not solve the issue
  5. Moving “File in Problem Folder.docx” out of “Problem Folder” will clear the sync error, but the error will immediately migrate to the new location
  6. Rebooting the client computer will not help
  7. Restarting the server will not help
  8. The file does not have any connected temp files or lock files associated with it in the client file system

 

There is nothing wrong with the file itself. It is not corrupt, pay-loaded with a virus or violating any policy. It is my (unproven) belief that the record for the file in the WorkFolders synchronisation database is corrupt. Performing any of the above steps will not alter the record in the WorkFolders client database, thus the problem cannot be ameliorated.

 

Fixing the problem

One you have identified the problem file(s). You can use one of the methods below to correct the error.

Save the file as a completely new file

  1. Open the file in its associated editor (e.g. Microsoft Word for docx files)
  2. File > Save as…
  3. Save the file in its original location, but with a different file name (do not overwrite the original)
  4. Delete the original file
  5. Allow WorkFolders to re-sync
  6. Rename the new file as required

This approach is easy for an end-user to perform, but can be very time consuming if you are troubleshooting a large number of such issues. It requires you to know which file is causing the problem in the first place.

 

Compression

  1. Compress the file using Windows Compressed folders (Right click > Send to… > Compressed (zipped) file)
  2. Delete the original file
  3. Wait for the folder to re-sync and clear the error
  4. Extract the original file from the zip back into the desired location

This method will create a new record in the WorkFolders synchronisation database and the error will not reappear. You can use the technique to fix an entire folder structure without having to first identify the problem file. It is also easy for an end-user to perform.

 

Move the file

  1. Move the file outside of the WorkFolders monitored file system. For example, move the file to C:\ or into the Recycle Bin
  2. Allow the original folder to re-sync and clear the error
  3. Return the original file to its original location and allow it to re-sync

Again, this method will create a new synchronisation record. In a managed environment this may be harder for an end-user to perform due to permissions. It is however easier for an administrator to perform as you can cut/paste the entire file structure out and then back into the WorkFolders sync root.

If you use this method, remember to move it to a location within the same drive letter. If you do, the move will preserve permissions, file dates and will not physically copy the underlying data to the new location (just update the MFT).

Create a Slipstreamed Hyper-V Server 2019 installation image with working Remote Desktop for Administration

If you have been following the saga of the non-working Hyper-V Server 2019 release from November. You may be aware that the most prominent issue – that of Remote Desktop Services for Administration not working – has now been resolved in the February 2019 patch release cycle.

This article outlines how to create updated media for Hyper-V Server 2019 using the original installation medium and patch it into a working state.

Note from the author

Please note that if you intend to use Hyper-V Server in a production environment, you should wait for Microsoft to re-issue the office ISO. Once it is released, it will be made available in the Microsoft Server Evaluation Centre.

View: Microsoft Evaluation Centre

Pre-requisites

You will need access to a Windows 10, Windows Server 2016 or Windows Server 2019 system in order to update the installer.

Obtain and install the Windows ADK 1809 (or later) selecting the Deployment Tools option (providing you with an updated version of DISM)
Download: Windows Assessment & Deployment Kit (ADK)

Retrieve the original Hyper-V Server 2019 ISO
Download: Hyper-V Server 2019 (1809)

Download following updates from the Microsoft Update Catalogue
Note: This is correct as of early March 2019. It is suggested that you apply newer cumulative and servicing updates as they are released in the future.

  1. KB4470788
  2. KB4482887
  3. KB4483452

View: Microsoft Update Catalogue

[Optional] If you wish to apply any language regionalisation (e.g. EN-GB), source the CAB file(s) for the language features that you require. For example:
Microsoft-Windows-Server-LanguagePack-Package~31bf3856ad364e35~amd64~en-GB~10.0.17763.1.cab

Updating the Installation Image

To update the installation image:

  1. Create a folder on C:\ called ‘Mount’
  2. Add a second folder on C:\ called ‘hvs’
  3. In the hvs folder, create a subfolder called ‘Updates’
  4. Extract the entire contents of the ISO from the Hyper-V Server 2019 ISO into C:\hvs
  5. Place the three MSU files from the Microsoft Update Catalogue into the C:\hvs\Updates folder
  6. [Optionally] Place the CAB file for the language pack into the C:\hvs folder and for convenience rename it ‘lp.cab’
  7. Open an elevated Command Prompt
  8. Issue:
    cd /d "C:\Program Files (x86)\Windows Kits\10\Assessment and Deployment Kit\Deployment Tools\amd64"
    To navigate into the working folder for the updated version of DISM.exe
  9. Issue:
    dism.exe /mount-image /ImageFile:"C:\hvs\Sources\install.wim" /Index:1 /MountDir:"C:\Mount"
    To unpack the installation image into the C:\Mount folder
    Note: Do not navigate into this folder with CMD, PowerShell or Windows Explorer. If you leave a handle open against this folder when you try to re-pack the install.wim, it will fail.
  10. Once the mounting is complete, patch the installation by issuing:
    dism.exe /Image:"C:\Mount" /Add-Package /PackagePath:"C:\hvs\Updates"
  11. [Optional] Apply the language pack by issuing (change en-GB to your language as applicable):
    dism.exe /Image:"C:\Mount" /ScratchDir:"C:\Windows\Temp" /Add-Package /PackagePath:"C:\hvs\lp.cab"
    dism.exe /Image:"C:\Mount" /Set-SKUIntlDefaults:en-GB
    If you intend to use ImageX, DISM or WDS to deploy this image, you can skip the following command. If you intend to create a new bootable ISO or UFD, issue:
    dism.exe /image:"C:\Mount" /gen-langini /distribution:"C:\hvs"
    This will create a new Lang.ini file which must be included in the ISO/UFD media (but is not required for other deployment methods)
  12. Dismount and re-package the install.wim file by issuing:
    dism.exe /unmount-image /MountDir:"C:\Mount" /Commit
  13. Once DISM has processed the installation image, the new Install.wim file can be found at:
    C:\hvs\Sources\install.wim
  14. At this point you will have a working installation image which you can use to create a new ISO, UFD or install via WDS. You should delete the Updates folder and [optional] lp.cab from C:\hvs before creating a new ISO or bootable UFD.

If it goes wrong at any point, issue the following command to abort the process and go back and try again:
dism.exe /unmount-image /MountDir:"C:\Mount" /Discard

Delete the C:\Mount and C:\hvs folders once you have finished creating you new deployment media.

Final Word

If you follow the above, you will have not only a fixed RDP experience, but also a current patched version of Hyper-V Server. Eliminating a little time spent waiting for Windows Update to run.

If you are going to enable RDP for Administration. As ever, do not forget to enable the firewall rule in PowerShell. SConfig.cmd does not do this for you!

Enable-NetFirewallRule -DisplayGroup "Remote Desktop"