How to compact VHDX files in the most efficient way

Have you ever run the Hyper-V “Edit Disk…” tool’s ‘compact’ action, to discover that your VHD or VHDX disk file didn’t shrink? This article discusses how to optimise the chances for success when compacting your virtual hard drives on Windows Server.

 

Why Compact VHDX files?

Compacting virtual disk files is an activity that most Hyper-V administrators seldom ever do. In most cases, the space savings are going to be nominal and outweigh the benefits. A well designed Hyper-V deployment will ensure accounting for future storage optimisation/loading during the capacity planning phase; with scale-out design considerations being made during this stage too.

Compaction is not a fix for poor design.

If you are unfamiliar with what compaction does at the technical level, Alataro has a good introduction guide.

View: Altaro “Why You Should Be Compacting Your Hyper-V Virtual Disks”

In practice, compaction can help in two scenarios:

  1. Improving the speed and disk space use for non-VSS block-based backups
  2. Reducing the amount of data transferred during live migration

 

Why doesn’t Edit Disk… work?

It is common for the Hyper-V Manager Compact tool to not achieve any reduction in the size of the VHDX file, putting many administrators off from spending time running it.

Two problems exist with the GUI compaction wizard:

  1. The VHDX file is not in a pre-optimised state prior to compaction, reducing the success rate
  2. For performance reasons. Execution through the wizard does not use the full range of compaction tools available to the system. These can only be accessed (currently) via PowerShell.

So how do you properly shrink a virtual disk?

 

Optimise your VM

Inevitably there will be some data inside the VM which is not necessary to retain. Removing this prior to compaction, will automatically increase the amount of space that you save.

Optimising Windows

Under Windows, running the Disk Clean-up tool is a good place to start. Additional places to clear-out include:

  1. C:\Windows\Temp
  2. C:\Windows\SoftwareDistribution\Download
    The SoftwareDistribution\Download folder will usually save over 1GB of space as it contains the installation sources for Windows Updates. Microsoft Update will re-download the files again if it needs them on the next update scan.
  3. C:\Users\<user>\AppData\Local\Temp

You can run defrag from within the VM, however if you are in a position where you can off-line compact the virtual disk, it is more time efficient to defrag the VHDX while offline.

Optimising Linux

Optimisation should also be performed on Linux systems. Unlike with Windows, Linux self-clears its own Temp space on restart. You can also free space from the updates process using your package manager. The below example can be used to free space using apt:

sudo apt-get update
sudo apt-get autoremove
sudo apt-get autoclean
sudo apt-get clean

Under native Linux file systems, there is no concept of running a defragmentation. ext4 performs this automatically on behalf of the system. This optimisation does not release free space in a “physical” fashion on the hard drive. Unless this “physical” space is released (zeroed out), Hyper-V will be unable to compact the disk.

Fortunately it is possible to force Linux to zero-out unused disk space in the VM. It can be performed using the following command

su
cd /
cat /dev/zero > zero.dat ; sync ; sleep 1 ; sync ; rm -f zero.dat

This command creates a file on the drive mounted in the “cd /” line. It then writes 0’s into this file until the system literally runs out of disk space. The new 0’s file is then deleted. If you are compacting a VHDX that is not the root partition, you must change the “cd /” line to represent the correct drive e.g. “cd /mnt/myDisk“.

Note: You will completely fill the volume for a few seconds. This will impact other write activities occurring on the disk and so can be considered to be a dangerous process.

 

Shrinking an Online VHDX

It is possible to perform an online compaction of a virtual disk. Hyper-V itself performs light-touch background optimisation automatically. If you cannot shutdown a VM and can only optimise an online VHDX then you must:

  1. Delete temp files
  2. Internally defragment the disk from within the VM itself
  3. Use Disk Management or diskpart to shrink the size of the mounted partition currently in use on the VHDX
diskpart
list vol
sel vol #
shrink
  1. Perform the compact using Hyper-V Manager or PowerShell
    Optimize-VHD <path to VHDX> -Mode Full
  2. Reverse the reduction in the size of the partition using Disk Management or diskpart
diskpart

list vol
sel vol #
extend

 

Shrinking an Offline VHDX

The most efficient method to free space from a VHDX file is to perform the compact wile it is offline. The disadvantage of an offline compaction is that the VM will need to be shutdown prior to the operation. The amount of time that the VM will be down for is directly proportional to the amount of work required to complete the optimisation process. You can improve this through online defragmentation prior to starting, however this takes significant additional administrative effort.

The following steps outline the process that I use to greatest effect. I use X:\ as the example drive letter in this scenario and the use of PowerShell is expected.

  1. Get the initial VHDX size
    $sizeBefore = (Get-Item <path to VHDX file>).length
  2. {optionally} Defrag the VM while online
  3. Shutdown the VM
  4. Mount the VHDX in read/write mode (not read only mode)
    Mount-VHD <path to VHDX file>
  5. Purge Temp files and the contents of X:\Windows\SoftwareDistribution\Download
  6. Defragment the VHDX while mounted to the management system
    1. If the VHDX is stored on SSD drives
      defrag x: /x
      defrag x: /k /l
      defrag x: /x
      defrag x: /k
    2. If the VHDX is stored on Hard Drives (/x /k /l are retained for Trim capable SANs)
      defrag x: /d
      defrag x: /x
      defrag x: /k /l
      defrag x: /x
      defrag x: /k /l

      Note: Defrag /d  can be particularly time consuming
  7. Dismount the VHDX
    Dismount-VHD <path to VHDX file>
  8. Perform a full optimisation of the VHD file
    Optimize-VHD <path to VHDX file> -Mode Full
  9. Get the new size of the VHDX
    $sizeAfter = (Get-Item <path to VHDX file>).length
  10. Start the VM
  11. Find out how much disk space you saved (if any)
    Write-Host "Total disk space Saved: $(($sizeBefore - $sizeAfter) /1Mb) MB" -Foreground Yellow

 

You will note that I repeat the steps of running defrag /x and defrag /k /l. In experimenting, this is because the repitition appears to allow a small amount of additiona space to be freed in some situations as show in the table below.

 

Results

Efficiency Purgable slabs
Size at start 73,991,716,864 100% 11
/k Slab consolidation 69,965,185,024 100% 11
/l Retrim 69,965,185,024 100% 11
/x Free space consolidation 69,965,185,024 100% 9
/x /l /k (repeat) 69,898,076,160 100% 9

The table shows the first /l retrim operation reducing the number of purgable slabs, after which a further 67,108,864 bytes (64MB) of space is freed – the two 32MB slabs.

 

Why not run Defrag /d on an SSD?

Defrag /d aka “traditional defrag” physically moves the file data to the end of its home partition, reconstructs the file in a contiguous series of blocks and moves the file back to the start of the disk. This process is unnecessary on an SSD. Here, there is virtually no likelihood that the data is stored in a contiguous fashion on the SSD NAND flash. There is also no performance benefit for the file being stored contiguously. While you can perform defrag /d on an SSD. In reality, you are needlessly shortening its cell-write life and the step should be skipped.

 

Conclusion

It is unfortunate that the process of compacting a VHDX file is not a seamless one. To realise the highest returns, it is necessary to shutdown the VM; which may not be practical in many scenarios. Equally, the amount of time required to perform the offline compact scales with the utilisation size of the VHDX, number of files and number of tasks performed as part of the maintenance.

Done right, and with the help of script automation, it can be a valuable task – especially before planned VM moves. I regularly save over 130GB in total when draining a hypervisor for maintenance in my home lab – around 25-30 minutes less file copy time over 1Gbps Ethernet. A worthwhile saving as it only takes 20 seconds to execute the automation script that does the work for me.