Story of PVS write cache and performance

Diskperformance PVS Write cache with different scenarios and write cache locations.

I visited the event Dutch Citrix User Group @ Dell Nederland Amsterdam 6-3-2015. It was my first DuCUG event(visitor). Thumbs up for the organization, the positive atmosphere and the superb guest speakers. Jim Moyle was one of the guest speakers at DuCuG. He was talking about diskperformance and PVS and inspired me to test PVS writecache diskperformance.

When you read this article I hope you get a better understanding about write cache locations, the different choices and what it does with diskperformance.

I used Jim Moyle’s article to test the diskperformance of a PVS XA image : The test are with IOmeter

I created a template with tool IOmeter. It represents a VDI workload(4k blocks, 20% seq 80% random, 80% write 20% read). The template is found here(attachment1). If you want to test yourself copy the text, put it in notepad and name it <name>.icf. Or do it yourself and learn IOmeter 🙂


The test environment

Hardware: HP DL360P G8

2 X Xeon E-52690v2, 192GB RAM, 2 x 200GB 6G SATA ME SDD, 2 x 10GB nic


Hypervisor: ESXi/VMware vSphere 5.5. VM: Server 2012 R2, PVS server 7.6, 2 vCPU and 32GB memory.

Target device(Testmachine for IOmeter)

Hypervisor: XenServer 6.2, 6 XA VMs per hypervisor. VM: Golden image with all software installed for a production environment at my current customer.6 vcpu, 30GB memory. 30GB D drive(SSD).SBC, Server 2012 R2, PVS target device 7.6, XA 7.6, RES WM.

The environment is as good as Idle(no load). It’s not in production yet.


 The test results

In the graphs i look at the Total IO per Second and the disk latencies. This two together are a good measurement for disk performance.

Test: 1GB workload against VM with write cache location: ram(4 GB of RAM) with overflow to disk. The 1GB workload will fit in the 4GB RAM. Result is impressive IOPS and a very low latency! How lower latencies are the better it is.



Test: 10GB workload against VM with write cache location: ram(4 GB of RAM) with overflow to disk. The 10GB workload wont fit in the 4GB RAM. So approximately 3.6 GB is in RAM and the other 6,4GB are on disk. Disk performance of the write cache halve if you compare it with 1GB workload. Still pretty impressive.



Test: 10GB workload direct to SSD disk. This is to test disk performance without the penalty of PVS. < 1 ms latency and 28412 IOPS are good.



Test: 10GB workload against VM with write cache location: cache on device harddrive. IOPS and disk latencies are good, but compared against 10GB workload direct to ssd disk there is a 50% performance penalty.

result40 result41

Test: 10GB workload against VM with write cache location: ram(64MB of RAM) with overflow to disk. The 10GB workload wont fit in the 64MB of ram. Ram in this scenario is only used as buffer to disk. Poor performance, only 4923 IOPS and latencies > 3 ms! Disk performance is horrible compared against the write cache in RAM or local device harddrive. Imagine that you have to share the 4923 iops with 6 other XenApp servers on one XenServer host.

result51 result50



You shouldn’t use memory as a buffer(64MB) for the SSD disk with the write cache option RAM with overflow to disk. With write cache option cache on device harddrive you get 293% better IOPS and Latency. If you could keep all the write cache in memory, and you only use the disk for overflow(as a last resort instead of BSoD with pure write cache to memory) the total IOPS is 1293% better and latency is 1250% better!!!

My write cache location choices:

  1. Ram with overflow to disk(No or less overflow).
  2. Cache and device harddrive.

So invest in memory for impressive IOPS or use cache to harddrive with a local SSD! Your end users are happy and get < 30 seconds logon times and superb application response(you will only get this result if there are no other bottlenecks and is optimized by a Citrix consultant 😉 ). So no more visits to the coffee corner…

Still invest in SSDs and not in spinning drives so everything runs smooth. The pagefile, logs, AV Pattern, crash dumps and other persistant stuff needs IOPS and low latency too 🙂 .

Many thanks to Elger van der Avoird for reviewing this article