Re: [Patch v5 0/3] Introduce a driver to support host accelerated access to Microsoft Azure Blob for Azure VM

From: Bart Van Assche
Date: Thu Aug 05 2021 - 13:09:19 EST


On 8/5/21 12:00 AM, longli@xxxxxxxxxxxxxxxxx wrote:
From: Long Li <longli@xxxxxxxxxxxxx>

Azure Blob storage [1] is Microsoft's object storage solution for the
cloud. Users or client applications can access objects in Blob storage via
HTTP, from anywhere in the world. Objects in Blob storage are accessible
via the Azure Storage REST API, Azure PowerShell, Azure CLI, or an Azure
Storage client library. The Blob storage interface is not designed to be a
POSIX compliant interface.

Problem: When a client accesses Blob storage via HTTP, it must go through
the Blob storage boundary of Azure and get to the storage server through
multiple servers. This is also true for an Azure VM.

Solution: For an Azure VM, the Blob storage access can be accelerated by
having Azure host execute the Blob storage requests to the backend storage
server directly.

This driver implements a VSC (Virtual Service Client) for accelerating Blob
storage access for an Azure VM by communicating with a VSP (Virtual Service
Provider) on the Azure host. Instead of using HTTP to access the Blob
storage, an Azure VM passes the Blob storage request to the VSP on the
Azure host. The Azure host uses its native network to perform Blob storage
requests to the backend server directly.

This driver doesn’t implement Blob storage APIs. It acts as a fast channel
to pass user-mode Blob storage requests to the Azure host. The user-mode
program using this driver implements Blob storage APIs and packages the
Blob storage request as structured data to VSC. The request data is modeled
as three user provided buffers (request, response and data buffers), that
are patterned on the HTTP model used by existing Azure Blob clients. The
VSC passes those buffers to VSP for Blob storage requests.

The driver optimizes Blob storage access for an Azure VM in two ways:

1. The Blob storage requests are performed by the Azure host to the Azure
Blob backend storage server directly.

2. It allows the Azure host to use transport technologies (e.g. RDMA)
available to the Azure host but not available to the VM, to reach to Azure
Blob backend servers.
Test results using this driver for an Azure VM:
100 Blob clients running on an Azure VM, each reading 100GB Block Blobs.
(10 TB total read data)
With REST API over HTTP: 94.4 mins
Using this driver: 72.5 mins
Performance (measured in throughput) gain: 30%.
[1] https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction

Is the ioctl interface the only user space interface provided by this kernel driver? If so, why has this code been implemented as a kernel driver instead of e.g. a user space library that uses vfio to interact with a PCIe device? As an example, Qemu supports many different virtio device types.

Thanks,

Bart.