RE: [Patch v5 0/3] Introduce a driver to support host accelerated access to Microsoft Azure Blob for Azure VM

From: Long Li
Date: Thu Sep 30 2021 - 18:25:23 EST


> Subject: RE: [Patch v5 0/3] Introduce a driver to support host accelerated
> access to Microsoft Azure Blob for Azure VM
>
> > Subject: RE: [Patch v5 0/3] Introduce a driver to support host
> > accelerated access to Microsoft Azure Blob for Azure VM
> >
> > > Subject: Re: [Patch v5 0/3] Introduce a driver to support host
> > > accelerated access to Microsoft Azure Blob for Azure VM
> > >
> > > On Sat, Aug 07, 2021 at 06:29:06PM +0000, Long Li wrote:
> > > > > I still think this "model" is totally broken and wrong overall.
> > > > > Again, you are creating a custom "block" layer with a character
> > > > > device, forcing all userspace programs to use a custom library
> > > > > (where is it
> > > at?) just to get their data.
> > > >
> > > > The Azure Blob library (with source code) is available in the
> > > > following
> > > languages:
> > > > Java:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2FAzure%2Fazure-sdk-for-
> > > java%2Ftree%2Fmain%2Fsdk%2Fstorage%2Faz
> > > > ure-storage-
> > > blob&data=04%7C01%7Clongli%40microsoft.com%7C778083147
> > > >
> > > 8ed49b16e6308d95a2b7ae8%7C72f988bf86f141af91ab2d7cd011db47%7C1
> > > %7C0%7C6
> > > >
> > >
> 37639965101378114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > > DAiLCJQIjoi
> > > >
> > >
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=wcNhsEo
> H
> > > LV0VBc
> > > > uDf0CVXl7W0Ug9Cj7Q92%2Bw6qizroU%3D&reserved=0
> > > > JavaScript:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2FAzure%2Fazure-sdk-for-
> > > js%2Ftree%2Fmain%2Fsdk%2Fstorage%2Fstor
> > > > age-
> > >
> blob&data=04%7C01%7Clongli%40microsoft.com%7C7780831478ed49b
> > > 16
> > > >
> > > e6308d95a2b7ae8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C
> > > 637639965
> > > >
> > >
> 101378114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIj
> o
> > > iV2luMzIi
> > > >
> > >
> LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=I%2FfhdPX3Unz6S
> 3
> > > eBPcpl
> > > > %2Bh55nKoV0u%2FO0%2BYgjLy4grQ%3D&reserved=0
> > > > Python:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2FAzure%2Fazure-sdk-for-
> > > python%2Ftree%2Fmain%2Fsdk%2Fstorage%2F
> > > > azure-storage-
> > > blob&data=04%7C01%7Clongli%40microsoft.com%7C7780831
> > > >
> > > 478ed49b16e6308d95a2b7ae8%7C72f988bf86f141af91ab2d7cd011db47%7
> > > C1%7C0%7
> > > >
> > >
> C637639965101378114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > > MDAiLCJQIj
> > > >
> > >
> oiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aAwsi%
> 2
> > > BPVsN
> > > > tsDMJ7rKnRDigNc41fIao031lde247Nc0%3D&reserved=0
> > > > Go:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2FAzure%2Fazure-storage-blob-
> > > go&data=04%7C01%7Clongli%40mic
> > > >
> > >
> rosoft.com%7C7780831478ed49b16e6308d95a2b7ae8%7C72f988bf86f141a
> > > f91ab2d
> > > >
> > >
> 7cd011db47%7C1%7C0%7C637639965101378114%7CUnknown%7CTWFpbG
> > > Zsb3d8eyJWIj
> > > >
> > >
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> 0
> > > 00&am
> > > >
> > >
> p;sdata=43JhbGsYQxA%2FoivNd7C3z7DSYO%2FPONCoaW2v7TN6xEU%3D&a
> > > mp;reserve
> > > > d=0
> > > > .NET:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2FAzure%2Fazure-sdk-for-
> > > net%2Ftree%2Fmain%2Fsdk%2Fstorage%2FAzu
> > > >
> > >
> re.Storage.Blobs&data=04%7C01%7Clongli%40microsoft.com%7C77808
> > > 3147
> > > >
> > > 8ed49b16e6308d95a2b7ae8%7C72f988bf86f141af91ab2d7cd011db47%7C1
> > > %7C0%7C6
> > > >
> > >
> 37639965101378114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > > DAiLCJQIjoi
> > > >
> > >
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6ClMeURl
> t
> > > cBv1q
> > > > 7l7PGGrxXVJbVDt9uMBlwoIVh7Wpw%3D&reserved=0
> > > > PHP:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2FAzure%2Fazure-storage-php%2Ftree%2Fmaster%2Fazure-
> > > storage-blo
> > > >
> > >
> b&data=04%7C01%7Clongli%40microsoft.com%7C7780831478ed49b16
> > > e6308d9
> > > >
> > > 5a2b7ae8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6376399
> > > 651013781
> > > >
> > >
> 14%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luM
> zIi
> > > LCJBTiI
> > > >
> > >
> 6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DuZO539vd76c%2Byaqjn
> > > hetp%2B3T
> > > > i0b74601ZkNe39SNK4%3D&reserved=0
> > > > Ruby:
> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > gi
> > > > th
> > > > ub.com%2Fazure%2Fazure-storage-
> > > ruby%2Ftree%2Fmaster%2Fblob&data=04
> > > > %7C01%7Clongli%40microsoft.com%7C7780831478ed49b16e6308d95a2
> b
> > > 7ae8%7C72
> > > >
> > > f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637639965101378114%7
> > > CUnknown%
> > > >
> > >
> 7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW
> wi
> > > LCJX
> > > >
> > >
> VCI6Mn0%3D%7C1000&sdata=6Zviu1IuRQE2do9bDCae2iJv0W2KOJu90t
> > > XSR6kDAR
> > > > 4%3D&reserved=0
> > > > C++:
> > > >
> > >
> C++https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2
> > > C++Fg
> > > > C++ithub.com%2FAzure%2Fazure-sdk-for-
> > > cpp%2Ftree%2Fmain%2Fsdk%2Fstorage
> > > > C++%23azure-storage-client-library-for-
> > > c&data=04%7C01%7Clongli%40m
> > > >
> > >
> C++icrosoft.com%7C7780831478ed49b16e6308d95a2b7ae8%7C72f988bf86
> > > f141af9
> > > >
> > > C++1ab2d7cd011db47%7C1%7C0%7C637639965101388074%7CUnknown%
> > > 7CTWFpbGZsb3
> > > >
> > >
> >
> C++d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
> Mn
> > > 0%3
> > > >
> > >
> C++D%7C1000&sdata=HH6jrqREWQ%2BkoRR%2Fsb02wRXnuLU5il4Erzm
> > > rBvUZu5w%
> > > > C++3D&reserved=0
> > >
> > > And why wasn't this linked to in the changelog here?
> > >
> > > In looking at the C code above, where is the interaction with this Linux
> driver?
> > > I can't seem to find it...
>
> Greg,
>
> I apologize for the delay. I have attached the Java transport library (a tgz file)
> in the email. The file is released for review under "The MIT License (MIT)".
>
> The transport library implemented functions needed for reading from a Block
> Blob using this driver. The function for transporting I/O is
> Java_com_azure_storage_fastpath_driver_FastpathDriver_read(), defined
> in "./src/fastpath/jni/fpjar_endpoint.cpp".
>
> In particular, requestParams is in JSON format (REST) that is passed from a
> Blob application using Blob API for reading from a Block Blob.
>
> For an example of how a Blob application using the transport library, please
> see Blob support for Hadoop ABFS:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fapache%2Fhadoop%2Fpull%2F3309%2Fcommits%2Fbe7d12662e2
> 3a13e6cf10cf1fa5e7eb109738e7d&data=04%7C01%7Clongli%40microsof
> t.com%7C3acb68c5fd6144a1857908d97e247376%7C72f988bf86f141af91ab2d7
> cd011db47%7C1%7C0%7C637679518802561720%7CUnknown%7CTWFpbGZsb
> 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C1000&sdata=6z3ZXPtMC5OvF%2FgrtbcRdFlqzzR1xJNRxE2v2Qrx
> FL8%3D&reserved=0
>
> In ABFS, the entry point for using Blob I/O is at AbfsRestOperation
> executeRead() in hadoop-tools/hadoop-
> azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStr
> eam.java, from line 553 to 564, this function eventually calls into
> executeFastpathRead() in hadoop-tools/hadoop-
> azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.ja
> va.
>
> ReadRequestParameters is the data that is passed to requestParams
> (described above) in the transport library. In this Blob application use-case,
> ReadRequestParameters has eTag and sessionInfo (sessionToken). They are
> both defined in this commit, and are treated as strings passed in JSON format
> to I/O issuing function
> Java_com_azure_storage_fastpath_driver_FastpathDriver_read() in the
> transport library using this driver.
>
> Thanks,
> Long

Hello Greg,

I have shared the source code of the Blob client using this driver, and the reason why the Azure Blob driver is not implemented through POSIX with file system and Block layer.

Blob APIs are specified in this doc:
https://docs.microsoft.com/en-us/rest/api/storageservices/blob-service-rest-api

The semantic of reading data from Blob is specified in this doc:
https://docs.microsoft.com/en-us/rest/api/storageservices/get-blob

The source code I shared demonstrated how a Blob is read to Hadoop through ABFS. In general, A Blob client can use any optional request headers specified in the API suitable for its specific application. The Azure Blob service is not designed to be POSIX compliant. I hope this answers your question on why this driver is not implemented at file system or block layer.

Do you have more comments on this driver?

Thanks,
Long