What is the difference between blob and file storage?
The Azure Storage service is a useful cloud service that enables users to store massive amounts of data. However, there are different styles of this storage service that apply to different situations.
Azure Blob Storage is an object store used for storing vast amounts unstructured data, while Azure File Storage is a fully managed distributed file system based on the SMB protocol and looks like a typical hard drive once mounted.
Given that Azure Blob Storage is an object store and Azure File Storage is a distributed filesystem, there are several other key differences between them. One useful commonality between them though, is that they are both fully managed by Microsoft Azure, relieving that headache from its users.
Both Azure Blob Storage and Azure File Storage have several different styles of storage redundancy. The available options are Locally Redundant Storage (LRS), Zone Redundant Storage (ZRS), Geo-Redundant Storage (GRS), Geo-Zone-Redundant Storage (GZRS), and the secondary read access versions of those, Read-Access Geo-Redundant Storage (RA-GRS) and Read-Access Geo-Zone-Redundant Storage (RA-GZRS).
The first type, Locally Redundant Storage (LRS), is a storage redundancy mechanism that replicates all data written to the storage account three times in the same primary region. This gives 99.999999999% durability to the stored objects or files in a given year. This means that if one million files or objects were stored, you would expect to lose one file or object, or have it corrupted, over a 100000 year period. However, if this single region goes down, or is destroyed, you’d lose access to all of the replicas.
The next type, Zone Redundant Storage (ZRS), is an offered storage method that replicates the data across three different availability zones in that same primary region. These zones are setup so that each have independent networking, power, and cooling at their sites, and are separated by a large enough distance that a local event that affects one of the availability zones should not influence the others. This setup gives an extra 9 of durability, setting it at twelve nines of durability.
The Geo-Redundant Storage (GRS) storage redundancy method is the same as LRS, however it also makes three asynchronous copies of the object or file data to another primary region separated by vast distances. With this setup, if an entire primary region is destroyed, or becomes in-accessible, the data will still be available at the secondary site in the secondary region. With the GRS option though, the secondary copy has its three copies stored in the same availability zone in that secondary region. Geo-Zone-Redundant Storage (GZRS) is the same as GRS, however it makes the three copies in three different availability zones in the secondary region.
The Read-Access Geo-Redundant Storage (RA-GRS) and Read-Access Geo-Zone-Redundant Storage (RA-GZRS) are identical to their GRS and GZRS counterparts except that the secondary backup copy of the data has a readable endpoint which is accessible. This way users of the storage account can choose which region to read the data from depending on where they are located. It is always faster for an end user to read data from a location that they are physically closer to.
As mentioned previously, Azure Blob Storage is an object store. An object store uses a flat namespace for storing its files. Even though the naming convention used by most users makes the objects appear to be stored in a directory like structure, this isn’t the case for an object store. You can think of an object store very much like a key value store, where in this case the key is the name of the object and the value is the objects data. One difference to this comparison though, is that metadata can also be associated with each object.
The Azure Blob Storage account is setup in a few different components. First is the actual storage account. The storage account can have more than just blob storage in it. After this, there is the container. This is where the actual blob objects get stored, which means the final layer contains the objects that are stored in the container.
The actual objects, or blobs, can be stored in several different formats. This includes Page Blobs, Append Blobs and Block Blobs. A Block Blob is a blob storage format which can store any type of binary or text based data. These blobs can be up to about 4.7TB each in size. The block blobs are broken up into several blocks which are managed by the Azure Storage system, hence the name. The Block Blob format is the format used in most typical cases.
An Append Blob is very similar to the Block Blobs, and also made up of blocks, but they are designed to receive the data in an appended like fashion. This is very useful for storing log like data that is written as received and seeking isn’t performed on the data.
The final format of objects in the Azure Blog Storage account are the Page Blobs. These are designed to work with random access patterns to the stored data. Most often, virtual hard drive data is stored as a Page Blobs due to the nature of how it is used by an operating system. Any VHD file stored in Azure Blob Storage should be stored in this format. Another difference with this format is that it allows for the objects data to be up to 8TB in size.
Azure File Storage is another type of storage system available to an Azure Storage account. This type of storage will feel very familiar to most users as it basically works like a regular filesystem found in most computers. The main difference is that it is a remote, and fully managed, file system with virtually unlimited storage capacity.
The Azure File Storage system exposes shares to its users which can be mounted by Azure VMs or you local laptop or desktop. Several machines can mount the same share at the same time, which makes sharing data around devices very easy. Another benefit is that it works across operating system types, meaning that Azure File Storage shares can be mounted by Windows, Linux or Mac OS.
Azure File Storage comes in a few flavors. These are the Standard Storage version and the Premium Storage version. The main difference between the two is that the premium storage is running on high performance solid state drives (SSD), which makes reading and writing data to these accounts much faster and offers more IOPS than what is possible with the standard storage option.
For Windows Server users, Azure File Storage also offers a syncing mechanism for an extra fee. This allows caching of frequently used data directly on the Windows Server for faster access times, while keeping the files in sync with the cloud based redundant copies of the data.
Another few benefits that apply to both Azure Blob Storage and Azure File Storage is the fact that they both enjoy encryption at rest. The encryption uses the industry standard AES-256 based encryption with either an Azure derived key, or a user provided key, keeping the data safe and secure. Once enabled, the users of the Blob based storage, or File based storage, don’t even need to think about encrypting the data that they are storing. This is handled automatically by the Azure Storage systems.
Azure Storage also offers the ability to prevent accidental deletion of files (in Azure File Storage) and objects (in Azure Blob Storage) by enabling the soft delete feature on the customers share (File Storage) or container (Blob Storage). With this enabled, any deletion that happens to the files or the objects will be marked for soft deletion for the set recovery period instead of being permanently deleted. This allows the owner of the data to restore the data within the configured window of time if the deletion was made in error, or the data is required for some new task. This can give users some extra piece of mind that their data is safe.
Given that both the Azure Blob Storage and Azure File Storage have so much flexibility in their use cases, either of them is a good choice when looking for a cloud based storage solution.