If you like this blog post, do subscribe to my RSS feed

An architecture for enforcing RBAC in a cloud storage system

Author: Parth Parikh

First Published: 17/02/23

‘The words are in the elven-tongue of the West of Middle-earth in the Elder Days,’ answered Gandalf. ‘But they do not say anything of importance to us. They say only: The Doors of Durin, Lord of Moria. Speak, friend, and enter. And underneath small and faint is written: I, Narvi, made them. Celebrimbor of Hollin drew these signs.’
‘What does it mean by speak, friend, and enter?’ asked Merry.
‘That is plain enough,’ said Gimli. ‘If you are a friend, speak the password, and the doors will open, and you can enter.’
- Lord of the Rings, A Journey in the Dark

In this blog post, we will explore a 2016 paper by Garrison et al. that presents an architecture for enforcing access control policies in a cloud storage system, which was published in the IEEE Symposium on Security and Privacy. If you are interested in Security and Privacy topics, be sure to check out my recent blog post on KHyperLogLog, a data structure from Google that estimates privacy risks in large databases.

Introduction

Access control is defined by the National Institute of Standards and Technology (NIST) as the process of permitting or restricting access to applications at a granular level, such as per-user, per-group, and per-resources.

When we read this definition, one of the first words that come to mind is passwords. The history of passwords is diverse and certainly rich. In older times, they were commonly known as watchwords. In his 1906 book on the Apostles' Creed, Rev. A. G. Mortimer mentions the following regarding the history of watchwords:

Figure 1: A reference to watchwords in Rev. A. G. Mortimer's 1906 book.

The Compatible Time-Sharing System (CTSS) was the first computer system to use password-based login. Professor Robert Fano, who was involved in CTSS development, spoke about the use of passwords at a 1967 IEEE conference:

There are a number of system features that deserve special mention. The private files of each individual are protected by a personal password, which is requested by the system at the beginning of each session at a terminal. The teletypewriter's printer is disabled while the password is given, so that no record of it can be seen by other people. The password is checked by the system against the user's name and problem number, and further access is denied if any mismatch is detected.

Together with this mechanism for insuring privacy of individual files, the system includes facilities for making one's work easily available to other people. A "permit" command is available for authorizing the use of any particular file on the part of other people. Permission can be granted to any number of specifically named individuals, or to everybody, or even to everybody but Joe. The authorization can be limited to using and copying a file, or it can include the right to modify it or even delete it. Once a person has been authorized to use a file owned by somebody else, he can "link" to it by means of an appropriate command and thereafter use it as if it were his own. The links established by each user are recorded in his file directory, together with the records of his own files. These facilities for making one's work available to other people are extensively used. At present, the average number of recorded links per user of the Project MAC system is 27, and the average number of private files per user is 34.

Additionally, Tom Van Vleck, a CTSS system programmer, shared a fascinating story about the use of passwords in CTSS:

The one time XEC * was used in a good way was the day in 1966 that a certain Computation Center administrator's mistake caused the CTSS password file and the message of the day to be swapped. Everybody's password was displayed in clear to each user that logged in. (The editor in those days created a temporary file with a fixed name.) This was before (and was the origin of) the idea of one-way encrypting passwords. Bill Mathews of Project TIP noticed the passwords coming out, and quickly entered the debugger and crashed the machine by entering an XEC * instruction. Naturally this happened at 5 PM on a Friday, and I had to spend several unplanned hours changing people's passwords. (The problem is described and analyzed in Corby's Turing Award Lecture.)

For some context, XEC * would cause the CPU to “sit there taking I-cycles, uninterruptible, until an operator manually reset the CPU.”

Access Control Models: RBAC and ABAC

Access control mechanisms have significantly evolved since the 1960s. Today, Role-Based Access Control (RBAC) is one of the more common access control models. For instance, an MNC with a storage system for employee records can tie access privileges with job responsibilities to ensure only authorized employees have access to the records. In this scenario, an HR manager would have read and write access to all employee records, while a software manager would only have access to records of employees who report under her. Such roles are then assigned to individual users, with new employees receiving access privileges related to their roles.

In this blog, we will primarily discuss RBAC models, specifically RBAC₀, whose state can be formally described as follows:

U is a set of users
R is a set of roles
P is a set of permissions
PA is the permission assignment relation, a subset of R x P
UR is the user assignment relation, a subset of U x R
auth determines if a given user u with a role r can use permission p. This is true when (u, r) is in set UR and (r, p) is in set PA.

This system will be useful in our discussion of construction. See Figure 2 for a visual representation of RBAC.

Figure 2: A visual representation of RBAC in an MNC.

There are other access control models out there, such as Attribute-Based Access Control (ABAC). ABACs are known to provide more fine-tuning than their RBAC counterparts. To get a better idea, imagine that the aforementioned MNC has a department working on government projects. Such projects likely require a security clearance level (an example of an attribute). Two employees in the organization may have the same roles (e.g., SDE-2), but their security clearance levels differ. If we use ABAC with security clearance level as an attribute, a user attempting to access the file but not meeting these criteria will be denied access.

Identity-based encryption and Identity-based signature

Before we jump into the architecture, let's discuss some preliminaries, such as identity-based encryption (IBE) and Identity-based signature (IBS) schemes, which we will be using during the construction.

Identity-Based Encryption is a type of public-key encryption system that provides confidentiality by encrypting messages. It allows the use of user identities, such as email addresses, phone numbers, IP addresses, or other unique identifiers, as public keys. This is different from traditional public-key encryption, where a user's public key is typically a large pseudo-random number. IBEs can be particularly useful in scenarios where key management is challenging.

In this blog post, we won't dive into the different types of IBE systems, but we will take a brief look at the four algorithms used by these systems. For our purposes, their implementation can be thought of as an abstraction.

The four algorithms are MSKGen, KeyGen, Enc, and Dec:

MSKGen takes a security parameter n and generates public parameters pp, and master key msk.
KeyGen takes the master key msk and the ID and generates a private key k_ID for the ID. Here, ID is the public key and can be something like a user's email address.
Enc encrypts the message m using the public key ID and public parameters pp.
Dec decrypts the ciphertext using the private key k_ID.

A visual representation of IBE is provided in Figure 3.

Figure 3: A visual representation of IBE. Here, Admin acts as a private key generator (PKG).

Now, let's take a look at Identity-Based Signature schemes. IBE and IBS are related in the sense that they both use a user's identity (such as an email address) as a public key. However, the difference lies in their functionality. IBS schemes are used to sign messages to prove their authenticity.

In essence, a user's identity is used to generate a signature that is attached to the message to prove that the message was created by the user and has not been altered since then. The receiver of the message can then verify its authenticity by using the user's identity and signature. The private keys are generated using a trusted third party, known as the Private Key Generator (PKG). A visual representation of IBS is shown in Figure 4.

Figure 4: A visual representation of IBS. Here, Admin acts as a private key generator (PKG). Furthermore, there is a level of indirection on the receiver's side as Alice sends the message to the Minimal Reference Monitor for authenticity verification. This is done to stay consistent with the construction part that we will be discussing next.

As before, we won't be delving into the implementation, but it is important to familiarize ourselves with the following algorithms: MSKGen, KeyGen, Sign, and Ver.

MSKGen takes a security parameter n and generates public parameters pp, and master key msk.
KeyGen takes the master key msk and the ID and generates a private signing key s_ID for this ID.
Sign takes the message M and the private key s_ID to generate a signature sig.
Ver takes the message M and signature sig to help verify whether sig is valid for a given ID.

Enforcing RBAC₀ in a cloud storage system

Architecture

In this blog post, we will consider a cloud storage system with three entities: an administrator, a user, and a cloud storage service. Figure 5 provides a visualization of this system.

Figure 5: A visual representation of the cloud storage system. This figure is from the paper.

The administrator is mainly responsible for managing cryptographic keys, such as creating and revoking them. For example, as we saw in the IBE and IBS sections, the administrator would act as a trusted third party and be responsible for using their master key msk to generate a private signing key for a user. Interestingly, in this architecture, any user can download files from the cloud. However, since all files are encrypted, only the user with the appropriate role-based key can decrypt, read, and modify their contents.

As can be seen, the files are hosted in the cloud, along with the access control policy data that protects them. The read access permissions are enforced on the user's side, whereas the write permissions are enforced on the cloud side. Before authorizing writes, a minimal reference monitor validates the user's signature. The architecture we will be examining makes the following assumptions about the storage system:

the cloud provider is not trusted to view the contents of the stored files,
trusted with ensuring the availability of the files, and
trusted to ensure that only authorized users can update their respective files.

With all these preliminaries in mind, let's move on to a naive construction for enforcing RBAC₀ on a cloud storage system.

Naive Construction

When a new user, such as Alice, joins an organization, she must complete an initial registration process with the administrator. The administrator will use their master keys (msk) to generate two private keys: k_u and s_u, using KeyGen for the IBE and IBS schemes, respectively. These private keys will be sent to Alice.

Similarly, when a new role, such as Sales Ninja, is created in the organization, the administrator will generate similar private keys: k_r and s_r, using KeyGen. For each user, such as Alice, assigned to this role, the administrator creates and uploads a tuple of the form

<RK, Alice’s ID, role, Enc(kr, sr), IBS signature of the administrator>

Here, RK is a special value indicating that this is a role key tuple. Enc(k_r, s_r) provides Alice with cryptographically protected access to k_r and s_r. The administrator does this by using Alice's private key (k_u) to encrypt the message (k_r and s_r) with Alice's identity. As Alice also has her private key, she can easily decrypt and obtain k_r/s_r.

For each file that is to be shared with a particular role, the administrator creates and uploads a tuple of the form

<F, role, <file name, access privileges>, Enc(file), IBS signature of the administrator>

Again, F is a special value indicating that this is a file tuple, and Enc(file) is an IBE-encrypted file with a role as the ID. The access privileges can be read or read-write and the file can be decrypted using the private key k_r corresponding to that role.

After the administrator completes the preparation, a user can start reading and writing their authorized files. Let's say, Alice, who is a Sales Ninja, wants to read an authorized file called sales.txt. To do this, she needs to:

Download an RK tuple for the Sales Ninja role and an F tuple for sales.txt.
Validate the IBS signatures on both tuples.
Decrypt k_r from the RK tuple using her private IBE key k_u.
Decrypt sales.txt using k_r and read the file.

To write to a file, Alice can create a new F tuple:

<F, Sales Ninja, <sales.txt, write>, Enc(updated sales.txt), IBS signature of Alice>

once she's done changing the contents locally. She can then send it to the cloud. If the minimal reference monitor can verify Alice's signature, the file sales.txt will be replaced with the updated one.

Issues with the Construction and Suggested Improvements

The above construction has significant issues. For instance, encrypting large amounts of data using IBE can impact performance, and the F tuple incurs duplicate work as the entire file needs to be encrypted multiple times, once for each role.

To address these issues, it's recommended to split an F tuple into two tuples, F and FK, as follows:

The FK tuple is: <FK, role, <file name, access privileges>, Enc(k), IBS signature of the administrator>. Here, k is a symmetric key generated by the administrator, and Enc(k) encrypts this key using the IBE scheme with the role as ID.
The F tuple is: <F, file name, Enc(file), IBS signature using the role key of the last authorized updater>. Here, Enc(file) uses faster symmetric-key cryptography with k as the symmetric key, rather than the IBE scheme.

This technique greatly reduces duplication, as a single F tuple can be created for a file, with multiple FK tuples, one for each role. Only one symmetric key per file needs to be generated, and a more cost-efficient symmetric key cryptography is used for encrypting files.

Another issue with the construction is the lack of provision for revoking access to a file. Ideally, we should be able to do this on a per-role and per-user level.

One way to address this issue is by versioning the tuples. To achieve per-role functionality, we create a version number v corresponding to the symmetric key k and use it to version the F and FK tuples. To remove permission for a role, we create a new version number, use it to generate a new symmetric key, and re-encrypt the file with it. This would not only modify the F tuple but also the FK tuples for all the roles whose access to the file has not been revoked.

Conclusion

That's all for now! The paper provides an in-depth analysis of the efficiency of this construction, and it also includes algorithms for various operations, such as adding and deleting users, as well as revoking access and permissions. If you're interested in learning more, I highly recommend checking out the paper for further details.

←