‘The words are in the elven-tongue of the West of Middle-earth in the Elder Days,’ answered Gandalf. ‘But they do not say anything of importance to us. They say only: The Doors of Durin, Lord of Moria. Speak, friend, and enter. And underneath small and faint is written: I, Narvi, made them. Celebrimbor of Hollin drew these signs.’
‘What does it mean by speak, friend, and enter?’ asked Merry.
‘That is plain enough,’ said Gimli. ‘If you are a friend, speak the password, and the doors will open, and you can enter.’
- Lord of the Rings, A Journey in the Dark
In this blog post, we will explore a 2016 paper by Garrison et al. that presents an architecture for enforcing access control policies in a cloud storage system, which was published in the IEEE Symposium on Security and Privacy. If you are interested in Security and Privacy topics, be sure to check out my recent blog post on KHyperLogLog, a data structure from Google that estimates privacy risks in large databases.
Access control is defined by the National Institute of Standards and Technology (NIST) as the process of permitting or restricting access to applications at a granular level, such as per-user, per-group, and per-resources.
When we read this definition, one of the first words that come to mind is passwords. The history of passwords is diverse and certainly rich. In older times, they were commonly known as watchwords. In his 1906 book on the Apostles' Creed, Rev. A. G. Mortimer mentions the following regarding the history of watchwords:
The Compatible Time-Sharing System (CTSS) was the first computer system to use password-based login. Professor Robert Fano, who was involved in CTSS development, spoke about the use of passwords at a 1967 IEEE conference:
There are a number of system features that deserve special mention. The private files of each individual are protected by a personal password, which is requested by the system at the beginning of each session at a terminal. The teletypewriter's printer is disabled while the password is given, so that no record of it can be seen by other people. The password is checked by the system against the user's name and problem number, and further access is denied if any mismatch is detected.
Together with this mechanism for insuring privacy of individual files, the system includes facilities for making one's work easily available to other people. A "permit" command is available for authorizing the use of any particular file on the part of other people. Permission can be granted to any number of specifically named individuals, or to everybody, or even to everybody but Joe. The authorization can be limited to using and copying a file, or it can include the right to modify it or even delete it. Once a person has been authorized to use a file owned by somebody else, he can "link" to it by means of an appropriate command and thereafter use it as if it were his own. The links established by each user are recorded in his file directory, together with the records of his own files. These facilities for making one's work available to other people are extensively used. At present, the average number of recorded links per user of the Project MAC system is 27, and the average number of private files per user is 34.
Additionally, Tom Van Vleck, a CTSS system programmer, shared a fascinating story about the use of passwords in CTSS:
The one time XEC * was used in a good way was the day in 1966 that a certain Computation Center administrator's mistake caused the CTSS password file and the message of the day to be swapped. Everybody's password was displayed in clear to each user that logged in. (The editor in those days created a temporary file with a fixed name.) This was before (and was the origin of) the idea of one-way encrypting passwords. Bill Mathews of Project TIP noticed the passwords coming out, and quickly entered the debugger and crashed the machine by entering an XEC * instruction. Naturally this happened at 5 PM on a Friday, and I had to spend several unplanned hours changing people's passwords. (The problem is described and analyzed in Corby's Turing Award Lecture.)
For some context, XEC *
would cause the CPU to “sit there
taking I-cycles, uninterruptible, until an operator manually reset the
CPU.”
Access control mechanisms have significantly evolved since the 1960s. Today, Role-Based Access Control (RBAC) is one of the more common access control models. For instance, an MNC with a storage system for employee records can tie access privileges with job responsibilities to ensure only authorized employees have access to the records. In this scenario, an HR manager would have read and write access to all employee records, while a software manager would only have access to records of employees who report under her. Such roles are then assigned to individual users, with new employees receiving access privileges related to their roles.
In this blog, we will primarily discuss RBAC models, specifically RBAC0, whose state can be formally described as follows:
U
is a set of usersR
is a set of rolesP
is a set of permissionsPA
is the permission assignment relation, a subset of
R x P
UR
is the user assignment relation, a subset of
U x R
auth
determines if a given user u
with a
role r
can use permission p
. This is true
when (u, r)
is in set UR
and
(r, p)
is in set PA
.
This system will be useful in our discussion of construction. See Figure 2 for a visual representation of RBAC.
There are other access control models out there, such as Attribute-Based Access Control (ABAC). ABACs are known to provide more fine-tuning than their RBAC counterparts. To get a better idea, imagine that the aforementioned MNC has a department working on government projects. Such projects likely require a security clearance level (an example of an attribute). Two employees in the organization may have the same roles (e.g., SDE-2), but their security clearance levels differ. If we use ABAC with security clearance level as an attribute, a user attempting to access the file but not meeting these criteria will be denied access.
Before we jump into the architecture, let's discuss some preliminaries, such as identity-based encryption (IBE) and Identity-based signature (IBS) schemes, which we will be using during the construction.
Identity-Based Encryption is a type of public-key encryption system that provides confidentiality by encrypting messages. It allows the use of user identities, such as email addresses, phone numbers, IP addresses, or other unique identifiers, as public keys. This is different from traditional public-key encryption, where a user's public key is typically a large pseudo-random number. IBEs can be particularly useful in scenarios where key management is challenging.
In this blog post, we won't dive into the different types of IBE systems, but we will take a brief look at the four algorithms used by these systems. For our purposes, their implementation can be thought of as an abstraction.
The four algorithms are MSKGen
, KeyGen
,
Enc
, and Dec
:
MSKGen
takes a security parameter n
and
generates public parameters pp
, and master key
msk
.
KeyGen
takes the master key msk
and the
ID
and generates a private key
kID
for the ID
. Here,
ID
is the public key and can be something like a user's
email address.
Enc
encrypts the message m
using the public
key ID
and public parameters pp
.
Dec
decrypts the ciphertext using the private key
kID
.
A visual representation of IBE is provided in Figure 3.
Now, let's take a look at Identity-Based Signature schemes. IBE and IBS are related in the sense that they both use a user's identity (such as an email address) as a public key. However, the difference lies in their functionality. IBS schemes are used to sign messages to prove their authenticity.
In essence, a user's identity is used to generate a signature that is attached to the message to prove that the message was created by the user and has not been altered since then. The receiver of the message can then verify its authenticity by using the user's identity and signature. The private keys are generated using a trusted third party, known as the Private Key Generator (PKG). A visual representation of IBS is shown in Figure 4.
As before, we won't be delving into the implementation, but it is
important to familiarize ourselves with the following algorithms:
MSKGen
, KeyGen
, Sign
, and
Ver
.
MSKGen
takes a security parameter n
and
generates public parameters pp
, and master key
msk
.
KeyGen
takes the master key msk
and the
ID
and generates a private signing key
sID
for this ID
.
Sign
takes the message M
and the private key
sID
to generate a signature sig
.
Ver
takes the message M
and signature
sig
to help verify whether sig
is valid for
a given ID
.
In this blog post, we will consider a cloud storage system with three entities: an administrator, a user, and a cloud storage service. Figure 5 provides a visualization of this system.
The administrator is mainly responsible for managing
cryptographic keys, such as creating and revoking them. For example, as
we saw in the IBE and IBS sections, the administrator would act as a
trusted third party and be responsible for using their master key
msk
to generate a private signing key for a user.
Interestingly, in this architecture, any user can download files from
the cloud. However, since all files are encrypted, only the user with
the appropriate role-based key can decrypt, read, and modify their
contents.
As can be seen, the files are hosted in the cloud, along with the access control policy data that protects them. The read access permissions are enforced on the user's side, whereas the write permissions are enforced on the cloud side. Before authorizing writes, a minimal reference monitor validates the user's signature. The architecture we will be examining makes the following assumptions about the storage system:
With all these preliminaries in mind, let's move on to a naive construction for enforcing RBAC0 on a cloud storage system.
When a new user, such as Alice, joins an organization, she must complete
an initial registration process with the administrator. The
administrator will use their master keys (msk
) to generate
two private keys: ku
and
su
, using KeyGen
for the IBE and IBS schemes, respectively.
These private keys will be sent to Alice.
Similarly, when a new role, such as Sales Ninja, is created in the
organization, the administrator will generate similar private keys:
kr
and sr
, using KeyGen
. For each user, such as Alice, assigned to
this role, the administrator creates and uploads a tuple of the form
<RK, Alice’s ID, role, Enc(kr, sr), IBS signature of the administrator>
Here, RK
is a special value indicating that this is a role
key tuple. Enc(kr, sr)
provides Alice
with cryptographically protected access to
kr
and sr
. The administrator does this by using Alice's private key (ku
) to encrypt the message (kr
and
sr
) with Alice's identity. As Alice also has her private key, she can
easily decrypt and obtain kr
/sr
.
For each file that is to be shared with a particular role, the administrator creates and uploads a tuple of the form
<F, role, <file name, access privileges>, Enc(file), IBS signature of the administrator>
Again, F
is a special value indicating that this is a file
tuple, and Enc(file)
is an IBE-encrypted file with a role
as the ID
. The access privileges can be read or read-write
and the file can be decrypted using the private key
kr
corresponding to that role.
After the administrator completes the preparation, a user can start
reading and writing their authorized files. Let's say, Alice, who is a
Sales Ninja, wants to read an authorized file called
sales.txt
. To do this, she needs to:
RK
tuple for the Sales Ninja role and an
F
tuple for sales.txt
.
kr
from the RK
tuple
using her private IBE key ku
.
sales.txt
using kr
and
read the file.
To write to a file, Alice can create a new F
tuple:
<F, Sales Ninja, <sales.txt, write>, Enc(updated sales.txt), IBS signature of Alice>
once she's done changing the contents locally. She can then send it to
the cloud. If the minimal reference monitor can verify Alice's
signature, the file sales.txt
will be replaced with the
updated one.
The above construction has significant issues. For instance, encrypting
large amounts of data using IBE can impact performance, and the
F
tuple incurs duplicate work as the entire file needs to
be encrypted multiple times, once for each role.
To address these issues, it's recommended to split an
F
tuple into two tuples, F
and
FK
, as follows:
FK
tuple is:
<FK, role, <file name, access privileges>, Enc(k), IBS signature of the administrator>
. Here, k
is a symmetric key generated by the
administrator, and Enc(k)
encrypts this key using the IBE
scheme with the role as ID
.
F
tuple is:
<F, file name, Enc(file), IBS signature using the role key of the last authorized updater>
. Here, Enc(file)
uses faster
symmetric-key cryptography with k
as the
symmetric key, rather than the IBE scheme.
This technique greatly reduces duplication, as a single
F
tuple can be created for a file, with multiple
FK
tuples, one for each role. Only one symmetric key per
file needs to be generated, and a more cost-efficient symmetric key
cryptography is used for encrypting files.
Another issue with the construction is the lack of provision for revoking access to a file. Ideally, we should be able to do this on a per-role and per-user level.
One way to address this issue is by versioning the tuples. To
achieve per-role functionality, we create a version number
v
corresponding to the symmetric key k
and use
it to version the F
and FK
tuples. To remove
permission for a role, we create a new version number, use it to
generate a new symmetric key, and re-encrypt the file with it. This
would not only modify the F
tuple but also the
FK
tuples for all the roles whose access to the file has
not been revoked.
That's all for now! The paper provides an in-depth analysis of the efficiency of this construction, and it also includes algorithms for various operations, such as adding and deleting users, as well as revoking access and permissions. If you're interested in learning more, I highly recommend checking out the paper for further details.