My undergraduate thesis on a capability based security system for a data-centric operating system.
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

this what i sent out

suri312006 5036fd07 96db0e6a

+127 -96
+20 -20
1-introduction.typ
··· 6 6 7 7 // talk about the standard unix abstractions 8 8 9 - In mainstream operating systems, security policy is enforced at runtime by a 10 - omnicient and all powerful kernel. 9 + In mainstream operating systems, a 10 + omnicient and all-powerful kernel enforces security policy at runtime. 11 11 // what am i trying to say here. 12 12 It acts as the bodyguard, holding all i/o and data protected unless the 13 13 requesting party has the authorization to access some resource. This tight 14 - coupling of security policy and access mechanisms works well since any accesses 15 - must be done through the kernel, so why not perform security checks right 16 - along-side an access. However 14 + coupling of security policy and access mechanisms works well since any access 15 + must be done through the kernel, so why not perform security checks 16 + alongside accesses? However, 17 17 the enforcement of security policy starts getting complicated when we try 18 - to seperate the access mechanisms from the kernel. This notion arises 18 + to separate the access mechanisms from the kernel. This problem arises 19 19 in a certain class of operating systems. 20 20 21 - == Data Centric Operating Systems 21 + == Data-Centric Operating Systems 22 22 23 - Data centric operating systems are defined by two principles @twizzler: 23 + Data-centric operating systems are defined by two principles @twizzler: 24 24 25 - + Provide direct, kernel-free, access to data. 25 + + They provide direct, kernel-free, access to data. 26 26 27 - + A notion of pointers that are tied to the data they represent. 27 + + They have a notion of pointers that are tied to the data they represent. 28 28 29 29 Mainstream operating systems fail to classify as data-centric operating 30 30 systems, as they rely on the kernel for all data access, and use virtualized 31 31 pointers per process to represent underlying data. The benefit of this "class" 32 32 of operating systems comes from the low overhead for data manipulation, due to the lack 33 - of kernel involvement. However our previous security model fails to operate 34 - here as, by defenition, the kernel cannot be infront of accesses to data. So, 33 + of kernel involvement. However, the mainstream security model fails to operate 34 + here as, by definition, the kernel cannot be in front of access to data. So, 35 35 something new must be investigated. 36 36 37 37 == Capability Based Security Systems ··· 41 41 // 42 42 // how they are different from earlier thingies 43 43 44 - Capability based security systems have a rich history in research, and offer 45 - an alternative approach to security, in opposition to the ACL's of prevalent OS's @linux_security. 46 - Boiled down, a capability is a token of authority, holding at mininum some 44 + Capability-based security systems have a rich history in research, and offer 45 + an alternative approach to security, in opposition to the Access Control Lists of prevalent OS's @linux_security. 46 + Boiled down, a capability is a token of authority, holding at minimum some 47 47 permissions and a unique identifier to which "thing" those permissions apply 48 - to @cap-book. This simple approach of having a "token", allows for a large seperation 48 + to @cap-book. This simple approach of having a "token", allows for a separation 49 49 of the kernel's involvement in the creation and management of security policy. 50 - In a well designed system, as we see in @twizsec and described later, allows for 50 + In a well-designed system, as we see in @twizsec and described later, this allows 51 51 users to completely create and manage security policy while the kernel is left to enforce 52 - it. This paradigm allows for the kernel-free access of data, while also guaranteeing 52 + it. This paradigm permits kernel-free access of data, while also guaranteeing 53 53 security. 54 54 55 55 ··· 59 59 In this thesis, I detail the fundamentals of security in the Twizzler 60 60 operating system, and discuss how I implement and refine some of the high 61 61 level ideas described in Twizzler @twizzler and an early draft of a Twizzler security 62 - paper @twizsec. Additionally we evaluate these systems inside kernel and user space, 63 - with comparsions to micro-benchmarks done with an older version of twizzler. 62 + paper @twizsec. Additionally, we evaluate these systems inside kernel and user space, using 63 + Alice/Bob scenarios and microbenchmarks. 64 64 Code can be found in this 65 65 #link("https://github.com/twizzler-operating-system/twizzler/issues/268")[Github 66 66 tracking issue].
+18 -18
2-keypair.typ
··· 4 4 #mol-chapter("Key Pairs") 5 5 6 6 // what are keypair objects ? 7 - Key pairs in Twizzler are the representation of the cryptographic signing 8 - schemes used to create a signed capability, discussed in 3.1. We design 9 - the keypair objects to be agnostic towards what cryptographic schemes are 10 - underneath, allowing for the underlying algorithm to be changed @twizzler. The 11 - keys themselves are stored inside of objects, allowing for persistent or 12 - volatile storage depending on object specification, and allows for keys 13 - themselves to be treated as any other object and have security policy applied 14 - to them. This allows for powerful primitives and rich expressiveness for 15 - describing secruity policy, while also being flexible enough to make basic 16 - policy easy. 7 + Key pairs in Twizzler are representation of the cryptographic signing 8 + schemes used to create a signed capability, as discussed in 3.1. We design 9 + the keypair objects to be agnostic towards the underlying scheme to allow for 10 + multiple schemes, as described in @twizzler. This also helps with backwards 11 + compatibilty when adding new, more secure schemes, in the future. The keys 12 + are stored inside of objects, allowing for persistent or volatile 13 + storage depending on object specification, and allows for keys themselves to 14 + be treated as any other object and have security policy applied to them. This 15 + allows for powerful primitives and rich expressiveness for describing secruity 16 + policy, while also being intuitive enough to construct basic policy easily. 17 17 18 18 19 19 == Abstraction ··· 29 29 for an outdated signing scheme to be used in support of older programs / 30 30 files. An existing drawback for backward compatibility is the maximum size 31 31 of the buffer we store the key in. Currently we set the maximum size as 256 32 - bytes, meaning if a future cryptographic signing scheme was to be found with 33 - a private key size larger than 256 bytes, we would have to drop backwards 34 - compatibility. Sure this can be prevented by setting the maximum size to 35 - something larger, but that a tradeoff between possible cryptographic schemes 32 + bytes, meaning if a future cryptographic signing scheme was to be created with 33 + a key size larger than 256 bytes, we would have to drop backwards 34 + compatibility. Sure this can be prevented now by setting the maximum size to 35 + something larger, but thats a tradeoff between possible cryptographic schemes 36 36 vs the real on-disk cost of larger buffers. 37 37 38 38 == Compartmentalization ··· 47 47 Suppose for instance we have Alice on Twizzler, and all users on twizzler have 48 48 a "user-root" keypair that allows for them to create an arbitrary number of 49 49 objects. Also suppose that access to this user-root keypair is protected by 50 - some login program, where only alice can log in. This now means that Alice 51 - now can create new keypair objects from her user-root keypair. Since all 52 - her new keypairs originate from her original user-root keypair, only she can 50 + some login program, where only alice can log in. This means that Alice 51 + can create new keypair objects from her user-root keypair. Since all 52 + *her* new keypairs originate from *her* original user-root keypair, only *she* can 53 53 access the keys required to create new signatures allowing permissions into 54 - her objects. It forms an elegant solution for capability creation without 54 + *her* objects. It forms an elegant solution for key management without 55 55 the involvement of the kernel. 56 56 57 57
+28 -17
3-cap.typ
··· 6 6 // define a capability 7 7 Capabilities are the atomic unit of security in Twizzler, acting as tokens of 8 8 protections granted to a process, allowing it to access some object in the ways 9 - it describes. A Capability is built up of the following fields. 9 + it describes. Colloquially a capability is defined as permissions and 10 + a unique object to which those permissions apply, but in Twizzler we add 11 + the signature component to allow the kernel to validate that the security policy was created by an authorized party. 12 + 13 + Thus, a Capability is represented as follows: 10 14 11 15 12 16 ```rust ··· 17 21 flags: CapFlags, // Cryptographic configuration for capability validation. 18 22 gates: Gates, // Additional constraints on when this capability can be used. 19 23 revocation: Revoc, // Specifies when this capability is invalid, i.e. expiration. 20 - sig: Signature, // The signature inside the capability. 24 + sig: Signature, // The signature. 21 25 } 22 26 ``` 23 27 24 28 // 25 29 == Signature 26 - The signature inside is what determines the validity of this capability. The 30 + The signature is what determines the validity of the capability. The 27 31 only possible signer of some capability is who ever has permissions to the 28 - signing key object, or the kernel. In this way, if the signer decides to 29 - make the signing key private to them, no other entity can administer this 30 - signature for this capability. The signature is built up of a array with 32 + signing key object, or the kernel itself. The signature is built up of a array with 31 33 a maximum length and a enum representing what type of cryptographic scheme 32 34 was used to create it; quite similar to the keys mentioned previously. 33 - The message being signed to form the signature is the bytes of each of the 34 - fields inside the capability being hashed. There is support for multiple 35 - hashing algorithms as described in 3.1. 36 - 35 + The fields of the capability are serialized and hashed to form the message that gets signed, 36 + and then stored in the signature field. Currently we support Blake3 and 37 + Sha256 as hashing algorithms. 37 38 38 39 // what do i want to talk about regarding signatures? 39 40 40 41 == Gates 42 + Gates act as a limited entry point into objects. If a capability has a non-trivial gate, 43 + which is made up of an offset field, and a length field, the kernel will read that and ensure 44 + that any memory accesses into that object are within the gate bounds. The original Twizzler 45 + paper @twizzler describes gates as a way to perform IPC, and calls between distinct programs, 46 + but in the context of this thesis it is sufficient to think of them as a region of allowed 47 + memory access. 41 48 42 49 == Flags 43 - Currently flags in capabilities are used to specify which hashing algorithm to use in order 44 - to form a message to be signed. We allow for multiple algorithms to be used in order to 45 - allow for backwards capability when newer, more efficient hashing algorithms are created. 46 - 47 - There is also plenty of space left in the bitmap, allowing for future work to develop more 48 - expressive ways of using capabilities, such as planned future work to implement information 49 - flow control into the twizzler security system. 50 + Currently, flags in capabilities are used to specify which hashing algorithm to use to form a message to be signed. We allow for multiple algorithms to be used to 51 + allow for backward capability when newer, more efficient hashing algorithms are created. 50 52 53 + The flags inside a capability is a bitmask providing information about distinct feautures 54 + of that capability. Currently we only use them to mark what hashing algorithm was used to 55 + form the message for the signature, but there's plenty of bits left to use. 56 + We hope for future work to develop more expressive ways of using capabilities, i.e. Decentralized Information Flow Control, as specified in 57 + 6.1. 51 58 52 59 53 60 #load-bib(read("refs.bib")) 61 + 62 + 63 + 64 +
+36 -22
4-secctx.typ
··· 2 2 3 3 #mol-chapter("Security Contexts") 4 4 5 - Security Contexts are objects that store capabilites, which processes can attach onto, inherting the 6 - permissions granted by the capabilities that reside inside. 7 - 8 - == Enforcement 9 - 10 - The enforcement of security policy in Twizzler happens on page fault when trying to access 11 - a new object @twizzler. Then the kernel inspects the security contexts attached to 12 - the accessing proccess, looking up what capabilities those contexts hold and if they are applicable 13 - to the object being accessed. The original twizzler paper @twizzler, and the following security paper 14 - go into more detail about the philosophy behind why enforcement works this way, such as the 15 - performance benefits of letting programs access objects directly without kernel involvement, etc. 16 - 17 - 18 - 5 + Security Contexts are objects that processes attach to in-order to inherit the 6 + permissions inside the context. The contexts store capabilities, allowing for userspace 7 + programs to add capabilities to contexts, and kernel space to efficiently search 8 + through them to determie whether a process has the permissions to perform a memory access. 19 9 20 10 == Base 21 11 22 12 Since security contexts can be interacted with by the kernel and userspace, there needs to 23 - be a consistent defenition that both parties can adhere to, which we define. Objects in twizzler 13 + be a consistent definition that both parties can adhere to, which we define. Objects in Twizzler 24 14 have a notion of a `Base` which defines an arbitrary block of data at the "bottom" of an object 25 15 that is represented as a type in rust. We define the `Base` for a security context as follows: 26 16 ··· 35 25 36 26 === Map 37 27 The map holds positions to Capabilities relevant to some target object, which 38 - the relevant security context implementations for kernel and userspaces to 28 + the relevant security context implementations for kernel and userspace to 39 29 parse security context objects. Implicitly, the kernel uses 40 30 this map for lookup while the user interacts with this map to indicate the insertion, removal, or modification of 41 31 a capability. ··· 43 33 === Masks 44 34 Masks act as a restraint on the permissions this context can provide for some targeted object. 45 35 This allows for more expressive security policy, such as being able to quickly restrict 46 - permissions for an object, without having to remove a capability, and recreating one with the 36 + permissions for an object, without having to remove a capability and recreating one with the 47 37 dersired restricted permissions. 48 38 49 39 The global mask is quite similar to the masks mentioned above, except that it operates on ··· 51 41 // what now 52 42 // 53 43 === Flags 54 - 55 - 56 - 44 + Flags is a bitmap allowing for a Security Context to have different properties. Currently, there 45 + is only one value, UNDETACHABLE, marking the security context as a jail of sorts, as 46 + once a process attaches to it, it won't be able to detach. This acts as a way to 47 + limit the transfer of information if a thread attaches to a sensitive object. Once a thread 48 + attaches to such a context, it is forced to end its execution with the objects that context grants 49 + permission to. We also plan to utilize these flags in future works, as described in 6.1. 57 50 58 51 52 + == Enforcement 59 53 60 - // on disk storage for security contexts for efficient lookup 54 + All enforcement happens inside the kernel, which has a seperate view into Security Contexts 55 + than userspace. The kernel keeps track of all security contexts that threads in Twizzler 56 + attach to, instantiating a cache inside each one. Additionally, a thread can attach 57 + to multiple security contexts, but can only utilize the permissions granted by one unless 58 + they switch @twizzler. To manage these threads, the kernel assigns a Security Context Manager, 59 + which holds onto security context references that a thread has. 61 60 61 + The enforcement of security policy in Twizzler happens on page fault when trying to access 62 + a new object @twizzler. Upon fault, the kernel inspects the target object and identifies the 63 + default permissnons of that object. Then the kernel checks if the currently active 64 + security context for the accessing thread has either cached or capabilities that provide 65 + permissions. If default permissions + the active context permissions arent enough to 66 + permit the access, the kernel then checks each of the inactive contexts to see if they 67 + have any relevant permissions. If there exists such permissions, then the kernel will 68 + switch the active context of that process to the previously inactive context where the permission 69 + was found. If it fails all of these, then the kernel terminates the process, citing inadequate 70 + permissions. 62 71 63 - // what else is special about security contexts? 72 + Since the security context can have a mask per object, while also having a global_mask to 73 + the protections it can grant, the kernel also takes this into account while determining if 74 + a process has the permissions for access. 64 75 76 + The original Twizzler paper @twizzler, and the following security paper 77 + go into more detail about the philosophy behind why enforcement works this way, such as the 78 + performance benefits of letting programs access objects directly without kernel involvement, etc. 65 79 66 80 #load-bib(read("refs.bib"))
+15 -16
5-results.typ
··· 19 19 == Validation 20 20 21 21 The first test is a basic scenario as a check to make sure the system is behaving as intended, and 22 - a more expressive test to demonstrate the flexibility of the model. Eventually I intend to work with 22 + a more expressive test to demonstrate the flexibility of the model. Eventually, I intend to work with 23 23 my advisor and peers to form a proof of correctness for the security model, as well 24 24 as empirical testing to demonstrate its rigidity. 25 25 ··· 30 30 31 31 32 32 == Micro Benchmarks 33 - Additionally we have microbenchmarks of core security operations in Twizzler. All 34 - benchmarks were ran with a Ryzen 5 2600, with Twizzler virtualized in QEMU. Unfortunately 33 + Additionally, we have microbenchmarks of core security operations in Twizzler. All 34 + benchmarks were run with a Ryzen 5 2600, with Twizzler virtualized in QEMU. Unfortunately 35 35 I ran out of time to perform benchmarks on bare metal, but they should be the same, if 36 36 not more, performant. 37 37 38 38 === Kernel 39 39 40 - Theres a couple things we benchmark inside the kernel, including core cryptographic 40 + There a couple of things we benchmark inside the kernel, including core cryptographic 41 41 operations like signature generation and verification, as well as the total time it takes 42 42 to verify a capability. 43 43 #figure( ··· 84 84 ) 85 85 86 86 We see that signatures are vastly more expensive than hashing, on an order 87 - of $10^3$, meaning that your choice of hashing algorithm doesnt affection the 87 + of $10^3$, meaning that your choice of hashing algorithm doesn't affect the 88 88 total time taken for the verification of a capability. It's also important to 89 - note that this cost of verifying a capability for access is done on the first 90 - pagefault, then the kernel uses caching to store the granted permissions and 91 - provieds those on subsequent page faults into that object. In the future I hope 92 - to measure the difference between a cached and uncached verification. Secondly 93 - we only measure verification inside kernel space; as disscussed in section 3, 89 + note that this cost of verifying a capability for access is done on the first-page fault, then the kernel uses caching to store the granted permissions and 90 + provides those on subsequent page faults into that object. In the future, I hope 91 + to measure the difference between a cached and uncached verification. Secondly, 92 + we only measure verification inside kernel space; as discussed in section 3, 94 93 capability creation only takes place in user space. 95 94 96 95 === UserSpace 97 96 98 - In userspace we benchmark keypair and capability creation, as these operations are core to 99 - creating security policy. 97 + In userspace, we benchmark keypair and capability creation, as these operations are core to 98 + creating a security policy. 100 99 101 100 102 101 #figure( ··· 130 129 ) 131 130 132 131 Almost all the time spent in creating a capability is the cryptographic operations used 133 - to form its signature, which is why its in the same ballpark as signature creation we saw earlier. 132 + to form its signature, which is why it's in the same ballpark as the signature creation we saw earlier. 134 133 135 - The high varince in Keypair objects and Security contexts creation happens from the 136 - unpredictable time it takes for the kernel to create an object on disk. The reason keypair's 137 - are almost 2x more expensive since it creates two seperate objects, one for the signing key, 134 + The high standard deviation in Keypair objects and Security context creation happens from the 135 + unpredictable time it takes for the kernel to create an object on disk. The reason keypairs 136 + are almost 2x more expensive since they create two separate objects, one for the signing key, 138 137 and one for the verifying key.
thesis.pdf

This is a binary file and will not be displayed.

+10 -3
thesis.typ
··· 24 24 ) 25 25 26 26 #mol-abstract[ 27 - whatevea 28 - lowkey not even sure what to write 29 - ] 27 + Traditional operating systems permit data access through the kernel, applying 28 + security policy as a part of that pipeline. The Twizzler operating system 29 + flips that relationship on its head, focusing on an approach where data 30 + access is a first-class citizen, getting rid of the kernel as a middleman. With 31 + this data-centric approach, it requires us to rethink how security policy 32 + interacts with users and the kernel. In this thesis, I present the design and 33 + implementation of core security primitives in Twizzler. Then I evaluate the 34 + security model with a basic and advanced scenario, as well as microbenchmarks 35 + of core security operations. Lastly, I discuss future work built off this 36 + thesis, such as the incorporation of Decentralized Information Flow Control.] 30 37 31 38 32 39