My undergraduate thesis on a capability based security system for a data-centric operating system.
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

lol

+261 -120
+20 -20
1-introduction.typ
··· 6 6 7 7 // talk about the standard unix abstractions 8 8 9 - In mainstream operating systems, security policy is enforced at runtime by a 10 - omnicient and all powerful kernel. 9 + In mainstream operating systems, a 10 + omnicient and all-powerful kernel enforces security policy at runtime. 11 11 // what am i trying to say here. 12 12 It acts as the bodyguard, holding all i/o and data protected unless the 13 13 requesting party has the authorization to access some resource. This tight 14 - coupling of security policy and access mechanisms works well since any accesses 15 - must be done through the kernel, so why not perform security checks right 16 - along-side an access. However 14 + coupling of security policy and access mechanisms works well since any access 15 + must be done through the kernel, so why not perform security checks 16 + alongside accesses? However, 17 17 the enforcement of security policy starts getting complicated when we try 18 - to seperate the access mechanisms from the kernel. This notion arises 18 + to separate the access mechanisms from the kernel. This problem arises 19 19 in a certain class of operating systems. 20 20 21 - == Data Centric Operating Systems 21 + == Data-Centric Operating Systems 22 22 23 - Data centric operating systems are defined by two principles @twizzler: 23 + Data-centric operating systems are defined by two principles @twizzler: 24 24 25 - + Provide direct, kernel-free, access to data. 25 + + They provide direct, kernel-free, access to data. 26 26 27 - + A notion of pointers that are tied to the data they represent. 27 + + They have a notion of pointers that are tied to the data they represent. 28 28 29 29 Mainstream operating systems fail to classify as data-centric operating 30 30 systems, as they rely on the kernel for all data access, and use virtualized 31 31 pointers per process to represent underlying data. The benefit of this "class" 32 32 of operating systems comes from the low overhead for data manipulation, due to the lack 33 - of kernel involvement. However our previous security model fails to operate 34 - here as, by defenition, the kernel cannot be infront of accesses to data. So, 33 + of kernel involvement. However, the mainstream security model fails to operate 34 + here as, by definition, the kernel cannot be in front of access to data. So, 35 35 something new must be investigated. 36 36 37 37 == Capability Based Security Systems ··· 41 41 // 42 42 // how they are different from earlier thingies 43 43 44 - Capability based security systems have a rich history in research, and offer 45 - an alternative approach to security, in opposition to the ACL's of prevalent OS's @linux_security. 46 - Boiled down, a capability is a token of authority, holding at mininum some 44 + Capability-based security systems have a rich history in research, and offer 45 + an alternative approach to security, in opposition to the Access Control Lists of prevalent OS's @linux_security. 46 + Boiled down, a capability is a token of authority, holding at minimum some 47 47 permissions and a unique identifier to which "thing" those permissions apply 48 - to @cap-book. This simple approach of having a "token", allows for a large seperation 48 + to @cap-book. This simple approach of having a "token", allows for a separation 49 49 of the kernel's involvement in the creation and management of security policy. 50 - In a well designed system, as we see in @twizsec and described later, allows for 50 + In a well-designed system, as we see in @twizsec and described later, this allows 51 51 users to completely create and manage security policy while the kernel is left to enforce 52 - it. This paradigm allows for the kernel-free access of data, while also guaranteeing 52 + it. This paradigm permits kernel-free access of data, while also guaranteeing 53 53 security. 54 54 55 55 ··· 59 59 In this thesis, I detail the fundamentals of security in the Twizzler 60 60 operating system, and discuss how I implement and refine some of the high 61 61 level ideas described in Twizzler @twizzler and an early draft of a Twizzler security 62 - paper @twizsec. Additionally we evaluate these systems inside kernel and user space, 63 - with comparsions to micro-benchmarks done with an older version of twizzler. 62 + paper @twizsec. Additionally, we evaluate these systems inside kernel and user space, using 63 + Alice/Bob scenarios and microbenchmarks. 64 64 Code can be found in this 65 65 #link("https://github.com/twizzler-operating-system/twizzler/issues/268")[Github 66 66 tracking issue].
+35 -50
2-keypair.typ
··· 4 4 #mol-chapter("Key Pairs") 5 5 6 6 // what are keypair objects ? 7 - Key pairs in Twizzler are the representation of the cryptographic signing schemes 8 - used to create a signed capability, discussed in 3.1. 9 - We design the keypair objects to be agnostic towards what cryptographic 10 - schemes are underneath, allowing for the underlying algorithm to be changed 11 - @twizzler. The keys themselves are stored inside of objects, allowing for 12 - persistent or volatile storage depending on object specification, and 13 - allows for keys themselves to be treated as any other object and have security 14 - policy applied to them. This allows for powerful primitives and rich 15 - expressiveness for describing secruity policy, while also being flexible enough 16 - to make basic policy easy. 7 + Key pairs in Twizzler are representation of the cryptographic signing 8 + schemes used to create a signed capability, as discussed in 3.1. We design 9 + the keypair objects to be agnostic towards the underlying scheme to allow for 10 + multiple schemes, as described in @twizzler. This also helps with backwards 11 + compatibilty when adding new, more secure schemes, in the future. The keys 12 + are stored inside of objects, allowing for persistent or volatile 13 + storage depending on object specification, and allows for keys themselves to 14 + be treated as any other object and have security policy applied to them. This 15 + allows for powerful primitives and rich expressiveness for describing secruity 16 + policy, while also being intuitive enough to construct basic policy easily. 17 17 18 - Suppose for instance we have Alice on Twizzler, and all users on twizzler 19 - have a "user-root" keypair that allows for them to create an arbitrary number of 20 - objects. Also suppose that access to this user-root keypair is protected by some 21 - login program, where only alice can log in. This now means that Alice now 22 - can create new keypairs, protected by her user-root keypair. Since all her 23 - new keypairs originate from her original user-root keypair, only she can access 24 - the keys required to create new signatures of hers. It forms an elegant solution for 25 - capability creation without the involvement of the kernel. 26 - 27 - 28 - 29 - 30 - // how are they represented in twizzler ? 31 18 32 19 == Abstraction 33 20 34 21 The `SigningKey` struct is a fixed length byte array with a length field 35 22 and an enum specifying what algorithm that key should be interpreted as. 36 - Currently we use the Elliptic Curve Digital Signature Algorithm (ECDSA) 37 - @ecdsa to sign capabilities and verify them, but the simplistic dat 38 - arepresentation allows for any arbitrary alogrithm to be used as long as 39 - the key can be represented as bytes. 23 + Currently we use the Elliptic Curve Digital Signature Algorithm (ECDSA) @ecdsa 24 + to sign capabilities and verify them, but the simplistic data representation 25 + allows for any arbitrary alogrithm to be used as long as the key can be 26 + represented as bytes. 40 27 41 - Additionally this specification allows 42 - for backward compatibility, allowing for an outdated signing scheme to be used in 43 - support of older programs / files. An existing drawback for backward compatibility is the 44 - maximum size of the buffer we store the key in. Currently we set the maximum size as 256 bytes, 45 - meaning if a future cryptographic signing scheme was to be found with a private key size 46 - larger than 256 bytes, we would have to drop backwards compatibility. Sure this 47 - can be prevented by setting the maximum size to something larger, but that a tradeoff 48 - between possible cryptographic schemes vs the real on-disk cost of larger buffers. 28 + Additionally this specification allows for backward compatibility, allowing 29 + for an outdated signing scheme to be used in support of older programs / 30 + files. An existing drawback for backward compatibility is the maximum size 31 + of the buffer we store the key in. Currently we set the maximum size as 256 32 + bytes, meaning if a future cryptographic signing scheme was to be created with 33 + a key size larger than 256 bytes, we would have to drop backwards 34 + compatibility. Sure this can be prevented now by setting the maximum size to 35 + something larger, but thats a tradeoff between possible cryptographic schemes 36 + vs the real on-disk cost of larger buffers. 49 37 50 38 == Compartmentalization 51 39 // how they can be used to sign multiple objects (compartmentalization) 52 40 53 41 To create an object in twizzler, you specify the id of a verifying key 54 42 object so the kernel knows which key to use to verify any 55 - capabilities permitting access to the object. You can also specify 56 - default protections for an object or create a capability with the signing 57 - key and any desired permissions. 58 - 59 - The neat thing about this design is that you can use a single keypair in-order to use 60 - any arbitrary amount of objects. An example could be a colletion of objects holding files for a class, and grouping all of them 61 - under the same key. In short, having this flexibility allows for a significant debloating 62 - of the filesystem, comparted to creating a new keypair for every single object. 43 + capabilities permitting access to the object. Since keys are represented as objects 44 + in twizzler, security policy applies on them as well, creating satisfying 45 + solutions in regards to key management. 63 46 64 - In planned future work , as we talk more about in 65 - we can investiage the This results in the possibility of finegrained 66 - access control to semantic groupings of objects. 67 - // what the fuck am i trying to say 47 + Suppose for instance we have Alice on Twizzler, and all users on twizzler have 48 + a "user-root" keypair that allows for them to create an arbitrary number of 49 + objects. Also suppose that access to this user-root keypair is protected by 50 + some login program, where only alice can log in. This means that Alice 51 + can create new keypair objects from her user-root keypair. Since all 52 + *her* new keypairs originate from *her* original user-root keypair, only *she* can 53 + access the keys required to create new signatures allowing permissions into 54 + *her* objects. It forms an elegant solution for key management without 55 + the involvement of the kernel. 68 56 69 - // all it does is make creation easier, since you only need one pair, it doesnt 70 - // restrict capabilities or whatever. It's just a benefit since we dont have to worry 71 - // about managing a keypair for every single object 72 57 73 58 #load-bib(read("refs.bib"))
+28 -17
3-cap.typ
··· 6 6 // define a capability 7 7 Capabilities are the atomic unit of security in Twizzler, acting as tokens of 8 8 protections granted to a process, allowing it to access some object in the ways 9 - it describes. A Capability is built up of the following fields. 9 + it describes. Colloquially a capability is defined as permissions and 10 + a unique object to which those permissions apply, but in Twizzler we add 11 + the signature component to allow the kernel to validate that the security policy was created by an authorized party. 12 + 13 + Thus, a Capability is represented as follows: 10 14 11 15 12 16 ```rust ··· 17 21 flags: CapFlags, // Cryptographic configuration for capability validation. 18 22 gates: Gates, // Additional constraints on when this capability can be used. 19 23 revocation: Revoc, // Specifies when this capability is invalid, i.e. expiration. 20 - sig: Signature, // The signature inside the capability. 24 + sig: Signature, // The signature. 21 25 } 22 26 ``` 23 27 24 28 // 25 29 == Signature 26 - The signature inside is what determines the validity of this capability. The 30 + The signature is what determines the validity of the capability. The 27 31 only possible signer of some capability is who ever has permissions to the 28 - signing key object, or the kernel. In this way, if the signer decides to 29 - make the signing key private to them, no other entity can administer this 30 - signature for this capability. The signature is built up of a array with 32 + signing key object, or the kernel itself. The signature is built up of a array with 31 33 a maximum length and a enum representing what type of cryptographic scheme 32 34 was used to create it; quite similar to the keys mentioned previously. 33 - The message being signed to form the signature is the bytes of each of the 34 - fields inside the capability being hashed. There is support for multiple 35 - hashing algorithms as described in 3.1. 36 - 35 + The fields of the capability are serialized and hashed to form the message that gets signed, 36 + and then stored in the signature field. Currently we support Blake3 and 37 + Sha256 as hashing algorithms. 37 38 38 39 // what do i want to talk about regarding signatures? 39 40 40 41 == Gates 42 + Gates act as a limited entry point into objects. If a capability has a non-trivial gate, 43 + which is made up of an offset field, and a length field, the kernel will read that and ensure 44 + that any memory accesses into that object are within the gate bounds. The original Twizzler 45 + paper @twizzler describes gates as a way to perform IPC, and calls between distinct programs, 46 + but in the context of this thesis it is sufficient to think of them as a region of allowed 47 + memory access. 41 48 42 49 == Flags 43 - Currently flags in capabilities are used to specify which hashing algorithm to use in order 44 - to form a message to be signed. We allow for multiple algorithms to be used in order to 45 - allow for backwards capability when newer, more efficient hashing algorithms are created. 46 - 47 - There is also plenty of space left in the bitmap, allowing for future work to develop more 48 - expressive ways of using capabilities, such as planned future work to implement information 49 - flow control into the twizzler security system. 50 + Currently, flags in capabilities are used to specify which hashing algorithm to use to form a message to be signed. We allow for multiple algorithms to be used to 51 + allow for backward capability when newer, more efficient hashing algorithms are created. 50 52 53 + The flags inside a capability is a bitmask providing information about distinct feautures 54 + of that capability. Currently we only use them to mark what hashing algorithm was used to 55 + form the message for the signature, but there's plenty of bits left to use. 56 + We hope for future work to develop more expressive ways of using capabilities, i.e. Decentralized Information Flow Control, as specified in 57 + 6.1. 51 58 52 59 53 60 #load-bib(read("refs.bib")) 61 + 62 + 63 + 64 +
+36 -22
4-secctx.typ
··· 2 2 3 3 #mol-chapter("Security Contexts") 4 4 5 - Security Contexts are objects that store capabilites, which processes can attach onto, inherting the 6 - permissions granted by the capabilities that reside inside. 7 - 8 - == Enforcement 9 - 10 - The enforcement of security policy in Twizzler happens on page fault when trying to access 11 - a new object @twizzler. Then the kernel inspects the security contexts attached to 12 - the accessing proccess, looking up what capabilities those contexts hold and if they are applicable 13 - to the object being accessed. The original twizzler paper @twizzler, and the following security paper 14 - go into more detail about the philosophy behind why enforcement works this way, such as the 15 - performance benefits of letting programs access objects directly without kernel involvement, etc. 16 - 17 - 18 - 5 + Security Contexts are objects that processes attach to in-order to inherit the 6 + permissions inside the context. The contexts store capabilities, allowing for userspace 7 + programs to add capabilities to contexts, and kernel space to efficiently search 8 + through them to determie whether a process has the permissions to perform a memory access. 19 9 20 10 == Base 21 11 22 12 Since security contexts can be interacted with by the kernel and userspace, there needs to 23 - be a consistent defenition that both parties can adhere to, which we define. Objects in twizzler 13 + be a consistent definition that both parties can adhere to, which we define. Objects in Twizzler 24 14 have a notion of a `Base` which defines an arbitrary block of data at the "bottom" of an object 25 15 that is represented as a type in rust. We define the `Base` for a security context as follows: 26 16 ··· 35 25 36 26 === Map 37 27 The map holds positions to Capabilities relevant to some target object, which 38 - the relevant security context implementations for kernel and userspaces to 28 + the relevant security context implementations for kernel and userspace to 39 29 parse security context objects. Implicitly, the kernel uses 40 30 this map for lookup while the user interacts with this map to indicate the insertion, removal, or modification of 41 31 a capability. ··· 43 33 === Masks 44 34 Masks act as a restraint on the permissions this context can provide for some targeted object. 45 35 This allows for more expressive security policy, such as being able to quickly restrict 46 - permissions for an object, without having to remove a capability, and recreating one with the 36 + permissions for an object, without having to remove a capability and recreating one with the 47 37 dersired restricted permissions. 48 38 49 39 The global mask is quite similar to the masks mentioned above, except that it operates on ··· 51 41 // what now 52 42 // 53 43 === Flags 54 - 55 - 56 - 44 + Flags is a bitmap allowing for a Security Context to have different properties. Currently, there 45 + is only one value, UNDETACHABLE, marking the security context as a jail of sorts, as 46 + once a process attaches to it, it won't be able to detach. This acts as a way to 47 + limit the transfer of information if a thread attaches to a sensitive object. Once a thread 48 + attaches to such a context, it is forced to end its execution with the objects that context grants 49 + permission to. We also plan to utilize these flags in future works, as described in 6.1. 57 50 58 51 52 + == Enforcement 59 53 60 - // on disk storage for security contexts for efficient lookup 54 + All enforcement happens inside the kernel, which has a seperate view into Security Contexts 55 + than userspace. The kernel keeps track of all security contexts that threads in Twizzler 56 + attach to, instantiating a cache inside each one. Additionally, a thread can attach 57 + to multiple security contexts, but can only utilize the permissions granted by one unless 58 + they switch @twizzler. To manage these threads, the kernel assigns a Security Context Manager, 59 + which holds onto security context references that a thread has. 61 60 61 + The enforcement of security policy in Twizzler happens on page fault when trying to access 62 + a new object @twizzler. Upon fault, the kernel inspects the target object and identifies the 63 + default permissnons of that object. Then the kernel checks if the currently active 64 + security context for the accessing thread has either cached or capabilities that provide 65 + permissions. If default permissions + the active context permissions arent enough to 66 + permit the access, the kernel then checks each of the inactive contexts to see if they 67 + have any relevant permissions. If there exists such permissions, then the kernel will 68 + switch the active context of that process to the previously inactive context where the permission 69 + was found. If it fails all of these, then the kernel terminates the process, citing inadequate 70 + permissions. 62 71 63 - // what else is special about security contexts? 72 + Since the security context can have a mask per object, while also having a global_mask to 73 + the protections it can grant, the kernel also takes this into account while determining if 74 + a process has the permissions for access. 64 75 76 + The original Twizzler paper @twizzler, and the following security paper 77 + go into more detail about the philosophy behind why enforcement works this way, such as the 78 + performance benefits of letting programs access objects directly without kernel involvement, etc. 65 79 66 80 #load-bib(read("refs.bib"))
+110 -5
5-results.typ
··· 1 1 #import "template.typ": * 2 + 3 + #import "@preview/unify:0.7.1" 2 4 #mol-chapter("Results") 3 5 4 6 // benchmarking ··· 13 15 // 14 16 // take measurements without security checks too so you can see the security overhead 15 17 // 16 - All testing was done in QEMU, with a Ryzen 5 2600 processor. 17 18 19 + == Validation 18 20 19 - == Validation 21 + The first test is a basic scenario as a check to make sure the system is behaving as intended, and 22 + a more expressive test to demonstrate the flexibility of the model. Eventually, I intend to work with 23 + my advisor and peers to form a proof of correctness for the security model, as well 24 + as empirical testing to demonstrate its rigidity. 20 25 21 26 === Basic 27 + TBA 28 + == Expressive 29 + TBA 22 30 23 31 24 - === Expressive 32 + == Micro Benchmarks 33 + Additionally, we have microbenchmarks of core security operations in Twizzler. All 34 + benchmarks were run with a Ryzen 5 2600, with Twizzler virtualized in QEMU. Unfortunately 35 + I ran out of time to perform benchmarks on bare metal, but they should be the same, if 36 + not more, performant. 25 37 38 + === Kernel 26 39 27 - == Micro Benchmarks 40 + There a couple of things we benchmark inside the kernel, including core cryptographic 41 + operations like signature generation and verification, as well as the total time it takes 42 + to verify a capability. 43 + #figure( 44 + table( 45 + columns: (auto, auto), 46 + inset: 10pt, 47 + align: center, 48 + table.header( 49 + [Benchmark], [Time] 50 + ), 51 + [ 52 + Hashing (Sha256) 53 + ], 54 + [ 55 + 267.86 ns $plus.minus 163$ ns 56 + ], 57 + [ 58 + Hashing (Blake3) 59 + ], 60 + [ 61 + 125.99 ns $plus.minus 117$ ns 62 + ], 63 + [ 64 + Signature Generation (ECDSA) 65 + ], 66 + [ 67 + 199.90 $\u{00B5}s plus.minus 9.45 \u{00B5}s$ 68 + ], 28 69 29 - === Kernel 70 + [ 71 + Signature Verification (ECDSA) 72 + ], 73 + [ 74 + 342.20 $\u{00B5}s plus.minus 6.28 \u{00B5}s$ 75 + ] , 76 + [ 77 + Capability Verification (ECDSA, Blake3) 78 + ], 79 + [ 80 + 343.59 $\u{00B5}s plus.minus 5.32 \u{00B5}s$ 81 + ] 82 + ), 83 + caption: [Collection of Kernel Benchmarking Results] 84 + ) 30 85 86 + We see that signatures are vastly more expensive than hashing, on an order 87 + of $10^3$, meaning that your choice of hashing algorithm doesn't affect the 88 + total time taken for the verification of a capability. It's also important to 89 + note that this cost of verifying a capability for access is done on the first-page fault, then the kernel uses caching to store the granted permissions and 90 + provides those on subsequent page faults into that object. In the future, I hope 91 + to measure the difference between a cached and uncached verification. Secondly, 92 + we only measure verification inside kernel space; as discussed in section 3, 93 + capability creation only takes place in user space. 31 94 32 95 === UserSpace 96 + 97 + In userspace, we benchmark keypair and capability creation, as these operations are core to 98 + creating a security policy. 99 + 100 + 101 + #figure( 102 + table( 103 + columns: (auto, auto), 104 + inset: 10pt, 105 + align: center, 106 + table.header( 107 + [Benchmark], [Time] 108 + ), 109 + [ 110 + Capability Creation 111 + ], 112 + [ 113 + 347.97 $\u{00B5}s plus.minus 5.78 \u{00B5}s$ 114 + ], 115 + [ 116 + Keypair Objects Creation 117 + ], 118 + [ 119 + 651.69$\u{00B5}s plus.minus 187.90 \u{00B5}s$ 120 + ], 121 + [ 122 + Security Context Creation 123 + ], 124 + [ 125 + 282.10$\u{00B5}s plus.minus 119.90 \u{00B5}s$ 126 + ], 127 + ), 128 + caption: [Collection of UserSpace Benchmarking Results] 129 + ) 130 + 131 + Almost all the time spent in creating a capability is the cryptographic operations used 132 + to form its signature, which is why it's in the same ballpark as the signature creation we saw earlier. 133 + 134 + The high standard deviation in Keypair objects and Security context creation happens from the 135 + unpredictable time it takes for the kernel to create an object on disk. The reason keypairs 136 + are almost 2x more expensive since they create two separate objects, one for the signing key, 137 + and one for the verifying key.
+20 -1
6-conclusion.typ
··· 2 2 3 3 #mol-chapter("Conclusion") 4 4 5 + In short we provide a general overview of the critical security 6 + components for security system in Twizzler, along with 7 + implementation details and desgin descisions. The evaluation programs show how 8 + security policy can be expressed and verifies that the kernel is enforcing as 9 + programmed. Lastly we go over microbenchmarks to show and explain the cost of these operations. 10 + 11 + 5 12 == Future Works 6 13 7 14 In the future I hope to take the primitives created during my thesis, and apply them towards 8 - the implementation of Decentralized Information Flow Control, as described in 15 + the implementation of Decentralized Information Flow Control, as described in @flume, into 16 + the Twizzler security model. Additionally I would love to see how the current security model 17 + evolves once we start adding distributed computing support to Twizzler, as described in 18 + the orignal paper @twizzler. 19 + 20 + 21 + == Acknowledgements 22 + 23 + I couldn't have done the work for this thesis and for Twizzler if it wasn't for the 24 + support I've recieved from my advisor Owen Arden and my technical mentor Daniel Bittman! I 25 + owe both of you so much, not just for the class credit but also for how much I've learned in 26 + this endeavor. Thanks guys! 27 + 9 28 10 29 11 30 #load-bib(read("refs.bib"))
+2 -2
template.typ
··· 98 98 align(alignment.left, [ 99 99 #set par(first-line-indent: 0em) 100 100 101 - *Date of the public defence:* 101 + // *Date of the public defence:* 102 102 103 - _#defence-date _ 103 + // _#defence-date _ 104 104 105 105 #colbreak() 106 106
thesis.pdf

This is a binary file and will not be displayed.

+10 -3
thesis.typ
··· 24 24 ) 25 25 26 26 #mol-abstract[ 27 - whatevea 28 - lowkey not even sure what to write 29 - ] 27 + Traditional operating systems permit data access through the kernel, applying 28 + security policy as a part of that pipeline. The Twizzler operating system 29 + flips that relationship on its head, focusing on an approach where data 30 + access is a first-class citizen, getting rid of the kernel as a middleman. With 31 + this data-centric approach, it requires us to rethink how security policy 32 + interacts with users and the kernel. In this thesis, I present the design and 33 + implementation of core security primitives in Twizzler. Then I evaluate the 34 + security model with a basic and advanced scenario, as well as microbenchmarks 35 + of core security operations. Lastly, I discuss future work built off this 36 + thesis, such as the incorporation of Decentralized Information Flow Control.] 30 37 31 38 32 39