> The expectation of software existing as opaque files creates a huge amount of work for the OS in verifying the exact behaviour of the software as it runs
That’s not really how the software/OS relationship works. By default OS’s run software in very unprivileged CPU security contexts, where they can’t really do anything beyond but operate on memory the OS allocated to the software, with CPU instructions can’t talk directly to other hardware, or read/write memory beyond the bounds the OS specified.
To do literally anything else, like open a file, write to a file, talk to a network interface, talk directly to other hardware connected to the CPU. The software needs to use a syscall, and effectively ask the OS to perform the operation on its behalf. Every single dangerous operation you can perform is guarded by the OS, the OS doesn’t validate anything, it simply decides to perform the operations requested, or it decides not to perform those operations.
The only reason why software expects so much unfettered access to a system is simply because OS have historical not limited what software was allowed to do, OS’s simply performed whatever operation was requested of them without question. It only in the last 20 years or so that security became a concern, and it occurred to us that allowing software to just do whatever it wants is probably a bad idea. But the genie is out of the bottle, and it’s proving hard to put it back in. So retroactive work to limit what software can do by default, without simply breaking everything, is slow and difficult.
If you ever want to understand why you can do somethings on MacOS, but not iOS, then this is a pretty good place to start (if we ignore Apple’s brand and commercial reasons for a moment). iOS only ever supported tightly sandboxed apps, that had to play nice with a draconian permissions system, where as MacOS is more standard OS, that was originally developed in the age before security was really a concern. So it’s easy to enforce tight sandboxing on iOS, where software has always had to deal with it, but hard on MacOS, where for the majority of MacOS history, sandboxing simply didn’t exist.
> rather than a source-based approach in which malware is never allowed to touch the processor.
That’s a nice idea, but doesn’t really hold up to scrutiny. You need to look at the ever increasing number of supply chain attacks to realise that simply being open source does little to ensure malware doesn’t make it onto your machine. And that before we get into issues like heart bleed, where the OSS software on your machine may contain bugs or errors that allow remote parties to gain access to privileged data or credentials, via OSS software that isn’t malware, it just got bugs in it.
At the end of the day, it doesn’t really matter what magical security boundaries you develop, someone will find a way around it. Which is why defence in depth is such an important principle. Simply relying on the idea that Open Source means no malware is foolish. You need proper OS defences regardless.
I know all that. The process you describe is the OS verifying the behaviour of software as it runs.
>That’s a nice idea, but doesn’t really hold up to scrutiny
It does actually. The kind of attack you describe only works because software exists in non-source based systems. If you were going to run a Haskell function, and you were told that I was going to modify its body in some way, there is no way I can insert malware into it because you cannot reach out of the context of the function and mess with the operating system. Software is composed of functions, which naturally run in a fully containerised state and only become dangerous when the OS adds an additional layer of computer-wide state.
> Software is composed of functions, which naturally run in a fully containerised state
That’s simply not true. It’s a convenient fiction that modern programming languages provide, but they only provide it by virtue of not providing API to access external state, and even that isn’t true. Most programming languages do let you access state beyond a functions stack, in the simplest form by allowing your function to interact with data stored on a heap.
There’s absolutely nothing about software that inherently results in “functions, which naturally run in a fully containerised state”. Functions are just a useful abstraction, in the same way an OS’s APIs are also just useful abstraction. There’s absolutely nothing that makes function scopes inherently more secure than OS APIs. Arguably function scopes are vastly less secure than OS APIs, there’s a good reason why stack overflows are one of the most common forms of Remote Code Execution exploits.
That is kind of my point. To claim that functions are somehow more secure than containers, or just standard process isolation (which is basically all container are), is just silly.
Functions are no more inherently secure than containers. Both abstraction are contained, until something exposes the external environment to them, both of them can be configured to simply not do that.
The claim that functions are more secure is quite reasonable, and it stems from the fact that they are simple. You or I could write a contained function, or even a piece of software that verifies a function is contained by scanning its source code. We could do this in perhaps a few weeks of spare time. Compare and contrast to the effort that goes into other containerisation techniques. Docker would take more that a week of spare time to write. It is orders or magnitude more complex.
> Compare and contrast to the effort that goes into other containerisation techniques. Docker would take more that a week of spare time to write. It is orders or magnitude more complex.
That’s because you’re comparing a butter knife to a chainsaw. You can quite easily run an entire process in manner that just as limited as simple function by using something like seccomp[1] which prevents the process from doing anything except exiting, or read/writing to file handles that have already been opened as pass to it.
No need to spend a weekend or whatever writing something to “verify a function is contained”, or anything like that. Just give your process to the OS and tell the OS to contain it.
Heck if you’re on a system with systemd, all you need a handle full of lines of config, and you’re done. No need to worry about how good your static analysis is (spoiler alert, it’s never good enough), or creating a new language, or reading anyone’s code.
You use containerisation frameworks when you want your software to be able to interact with the world, but want to be able to carefully analyse and limit how it interacts with the wider world. It’s total overkill if you just want to execute code that performs computations without interacting with other systems.
> You can quite easily run an entire process in manner that just as limited as simple function by using something like seccomp
Yet most software will not work when you do this because most software expects to be able to interact implicitly with the wider system. By using functions as the fundamental unit of containment, you force software to explicitly declare anything it depends on as an argument to the function. For instance, if it wishes to listen on a port, it would need to receive some object representing access to that port as an input rather than just receiving one large implicit object representing every function of an operating system which then has to be retroactively hacked down to only the required functionality. Full Docker-like containerisation on all processes, and done as simply as if you had used seccomp. It is the best of both worlds.
> spoiler alert, it’s never good enough
This is provably false. Haskell is an instance of static analysis good enough to verify that functions don't have side-effects (and additionally that they don't use mutable state). Safe Rust also has more practical methods, though it doesn't fully implement this. Verifying these things is trivial, and requires only three conditions: global variables are immutable, immutable objects cannot be converted to mutable ones, and that the mutable state of the operating system shall only be exposed through explicit mutable objects representing it.
That’s not really how the software/OS relationship works. By default OS’s run software in very unprivileged CPU security contexts, where they can’t really do anything beyond but operate on memory the OS allocated to the software, with CPU instructions can’t talk directly to other hardware, or read/write memory beyond the bounds the OS specified.
To do literally anything else, like open a file, write to a file, talk to a network interface, talk directly to other hardware connected to the CPU. The software needs to use a syscall, and effectively ask the OS to perform the operation on its behalf. Every single dangerous operation you can perform is guarded by the OS, the OS doesn’t validate anything, it simply decides to perform the operations requested, or it decides not to perform those operations.
The only reason why software expects so much unfettered access to a system is simply because OS have historical not limited what software was allowed to do, OS’s simply performed whatever operation was requested of them without question. It only in the last 20 years or so that security became a concern, and it occurred to us that allowing software to just do whatever it wants is probably a bad idea. But the genie is out of the bottle, and it’s proving hard to put it back in. So retroactive work to limit what software can do by default, without simply breaking everything, is slow and difficult.
If you ever want to understand why you can do somethings on MacOS, but not iOS, then this is a pretty good place to start (if we ignore Apple’s brand and commercial reasons for a moment). iOS only ever supported tightly sandboxed apps, that had to play nice with a draconian permissions system, where as MacOS is more standard OS, that was originally developed in the age before security was really a concern. So it’s easy to enforce tight sandboxing on iOS, where software has always had to deal with it, but hard on MacOS, where for the majority of MacOS history, sandboxing simply didn’t exist.
> rather than a source-based approach in which malware is never allowed to touch the processor.
That’s a nice idea, but doesn’t really hold up to scrutiny. You need to look at the ever increasing number of supply chain attacks to realise that simply being open source does little to ensure malware doesn’t make it onto your machine. And that before we get into issues like heart bleed, where the OSS software on your machine may contain bugs or errors that allow remote parties to gain access to privileged data or credentials, via OSS software that isn’t malware, it just got bugs in it.
At the end of the day, it doesn’t really matter what magical security boundaries you develop, someone will find a way around it. Which is why defence in depth is such an important principle. Simply relying on the idea that Open Source means no malware is foolish. You need proper OS defences regardless.