Minutes 21 Sep 2023

Host: Paul Albertella

Participants: Pete Brink, Igor Stoppa, Dana Vede, Daniel Krippner, Elana Copperman, Sebastian Hetze, Gabriele Paoloni

Agenda

Documenting "Quantitative and probabilistic methods summary"
Models describing Linux role in safety-critical systems
Kernel memory corruption topic for the ELISA Workshop

Actions

Sebastian to have a go at drafting the ‘proven in use’ position that we discussed
Igor to give a more definitive answer next week regarding the proposed ‘kernel memory corruption’ topic discussion at the ELISA workshop in Munich.

Modelling safety problems involving Linux

Motivation behind safety analysis: if we don’t understand the ways in which a component might fail or otherwise cause an unsafe outcome in a safety-related system, then we can’t define measures to deal with these ‘failure modes’ (e.g. by applying additional verification or validation processes, or designing the rest of the system to mitigate the potential outcome(s).

Paul: Working on an abstract control structure model for Linux:

https://github.com/elisa-tech/wg-osep/pull/19
Intended to provide a consistent way to model issues that we know about, so that a wider audience (e.g. safety analysts, system developers and Linux contributors) can understand their implications, and to identify new problems that we have not yet considered.

Igor: How can using STPA help us to address a problem like kernel memory corruption?

Paul: Goal is to be able to clearly articulate what impact problems like this can have for a given safety use case, or all safety use case.

Igor outlined the memory integrity problem:

(on ARM64) the linux kernel maps practically all the physical memory in kernel space the kernel space does not provide any isolation between components belonging to it
kernel components individually developed with different levels of safety are not isolated, therefore any component is exposed to interference from any other, including the lowest (or none) safety ones
userspace processes rely on memory which is backed by physical pages, mapped in kernel space and exposed to the most unsafe components Consequences:
it is not true that userspace processes are shielded from kernel interference
it is not true that by creating a user-mode device driver, it is isolated from kernel interference

Igor: In more direct words: literally anything anywhere in a process is exposed to interference from the kernel: code, constants, data, stack, heap, mapping.

Paul: We could argue that the design of the Linux kernel is such that this risk will always be present, but measures / design features have been added to help manage these risks

Pete: But is that not just ‘papering over’ fundamental design flaws
Elana: Pete, your statement is true for just about any complex modern software system, Linux-based or not. An external safety qualified HW watchdog is therefore a very common element in safety architecture.
Gab: What about x86’s SMEP/SMAP, ARM’s PXN/PAN ? If enabled don't they prevent kernel code messing up user mode code and data ?
Igor: Those operate on other mappings, AFAIK
Elana: Those are HW features which need to be configured (and not simply enabled) and proven that they satisfy specific safety goals (e.g., protection of user mode code/data).
Igor: They are security functions, not safety, and they are applied to the process mappings, not the kernel mappings
Elana: Security functions may be used to satisfy safety goals. But not obvious and not immediate.

Igor: Even if we could protect the ‘working memory’ that a userspace safety process uses from accidental corruption by the kernel, the metadata that relates to this memory (and to the runtime properties of the userspace process) is necessarily still vulnerable to interference by the kernel.

Paul: There is a difference between a failure that e.g. causes the kernel or an userspace process to crash and one that ‘invisibly’ corrupts the data that a safety-critical process uses to make decisions.

Igor: My proposal is an address space isolation approach that can operate within the kernel - only allow certain parts of the kernel to write to it - and potentially to detect when an unauthorised process reads it.

Paul: And we could perhaps use this to isolate kernel metadata such that only the subsystem responsible for managing it can write to i
Igor: Perhaps, but let’s focus on describing the problems first, and not try to jump straight to solutions. Features SELinux and CGroups are often mentioned as solutions, but they may be contributing to the problem!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 21 Sep 2023

Agenda

Actions

Modelling safety problems involving Linux

Clone this wiki locally