Reliability in MINIX 3
One of the main goals of MINIX 3 is reliability. Below we discuss
some of the more important principles that enhance MINIX 3's reliability.
These principles also enhance security, since most security flaws are due to
attackers exploiting bugs in the code, so greater reliability
will also improve security. Some of the ideas discussed are in the
current release, but a few are scheduled for the next release. As this
is a research project, we often make changes as we think of new ways to
improve reliability.
Reduce kernel size
Monolithic operating systems (e.g., Windows, Linux, BSD)
have millions of lines of kernel code.
There is no way so much code can ever be made correct.
In contrast, MINIX 3 has about 4000 lines of executable
kernel code. We believe this code can eventually be made fairly close to
bug free.
Cage the bugs
In monolithic operating systems, device drivers reside in the kernel.
This means that when a new peripheral is installed, unknown, untrusted
code is inserted in the kernel. A single bad line of code in
a driver can bring down the system. This design is fundamentally
flawed. In MINIX 3, each device driver is a separate user-mode process.
Drivers cannot execute privileged instructions, change the page
tables, perform I/O, or write to absolute memory. They have to make kernel calls
for these services and the kernel checks each call for authority.
Limit drivers' memory access
In monolithic operating systems, a driver can write to any word of memory
and thus accidentally trash user programs. In MINIX 3, when a user expects
data from, for example, the file system, it builds a descriptor telling
who has access and at what addresses. It then passes an index to this
descriptor to the file system, which may pass it to a driver. The file
system or driver then asks the kernel to write via the descriptor, making
it impossible for them to write to addresses outside the buffer.
Survive bad pointers
Dereferencing a bad pointer within a driver will crash the driver process,
but will have no effect on the system as a whole.
The reincarnation server will restart the crashed driver automatically.
For some drivers (e.g., disk and network) recovery is transparent to user
processes. For others (e.g., audio and printer), the user may notice.
In monolithic
systems, dereferencing a bad pointer in a (kernel) driver normally leads to a
system crash.
Tame infinite loops
If a driver gets into an infinite loop, the scheduler will
gradually lower its priority until it becomes the idle process.
Eventually the reincarnation server will see that it is not responding
to status requests, so it will kill and restart the looping driver.
In a monolithic system, a looping driver hangs the system.
Limit damage from buffer overruns
MINIX 3 uses fixed-length messages for internal communication, which
eliminates certain buffer overruns and buffer management problems.
Also, many exploits work by overrunning a buffer to trick the program
into returning from a function call using an overwritten stacked return address pointing
into the overrun buffer. In MINIX 3,
this attack does not work because instruction and data space are
split and only code in (read-only) instruction space can be executed.
Restrict access to kernel functions
Device drivers obtain kernel services (such as copying data to users' address spaces)
by making kernel calls. The MINIX 3 kernel has a bit map for each driver
specifying which calls it is authorized to make. In monolithic systems
every driver can call every kernel function, authorized or not.
Restrict access to I/O ports
The kernel also maintains a table telling
which I/O ports each driver may access.
As a result, a driver can only touch its
own I/O ports. In monolithic systems, a buggy
driver can access I/O ports belonging to another device.
Restrict communication with OS components
Not every driver and server needs to communicate with every other
driver and server. Accordingly, a per-process bit map determines
which destinations each process may send to.
Reincarnate dead or sick drivers
A special process, called the reincarnation server,
periodically pings each device driver. If the driver dies or fails to respond correctly to pings,
the reincarnation server automatically replaces it by a fresh copy.
The detection
and replacement of nonfunctioning drivers is automatic, without any user action required.
This feature does not work for disk drivers at present, but in the next release the
system will be able to recover even disk drivers, which will be shadowed in RAM.
Driver recovery does not affect running processes.
Integrate interrupts and messages
When a interrupt occurs, it is converted at a low level to a
notification sent to the appropriate driver.
If the driver is waiting for a message, it gets the interrupt immediately;
otherwise it gets the notification the next time it does a RECEIVE to
get a message. This scheme eliminates nested interrupts and makes
driver programming easier.
|