Microkernel Design

A rough draft of a (not really) microkernel

Philosophy and Design

There is no single strict guideline being adhered to. These are different from the design principles of a microkernel, namely policy isolation and minimalism.

Policy: The system should only enforce a minimal set of policies on the userspace.
Isolation: The kernel design should not intrude on the function of programs. Furthermore, clients should not easily influence the behaviour of servers. It should be difficult for any single program to crash the entire system.
Minimalism: The interface methods between the kernel and userspace should not be heavily overloaded, yet only a minimal set of primitives should be exposed. If something can be removed from the kernel, it should only if none of the previous points are violated.

We can consider this design a weak microkernel variant, or even a hybrid design.

To reduce the number of context switches while still complying (mostly) with the above guidelines, the kernel is responsible for these 3 tasks:

Process/thread management and scheduling
Physical and virtual memory management
Interprocess communication

Some consideration may be given in the future regarding:

Userspace thread scheduling
Userspace memory management

Missing core kernel elements that still need to be decided upon:

Userspace pager and page fault management
Exception and interrupt delivery
Process and thread exception and crash delivery
Detailed process and thread control system calls

We currently have 20 system calls defined

Kernel Objects

Kernel objects are allocated in kernel space as there is no practical limitation on the address space because of the 64-bit only support.

References are per-process and an object can have multiple references. When a reference is freed, the object's refcount is decreased by 1. When the count reaches 0, the object is freed by the GC.

// Sample reference implementation
struct kref {
    kobj*    object;
    uint32_t ability;
    uint32_t proc_owner;
}

Objects are exposed to the userspace through references. References can be transferred to another process, duplicated and re-instantiated with different rights.

The above table lists the 3 logical operations that can be performed on references.

Asynchronous IPC

Instead of associating IPC endpoints with threads, they are described by channel endpoint objects. Endpoints are transferrable, duplicable XOR readable and writable.

Endpoint objects belong to one or no thread(s) and only support asynchronous IPC primitives. Channels may transfer both references and data from one endpoint to another.

In-transit messages are buffered in a kernel queue. If the queue fills up, the kernel will not allow any more messages to be queued and return an error. Until both endpoints are destroyed, in-transit messages are kept alive. Unpaired endpoints can only be read from.

The above table lists 3 logical operations that can be performed on the IPC endpoint pairs.

Notifications

Endpoint pairs can be signaled and waited upon through notifications. Unlike L4, a thread can wait on multiple endpoints. This is done by marking endpoints as "waiting" and placing the thread in a WAIT state. When any paired endpoint signals, we check if the endpoint partner is waiting and unblock accordingly.

Deadlocks can easily form through dependency cycles. Dependency-based schedulers must check for cycles otherwise the entire system freezes.

Internally, notifications are represented by a single bit that can only be asserted. The kernel will de-assert the bit upon successful delivery. If there are no listeners, the notification will be dropped.

The above table outlines the 2 primitives used for notifications.

Memory Management

Instead of working with single page mappings, we introduce virtual memory regions in which memory from a pool of physical pages, the VMO, can be mapped into.

The above table outlines 5 memory primitives that operate on VMRs and VMOs.

These operations only work on VMOs registered with a custom userspace pager thread. The kernel will redirect vmo operations performed on vmo objects with a pager to their respective thread endpoints.

The above table outlines 1 memory primitive that can be used to create a userspace pager.

Processes and Threads

A thread is the single unit of execution while processes can own multiple threads. Threads of the same process share one address space while different processes have different address spaces.

The control structures for thread and process syscalls are used below.

The above table outlines 5 primitives used for process and thread control.

Previouswlp4cc Nextcxkernel

Last updated 7 months ago