Inter-Process Communication (IPC)

Without communication, a thread can use resources moved to the stack. A peripheral could only be used by one thread throughout the application runtime, and no event could trigger an action. Thus, to create a useful application, inter-process communication is essential.

IPC in computer systems is used to transfer information beyond the virtual address space of a process [19, pp. 3]. While threads on microcontrollers share the same address space and direct access to other thread stacks would be possible, IPC is used where memory protection isolates memory. Furthermore, can common terminology, known from computer systems, be applied to embedded systems.

ipc-overview
Figure: Overview of available IPC mechanisms.

Depending on the size of the information, different forms of communication are used, i.e., large measurement sample arrays need a different form of IPC than simply waiting for an event flag to be set. the figure shows an overview of the different IPC mechanisms in Bern RTOS.

Critical Sections (internal)

The simplest solution for synchronized access to a resource is a critical section. Before a protected resource is accessed, all interrupts are disabled. Without interrupts, no thread switch or preemption in general can occur, synchronizing access to the resource. As soon as the resource is not used anymore, interrupts can be enabled again.

The advantage of this approach is that it can be applied to any microcontroller. But disabling interrupts increases the latency of service routines that might not even affect the protected resource. In real-time application (e.g., motor control) interrupt latency must be kept low. Some CPU architectures allow for interrupt masking instead of disabling all interrupts [24, pp. 265], so that time-critical interrupts can still be executed.

Bern RTOS discourages the use of critical sections. They are not available to users. Generally, synchronization using atomic operations is preferred. Short critical sections are used within the scheduler where hardware capabilities for atomic operation do not suffice.

In the current kernel implementation, all interrupts are disabled during a critical section. This will be changed to interrupt masking when interrupt handling is added to the kernel.

Event System (internal)

All forms of IPC transport data between threads and allow threads to wait for an event. While the data transport is different for every form of IPC, suspending threads until data is ready, is used for all IPC. Bern RTOS uses a common event system for all IPC.

A thread can request to be suspended until an IPC is updated. On update events, the affected threads must be woken up.

event-list
Figure: Event list.

As illustrated in the figure all events are stored in one list. An event is created for every form of IPC registered in the kernel. An event consists of the following:

Identifier (ID) Is used to access an event through a system calls. In contrast to passing a pointer to an event through system calls looking for an ID is computationally more expensive, but it ensures that a thread cannot crash the kernel by passing any memory address.

Settings Contain additional information on whether to use priority inversion or if all threads should be woken-up or just the first one.

Pending A list of threads pending for the event. The list sorted by priority as the highest priority thread will be granted access to the IPC first.

The event system is used within the kernel and the user cannot access it directly.

Mutual Exclusion (Mutex)

Printing to a serial interface on a microcontroller from two threads without synchronization leads to interfering strings at the output. The same happens if an array of measurement samples is read and written from different threads. Consistent data can only be guaranteed if only ever one thread can access data at a time. Mutual exclusion (mutex) is a synchronization primitive that restricts access to a resource to one thread.

Design

Similar to the Mutex type from the Rust standard library, a mutex in Bern RTOS encapsulates a resource, tightly coupling synchronization and resource.

mutex-memory
Figure: Mutex data structure.

A mutex has the data structure shown in the figure. It has an event ID from the event system, a lock, and data with a generic type and size. A mutex must be placed in a shared memory region for multiple threads to have access.

When a thread wants to access a mutex, it will try setting the lock using an atomic operation. If the lock can be set, the data can be accessed without ever calling the kernel. When the lock is already set a thread can be suspended until the lock is released. An event is triggered whenever a lock is released.

Usage

Using a mutex in Bern RTOS is almost equivalent to a mutex from the Rust standard library on a computer.

#[entry]
fn main() -> ! {
    /*..*/
    PROC.init(move |c| {
        let shared_data = Arc::new(Mutex::new(42_u321));
        let shared_data_gen = shared_data.clone();
        Thread::new(c)
            .stack(Stack::try_new_in(c, 2048).unwrap())
            .spawn(move || {
                let mut counter = 0;
                loop {
                    match shared_data_gen.lock(u32::MAX)2 {
                        Ok(mut value) => { *value = counter;3 }
                        Err(_) => {}
                    }
                    counter += 1;
                    sleep(500);
                }4
            });
        /*..*/
    }).unwrap();
    /*..*/
}

Listing: Creating and accessing a mutex.

At 1 in the listing a mutex encapsuling an integer is created. The mutex only synchronizes access to a resource. It must be placed somewhere all participating threads have access. Using Arc the mutex and the contained resource are allocated in process memory. Alternatively, a mutex could be allocated statically. Currently, the kernel does not support this to avoid lazy initialization.

When lock() at 2 is called, the thread tries setting the lock and is suspended if the mutex is already locked. The lock either returns the value encapsulated in the mutex 3 or an error if for example the lock request is timed out. The mutex is automatically unlocked when the value goes out of scope 4.

\notebox{ Timeouts are not implemented. Hence, the timeout value is ignored and a lock() will block the thread until the resource is free. Alternatively, try_lock() will never block. }

Semaphore

An end switch on a machine might be wired to a hardware interrupt on a microcontroller. Hitting the end switch will trigger an ISR, which then must somehow communicate to the thread in charge that the machine should be stopped. A counting semaphore can be used to synchronize an interrupt event and a thread, or to synchronize multiple threads.

Design

A counting semaphore is almost identical to a mutex, but it does not hold any data. Instead of a lock, a semaphore has a given number of permits it can issue. If no permits are available a thread can be suspended until at least one permit is present. The semaphore also uses atomic operations to count permits.

Usage

The semaphore API is based on tokio::sync::Semaphore [57] and is similar to a mutex.

#[entry]
fn main() -> ! {
    /*..*/
    PROC.init(move |c| {
        let signal = Arc::new(Semaphore::new(0)1);
        let signal_sender = signal.clone();

        Thread::new(c)
            .stack(Stack::try_new_in(c, 2048).unwrap())
            .spawn(move || {
                loop {
                    match signal.acquire(u32::MAX)2 {
                        Ok(permit) => {
                            permit.forget();3
                            do_something();
                        },
                        Err(_) => {}
                    }
                }
            });

        InterruptHandler::new(c)
            .stack(InterruptStack::Kernel)
            .connect_interrupt(stm32f4xx_hal::interrupt::EXTI15_10 as u16)
            .handler(move |_c| {
                button.clear_interrupt_pending_bit();
                signal_sender.add_permits(1);4
            });
    }).unwrap();
    /*..*/
}

Listing: Creating and accessing a semaphore.

A semaphore is created with zero permits at 1 in the listing. The semaphore must like a mutex, be placed in process memory. acquire() requests a permit in a blocking manner 2. As the initial semaphore count is zero, the thread will be suspended until a permit is added to the semaphore. Once the thread can continue, it executes a function and in this case, forgets 3 the permits. Meaning that the permit will not be returned to the semaphore and the overall number of available permits will be reduced by one.

At 4 an interrupt triggered by an external hardware interrupt will add one permit to the semaphore and trigger the event system waking-up the thread.

Channel (Message Queue)

A mutex synchronizes access to a shared resource. For example a ISR might write data received from UART interface into a shared data buffer. A thread would then poll and process the shared buffer. From this approach, two problems arise. First the ISR cannot write to the buffer when thread already locked it. Secondly, the thread always accesses the buffer to check for new data. As this use case is event based and we want to run the ISR and thread asynchronically, a message queue is a better choice.

Design

A message queue in Rust is called a channel. It consists of a data queue and management objects for the sender and receiver. In the current implementation, the data queue can store a fixed number of objects of the same type. The type and size have to be set at compile time. The data queue and hence all channels are first in first out (FIFO).

Generally, a channel copies an element to the data queue. In the case of a large object, an element can also move the ownership of an object on the heap. A few variants of the message queues exist for different use cases. One case is the communication between threads in one process, as depicted in the figure.

queue-threads
Figure: Channel between threads within one process.

The data queue is allocated at process initialization and stored on the heap. A thread can send an element by copying it into the queue. The receiving thread can copy elements asynchronically from the queue. Hence, every element is copied twice. Copying is less efficient than passing references but simplifies the queue implementation. The data queue does not need to know how long and element is used for if it just copied to the consumer.

There is also a slight variation in whether there are one or multiple producers. A single producer single consumer (SPSC) queue is much more straightforward and thus faster than a multi producer single consumer (MPSC) queue.

Threads have access to the entire process memory. Thus, the producer can check itself whether there is a free element to copy data into. The kernel only has get involved when the producer or the consumer wants to wait for a queue event.

Another variant of channel is used to communicate across process boundaries. the figure shows a thread sending a message from one process to another.

queue-proc
Figure: Channel between threads of different processes.

In this case the data queue is allocated statically on the consumers heap. The producer thread can make a system call to request an element to be copied to the receiving queue. The receiving thread can then access the data queue directly.

Usage

Message queues follow the terminology used by std::sync::mpsc::Channel from the Rust standard library.

#[entry]
fn main() -> ! {
    /*..*/
    PROC.init(move |c| {
        #[link_section=".process.my_process"]
        static CHANNEL: Channel<ConstQueue<u32, 16>> = sync::spsc::channel();1
        let (tx, rx) = CHANNEL.split();2

        Thread::new(c)
            .stack(Stack::try_new_in(c, 2048).unwrap())
            .spawn(move || {
                let mut counter = 0;
                loop {
                    tx.send(counter).unwrap();3
                    counter += 1;
                    sleep(500);
                }
            });

        Thread::new(c)
            .stack(Stack::try_new_in(c, 2048).unwrap())
            .spawn(move || {
                loop {
                    match rx.recv() 4{
                        Ok(v) => process(*v5),
                        Err(_) => {}
                    }
                    sleep(10);
                }
            });
    }).unwrap();
    /*..*/
}

Listing: Creating and accessing a channel within one process.

In this example, a channel is created at 1 in the listing. The channel contains a queue that can take 16 elements of the u32. Like all static data we must place it into process memory by specifying the link section. A channel is then split into sender and receiver 2.

The sender is moved to one thread that in this case periodically adds an element to the queue 3. The send() method returns an error if the queue is full.

Another thread is on the receiving end of the channel 4. If there is an element in the queue, the software processes the newest element 5.

Sending data between threads of different processes uses a slightly different syntax.

static PROC_A: &Process = bern_kernel::new_process!(process_a, 8192);
static PROC_B: &Process = bern_kernel::new_process!(process_b, 4096);
#[entry]
fn main() -> ! {
    /*..*/
    #[link_section=".process.process_b"]
    static IPC_CHANNEL: ConstQueue<u32, 16> = ConstQueue::new();1

    let channel = sync::ipc::spsc::channel();
    let (ipc_tx, ipc_rx) = channel.split(&IPC_CHANNEL2).unwrap();

    PROC_A.init(move |c| {
        Thread::new(c)
            .stack(Stack::try_new_in(c, 2048).unwrap())
            .spawn(move || {
                let mut counter = 0;
                loop {
                    ipc_tx.send(counter).unwrap();3
                    counter += 1;
                    sleep(500);
                }
            });
    }).unwrap();

    PROC_B.init(move |c| {
        Thread::new(c)
            .stack(Stack::try_new_in(c, 2048).unwrap())
            .spawn(move || {
                loop {
                    match ipc_rx.recv() 4{
                        Ok(v) => process(*v),
                        Err(_) => {}
                    }
                    sleep(10);
                }
            });
    }).unwrap();
    /*..*/
}

Listing: Creating and accessing a channel from one process to another.

In the listing at 1 the queue behind the channel is allocated statically inside the receiving process memory. The queue is linked to the channel when splitting the channel 2. In contrast to channels within one process, this is done outside a process context.

The sender and receiver are then moved to the respective processes and threads. Sending 3 and receiving 4 elements through the channel is equivalent to channels inside one process.

\notebox{ Currently Bern RTOS does not support awaiting queueing events. }

Bern RTOS: Kernel