Sel4 unikernel vs multi-kernel vs SMP on genode/sculpt

Hi,

I have been spending some time to investigate why sculpt-os+sel4 was rather slow on ARM IMX8MP, and aside from finding some stability issues that I am still working on, I also found the default config for seL4 is actually mono-core, (non-SMP).

kinda filling silly for not checking that earlier TBH ^^;

Reviewing a few threads and papers, I can understand the SMP version isn’t on a path to verification, so choosing unikernel over SMP makes some sense from a security standpoint, yet, the multi-kernel configuration could be an option. It would require a bit more tooling I am sure, but I was wondering, has genode team had a look at the multi-kernel configuration of seL4 for integration?

Right now, mono-thread payloads are quite a bit faster on sel4 compared to hw, but obviously, hw beats sel4 on SMP (since SMP is not activated ^^; )

any thougths?

thanks!

The short answer is: no.

I would be careful with too general assumptions about the runtime performance in between two different underlying kernels in Genode.
When looking at different workloads of our nightly tests, I can see diverse results regarding time needed to succeed in between both mentioned kernels. There are plenty reasons not necessarily directly related to the kernel itself that are responsible therefore. For instance, the initial costs being spent by a client to establish a session are generic, but the actual costs differ on the kernels (what needs to be paid for kernel-resource management). Therefore, it happens that the initial session establishment can be very costly on one kernel, where the client has to try this several times (each time delivering more ram/caps) in a loop. If you have a scenario centered on a lot of dynamic regarding session opening/closing, you get quite different result than maybe I/O heavy loads using shared memory and signaling, and so on. In our regular tests I can see for instance a slightly faster success of the “nic_router_stress” test (artificial network test with no real I/O) on sel4, but regarding the “bomb” test (creating and destroying tons of components) it performance collapses in constrast to base-hw. But even here I would be cautious to assume too much from this observation. One would need to first analyze what is actually bottleneck in which situation. Sometimes it might be simply a configuration issue.

True. That being said, my observations are based on a few benchmarks I created to avoid that bias, I am looking at the best way to share those results in a meaningful fashion (but I digress). You mention you have regular tests, could you point me at what are those? I would be interested to learn from them and maybe add that to my benchmarks as well :slight_smile:

Either way, I need to study a bit more the base-sel4 layer (and base-hw layer for that matter) to see if the multi-kernel integration would be possible or even make sense then. That will be a low priority for this project though, so it might come rather late ^^;

Thanks for the feedback

What I want to add: for x86 we enabled SMP support, but not for any seL4 ARM platform, simply there was no need to do so.

got it, I might look into that as well, but since the SMP version is not verfiable according to seL4 community, I might just activate and fix or go for the MultiKernel directly (still trying to understand the limitations).

The list of tests we run nightly can be found here:

The actual results (logs) are not publicly available, but are part of our internal test-suite.
One of the potentially interesting run-scripts is for instance depot_autopilot.run, which itself contains a bunch of independent ~90 tests that are not dependent on additional peripherals.

1 Like