I’ve been testing Genode on two systems with different processors (4th-gen core i7 and Celeron). While I expect the Celeron to be slower, the performance gap with Genode is far larger than anticipated, especially compared to Linux on the same Celeron hardware.
I suspect that the microkernel architecture, with its IPC and task-switching overhead, is causing these bottlenecks. I’ve already tried with the priority attribute of components, CPU core affinity, and explicit processing unit allocation as per the documentation, but observed no improvement. I’m seeking further guidance on tools or methods to identify and reduce these bottlenecks.
It seems you raise the same concerns already mentioned in Performance Issues with Sculpt OS and Falkon on Celeron. Without knowing your exact scenario it is hard to tell where your bottlenecks actually lie but I would not point at the underlying system architecture prematurely.
One big difference that will hit you is that the pixel pushing is done only on the CPU in Genode at the moment. So you profit from a faster CPU and a slow CPU will worsen your interactive experience when it comes to compositing and blitting pixels as time spent there is missing for other tasks. In the same vein using a display with a high resolution will make that apparent¹.
One experiment you could try would be to configure your Linux appliance in a way that the GPU is not utilized and your graphics output is not accelerated to level the playing field when drawing comparisons.
Now, if you want to investigate where the time is spent on Genode I suggest to take a look at the top
component that prints the the execution time of each component periodically to the LOG.
¹) Every now and then I test Sculpt on a Shuttle DS57U (Celeron 3205U) that has 2 cores clocked at 1.5 GHz and driving a smaller 1080p display works well enough albeit one core is heavy utilized when nit_fader
or a full-screen component like a game is active. Using my normal display with 2560x2880 will push the system over the edge where nit_fader
exhibits noticeable artifacts and since the system only has 2 cores spreading the load across more cores is unfortunately not possible.