FreeRTOS Tickless Low Power mode Using ARM Semihosting Syscalls in QEMU
Posted Fri, 17 Mar 2023 10:45:21 +0100 | Operative Systems| Virtual Machines| MIST|
I recently implemented an extension to the set of ARM semihosting system calls that allows any ARM semihosting-compatible guest running FreeRTOS with tickless low power mode enabled to command the QEMU host to yield or halt for a given maximum number of ticks, reducing power consumption and CPU usage. This is accomplished without noticeably affecting performance since QEMU wakes up if it has received I/O over the character frontend (or stdin). Some more background information is provided in the original issue on Gitlab.
Syscall specification
ARM semihosting calls allows a a guest running on a host or using a debugger to leverage the resources of that host or debugger, including things like reading files, obtaining the current time, and even execute arbitrary shell commands. Unfortunately, there seemed to be no syscall that requested the host to halt or enter a low power mode. As such, the first step I took was to define that semihosting syscall.
SYS_HALT (0xFE)
Halts the host for at most the given number of ticks. Ultimately, the host decides how long
to halt for. Use SYS_TICKFREQ to determine the tick frequency.
Entry
On entry, R1 points to a two-word data block that contains the maximum number of ticks to halt for:
word 1
the least significant word and is at the low address
word 2
the most significant word and is at the high address.
Return
On exit:
On success, R1 points to a doubleword that contains the number of ticks that the host halted for.
On success, R0 contains 0. On failure, R0 contains -1.
To enable semihosting in QEMU, the flags -semihosting -semihosting-config target=native
must be supplied to it. The latter flag is needed as
otherwise QEMU uses GDB syscalls (which I guess means that a GDB session must be active) and you end up with a segmentation fault.
Modding QEMU
Since I don’t know much about how either QEMU or FreeRTOS works, I had to do some manual deep-diving into the code using grep -R
,
which is my favorite way of learning how some software works. Thanks to this, I discovered
the ARM-compatible semihosting syscall handler in semihosting/arm-compat-semi.c
, where I was able to find out that the tick frequency
used in QEMU as a reply to the SYS_TICKFREQ
syscall is 1 nanosecond.
case TARGET_SYS_TICKFREQ:
/* qemu always uses nsec */
common_semi_set_ret(cs, 1000000000);
break;
Likewise, I found in FreeRTOSConfig.h
that the tick frequency used in the FreeRTOS distro for the iOBC uses
a tick frequency of 1 millisecond, which means QEMU has a high enough resolution to halt for however many ticks
that FreeRTOS requests.
Implementing SYS_HALT…
Then, I added support for this syscall the QEMU handler, making use of pthread_cond_timedwait
wrapped in two calls to clock_gettime
, where the abstime
argument is calculated
using the number of ticks supplied in R1
and the output of the call to clock_gettime
. To allow QEMU to wake up earlier than abstime
a mutex arm_semihosting_sys_halt_mutex
and condition arm_semihosting_sys_halt_cond
were added, where arm_semihosting_sys_halt_cond
can be signalled externally when QEMU needs to wake up to for example handle I/O.
/* Lock mutex */
ret = pthread_mutex_lock(&arm_semihosting_sys_halt_mutex);
if (ret != 0) {
goto do_fault;
}
/* Get the current time in nanoseconds, and add the requested number of maximum
ticks to sleep for to determine the latest time at which we need to wake up. */
clock_gettime(CLOCK_REALTIME, &ts);
if (max_ticks > 0) {
ts_after.tv_sec = ts.tv_sec + (ts.tv_nsec + max_ticks) / 1000000000;
ts_after.tv_nsec = (ts.tv_nsec + max_ticks) % 1000000000;
/* Go into low power mode and halt */
ret = pthread_cond_timedwait(&arm_semihosting_sys_halt_cond,
&arm_semihosting_sys_halt_mutex,
&ts_after);
} else {
/* Insignificant number of max ticks, skip halting */
ret = 0;
}
/* Get the actual time we spent halting, and unlock mutex */
clock_gettime(CLOCK_REALTIME, &ts_after);
(void) pthread_mutex_unlock(&arm_semihosting_sys_halt_mutex);
if (ret != 0 && ret != ETIMEDOUT) {
goto do_fault;
}
elapsed = (ts_after.tv_sec * 1000000000 + ts_after.tv_nsec) -
(ts.tv_sec * 1000000000 + ts.tv_nsec);
…without neglecting I/O
Now we need to actually signal the condition when FreeRTOS has I/O to handle. The iOBC uses a DBGU peripheral
for stdout, stdin, and stderr, and a grep -R
later I found that at91-dbgu.c
contained the
implementation for the DGBU. It declares two callback functions, dbgu_uart_can_receive
and
dbgu_uart_receive
, where the former returns the number of characters from stdin that the DBGU peripheral can receive,
and the latter handles incoming data over stdin. After some quick testing it turns out that QEMU frequently calls
dbgu_uart_can_receive
even when there is no input, making it an unsuitable place to signal the condition variable.
As such, the call to pthread_cond_signal
was placed in dbgu_uart_receive
when a character is successfully written
to the DBGU RHR register that contains the received character.
extern pthread_cond_t arm_semihosting_sys_halt_cond;
extern pthread_mutex_t arm_semihosting_sys_halt_mutex;
static void semihosting_sys_halt_cond_signal(void)
{
int err = pthread_cond_signal(&arm_semihosting_sys_halt_cond);
if (err != 0) {
error_report("at91.dbgu: failed to signal halt semihosting condition: %d", err);
}
(void) pthread_mutex_unlock(&arm_semihosting_sys_halt_mutex);
}
static void dbgu_uart_receive(void *opaque, const uint8_t *buf, int size)
{
/* ... */
// SPEC: When a complete character is received, it is transferred to the DBGU_RHR
// and the RXRDY status bit in DBGU_SR (Status Register) is set.
s->reg_rhr = buf[0];
s->reg_sr |= SR_RXRDY;
/* ... */
// Notify the CPU thread that it needs to wake up and handle I/O
semihosting_sys_halt_cond_signal();
}
Test program
To test this without involving FreeRTOS yet, I implemented quick support for SYS_HALT
in the ARM semihosting implementation
used in MIST’s hal-clone
stubs of the iOBC hardware abstraction layer, using the semihosting_halt
function.
#ifdef USE_NONSTANDARD_HALT_SEMIHOSTING
#define SEMIHOSTING_SYS_HALT 0xFE
#endif
static int _semihosting_call(int r0, int r1)
{
register int reg0 asm("r0");
register int reg1 asm("r1");
reg0 = r0;
reg1 = r1;
asm("svc 0x00123456");
return reg0;
}
#ifdef USE_NONSTANDARD_HALT_SEMIHOSTING
int64_t semihosting_halt(int64_t max_ticks)
{
int err;
int ret[2];
ret[0] = (int) (max_ticks & 0xFFFFFFFF);
ret[1] = (int) (max_ticks >> 32);
err = _semihosting_call(SEMIHOSTING_SYS_HALT, (int) ret);
if (err != 0)
return -1;
return (((int64_t) ret[1]) << 32) | ret[0];
}
#endif
Then I wrote a quick test program that calls semihosting_halt
with a max_ticks
argument of 2 billion (2 seconds)
and prints the number of ticks it halted for, and ran it in QEMU just to test the behavior.
It ended up not working very well (as code tends not to do the first time you run it). Although the program delayed
for 2 seconds as expected, it was not responsive to inputs to stdin at all. Also, it did not print exactly 2000000000
, but
something slightly larger like 2000057987
.
Troubleshooting with GDB
The cause of the latter problem was quite obvious, there are internal overheads
in QEMU when the SYS_HALT
syscall is executed that seemingly ends up taking a couple tens of microseconds. To fix this, I
added a negative offset of 1000 000 ticks (1 millisecond) to the number of ticks that QEMU receives from the ARM guest.
/* Subtract 1 ms from the halting ticks to account for overheads */
max_ticks -= 1000000;
if (max_ticks > 0) {
ts_after.tv_sec = ts.tv_sec + (ts.tv_nsec + max_ticks) / 1000000000;
ts_after.tv_nsec = (ts.tv_nsec + max_ticks) % 1000000000;
/* Go into low power mode and halt */
ret = pthread_cond_timedwait(&arm_semihosting_sys_halt_cond,
&arm_semihosting_sys_halt_mutex,
&ts_after);
} else {
/* Insignificant number of max ticks, skip halting */
ret = 0;
}
This is definitely not a perfect solution, but I figured it would solve the problem for now. As for the first problem,
I made the assumption that there must be some other thread in QEMU that calls the dbgu_uart_can_receive
and dbgu_uart_receive
callbacks, so I decided to start up GDB and send an interrupt signal using CTRL+C right as QEMU suspends to see what every QEMU thread was doing.
I did this just by prefixing my QEMU command with gdb --args
. Then I checked the backtrace of all QEMU threads using thread apply all bt
.
Thread 1 "qemu-system-arm" received signal SIGINT, Interrupt.
0x00007ffff607f9a0 in ?? () from /usr/lib/libc.so.6
(gdb) thread apply all bt
Thread 3 (Thread 0x7ffff0e6e6c0 (LWP 40073) "qemu-system-arm"):
#0 0x00007ffff607f766 in () at /usr/lib/libc.so.6
#1 0x00007ffff6082294 in pthread_cond_timedwait () at /usr/lib/libc.so.6
#2 0x0000555555d768cc in do_common_semihosting (cs=0x555556cd6440) at ../semihosting/arm-compat-semi.c:814
#3 0x0000555555c31745 in tcg_handle_semihosting (cs=0x555556cd6440) at ../target/arm/helper.c:10991
#4 arm_cpu_do_interrupt (cs=0x555556cd6440) at ../target/arm/helper.c:11038
#5 0x0000555555dc221e in cpu_handle_exception (ret=<synthetic pointer>, cpu=0x555556cd6440) at ../accel/tcg/cpu-exec.c:727
#6 cpu_exec_loop (cpu=cpu@entry=0x555556cd6440, sc=sc@entry=0x7ffff0e6d580) at ../accel/tcg/cpu-exec.c:944
#7 0x0000555555dc2a1d in cpu_exec_setjmp (cpu=cpu@entry=0x555556cd6440, sc=sc@entry=0x7ffff0e6d580) at ../accel/tcg/cpu-exec.c:1037
#8 0x0000555555dc2fe5 in cpu_exec (cpu=cpu@entry=0x555556cd6440) at ../accel/tcg/cpu-exec.c:1063
#9 0x0000555555ddcb3f in tcg_cpus_exec (cpu=cpu@entry=0x555556cd6440) at ../accel/tcg/tcg-accel-ops.c:81
#10 0x0000555555ddcc8f in mttcg_cpu_thread_fn (arg=arg@entry=0x555556cd6440) at ../accel/tcg/tcg-accel-ops-mttcg.c:95
#11 0x0000555555f59fe8 in qemu_thread_start (args=0x555556e3a530) at ../util/qemu-thread-posix.c:512
#12 0x00007ffff6082bb5 in () at /usr/lib/libc.so.6
#13 0x00007ffff6104d90 in () at /usr/lib/libc.so.6
Thread 2 (Thread 0x7ffff17f16c0 (LWP 40072) "qemu-system-arm"):
#0 0x00007ffff60fd0dd in syscall () at /usr/lib/libc.so.6
#1 0x0000555555f5b17a in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /home/william/MIST/qemu/include/qemu/futex.h:29
#2 qemu_event_wait (ev=ev@entry=0x5555568c6fa8 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:435
#3 0x0000555555f63a92 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:261
#4 0x0000555555f59fe8 in qemu_thread_start (args=0x5555568eb050) at ../util/qemu-thread-posix.c:512
#5 0x00007ffff6082bb5 in () at /usr/lib/libc.so.6
#6 0x00007ffff6104d90 in () at /usr/lib/libc.so.6
Thread 1 (Thread 0x7ffff17f7f40 (LWP 40069) "qemu-system-arm"):
#0 0x00007ffff607f9a0 in () at /usr/lib/libc.so.6
#1 0x00007ffff6085ea2 in pthread_mutex_lock () at /usr/lib/libc.so.6
#2 0x0000555555f5a3c3 in qemu_mutex_lock_impl (mutex=0x5555568a5f80 <qemu_global_mutex>, file=0x555556196267 "../util/main-loop.c", line=315) at ../util/qemu-thread-posix.c:94
#3 0x0000555555afe836 in qemu_mutex_lock_iothread_impl (file=file@entry=0x555556196267 "../util/main-loop.c", line=line@entry=315) at ../softmmu/cpus.c:504
#4 0x0000555555f6ca66 in os_host_main_loop_wait (timeout=913020) at ../util/main-loop.c:315
#5 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:603
#6 0x0000555555b04c47 in qemu_main_loop () at ../softmmu/runstate.c:731
#7 0x000055555588cef6 in qemu_default_main () at ../softmmu/main.c:37
#8 0x00007ffff6020790 in () at /usr/lib/libc.so.6
#9 0x00007ffff602084a in __libc_start_main () at /usr/lib/libc.so.6
#10 0x000055555588ce15 in _start ()
So there are just three threads. Thread 3 seems to be the thread that emulates the ARM CPU and it has halted at the call to
pthread_cond_timedwait
as expected. Thread 2 presumably has something to do with something called Read-copy update, which does not seem to be relevant to the problem.
But thread 1, which says something about iothread
, seems very interesting. It has called pthread_mutex_lock
and is currently waiting for
something. I decided to take a look at os_host_main_loop_wait
in main-loop.c
, which looks like this.
static int os_host_main_loop_wait(int64_t timeout)
{
GMainContext *context = g_main_context_default();
int ret;
g_main_context_acquire(context);
glib_pollfds_fill(&timeout);
qemu_mutex_unlock_iothread();
replay_mutex_unlock();
ret = qemu_poll_ns((GPollFD *)gpollfds->data, gpollfds->len, timeout);
replay_mutex_lock();
qemu_mutex_lock_iothread(); /* blocked here */
glib_pollfds_poll();
g_main_context_release(context);
return ret;
}
The I/O thread is blocked at line 315 which is at the call to qemu_mutex_lock_iothread
. Then why don’t we just call qemu_mutex_unlock_iothread
in the CPU thread before halting? It seemed like a naive idea, but since it didn’t require much effort I decided to try it.
I changed the QEMU semihosting handler to unblock the I/O thread before halting using pthread_cond_timedwait
and block it again
after waking up again.
/* Unlock the I/O thread such that it can wake us up if there is I/O */
qemu_mutex_unlock_iothread();
clock_gettime(CLOCK_REALTIME, &ts);
/* Subtract 1 ms from the halting ticks to account for overheads */
max_ticks -= 1000000;
if (max_ticks > 0) {
ts_after.tv_sec = ts.tv_sec + (ts.tv_nsec + max_ticks) / 1000000000;
ts_after.tv_nsec = (ts.tv_nsec + max_ticks) % 1000000000;
/* wait for io thread to wake us up */
ret = pthread_cond_timedwait(&arm_semihosting_sys_halt_cond,
&arm_semihosting_sys_halt_mutex,
&ts_after);
} else {
/* Insignificant number of max ticks, skip halting */
ret = 0;
}
/* Get the actual time we spent halting, and unlock mutex */
clock_gettime(CLOCK_REALTIME, &ts_after);
qemu_mutex_lock_iothread();
As luck would have it, this was enough to get the test program to behave the way I wanted. It halted for 2 seconds and printed a number of ticks slightly smaller than 2 billion, unless I gave it input using the keyboard, in which case it terminated earlier and printed a number that roughly equals the number of ticks that passed since the program started. Great!
Modding FreeRTOS
Now that the SYS_HALT
syscall seemed to be correctly implemented on QEMU’s side, it was time to implement support
for it in the ancient version of FreeRTOS provided by ISISpace. In FreeRTOS, there is an idle task, prvIdleTask
,
that runs only when all other tasks with higher priority are suspended or blocked. In essence, the way tickless
low power mode works is that when the idle task runs it will calculate the shortest time until a task will be ready,
and call a port-specific function that should be responsible for entering low power mode for that amount of time.
Because the low power mode suspends the interrupt service routine that increments the FreeRTOS tick counter,
this function needs to adjust the tick counter based on the time it spent in low power mode using an external
source of time that is not suspended, by calling the vTaskStepTick
function. See the official article for more details.
Implementing the AT91SAM9G20 QEMU Port
I first updated the configuration in FreeRTOSConfig.h
to enable the tickless low power mode.
/* Enable low power tickless idle mode with an arbitrary
minimum idle time of 5 ticks */
#define configUSE_TICKLESS_IDLE 1
#define configEXPECTED_IDLE_TIME_BEFORE_SLEEP 5
To ensure correctness, I invoke the SYS_TICKFREQ
semihosting syscall when the FreeRTOS scheduler starts in port.c
to determine how to convert between FreeRTOS and host ticks.
#include <hal-clone/semihosting.h>
/* The ratio of QEMU tick frequency to FreeRTOS tick frequency. */
static portTickType tickQemuToRtos;
portBASE_TYPE xPortStartScheduler( void )
{
/* Obtain how many QEMU ticks there are for every FreeRTOS tick
QEMU must have a higher tick frequency, or it will halt for too long. */
tickQemuToRtos = (portTickType) (semihosting_tickfreq() / configTICK_RATE_HZ);
configASSERT( tickQemuToRtos );
/* ... */
}
Then I implemented the vApplicationSleep
callback function, as suggested in the official article.
/* Define the function that is called by portSUPPRESS_TICKS_AND_SLEEP(). */
void vApplicationSleep( portTickType xExpectedIdleTime )
{
uint64_t ulLowPowerTime;
portTickType ulLowPowerTimeTicks;
eSleepModeStatus eSleepStatus;
/* TODO Stop the timer that is generating the tick interrupt. */
/* Enter a critical section that will not effect interrupts bringing the MCU
out of sleep mode. */
vPortEnterCritical();
/* Ensure it is still ok to enter the sleep mode. */
eSleepStatus = eTaskConfirmSleepModeStatus();
if( eSleepStatus == eAbortSleep )
{
/* A task has been moved out of the Blocked state since this macro was
executed, or a context siwth is being held pending. Do not enter a
sleep state. Restart the tick and exit the critical section. */
}
else if( eSleepStatus == eNoTasksWaitingTimeout )
{
/* It is not necessary to configure an interrupt to bring the
microcontroller out of its low power state at a fixed time in the
future. */
for (;;) {
(void) semihosting_halt(0x7FFFFFFFFFFFFFFFLL);
}
}
else
{
/* Enter the low power state for the expected idle time, and QEMU
will wake us up at around that time or earlier if there is I/O,
and inform us of how low we were asleep. */
ulLowPowerTime = semihosting_halt(((int64_t) xExpectedIdleTime) * tickQemuToRtos);
configASSERT( ulLowPowerTime >= 0 );
/* FIXME Truncate the number of ticks slept if they were larger than expected
this should be handled by QEMU and not here. */
ulLowPowerTimeTicks = (portTickType) (ulLowPowerTime / tickQemuToRtos);
if (ulLowPowerTimeTicks > xExpectedIdleTime) {
ulLowPowerTimeTicks = xExpectedIdleTime;
}
/* Note that the scheduler is suspended before
portSUPPRESS_TICKS_AND_SLEEP() is called, and resumed when
portSUPPRESS_TICKS_AND_SLEEP() returns. Therefore no other tasks will
execute until this function completes. */
/* Correct the kernels tick count to account for the time the
microcontroller spent in its low power state. */
vTaskStepTick( ulLowPowerTimeTicks );
}
/* Exit the critical section - it might be possible to do this immediately
after the prvSleep() calls. */
vPortExitCritical();
/* TODO Restart the timer that is generating the tick interrupt. */
}
Since I am not all that
familiar with low level hardware-specific details of the iOBC board, I assume that calling the vPortEnterCritical
and
vPortExitCritical
functions declared in port.c
is enough to disable and re-enable all interrupts. Interrupts should
anyhow not be able to trigger if QEMU is halting, but then again I am no QEMU expert. I also was not able to figure out
how to disable the timer that generates the tick interrupts, only that prvSetupTimerInterrupt
seems to enable it.
I use the provided xExpectedIdleTime
argument, convert it into QEMU ticks, and invoke the SYS_HALT
syscall. Then I
truncate the number of returned ticks such that they are not greater than xExpectedIdleTime
after converted to FreeRTOS
ticks. The reason for this is that vTaskStepTick
actually asserts this, making FreeRTOS enter a failed sanity check loop
if the condition is violated.
This does tie into the issue with the offset I subtract from the max_ticks
argument in QEMU, as the overhead in
the SYS_HALT
syscall will vary based on how powerful the hardware that runs QEMU is. Perhaps this offset is something
that could be learned by QEMU through trial and error by first assuming a large offset, and then shrinking it if possible.
In the end I made it future work. As the host may halt for a longer amount of time that FreeRTOS thinks, the ARM guest may run a little slower than usual.
However, it doesn’t seem to affect correctness at least. I guess it is because from the ARM guests perspective, the call to SYS_HALT
is instantaneous.
Testing with the On-board Software
Finally, it was time to test using the MIST On-board software (OBCSW) running on top of FreeRTOS with tickless low power enabled.
Unfortunately, it ran super slowly, delaying for much longer than you would expect. On the other hand, feeding it input over stdin
actually made the thing run much faster, meaning it did indeed detect the I/O and woke up from its halting state. This suggested that there
was a bug with the xExpectedIdleTime
argument given to vApplicationSleep
, and I was able to confirm using GDB that the QEMU SYS_HALT
handler received max_ticks
arguments corresponding to around 0.97 seconds very often when you would expect the delay much lower.
Troubleshooting with GDB
I had to figure out why FreeRTOS thought it was OK to halt for so long. This time I decided to leverage the GDB support in QEMU to
debug the OBCSW itself. I added the -s -S
flags to QEMU to make it wait for a GDB connection, and then in a separate terminal ran to command
arm-none-eabi-gdb -ex 'target remote localhost:1234' -ex 'symbol-file obcsw.elf'
.
I found this .gdbinit
file
for use with debugging FreeRTOS, but I couldn’t really get it to work properly, maybe because it was intended for a somewhat
modern version of FreeRTOS. On the positive side, I was able to reverse engineer it to learn how to obtain information about the scheduler state manually.
For example, to check the name of the currently running task, use p pxCurrentTCB->pcTaskName
. To check how many tasks are in a list, such as one of the pxReadyTasksLists
or the pxDelayedTaskList
, one can use
p $(LIST).uxNumberOfItems
, and to examine the name of the list task in a list, one can use p ((tskTCB *)($(LIST).xListEnd.pxPrevious->pvOwner))->pcTaskName
.
It is worth noting that pxReadyTasksLists
is an array of lists, where the index indicates the priority of the tasks that are located there.
So now I was able to examine the list of ready tasks for each priority level and the list of suspended tasks. By looking at the FreeRTOS kernel,
I was able to deduce that the xExpectedIdleTime
argument is calculated using the prvGetExpectedIdleTime
function, and that it used the global
xNextTaskUnblockTime
variable to do this. The xNextTaskUnblockTime
variable in turn is conditionally set when a task yields in the
prvAddCurrentTaskToDelayedList
function. After setting breakpoints in various places, like at the start of prvGetExpectedIdleTime
,
vApplicationSleep
, and prvIdleTask
, I eventually set a breakpoint at the start of the eTaskConfirmSleepModeStatus
function.
This function is called by vApplicationSleep
to confirm whether it is really OK to go into low power mode, and it is supposed to abort sleep
if a task was marked as ready while the scheduler was suspended, or if there is a pending yield. When I examined the tasks in the ready list,
there was a ready task with priority 0, which is unsurprisingly the currently running idle task. What is more noteworthy though is that there
was also a task with priority 4 ready. This turned out to be the OBCSW initialization task, which was exactly the task that ran so slowly.
Breakpoint 1, eTaskConfirmSleepModeStatus () at ../modules/hal-clone/src/freertos/tasks.c:2223
2223 ../modules/hal-clone/src/freertos/tasks.c: No such file or directory.
(gdb) p pxCurrentTCB->pcTaskName
$1 = "IDLE", '\000' <repeats 27 times>
(gdb) p pxReadyTasksLists
$2 = {{uxNumberOfItems = 1, pxIndex = 0x2026b9f4, xListEnd = {xItemValue = 4294967295, pxNext = 0x2026b9f4, pxPrevious = 0x2026b9f4}}, {uxNumberOfItems = 0,
pxIndex = 0x201a64a8, xListEnd = {xItemValue = 4294967295, pxNext = 0x201a64a8, pxPrevious = 0x201a64a8}}, {uxNumberOfItems = 0, pxIndex = 0x201a64bc, xListEnd = {
xItemValue = 4294967295, pxNext = 0x201a64bc, pxPrevious = 0x201a64bc}}, {uxNumberOfItems = 0, pxIndex = 0x201a64d0, xListEnd = {xItemValue = 4294967295,
pxNext = 0x201a64d0, pxPrevious = 0x201a64d0}}, {uxNumberOfItems = 1, pxIndex = 0x201a64e4, xListEnd = {xItemValue = 4294967295, pxNext = 0x202598bc,
pxPrevious = 0x202598bc}}}
(gdb) p pxReadyTasksLists[4].uxNumberOfItems
$3 = 1
(gdb) p ((tskTCB *)(pxReadyTasksLists[4].xListEnd.pxPrevious->pvOwner))->pcTaskName
$4 = "mist_initializationt_0\000\000\000\000\000\000\000\000\000"
And yet, the idle task was running and eTaskConfirmSleepModeStatus
reported that it was OK to sleep. Because eTaskConfirmSleepModeStatus
doesn’t check the ready task lists, only the “pending” ready task list,
this goes undetected.
(gdb) p xPendingReadyList.uxNumberOfItems
$5 = 0
I didn’t want to believe that this was a bug in FreeRTOS, but I decided anyway to add some extra code to eTaskConfirmSleepModeStatus
that simply aborts sleep to the trigger idle task to yield if any if the lists of ready tasks with a priority of 1 or greater is not empty.
eSleepModeStatus eTaskConfirmSleepModeStatus( void )
{
eSleepModeStatus eReturn = eStandardSleep;
/* Begin FreeRTOS hack... */
unsigned portBASE_TYPE i = uxTopReadyPriority;
while( listLIST_IS_EMPTY( &( pxReadyTasksLists[ i ] ) ) )
{
configASSERT( uxTopReadyPriority );
--i;
}
if (i > 0)
{
/* There is a task with a priority greater than 0 that is ready. */
eReturn = eAbortSleep;
}
/* End FreeRTOS hack */
else if( listCURRENT_LIST_LENGTH( &xPendingReadyList ) != 0 )
{
/* ... */
}
This actually was enough to fix the problem. The OBCSW ran essentially flawlessly after that, and htop
reported that the CPU usage had gone down from 100%
of a CPU thread to just around 10%. And this even though the OBCSW is mostly a polling-based system that must semi-frequently must check all
subsystems for new events using I2C.
Preemptive vs cooperative scheduling
One big disclaimer here is that the OBCSW FreeRTOS distro uses cooperative scheduling, not preemptive scheduling. This has the effect that, at least in the FreeRTOS port we use, if a task gets placed in the ready task lists by the tick interrupt service routine there is no yield.
void vPortTickISR( void )
{
volatile unsigned long ulDummy;
/* Increment the tick count - which may wake some tasks but as the
preemptive scheduler is not being used any woken task is not given
processor time no matter what its priority. */
if( xTaskIncrementTick() != pdFALSE )
{
vTaskSwitchContext();
}
/* Clear the PIT interrupt. */
ulDummy = AT91C_BASE_PITC->PITC_PIVR;
/* To remove compiler warning. */
( void ) ulDummy;
/* The AIC is cleared in the asm wrapper, outside of this function. */
}
It is likely that if we used preemptive scheduling, this problem wouldn’t have happened because the initialization task would have
been scheduled to run after the yield. But still,
this could be a FreeRTOS bug for cooperative scheduling. Messing with the FreeRTOS kernel was the only solution I could think of,
because attempting to fix it in vPortTickISR
would mean that were are no longer obeying cooperative scheduling.
Conclusion
And there you have it, how to support tickless low power mode for any ARM semihosting compatible boards run by QEMU, even
if the board does not actually have a low power halting state! You can now unit test or manually test your firmware in QEMU
for hours or days and worry less about your electricity bills. Some things that could be improved include optimizing the value
of the max_ticks
offset in QEMU, and update assertion failures and other purposeful infinite loops in FreeRTOS to call
semihosting_halt
in those infinite loops to also reduce CPU usage in case if bugs.
Some final links in case you want to use it in your project.