2021年2月28日 星期日

Raspberry Pi Zero W Project Part 3 - Software Interrupt & Handler example

For ARM architecture, the instruction for software interrupt is SWI or SVC (ARMv7).

Using the instruction is very simple:

SVC #imm

When an ARM processor executes the instruction, it will switch to SVC/SWI mode and jump to the SWI or SVC hander in exception vector table specified by Vector Base Address Register (VBAR).

Similar to previous parts, please get from repo https://github.com/champyen/rpiz_bare_metal.git 

please checkout the commit: 637086c2

$ git checkout 637086c2

It is not difficult to use SWI/SVC instruction. But it requires experience to apply it well. Using SWI/SVC instrution for system calls, we need to:

  • design system call table ( SWI number , system call pair)
  • implement SWI/SVC handler, get the system call number
  • implement system calls with corresponding number

For get the SWI (or system call) number, it is describe in ARM's "SWI Handler's" document. The SWI number is encoded as lowest 24 bit in SWI/SVC instruction. Therefore we could get SWI/SVC number by fetching the instruction itself and clearing the MSb 8bits with BIC instruction right after entering SWI hander:

ldr     r0, [lr,#-4]
bic     r0, r0, #0xff000000

After getting SWI number, we could pass it to system_handler as an argument saved in R0. The system_hander is very similar to ISR, but we don't need to check the hardware status to know which hardware interrupts CPU. We just need to handler the SWI or system call by the SWI number. Of course, as IRQ has Interrupt latency SWI/SVC instruction consumes more cycles than normal instructions for most CPUs.

For Application processor it is easy to understand the meaning of mode switching and SWI handling. In fact, Cortex-R, and Cortex-M processor also has SWI/SVC instruction. For most Embedded / RTOS developers, they think it is useless or trivial in such processor.

For some embedded or RTOS, mutual-exclusive applications can be loaded on demand. This can be done by well-designed linker-script (e.g.: different LAs with same). For such Embedded / RTOS, SWI / SVC instruction is useful to maintain an API layer between Applications and Kernel. An intuitive / naive way is to setup and maintain table of function pointers on the kernel side (Of course we still need to setup API index for using the function as SWI number). For development it has an drawback: it is hard to maintain or adjust or expand the table. With SWI/SVC instruction, no table forwarding is needed. Besides it provides flexibility to group system calls and reserve numbers for future needs. (Since for table-based method, it required to reserve a huge table).


2021年2月6日 星期六

Raspberry Pi Zero W Project Part 2 - Interrupt Service Routine (ISR) example

After implementing naive function and printf, let's add ISR to it.

please checkout the commit: 91abc0d3

$ git checkout 91abc0d3

There are many changes from previous commit:

head.S

it becomes more complicated. As we know, its context starts at 0x8000 address.

start:
    ldr     pc, reset_target        /* 0x00 mode: svc */
    ldr     pc, undefined_target    /* 0x04 mode: ? */
    ldr     pc, swi_target          /* 0x08 mode: svc */
    ldr     pc, prefetch_target     /* 0x0c mode: abort */
    ldr     pc, abort_target        /* 0x10 mode: abort */
    ldr     pc, unused_target       /* 0x14 unused */
    ldr     pc, irq_target          /* 0x18 mode: irq */
    ldr     pc, fiq_target          /* 0x1c mode: fiq */

reset_target:           .word   reset_entry

undefined_target:       .word   undefined_entry
swi_target:             .word   syscall_entry
prefetch_target:        .word   prefetch_entry
abort_target:           .word   abort_entry
unused_target:          .word   unused_entry
irq_target:             .word   irq_entry
fiq_target:             .word   fiq_entry

After loading the binary, it will jump to a routine named "reset_entry". Before we trace the reset_entry. The 8 "ldr pc, XXXXXX" instructions are so called Exception Vector Table. (FIQ is a special irq mode, it has a advantage - the implementation can be start at the location, the jump is not necessary. Therefore one jump delay is saved. ) It is used to handle system exceptions. Each has corresponding privileged mode to it. Besides, each mode has dedicated LR and SP registers - this means OS / firmware implementation should take care of stack space arrangement for the mode:


In fact, for reset_entry here, its major work is setting stack for each mode:

reset_entry:
    /* set VBAR to 0x8000 */
    mov r0, #0x8000
    mcr p15, 0, r0, c12, c0, 0


    /* (PSR_FIQ_MODE|PSR_FIQ_DIS|PSR_IRQ_DIS) */
    mov r0,#0xD1
    msr cpsr_c,r0
    ldr sp, stack_fiq_top

... other 4 modes ...

    /* (PSR_SVC_MODE|PSR_FIQ_DIS|PSR_IRQ_DIS) */
    mov r0,#0xD3
    msr cpsr_c,r0
    ldr sp, stack_svc_top

    cpsie i
    bl  bare_metal_start

In addition to stack assignment and jump to bare_metal_start , there are two key points here:

  1. setup Vector Base Address Register (VBAR) - From ARMv6, the exception vector can be placed other than 0x00000000 and 0xFF000000. This is achieved by setting VBAR, please refer to "3.2.43 c12, Secure or Non-secure Vector Base Address Register" in ARM1176JZF-S TRM.
  2. enable interrupt - cpsie instruction

And we have to trace isr_entry:

irq_entry:
    stmfd  sp!, {r0-r12, lr}
    add     lr, pc, #4
    bl      isr_entry
    ldmfd   sp!, {r0-r12, lr}

    subs    pc, lr, #4
For ISR, it is not surprised to backup and restore all (non-dedicated) registers. The most interesting things are - LR register setting and the instruction to leave IRQ mode. For LR setting it is easy to figure out, the target return address is the 'bl isr_entry' not 'add lr, pc, #4". That's the main reason to save "pc+4" to LR. And for leaving each mode, please refer to "2.12.2 Exception entry and exit summary" of ARM1176JZF-S TRM.

isr.c

There 3 functions in the source file: timer_enable, timer_check and isr_enty. Here we use "System Timer" in BCM2835, please refer to Chap 7 and Chap 12 of "BCM2835 Peripheral specification". Besides the IRQ number of System Timer is not listed in the document, please refer to the link of  "errata and some additional information" on the page.

The timer_enable will enable System Timer 1 or 3 by index and timer_check is used to clear IRQ state and update next timeout interrupt. Therefore the isr_enty just check status and call timer_check for clear the IRQ.

bare_metal.c

For demonstrate IRQ and main thread's progress, a busy loop with counter is added. The loop will print out a number when specified condition is met. And Timer is enabled before the loop, You can see the timer tick with ISR and the main thread keeps counting.

void bare_metal_start(void)
{
    int base = 0;
    asm volatile (
        "mov %0, sp\n\t" : "=r" (base)
    );

    printf("\n\n%s:%x: Hello World! %s %s %d\n\n", __func__, base, __DATE__, __TIME__, __LINE__);
    printf("enter busy loop\n");

    timer_enable(1);

    volatile int i = 0;
    while(1){
        if((i++ & 0x00FFFFFF) == 0)
            printf("%d\n", i);
    }

}

 



 

 

 

 

在 ARM 平台上使用 Function Multi-Versioning (FMV) - 以使用 Android NDK 為例

Function Multi-Versioning (FMV) 過往的 CPU 發展歷程中, x86 平台由於因應各種應用需求的提出, 而陸陸續續加入了不同的指令集, 此外也可能因為針對市場做等級區隔, 支援的數量與種類也不等. 在 Linux 平台上這些 CPU 資訊可以透過...