Post

ENG | Cross compiling assembly for RISC-V

Learn how to convert x86 assembly to RISC-V (RV64), cross-compile for the Milk V Duo S, and deploy and debug it.

ENG | Cross compiling assembly for RISC-V

This article explores the process of converting and cross-compiling a simple x86 assembly program to RISC-V architecture, specifically for use on the Milk V Duo S single-board computer. We’ll walk through the conversion process, necessary tools, deployment steps and debugging using GBD (GNU Debugger).

Code Example: Rewriting a Hello World Program in RISC-V Assembly

First, I asked Claude.AI to rewrite this 25-year-old program for 32-bit Linux (I’ve lost MS-DOS version)

Btw, Potvora means roughly monster, beast, or some annoying man or creature in Czech

Netwide Assembler version for x86 Linux.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
; Preklad:
;   nasm -f elf -o potvora.o potvora.asm
;   ld potvora.o -o potvora
;   strip potvora
bits 32
global _start
section .text
_start:
        mov     al,4            ; eax=0, 4 je syscall write
        mov     ebx,1           ; file descriptor
        mov     dword ecx, .potvora
        mov     edx,79          ; delka stringu
        int     80h             ; write(1,".....",79);
        xor     ebx,ebx         ; exit code
        mov     eax,1           ; syscall exit
        int     80h             ; exit(0);
.potvora:
        db 1Bh,'[32;1m\|/ __\/ \|/',0Ah
        db ' @~/  ',1Bh,'[0;1mOo',1Bh,'[32m\~@',0Ah
        db '/(  \/)_\',0Ah
        db '   \_',1Bh,'[31mU',1Bh,'[32m/',1Bh,'[m',0Ah

RISC-V Assembly version

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
.global _start

.section .text
_start:
    # Write syscall
    li a7, 64           # syscall number for write
    li a0, 1            # file descriptor (1 = stdout)
    la a1, potvora      # address of string to print
    li a2, 79           # length of string
    ecall               # make syscall

    # Exit syscall
    li a7, 93           # syscall number for exit
    li a0, 0            # exit status
    ecall               # make syscall

.section .data
potvora:
    .ascii "\033[32;1m\\|/ __\\/ \\|/\n"
    .ascii " @~/  \033[0;1mOo\033[32m\\~@\n"
    .ascii "/_(  \\__/)_\\\n"
    .ascii "   \\___\033[31mU\033[32m/\033[m\n"

(Note: Instructions li and la stand for load immediate value and load address, respectively.)

It may or may not work, we need to compile it on a system which runs on another machine.

Installing and Using RISC-V Binutils on Fedora

On Fedora, install RISC-V binutils (assembler, linker, etc.) as root:

1
2
3
dnf search riscv                            # Search for packages containing riscv
dnf in binutils-riscv64-linux-gnu.x86_64    # Install binutils for riscv
rpm -ql binutils-riscv64-linux-gnu          # List files in RPM

Compiling and Linking RISC-V Assembly Code

Now, as a user:

1
2
3
4
riscv64-linux-gnu-as potvora_riscv.s -o potvora_riscv.o # Assembler to object file
riscv64-linux-gnu-ld potvora_riscv.o -o potvora_riscv   # Link a single object file to executable
ls -al ~/potvora_riscv # ==> -rwxr-xr-x. 1 pavel pavel 1464 Jun 22 22:56 /home/pavel/potvora_riscv
file ~/potvora_riscv   # ==> potvora_riscv: ELF 64-bit LSB executable, UCB RISC-V, double-float ABI, version 1 (SYSV), statically linked, not stripped

Transferring the Executable to Milk V Duo S

Sadly, MilkV DuoS has lightweight ssh (dropbear) without sftp server functionality

1
scp potvora_riscv [email protected]:/root/potvora_riscv
1
2
3
[email protected]'s password: 
sh: /usr/libexec/sftp-server: not found
scp: Connection closed

Hard way

Oops. Let’s try something simpler, like using netcat, ncat, or nc to redirect pipes to and from a network socket.

Source computer

  • Read `potvora_riscv and redirect it to netcat listening on port 12345
1
2
cat potvora_riscv | nc -l 12345

Target computer

  • Connect to host 192.168.0.143 to “download” file content and redirect it file ‘potvora’
  • Check file permissions
  • Fix file permissions
  • Execute command
1
2
3
4
5
6
7
8
9
10
[root@milkv-duo]~# netcat 192.168.0.143 12345 > potvora
[root@milkv-duo]~# ls -la potvora 
-rw-r--r-- 1 root root 1464 Jun 22 21:06 potvora
[root@milkv-duo]~# chmod +x potvora 
[root@milkv-duo]~# ./potvora 
\|/ __\/ \|/
 @~/  Oo\~@
/_(  \__/)_\
   \___U/
[root@milkv-duo]~#    

Easier way

Ok, without overthinking, scp from DuoS to Linux host works. When DuoS is connected to USB port, just determine local ip address, which was assigned by DuoS running DHCP server, e.g.:

1
ip addr # or ip a
1
2
3
4
5
6
⋮
4: enp0s20f0u2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000
    link/ether a2:09:a7:7a:6b:6c brd ff:ff:ff:ff:ff:ff
    inet 192.168.42.73/24 brd 192.168.42.255 scope global dynamic noprefixroute enp0s20f0u2
       valid_lft 2322sec preferred_lft 2322sec
⋮      

Connect to DuoS and transfer file from here:

1
2
3
4
5
6
[pavel@marten -=- /home/pavel]$ ssh [email protected]
[email protected]'s password:
[root@milkv-duo]~# scp [email protected]:/home/pavel/dev-riscv/potvora .
[email protected]'s password:
potvora                                                                             100% 1464     1.5KB/s   00:00
[root@milkv-duo]~#

Debugging with GDB on RISC-V

First it’s important to add debug symbols. Hopefully there is gdb and strace on DuoS

1
2
3
#!/usr/bin/env dash
riscv64-linux-gnu-as --gstabs nanosleep.s -o nanosleep.o
riscv64-linux-gnu-ld nanosleep.o -o nanosleep

Then it’s needed to transfer both assembly and binary file:

1
2
scp [email protected]:/home/pavel/dev-riscv/nanosleep.s .
scp [email protected]:/home/pavel/dev-riscv/nanosleep .

And start debugger:

1
gdb nanosleep

Useful commands are:

  • list: file listing around current position
  • list 13: listing around line 13
  • break 13 or b 13: set breakpoint to line 13
  • break _start: set breakpoint to label _start
  • run: run program to next breakpoint
  • continue: continue program to next breakpoint
  • step or s: step to next instruction (enter functions)
  • next or n: go to next line (do not enter functions)
  • info reg a7 or i r a7: print register a7
  • x <address>: examine memory at address
  • x/6b 0x10138: examine memory at address 0x10138 and print six bytes (see docs)
  • x/8dw <addr>: 8 decimal words
  • disassemble _start: to see assembly (when debug info is not present)
  • print whatever: prints variable (or struct) named whatever
  • ptype whatever: prints type (or struct definition) named whatever
  • <ENTER>: repeats last command
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
⋮
(gdb) break _start
Breakpoint 1 at 0x100e8: file nanosleep.s, line 23.
(gdb) run
Starting program: /root/nanosleep

Breakpoint 1, _start () at nanosleep.s:23
23              call write_hello
(gdb) s
write_hello () at nanosleep.s:29
29              li a7, WRITE    # write
(gdb) n
30              li a0, STDOUT   # fd
(gdb) n
31              la a1, msg      # char*
(gdb) n
32              li a2, 6        # strlen
(gdb) info reg a1
a1             0x10138  65848
(gdb) n
33              ecall
(gdb) n
Hello
34              ret
(gdb) x/6b 0x10138
0x10138:        0x48    0x65    0x6c    0x6c    0x6f    0x0a
(gdb) x/6c 0x10138
0x10138:        72 'H'  101 'e' 108 'l' 108 'l' 111 'o' 10 '\n'
(gdb) x/8i 0x03020000
   0x3020000:   Cannot access memory at address 0x3020000

Rewriting blink.sh to assembly

For curiosity I tried new Claude.AI Sonnet 3.5 (large language model) and asked it rewrite /mnt/system/blink.sh to assembly using table of strings and functions. It did it. I don’t have a clue where it got information which registers to use for kernel calls on RV64 architecture. However, it got stat syscall completely wrong (syscall number) and the condition testing and it’s return value were wrong too.

First original program executed by init, that seems automatically restarted when killed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#!/bin/sh
LED_PIN=509

LED_GPIO=/sys/class/gpio/gpio${LED_PIN}

if test -d ${LED_GPIO}; then
    echo "PIN ${LED_PIN} already exported"
else
    echo ${LED_PIN} > /sys/class/gpio/export
fi

echo out > ${LED_GPIO}/direction

while true; do
    echo 0 > ${LED_GPIO}/value
    sleep 0.5
    echo 1 > ${LED_GPIO}/value
    sleep 0.5
done

Assembly code

The following code works, but contains errors. These are left for debugging example. Final code is in the links sections.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
.section .rodata
gpio_export:    .asciz "/sys/class/gpio/export"
gpio_dir:       .asciz "/sys/class/gpio/gpio509/direction"
gpio_val:       .asciz "/sys/class/gpio/gpio509/value"
gpio_path:      .asciz "/sys/class/gpio/gpio509"
pin_str:        .asciz "509\n"
dir_out:        .asciz "out"
val_on:         .asciz "1"
val_off:        .asciz "0"
# Struct timespec for nanosleep
sleep_time:
    .dword 0          # tv_sec
    .dword 500000000  # tv_nsec (500ms)

.section .data
stat_buffer:    .space 160    # Allocate 160 bytes in the data section for struct stat


.section .text
.global _start

_start:
    # Check if GPIO is already exported
    call check_dir_exists

    li t0, 0
    beq a0, t0, gpio_exported # if 0, already exported (branch equal)

    li t0, -2
    bne a0, t0, handle_error # if not -2 error (branch notequal)

    # GPIO not exported, so export it
    la a0, gpio_export
    la a1, pin_str
    li a2, 4          # length of "509\n"
    call write_to_file
    call do_sleep

gpio_exported:
    # Set direction to out
    la a0, gpio_dir
    la a1, dir_out
    li a2, 3          # length of "out"
    call write_to_file

loop:
    # Turn LED off
    la a0, gpio_val
    la a1, val_off
    li a2, 1          # length of "0"
    call write_to_file

    # Sleep for 500ms
    call do_sleep

    # Turn LED on
    la a0, gpio_val
    la a1, val_on
    li a2, 1          # length of "1"
    call write_to_file

    # Sleep for 500ms
    call do_sleep

    j loop

# Function to check if directory exists
check_dir_exists:
    la a0, gpio_path     # Load address of the pathname
    la a1, stat_buffer   # Load address of the stat buffer
    li a2, -100          # AT_FDCWD for current working directory
    li a3, 0             # flags (0 for no special options)
    li a7, 79            # newfstatat syscall number
    ecall
    # The return value is already in a0
    ret

# Function to write to a file
write_to_file:
    mv t0, a0         # Save filename
    mv t1, a1         # Save content
    mv t2, a2         # Save length

    # Open file
    li a7, 56         # openat syscall
    li a0, -100       # AT_FDCWD (current working directory)
    mv a1, t0         # filename
    li a2, 1          # O_WRONLY
    li a3, 0644       # mode
    ecall

    # Write to file
    li a7, 64         # write syscall
    mv a1, t1         # content
    mv a2, t2         # length
    ecall

    # Close file
    li a7, 57         # close syscall
    ecall
    ret

# Function to sleep
do_sleep:
    li a7, 101        # nanosleep syscall
    la a0, sleep_time
    li a1, 0          # NULL for remaining time
    ecall
    ret

# Exit program
exit:
    li a0, 0          # status code 0
handle_error:
    li a7, 93         # exit syscall
    ecall

# https://github.com/hrw/syscalls-table/blob/master/system_calls/tables/riscv64.py    
# https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf (page 110 conditions)

Running at boot time

Now adding this to /etc/init.d/S99user can be a bit tricky, cause making error in this file can result in system not booting correctly. Note that . before file means that file is included into script rather than executed (no new process is spawned) and blink.sh cannot be killed, because it’s restarted immediately.

So I replaced this:

1
2
3
4
        if [ -f $SYSTEMPATH/blink.sh ]; then
                . $SYSTEMPATH/blink.sh &
        fi

By this (note I changed file exists to file is executable):

1
2
3
4
5
6
        if [ -x $SYSTEMPATH/blink.sh ]; then
                . $SYSTEMPATH/blink.sh &
        fi
        if [ -x /root/bin/blink ]; then
                /root/bin/blink &
        fi

and removed exectutable flag from the script:

1
chmod -x /mnt/system/blink.sh

Now reboot. When we kill our blink we can inspect it:

Inspecting and debugging system calls with strace

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@milkv-duo]~/dev# strace ./blink
execve("./blink", ["./blink"], 0x3fff98abe0 /* 18 vars */) = 0
newfstatat(66123, "", 0xffffffffffffff9c, 0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/sys/class/gpio/export", O_WRONLY) = 3
write(3, "509\n", 4)                    = -1 EBUSY (Resource busy)
close(-16)                              = -1 EBADF (Bad file descriptor)
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/direction", O_WRONLY) = 4
write(4, "out", 3)                      = 3
close(3)                                = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 3
write(3, "0", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 1
write(1, "1", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 1
write(1, "0", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, ^CNULL) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
strace: Process 346 detached

Here is suspisious call and relevant piece of code

1
newfstatat(66123, "", 0xffffffffffffff9c, 0) = -1 ENOENT (No such file or directory)
1
2
3
4
5
6
check_dir_exists:
    la a0, gpio_path     # Load address of the pathname
    la a1, stat_buffer   # Load address of the stat buffer
    li a2, -100          # AT_FDCWD for current working directory
    li a3, 0             # flags (0 for no special options)
    li a7, 79            # newfstatat syscall number

Piece of /fs/stat.c linux kernel code:

1
2
SYSCALL_DEFINE4(newfstatat, int, dfd, const char __user *, filename,
        struct stat __user *, statbuf, int, flag)

Hmm, we have wrong order of arguments. Let’s fix it:

1
2
3
4
5
6
check_dir_exists:
    li a0, -100          # AT_FDCWD for current working directory
    la a1, gpio_path     # Load address of the pathname
    la a2, stat_buffer   # Load address of the stat buffer
    li a3, 0             # flags (0 for no special options)
    li a7, 79            # newfstatat syscall number

Run file exported:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@milkv-duo]~/dev# strace ./blink
execve("./blink", ["./blink"], 0x3fffb58be0 /* 18 vars */) = 0
newfstatat(AT_FDCWD, "/sys/class/gpio/gpio509", {st_mode=S_IFDIR|0755, st_size=0, ...}, 0) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/direction", O_WRONLY) = 3
write(3, "out", 3)                      = 3
close(3)                                = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 3
write(3, "0", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 1
write(1, "1", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, ^CNULL) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
strace: Process 360 detached

Flawless (well, not really if you look really close()). And without file exported:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@milkv-duo]~/dev# echo 509 > /sys/class/gpio/unexport
[root@milkv-duo]~/dev# strace ./blink
execve("./blink", ["./blink"], 0x3fffecebe0 /* 18 vars */) = 0
newfstatat(AT_FDCWD, "/sys/class/gpio/gpio509", 0x11280, 0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/sys/class/gpio/export", O_WRONLY) = 3
write(3, "509\n", 4)                    = 4
close(4)                                = -1 EBADF (Bad file descriptor)
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/direction", O_WRONLY) = 4
write(4, "out", 3)                      = 3
close(3)                                = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 3
write(3, "0", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 1
write(1, "1", 1)                        = 1
close(1)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, ^CNULL) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
strace: Process 364 detached

Okay. We have bad file descriptor, cause write command returns it wrote 4 bytes in a0 register where we have our file descriptor. Weirdly, strace shows errors as -1, but error codes are correct (ENOENT, EBATF). After fixing it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@milkv-duo]~/dev# echo 509 > /sys/class/gpio/unexport
[root@milkv-duo]~/dev# strace ./blink
execve("./blink", ["./blink"], 0x3fffc12be0 /* 18 vars */) = 0
newfstatat(AT_FDCWD, "/sys/class/gpio/gpio509", 0x11288, 0) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/sys/class/gpio/export", O_WRONLY) = 3
write(3, "509\n", 4)                    = 4
close(3)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/direction", O_WRONLY) = 3
write(3, "out", 3)                      = 3
close(3)                                = 0
openat(AT_FDCWD, "/sys/class/gpio/gpio509/value", O_WRONLY) = 3
write(3, "0", 1)                        = 1
close(3)                                = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, ^CNULL) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
strace: Process 372 detached

I assume it can be done more efficiently via gpiod somehow, but tools are not installed on Duo S and it’s likely possible to do it from RTOS core.

Source files

Some Linux kernel stuff

GPIO

Videos

This post is licensed under CC BY 4.0 by the author.