UART Loader

FPGA Computer UART Loader

This is a follow-up of the FPGA computer post. 

I have developed the UART Loader for the FPGA Computer to be able to send programs to it. It is based on the UART module developed in Verilog, for the FPGA Computer. This module provides both sending and receiving bytes, using 115200 bauds, 8 bits, 1 start, 1 stop bit, no parity. The serial port of the FPGA computer is connected to the TTL SerialToUSB dongle, which is then connected to the USB port of the computer:

When I initially created the FPGA Computer, I was able to store just one program in it, by hardcoding it in the RAM memory. Here is the part of the RAM.v Verilog module that includes the program in the RAM:

// Declare the RAM variable
reg [N-1:0] ram[32767:0];

initial
begin
  $readmemh("program.hex", ram);
end

The problem with this approach is that it is very slow. This program has to be embedded into the computer during the building of the computer, which can last several minutes. That is why I have devised the Loader. It is hardcoded in the RAM module, and when the computer powers on, it jumps to the address 0x0000, where I have placed a JUMP instruction to go to the Loader:

; ########################################################
; RESET CODE (4 bytes max)
; ########################################################
#addr 0x0000
j start

When started, Loader sends an initialisation sequence of bytes to the PC, via UART:

; send raspbootin boot char sequence
mov r0, 77 ; "M" character
call uart_send
mov r0, 13 ; \n character
call uart_send
mov r0, 10 ; \r character
call uart_send
mov r0, 3
call uart_send
mov r0, 3
call uart_send
mov r0, 3
call uart_send

This sequence is inherited from the original Raspbootin protocol for which I have made a Java implementation. This version is similar, but I have added a checksum at the end (more about this below).

The Loader then fetches the number of bytes to be received:

first_byte:
in r1, [64] ; get the char from the uart
st [size], r1 ; store the lowest byte to the size variable
inc [state] ; next state -> 1 (second byte)
j skip ; return from interrupt
second_byte:
in r1, [64] ; get the char from the uart (8 upper bits)
ld r2, [size] ; get the lower 8 bits (received earlier)
shl r1, 8 ; shift the received byte 8 bits
or r1, r2 ; put together lower and upper 8 bits
st [size], r1 ; store the calculated size
inc [state] ; next state 
j skip ; return from interrupt

After that, the Loader returns back the received size (just to make sure that it received the correct number of bytes):

; this is 16-bit cpu, so we don't load code bigger than 65535 bytes
; send confirmation that the code has been loaded
ld r0, [size]
and r0, 255
call uart_send
ld r0, [size]
shr r0, 8
call uart_send
inc [state] ; next state ->  (code arrives)

After that, all incoming bytes are loaded into the memory, starting from the 0x400 address:

in r1, [64] ; get the byte from the uart into r1

mov r2, r1
ld r0, [sum_all]
add r0, r2
st [sum_all], r0 ; primitive checksum - sum of all bytes
; at this moment, r1 holds the received byte
ld r2, [current_addr]
st.b [r2], r1 ; store the received byte into the memory
inc r2 ; move to the next location in memory
st [current_addr], r2   ; save the incremented value of the address

ld r2, [current_size]   ; increment the byte counter
inc r2
st [current_size], r2
cmp r2, [size] ; did we receive all?
jz all_arrived
j skip

When all bytes are received, the Loader sends back the primitive checksum, so the PC can check if everything is OK:

all_arrived:
; send the sum of all bytes
ld r0, [sum_all]
and r0, 255
call uart_send
ld r0, [sum_all]
shr r0, 8
call uart_send

mov r0, 1; signal to the main program ->loader has received all
st [loaded], r0

After that, the Loader jumps to the 0x400 address:

not_loaded:
ld r0, [loaded]
cmp r0, 1
jz 0x400
nop
j not_loaded

For the PC, I have modified the Raspbootin Loader, originally used in the Raspberry Pi bare metal programming, and it is also stored on the github.

Conclusion

When I tried Raspberry Pi bare metal programming, I immediately had the problem of transferring programs from the PC to the RPI. Usually, there is no network (it is bare metal platform with almost none of the I/O libraries) and the only other way is by transferring programs via micro SD cards (card dance). You would cross-compile the program on the PC, save it to the SD card, eject it, put it in the RPI, and reset the RPI. And then again, and again...

That was a motivation for the programmers to develop some kind of a loader for the RPI. One of those loaders is the Raspbootin. It is fairly simple. I re-used it for the exaclty same purpose - to load programs on my FPGA Computer from the PC. The only problematic part of this development was debugging the Loader. It could be only done on the FPGA, with those couple-of-minutes compiling. When I survived that, I was able to cross-assemble programs on my PC and send them to the board via Loader.


Comments

Comments powered by Disqus