Adding byte-oriented instructions

This is a follow-up of my previous post about the FPGA Computer.

When I initially commited the FPGA Computer, the CPU was 16-bit wide in both address and data bus. Also, all the instructions were word-oriented, working with 16 bits. Even the memory was word-oriented, having 64KWords, not 64KB. At first, that looked promising, having double the amount of RAM memory compared to the usual 8-bit platforms (64KW compared to 64KB).

However, all the instructions were word oriented, making byte-oriented programs complicated. For example, the UART loader receives bytes, not words, since the UART is byte-oriented. That causes a problem when the loader has to receive the code from the UART:

in r1, [64] ; get the byte from the uart into r1

ld r2, [flip]
cmp r2, 0
jz do_flip       ; we have received the even byte
; at this moment, r1 holds the received byte
neg [flip] ; we have received the odd byte - time to complete the word out of those two bytes (even and odd)
ld r0, [current_byte] ; get the even byte from the memory (stored earlier)
shl r0, 8 ; shift it 8 bits to the left
or r0, r1 ; complete the word
ld r2, [current_addr] ; r2 holds the current pointer in memory to store the received byte
st [r2], r0 ; store the completed word into the memory
inc r2 ; move to the next location in memory
st [current_addr], r2  ; save the incremented value of the current address
ld r2, [current_size]  ; increment the byte counter
inc r2
st [current_size], r2
cmp r2, [size] ; did we receive all?
jz all_arrived
j skip

do_flip:
neg [flip]
st [current_byte], r1 ; we need to receive two bytes to form the word, so we are saving this byte before receiving the other
ld r2, [current_size] 
inc r2 ; increment the byte counter
st [current_size], r2

cmp r2, [size] ; did we receive all?
jz all_arrived_even

j skip ; return and wait for the next byte

all_arrived_even:
; at this moment, r1 holds the received byte
shl r1, 8 ; the upper byte is for the odd bytes
ld r2, [current_addr] ; r2 holds the current pointer in memory to store the received byte
st [r2], r1 ; store the incomplete word into the memory
all_arrived:

As you can see, the problem is with the word-oriented instructions and memory locations. Whenever a byte comes to the computer, it must be saved, then combined with the next byte that would come, and that combination then stored in memory as a 16-bit value.

That was the reason for the redesign. I have introduced the ".b" suffix. If the instruction has the ".b" suffix, it is byte-oriented. This also caused the change in the addressing. The data bus is still 16-bit wide, and all the memory operations are 16-bit, but the address range covers 64KB now, instead of 64KW. That way, all the addresses in the assembler are byte-oriented, not word-oriented.

This means that if the instruction does not have the ".b" suffix, it will work with the word-oriented memory location, aiming at the word at the given address. If that is the case, the address must be aligned to 16-bits (even).

For example, this instruction is word-oriented:

ld r0, [1000]

It loads the 16-bit content of the address 1000 (two bytes, one byte from the 1001 and the other from 1000) and stores that 16-bit value in the r0 register. The address must be even.

If the instruction has the ".b" suffix, then it is byte-oriented. The address in byte-oriented instructions can be both even and odd. This instruction is byte-oriented:

ld.b r0, [1001]

It loads the 8-bit value (one byte) from the address 1001 into the r0 register.

It the 16-bit word is stored in the memory, it is stored as big endian, having the lower byte in odd address, and the upper byte in the even address. For example, the number 0x1234 stored at the 1000 address looks like this:

address
content
1000
0x12
1001
0x34

Now let's look at the same UART loader code, having byte-oriented instructions:

in r1, [64] ; get the byte from the uart into r1

; at this moment, r1 holds the received byte
; r2 holds the current pointer in memory to store the received byte
ld r2, [current_addr]
st.b [r2], r1 ; store the received byte into the memory
inc r2 ; move to the next location in memory
st [current_addr], r2 ; save the incremented value of the current address
ld r2, [current_size] ; increment the byte counter
inc r2
st [current_size], r2
cmp r2, [size] ; did we receive all?
jz all_arrived
j skip

all_arrived:

As you can see, the code is shorter and easier to understand.

The same idea can be applied to strings. Now that we have the byte-oriented instructions, dealing with byte-oriented strings is easy. This code prints the hello string on the screen:

VIDEO_0 = 2400 ; beginning of the text frame buffer
mov r2, 0 ; r1 is the index
mov r1, hello ; r1 holds the address of the "HELLO WORLD" string
again:
; load r0 with the content of the memory location to which r1 points
ld.b r0, [r1]          
cmp r0, 0 ; if the current character is 0 (string terminator),
jz end ; go out of this loop 
st.b [r2 + VIDEO_0], r0 ; store the character at the VIDEO_0 + r2 
inc r1 ; move to the next character
add r2, 2 ; move to the next location in the video memory
j again ; continue with the loop
end:
halt
hello:
#str "HELLO WORLD\0"

Conclusion

This change in the design of the CPU contributed to the much better assembler code. I haven't lost all the word-oriented instructions, but I have gained whole bunch of byte-oriented instructions. I did lose 64KB of memory, but my FPGA didn't have 128KB of SRAM memory anyway. 

Even if we try to make whole code word-oriented, we cannot skip 8-bit strings and protocols. That is why I have done this refactoring.

Here are github links:
- FPGAComputer
- FPGA Custom Assembler
- FPGA UART Loader (Raspbootin-like)
- FPGA Emulator