FPGA Computer Assembler
This is the second follow-up of my initial text about the FPGA
            Computer.
I use a fork of the customasm project for my FPGA-based CPU. It is on the github here:
https://github.com/milanvidakovic/FPGAcustomasm
            
        
The address bus is 16 bits wide, addressing 65536 addresses. Data bus is
            also 16 bits wide, but all the addresses are 8-bit aligned. 
There are eleven groups of instructions:
All the instructions have the similar format:
For example, the mov r2, r1 instruction is encoded as:
The Source is r1 (0001), the Destination is r2 (0010), the group is 0 (0000) and the type is move regx, regy (0000).
Second example is the mov r1, 0x0f instruction:
ld r1, [0x0a] loads two bytes from the 0x0a location. The address (0x0a) must be even if we work with 16-bit values.
If we want to load a byte from a location, we need to use the ".b" suffix:
ld.b r1, [0x0a]
The code above will load a byte from the 0x0a location into the r1 register.
Let's look at the Hello World example:
; this program will print HELLO WORLD
#addr 0x400
VIDEO_0 = 2400 ; beginning of the text frame buffer
mov r2, 0 ; r1 is the index
mov r1, hello ; r1 holds the address of the "HELLO WORLD" string
again:
ld.b r0, [r1] ; load r0 with the content of the memory location to which r1 points (current character)
cmp r0, 0 ; if the current character is 0 (string terminator),
jz end ; go out of this loop
st [r2 + VIDEO_0], r0 ; store the character at the VIDEO_0 + r2
inc r1 ; move to the next character
add r2, 2 ; move to the next location in the video memory
j again ; continue with the loop
end:
halt
hello:
#str "HELLO WORLD!\0"
        
            
        
        
            
Next, we enter the loop. The loop starts with the again label, and in the loop we load the byte value from the current address (starts with the first character of the hello string), then we compare that byte with the zero (checking the end of the string), and then we store that byte in the current address of the video memory.
When all the characters are printed on the screen, the CPU halts (halt instruction).
Let's look at the UART echo demo. This demo waits for the character to arrive via serial UART (115200
            baud, one start bit, one stop bit, no partiy), then prints that character on the screen, and finally, echoes
            that character back to the UART:
#addr 0x400
; ########################################################
; REAL START OF THE PROGRAM
; ########################################################
mov sp, 1000
mov r0, 14
st [cursor], r0
 
; set the IRQ handler for UART to our own IRQ handler
mov r0, 1
mov r1, 16
st [r1], r0
mov r0, irq_triggered
mov r1, 18
st [r1], r0
halt
        
        
            
            
            
             
             
            
        
        
All the examples are stored in the FPGACustomasm project on the github:
https://github.com/milanvidakovic/FPGAcustomasm/tree/master/examples/FPGA/raspbootin
    
    I use a fork of the customasm project for my FPGA-based CPU. It is on the github here:
https://github.com/milanvidakovic/FPGAcustomasm
This 16-bit CPU has 8 general-purpose registers (r0 – r7), pc (program
                counter), sp (stack pointer), ir (instruction register), and h (higher word when multiplying, or
                remainder when dividing). Each register is 16-bits wide.
        There are eleven groups of instructions:
| Group number | Group name | Group members | Group description | 
| 0 | NOP/MOV/ IN/OUT/PUSH/ POP/RET/IRET/ HALT/SWAP | nop mov reg, xx mov reg, reg in reg, [xx] out [xx], reg push reg push xx pop reg ret iret swap halt | The most general group.
                            Deals with putting values into registers, exchanging values between registers, I/O
                            operations, stack operations, returning from subroutines, and register content swapping. NOP
                            and HALT are also in this group. | 
| 1 | JUMP | j xx jc xx jnc xx jz xx jnz xx jo xx jno xx jp xx jnp xx jg xx jge xx js xx jse xx | Jump to the given
                            location. | 
| 2 | CALL | call xx callc xx callnc xx callz xx callnz xx callo xx callno xx callp xx callnp xx callg xx callge xx calls xx callse xx | Calling subroutine. Puts
                            the return address on the stack before jumping to the subroutine. Needs to call RET when
                            returning from the subroutine. | 
| 3 | LOAD/STORE | ld reg, [xx] ld reg, [reg] ld reg, [reg + xx] ld.b reg, [xx] ld.b reg, [reg]
                             ld.b reg, [reg + xx]
                            st [xx], reg st [reg], reg st [reg + xx], reg st.b [xx], reg st.b [reg], reg
                             st.b [reg + xx], reg
                             | Load from memory into
                            the register destination: register
                         source: memory address
                            given by the number, or by the register, or by the register+number. Store the given register
                            into the memory location destination: memory
                            location given by the number, or by the register, or by the register+number. | 
| 4 | ADD/SUB | add reg, reg add reg, xx add reg, [reg] add reg, [xx] add reg, [reg + xx]
                         add.b reg, [reg]
                             add.b reg, [xx]
                             add.b reg, [reg +
                                xx]sub reg, reg sub reg, xx sub reg, [reg] sub reg, [xx] sub reg, [reg + xx] sub.b reg, [reg]
                             sub.b reg, [xx]
                             sub.b reg, [reg +
                                xx] |  Add and sub group.
                         | 
| 5 | AND/OR | and reg, reg and reg, xx and reg, [reg] and reg, [xx] and reg, [reg + xx]
                         and.b reg, [reg]
                             and.b reg, [xx]
                             and.b reg, [reg +
                                xx]or reg, reg or reg, xx or reg, [reg] or reg, [xx] or reg, [reg + xx] or.b reg, [reg]
                             or.b reg, [xx] or.b reg, [reg + xx]
                             |  And / or group. | 
| 6 | XOR | xor reg, reg xor reg, xx xor reg, [reg] xor reg, [xx] xor reg, [reg + xx] xor.b reg, [reg]
                             xor.b reg, [xx]
                             xor.b reg, [reg +
                                xx] |  Xor group. | 
| 7 | SHL/SHR | shl reg, reg shl reg, xx shl reg, [reg] shl reg, [xx] shl reg, [reg + xx]
                         shl.b reg, [reg]
                             shl.b reg, [xx]
                             shl.b reg, [reg +
                                xx]shr reg, reg shr reg, xx shr reg, [reg] shr reg, [xx] shr reg, [reg + xx] shr.b reg, [reg]
                             shr.b reg, [xx]
                             shr.b reg, [reg +
                                xx] |  Shift group. | 
| 8 | MUL/DIV | mul reg, reg mul reg, xx mul reg, [reg] mul reg, [xx] mul reg, [reg + xx]
                         mul.b reg, [reg]
                             mul.b reg, [xx]
                             mul.b reg, [reg +
                                xx]div reg, reg div reg, xx div reg, [reg] div reg, [xx] div reg, [reg + xx] div.b reg, [reg]
                             div.b reg, [xx]
                             div.b reg, [reg +
                                xx] | Multiply / divide group.
                         | 
| 9 | INC/DEC | inc reg inc [reg] inc [xx] inc [reg + xx] inc.b [reg] inc.b [xx] inc.b [reg + xx]
                            dec reg dec [reg] dec [xx] dec [reg + xx] dec.b [reg] dec.b [xx] dec.b [reg + xx]
                             | Increment and decrement
                            group. | 
| 10 | CMP/NEG | cmp reg, reg cmp reg, xx cmp reg, [reg] cmp reg, [xx] cmp reg, [reg + xx]
                         cmp.b reg, [reg]
                             cmp.b reg, [xx]
                             cmp.b reg, [reg +
                                xx]neg reg neg [reg] neg [xx] neg [reg + xx] neg.b [reg] neg.b [xx] neg.b [reg + xx]
                             |  Compare / negate group.
                         | 
All the instructions are two or four bytes long. Since the data bus is
                16-bits wide, the complete instruction is fetched in either one or two memory reads. This means that,
                since the SRAM is used, the complete instruction is fetched, decoded, and executed in three or more
                clock cycles.
All the instructions have the similar format:
| from | to | what | group | 
| bbbb 0-7: r0-r7 8-sp 9-h | bbbb 0-7: r0-r7 8-sp 9-h | 0000 0=>mov regx, regy
                         | 0000 | 
The first byte has lower four bits used to designate the destination register
            (to), while upper four bits  are used for the source register (from)
            identification. The second byte has lower four bits for the instruction group identification
            (group) and upper four bits for the type of the instruction in that group (what).
        
For example, the mov r2, r1 instruction is encoded as:
binary: 0001 0010 0000 0000
 hex: 12 00The Source is r1 (0001), the Destination is r2 (0010), the group is 0 (0000) and the type is move regx, regy (0000).
Second example is the mov r1, 0x0f instruction:
binary: 0000 0001 0010 0000, 0000 0000 0000 1111
 hex: 01 20, 00 0f
The Load instructions are used to load the value from the memory into
                the register. The Store instructions store the value of the register into the given memory location.
                Memory location is given as number (ld  r1, [0x0a] - load the content of the
            0x0a location into the r1 register), or as a value of a register
                (ld  r1, [r2] - load the content of the memory location to which r2
            points), or as a sum of number and register (ld  r1, [0x0f +
                r2]). 
ld r1, [0x0a] loads two bytes from the 0x0a location. The address (0x0a) must be even if we work with 16-bit values.
If we want to load a byte from a location, we need to use the ".b" suffix:
ld.b r1, [0x0a]
The code above will load a byte from the 0x0a location into the r1 register.
Hello World example
Let's look at the Hello World example:
; this program will print HELLO WORLD
#addr 0x400
VIDEO_0 = 2400 ; beginning of the text frame buffer
mov r2, 0 ; r1 is the index
mov r1, hello ; r1 holds the address of the "HELLO WORLD" string
again:
ld.b r0, [r1] ; load r0 with the content of the memory location to which r1 points (current character)
cmp r0, 0 ; if the current character is 0 (string terminator),
jz end ; go out of this loop
st [r2 + VIDEO_0], r0 ; store the character at the VIDEO_0 + r2
inc r1 ; move to the next character
add r2, 2 ; move to the next location in the video memory
j again ; continue with the loop
end:
halt
hello:
#str "HELLO WORLD!\0"
First we define the constant VIDEO_0 with the valuer of 2400.
                This is the address of the text-based frame buffer. It points to the first character in the video
                memory.
        Then we set the r2 to 0 and r1 to
                the address of the hello string. Note that the mov instruction is used to move
                the number into the register (for example, mov r2, 0), or to move a value of the source
                register to the destination register (for example, mov r1, r2).
        Next, we enter the loop. The loop starts with the again label, and in the loop we load the byte value from the current address (starts with the first character of the hello string), then we compare that byte with the zero (checking the end of the string), and then we store that byte in the current address of the video memory.
When all the characters are printed on the screen, the CPU halts (halt instruction).
Interrupts
#addr 0x400
; ########################################################
; REAL START OF THE PROGRAM
; ########################################################
mov sp, 1000
mov r0, 14
st [cursor], r0
; set the IRQ handler for UART to our own IRQ handler
mov r0, 1
mov r1, 16
st [r1], r0
mov r0, irq_triggered
mov r1, 18
st [r1], r0
halt
The code above sets the interrupt handling routine
            (irq_triggered) for the UART. This is the IRQ1 and its handling routine is at the address 16
            (0x0010). This means that whenever the serial  UART subsystem receives a byte, the CPU will
            jump to the 0x0010 address. At that address, we have placed the JUMP instruction (j
                irq_triggered), having at the address 0x0010 value of 0x0001 (the JUMP
            instruction opcode - 0x0001) and at the address 0x0012 the address of the
            irq_triggered routine (st [r1], irq_triggered).
        That way, we have prepared the UART interrupt routine and the main program
            halts. The rest of the program is in the interrupt routine. Let's look at the interrupt routine:
        ; ##################################################################
            ; Subroutine which is called whenever some byte arrives at the UART
            ; ##################################################################
            irq_triggered: 
            push r0
            push r1
            push r2   
            push r5
            push r6
            
in r1, [64]  ; r1 holds now received byte from the UART (address 64
                    decimal)
            ld r6, [cursor]
            st [r6 + VIDEO_0], r1    ; store the UART character at the VIDEO_0 + r2 
            add r6, 2       ; move to the next location in the video memory
            st [cursor], r6
            loop2:
            in r5, [65]   ; tx busy in r5
            cmp r5, 0     
            jz not_busy   ; if not busy, send back the received character 
            j loop2
            not_busy:
            out [66], r1  ; send the received character to the UART
            skip:
            pop r6
            pop r5
            pop r2
            pop r1                 
            pop r0
            iret 
        When the interrupt happens, the irq_triggered routine first
            pushes some registers on the stack, obtains the received byte from the UART (in r1, [64]),
            prints it on the screen, and then sends back that character through UART (out [66], r1). If the
            UART is busy sending some character, the in r5, [65] will have r5 set to 1; otherwise, the r5
            will have 0. Finally, the routine pops the registers from the stack and returns (iret instruction). 
        The difference between iret and ret is that
            ret pops the return address from the stack and jumps to the obtained address (return from the
            call subroutine), while the iret pops the return address, pops the flags, and then jumps to the
            obtained address (interrupt routine might have changed flags,so they need to be saved before interrupt
            routine is invoked, and restored during the iret execution).
All the examples are stored in the FPGACustomasm project on the github:
https://github.com/milanvidakovic/FPGAcustomasm/tree/master/examples/FPGA/raspbootin
Comments
Comments powered by Disqus