Get-Set window position in Java Swing

I frequently need to save the window position and its size during application shutdown, in order to restore them during the application startup. That is the usual procedure for most of the applications to have them opened right where you last time closed them.

Java Swing has a standard API for reading and setting window (frame) position. Simply, you type JFrame.getLocation(), or JFrame.getSize(), and the corresponding: JFrame.setLocation(...), and JFrame.setSize(...).

This code works correctly on a single monitor configuration. However, if you have multiple monitors with different scaling, then the whole API breaks down and you get the inconsistent results. Restoring the saved position and size ends up with the frame on wrong location and with a wrong size.

Read more…

DMA on FPGA computer

This is a followup of my original post.

So far, my FPGA computer didn't have DMA (Direct Memory Access). There were two places on my computer where DMA would fit: SD card interface and Ethernet interface. I have decided to implement a simple DMA controller first for SD card interface and later for the Ethernet adapter.

Let's look at the current SD card implemention. I have made a text about it here. In short, SD card adapter is connected to my internal SPI module. This means that all the communication with the SD card goes via SPI. In particular, when reading a sector from the SD card, my driver actually sends 0xFF to the SPI, and then reads a byte from it. And it goes like this for all 512 bytes of the SD card sector. SPI read function was implemented using interrupts. This means that when a byte arrives from the SD card, the SPI controller will trigger an interrupt and CPU will have to handle that single byte. This can be quite slow, because CPU needs to transfer all 512 bytes one by one having SPI interrupt happen 512 times.

The initial code for receiving a count bytes from the SD card without DMA is quite simple. As previously stated, it simply receives a count number of bytes from the SPI port of the SD card and stores those bytes in the buffer. It looks like this:

for (uint16_t i = 0; i < count; i++) {
        dst[i] = spiRec();
}

The spiRec() function sends 0xFF to the SPI port, and then waits for the byte to arrive from the SPI. It looks like this:

uint8_t spiRec(void) {
        send_spi(SPI0, 0xFF);
        return read_spi(SPI0);
}

Read more…

Floating-point implementation in GCC for my FPGA computer

This is a followup of my original post.

My FPGA computer supports floating-point instructions. GCC port doesn't. If we try to make the following code:

float d = 0.5;
float e = 0.2;
float f;

f = d - e;

The compiler will generate the following assembly code:

# small.c:10:   float d = 0.5;
mov.w   r0, 1056964608  # tmp43,
st.w    [r13 + (-12)], r0   # d, tmp43
# small.c:11:   float e = 0.2;
mov.w   r0, 1045220557  # tmp44,
st.w    [r13 + (-16)], r0   # e, tmp44
# small.c:14:   f = d - e;
ld.w    r0, [r13 + (-16)]   # tmp45, e
st.w    [sp + (4)], r0  #, tmp45
ld.w    r0, [r13 + (-12)]   # tmp46, d
st.w    [sp], r0    #, tmp46
call    __subsf3        #
mov.w   r1, r0  # tmp47,
mov.w   r0, r1  # tmp48, tmp47
st.w    [r13 + (-20)], r0   # f, tmp48

Read more…

Networking with the FPGA computer

This is a followup of my original post.

As I mentioned in the SPI-related post, I have added the SPI interface to my FPGA computer. Not one, but two: one for the SD card, and the other one for the Ethernet card. Today I am going to talk about the Ethernet.

First of all, I have used the ENC28J60 module, which I use for my Raspberry Pi Zero and Arduino/ESP32 ethernet connectivity This is rather simple module, which uses SPI as an interface to the host computer. Since I have already used this module with the Arduino and ESP32, I have decided to reuse the corresponding Arduino library for this module and to adjust it to work with my FPGA computer.

The library I used for the Arduino is: https://github.com/njh/EtherCard

This library is written in C++. Since I haven't finished porting GCC to my FPGA, I don't have the support for the C++. This means that I had to unwrap the code from C++ to pure C. When I finished that, the only thing that I had to do was to replace Arduino-based SPI code with my FPGA SPI code. For example, one of the original functions was:

static void writeOp (byte opbyte addressbyte data) {
    enableChip();
    SpiPtr->beginTransaction(SPISettings(spiClk, MSBFIRST, SPI_MODE0));
    SpiPtr->transfer(op | (address & ADDR_MASK));
    SpiPtr->transfer(data);
    SpiPtr->endTransaction();
    disableChip();
}

My code is:

void enc28j60WriteOp(uint8_t opuint8_t addressuint8_t data)
{
        chipSelectLowE();
        // issue write command
        spiSendE(op | (address & ADDR_MASK));
        // write data
        spiSendE(data);
        chipSelectHighE();
}

Read more…

Adding PS/2 mouse to my FPGA computer

This is a followup of my original post.

So far I had PS/keyboard only on my FPGA computer. The time has come to add the mouse, too. Without any investigation how PS/2 mouse works, I first tried to plug the mouse into my PS/2 keyboard connector and watch what would come from it. It didn't work. The keyboard worked, but the mouse didn't. I expected that the mouse would send bytes as I move or click, but it didn't. After a brief investigation, I found out that the PS/2 mouse needs an initialization in order to start sending bytes to the computer.

PS/2 is actually a bidirectional interface. Both computer and mouse/keyboard can send bytes to the other. Well, the initialization sequence for the mouse actually means that the computer needs to send one byte to the mouse. Unfortunately, that is not so simple. In order to send a command to the mouse, host (computer) needs to set both data and clock lines low for a given period of time, then to release both lines, and then to start setting bits of the command in synchronization with the clock that has just started to arrive from the mouse.

Fortunately, there is a module that already does all these steps, and can be found here.

I have replaced my original PS2 module with this one and now I have two ports in my computer:

// ####################################
// PS/2 keyboard instance
// ####################################
wire [7:0] ps2_data;
wire ps2_received;
reg [7:0] ps2_data_r;
PS2_Controller #(.INITIALIZE_MOUSE(0)) PS2 (
    // Inputs
    .CLOCK_50           (CLOCK_50),
    .reset              (~KEY[0]),
    // Bidirectionals
    .PS2_CLK            (gpio0[33]),
    .PS2_DAT            (gpio0[31]),
    // Outputs
    .received_data      (ps2_data),
    .received_data_en   (ps2_received)
); 
// ####################################
// PS/2 mouse instance
// ####################################
wire [7:0] ps2_data_mouse;
wire ps2_received_mouse;
reg [7:0] ps2_data_r_mouse;
PS2_Controller PS2_mouse (
    // Inputs
    .CLOCK_50           (CLOCK_50),
    .reset              (~KEY[0]),
    // Bidirectionals
    .PS2_CLK            (gpio0[2]),
    .PS2_DAT            (gpio0[4]),
    // Outputs
    .received_data      (ps2_data_mouse),
    .received_data_en   (ps2_received_mouse)
); 

Read more…

To BLIT or not to BLIT

This is a followup of my original post.

I have recently implemented the BLIT instruction for my FPGA computer. It is the most simple version of BLIT: copy the given number of bytes from the source memory location to the destination memory location. The syntax is like this:

mov.w r1, 1024  # destination address is in r1
mov.w r2, 9024  # source address is in r2
mov.w r3, 8000  # number of bytes is in r3
blit            # copy bytes

Registers r1, r2 and r3 are hardcoded. Later I might make it more flexible.

Results are quite impressive. When I copy 32KB using memcpy (not using BLIT), it takes approximately 100 milliseconds. When I use the BLIT instruction, it takes one millisecond!

How is BLIT implemented? Here is the Verilog code:

4'b1000begin
    // BLIT (r1, r2, r3) - r1 - dst; r2 - src; r3 - count
    case (mc_count)
        0begin
            addr <= regs[2] >> 1;
            regs[2] <= regs[2] + 2;
            regs[3] <= regs[3] - 2;
            mc_count <= 1;
            next_state <= EXECUTE;
            state <= READ_DATA;
        end
        1begin
            addr <= regs[1] >> 1;
            data_to_write <= data_r;
            regs[1] <= regs[1] + 2;
            next_state <= EXECUTE;
            state <= WRITE_DATA;
            if (regs[3] <= 0begin
                mc_count <= 2;
            end
            else 
                mc_count <= 0;
        end
        2begin
            state <= CHECK_IRQ;
            pc <= pc + 2;
        end
    endcase
end

In the code above we see that the CPU starts memory read at the address pointed by the r2 register in the first mc_count cycle. Then it obtains the word (two bytes) from memory and writes them to the address pointed by the r1 register. Both r1 and r2 are incremented by two and the r3 register is decremented by two; when it reaches zero, the instruction finishes.

Conclusion

The BLIT instruction does not execute in parallel with the CPU. It blocks the CPU while executing. Even with this constraint, it is approximately hundred times faster then copying bytes across the memory using the memcpy function. Therefore, it is worth using.

SPI interface on my FPGA computer

This is a followup of my original post.


SPI interface is a kind of a standard when it comes to connecting various peripherals to a computer (or, at least to a microcontroller). There is also I2C interface, but I will focus on the SPI in this post.

SPI stands for Serial Peripheral Interface. It is organized as a master-slave communication. If we presume that our FPGA computer is master, then the peripheral will be slave.

It usually has four important pins:
1. MISO (Master In Slave Out) - a wire which is used to transport data from slave to the master device,
2. MOSI (Master Out Slave In) - a wire which is used to transport data from master to the slave device,
3. SCL - clock (all the data transport is synchronized using this clock line), and
4. SS (Slave Select) - when active, the slave is selected (sometimes it is called CS - chip select). With this wire, it is possible to connect several peripherals to the same three mentioned wires (MISO, MOSI and SCK) and to have separate SS wires to each peripheral.

Why did I choose to use the SPI on my computer. First of all, SD cards have SPI built-in. This means that every SD card is actually a SPI slave device. Next, I use the ENC28J60 Ethernet module for my Arduino/ESP32/RaspberryPi Zero devices for the Ethernet connectivity. That module has SPI interface, too.

Read more…

TinyBasic made for my FPGA platform

This is a followup of my original post.


In my previous post, I have described how I have modified GCC cross compiler made originally for the moxie platform to generate assembly code for my FPGA platform. I have used my new cross compiler to make a port of TinyBasic for my platform. I have downloaded TinyBasic C code and modified it to be a bit more programmer-friendly. That port can be found here:

https://github.com/milanvidakovic/FPGABasic

Besides standard BASIC commands, I had a freedom to invent my own commands and to play with them. First of all, I have created a MODE command which is used to set the video card mode:
0 - text mode
1 - graphics mode of 640x480x2 colors, and
2 - graphics mode of 320x240x8 colors.

Besides MODE command, I now have the following graphics commands:
- PLOT x, y, color
- LINE x1, y1, x2, y2, color
- CIRCLE x, y, r
- DRAW x, y, "TEXT"

I have also added two key-related functions: KEY() and ISKEY(). Both functions return virtual key that has been pressed, but the first is a blocking one - it waits until some key is pressed, while the other one just immediately returns the virtual code of a last key being pressed.

I have also played with the file system on my "hard disk". I have created following commands:
- DIR - lists the content of the "hard disk" root folder,
- LOAD PROGRAM.BAS - loads a BASIC program into the computer memory,
- SAVE PROGRAM.BAS - saves a BASIC program on the "hard disk"
- EXEC PROGRAM.BIN - loads and executes a binary executable
- SYS ADDRESS - executes a machine program loaded at the given address.

The BASIC now boots from the SD card and can be used immediately. Here is the video of the computer booting from the SD card into the BASIC:


Modifying GCC to work with my FPGA computer

This is a followup of my original post.

In this text I will talk about the modification of the GCC compiler in order to work with my FPGA platform. I wanted to make a cross-compiler that would be able to compile C programs for my FPGA platform. This post will describe how far I have reached. Currently, a modified GCC compiler produces assembly code for my platform as a result of C program compilation.

When I say a cross-compiler, I think of a compiler that would compile C code on my PC, but the executable would be for my FPGA computer. GCC already supports a lot of cross-compilers and all you have to do is to choose the appropriate target when building GCC. That way, you will build a cross-compiler that would produce an executable for the target platform. However, if you have created a new platform, then you need to add a cross-compiler code to the GCC, and then build a cross-compiler for that new platform. That step is very complicated. Trying to add your own platform into the GCC is almost impossible job if you have not done something like that already (I haven't).

Read more…

Cache implemented on my FPGA computer

Introduction

This is a followup of my original post.

My FPGA computer uses SDRAM as operating memory. It has static RAM too, but most of it is used as dual-port RAM for the VGA video subsystem. The SDRAM inside is 32MB, 16-bit data bus memory and it usually takes about six clock cycles for read and the same amount of cycles for write. The clock is 100MHz. Knowing all of this, it was about time to do some performance measurement:

I have made a simple program that counts from 1 to 10 000 000. If that program is loaded in SDRAM, it takes about 15 seconds to finish. However, if I load it in static RAM, it takes about 6 seconds to finish. So, there was an obvious motivation to try to implement the cache controller. You can look at the Verilog code here:
https://github.com/milanvidakovic/FPGAComputer32/blob/master/cpu.v

Implementation

I haven't used all of the static RAM in my FPGA computer, so I was able to make about 8KB of L1 cache. Here are the details:
- I have 4096 cache lines, each having two bytes. That is 8KB of cache.
- for each cache line, I have added 12-bit TAG, used for the direct mapping of the cache line. That consumes additional 5632 bytes of static RAM.
- I have implemented write-through policy, since I didn't have enough resources to make a write-back removal policy. I will try to make write-back, but it requires a complete rework of the cache controller, so, perhaps later...

Read more…