How to: many-accelerator SoC - Tutorial

This guide illustrates how to generate and test a many-accelerator SoC. The steps that are the same as for previous tutorials will not be covered in detail again.

insert image here: tutorial overview

Target FPGA board

For this tutorial we target the Xilinx VCU118 board based on the Virtex Ultrascale+ FPGA, but all the steps described in this tutorial are identical for all the FPGA targets.

# Move to the Xilinx VCU118 design folder
cd <esp>/socs/xilinx-vcu118-xcvu9p

RTL generation: accelerators

The ESP accelerator tile can host accelerators coming from different design flows: SystemC with Stratus HLS, C with Vivado HLS, Chisel, RTL, Keras Tensorflow and Pytorch with hls4ml. Each flow has different levels of integration and support in ESP. The many-accelerator SoC designed in this tutorial contains accelerators coming from two design flows both based on high-level synthesis (HLS): SystemC accelerators synthesized with Stratus HLS and C accelerators synthesized with Vivado HLS. This tutorial uses predesigned accelerators available in the ESP release in the accelerators/ folder.

  • Stratus HLS accelerators:
    • dummy: sample accelerator that performs a simple element-wise operation
    • sort: a combination of bitonic and merge sort of input arrays
    • spmv: sparse matrix-vector multiplication
    • visionchip: pipeline of computer vision kernels for the nightivision domain
  • Vivado HLS accelerators:
    • adder: sample accelerator that performs a simple element-wise operation

None of the steps in this tutorial differ based on the accelerator design flow (Stratus HLS vs. Vivado HLS). Everything is completely agnostic of how the accelerator has been designed.

The first step is to generate the RTL for all these accelerators by running HLS on their source code with Stratus HLS for the accelerators in <esp>/accelerators/stratus_hls and with Vivado HLS for those in <esp>/accelerators/vivado_hls.

make dummy-hls
make sort-hls
make spmv-hls
make visionchip-hls
make adder-hls

SoC configuration

Open the ESP configuration GUI.

make esp-xconfig

In the GUI select a 4x4 NoC configuration. Keep the default processor choice, Ariane, and leave the caches disabled. The accelerators synthesized in the previous step appear in the dropdown menu of each tile. For this tutorial we place 12 accelerators tiles as shown in the figure below: two tile for each of the Stratus HLS accelerators and 4 tiles for the Vivado HLS accelerator. The exact location of each tile does not affect functionality, in fact any permutation of tile placement in the figure below would yield an SoC with the same functionality.

insert figure: ESP GUI 4x4 with 12 accelerators. VCU118

Open the GRLIB configuration GUI and set the debug link ethernet IP appropriately as described in the single-core SoC tutorial.

make grlib-xconfig

RTL simulation

By default the RTL simulation target make sim (or make ncsim) cross-compiles the systest.c application in the design folder and that is the baremetal application executed by the RTL simulation. In this case we want to compile specific baremetal unit tests for each accelerator. The source code of the baremetal unti tests for the accelerators are in soft/<processor-name>/drivers/<accelerator-name>/barec. Then for each type of accelerator run the following commands to run RTL simulation.

make <accelerator-name>-barec
TEST_PROGRAM=barec/<accelerator-name>.exe make soft
make sim # for Modelsim
> run -a

If the SoC contains multiple tiles with the same accelerator, as for this tutorial, the baremetal programs automatically test all of them one by one.

add GIF of bare-metal result of one of the accelerators

FPGA prototyping

Generate the FPGA bitstream.

make vivado-syn

Compile the baremetal test for each accelerator as described earlier.

make <accelerator-name>-barec

Compile Linux. This step compiles and includes in the Linux image the accelerators’ device drivers (source code at soft/<processor-name>/drivers/<accelerator-name>/linux) and their unit test applications (source code at soft/<processor-name>/drivers/<accelerator-name>/app).

make linux

By using the appropriate host name and hw_server port as described in the single-core SoC tutorial, program the FPGA.

FPGA_HOST=localhost XIL_HW_SERVER_PORT=3121 make fpga-program

Connect a serial communication program to the ESP UART interface as described in the single-core SoC tutorial.

For each type of accelerator on the SoC run the unit test baremetal program on FPGA.

TEST_PROGRAM=barec/<accelerator-name>.exe make soft
make fpga-run

add GIF of make fpga-run and bare-metal program running

Boot Linux on FPGA.

make fpga-run-linux

add GIF of make fpga-run-linux and Linux booting

After the boot, login with the username root. Then in the Linux terminal launch all the unit test applications for the accelerators.


add GIF of Linux app running