How to: many-accelerator SoC - Tutorial
This guide illustrates how to generate and test a many-accelerator SoC. The steps that are the same as for previous tutorials will not be covered in detail again.
insert image here: tutorial overview
Target FPGA board
For this tutorial we target the Xilinx VCU118 board based on the Virtex Ultrascale+ FPGA, but all the steps described in this tutorial are identical for all the FPGA targets.
# Move to the Xilinx VCU118 design folder cd <esp>/socs/xilinx-vcu118-xcvu9p
RTL generation: accelerators
The ESP accelerator tile can host accelerators coming from different design
flows: SystemC with Stratus HLS, C with Vivado HLS, Chisel, RTL, Keras
Tensorflow and Pytorch with hls4ml. Each flow has different levels of
integration and support in ESP. The many-accelerator SoC designed in this
tutorial contains accelerators coming from two design flows both based on
high-level synthesis (HLS): SystemC accelerators synthesized with Stratus HLS
and C accelerators synthesized with Vivado HLS. This tutorial uses predesigned
accelerators available in the ESP release in the
- Stratus HLS accelerators:
dummy: sample accelerator that performs a simple element-wise operation
sort: a combination of bitonic and merge sort of input arrays
spmv: sparse matrix-vector multiplication
visionchip: pipeline of computer vision kernels for the nightivision domain
- Vivado HLS accelerators:
adder: sample accelerator that performs a simple element-wise operation
None of the steps in this tutorial differ based on the accelerator design flow (Stratus HLS vs. Vivado HLS). Everything is completely agnostic of how the accelerator has been designed.
The first step is to generate the RTL for all these accelerators by running HLS
on their source code with Stratus HLS for the accelerators in
<esp>/accelerators/stratus_hls and with Vivado HLS for those in
make dummy-hls make sort-hls make spmv-hls make visionchip-hls make adder-hls
Open the ESP configuration GUI.
In the GUI select a 4x4 NoC configuration. Keep the default processor choice, Ariane, and leave the caches disabled. The accelerators synthesized in the previous step appear in the dropdown menu of each tile. For this tutorial we place 12 accelerators tiles as shown in the figure below: two tile for each of the Stratus HLS accelerators and 4 tiles for the Vivado HLS accelerator. The exact location of each tile does not affect functionality, in fact any permutation of tile placement in the figure below would yield an SoC with the same functionality.
insert figure: ESP GUI 4x4 with 12 accelerators. VCU118
Open the GRLIB configuration GUI and set the debug link ethernet IP appropriately as described in the single-core SoC tutorial.
By default the RTL simulation target
make sim (or
make ncsim) cross-compiles
systest.c application in the design folder and that is the baremetal
application executed by the RTL simulation. In this case we want to compile
specific baremetal unit tests for each accelerator. The source code of the
baremetal unti tests for the accelerators are in
soft/<processor-name>/drivers/<accelerator-name>/barec. Then for each type of
accelerator run the following commands to run RTL simulation.
make <accelerator-name>-barec TEST_PROGRAM=barec/<accelerator-name>.exe make soft make sim # for Modelsim > run -a
If the SoC contains multiple tiles with the same accelerator, as for this tutorial, the baremetal programs automatically test all of them one by one.
add GIF of bare-metal result of one of the accelerators
Generate the FPGA bitstream.
Compile the baremetal test for each accelerator as described earlier.
Compile Linux. This step compiles and includes in the Linux image the
accelerators’ device drivers (source code at
soft/<processor-name>/drivers/<accelerator-name>/linux) and their unit test
applications (source code at
By using the appropriate host name and hw_server port as described in the single-core SoC tutorial, program the FPGA.
FPGA_HOST=localhost XIL_HW_SERVER_PORT=3121 make fpga-program
Connect a serial communication program to the ESP UART interface as described in the single-core SoC tutorial.
For each type of accelerator on the SoC run the unit test baremetal program on FPGA.
TEST_PROGRAM=barec/<accelerator-name>.exe make soft make fpga-run
add GIF of make fpga-run and bare-metal program running
Boot Linux on FPGA.
add GIF of make fpga-run-linux and Linux booting
After the boot, login with the username
root. Then in the Linux terminal launch all the unit test applications for the accelerators.
add GIF of Linux app running