User Guide
Overview
This section describes how to integrate, configure, and use the CNN Hardware Accelerator in FPGA or SoC designs.
It assumes you already have the RTL source files available and basic familiarity with SystemVerilog simulation and synthesis using Vivado or ModelSim/QuestaSim.
Integration in RTL Design
Configuration Options & Register Map
All configuration parameters are defined in cnn_defs.svh.
| Parameter | Default | Description |
|---|---|---|
| DATA_WIDTH | 8 | Bit width |
| IFMAP_SIZE | 256 | Image dimension |
| KERNEL_SIZE | 3 | Convolution kernel size |
| STRIDE | 1 | Convolution stride |
| PADDING | 1 | Zero-padding size |
Instantiation Example
To include the CNN accelerator in your top-level design, instantiate it as shown below:
cnn_accelerator #(
.DATA_WIDTH(8),
.IFMAP_SIZE(256),
.KERNEL_SIZE(3),
.STRIDE(1),
.PADDING(1)
) u_cnn_accelerator (
.clk (clk),
.reset (reset),
.en (start),
.done (done),
.cnn_ifmap (input_image),
.weights (kernel_data),
.cnn_ofmap (output_result)
);
Signal Overview
| Signal | Direction | Description |
|---|---|---|
clk |
Input | System clock |
reset |
Input | Active-high asynchronous reset |
en |
Input | Start signal for CNN operation |
done |
Output | Indicates completion of processing |
cnn_ifmap |
Input | Input image pixel stream |
weights |
Input | Convolution filter/kernel weights |
cnn_ofmap |
Output | Final output feature map |
Typical Run Flow
The CNN accelerator processes data in a pipelined fashion.
A minimal operational sequence looks like this:
- Load input image and weights into registers or memory.
- Assert
enfor one clock cycle to begin processing. - Wait for
donesignal to indicate completion. - Read output from
cnn_ofmap. - Reset the module (optional) before the next inference.
Example Testbench Flow:
initial begin
reset = 1;
#10 reset = 0;
en = 1;
@(posedge done);
$display("CNN Processing Completed!");
$finish;
end
Simulation Guide
This section explains how to run functional simulations of the CNN accelerator using ModelSim or QuestaSim.
Running the Simulation
From the project root directory:
make sim
This will:
1. Compile all RTL and testbench files.
2. Launch the simulator with the CNN testbench (cnn_tb.sv).
3. Run the simulation and generate waveforms and logs.
Simulation outputs are stored under:
test/output/
Supported Simulators
| Simulator | Description |
|---|---|
| ModelSim | Industry-standard HDL simulator for debugging and analysis |
| QuestaSim | Enhanced commercial version with advanced verification features |
To manually invoke the simulator:
vsim -do run.do cnn_tb
or using the GUI:
vsim work.cnn_tb
add wave *
run -all
Example Simulation Output
Expected console output (testbench messages):
[INFO] CNN Accelerator Simulation Started
[INFO] Convolution + ReLU completed
[INFO] Max Pooling completed
[PASS] Simulation completed successfully
Waveform file: waves.vcd
Output images (if enabled): ofmap.png
FPGA Implementation
Synthesis Instructions (Vivado)
- Open Xilinx Vivado.
- Create a new project and add all RTL files from
/rtl/. - Set the top module to
cnn_accelerator. - Configure global parameters as needed (
cnn_defs.svh). - Run Synthesis → Implementation → Bitstream Generation.
Recommended synthesis constraints:
- Clock period: 10 ns (100 MHz)
- Reset: Active-high asynchronous
Example Resource Utilization (Xilinx Artix-7)
| Resource | Usage | Percentage |
|---|---|---|
| LUTs | 9,823 | 28% |
| FFs | 4,211 | 11% |
| DSPs | 54 | 72% |
| BRAM | 12 | 20% |
(Actual utilization varies with IFMAP size and kernel configuration)
Summary
The CNN Accelerator provides a fully synthesizable, parameterized hardware design for real-time image inference.
This guide covered:
- RTL instantiation and port mapping
- Simulation with ModelSim/QuestaSim
- Synthesis and FPGA implementation flow
- Image conversion scripts for testbench I/O
Use this as your main operational reference for running and verifying the accelerator.