CNN Accelerator – API Reference

SystemVerilog Implementation
Developed by: Abdullah Nadeem & Talha Ayyaz


Top-Level Module: cnn_accelerator

Description:
Integrates all sub-modules to form the complete CNN accelerator. Manages the flow of data through convolution, activation, pooling, flattening, GAP, and output logic.

Ports:

Port Type Description
clk input logic System clock
reset input logic Active-high reset
en input logic Enable signal
cnn_ifmap input logic [DATA_WIDTH-1:0][0:IFMAP_SIZE-1][0:IFMAP_SIZE-1] Input feature map (unsigned)
weights input logic signed [DATA_WIDTH-1:0][0:KERNEL_SIZE-1][0:KERNEL_SIZE-1] Convolution kernel weights
cnn_ofmap output logic [DATA_WIDTH-1:0][0:POOL_OFMAP_SIZE-1][0:POOL_OFMAP_SIZE-1] Final output feature map
done output logic Completion flag

Parameters:

Parameter Default Description
DATA_WIDTH 8 Bit width of input and weights
IFMAP_SIZE 256 Input feature map size
KERNEL_SIZE 3 Convolution kernel size
STRIDE 1 Convolution/pooling stride
PADDING 1 Convolution padding
NUM_CLASSES 10 Number of output classes

Sub-Modules

conv

Performs 2D convolution with ReLU activation.

Ports:

Port Type Description
clk input logic System clock
reset input logic Reset
en input logic Enable signal
conv_ifmap input logic [DATA_WIDTH-1:0][0:IFMAP_SIZE-1][0:IFMAP_SIZE-1] Input feature map
weights input logic signed [DATA_WIDTH-1:0][0:KERNEL_SIZE-1][0:KERNEL_SIZE-1] Kernel weights
conv_ofmap output logic [DATA_WIDTH-1:0][0:CONV_OFMAP_SIZE-1][0:CONV_OFMAP_SIZE-1] Convolved output
conv_done output logic Completion flag

mac

Core multiply-accumulate unit.

Ports:

Port Type Description
feature input logic [DATA_WIDTH-1:0][0:KERNEL_SIZE-1][0:KERNEL_SIZE-1] Input feature window
kernel input logic signed [DATA_WIDTH-1:0][0:KERNEL_SIZE-1][0:KERNEL_SIZE-1] Kernel values
result output logic signed [MAC_RESULT_WIDTH-1:0] Accumulated product result

maxpool

Performs 2×2 max pooling with stride 2.

Ports:

Port Type Description
clk input logic System clock
reset input logic Reset
en input logic Enable signal
ifmap input logic [DATA_WIDTH-1:0][0:CONV_OFMAP_SIZE-1][0:CONV_OFMAP_SIZE-1] Input feature map
ofmap output logic [DATA_WIDTH-1:0][0:(CONV_OFMAP_SIZE/2)-1][0:(CONV_OFMAP_SIZE/2)-1] Pooled output
done_pool output logic Completion flag

flatten

Converts 2D pooled feature maps into a 1D vector.

Ports:

Port Type Description
flatten_in input logic [DATA_WIDTH-1:0][0:POOL_OFMAP_SIZE-1][0:POOL_OFMAP_SIZE-1] Input feature map
flatten_out output logic [DATA_WIDTH-1:0][0:POOL_PIXEL_COUNT-1] Flattened 1D vector

gap (Global Average Pooling)

Computes average per channel.

Ports:

Port Type Description
clk input logic System clock
rst input logic Reset
start input logic Start signal
input_fm input logic [CHANNELS-1:0][POOL_OFMAP_SIZE-1:0][POOL_OFMAP_SIZE-1:0] Input feature maps
output_fm output logic [DATA_WIDTH-1:0][CHANNELS-1:0] Channel-wise average output
done output logic Completion flag

argmax

Determines the index of the maximum value for classification.

Ports:

Port Type Description
data_in input logic [DATA_WIDTH-1:0][CHANNELS] Input vector of channel outputs
max_index output logic [INDEX_WIDTH-1:0] Index of the max value
max_value output logic [DATA_WIDTH-1:0] Maximum value

comparator

Selects the maximum value among 4 inputs (used in pooling).

Ports:

Port Type Description
input1 input logic [DATA_WIDTH-1:0] Input 1
input2 input logic [DATA_WIDTH-1:0] Input 2
input3 input logic [DATA_WIDTH-1:0] Input 3
input4 input logic [DATA_WIDTH-1:0] Input 4
max_val output logic [DATA_WIDTH-1:0] Maximum of the 4 inputs