6.2. FPGA AI Suite Architecture File Breakdown
This chapter explores customization parameters within an architecture (.arch) file, demonstrates where and how the optimization happens, and hints at the more advanced optimization techniques that are introduced in Optimizing Your FPGA AI Suite IP. A predefined architecture file (AGX7_Generic.arch) is used here as an example.
family : ‘AGX7’ k_vector : 32 c_vector : 16 arch_precision : FP13AGX stream_buffer_depth : 63488 output_channels_max : 16384
enable_eltwise_mult : true filter_size_width_max : 28 filter_size_height_max : 28
pe_array { num_interleaved_features : 5 num_interleaved_filters : 1 exit_fifo_depth : 1024 }
filter_scratchpad { filter_depth : 512 bias_scale_depth : 512 }
dma { csr_addr_width : 11 csr_data_bytes : 4 ddr_addr_width : 32 ddr_burst_width : 4 ddr_data_bytes : 64 ddr_read_id_width : 2 }
activation { generic_aux_parameters { k_vector : 16 } enable_clamp : true enable_leaky_relu : false enable_sigmoid : true enable_prelu : true }
Enable hardened pooling module. If not included, the pooling will be executed on the host.
pool { generic_aux_parameters { k_vector : 4 } max_window_height : 13 max_window_width : 13 max_stride_vertical : 4 max_stride_horizontal : 4 }
- xbar_in_ports
- Defines a connection that receives output feature from the PE array.
- xbar_ports
-
Defines the connection of several auxiliary modules: activation, hardened pooling, and hardened softmax.
The output feature from the PE array, if needed, are sent to these auxiliary modules for further processing. The activation and softmax modules connect to the xbar_in_port input connection but the pool module connects to the activation connection. These connections means that the output feature can go to activation or softmax directly, but it must go through activation for pooling.
- xbar_out_ports
- Allows the output feature to be sent to input_feeder then to the PE for the next convolution; to output_writer for writing out the result; since the input_connection has xbar_in_port, the input can bypass all modules connected to the crossbar and directly goes out; similarly, the result from activation, pool, and softmax can be sent out.
xbar { xbar_k_vector : 16 max_input_interfaces : 5 max_output_interfaces : 5 xbar_ports { xbar_aux_port { name : 'activation' input_connection : 'xbar_in_port' } xbar_aux_port { name : 'pool' input_connection : 'xbar_in_port' input_connection : 'activation' } xbar_aux_port { name : 'softmax' input_connection : 'xbar_in_port' } } xbar_in_port { external_connection : 'pe_array' } xbar_out_port { external_connection : 'input_feeder' external_connection : 'output_writer' input_connection : 'xbar_in_port' input_connection : 'pool' input_connection : 'activation' input_connection : 'softmax' } }
The configuration network is connected to all other modules since it decodes the instructions from the compiled model to the FPGA device and orchestrates inference by controlling all other modules. As mentioned in FPGA AI Suite IP Datapath Component Organization, the configuration network provides little configurability in the architecture file.
Section Content
FPGA AI Suite IP Supported Layers and Hyperparameter Ranges
Architecture Description File Parameters