FPGA AI Suite Handbook

ID 863373
Date 11/21/2025
Public
Document Table of Contents

10.3.4. CSR Map and Descriptor Queue

The CSR interface uses a 32-bit data path in which all accesses are aligned to 32 bits; however the address is a byte address. The size of the CSR address space is 2048 bytes (11 bit addressable). The regions within the CSR address space are listed in the table that follows.

Register and Bit Attribute Definitions

The following notation describes the CSR registers.

Table 43.  Register and Bit Attribute Definitions

Attribute

Expansion

Description

RW

Read/Write

This bit can be read or written by software.

RO

Read Only

The bit is set by hardware only. Software can only read this bit. Writes have no effect.

RW1C

Read/Write 1to Clear

Software can read or clear this bit. Software must write 1 to clear this bit. Writing zero to an RW1C bit has no effect.

A multibit RW1C field can exist. In that case, all bits in the field are cleared if a 1 is written to any of the bits.

RsvdZ

Reserved and zero

Reserved for future RW1C implementations.

When you write to a register with RsvdZ bits, only write zeros to these bits.

Discovery ROM

The discovery ROM stores metadata. The metadata includes a hash for the architecture that the IP corresponds to and the FPGA AI Suite version that was used to create the IP.

The host runtime can use this information to determine whether the incoming inference job can be run on the IP instances. For example, if the architectures do not match each other, then inference is not possible.

The layout of the discovery ROM is as follows:

Table 44.  Discovery ROM Layout
Base Byte Address Length (in bytes) Feature

0x000

16

Hash of the Architecture Description File (.arch)

0x010

32

Human-readable FPGA AI Suite version string

Interrupt Control

The interrupt control feature registers are as follows:

Table 45.  Interrupt Control Feature Registers

Register

Offset

Attribute

Description

ICR

0x000

RW1C

DMA Interrupt control register

IMR

0x004

RW

DMA Interrupt mask register

The DMA optionally generates level sensitive interrupt signals in response to various events.

The hardware sets the corresponding bit within the ICR register whenever such an event occurs.

An interrupt is generated upon a 0-to-1 transition of a bit within ICR only if the corresponding bit in the IMR is set to one. A 0-to-1 transition of a bit within the IMR also generates an interrupt if the corresponding bit within the ICR is set to 1.

Table 46.  Interrupt Control Register (ICR) Fields

Field

Bit

Description

Reserved

31:2

RsvdZ (Reserved; software must write 0)

Inference_complete

1

Indicates that an inference request has completed

Error

0

Indicates that an error condition has been triggered

Table 47.  Interrupt Mask Register (IMR) Fields

Field

Bit

Description

Reserved

31:2

RsvdZ (Reserved; software must write 0)

Inference_complete_mask

1

Set to one to enable interrupt generation on inference completion

Error_mask

0

Set to one to enable interrupt generation on error condition

DMA Descriptor Queue

The DMA contains a single descriptor FIFO for enqueuing inference requests. Descriptors potentially require multiple register writes and are added to the queue upon writing to the desc_input_output_base_addr register.

The desc_cfg_filter_base_addr and desc_cfg_num_words are registers that hold their value.

If you already enqueued a DMA descriptor and want to enqueue another descriptor with the same values for the desc_cfg_filter_base_addr and desc_cfg_num_words registers, then write to the desc_input_output_base_addr register.

If you want to change the desc_cfg_filter_base_addr and desc_cfg_num_words registers for the next descriptor, then you must set new values before writing to the desc_input_output_base_addr register.

Table 48.  DMA Descriptor Queue Registers

Register

Offset

Attribute

Description

desc_cfg_filter_base_addr

0x000

RW

Base address pointer for the configuration buffer and for the filter buffer.

The filters are located at desc_cfg_filter_base_addr + desc_cfg_num_words, which is encoded in the address provided to the filter reader as configuration data.

Must be aligned to a multiple of the DDR word size.

desc_cfg_num_words - 2

0x004

RW

Length of the configuration buffer - 2, in config words (64 bits – 32 for instruction, 32 for data)

desc_input_output_base_addr

0x008

RW

Base address pointer for the input feature data and output inference results (written to an offset from the base address).

Must be aligned to a multiple of the DDR word size.

Writing to this register enqueues a descriptor into the internal DMA descriptor queue.

desc_diagnostics

0x00C

RO

This register is useful for debugging. Production software should not need to read from this.

Bit 0: Asserts if the descriptor queue overflows; this is a sticky bit which only clears after reset.

Bit 1: Descriptor queue is full or almost full.

Bit 2: Asserts if the inference limit for an unlicensed IP is reached. When asserted, inference requests are rejected.

All other bits are reserved.

DMA Control Registers

Table 49.  DMA Control Registers

Register

Offset

Attribute

Description

Intermediate_ddr_base_address

0x000

RW

Base address for the DDR intermediate data. This is a shared address across all graphs. Only required to be set once upon startup. Must be aligned to a multiple of the DDR word size.

Inference_completion_count

0x004

RO

Number of inference request completions by the FPGA AI Suite IP.

IP_reset 0x008 RW Write any non-zero value to this address to trigger a reset of the FPGA AI Suite IP.

The value is automatically cleared upon reset.

Reading from this register always returns 0.

Activate_streaming 0x00C RW When streaming is enabled in the architecture, writing "1" to this register makes the FPGA AI Suite IP begin queuing descriptors and start listening for streaming inputs.

Writing "0" stops queuing descriptors and turns off the input streaming interface.

Performance Registers

Hardware counters are provided to measure how many clock cycles that the IP is active. A job is considered active after the first word of its descriptor is read from the descriptor queue. A job is considered finished just before the done interrupt is raised and the completion count is updated.

The IP and supporting host form an elastic pipeline in which multiple jobs can be in flight. The IP tracks both the overall latency (for example, the length of time required to process 100 jobs) as well as the average latency for each of those jobs. The hardware tracks the total latency of every job but knowing the total number of jobs software can compute the average.

64-bit counters mitigate against overflow. There is no synchronization between reading the lower or upper 32 bits of a counter, therefore the software should not read the counters while the IP is active.

Table 50.  Performance Registers

Register

Offset

Attribute

Description

Total clocks active

(lower 32 bits)

0x000

RO

On each clock cycle, if any IP job is active, increment the counter by 1.

Total clocks active

(upper 32 bits)

0x004

RO

Same as above.

Total clocks for all jobs

(lower 32 bits)

0x008

RO

On each clock cycle, if there are N IP jobs active, increment the counter by N.

Total clocks for all jobs

(upper 32 bits)

0x00C

RO

Same as above.

Debug Network Registers

The debug network has the following registers available from the CSR:

Table 51.  Debug Network Registers

Register

Offset

Attribute

Description

DLA_DMA_CSR_OFFSET_DEBUG_NETWORK_ADDR

0x000

RO

Address that the debug network uses to issue a read request.

DLA_DMA_CSR_OFFSET_DEBUG_NETWORK_VALID

0x004

RO

Indicates that a read response has been received from the debug network.

DLA_DMA_CSR_OFFSET_DEBUG_NETWORK_DATA

0x008

RO

Data from debug network.

DMA License Register

Table 52.  DMA License Register

Register

Offset

Attribute

Description

license_flag

0x000

RO

Indicates whether the IP is licensed:
  • 0: unlicensed
  • 1: licensed

DMA Transaction Counters

Hardware counters are provided to measure the number of data words accessed by the DMA from the external DDR memory.

The counter values are separated into input feature reads, input weights and biases reads, and output feature writes. The width of each memory word in bytes matches the dma/ddr_data_bytes value in the architecture description file.

Table 53.  DMA Transaction Counter Registers

Register

Offset

Attribute

Description

Total number of input feature words read by the FPGA AI Suite IP

(lower 32 bits)

0x000

RO

This counter is incremented by 1 for every input feature word transferred from the external memory to the IP DMA on the AXI4 read bus.

Total number of input feature words read by the FPGA AI Suite IP

(upper 32 bits)

0x004

RO

Same as above.

Total number of input filter and biases words read by the FPGA AI Suite IP

(lower 32 bits)

0x008

RO

This counter is incremented by 1 for every filter-bias word transferred from the external memory to the IP DMA on the AXI4 read bus.
Total number of input filter and biases words read by the FPGA AI Suite IP

(upper 32 bits)

0x00C

RO

Same as above.

Total number of output feature words written by the FPGA AI Suite IP

(lower 32 bits)

0x010 RO This counter is incremented by 1 for every feature word written to the external memory by the IP DMA on the AXI4 write bus.
Total number of output feature words written by the FPGA AI Suite IP

(upper 32 bits)

0x00C RO Same as above.

Model Update Registers

When the FPGA AI Suite IP is configured as DDR-Free, its model can be updated via writes to the following registers:
Register Offset Attribute Description
MODEL_UPDATE_WORD_0 0x000 W 32-bit chunk of a scratchpad or configuration word, index 0 (least significant)
MODEL_UPDATE_WORD_1 0x004 W Same as above, index 1
MODEL_UPDATE_WORD_2 0x008 W Same as above, index 2
MODEL_UPDATE_WORD_3 0x00C W Same as above, index 3
MODEL_UPDATE_WORD_4 0x010 W Same as above, index 4
MODEL_UPDATE_WORD_5 0x014 W Same as above, index 5
MODEL_UPDATE_WORD_6 0x018 W Same as above, index 6
MODEL_UPDATE_WORD_7 0x01C W Same as above, index 7
MODEL_UPDATE_WORD_8 0x020 W Same as above, index 8
MODEL_UPDATE_WORD_9 0x024 W Same as above, index 9
MODEL_UPDATE_WORD_10 0x028 W Same as above, index 10
MODEL_UPDATE_WORD_11 0x02C W Same as above, index 11
MODEL_UPDATE_WORD_12 0x030 W Same as above, index 12
MODEL_UPDATE_WORD_13 0x034 W Same as above, index 13
MODEL_UPDATE_WORD_14 0x038 W Same as above, index 14
MODEL_UPDATE_WORD_15 0x03C W Same as above, index 15
MODEL_UPDATE_WORD_16 0x040 W Same as above, index 16
MODEL_UPDATE_WORD_17 0x044 W Same as above, index 17
MODEL_UPDATE_WORD_18 0x048 W Same as above, index 18
MODEL_UPDATE_WORD_19 0x04C W Same as above, index 19
MODEL_UPDATE_WORD_20 0x050 W Same as above, index 20
MODEL_UPDATE_WORD_21 0x054 W Same as above, index 21
MODEL_UPDATE_WORD_22 0x058 W Same as above, index 22
MODEL_UPDATE_WORD_23 0x05C W Same as above, index 23
MODEL_UPDATE_WORD_24 0x060 W Same as above, index 24
MODEL_UPDATE_WORD_25 0x064 W Same as above, index 25
MODEL_UPDATE_WORD_26 0x068 W Same as above, index 26
MODEL_UPDATE_WORD_27 0x06C W Same as above, index 27
MODEL_UPDATE_WORD_28 0x070 W Same as above, index 28
MODEL_UPDATE_WORD_29 0x074 W Same as above, index 29
MODEL_UPDATE_WORD_30 0x078 W Same as above, index 30
MODEL_UPDATE_WORD_31 0x07C W Same as above, index 31
MODEL_UPDATE_CONTROL 0x080 W Type of word and target address of the update

To see how to use these registers to update the DDR-free model on the FPGA device, refer to Updating Hostless DDR-Free MIF Files Through the CSR