Intel Acceleration Stack for Intel® Xeon® CPU with FPGAs Core Cache Interface (CCI-P) Reference Manual
ID
683193
Date
11/04/2019
Public
1.3.1. Signaling Information
1.3.2. Read and Write to Main Memory
1.3.3. Interrupts
1.3.4. UMsg
1.3.5. MMIO Accesses to I/O Memory
1.3.6. CCI-P Tx Signals
1.3.7. Tx Header Format
1.3.8. CCI-P Rx Signals
1.3.9. Multi-Cache Line Memory Requests
1.3.10. Byte Enable Memory Request ( Intel® FPGA PAC D5005)
1.3.11. Additional Control Signals
1.3.12. Protocol Flow
1.3.13. Ordering Rules
1.3.14. Timing Diagram
1.3.15. CCI-P Guidance
1.3.9. Multi-Cache Line Memory Requests
To achieve highest link efficiency, pack the memory requests into large transfer sizes. Use the multi-CL requests for this. Listed below are the characteristics of multi-CL memory requests:
- Highest memory bandwidth is achieved when using a data payload of 4 CLs.
- Memory write request should always begin with the lowest address first. SOP=1 in the c1_ReqMemHdr marks the first CL. All subsequent headers in the multi-CL request must drive incremental values in Address[1:0]and Address[41:2] is treated as don't care.
- An N CL memory write request takes N cycles on Channel 1. It is legal to have idle cycles in the middle of a multi-CL request, but one request cannot be interleaved with another request. It is illegal to start a new request without completing the entire data payload for a multi-CL write request.
- FIU guarantees to complete the multi-CL VA requests on a single VC.
- The memory request address must be naturally aligned. A 2CL request should start on a 2-CL boundary and its CL address must be divisible by 2, that is address[0] = 1'b0. A 4CL request should be aligned on a 4-CL boundary and its CL address must be divisible by 4, that is address[1:0] = 2'b00.
- A multi-CL burst must complete by transmitting all words before issuing any other request. This means that the following special memory write requests cannot be interleaved within a single multi-CL burst:
- Write Fences
- Interrupts
- Byte enable writes
The figure below is an example of a multi-CL Memory Write Request.
Figure 12. Multi-CL Memory Request
The figure below is an example for a Memory Write Response Cycles. For unpacked response, the individual CLs could return out of order.
Figure 13. Multi-CL Memory Write Responses
Below is an example of a Memory Read Response Cycle. The read response can be reordered within itself; that is, there is no guaranteed ordering between individual CLs of a multi-CL Read. All CLs within a multi-CL response have the same mdata and same vc_used. Individual CLs of a multi-CL Read are identified using the cl_num field.
Figure 14. Multi-CL Memory Read Responses