Hard Processor System Technical Reference Manual: Agilex™ 5 SoCs
A newer version of this document is available. Customers should click here to go to the newest version.
4.3.9.2.1. Calculating Page Table Walks
We can estimate the average number of page table walk memory accesses based on whether each level of walk cache hit it miss.
4 KB Pages:
The table below shows the number of page table walks that are expected in the following situation:
- Both stages of translation are enabled
- Each granule size is 4KB in both stages
- Each translation is 4KB in size in both translation stages
Each row and column gives the highest level of hit for the corresponding translation stage.
ptw | S1 Miss | S1L0 hit | S1L1 hit | S1L2 hit | S1L3 hit | S1 Bypass |
---|---|---|---|---|---|---|
S2 miss | 24 | 15 | 10 | 5 | 0 | 4 |
S2L0 hit | 19 | 12 | 8 | 4 | 0 | 3 |
S2L1 hit | 14 | 9 | 6 | 3 | 0 | 2 |
S2L2 hit | 9 | 6 | 4 | 2 | 0 | 1 |
S2L3 hit | 4 | 3 | 1 | 1 | 0 | 0 |
S2 Bypass | 4 | 3 | 2 | 1 | 0 | 0 |
2 MB Pages:
Generally, the number of page table walks decreases with larger page size.
The following table shows the number of page table walks memory accesses with 2 MB page sizes at both stages with 4 KB granule.
ptw | S1 Miss | S1L0 hit | S1L1 hit | S1L2 hit | S1 Bypass |
---|---|---|---|---|---|
S2 miss | 15 | 8 | 4 | 0 | 3 |
S2L0 hit | 11 | 6 | 3 | 0 | 2 |
S2L1 hit | 7 | 4 | 2 | 0 | 1 |
S2L2 hit | 3 | 2 | 1 | 0 | 0 |
S2 Bypass | 3 | 2 | 1 | 0 | 0 |
Intel Agilex 5 Application:
Current verification has only Stage 1 hierarchy with level L0 cache set up with S2 bypassed.
Example of address translation with mandatory cache miss:
- TBU clock frequency: 400 MHz
- TCU clock frequency: 600 MHz
- TBU cache miss and request for address translation to TCU: 15 ns (approximately)
- TCU cache miss and memory read request: 35 ns (approximately)
- One page/configuration table walk, memory read completion to TCU: 68 ns (approximately)
- Total of 4 page/configuration table walks
- Address translation details to TCU to TBU: 290 ns (approximately)
- TBU transaction to destination: 10 ns
- Total of 420 ns transaction latency for translation cache miss
Example of address translation with cache hit:
- TBU cache hit and sending transaction to destination: 2.5 ns