Nios II Classic Processor Reference Guide
1. Introduction
This handbook describes the Classic processor from a high-level conceptual description to the low-level details of implementation. The chapters in this handbook describe the Nios II processor architecture, the programming model, and the instruction set.
This handbook describes the processor from a high-level conceptual description to the low-level details of implementation. The chapters in this handbook describe the processor architecture, the programming model, and the instruction set. The processor is only availabe in the 14.1 release and above.
We have ended development of new Classic processor features with the 14.0 release. New features are implemented only in the processor core. Although the Classic processor remains supported, we recommend that you use the core for future designs.
This handbook assumes you have a basic familiarity with embedded processor concepts. You do not need to be familiar with any specific FPGA technology or with FPGA development tools. This handbook limits discussion of hardware implementation details of the processor system. The processors are designed for FPGA devices, and so this handbook does describe some FPGA implementation concepts. Your familiarity with FPGA technology provides a deeper understanding of the engineering trade-offs related to the design and implementation of the processor.
This chapter introduces the FPGA Nios II embedded processor family and describes the similarities and differences between the processor and traditional embedded processors.
1.1. Processor System Basics
- Full 32-bit instruction set, data path, and address space
- 32 general-purpose registers
- Optional shadow register sets
- 32 interrupt sources
- External interrupt controller interface for more interrupt sources
- Single-instruction 32 × 32 multiply and divide producing a 32-bit result
- Dedicated instructions for computing 64-bit and 128-bit products of multiplication
- Optional floating-point instructions for single-precision floating-point operations
- Single-instruction barrel shifter
- Access to a variety of on-chip peripherals, and interfaces to off-chip memories and peripherals
- Hardware-assisted debug module enabling processor start, stop, step, and trace under control of the Nios II software development tools
- Optional memory management unit (MMU) to support operating systems that require MMUs
- Optional memory protection unit (MPU)
- Software development environment based on the GNU C/C++ tool chain and the Nios II Software Build Tools (SBT) for Eclipse
- Integration with FPGA’s Signal Tap* Embedded Logic Analyzer, enabling real-time analysis of instructions and data along with other signals in the FPGA design
- Instruction set architecture (ISA) compatible across all processor systems
- Performance up to 250 DMIPS
- Optional error correcting code (ECC) support for a subset of processor internal RAM blocks
A processor system is equivalent to a microcontroller or “computer on a chip” that includes a processor and a combination of peripherals and memory on a single chip. A processor system consists of a processor core, a set of on-chip peripherals, on-chip memory, and interfaces to off-chip memory, all implemented on a single FPGA device. Like a microcontroller family, all processor systems use a consistent instruction set and programming model.
- Full 32-bit instruction set, data path, and address space
- 32 general-purpose registers
- Optional shadow register sets
- 32 interrupt sources
- External interrupt controller interface for more interrupt sources
- Single-instruction 32 × 32 multiply and divide producing a 32-bit result
- Dedicated instructions for computing 64-bit and 128-bit products of multiplication
- Optional floating-point instructions for single-precision floating-point operations
- Single-instruction barrel shifter
- Access to a variety of on-chip peripherals, and interfaces to off-chip memories and peripherals
- Hardware-assisted debug module enabling processor start, stop, step, and trace under control of the Nios II software development tools
- Optional memory management unit (MMU) to support operating systems that require MMUs
- Optional memory protection unit (MPU)
- Software development environment based on the GNU C/C++ tool chain and the Nios II Software Build Tools (SBT) for Eclipse
- Integration with FPGA’s Signal Tap II* Embedded Logic Analyzer, enabling real-time analysis of instructions and data along with other signals in the FPGA design
- Instruction set architecture (ISA) compatible across all processor systems
- Performance up to 250 DMIPS
- Error correcting code (ECC) support for all processor internal RAM blocks
A processor system is equivalent to a microcontroller or “computer on a chip” that includes a processor and a combination of peripherals and memory on a single chip. A processor system consists of a processor core, a set of on-chip peripherals, on-chip memory, and interfaces to off-chip memory, all implemented on a single FPGA device. Like a microcontroller family, all processor systems use a consistent instruction set and programming model.
1.2. Getting Started with the Nios II Processor
The Nios II EDS includes the following two closely-related software development tool flows:
- The Nios II SBT
- The Nios II SBT for Eclipse
Both tools flows are based on the GNU C/C++ compiler. The Nios II SBT for Eclipse™ provides a familiar and established environment for software development. Using the Nios II SBT for Eclipse, you can immediately begin developing and simulating Nios II software applications.
The Nios II SBT also provides a command line interface.
Using the hardware reference designs included in an development kit, you can prototype an application running on a board before building a custom hardware platform.
If the prototype system adequately meets design requirements using an -provided reference design, you can copy the reference design and use it without modification in the final hardware platform. Otherwise, you can customize the processor system until it meets cost or performance requirements.
1.3. Customizing Processor Designs
Because the pins and logic resources in FPGA devices are programmable, many customizations are possible:
- You can rearrange the pins on the chip to simplify the board design. For example, you can move address and data pins for external SDRAM memory to any side of the chip to shorten board traces.
- You can use extra pins and logic resources on the chip for functions unrelated to the processor. Extra resources can provide a few extra gates and registers as glue logic for the board design; or extra resources can implement entire systems. For example, a processor system consumes only 5% of a large FPGA, leaving the rest of the chip’s resources available to implement other functions.
- You can use extra pins and logic on the chip to implement additional peripherals for the processor system. FPGA offers a library of peripherals that easily connect to processor systems.
1.4. Configurable Soft Processor Core Concepts
1.4.1. Configurable Soft Processor Core
You are not required to create a new processor configuration for every new design. FPGA provides ready-made system designs that you can use as is. If these designs meet your system requirements, there is no need to configure the design further. In addition, you can use the Nios II instruction set simulator to begin writing and debugging Nios II applications before the final hardware configuration is determined.
1.4.2. Flexible Peripheral Set and Address Map
FPGA provides software constructs to access memory and peripherals generically, independently of address location. Therefore, the flexible peripheral set and address map does not affect application developers.
There are two broad classes of peripherals: standard peripherals and custom peripherals.
1.4.2.1. Standard Peripherals
1.4.2.2. Custom Components
You can also create custom components and integrate them in processor systems. For performance-critical systems that spend most CPU cycles executing a specific section of code, it is a common technique to create a custom peripheral that implements the same function in hardware.
This approach offers a double performance benefit:
- Hardware implementation is faster than software.
- Processor is free to perform other functions in parallel while the custom peripheral operates on data.
1.4.2.3. Custom Instructions
The custom logic is integrated into the processor’s arithmetic logic unit (ALU). Similar to native Nios II instructions, custom instruction logic can take values from up to two source registers and optionally write back a result to a destination register.
Because the processor is implemented on reprogrammable FPGAs, software and hardware engineers can work together to iteratively optimize the hardware and test the results of software running on hardware.
From the software perspective, custom instructions appear as machine-generated assembly macros or C functions, so programmers do not need to understand assembly language to use custom instructions.
1.4.3. Automated System Generation
After system generation, you can download the design onto a board, and debug software executing on the board. To the software developer, the processor architecture of the design is set. Software development proceeds in the same manner as for traditional, nonconfigurable processors.
1.5. Intel FPGA IP Evaluation Mode
- Simulate the behavior of a processor within your system.
- Verify the functionality of your design, as well as evaluate its size and speed quickly and easily.
- Generate time-limited device programming files for designs that include processors.
- Program a device and verify your design in hardware.
You only need to purchase a license for the processor when you are completely satisfied with its functionality and performance, and want to take your design to production.
1.6. Introduction Revision History
Document Version | Changes |
---|---|
2019.10.17 | Removed sections: Installing Windows* Subsystem for Linux* (WSL) on Windows* and placed them in the Software Developer Handbook . |
2019.07.01 | Added sections: Installing Windows* Subsystem for Linux* (WSL) on Windows* . |
2019.04.30 | Added section: Installing Eclipse IDE into Nios II EDS. |
2018.04.18 |
|
2016.10.28 | Maintenance release. |
2015.04.02 | Initial release |
Date | Version | Changes |
---|---|---|
June 2016 | 2016.06.17 | Updated introduction. |
April 2015 | 2015.04.02 | Maintenance release. |
February 2014 | 13.1.0 |
|
May 2011 | 11.0.0 | Added references to new system integration tool. |
December 2010 | 10.1.0 | Maintenance release. |
July 2010 | 10.0.0 | Maintenance release. |
November 2009 | 9.1.0 |
|
March 2009 | 9.0.0 | Maintenance release. |
November 2008 | 8.1.0 | Maintenance release. |
May 2008 | 8.0.0 | Added MMU and MPU to bullet list of features. |
October 2007 | 7.2.0 | Added OpenCore Plus section. |
May 2007 | 7.1.0 |
|
March 2007 | 7.0.0 | Maintenance release. |
November 2006 | 6.1.0 | Maintenance release. |
May 2006 | 6.0.0 |
|
October 2005 | 5.1.0 | Maintenance release. |
May 2005 | 5.0.0 | Maintenance release. |
September 2004 | 1.1 | Maintenance release. |
May 2004 | 1.0 | Initial release. |
2. Processor Architecture
The Nios II architecture describes an instruction set architecture (ISA). The ISA in turn necessitates a set of functional units that implement the instructions. A processor core is a hardware design that implements the Nios II instruction set and supports the functional units described in this document. The processor core does not include peripherals or the connection logic to the outside world. It includes only the circuits required to implement the Nios II architecture.
The Nios II architecture defines the following functional units:
- Register file
- Arithmetic logic unit (ALU)
- Interface to custom instruction logic
- Exception controller
- Internal or external interrupt controller
- Instruction bus
- Data bus
- Memory management unit (MMU)
- Memory protection unit (MPU)
- Instruction and data cache memories
- Tightly-coupled memory interfaces for instructions and data
- JTAG debug module
2.1. Processor Implementation
The functional units of the Nios II architecture form the foundation for the Nios II instruction set. However, this does not indicate that any unit is implemented in hardware. The Nios II architecture describes an instruction set, not a particular hardware implementation. A functional unit can be implemented in hardware, emulated in software, or omitted entirely.
A Nios II implementation is a set of design choices embodied by a particular processor core. All implementations support the instruction set defined in the Instruction Set Reference chapter.
Each implementation achieves specific objectives, such as smaller core size or higher performance. This flexibility allows the Nios II architecture to adapt to different target applications.
Implementation variables generally fit one of three trade-off patterns: more or less of a feature; inclusion or exclusion of a feature; hardware implementation or software emulation of a feature. An example of each trade-off follows:
- More or less of a feature—For example, to fine-tune performance, you can increase or decrease the amount of instruction cache memory. A larger cache increases execution speed of large programs, while a smaller cache conserves on-chip memory resources.
- Inclusion or exclusion of a feature—For example, to reduce cost, you can choose to omit the JTAG debug module. This decision conserves on-chip logic and memory resources, but it eliminates the ability to use a software debugger to debug applications.
- Hardware implementation or software emulation—For example, in control applications that rarely perform complex arithmetic, you can choose for the division instruction to be emulated in software. Removing the divide hardware conserves on-chip resources but increases the execution time of division operations.
For information about which cores supports what features, refer to the Core Implementation Details chapter of the Processor Reference Handbook.
For complete details about user-selectable parameters for the processor, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
2.2. Register File
The processor can optionally have one or more shadow register sets. A shadow register set is a complete set of Nios II general-purpose registers. When shadow register sets are implemented, the CRS field of the status register indicates which register set is currently in use. An instruction access to a general-purpose register uses whichever register set is active.
A typical use of shadow register sets is to accelerate context switching. When shadow register sets are implemented, the processor has two special instructions, rdprs and wrprs, for moving data between register sets. Shadow register sets are typically manipulated by an operating system kernel, and are transparent to application code. A processor can have up to 63 shadow register sets.
The Nios II architecture allows for the future addition of floating-point registers.
For details about shadow register set implementation and usage, refer to “Registers” and “Exception Processing” in the Programming Model chapter of the Processor Reference Handbook.
For details about the rdprs and wrprs instructions, refer to the Instruction Set Reference chapter of the Processor Reference Handbook.
2.3. Arithmetic Logic Unit
Category | Details |
---|---|
Arithmetic | The ALU supports addition, subtraction, multiplication, and division on signed and unsigned operands. |
Relational | The ALU supports the equal, not-equal, greater-than-or-equal, and less-than relational operations (==, != >=, <) on signed and unsigned operands. |
Logical | The ALU supports AND, OR, NOR, and XOR logical operations. |
Shift and Rotate | The ALU supports shift and rotate operations, and can shift/rotate data by 0 to 31 bit positions per instruction. The ALU supports arithmetic shift right and logical shift right/left. The ALU supports rotate left/right. |
2.3.1. Unimplemented Instructions
The processor generates an exception whenever it issues an unimplemented instruction so your exception handler can call a routine that emulates the operation in software. Unimplemented instructions do not affect the programmer’s view of the processor.
For a list of potential unimplemented instructions, refer to the Programming Model chapter of the Processor Reference Handbook.
2.3.2. Custom Instructions
Refer to "Custom Instruction Tab" in the Instantiating the Processor chapter of the Processor Reference Handbook for additional information.
2.4. Reset and Debug Signals
Signal Name | Type | Purpose |
---|---|---|
reset | Reset | This is a global hardware reset signal that forces the processor core to reset immediately. |
cpu_resetrequest | Reset | This is an
optional, local reset signal that causes the processor to reset without
affecting other components in the
system. The processor finishes executing any instructions in the pipeline,
and then enters the reset state. This process can take several clock cycles,
so be sure to continue asserting the cpu_resetrequest signal until the
processor core asserts a cpu_resettaken signal. The processor core asserts a cpu_resettaken signal for 1 cycle when the reset is complete and then periodically if cpu_resetrequest remains asserted. The processor remains in the reset state for as long as cpu_resetrequest is asserted. While the processor is in the reset state, it periodically reads from the reset address. It discards the result of the read, and remains in the reset state. The processor does not respond to cpu_resetrequest when the processor is under the control of the JTAG debug module, that is, when the processor is paused. The processor responds to the cpu_resetrequest signal if the signal is asserted when the JTAG debug module relinquishes control, both momentarily during each single step as well as when you resume execution. |
debug_reset_request | Reset | This reset output signal appears when the JTAG Debug module is enabled. This signal is triggered by the JTAG debugger or nios2-download -r command. This signal must be connected to the reset input signal of the processor which allows the JTAG debugger to reset the processor. This signal can be connected to the reset input signal of other components when needed. |
debugreq | Debug | This is an optional signal that temporarily suspends the processor for debugging purposes. When you assert the signal, the processor pauses in the same manner as when a breakpoint is encountered, transfers execution to the routine located at the break address, and asserts a debugack signal. Asserting the debugreq signal when the processor is already paused has no effect. |
reset_req | Reset | This optional signal prevents the memory corruption by performing a reset handshake before the processor resets. |
For more information about adding reset signals and debug signals to the processor, refer to Advanced Features Tab and JTAG Debug Module Tab in the Instantiating the Processor chapter respectively.
2.5. Exception and Interrupt Controllers
2.5.1. Exception Controller
Exception addresses are specified with the Processor parameter editor.
All exceptions are precise. Precise means that the processor has completed execution of all instructions preceding the faulting instruction and not started execution of instructions following the faulting instruction. Precise exceptions allow the processor to resume program execution once the exception handler clears the exception.
2.5.2. EIC Interface
The processor connects to an EIC through the EIC interface. When an EIC is present, the internal interrupt controller is not implemented; connects interrupts to the EIC.
The EIC selects among active interrupts and presents one interrupt to the processor, with interrupt handler address and register set selection information. The interrupt selection algorithm is specific to the EIC implementation, and is typically based on interrupt priorities. The processor does not depend on any specific interrupt prioritization scheme in the EIC.
For every external interrupt, the EIC presents an interrupt level. The processor uses the interrupt level in determining when to service the interrupt.
Any external interrupt can be configured as an NMI. NMIs are not masked by the status.PIE bit, and have no interrupt level.
An EIC can be software-configurable.
For a typical example of an EIC, refer to the Vectored Interrupt Controller chapter in the Embedded Peripherals IP User Guide.
For details about EIC usage, refer to “Exception Processing” in the Programming Model chapter of the Processor Reference Handbook.
2.5.3. Internal Interrupt Controller
Your software can enable and disable any interrupt source individually through the ienable control register, which contains an interrupt-enable bit for each of the IRQ inputs. Software can enable and disable interrupts globally using the PIE bit of the status control register. A hardware interrupt is generated if and only if all of the following conditions are true:
- The PIE bit of the status register is 1
- An interrupt-request input, irq<n>, is asserted
- The corresponding bit n of the ienable register is 1
The interrupt vector custom instruction is less efficient than using the EIC interface with the FPGA vectored interrupt controller component, and thus is deprecated in . recommends using the EIC interface.
2.6. Memory and I/O Organization
The flexible nature of the Nios II memory and I/O organization are the most notable difference between processor systems and traditional microcontrollers. Because processor systems are configurable, the memories and peripherals vary from system to system. As a result, the memory and I/O organization varies from system to system.
A Nios II core uses one or more of the following to provide memory and I/O access:
- Instruction master port—An ® Memory-Mapped (-MM) master port that connects to instruction memory via system interconnect fabric
- Instruction cache—Fast cache memory internal to the Nios II core
- Data master port—An -MM master port that connects to data memory and peripherals via system interconnect fabric
- Data cache—Fast cache memory internal to the Nios II core
- Tightly-coupled instruction or data memory port—Interface to fast on-chip memory outside the Nios II core
The Nios II architecture handles the hardware details for the programmer, so programmers can develop Nios II applications without specific knowledge of the hardware implementation.
For details that affect programming issues, refer to the Programming Model chapter of the Processor Reference Handbook.
2.6.1. Instruction and Data Buses
The Nios II architecture supports separate instruction and data buses, classifying it as a Harvard architecture. Both the instruction and data buses are implemented as -MM master ports that adhere to the -MM interface specification. The data master port connects to both memory and peripheral components, while the instruction master port connects only to memory components.
2.6.1.1. Memory and Peripheral Access
The Nios II architecture does not specify anything about the existence of memory and peripherals; the quantity, type, and connection of memory and peripherals are system-dependent. Typically, processor systems contain a mix of fast on-chip memory and slower off-chip memory. Peripherals typically reside on-chip, although interfaces to off-chip peripherals also exist.
2.6.1.2. Instruction Master Port
The instruction master port is a pipelined -MM master port. Support for pipelined -MM transfers minimizes the impact of synchronous memory with pipeline latency and increases the overall fMAX of the system. The instruction master port can issue successive read requests before data has returned from prior requests. The processor can prefetch sequential instructions and perform branch prediction to keep the instruction pipe as active as possible.
The instruction master port always retrieves 32 bits of data. The instruction master port relies on dynamic bus-sizing logic contained in the system interconnect fabric. By virtue of dynamic bus sizing, every instruction fetch returns a full instruction word, regardless of the width of the target memory. Consequently, programs do not need to be aware of the widths of memory in the processor system.
The Nios II architecture supports on-chip cache memory for improving average instruction fetch performance when accessing slower memory. Refer to the "Cache Memory" section of this chapter for details.
The Nios II architecture supports tightly-coupled memory, which provides guaranteed low-latency access to on-chip memory. Refer to the "Tightly-Coupled Memory" section of this chapter for details.
2.6.1.3. Data Master Port
- Read data from memory or a peripheral when the processor executes a load instruction
- Write data to memory or a peripheral when the processor executes a store instruction
Byte-enable signals on the master port specify which of the four byte-lane(s) to write during store operations. When the Nios II core is configured with a data cache line size greater than four bytes, the data master port supports pipelined -MM transfers. When the data cache line size is only four bytes, any memory pipeline latency is perceived by the data master port as wait states. Load and store operations can complete in a single clock cycle when the data master port is connected to zero-wait-state memory.
The Nios II architecture supports on-chip cache memory for improving average data transfer performance when accessing slower memory. Refer to the "Cache Memory" section of this chapter for details.
The Nios II architecture supports tightly-coupled memory, which provides guaranteed low-latency access to on-chip memory. Refer to "Tightly-Coupled Memory" section of this chapter for details.
- Read data from memory or a peripheral when the processor executes a load instruction
- Write data to memory or a peripheral when the processor executes a store instruction
Byte-enable signals on the master port specify which of the four byte-lane(s) to write during store operations. Load and store operations can complete in a single clock cycle when the data master port is connected to zero-wait-state memory.
The Nios II architecture supports on-chip cache memory for improving average data transfer performance when accessing slower memory. Refer to the "Cache Memory" section of this chapter for details.
The Nios II architecture supports tightly-coupled memory, which provides guaranteed low-latency access to on-chip memory. Refer to "Tightly-Coupled Memory" section of this chapter for details.
2.6.1.4. Shared Memory for Instructions and Data
The data and instruction master ports never cause a gridlock condition in which one port starves the other. For highest performance, assign the data master port higher arbitration priority on any memory that is shared by both instruction and data master ports.
2.6.2. Cache Memory
The instruction and data caches are enabled perpetually at run-time, but methods are provided for software to bypass the data cache so that peripheral accesses do not return cached data. Cache management and cache coherency are handled by software. The Nios II instruction set provides instructions for cache management.
2.6.2.1. Configurable Cache Memory Options
A processor core might include one, both, or neither of the cache memories. Furthermore, for cores that provide data as well as instruction cache, the sizes of the cache memories are user-configurable. The inclusion of cache memory does not affect the functionality of programs, but it does affect the speed at which the processor fetches instructions and reads/writes data.
2.6.2.2. Effective Use of Cache Memory
- Regular memory is located off-chip, and access time is long compared to on-chip memory
- The largest, performance-critical instruction loop is smaller than the instruction cache
- The largest block of performance-critical data is smaller than the data cache
Optimal cache configuration is application specific, although you can make decisions that are effective across a range of applications. For example, if a processor system includes only fast, on-chip memory (i.e., it never accesses slow, off-chip memory), an instruction or data cache is unlikely to offer any performance gain. As another example, if the critical loop of a program is 2 KB, but the size of the instruction cache is 1 KB, an instruction cache does not improve execution speed. In fact, an instruction cache may degrade performance in this situation.
If an application always requires certain data or sections of code to be located in cache memory for performance reasons, the tightly-coupled memory feature might provide a more appropriate solution. Refer to the "Tightly-Coupled Memory" section for details.
2.6.2.3. Cache Bypass Methods
The Nios II architecture provides the following methods for bypassing the data cache:
- I/O load and store instructions
- Bit-31 cache bypass
The Nios II architecture provides the following methods for bypassing the data cache:
- I/O load and store instructions
- Bit-31 cache bypass
- Peripheral Region
2.6.2.3.1. I/O Load and Store Instructions Method
The load and store I/O instructions such as ldwio and stwio bypass the data cache and force an -MM data transfer to a specified address.
2.6.2.3.2. The Bit-31 Cache Bypass Method
The bit-31 cache bypass method on the data master port uses bit 31 of the address as a tag that indicates whether the processor should transfer data to/from cache, or bypass it. This is a convenience for software, which might need to cache certain addresses and bypass others. Software can pass addresses as parameters between functions, without having to specify any further information about whether the addressed data is cached or not.
To determine which cores implement which cache bypass methods, refer to the Nios II Core Implementation Details chapter of the Processor Reference Handbook.
2.6.2.3.3. Peripheral Region
cores optionally support a new peripheral region mechanism to indicate cacheability. The peripheral region cacheability mechanism allows a user at generation time to specify a region of address space that is treated as non-cacheable. The peripheral region is any integer power of 2 bytes from a minimum of 4096 bytes up to a maximum of 2 GBytes and must be located at a base address aligned to the size of the peripheral region. The peripheral region is available as long as an MMU is not present.
2.6.3. Tightly-Coupled Memory
- Performance similar to cache memory
- Software can guarantee that performance-critical code or data is located in tightly-coupled memory
- No real-time caching overhead, such as loading, invalidating, or flushing memory
Physically, a tightly-coupled memory port is a separate master port on the processor core, similar to the instruction or data master port. A Nios II core can have zero, one, or multiple tightly-coupled memories. The Nios II architecture supports tightly-coupled memory for both instruction and data access. Each tightly-coupled memory port connects directly to exactly one memory with guaranteed low, fixed latency. The memory is external to the Nios II core and is located on chip.
2.6.3.1. Accessing Tightly-Coupled Memory
2.6.3.2. Effective Use of Tightly-Coupled Memory
A system can use tightly-coupled memory to achieve maximum performance for accessing a specific section of code or data. For example, interrupt-intensive applications can place exception handler code into a tightly-coupled memory to minimize interrupt latency. Similarly, compute-intensive digital signal processing (DSP) applications can place data buffers into tightly-coupled memory for the fastest possible data access.
If the application’s memory requirements are small enough to fit entirely on chip, it is possible to use tightly-coupled memory exclusively for code and data. Larger applications must selectively choose what to include in tightly-coupled memory to maximize the cost-performance trade-off.
2.6.4. Address Map
There are three addresses that are part of the processor and deserve special mention:
- Reset address
- Exception address
- Break handler address
Programmers access memories and peripherals by using macros and drivers. Therefore, the flexible address map does not affect application developers.
2.6.5. Memory Management Unit
- Virtual to physical address mapping
- Memory protection
- 32-bit virtual and physical addresses, mapping a 4-GB virtual address space into as much as 4 GB of physical memory
- 4-KB page and frame size
- Low 512 MB of physical address space available for direct access
- Hardware translation lookaside buffers (TLBs), accelerating address translation
- Separate TLBs for instruction and data accesses
- Read, write, and execute permissions controlled per page
- Default caching behavior controlled per page
- TLBs acting as n-way set-associative caches for software page tables
- TLB sizes and associativities configurable in the Processor parameter editor
- Format of page tables (or equivalent data structures) determined by system software
- Replacement policy for TLB entries determined by system software
- Write policy for TLB entries determined by system software
For more information about the MMU implementation, refer to the Programming Model chapter of the Processor Reference Handbook.
You can optionally include the MMU when you instantiate the processor in your hardware system. When present, the MMU is always enabled, and the data and instruction caches are virtually-indexed, physically-tagged caches. Several parameters are available, allowing you to optimize the MMU for your system needs.
For complete details about user-selectable parameters for the Nios II MMU, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
2.6.6. Memory Protection Unit
- Memory protection
- Up to 32 instruction regions and 32 data regions
- Variable instruction and data region sizes
- Amount of region memory defined by size or upper address limit
- Read and write access permissions for data regions
- Execute access permissions for instruction regions
- Overlapping regions
For more information about the MPU implementation, refer to the Programming Model chapter of the Processor Reference Handbook.
You can optionally include the MPU when you instantiate the processor in your hardware system. When present, the MPU is always enabled. Several parameters are available, allowing you to optimize the MPU for your system needs.
For complete details about user-selectable parameters for the Nios II MPU, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
2.7. JTAG Debug Module
- Downloading programs to memory
- Starting and stopping execution
- Setting breakpoints and watchpoints
- Analyzing registers and memory
- Collecting real-time execution trace data
The debug module connects to the JTAG circuitry in an FPGA. External debugging probes can then access the processor via the standard JTAG interface on the FPGA. On the processor side, the debug module connects to signals inside the processor core. The debug module has nonmaskable control over the processor, and does not require a software stub linked into the application under test. All system resources visible to the processor in supervisor mode are available to the debug module. For trace data collection, the debug module stores trace data in memory either on-chip or in the debug probe.
The debug module gains control of the processor either by asserting a hardware break signal, or by writing a break instruction into program memory to be executed. In both cases, the processor transfers execution to the routine located at the break address. The break address is specified with the Processor parameter editor in .
Soft processor cores such as the processor offer unique debug capabilities beyond the features of traditional, fixed processors. The soft nature of the processor allows you to debug a system in development using a full-featured debug core, and later remove the debug features to conserve logic resources. For the release version of a product, the JTAG debug module functionality can be reduced, or removed altogether.
The following sections describe the capabilities of the Nios II JTAG debug module hardware. The usage of all hardware features is dependent on host software, such as the Nios II Software Build Tools for Eclipse, which manages the connection to the target processor and controls the debug process.
2.7.1. JTAG Target Connection
2.7.2. Download and Execute Software
2.7.3. Software Breakpoints
2.7.4. Hardware Breakpoints
Hardware breakpoints are implemented using the JTAG debug module’s hardware trigger feature.
2.7.5. Hardware Triggers
Hardware trigger conditions are based on either the instruction or data bus. Trigger conditions on the same bus can be logically ANDed, enabling the JTAG debug module to trigger, for example, only on write cycles to a specific address.
Condition | Bus | Description |
---|---|---|
Specific address | Data, Instruction | Trigger when the bus accesses a specific address. |
Specific data value | Data | Trigger when a specific data value appears on the bus. |
Read cycle | Data | Trigger on a read bus cycle. |
Write cycle | Data | Trigger on a write bus cycle. |
Armed | Data, Instruction | Trigger only after an armed trigger event. Refer to the Armed Triggers section. |
Range | Data | Trigger on a range of address values, data values, or both. Refer to the Triggering on Ranges of Values section. |
When a trigger condition occurs during processor execution, the JTAG debug module triggers an action, such as halting execution, or starting trace capture. The table below lists the trigger actions supported by the Nios II JTAG debug module.
Action | Description |
---|---|
Break | Halt execution and transfer control to the JTAG debug module. |
External trigger | Assert a trigger signal output. This trigger output can be used, for example, to trigger an external logic analyzer. |
Trace on | Turn on trace collection. |
Trace off | Turn off trace collection. |
Trace sample | Store one sample of the bus to trace buffer. |
Arm | Enable an armed trigger. |
2.7.5.1. Armed Triggers
The JTAG debug module provides a two-level trigger capability, called armed triggers. Armed triggers enable the JTAG debug module to trigger on event B, only after event A. In this example, event A causes a trigger action that enables the trigger for event B.
2.7.5.2. Triggering on Ranges of Values
The JTAG debug module can trigger on ranges of data or address values on the data bus. This mechanism uses two hardware triggers together to create a trigger condition that activates on a range of values within a specified range.
2.7.6. Trace Capture
- Capture execution trace (instruction bus cycles).
- Capture data trace (data bus cycles).
- For each data bus cycle, capture address, data, or both.
- Start and stop capturing trace in real time, based on triggers.
- Manually start and stop trace under host control.
- Optionally stop capturing trace when trace buffer is full, leaving the processor executing.
- Store trace data in on-chip memory buffer in the JTAG debug module. (This memory is accessible only through the JTAG connection.)
- Store trace data to larger buffers in an off-chip debug probe.
Certain trace features require additional licensing or debug tools from third-party debug providers. For example, an on-chip trace buffer is a standard feature of the processor, but using an off-chip trace buffer requires additional debug software and hardware provided by Imagination Technologies™, LLC or Lauterbach GmbH.
2.7.6.1. Execution vs. Data Trace
The JTAG debug module can filter the data bus trace in real time to capture the following:
- Load addresses only
- Store addresses only
- Both load and store addresses
- Load data only
- Load address and data
- Store address and data
- Address and data for both loads and stores
- Single sample of the data bus upon trigger event
2.7.6.2. Trace Frames
To keep pace with the processor executing in real time, execution trace is optimized to store only selected addresses, such as branches, calls, traps, and interrupts. From these addresses, host-side debug software can later reconstruct an exact instruction-by-instruction execution trace. Furthermore, execution trace data is stored in a compressed format, such that one frame represents more than one instruction. As a result of these optimizations, the actual start and stop points for trace collection during execution might vary slightly from the user-specified start and stop points.
Data trace stores 100% of requested loads and stores to the trace buffer in real time. When storing to the trace buffer, data trace frames have lower priority than execution trace frames. Therefore, while data frames are always stored in chronological order, execution and data trace are not guaranteed to be exactly synchronized with each other.
2.8. Processor Architecture Revision History
Document Version | Changes |
---|---|
2019.05.20 | Reset and Debug Signals: Added definition for the debug_reset_request signal. |
2019.04.30 | Maintenance release |
2018.04.18 |
|
2016.10.28 | Maintenance release |
2015.04.02 | Initial release |
Date | Version | Changes |
---|---|---|
April 2015 | 2015.04.02 | Maintenance release. |
February 2014 | 13.1.0 |
|
May 2011 | 11.0.0 |
|
December 2010 | 10.1.0 | Added reference to tightly-coupled memory tutorial. |
July 2010 | 10.0.0 | Maintenance release. |
November 2009 | 9.1.0 |
|
March 2009 | 9.0.0 | Maintenance release. |
November 2008 | 8.1.0 |
|
May 2008 | 8.0.0 | Added MMU and MPU sections. |
October 2007 | 7.2.0 | Maintenance release. |
May 2007 | 7.1.0 |
|
March 2007 | 7.0.0 | Maintenance release. |
November 2006 | 6.1.0 | Described interrupt vector custom instruction. |
May 2006 | 6.0.0 |
|
October 2005 | 5.1.0 | Maintenance release. |
May 2005 | 5.0.0 | Added tightly-coupled memory. |
December 2004 | 1.2 | Added new control register ctl5. |
September 2004 | 1.1 | Updates for Nios II 1.01 release. |
May 2004 | 1.0 | Initial release. |
3. Programming Model
This chapter describes the Nios® II programming model, covering processor features at the assembly language level. Fully understanding the contents of this chapter requires prior knowledge of computer architecture, operating systems, virtual memory and memory management, software processes and process management, exception handling, and instruction sets. This chapter assumes you have a detailed understanding of these concepts and focuses on how these concepts are specifically implemented in the processor. Where possible, this chapter uses industry-standard terminology.
3.1. Operating Modes
- Supervisor mode
- User mode
The following sections define the modes, their relationship to your system software and application code, and their relationship to the Nios II MMU and Nios II MPU.
3.1.1. Supervisor Mode
Operating systems and other system software run in supervisor mode. In systems with an MMU, application code runs in user mode, and the operating system, running in supervisor mode, controls the application’s access to memory and peripherals. In systems with an MPU, your system software controls the mode in which your application code runs. In systems without an MMU or MPU, all application and system code runs in supervisor mode.
Code that needs direct access to and control of the processor runs in supervisor mode. For example, the processor enters supervisor mode whenever a processor exception (including processor reset or break) occurs. Software debugging tools also use supervisor mode to implement features such as breakpoints and watchpoints.
3.1.2. User Mode
The operating system determines which memory addresses are accessible to user mode applications. Attempts by user mode applications to access memory locations without user access enabled are not permitted and cause an exception. Code running in user mode uses system calls to make requests to the operating system to perform I/O operations, manage memory, and access other system functionality in the supervisor memory.
The Nios II MMU statically divides the 32-bit virtual address space into user and supervisor partitions. Refer to Address Space and Memory Partitions section for more information about the MMU memory partitions. The MMU provides operating systems access permissions on a per-page basis. Refer to Virtual Addressing for more information about MMU pages.
The Nios II MPU supervisor and user memory divisions are determined by the operating system or runtime environment. The MPU provides user access permissions on a region basis. Refer to Memory Regions for more information about MPU regions.
3.2. Memory Management Unit
3.2.1. Recommended Usage
Many systems have simpler requirements where minimal system software or a small-footprint operating system (such as the FPGA® hardware abstraction library (HAL) or a third party real-time operating system) is sufficient. Such software is unlikely to function correctly in a hardware system with an MMU-based processor. Do not include an MMU in your system unless your operating system requires it.
If your system needs memory protection, but not virtual memory management, refer to Memory Protection Unit section.
3.2.2. Memory Management
- Virtual addressing—Mapping a virtual memory space into a physical memory space
- Memory protection—Allowing access only to certain memory under certain conditions
3.2.2.1. Virtual Addressing
The MMU contains a hardware translation lookaside buffer (TLB). The operating system is responsible for creating and maintaining a page table (or equivalent data structures) in memory. The hardware TLB acts as a software managed cache for the page table. The MMU does not perform any operations on the page table, such as hardware table walks. Therefore the operating system is free to implement its page table in any appropriate manner.
There is a 20 bit virtual page number (VPN) and a 12 bit page offset.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Virtual Page Number | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Virtual Page Number | Page Offset |
As input, the TLB takes a VPN plus a process identifier (to guarantee uniqueness). As output, the TLB provides the corresponding physical frame number (PFN).
Distinct processes can use the same virtual address space. The process identifier, concatenated with the virtual address, distinguishes identical virtual addresses in separate processes. To determine the physical address, the Nios II MMU translates a VPN to a PFN and then concatenates the PFN with the page offset. The bits in the page offset are not translated.
Memory Protection
The Nios II MMU maintains read, write, and execute permissions for each page. The TLB provides the permission information when translating a VPN. The operating system can control whether or not each process is allowed to read data from, write data to, or execute instructions on each particular page. The MMU also controls whether accesses to each data page are cacheable or uncacheable by default.
Whenever an instruction attempts to access a page that either has no TLB mapping, or lacks the appropriate permissions, the MMU generates an exception. The processor’s precise exceptions enable the system software to update the TLB, and then re-execute the instruction if desired.
3.2.2.2. Memory Protection
The MMU maintains read, write, and execute permissions for each page. The TLB provides the permission information when translating a VPN. The operating system can control whether or not each process is allowed to read data from, write data to, or execute instructions on each particular page. The MMU also controls whether accesses to each data page are cacheable or uncacheable by default.
Whenever an instruction attempts to access a page that either has no TLB mapping, or lacks the appropriate permissions, the MMU generates an exception. The processor’s precise exceptions enable the system software to update the TLB, and then re-execute the instruction if desired.
3.2.3. Address Space and Memory Partitions
3.2.3.1. Virtual Memory Address Space
Partition | Virtual Address Range | Used By | Memory Access | User Mode Access | Default Data Cacheability |
---|---|---|---|---|---|
I/O | 0xE0000000–0xFFFFFFFF | Operating system | Bypasses TLB | No | Disabled |
Kernel | 0xC0000000–0xDFFFFFFF | Operating system | Bypasses TLB | No | Enabled |
Kernel MMU | 0x80000000–0xBFFFFFFF | Operating system | Uses TLB | No | Set by TLB |
User | 0x00000000–0x7FFFFFFF | User processes | Uses TLB | Set by TLB | Set by TLB |
Each partition has a specific size, purpose, and relationship to the TLB:
- The 512-MB I/O partition provides access to peripherals.
- The 512-MB kernel partition provides space for the operating system kernel.
- The 1-GB kernel MMU partition is used by the TLB miss handler and kernel processes.
- The 2-GB user partition is used by application processes.
I/O and kernel partitions bypass the TLB. The kernel MMU and user partitions use the TLB. If all software runs in the kernel partition, the MMU is effectively disabled.
3.2.3.2. Physical Memory Address Space
High physical memory can only be accessed through the TLB. Any physical address in low memory (29-bits or less) can be accessed through the TLB or by bypassing the TLB. When bypassing the TLB, a 29-bit physical address is computed by clearing the top three bits of the 32-bit virtual address.
3.2.3.3. Data Cacheability
Non-I/O load and store instructions use the default data cacheability property. I/O load and store instructions are always noncacheable, so they ignore the default data cacheability property.
3.2.4. TLB Organization
The TLB is organized as an n-way set-associative cache. The software specifies the way (set) when loading a new entry.
- Cyclone III®, Stratix III®, Stratix IV—256 entries, requiring one M9K
RAM
For more information, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
The operating system software is responsible for guaranteeing that multiple TLB entries do not map the same virtual address. The hardware behavior is undefined when multiple entries map the same virtual address.
Each TLB entry consists of a tag and data portion. This is analogous to the tag and data portion of instruction and data caches.
Refer to the Nios II Core Implementation Details chapter of the Processor Reference Handbook for information about instruction and data caches.
The tag portion of a TLB entry contains information used when matching a virtual address to a TLB entry.
Field Name | Description |
---|---|
VPN | VPN is the virtual page number field. This field is compared with the top 20 bits of the virtual address. |
PID | PID is the process identifier field. This field is compared with the value of the current process identifier stored in the tlbmisc control register, effectively extending the virtual address. The field size is configurable in the Nios_II Processor parameter editor, and can be between 8 and 14 bits. |
G | G is the global flag. When G = 1, the PID is ignored in the TLB lookup. |
The TLB data portion determines how to translate a matching virtual address to a physical address.
Field Name | Description |
---|---|
PFN | PFN is the physical frame number field. This field specifies the upper bits of the physical address. The size of this field depends on the range of physical addresses present in the system. The maximum size is 20 bits. |
C | C is the cacheable flag. Determines the default data cacheability of a page. Can be overridden for data accesses using I/O load and store family of Nios II instructions. |
R | R is the readable flag. Allows load instructions to read a page. |
W | W is the writable flag. Allows store instructions to write a page. |
X | X is the executable flag. Allows instruction fetches from a page. |
3.2.5. TLB Lookups
TLB Lookup Algorithm for Instruction Fetches
if (VPN match && (G == 1 || PID match)) if (X == 1) PADDR = concat(PFN, VADDR[11:0]) else take TLB permission violation exception else if (EH bit of status register == 1) take double TLB miss exception else take fast TLB miss exception
TLB Lookup Algorithm for Data Access Operations
if (VPN match && (G == 1 || PID match)) if ((load && R == 1) || (store && W == 1) || flushda) PADDR = concatenate(PFN, VADDR[11:0]) else take TLB permission violation exception else if (EH bit of status register == 1) take double TLB miss exception else take fast TLB miss exception
Refer to “Instruction-Related Exceptions” for information about TLB exceptions.
3.3. Memory Protection Unit
When present and enabled, the MPU monitors all Nios II instruction fetches and data memory accesses to protect against errant software execution. The MPU is a hardware facility that system software uses to define memory regions and their associated access permissions. The MPU triggers an exception if software attempts to access a memory region in violation of its permissions, allowing you to intervene and handle the exception as appropriate. The precise exception effectively prevents the illegal access to memory.
The MPU extends the processor to support user mode and supervisor mode. Typically, system software runs in supervisor mode and end-user applications run in user mode, although all software can run in supervisor mode if desired. System software defines which MPU regions belong to supervisor mode and which belong to user mode.
MPU protects user application. Therefore for interrupt service, the system must have access to the regions that may cause potential violation because MPU generates exception post access and does not prevent access to the memory region in hardware.
3.3.1. Memory Regions
- Base address
- Region type
- Region index
- Region size or upper address limit
- Access permissions
- Default cacheability (data regions only)
3.3.1.1. Base Address
3.3.1.2. Region Type
3.3.1.3. Region Index
3.3.1.4. Region Size or Upper Address Limit
A generation-time option controls whether the amount of memory in the region is defined by size or upper address limit. The size is an integer power of two bytes. The limit is the highest address of the region plus one. The minimum supported region size is 64 bytes but can be configured for larger minimum sizes to save logic resources. The maximum supported region size equals the Nios II address space (a function of the address ranges of slaves connected to the Nios II masters). Any access outside of the Nios II address space is considered not to match any region and triggers an MPU region violation exception.
A generation-time option controls whether the amount of memory in the region is defined by size or upper address limit. The size is an integer power of two bytes. The limit is the highest address of the region plus one. The minimum supported region size is 256 bytes but can be configured for larger minimum sizes to save logic resources. The maximum supported region size equals the Nios II address space (a function of the address ranges of slaves connected to the Nios II masters). Any access outside of the Nios II address space is considered not to match any region and triggers an MPU region violation exception.
When regions are defined by size, the size is encoded as a binary mask to facilitate the following MPU region address range matching:
(address & region_mask) == region_base_address
When regions are defined by limit, the limit is encoded as an unsigned integer to facilitate the following MPU region address range matching:
(address >= region_base) && (address < region_limit)
The region limit uses a less-than instead of a less-than-or-equal-to comparison because less-than provides a more efficient implementation. The limit is one bit larger than the address so that full address range may be included in a range. Defining the region by limit results in slower and larger address range match logic than defining by size but allows finer granularity in region sizes.
3.3.1.5. Access Permissions
3.3.1.6. Default Cacheability
The default cacheability specifies whether normal load and store instructions access the data cache or bypass the data cache. The default cacheability is only present for data regions. You can override the default cacheability by using the ldwio or stwio instructions. The bit 31 cache bypass feature is available when the MPU is present.
Refer to the Cache Memory section for more information on cache bypass.
The default cacheability specifies whether normal load and store instructions access the data cache or bypass the data cache. The default cacheability is only present for data regions. You can override the default cacheability by using the ldwio or stwio instructions. The bit-31 cache and Peripheral Region features are available when the MMU is not present.
Refer to the Cache Memory section for more information on cache bypass and Peripheral Region.
3.3.2. Overlapping Regions
If regions overlap so that a particular access matches more than one region, the region with the highest priority (lowest index) determines the access permissions and default cacheability.
3.3.3. Enabling the MPU
3.4. Registers
3.4.1. General-Purpose Registers
The Nios II architecture provides thirty-two 32-bit general-purpose registers, r0 through r31. Some registers have names recognized by the assembler. For example, the zero register (r0) always returns the value zero, and writing to zero has no effect. The ra register (r31) holds the return address used by procedure calls and is implicitly accessed by the call, callr and ret instructions. C and C++ compilers use a common procedure-call convention, assigning specific meaning to registers r1 through r23 and r26 through r28.
Register | Name | Function | Register | Name | Function |
---|---|---|---|---|---|
r0 | zero | 0x00000000 | r16 | Callee-saved register | |
r1 | at | Assembler temporary | r17 | Callee-saved register | |
r2 | Return value | r18 | Callee-saved register | ||
r3 | Return value | r19 | Callee-saved register | ||
r4 | Register arguments | r20 | Callee-saved register | ||
r5 | Register arguments | r21 | Callee-saved register | ||
r6 | Register arguments | r22 | Callee-saved register | ||
r7 | Register arguments | r23 | Callee-saved register | ||
r8 | Caller-saved register | r24 | et | Exception temporary | |
r9 | Caller-saved register | r25 | bt | Breakpoint temporary | |
r10 | Caller-saved register | r26 | gp | Global pointer | |
r11 | Caller-saved register | r27 | sp | Stack pointer | |
r12 | Caller-saved register | r28 | fp | Frame pointer | |
r13 | Caller-saved register | r29 | ea | Exception return address | |
r14 | Caller-saved register | r30 | sstatus | Status register | |
r15 | Caller-saved register | r31 | ra | Return address |
For more information, refer to the Application Binary Interface chapter of the Processor Reference Handbook.
3.4.2. Control Registers
Control registers report the status and change the behavior of the processor. Control registers are accessed differently than the general-purpose registers. The special instructions rdctl and wrctl provide the only means to read and write to the control registers and are only available in supervisor mode.
The architecture supports up to 32 control registers. All non-reserved control registers have names recognized by the assembler.
Register | Name | Register Contents | |||
---|---|---|---|---|---|
0 | status | Refer to The status Register | |||
1 | estatus | Refer to The estatus Register | |||
2 | bstatus | Refer to The bstatus Register | |||
3 | ienable | Internal interrupt-enable bits
Available only when the external interrupt controller interface is not present. Otherwise reserved. |
|||
4 | ipending | Pending internal interrupt bits
Available only when the external interrupt controller interface is not present. Otherwise reserved. |
|||
5 | cpuid | Unique processor identifier | |||
6 | Reserved | Reserved | |||
7 | exception | Refer to The exception Register | |||
8 | pteaddr | Refer
to The pteaddr
Register
Available only when the MMU is present. Otherwise reserved. |
|||
9 | tlbacc | Refer
to The tlbacc
Register
Available only when the MMU is present. Otherwise reserved. |
|||
10 | tlbmisc | Refer
to The tlbmisc
Register
Available only when the MMU is present. Otherwise reserved. |
|||
11 | eccinj | Refer
to The eccinj
Register
Available only when ECC is present. |
|||
12 | badaddr | Refer to The badaddr Register | |||
13 | config | Refer to The config Register
Available only when the MPU or ECC is present. Otherwise reserved. |
|||
14 | mpubase | Refer
to The mpubase
Register
Available only when the MPU is present. Otherwise reserved. |
|||
15 | mpuacc | Refer
to The mpuacc
Register for MASK variations table. Available only when the MPU is present. Otherwise reserved. |
|||
16–31 | Reserved | Reserved |
The following sections describe the non-reserved control registers.
Control registers report the status and change the behavior of the processor. Control registers are accessed differently than the general-purpose registers. The special instructions rdctl and wrctl provide the only means to read and write to the control registers and are only available in supervisor mode.
The architecture supports up to 32 control registers. All non-reserved control registers have names recognized by the assembler.
Register | Name | Register Contents | |||
---|---|---|---|---|---|
0 | status | Refer to The status Register | |||
1 | estatus | Refer to The estatus Register | |||
2 | bstatus | Refer to The bstatus Register | |||
3 | ienable | Internal interrupt-enable bits
Available only when the external interrupt controller interface is not present. Otherwise reserved. |
|||
4 | ipending | Pending internal interrupt bits
Available only when the external interrupt controller interface is not present. Otherwise reserved. |
|||
5 | cpuid | Unique processor identifier | |||
6 | Reserved | Reserved | |||
7 | exception | Refer to The exception Register | |||
8 | pteaddr | Refer
to The pteaddr
Register
Available only when the MMU is present. Otherwise reserved. |
|||
9 | tlbacc | Refer
to The tlbacc
Register
Available only when the MMU is present. Otherwise reserved. |
|||
10 | tlbmisc | Refer
to The tlbmisc
Register
Available only when the MMU is present. Otherwise reserved. |
|||
11 | eccinj | Refer
to The eccinj
Register
Available only when ECC is present. |
|||
12 | badaddr | Refer to The badaddr Register | |||
13 | config | Refer
to The config Register
Available when the MPU or ECC is present. Otherwise reserved. |
|||
14 | mpubase | Refer
to The mpubase
Register
Available only when the MPU is present. Otherwise reserved. |
|||
15 | mpuacc | Refer
to The mpuacc
Register for MASK variations table. Available only when the MPU is present. Otherwise reserved. |
|||
16–31 | Reserved | Reserved |
The following sections describe the non-reserved control registers.
3.4.2.1. The status Register
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | RSIE | NMI | PRS | ||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
CRS | IL | IH | EH | U | PIE |
Bit | Description | Access | Reset | Available |
---|---|---|---|---|
RSIE | RSIE is the register set interrupt-enable bit. When set to 1, this bit allows the processor to service external interrupts requesting the register set that is currently in use. When set to 0, this bit disallows servicing of such interrupts. | Read/Write | 1 | EIC interface and shadow register sets only1 |
NMI | NMI is the nonmaskable interrupt mode bit. The processor sets NMI to 1 when it takes a nonmaskable interrupt. | Read | 0 | EIC interface only4 |
PRS |
PRS is the previous
register set field. The processor copies the CRS field to the PRS field upon one of
the following events:
|
Read/Write | 0 | Shadow register sets only4 |
CRS |
CRS is the current
register set field. CRS indicates which register set is currently in use. Register
set 0 is the normal register set, while register sets 1 and higher are
shadow register sets. The processor sets CRS to zero on any noninterrupt
exception. The number of significant bits in the CRS and PRS fields depends on the number of shadow register sets implemented in the Nios II core. Unused high-order bits are always read as 0, and must be written as 0. |
Read2 | 0 | Shadow register sets only4 |
IL | IL is the interrupt level field. The IL field controls what level of external maskable interrupts can be serviced. The processor services a maskable interrupt only if its requested interrupt level is greater than IL. | Read/Write | 0 | EIC interface only4 |
IH | IH is the interrupt handler mode bit. The processor sets IH to one when it takes an external interrupt. | Read/Write | 0 | EIC interface only4 |
EH 3 | EH is the exception handler mode bit. The processor sets EH to one when an exception occurs (including breaks). Software clears EH to zero when ready to handle exceptions again. EH is used by the MMU to determine whether a TLB miss exception is a fast TLB miss or a double TLB miss. In systems without an MMU, EH is always zero. | Read/Write | 0 | MMU or ECC only4 |
U 3 | U is the user mode bit. When U = 1, the processor operates in user mode. When U = 0, the processor operates in supervisor mode. In systems without an MMU, U is always zero. | Read/Write | 0 | MMU or MPU only4 |
PIE | PIE is the processor interrupt-enable bit. When PIE = 0, internal and maskable external interrupts and noninterrupt exceptions are ignored. When PIE = 1, internal and maskable external interrupts can be taken, depending on the status of the interrupt controller. Noninterrupt exceptions are unaffected by PIE. | Read/Write | 0 | Always |
3.4.2.2. The estatus Register
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | RSIE | NMI | PRS | ||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
CRS | IL | IH | EH | U | PIE |
All fields in the estatus register have read/write access. All fields reset to 0.
When the processor takes an interrupt, if status.eh is zero (that is, the MMU is in nonexception mode), the processor copies the contents of the status register to estatus.
For details about the sstatus register, refer to The sstatus Register section.
The exception handler can examine estatus to determine the pre-exception status of the processor. When returning from an exception, the eret instruction restores the pre-exception value of status. The instruction restores the pre-exception value by copying either estatus or sstatus back to status, depending on the value of status.CRS.
Refer to the Exception Processing section for more information.
3.4.2.3. The bstatus Register
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | RSIE | NMI | PRS | ||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
CRS | IL | IH | EH | U | PIE |
All fields in the bstatus register have read/write access. All fields reset to 0.
The Status Control Register Field Description table describes the details of the fields defined in the bstatus register.
When a break occurs, the value of the status register is copied into bstatus. Using bstatus, the debugger can restore the status register to the value prior to the break. The bret instruction causes the processor to copy bstatus back to status. Refer to the Processing a Break section for more information.
3.4.2.4. The ienable Register
3.4.2.5. The ipending Register
The value of the ipending register indicates the value of the enabled interrupt signals driven into the processor. A value of one in bit n means that the corresponding irq n input is asserted and enabled in the ienable register. Writing a value to the ipending register has no effect.
3.4.2.6. The cpuid Register
3.4.2.7. The exception Register
When the extra exception information option is enabled, the Nios II processor provides information useful to system software for exception processing in the exception and badaddr registers when an exception occurs. When your system contains an MMU or MPU, the extra exception information is always enabled. When no MMU or MPU is present, the Processor parameter editor gives you the option to have the processor provide the extra exception information.
/f Processor provides information useful to system software for exception processing in the exception and badaddr registers when an exception occurs.
For information about controlling the extra exception information option, refer to the Instantiating the Processor chapter of this document.
Bit Fields | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ECCFTL | 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | |
Reserved | ||||||||||||||||
16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Reserved | Cause | Rsvd |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
ECCFTL | The processor writes to ECCFTL when it detects a potentially fatal ECC error. When ECCFTL = 1, the processor detects an ECC register file error. When ECCFTL = 0, another ECC exception occurred. | Read | 0 | Only with ECC |
CAUSE |
CAUSE is written by
the processor when certain exceptions occur. CAUSE contains a code for the
highest-priority exception occurring at the time. The Cause column in the
Nios II Exceptions (In Decreasing Priority Order table lists the CAUSE field value for
each exception. CAUSE is not written on a break or an external interrupt. |
Read | 0 | Only with extra exception information |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
ECCFTL | The processor writes to ECCFTL when it detects a potentially fatal ECC error. When ECCFTL = 1, the processor detects an ECC register file error. When ECCFTL = 0, another ECC exception occurred. | Read | 0 | Only with ECC |
CAUSE |
CAUSE is written by
the processor when certain exceptions occur. CAUSE contains a code for the
highest-priority exception occurring at the time. The Cause column in the
Nios II Exceptions (In Decreasing Priority Order table lists the CAUSE field value for
each exception. CAUSE is not written on a break or an external interrupt. |
Read | 0 | Only with /f |
3.4.2.8. The pteaddr Register
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
PTBASE | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
VPN | Rsvd |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
PTBASE | PTBASE is the base virtual address of the page table. | Read/Write | 0 | Only with MMU |
VPN | VPN is the virtual page number. VPN can be set by both hardware and software. | Read/Write | 0 | Only with MMU |
Software writes to the PTBASE field when switching processes. Hardware never writes to the PTBASE field.
Software writes to the VPN field when writing a TLB entry. Hardware writes to the VPN field on a fast TLB miss exception, a TLB permission violation exception, or on a TLB read operation. The VPN field is not written on any exceptions taken when an exception is already active, that is, when status.EH is already one.
3.4.2.9. The tlbacc Register
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
IG | C | R | W | X | G | PFN | |||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
PFN |
Issuing a wrctl instruction to the tlbacc register writes the tlbacc register with the specified value. If tlbmisc.WE = 1, the wrctl instruction also initiates a TLB write operation, which writes a TLB entry. The TLB entry written is specified by the line portion of pteaddr.VPN and the tlbmisc.WAY field. The value written is specified by the value written into tlbacc along with the values of pteaddr.VPN and tlbmisc.PID. A TLB write operation also increments tlbmisc.WAY, allowing software to quickly modify TLB entries.
Issuing a rdctl instruction to the tlbacc register returns the value of the tlbacc register. The tlbacc register is written by hardware when software triggers a TLB read operation (that is, when wrctl sets tlbmisc.RD to one).
Field | Description | Access | Reset | Available |
---|---|---|---|---|
IG | IG is ignored by hardware and available to hold operating system specific information. Read as zero but can be written as nonzero. | Read/Write | 0 | Only with MMU |
C | C is the data cacheable flag. When C = 0, data accesses are uncacheable. When C = 1, data accesses are cacheable. | Read/Write | 0 | Only with MMU |
R | R is the readable flag. When R = 0, load instructions are not allowed to access memory. When R = 1, load instructions are allowed to access memory. | Read/Write | 0 | Only with MMU |
W | W is the writable flag. When W = 0, store instructions are not allowed to access memory. When W = 1, store instructions are allowed to access memory. | Read/Write | 0 | Only with MMU |
X | X is the executable flag. When X = 0, instructions are not allowed to execute. When X = 1, instructions are allowed to execute. | Read/Write | 0 | Only with MMU |
G | G is the global flag. When G = 0, tlbmisc.PID is included in the TLB lookup. When G = 1, tlbmisc.PID is ignored and only the virtual page number is used in the TLB lookup. | Read/Write | 0 | Only with MMU |
PFN | PFN is the physical frame number field. All unused upper bits must be zero. | Read/Write | 0 | Only with MMU |
The tlbacc register format is the recommended format for entries in the operating system page table. The IG bits are ignored by the hardware on wrctl to tlbacc and read back as zero on rdctl from tlbacc. The operating system can use the IG bits to hold operating system specific information without having to clear these bits to zero on a TLB write operation.
3.4.2.10. The tlbmisc Register
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | EE | WAY | RD | WE | PID | ||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
PID | DBL | BAD | PERM | D |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
EE | If this field is a 1, a software-triggered ECC error (1, 2, or 3 bit error) occurred because software initiated a TLB read operation. Only set this field to 1 if CONFIG.ECCEN is 1. | Read/Write | 0 | Only with MMU and EEC |
WAY | The WAY field controls the mapping from
the VPN to a particular TLB entry. This field size is variable. Unused upper bits must be written as zero. |
Read/Write | 0 | Only with MMU |
RD | RD is the read flag. Setting RD to one triggers a TLB read operation. | Write | 0 | Only with MMU |
WE | WE is the TLB write enable flag. When WE = 1, a write to tlbacc writes through to a TLB entry. | Read/Write | 0 | Only with MMU |
PID |
PID is the process identifier field. This field size is variable. Unused upper bits must be written as zero. |
Read/Write | 0 | Only with MMU |
DBL | DBL is the double TLB miss exception flag. | Read | 0 | Only with MMU |
BAD | BAD is the bad virtual address exception flag. | Read | 0 | Only with MMU |
PERM | PERM is the TLB permission violation exception flag. | Read | 0 | Only with MMU |
D | D is the data access exception flag. When D = 1, the exception is a data access exception. When D = 0, the exception is an instruction access exception. | Read | 0 | Only with MMU |
For DBL, BAD, and PERM fields you can also use exception.CAUSE to determine these exceptions.
The following sections provide more information about the tlbmisc fields.
3.4.2.10.1. The RD Flag
- The tag portion of pteaddr.VPN
- tlbmisc.PID
- The tlbacc register
The TLB entry to be read is specified by the following values:
- the line portion of pteaddr.VPN
- tlbmisc.WAY
When system software changes the fields that specify the TLB entry, there is no immediate effect on pteaddr.VPN, tlbmisc.PID, or the tlbacc register. The registers retain their previous values until the next TLB read operation is initiated. For example, when the operating system sets pteaddr.VPN to a new value, the contents of tlbacc continues to reflect the previous TLB entry. tlbacc does not contain the new TLB entry until after an explicit TLB read.
3.4.2.10.2. The WE Flag
Hardware sets the WE flag to one on a TLB permission violation exception, and on a TLB miss exception when status.EH = 0. When a TLB write operation writes the tlbacc register, the write operation also writes to a TLB entry when WE = 1.
3.4.2.10.3. The WAY Field
3.4.2.10.4. The PID Field
tlbmisc.PID contains the PID field from a TLB tag. The operating system must set the PID field when switching processes, and before each TLB write operation.
The MMU sets tlbmisc.PID on a TLB read operation. When the software triggers a TLB read, by setting tlbmisc.RD to one with the wrctl instruction, the PID value read from the TLB has priority over the value written by the wrctl instruction.
The size of the PID field is configured in at system generation, and can be from 8 to 14 bits. If system software defines a process identifier smaller than the PID field, unused upper bits must be written as zero.
3.4.2.10.5. The DBL Flag
The DBL flag indicates whether the most recent exception is a double TLB miss condition. When a general exception occurs, the MMU sets DBL to one if a double TLB miss is detected, and clears DBL to zero otherwise.
3.4.2.10.6. The BAD Flag
- Supervisor-only instruction address
- Supervisor-only data address
- Misaligned data address
- Misaligned destination address
Refer to Nios II Exceptions (In Decreasing Priority Order) table in the "Exception Overview" section for more information on these exceptions.
3.4.2.10.7. The PERM Flag
3.4.2.10.8. The D Flag
The following exceptions set the D flag to one:
- Fast TLB miss (data)
- Double TLB miss (data)
- TLB permission violation (read or write)
- Misaligned data address
- Supervisor-only data address
3.4.2.11. The badaddr Register
When the extra exception information option is enabled, the processor provides information useful to system software for exception processing in the exception and badaddr registers when an exception occurs. When your system contains an MMU or MPU, the extra exception information is always enabled. When no MMU or MPU is present, the Processor parameter editor gives you the option to have the processor provide the extra exception information.
/f processor provides information useful to system software for exception processing in the exception and badaddr registers when an exception occurs.
For information about controlling the extra exception information option, refer to the Instantiating the Processor chapter of this document.
When the option for extra exception information is enabled and a processor exception occurs, the badaddr register contains the byte instruction or data address associated with certain exceptions at the time the exception occurred. The Exceptions Table lists which exceptions write the badaddr register along with the value written.
When an exception occurs in /f processor, the badaddr register contains the byte instruction or data address associated with certain exceptions at the time the exception occurred. The Exceptions Table lists which exceptions write the badaddr register along with the value written.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
BADDR | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
BADDR |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
BADDR | BADDR contains the byte instruction address or data address associated with an exception when certain exceptions occur. The Address column of the Exceptions Table lists which exceptions write the BADDR field. | Read | 0 | Only with extra exception information |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
BADDR | BADDR contains the byte instruction address or data address associated with an exception when certain exceptions occur. The Address column of the Exceptions Table lists which exceptions write the BADDR field. | Read | 0 | Only with /f |
The BADDR field allows up to a 32-bit instruction address or data address. If an MMU or MPU is present, the BADDR field is 32 bits because MMU and MPU instruction and data addresses are always full 32-bit values. When an MMU is present, the BADDR field contains the virtual address.
If there is no MMU or MPU and the Nios II address space is less than 32 bits, unused high-order bits are written and read as zero. If there is no MMU, bit 31 of a data address (used to bypass the data cache) is always zero in the BADDR field.
3.4.2.12. The config Register
The config register configures Nios II runtime behaviors that do not need to be preserved during exception processing (in contrast to the information in the status register).
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Reserved | ECCEXE | ECCEN | ANI | PE |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
ANI |
ANI is the
automatic nested interrupt mode bit. If ANI is set to zero, the
processor clears status.PIE on each interrupt, disabling fast nested
interrupts. If ANI is set to one, the processor keeps status.PIE
set to one at the time of an interrupt, enabling fast nested
interrupts. If the EIC interface and shadow register sets are not implemented in the Nios II core, ANI always reads as zero, disabling fast nested interrupts. |
Read/Write | 0 | Only with the EIC interface and shadow register sets |
ECCEXE | ECCEX is the ECC error exception enable bit. When ECCEXE = 1, the processor generates ECC error exceptions. | Read/Write | 0 | Only with ECC |
ECCEN | ECCEN is the ECC enable bit. When ECCEN = 0, the processor ignores all ECC errors. When ECCEN = 1, the processor recovers all recoverable ECC errors. | Read/Write | 0 | Only with ECC |
PE | PE is the memory protection enable bit. When PE =1, the MPU is enabled. When PE = 0, the MPU is disabled. In systems without an MPU, PE is always zero. | Read/Write | 0 | Only with MPU |
3.4.2.13. The mpubase Register
The mpubase register works in conjunction with the mpuacc register to set and retrieve MPU region information and is only available in systems with an MPU.
Field | Description | Access | Reset | Available |
---|---|---|---|---|
BASE | BASE is the base memory address of the region identified by the INDEX and D fields. | Read/Write | 0 | Only with MPU |
INDEX | INDEX is the region index number. | Read/Write | 0 | Only with MPU |
D | D is the region access bit. When D =1, INDEX refers to a data region. When D = 0, INDEX refers to an instruction region. | Read/Write | 0 | Only with MPU |
The BASE field specifies the base address of an MPU region. The 24-bit BASE field corresponds to bits 8 through 31 of the base address, making the base address always a multiple of 256 bytes. If the minimum region size (set in at generation time) is larger than 256 bytes, unused low-order bits of the BASE field must be written as zero and are read as zero. For example, if the minimum region size is 1024 bytes, the two least-significant bits of the BASE field (bits 8 through 9 of the mpubase register) must be zero. Similarly, if the Nios II address space is less than 31 bits, unused high-order bits must also be written as zero and are read as zero.
The INDEX and D fields specify the region information to access when an MPU region read or write operation is performed. The D field specifies whether the region is a data region or an instruction region. The INDEX field specifies which of the 32 data or instruction regions to access. If there are fewer than 32 instruction or 32 data regions, unused high-order bits must be written as zero and are read as zero.
Refer to the MPU Region Read and Write Operations section for more information on MPU region read and write operations.
3.4.2.14. The mpuacc Register
The mpuacc register works in conjunction with the mpubase register to set and retrieve MPU region information and is only available in systems with an MPU. The mpuacc register consists of attributes that can be set or have been retrieved which define the MPU region. The mpuacc register only holds a portion of the attributes that define an MPU region. The remaining portion of the MPU region definition is held by the BASE field of the mpubase register.
A generation-time option controls whether the mpuacc register contains a MASK or LIMIT field.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
MASK8 | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
MASK8 | C | PERM | RD | WR |
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
LIMIT8 | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
LIMIT8 | C | PERM | RD | WR |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
MASK | MASK specifies the size of the region. | Read/Write | 0 | Only with MPU |
LIMIT | LIMIT specifies the upper address limit of the region. | Read/Write | 0 | Only with MPU |
C | C is the data cacheable flag. C only applies to MPU data regions and determines the default cacheability of a data region. When C = 0, the data region is uncacheable. When C = 1, the data region is cacheable. | Read/Write | 0 | Only with MPU |
PERM | PERM specifies the access permissions for the region. | Read/Write | 0 | Only with MPU |
RD | RD is the read region flag. When RD = 1, wrctl instructions to the mpuacc register perform a read operation. | Write | 0 | Only with MPU |
WR | WR is the write region flag. When WR = 1, wrctl instructions to the mpuacc register perform a write operation. | Write | 0 | Only with MPU |
The MASK and LIMIT fields are mutually exclusive. Refer to Table 35 and Table 36.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
MASK[n-1:p]8 | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
MASK[n-1:p]8 | 0 | MT | PERM | RD | WR |
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
LIMIT[n:p]8 | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
LIMIT[n:p]8 | 0 | MT | PERM | RD | WR |
Field | Description | Access | Reset | Available |
---|---|---|---|---|
MASK | MASK specifies the size of the region. | Read/Write | 0 | Only with MPU |
LIMIT | LIMIT specifies the upper address limit of the region. | Read/Write | 0 | Only with MPU |
MT | (MT) Memory Type:
|
Read/Write | 0 | Only with MPU |
PERM | PERM specifies the access permissions for the region. | Read/Write | 0 | Only with MPU |
RD | RD is the read region flag. When RD = 1, wrctl instructions to the mpuacc register perform a read operation. | Write | 0 | Only with MPU |
WR | WR is the write region flag. When WR = 1, wrctl instructions to the mpuacc register perform a write operation. | Write | 0 | Only with MPU |
The MASK and LIMIT fields are mutually exclusive. Refer to Table 38 and Table 39.
3.4.2.14.1. The MASK Field
When the amount of memory reserved for a region is defined by size, the MASK field specifies the size of the memory region. The MASK field is the same number of bits as the BASE field of the mpubase register.
MASK Region Size Encodings Table lists the MASK field encodings for all possible region sizes in a full 31-bit byte address space.
MASK Encoding | Region Size |
---|---|
0x1FFFFFF | 64 bytes |
0x1FFFFFE | 128 bytes |
0x1FFFFFC | 256 bytes |
0x1FFFFF8 | 512 bytes |
0x1FFFFF0 | 1 KB |
0x1FFFFE0 | 2 KB |
0x1FFFFC0 | 4 KB |
0x1FFFF80 | 8 KB |
0x1FFFF00 | 16 KB |
0x1FFFE00 | 32 KB |
0x1FFFC00 | 64 KB |
0x1FFF800 | 128 KB |
0x1FFF000 | 256 KB |
0x1FFE000 | 512 KB |
0x1FFC000 | 1 MB |
0x1FF8000 | 2 MB |
0x1FF0000 | 4 MB |
0x1FE0000 | 8 MB |
0x1FC0000 | 16 MB |
0x1F80000 | 32 MB |
0x1F00000 | 64 MB |
0x1E00000 | 128 MB |
0x1C00000 | 256 MB |
0x1800000 | 512 MB |
0x1000000 | 1 GB |
0x0000000 | 2 GB |
MASK Encoding | Region Size |
---|---|
0xFFFFFF | 256 bytes |
0xFFFFFE | 512 bytes |
0xFFFFFC | 1 KB |
0xFFFFF8 | 2 KB |
0xFFFFF0 | 4 KB |
0xFFFFE0 | 8 KB |
0xFFFFC0 | 16 KB |
0xFFFF80 | 32 KB |
0xFFFF00 | 64 KB |
0xFFFE00 | 128 KB |
0xFFFC00 | 256 KB |
0xFFF800 | 512 KB |
0xFFF000 | 1 MB |
0xFFE000 | 2 MB |
0xFFC000 | 4 MB |
0xFF8000 | 8 MB |
0xFF0000 | 16 MB |
0xFE0000 | 32 MB |
0xFC0000 | 64 MB |
0xF80000 | 128 MB |
0xF00000 | 256 MB |
0xE00000 | 512 MB |
0xC00000 | 1 GB |
0x800000 | 2 GB |
0x000000 | 4 GB |
The MASK field contains the following value, where region_size is in bytes:
MASK = 0x1FFFFFF << log2(region_size >> 6)
The MASK field contains the following value, where region_size is in bytes:
MASK = 0xFFFFFF << log2(region_size >> 8)
3.4.2.14.2. The LIMIT Field
When the amount of memory reserved for a region is defined by an upper address limit, the LIMIT field specifies the upper address of the memory region plus one. For example, to achieve a memory range for byte addresses 0x4000 to 0x4fff with a 256 byte minimum region size, the BASE field of the mpubase register is set to 0x40 (0x4000 >> 8) and the LIMIT field is set to 0x50 (0x5000 >> 8). Because the LIMIT field is one more bit than the number of bits of the BASE field of the mpubase register, bit 31 of the mpuacc register is available to the LIMIT field.
3.4.2.14.3. The C Flag
The C flag determines the default data cacheability of an MPU region. The C flag only applies to data regions. For instruction regions, the C bit must be written with 0 and is always read as 0.
When data cacheability is enabled on a data region, a data access to that region can be cached, if a data cache is present in the system. You can override the default cacheability and force an address to noncacheable with an ldwio or stwio instruction.
3.4.2.14.4. The MT Flag
The MT flag determines the default memory type of an MPU data region. . The MT flag only applies to data regions. For instruction regions, the MT bit must be written with 0 for instruction regions and is always read as 0.
When data cacheability is enabled on a data region, a data access to that region can be cached, if a data cache is present in the system. You can override the default cacheability and force an address to noncacheable with an ldwio or stwio instruction. The encoding of the MT field is setup to be backwards-compatible with the Classic core MPU where bit 5 of MPUACC contains the cacheable bit (0 = non-cacheable, 1 = cacheable) and bit 6 is zero.
3.4.2.14.5. The PERM Field
Value | Supervisor Permissions | User Permissions |
---|---|---|
0 | None | None |
1 | Execute | None |
2 | Execute | Execute |
Value | Supervisor Permissions | User Permissions |
---|---|---|
0 | None | None |
1 | Read | None |
2 | Read | Read |
4 | Read/Write | None |
5 | Read/Write | Read |
6 | Read/Write | Read/Write |
3.4.2.14.6. The RD Flag
3.4.2.14.7. The WR Flag
3.4.2.14.8. The eccinj Register
The eccinj register injects 1 and 2 bit errors to the processor’s internal RAM blocks that support ECC. Injecting errors allows the software to test the ECC error exception handling code. The error(s) are injected in the data bits, not the parity bits. The eccinj register is only available when ECC is present.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | |||||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Reserved | TLB | Reserved | ICDAT | ICTAG | RF |
Software writes 0x1 to inject a 1 bit ECC error or 0x2 to inject a 2-bit ECC error to the RAM field. Hardware sets the value of the inject field to 0x0 after the error injection has occurred.
Field | Description | Access | Reset | Available |
---|---|---|---|---|
RF | Inject an ECC error in the register file’s RAM. | Read/Write | 0 | Only with ECC |
ICTAG | Inject an ECC error in the instruction cache Tag RAM. | Read/Write | 0 | Only with ECC |
ICDAT | Inject an ECC error in the instruction cache data RAM. | Read/Write | 0 | Only with ECC |
TLB | Inject an ECC error in the MMU TLB RAM. Errors are injected in the tag portion of the VPN field. | Read/Write | 0 | Only with ECC |
Refer to “Working with ECC” for more information about when errors are injected.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
Reserved | DC WB | DTCM 3 | DTCM 2 | ||||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
DTCM 1 | DTCM 0 | TLB | DC DAT | DC TAG | ICDAT | ICTAG | RF |
Software writes 0x1 to inject a 1 bit ECC error or 0x2 to inject a 2-bit ECC error to the RAM field. Hardware sets the value of the inject field to 0x0 after the error injection has occurred.
Field | Description | Access | Reset | Available |
---|---|---|---|---|
RF | Inject an ECC error in the register file’s RAM. | Read/Write | 0 | Only with ECC |
ICTAG | Inject an ECC error in the instruction cache Tag RAM. | Read/Write | 0 | Only with ECC |
ICDAT | Inject an ECC error in the instruction cache data RAM. | Read/Write | 0 | Only with ECC |
DCTAG | Inject ECC error in data cache tag RAM. | Read/Write | 0 | |
DCDAT | Inject an ECC error in the data cache data RAM. Injection occurs on next store instruction that writes the data cache or the next line fill. | Read/Write | 0 | |
TLB | Inject an ECC error in the MMU TLB RAM. Errors are injected in the tag portion of the VPN field. | Read/Write | 0 | Only with ECC |
DTCM0 | Inject ECC error in DTCM0. Injection occurs on next store instruction that writes this DTCM. | Read/Write | 0 | |
DTCM1 | Inject ECC error in DTCM1. Injection occurs on next store instruction that writes this DTCM. | Read/Write | 0 | |
DTCM2 | Inject ECC error in DTCM2. Injection occurs on next store instruction that writes this DTCM. | Read/Write | 0 | |
DTCM3 | Inject ECC error in DTCM3. Injection occurs on next store instruction that writes this DTCM. | Read/Write | 0 | |
DC WB | Inject ECC error in data cache victim line buffer RAM. Injection occurs on the first word written into the victim buffer RAM when a dirty line is being written back. | Read/Write | 0 |
Refer to “Working with ECC” for more information about when errors are injected.
3.4.3. Shadow Register Sets
When shadow register sets are implemented, status.CRS indicates the register set currently in use. A Nios II core can have up to 63 shadow register sets. If n is the configured number of shadow register sets, the shadow register sets are numbered from 1 to n. Register set 0 is the normal register set.
A shadow register set behaves precisely the same as the normal register set. The register set currently in use can only be determined by examining status.CRS.
Shadow register sets are typically used in conjunction with the EIC interface. This combination can substantially reduce interrupt latency.
For details of EIC interface usage, refer to the Exception Processing section.
System software can read from and write to any shadow register set by setting status.PRS and using the rdprs and wrprs instructions.
For details of the rdprs and wrprs instructions, refer to the Instruction Set Reference chapter of the Nios II Processor Reference Handbook.
3.4.3.1. The sstatus Register
The sstatus register is physically stored in general-purpose register r30 in each shadow register set. The normal register set does not have an sstatus register, but each shadow register set has a separate sstatus register.
Bit Fields | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 |
SRS | Reserved | RSIE | NMI | PRS | |||||||||||
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
CRS | IL | IH | EH | U | PIE |
Bit | Description | Access | Reset | Available |
---|---|---|---|---|
SRS 9 | SRS is the switched register set bit. The processor sets SRS to 1 when an external interrupt occurs, if the interrupt required the processor to switch to a different register set. | Read/Write | Undefined | EIC interface and shadow register sets only |
RSIE | RSIE is the register set interrupt-enable bit. When set to 1, this bit allows the processor to service external interrupts requesting the register set that is currently in use. When set to 0, this bit disallows servicing of such interrupts. | Read/Write | Undefined | 10 |
NMI | NMI is the nonmaskable interrupt mode bit. The processor sets NMI to 1 when it takes a nonmaskable interrupt. | Read/Write | Undefined | 10 |
PRS | 10 | Read/Write | Undefined | 10 |
CRS | 10 | Read/Write | Undefined | 10 |
IL | 10 | Read/Write | Undefined | 10 |
IH | 10 | Read/Write | Undefined | 10 |
EH | 10 | Read/Write | Undefined | 10 |
U | 10 | Read/Write | Undefined | 10 |
PIE | 10 | Read/Write | Undefined | 10 |
The sstatus register is present in the Nios II core if both the EIC interface and shadow register sets are implemented. There is one copy of sstatus for each shadow register set.
When the processor takes an interrupt, if a shadow register set is requested (RRS = 0) and the MMU is not in exception handler mode (status.EH = 0), the processor copies status to sstatus.
For details about RRS, refer to "Requested Register Set”.
For details about status.EH, refer to the Processor Status After Taking Exceptions Table.
3.4.3.1.1. Changing Register Sets
- If the processor is currently running in the normal register set, insert the new register set number in estatus.CRS, and execute eret.
- If the processor is currently running in a shadow register set, insert the new register set number in sstatus.CRS, and execute eret.
Before executing eret to change the register set, system software must set individual external interrupt masks correctly to ensure that registers in the shadow register set cannot be corrupted. If an interrupt is assigned to the register set, system software must ensure that one of the following conditions is true:
- The ISR is written to preserve register contents.
- The individual interrupt is disabled. The method for disabling an individual external interrupt is specific to the EIC implementation.
3.4.3.1.2. Stacks and Shadow Register Sets
Depending on system requirements, the system software can create a dedicated stack for each register set, or share a stack among several register sets. If a stack is shared, the system software must copy the stack pointer each time the register set changes. Use the rdprs instruction to copy the stack register between the current register set and another register set.
3.4.3.2. Initialization with Shadow Register Sets
- After the gp register is initialized in the normal register set, copy it to all shadow register sets, to ensure that all code can correctly address the small data sections.
- Copy the zero register from the normal register set to all shadow register sets, using the wrprs instruction.
3.5. Working with the MPU
3.5.1. MPU Region Read and Write Operations
MPU region read operations retrieve the current values for the attributes of a region. Each MPU region read operation consists of the following actions:
- Execute a wrctl instruction to the mpubase register with the mpubase.INDEX and mpubase.D fields set to identify the MPU region.
- Execute a wrctl instruction to the mpuacc register with the mpuacc.RD field set to one and the mpuacc.WR field cleared to zero. This action loads the mpubase and mpuacc register values.
- Execute a rdctl instruction to the mpubase register to read the loaded the mpubase register value.
- Execute a rdctl instruction to the mpuacc register to read the loaded the mpuacc register value.
The MPU region read operation retrieves mpubase.BASE, mpuacc.MASK or mpuacc.LIMIT, mpuacc.MT, and mpuacc.PERM values for the MPU region.
MPU region write operations set new values for the attributes of a region. Each MPU region write operation consists of the following actions:
- Execute a wrctl instruction to the mpubase register with the mpubase.INDEX and mpubase.D fields set to identify the MPU region.
- Execute a wrctl instruction to the mpuacc register with the mpuacc.WR field set to one and the mpuacc.RD field cleared to zero.
The MPU region write operation sets the values for mpubase.BASE, mpuacc.MASK or mpuacc.LIMIT, mpuacc.MT, and mpuacc.PERM as the new attributes for the MPU region.
Normally, a wrctl instruction flushes the pipeline to guarantee that any side effects of writing control registers take effect immediately after the wrctl instruction completes execution. However, wrctl instructions to the mpubase and mpuacc control registers do not automatically flush the pipeline. Instead, system software is responsible for flushing the pipeline as needed (either by using a flushp instruction or a wrctl instruction to a register that does flush the pipeline). Because a context switch typically requires reprogramming the MPU regions for the new thread, flushing the pipeline on each wrctl instruction would create unnecessary overhead.
3.5.2. MPU Initialization
The MPU is disabled on system reset. Before enabling the MPU, FPGA recommends initializing all MPU regions. Enable desired instruction and data regions by writing each region’s attributes to the mpubase and mpuacc registers as described in the "MPU Region Read and Write Operations" section of this chapter. You must also disable unused regions. When using region size, clear mpuacc.MASK to zero. When using limit, set the mpubase.BASE to a nonzero value and clear mpuacc.LIMIT to zero.
To perform a context switch, use a wrctl to write a zero to the PE field of the config register to disable the MPU, define all MPU regions from the new thread’s data structure, and then use another wrctl to write a one to config.PE to enable the MPU.
Define each region using the pair of wrctl instructions described in the "MPU Region Read and Write Operations" section of this chapter. Repeat this dual wrctl instruction sequence until all desired regions are defined.
3.5.3. Debugger Access
3.6. Working with ECC
3.6.1. Enabling ECC
The processor executes the INITI instruction on each cache line, which initializes the instruction cache RAM. The RAM does not require special initialization because any detected ECC errors are ignored if the line is invalid; the line is invalid after INITI instructions initialize the tag RAM.
processor instructions that write to every register (except register 0) initialize the register file RAM blocks. If shadow register sets are present, this step is performed for all registers in the shadow register set using the WRPRS instruction.
processor instructions that write every TLB RAM location initialize the MMU TLB RAM. This RAM does not require special initialization.
3.6.1.1. Disabling ECC
Disable ECC in software by writing 0 to CONFIG.ECCEN. Software can re-enable ECC without reinitializing the ECC-protected RAMs because the ECC parity bits are written to the RAM blocks even if ECC is disabled.
3.6.2. Handling ECC Errors
Typically, software can recover from an unrecoverable MMU TLB ECC error (2 bit error) because the TLB is a software-managed cache of the operating system page tables stored in the main memory (e.g., SDRAM). Software can invalid the TLB entry, return to the instruction that took the ECC error exception, and execute the TLB’s mishandled code to load a TLB entry from the page tables.
In general, software cannot recover from a register file ECC error (2 bit error) because the correct value of a register is not known. If the exception handler reads a register that has a 2 bit ECC error associated with it, another ECC error occurs and an exception handler loop can occur.
Exception handler loops occur when an ECC error exception occurs in the exception handler before it is ready to handle nested exceptions. To minimize the occurrence or exception handler loops, locate the ECC error exception handler code in normal cacheable memory, ensure that all data accesses are to non-cacheable memory, and minimize register reading.
The ECC error signals (ecc_event_bus) provide the EEH signal for external logic to detect a possible exception handler loop and reset the processor.
3.6.3. Injecting ECC Errors
This section describes the code sequence for injecting ECC errors for each ECC-protected RAM, assuming the ECC is enabled and interrupts are disabled for the duration of the code sequence.
3.6.3.1. Instruction Cache Tag RAM
- Ensure all code up to the JMP instruction is in the same instruction cache line or is located in an ITCM.
- Use a FLUSHI instruction to flush an instruction cache line other than the line containing the executing code.
- Use a FLUSHP instruction to flush the pipeline.
- Use a WRCTL instruction to set ECCINJ.ICTAG to INJS or INJD. This setting causes an ECC error to occur on the start of the next line fill.
- Use a JMP instruction to jump to an instruction address in the flushed line.
- The ECC error is injected when writing the tag RAM at the start of the line fill.
- Use a RDCTL instruction to ensure that the value of ECCINJ.ICTAG is NOINJ.
- The ECC error triggers after the target of the JMP instruction.
3.6.3.2. Instruction Cache Data RAM
- Ensure all code up to the JMP instruction is in the same instruction cache line or is located in an ITCM.
- Use a FLUSHI instruction to flush an instruction cache line other than the line containing the executing code.
- Use a FLUSHP instruction to flush the pipeline.
- Use a WRCTL instruction to set ECCINJ.ICDAT to INJS or INJD. This setting causes an ECC error to occur on the start of the next line fill.
- Use a JMP instruction to jump to an instruction address in the flushed line.
- The ECC error is injected when writing the tag RAM at the start of the line fill.
- Use a RDCTL instruction to ensure that the value of ECCINJ.ICDAT is NOINJ.
- Execute the target of the JMP instruction twice (first to inject the ECC error and second to be triggered by it).
3.6.3.3. ITCMs
Software running on the cannot directly inject an ECC error in an ITCM because the only writes ITCMs when correcting ECC errors. To inject an ECC in an ITCM, the TCM RAM must also be connected to a DTCM master. The provided DTCM error injection mechanism (i.e. ECCINJ register) is used to inject an error in the TCM RAM as follows:
- Use a WRCTL instruction to set ECCINJ so that it injects ECC errors in the DTCM connected to the ITCM.
- Use a STW instruction to write the DTCM.
- Use a RDCTL instruction to ensure the value of the ECCINJ field written by the WRCTL is NOINJ.
- Use a JMP instruction to jump to an instruction address in the ITCM.
- The ECC error should be triggered on the target of the JMP instruction.
3.6.3.4. Register File RAM Blocks
- Use a WRCTL instruction to set ECCINJ.RF to INJS or INJD (as desired).
- Execute any instruction that writes any register except R0.
- Use a RDCTL instruction to ensure that the value of ECCINJ.RF is NOINJ.
- Use an instruction to read the desired register from rA such as OR rd, r0, rx where rx is the register written in the previous step. This action triggers the ECC error.
- Use an instruction to read the desired register from rB such as OR rd, rx, r0 where rx is the register written in the previous step.
3.6.3.5. Data Cache Tag RAM
- Use a LOAD instruction from a data address to get the line in the cache. The line should be clean.
- Use a WRCTL instruction to set ECCINJ.DCTAG to INJS or INJD.
- Use a STORE instruction from a data address mapped to that line. The STORE instruction should hit in the data cache and write the tag RAM to set the dirty bit.
- The ECC error is injected when the tag RAM is written.
- Use a RDCTL instruction to ensure the value of ECCINJ.DCTAG is NOINJ. Before the RDCTL, use a FLUSHP instruction to avoid the RAW hazard on ECCINJ.
- Do another LOAD or STORE instruction to the same line.
- The ECC error should be triggered on this second LOAD/STORE instruction.
3.6.3.6. Data Cache Data RAM (Clean Line)
- Use a FLUSHDA instruction to ensure the line isn’t in the data cache.
- Use a LOAD instruction to load a clean data cache line.
- Use a WRCTL instruction to set ECCINJ.DCDAT field to INJS or INJD.
- Use a LOAD instruction to an address in the data cache line to inject the error.
- Use a RDCTL instruction to ensure the values of the field written by the WRCTL to ECCINJ is NOINJ. Before the RDCTL, use a FLUSHP instruction to avoid the RAW hazard on ECCINJ.
- Use a LOAD instruction from the same address.
- The ECC error should be triggered on the LOAD instruction.
3.6.3.7. Data Cache Data RAM (Dirty Line)
- Use a LOAD instruction to load a data cache line.
- Use a WRCTL instruction to set ECCINJ.DCDAT field to INJS or INJD (as desired).
- Use a STORE instruction to an address in the data cache line.
- Use a RDCTL instruction to ensure the values of the field written by the WRCTL to ECCINJ is NOINJ. Before the RDCTL, use a FLUSHP instruction to avoid the RAW hazard on ECCINJ.
- Either use a LOAD instruction from the same address or trigger a writeback of the dirty line (e.g. FLUSHDA instruction)
- The ECC error should be triggered on the LOAD instruction unless it is only detected during the writeback of a dirty line. In the writeback of a dirty line case, the ECC error is triggered an undefined number of instructions later.
3.6.3.8. Data Cache Victim Line Buffer RAM
- Use a LOAD instruction to load a data cache line.
- Use a WRCTL instruction to set ECCINJ.DCWB field to INJS or INJD (as desired).
- Use a STORE instruction to an address in the data cache line.
- Use a RDCTL instruction to ensure the values of the field written by the WRCTL to ECCINJ is NOINJ. Before the RDCTL, use a FLUSHP instruction to avoid the RAW hazard on ECCINJ.
- Either use a LOAD instruction from the same address or trigger a writeback of the dirty line (e.g. FLUSHDA instruction)
- The ECC error should be triggered on the LOAD instruction unless it is only detected during the writeback of a dirty line. In the writeback of a dirty line case, the ECC error is triggered an undefined number of instructions later.
3.6.3.9. DTCMs
- Use a WRCTL instruction to set the ECCINJ.DTCM field to INJS or INJD for the desired DTCM.
- Use a STW instruction to write an address in the DTCM.
- Use a RDCTL instruction to ensure the value of the field written by the WRCTL to ECCINJ is NOINJ.
- Use a LOAD instruction from the same address in the DTCM.
- The ECC error should be triggered on the LOAD instruction.
3.6.3.10. MMU TLB RAM
- Use a WRCTL instruction to set ECCINJ.TLB to INJS or INJD.
- Use a WRCTL instruction to write a TLB entry. The ECC error is injected at this time and any associated uTLB entry can be flushed.
- Use a RDCTL instruction to ensure the value of ECCINJ.TLB is NOINJ.
- Perform an instruction/data access to cause the hardware to read the TLB entry (copied into uTLB) and the ECC decoder should detect the ECC error at this time. Alternatively, initiate a software read of the TLB (by writing TLBMISC.RD to 1).
- If a software read was initiated, the TLBMISC.EE field should be set to 1 on any instruction after the WRCTL that triggered the software read.
- If a hardware read was initiated, the ECC error should be triggered on the first instruction after the hardware read.
3.7. Exception Processing
All Nios II exceptions are precise. Precise exceptions enable the system software to re-execute the instruction, if desired, after handling the exception.
3.7.1. Terminology
- Exception—a transfer of control away from a program’s normal flow of execution, caused by an event, either internal or external to the processor, which requires immediate attention.
- Interrupt—an exception caused by an explicit request signal from an external device; also: hardware interrupt.
- Interrupt controller—hardware that interfaces the processor to interrupt request signals from external devices.
- Internal interrupt controller—the nonvectored interrupt controller that is integral to the processor. The internal interrupt controller is available in all revisions of the processor.
- Vectored interrupt controller (VIC)—an -provided external interrupt controller.
- Exception (interrupt) latency—The time elapsed between the event that causes the exception (assertion of an interrupt request) and the execution of the first instruction at the handler address.
- Exception (interrupt) response time—The time elapsed between the event that causes the exception (assertion of an interrupt request) and the execution of non-overhead exception code, that is, specific to the exception type (device).
- Global interrupts—All maskable exceptions on the processor, including internal interrupts and maskable external interrupts, but not including nonmaskable interrupts.
- Worst-case latency—The value of the exception (interrupt) latency, assuming the maximum disabled time or maximum masked time, and assuming that the exception (interrupt) occurs at the beginning of the masked/disabled time.
- Maximum disabled time—The maximum amount of continuous time that the system spends with maskable interrupts disabled.
- Maximum masked time—The maximum amount of continuous time that the system spends with a single interrupt masked.
- Shadow register set—a complete alternate set of Nios II general-purpose registers, which can be used to maintain a separate runtime context for an ISR.
3.7.2. Exception Overview
Each of the Nios II exceptions falls into one of the following categories:
- Reset exception—Occurs when the processor is reset. Control is transferred to the reset address you specify in the processor IP core setup parameters.
- Break exception—Occurs when the JTAG debug module requests control. Control is transferred to the break address you specify in the processor IP core setup parameters.
- Interrupt exception—Occurs when a peripheral device signals a condition requiring service
- Instruction-related exception—Occurs when any of several internal conditions occurs, as detailed in the Exceptions Table. Control is transferred to the exception address you specify in the processor IP core setup parameters.
The following table columns specify information for the exceptions:
- Exception—Gives the name of the exception.
- Type—Specifies the exception type.
- Available—Specifies when support for that exception is present.
- Cause—Specifies the value of the CAUSE field of the exception register, for exceptions that write the exception.CAUSE field.
- Address—Specifies the instruction or data address associated with the exception.
- Vector—Specifies which exception vector address the processor passes control to when the exception occurs.
Exception | Type | Available | Cause | Address | Vector |
---|---|---|---|---|---|
Reset | Reset | Always | 0 | Reset | |
Hardware break | Break | Always | — | Break | |
Processor-only reset request | Reset | Always | 1 | Reset | |
Internal interrupt | Interrupt | Internal interrupt controller | 2 | ea–4 12 | General exception |
External nonmaskable interrupt | Interrupt | External interrupt controller interface | — | ea–4 12 | Requested handler address 13 |
External maskable interrupt | Interrupt | External interrupt controller interface | 2 | ea–4 12 | Requested handler address 13 |
ECC TLB error (instruction) | Instruction-related | MMU and ECC | 18 | ea–4 12 | General exception |
Supervisor-only instruction address 11 | Instruction-related | MMU | 9 | ea–4 12 | General exception |
Fast TLB miss (instruction)11 | Instruction-related | MMU | 12 | pteaddr.VPN, ea–4 12 | Fast TLB Miss exception |
Double TLB miss (instruction) 11 | Instruction-related | MMU | 12 | pteaddr.VPN, ea–4 12 | General exception |
TLB permission violation (execute) 11 | Instruction-related | MMU | 13 | pteaddr.VPN, ea–4 12 | General exception |
ECC register file error | Instruction-related | ECC | 20 | ea–4 12 | General exception |
MPU region violation (instruction) 11 | Instruction-related | MPU | 16 | ea–4 12 | General exception |
Supervisor-only instruction | Instruction-related | MMU or MPU | 10 | ea–4 12 | General exception |
Trap instruction | Instruction-related | Always | 3 | ea–4 12 | General exception |
Illegal instruction | Instruction-related | Illegal instruction detection on, MMU, or MPU | 5 | ea–4 12 | General exception |
Unimplemented instruction | Instruction-related | Always | 4 | ea–4 12 | General exception |
Break instruction | Instruction-related | Always | — | ba–4 12 | Break |
Supervisor-only data address | Instruction-related | MMU | 11 | badaddr (data address) | General exception |
Misaligned data address | Instruction-related | Illegal memory access detection on, MMU, or MPU | 6 | badaddr (data address) | General exception |
Misaligned destination address | Instruction-related | Illegal memory access detection on, MMU, or MPU | 7 | badaddr (destination address) | General exception |
ECC TLB error (data) | Instruction-related | MMU and ECC | 18 | badaddr (data address) | General exception |
Division error | Instruction-related | Division error detection on | 8 | ea–4 12 | General exception |
Fast TLB miss (data) | Instruction-related | MMU | 12 | pteaddr.VPN, badaddr (data address) | Fast TLB Miss exception |
Double TLB miss (data) | Instruction-related | MMU | 12 | pteaddr.VPN, badaddr (data address) | General exception |
TLB permission violation (read) | Instruction-related | MMU | 14 | pteaddr.VPN, badaddr (data address) | General exception |
TLB permission violation (write) | Instruction-related | MMU | 15 | pteaddr.VPN, badaddr (data address) | General exception |
MPU region violation (data) | Instruction-related | MPU | 17 | badaddr (data address) | General exception |
Exception | Type | Available | Cause | Address | Vector |
---|---|---|---|---|---|
Reset | Reset | Always | 0 | Reset | |
Hardware break | Break | Always | — | Break | |
Processor-only reset request | Reset | Always | 1 | Reset | |
ECC Data Cache Writeback Error | Instruction-related | ECC and data cache | 22 | General exception | |
Internal interrupt | Interrupt | Internal interrupt controller | 2 | ea–4 12 | General exception |
External nonmaskable interrupt | Interrupt | External interrupt controller interface | — | ea–4 12 | Requested handler address 13 |
External maskable interrupt | Interrupt | External interrupt controller interface | 2 | ea–4 12 | Requested handler address 13 |
ECC TLB error (instruction) | Instruction-related | MMU and ECC | 18 | ea–4 12 | General exception |
Supervisor-only instruction address 11 | Instruction-related | MMU | 9 | ea–4 12 | General exception |
Fast TLB miss (instruction)11 | Instruction-related | MMU | 12 | pteaddr.VPN, ea–4 12 | Fast TLB Miss exception |
Double TLB miss (instruction) 11 | Instruction-related | MMU | 12 | pteaddr.VPN, ea–4 12 | General exception |
TLB permission violation (execute) 11 | Instruction-related | MMU | 13 | pteaddr.VPN, ea–4 12 | General exception |
ECC register file error | Instruction-related | ECC | 20 | ea–4 12 | General exception |
MPU region violation (instruction) 11 | Instruction-related | MPU | 16 | ea–4 12 | General exception |
Bus Instruction Fetch Error | M Core | 23 | ea–4 12 | General exception | |
ECC Fetch Error (instruction fetch) | ECC and ITCM | 19 | ea–4 12 | General exception | |
ECC Register File Error | ECC | 20 | ea–4 12 | General exception | |
Supervisor-only instruction | Instruction-related | MMU or MPU | 10 | ea–4 12 | General exception |
Trap instruction | Instruction-related | Always | 3 | ea–4 12 | General exception |
Illegal instruction | Instruction-related | Illegal instruction detection on, MMU, or MPU | 5 | ea–4 12 | General exception |
Unimplemented instruction | Instruction-related | Always | 4 | ea–4 12 | General exception |
Break instruction | Instruction-related | Always | — | ba–4 12 | Break |
Supervisor-only data address | Instruction-related | MMU | 11 | badaddr (data address) | General exception |
Misaligned data address | Instruction-related | Illegal memory access detection on, MMU, or MPU | 6 | badaddr (data address) | General exception |
Misaligned destination address | Instruction-related | Illegal memory access detection on, MMU, or MPU | 7 | badaddr (destination address) | General exception |
ECC TLB error (data) | Instruction-related | MMU and ECC | 18 | badaddr (data address) | General exception |
Division error | Instruction-related | Division error detection on | 8 | ea–4 12 | General exception |
Fast TLB miss (data) | Instruction-related | MMU | 12 | pteaddr.VPN, badaddr (data address) | Fast TLB Miss exception |
Double TLB miss (data) | Instruction-related | MMU | 12 | pteaddr.VPN, badaddr (data address) | General exception |
TLB permission violation (read) | Instruction-related | MMU | 14 | pteaddr.VPN, badaddr (data address) | General exception |
TLB permission violation (write) | Instruction-related | MMU | 15 | pteaddr.VPN, badaddr (data address) | General exception |
MPU region violation (data) | Instruction-related | MPU | 17 | badaddr (data address) | General exception |
Bus Data Region Violation | M core | 24 | badaddr (data address) | General exception | |
ECC Data Error | ECC and (data cache OR DTCM) | 21 | badaddr (data address) | General exception |
3.7.3. Exception Latency
Exception latency specifies how quickly the system can respond to an exception. Exception latency depends on the type of exception, the software and hardware configuration, and the processor state.
3.7.3.1. Interrupt Latency
The interrupt controller can mask individual interrupts. Each interrupt can have a different maximum masked time. The worst-case interrupt latency for interrupt i is determined by that interrupt’s maximum masked time, or by the maximum disabled time, whichever is greater.
3.7.4. Reset Exceptions
- Sets status.RSIE to 1, and clears all other fields of the status register.
- Invalidates the instruction cache line associated with the reset vector.
- Begins executing the reset handler, located at the reset vector.
Clearing the status.PIE field disables maskable interrupts. If the MMU or MPU is present, clearing the status.U field forces the processor into supervisor mode.
Invalidating the reset cache line guarantees that instruction fetches for reset code comes from uncached memory.
Aside from the instruction cache line associated with the reset vector, the contents of the cache memories are indeterminate after reset. To ensure cache coherency after reset, the reset handler located at the reset vector must immediately initialize the instruction cache. Next, either the reset handler or a subsequent routine should proceed to initialize the data cache.
The reset state is undefined for all other system components, including but not limited to:
- General-purpose registers, except for zero (r0) in the normal register set, which is permanently zero.
- Control registers, except for status. status.RSIE is reset to 1, and the remaining fields are reset to 0.
- Instruction and data memory.
- Cache memory, except for the instruction cache line associated with the reset vector.
- Peripherals. Refer to the appropriate peripheral data sheet or specification for reset conditions.
- Custom instruction logic
- Nios II C-to-hardware (C2H) acceleration compiler logic.
For more information refer to the Nios II Custom Instruction User Guide for reset conditions.
3.7.5. Break Exceptions
Break processing is the means by which software debugging tools implement debug and diagnostic features, such as breakpoints and watchpoints. Break processing is a type of exception processing, but the break mechanism is independent from general exception processing. A break can occur during exception processing, enabling debug tools to debug exception handlers.
The processor enters the break processing state under either of the following conditions:
- The processor executes the break instruction. This is often referred to as a software break.
- The JTAG debug module asserts a hardware break.
3.7.5.1. Processing a Break
- Stores the contents of the status register to bstatus.
- Clears status.PIE to zero, disabling maskable interrupts.
- Writes the address of the instruction following the break to the ba register (r30) in the normal register set.
- Clears status.U to zero, forcing the processor into supervisor mode, when the system contains an MMU or MPU.
- Sets status.EH to one, indicating the processor is handling an exception, when the system contains an MMU.
- Copies status.CRS to status.PRS and then sets status.CRS to 0.
- Transfers execution to the break handler, stored at the break vector specified in the Processor parameter editor.
3.7.5.2. Understanding Register Usage
The bstatus control register and general-purpose registers bt (r25) and ba (r30) in the normal register set are reserved for debugging. Code is not prevented from writing to these registers, but debug code might overwrite the values. The break handler can use bt (r25) to help save additional registers.
3.7.5.3. Returning From a Break
After processing a break, the break handler releases control of the processor by executing a bret instruction. The bret instruction restores status by copying the contents of bstatus and returns program execution to the address in the ba register (r30) in the normal register set. Aside from bt and ba, all registers are guaranteed to be returned to their pre-break state after returning from the break handler.
3.7.6. Interrupt Exceptions
- The external interrupt controller interface
- The internal interrupt controller
3.7.6.1. External Interrupt Controller Interface
The processor does not depend on any particular implementation of an EIC. The degree of EIC configurability, and EIC configuration methods, are implementation-specific. This section discusses the EIC interface, and general features of EICs. For usage details, refer to the documentation for the specific EIC in your system.
When an IRQ is asserted, the EIC presents the following information to the processor:
- The requested handler address (RHA)—Refer to the Requested Handler Address section of this chapter
- The requested interrupt level (RIL)—Refer to the Requested Interrupt Level section of this chapter
- The requested register set (RRS)—Refer to Requested Register Set section of this chapter
- Requested nonmaskable interrupt (RNMI) mode—Refer to the Requested NMI Mode section of this chapter
The processor EIC interface connects to a single EIC, but an EIC can support a daisy-chained configuration. In a daisy-chained configuration, multiple EICs can monitor and prioritize interrupts. The EIC directly connected to the processor presents the processor with the highest-priority interrupt from all EICs in the daisy chain.
An EIC component can support an arbitrary level of daisy-chaining, potentially allowing the processor to handle an arbitrary number of prioritized interrupts.
For a typical EIC implementation, refer to the Vectored Interrupt Controller chapter in the Embedded Peripherals IP User Guide.
3.7.6.1.1. Requested Handler Address
The RHA for each interrupt is typically software-configurable. The method for specifying the RHA is dependent on the specific EIC implementation.
If the processor is implemented with an MMU, the processor treats handler addresses as virtual addresses.
3.7.6.1.2. Requested Interrupt Level
The RIL is ignored for nonmaskable interrupts.
3.7.6.1.3. Requested Register Set
The method of assigning register sets to interrupts depends on the specific EIC implementation. Register set assignments can be software-configurable.
Multiple interrupts can be configured to share a register set. In this case, the interrupt handlers must be written so as to avoid register corruption. For example, one of the following conditions must be true:
- The interrupts cannot pre-empt one another. For example, all interrupts are at the same level.
- Registers are saved in software. For example, each interrupt handler saves its own registers on entry, and restores them on exit.
Typically, the processor is configured so that when it takes an interrupt, other interrupts in the same register set are disabled. If interrupt preemption within a register set is desired, the interrupt handler can re-enable interrupts in its register set.
By default, the processor disables maskable interrupts when it takes an interrupt request. To enable nested interrupts, system software or the ISR itself must re-enable interrupts after the interrupt is taken.
3.7.6.1.4. Requested NMI Mode
status.IL and RIL are ignored for nonmaskable interrupts.
3.7.6.1.5. Shadow Register Sets
For the best interrupt performance, assign a dedicated register set to each of the most time-critical interrupts. Less-critical interrupts can share register sets, provided the ISRs are protected from register corruption as noted in the Requested Register Set section of this chapter.
The method for mapping interrupts to register sets is specific to the particular EIC implementation.
3.7.6.2. Internal Interrupt Controller
- The PIE bit of the status control register is one.
- An interrupt-request input, irqn, is asserted.
- The corresponding bit n of the ienable control register is one.
Upon hardware interrupt, the processor clears the PIE bit to zero, disabling further interrupts, and performs the other steps outlined in the "Exception Processing Flow" section of this chapter.
The value of the ipending control register shows which interrupt requests (IRQ) are pending. By peripheral design, an IRQ bit is guaranteed to remain asserted until the processor explicitly responds to the peripheral.
3.7.7. Instruction-Related Exceptions
The processor generates the following instruction-related exceptions:
- Trap instruction
- Break instruction
- Unimplemented instruction
- Illegal instruction
- Supervisor-only instruction
- Supervisor-only instruction address
- Supervisor-only data address
- Misaligned data address
- Misaligned destination address
- Division error
- Fast TLB miss
- Double TLB miss
- TLB permission violation
- MPU region violation
3.7.7.1. Trap Instruction
3.7.7.2. Break Instruction
3.7.7.3. Unimplemented Instruction
For more information, refer to the "Potential Unimplemented Instructions" section of this chapter.
3.7.7.4. Illegal Instruction
Illegal instructions are instructions with an undefined opcode or opcode-extension field. The processor can check for illegal instructions and generate an exception when an illegal instruction is encountered. When your system contains an MMU or MPU, illegal instruction checking is always on. When no MMU or MPU is present, you have the option to have the processor check for illegal instructions.
Illegal instructions are instructions with an undefined opcode or opcode-extension field. The processor can check for illegal instructions and generate an exception when an illegal instruction is encountered. Illegal instruction checking is always on regardless of MMU or MPU settings.
For information about controlling this option, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
When the processor issues an instruction with an undefined opcode or opcode-extension field, and illegal instruction exception checking is turned on, an illegal instruction exception is generated.
Refer to the OP Encodings and OPX Encodings for R-Type Instructions tables in the Instruction Set Reference chapter of the Processor Reference Handbook to see the unused opcodes and opcode extensions.
Refer to the Nios II Core Implementation Details chapter of the Processor Reference Handbook for information about each specific Nios II core.
3.7.7.5. Supervisor-Only Instruction
This exception is implemented only in processors configured to use supervisor mode and user mode. Refer to the "Operating Modes" section of this chapter for more information.
3.7.7.6. Supervisor-Only Instruction Address
This exception is implemented only in processors that include the MMU.
3.7.7.7. Supervisor-Only Data Address
This exception is implemented only in processors that include the MMU.
3.7.7.8. Misaligned Data Address
For information about controlling this option, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
A data address is considered misaligned if the byte address is not a multiple of the width of the load or store instruction data width (four bytes for word, two bytes for half-word). Byte load and store instructions are always aligned so never take a misaligned address exception.
3.7.7.9. Misaligned Destination Address
For information about controlling this option, refer to the Instantiating the Processor chapter of the Processor Reference Handbook.
A destination address is considered misaligned if the target byte address of the instruction is not a multiple of four.
3.7.7.10. Division Error
The division error exception detects divide instructions that produce a quotient that can't be represented. The two cases are divide by zero and a signed division that divides the largest negative number -2147483648 (0x80000000) by -1 (0xffffffff). Division error detection is only available if divide instructions are supported by hardware.
3.7.7.11. Fast TLB Miss
There are two kinds of fast TLB miss exceptions:
- Fast TLB miss (instruction)—Any instruction fetch can cause this exception.
- Fast TLB miss (data)—Load, store, initda, and flushda instructions can cause this exception.
The fast TLB miss exception handler can inspect the tlbmisc.D field to determine which kind of fast TLB miss exception occurred.
3.7.7.12. Double TLB Miss
There are two kinds of double TLB miss exceptions:
- Double TLB miss (instruction)—Any instruction fetch can cause this exception.
- Double TLB miss (data)—Load, store, initda, and flushda instructions can cause this exception.
The general exception handler can inspect either the exception.CAUSE or tlbmisc.D field to determine which kind of double TLB miss exception occurred.
3.7.7.13. TLB Permission Violation
There are three kinds of TLB permission violation exceptions:
- TLB permission violation (execute)—Any instruction fetch can cause this exception.
- TLB permission violation (read)—Any load instruction can cause this exception.
- TLB permission violation (write)—Any store instruction can cause this exception.
The general exception handler can inspect the exception.CAUSE field to determine which permissions were violated.
3.7.7.14. MPU Region Violation
- An instruction fetch or data address matched a region but the permissions for that region did not allow the action to complete.
- An instruction fetch or data address did not match any region.
The general exception handler reads the MPU region attributes to determine if the address did not match any region or which permissions were violated.
There are two kinds of MPU region violation exceptions:
- MPU region violation (instruction)—Any instruction fetch can cause this exception.
- MPU region violation (data)—Load, store, initda, and flushda instructions can cause this exception.
The general exception handler can inspect the exception.CAUSE field to determine which kind of MPU region violation exception occurred.
3.7.8. Other Exceptions
3.7.9. Exception Processing Flow
3.7.9.1. Processing General Exceptions
The fast TLB miss exception handler only handles the fast TLB miss exception. It is built for speed to process TLB misses quickly. The fast TLB miss exception handler address, specified with the Processor parameter editor in , is called the fast TLB miss exception vector in the Processor parameter editor.
3.7.9.2. Exception Flow with the EIC Interface
- RHA—The requested handler address for the interrupt handler assigned to the requested interrupt.
- RRS—The requested register set to be used when the interrupt handler executes. If shadow register sets are not implemented, RRS must always be 0.
- RIL—The requested interrupt level specifies the priority of the interrupt.
- RNMI—The requested NMI flag specifies
whether to treat the interrupt as nonmaskable.
For further information about the RHA, RRS, RIL and RNMI, refer to “The Nios II/f Core” in the Nios II Core Implementation Details chapter of the Nios II Processor Reference Handbook.
When the EIC interface presents an interrupt to the processor, the processor uses several criteria, as follows, to determine whether to take the interrupt:
- Nonmaskable interrupts—The processor takes any NMI as long as it is not processing a previous NMI.
- Maskable interrupts—The processor takes a maskable interrupt if maskable interrupts are enabled, and if the requested interrupt level is higher than that of the interrupt currently being processed (if any). However, if shadow register sets are implemented, the processor takes the interrupt only if the interrupt requests a register set different from the current register set, or if the register set interrupt enable flag (status.RSIE) is set.
Table 53. Conditions Required to Take External Interrupt RNMI == 1 RNMI == 0 status.NMI == 0 status.NMI == 1 status.PIE == 0 status.PIE == 1 RIL <= status.IL RIL > status.IL Processor Has Shadow Register Sets No Shadow Register Sets RRS == status.CRS RRS != status.CRS status.RSIE == 0 status.RSIE == 1 Yes No No No No 14 Yes Yes Yes
The processor supports fast nested interrupts with shadow register sets, as described in the "Shadow Register Set" section of this chapter.
Keeping status.PIE set allows higher level interrupts to be taken immediate, without requiring the interrupt handler to set status.PIE to 1.
The processor disables maskable interrupts when taking an exception, just as it does without shadow register sets. An individual interrupt handler can re-enable interrupts by setting status.PIE to 1, if desired.
3.7.9.3. Exception Flow with the Internal Interrupt Controller
Interrupts can be re-enabled by writing one to the PIE bit, thereby allowing the current ISR to be interrupted. Typically, the exception routine adjusts ienable so that IRQs of equal or lower priority are disabled before re-enabling interrupts.
Refer to "Handling Nested Exceptions” for more information.
3.7.9.4. Exceptions and Processor Status
Processor Status Register or Field | System Status Before Taking Exception | |||||||
---|---|---|---|---|---|---|---|---|
External Interrupt Asserted 15 | Internal Interrupt Asserted or Noninterrupt Exception | |||||||
status.EH==1 30 | status.EH==0 | status.EH==1 | status.EH==0 | |||||
TLB Miss 32 | No TLB Miss | |||||||
RRS==0 31 | RRS!=0 | RRS==0 | RRS!=0 | TLB Permission Violation 32 | No TLB Permission Violation | |||
pteaddr.VPN 16 | No change | VPN 17 | No change | |||||
status.PRS 31 | No change | status.CRS 31 33 | No change | |||||
pc | RHA | General exception vector 18 | Fast TLB exception vector 19 | General exception vector31 | ||||
sstatus 20 34 | No change | status 33 21 | No change | |||||
estatus 34 | No change | status 33 | No change | status 33 | ||||
ea | No change | return address 22 | No change | return address | ||||
tlbmisc.D 30 | No change | 23 | ||||||
tlbmisc.DBL 30 | No change | 24 | ||||||
tlbmisc.PERM 30 | No change | 25 | ||||||
tlbmisc.BAD 30 | No change | 26 | ||||||
status.PIE | No change | 0 27 | ||||||
status.EH 30 | No change | 1 28 | ||||||
status.IH 36 | 1 | No change | ||||||
status.NMI 36 | RNMI | No change | ||||||
status.IL 36 | RIL | No change | ||||||
status.RSIE 31 36 | 0 | No change | ||||||
status.CRS 31 | RRS | No change | ||||||
status.U 30 | 0 29 |
3.7.10. Determining the Cause of Interrupt and Instruction-Related Exceptions
3.7.10.1. With Extra Exception Information
When you have included the extra exception information in your Nios II system, the CAUSE field of the exception register contains a code for the highest-priority exception occurring at the time and the BADDR field of the badaddr register contains the byte instruction address or data address for certain exceptions.
Refer to the Exceptions table for more information in the Exception Overview section.
To determine the cause of an exception, simply read the cause of the exception from exception.CAUSE and then transfer control to the appropriate exception routine.
3.7.10.2. Without Extra Exception Information
When the extra exception information is not available, use the sequence in the example below to determine the cause of an exception.
Determining Exception Cause Without Extra Exception Information
/* With an internal interrupt controller, check for interrupt exceptions. With an external interrupt controller, ipending is always 0, and this check can be omitted. */ if (estatus.PIE == 1 and ipending != 0) { handle interrupt /* Decode exception from instruction */ /* Note: Because the exception register is included with the MMU and */ /* MPU, you never need to determine MMU or MPU exceptions by decoding */ } else { decode instruction at $ea-4 if (instruction is trap) handle trap exception else if (instruction is load or store) handle misaligned data address exception else if (instruction is branch, bret, callr, eret, jmp, or ret) handle misaligned destination address exception else if (instruction is unimplemented) handle unimplemented instruction exception else if (instruction is illegal) handle illegal instruction exception else if (instruction is divide) { if (denominator == 0) handle division error exception else if (instruction is signed divide and numerator == 0x80000000 and denominator == 0xffffffff) handle division error exception } } /* Not any known exception */ } else { handle unknown exception (If internal interrupt controller is implemented, could be spurious interrupt) } }
3.7.10.3. /f Exception Processing
The CAUSE field of the exception register contains a code for the highest-priority exception occurring at the time. The BADDR field of the badaddr register contains the byte instruction address or data address for certain exceptions.
Refer to the Exceptions table for more information in the Exception Overview section.
To determine the cause of an exception, simply read the cause of the exception from exception.CAUSE and then transfer control to the appropriate exception routine.
3.7.10.4. /e Exception Processing
Determining Exception Cause for /e Exception Processing
/* With an internal interrupt controller, check for interrupt exceptions. With an external interrupt controller, ipending is always 0, and this check can be omitted. */ if (estatus.PIE == 1 and ipending != 0) { handle interrupt /* Decode exception from instruction */ /* Note: Because the exception register is included with the MMU and */ /* MPU, you never need to determine MMU or MPU exceptions by decoding */ } else { decode instruction at $ea-4 if (instruction is trap) handle trap exception else if (instruction is load or store) handle misaligned data address exception else if (instruction is branch, bret, callr, eret, jmp, or ret) handle misaligned destination address exception else if (instruction is unimplemented) handle unimplemented instruction exception else if (instruction is illegal) handle illegal instruction exception else if (instruction is divide) { if (denominator == 0) handle division error exception else if (instruction is signed divide and numerator == 0x80000000 and denominator == 0xffffffff) handle division error exception } } /* Not any known exception */ } else { handle unknown exception (If internal interrupt controller is implemented, could be spurious interrupt) } }
3.7.11. Handling Nested Exceptions
- An exception handler enables maskable interrupts
- An EIC is present, and an NMI occurs
- An EIC is present, and the processor is configured to keep maskable interrupts enabled when taking an interrupt
- An exception handler triggers an instruction-related
exception
For details about when the processor takes exceptions, refer to “Exception Processing Flow” on page 3–44.
For details about unimplemented instructions, refer to the Processor Architecture chapter of the Processor Reference Handbook.
For details about MMU and MPU exceptions, refer to the Instruction-Related Exceptions section of this chapter.
A system can be designed to eliminate the possibility of nested exceptions. However, if nested exceptions are possible, the exception handlers must be carefully written to prevent each handler from corrupting the context in which a pre-empted handler runs.
If an exception handler issues a trap instruction, an optional instruction, or an instruction which could generate an MMU or MPU exception, it must save and restore the contents of the estatus and ea registers.
3.7.11.1. Nested Exceptions with the Internal Interrupt Controller
3.7.11.2. Nested Exceptions with an External Interrupt Controller
When individual external interrupts have dedicated shadow register sets, the processor supports fast interrupt handling with no overhead for saving register contents. To take full advantage of fast interrupt handling, system software must set up certain conditions. With the following conditions satisfied, ISRs need not save and restore register contents on entry and exit:
- Automatic nested interrupts are enabled.
- Each interrupt is assigned to a dedicated shadow register set.
- All interrupts with the same RIL are assigned to dedicated shadow register sets.
- Multiple interrupts with different RILs can be assigned to
a single shadow register set. However, with multiple register sets, you must not
allow the RILs assigned to one shadow register set to overlap the RILs assigned to
another register set.
The following tables demonstrate the validity of register set assignments when preemption within a register set is enabled.
RIL | Register Set 1 | Register Set 2 |
---|---|---|
1 | IRQ0 | |
2 | IRQ1 | |
3 | IRQ2 | |
4 | IRQ3 | |
5 | IRQ4 | |
6 | IRQ5 | |
7 | IRQ6 |
RIL | Register Set 1 | Register Set 2 |
---|---|---|
1 | IRQ0 | |
2 | IRQ1 | |
3 | IRQ3 | |
4 | IRQ2 | |
5 | IRQ4 | |
6 | IRQ5 | |
7 | IRQ6 |
Multiple interrupts can share a register set, with some loss of performance. There are two techniques for sharing register sets:
- Set status.RSIE to 0. When an ISR is running in a given register set, the processor does not take any maskable interrupt assigned to the same register set. Such interrupts must wait for the running ISR to complete, regardless of their interrupt level.
- Ensure that each ISR saves and restores registers on entry and exit, and set status.RSIE to 1 after registers are saved. When an ISR is running in a given register set, the processor takes an interrupt in the same register set if it has a higher interrupt level.
The processor disables interrupts when taking a maskable interrupt (nonmaskable interrupts always disable maskable interrupts). Individual ISRs can re-enable nested interrupts by setting status.PIE to 1, as described in the Nested "Exceptions with Internal Interrupt Controller" section of this chapter.
3.7.12. Handling Nonmaskable Interrupts
NMIs leave intact the processor state associated with maskable interrupts and other exceptions, as well as normal, nonexception processing, when each NMI is assigned to a dedicated shadow register set. Therefore, NMIs can be handled transparently.
3.7.13. Masking and Disabling Exceptions
3.7.13.1. Disabling Maskable Interrupts
3.7.13.2. Masking Interrupts with an External Interrupt Controller
The status.IL field controls what level of external maskable interrupts can be serviced. The processor services a maskable interrupt only if its requested interrupt level is greater than status.IL.
An ISR can make run-time adjustments to interrupt nesting by manipulating status.IL. For example, if an ISR is running at level 5, to temporarily allow pre-emption by another level 5 interrupt, it can set status.IL to 4.
To enable all external interrupts, set status.IL to 0. To disable all external interrupts, set status.IL to 63.
3.7.13.3. Masking Interrupts with the Internal Interrupt Controller
Refer to the "Exception Processing" section of this chapter for more information.
An ISR can adjust ienable so that IRQs of equal or lower priority are disabled. Refer to the "Handling Nested Exceptions" section of this chapter for more information.
3.7.13.4. Returning From Interrupt and Instruction-Related Exceptions
You must ensure that when an exception handler modifies registers, they are restored when it returns. This can be taken care of in either of the following ways:
- In the case of ISRs, if the EIC interface and shadow register sets are implemented, and the ISR has a dedicated register set, no software action is required. The processor returns to the previous register set when it executes eret, which restores the register contents.
- For details, refer to the "Nested Exceptions with an External Interrupt Controller" section of this chapter.
- In the case of noninterrupt exceptions, for ISRs in a system with the internal interrupt controller, and for ISRs without a dedicated shadow register set, the exception handler must save registers on entry and restore them on exit. Saving the register contents on the stack is a typical, re-entrant implementation.
When executing the eret instruction, the processor performs the following tasks:
- Restores the previous contents of status as follows:
- If status.CRS is 0, copies estatus to status
- If status.CRS is nonzero, copies sstatus to status
- Transfers program execution to the address in the ea register (r29) in the register set specified by the original value of status.CRS.
3.7.13.4.1. Return Address Considerations
When returning from instruction-related exceptions, execution must resume from the instruction following the instruction where the exception occurred. Therefore, ea contains the correct return address.
On the other hand, hardware interrupt exceptions must resume execution from the interrupted instruction itself. In this case, the exception handler must subtract 4 from ea to point to the interrupted instruction.
3.8. Memory and Peripheral Access
Nios II addresses are 32 bits, allowing access up to a 4-gigabyte address space. Nios II core implementations without MMUs restrict addresses to 31 bits or fewer. The MMU supports the full 32-bit physical address.
Nios II addresses are 32 bits, allowing access up to a 4-gigabyte address space. The MMU supports the full 32-bit physical address. Bit 31 bypass is optional, you can access full 32-bit addressing without the MMU.
For details, refer to the Nios II Core Implementation Details chapter of the Nios II Processor Reference Handbook.
Peripherals, data memory, and program memory are mapped into the same address space. The locations of memory and peripherals within the address space are determined at system generation time. Reading or writing to an address that does not map to a memory or peripheral produces an undefined result.
The processor’s data bus is 32 bits wide. Instructions are available to read and write byte, half-word (16-bit), or word (32-bit) data.
The Nios II architecture uses little-endian byte ordering. For data wider than 8 bits stored in memory, the more-significant bits are located in higher addresses.
The Nios II architecture supports register and immediate addressing.
3.8.1. Cache Memory
The Nios II architecture and instruction set accommodate the presence of data cache and instruction cache memories. Cache management is implemented in software by using cache management instructions. Instructions are provided to initialize the cache, flush the caches whenever necessary, and to bypass the data cache to properly access memory-mapped peripherals.
The Nios II architecture provides the following mechanisms to bypass the cache:
- When no MMU is present, bit 31 of the address is reserved for bit-31 cache bypass. With bit-31 cache bypass, the address space of processor cores is 2 GB, and the high bit of the address controls the caching of data memory accesses.
- When the MMU is present, cacheability is controlled by the MMU, and bit 31 functions as a normal address bit. For details, refer to the Address Space and Memory Partitions section , and the TLB Organization section of this chapter.
- Cache bypass instructions, such as ldwio and stwio.
- When no MMU is present, bit 31 of the address is reserved for the optimal bit-31 cache bypass. With bit-31 cache bypass, the address space of processor cores is 2 GB, and the high bit of the address controls the caching of data memory accesses.
- When the MMU is present, cacheability is controlled by the MMU, and bit 31 functions as a normal address bit. For details, refer to the Address Space and Memory Partitions section , and the TLB Organization section of this chapter.
- Cache bypass instructions, such as ldwio and stwio.
Refer to the Nios II Core Implementation Details chapter of the Processor Reference Handbook for details of which processor cores implement bit-31 cache bypass.
Refer to Instruction Set Reference chapter of the Nios II Processor Reference Handbook for details of the cache bypass instructions.
Code written for a processor core with cache memory behaves correctly on a processor core without cache memory. The reverse is not true. If it is necessary for a program to work properly on multiple processor core implementations, the program must behave as if the instruction and data caches exist. In systems without cache memory, the cache management instructions perform no operation, and their effects are benign.
For a complete discussion of cache management, refer to theCache and Tightly Coupled Memory chapter of the Nios II Software Developer’s Handbook.
Some consideration is necessary to ensure cache coherency after processor reset. Refer to "Reset Exceptions" section of this chapter for more information.
For information about the cache architecture and the memory hierarchy refer to the Processor Architecture chapter of the Processor Reference Handbook.
3.8.1.1. Virtual Address Aliasing
For example, in a 64-KB direct-mapped cache with a 16-byte line, bits 15:4 are used to select the line. Assume that virtual address 0x1000 is mapped to physical address 0xF000 and virtual address 0x2000 is also mapped to physical address 0xF000. This is an illegal virtual address alias because accesses to virtual address 0x1000 use line 0x1 and accesses to virtual address 0x2000 use line 0x2 even though they map to the same physical address. This results in two copies of the same physical address in the cache. With an n-byte direct-mapped cache, there could be n/4096 copies of the same physical address in the cache if illegal virtual address aliases are not prevented. The bits of the virtual address that are used to select the line and are translated bits (bits 12 and up) are known as the color of the address. An operating system avoids illegal virtual address aliases by ensuring that if multiple virtual addresses map the same physical address, the virtual addresses have the same color. Note though, the color of the virtual addresses does not need to be the same as the color as the physical address because the cache tag contains all the bits of the PFN.
3.9. Instruction Set Categories
3.9.1. Data Transfer Instructions
Instruction | Description |
---|---|
ldw stw |
The ldw and stw instructions load and store 32-bit data words from/to memory. The effective address is the sum of a register's contents and a signed immediate value contained in the instruction. Memory transfers can be cached or buffered to improve program performance. This caching and buffering might cause memory cycles to occur out of order, and caching might suppress some cycles entirely. Data transfers for I/O peripherals should use ldwio and stwio. |
ldwio stwio |
ldwio and stwio instructions load and store 32-bit data words from/to peripherals without caching and buffering. Access cycles for ldwio and stwio instructions are guaranteed to occur in instruction order and are never suppressed. |
Instruction | Description |
---|---|
ldb ldbu stb ldh ldhu sth |
ldb, ldbu, ldh and ldhu load a byte or half-word from memory to a register. ldb and ldh sign-extend the value to 32 bits, and ldbu and ldhu zero-extend the value to 32 bits. stb and sth store byte and half-word values, respectively. Memory accesses can be cached or buffered to improve performance. To transfer data to I/O peripherals, use the io versions of the instructions, described in the following table cell. |
ldbio ldbuio stbio ldhio ldhuio sthio |
These operations load/store byte and half-word data from/to peripherals without caching or buffering. |
3.9.2. Arithmetic and Logical Instructions
Instruction | Description |
---|---|
and or xor nor |
These are the standard 32-bit logical operations. These operations take two register values and combine them bit-wise to form a result for a third register. |
andi ori xori |
These operations are immediate versions of the and, or, and xor instructions. The 16-bit immediate value is zero-extended to 32 bits, and then combined with a register value to form the result. |
andhi orhi xorhi |
In these versions of and, or, and xor, the 16-bit immediate value is shifted logically left by 16 bits to form a 32-bit operand. Zeroes are shifted in from the right. |
add sub mul div divu |
These are the standard 32-bit arithmetic operations. These operations take two registers as input and store the result in a third register. |
addi subi muli |
These instructions are immediate versions of the add, sub, and mul instructions. The instruction word includes a 16-bit signed value. |
mulxss mulxuu |
These instructions provide access to the upper 32 bits of a 32x32 multiplication operation. Choose the appropriate instruction depending on whether the operands should be treated as signed or unsigned values. It is not necessary to precede these instructions with a mul. |
mulxsu |
This instruction is used in computing a 128-bit result of a 64x64 signed multiplication. |
3.9.3. Move Instructions
Instruction | Description |
---|---|
mov movhi movi movui movia |
mov copies the value of one register to another register. movi moves a 16-bit signed immediate value to a register, and sign-extends the value to 32 bits. movui and movhi move a 16-bit immediate value into the lower or upper 16-bits of a register, inserting zeros in the remaining bit positions. Use movia to load a register with an address. |
3.9.4. Comparison Instructions
Instruction | Description |
---|---|
cmpeq |
== |
cmpne |
!= |
cmpge |
signed >= |
cmpgeu |
unsigned >= |
cmpgt |
signed > |
cmpgtu |
unsigned > |
cmple |
unsigned <= |
cmpleu |
unsigned <= |
cmplt |
signed < |
cmpltu |
unsigned < |
cmpeqi cmpnei cmpgei cmpgeui cmpgti cmpgtui cmplei cmpleui cmplti cmpltui |
These instructions are immediate versions of the comparison operations. They compare the value of a register and a 16-bit immediate value. Signed operations sign-extend the immediate value to 32-bits. Unsigned operations fill the upper bits with zero. |
3.9.5. Shift and Rotate Instructions
Instruction | Description |
---|---|
rol ror roli |
The rol and roli instructions provide left bit-rotation. roli uses an immediate value to specify the number of bits to rotate. The ror instructions provides right bit-rotation. There is no immediate version of ror, because roli can be used to implement the equivalent operation. |
sll slli sra srl srai srli |
These shift instructions implement the << and >> operators of the C programming language. The sll, slli, srl, srli instructions provide left and right logical bit-shifting operations, inserting zeros. The sra and srai instructions provide arithmetic right bit-shifting, duplicating the sign bit in the most significant bit. slli, srli and srai use an immediate value to specify the number of bits to shift. |
3.9.6. Program Control Instructions
Instruction | Description |
---|---|
call | This instruction calls a subroutine using an immediate value as the subroutine's absolute address, and stores the return address in register ra. |
callr | This instruction calls a subroutine at the absolute address contained in a register, and stores the return address in register ra. This instruction serves the roll of dereferencing a C function pointer. |
ret | The ret instruction is used to return from subroutines called by call or callr. ret loads and executes the instruction specified by the address in register ra. |
jmp | The jmp instruction jumps to an absolute address contained in a register. jmp is used to implement switch statements of the C programming language. |
jmpi | The jmpi instruction jumps to an absolute address using an immediate value to determine the absolute address. |
br | This instruction branches relative to the current instruction. A signed immediate value gives the offset of the next instruction to execute. |
The conditional branch instructions compare register values directly, and branch if the expression is true. The conditional branches support the following equality and relational comparisons of the C programming language:
- == and !=
- < and <= (signed and unsigned)
- > and >= (signed and unsigned)
The conditional branch instructions do not have delay slots.
Instruction | Description |
---|---|
bge bgeu bgt bgtu ble bleu blt bltu beq bne |
These instructions provide relative branches that compare two register values and branch if the expression is true. Refer to the "Comparison Instructions" section of this chapter for a description of the relational operations implemented. |
3.9.7. Other Control Instructions
Instruction | Description |
---|---|
trap eret |
The trap and eret instructions generate and return from exceptions. These instructions are similar to the call/ret pair, but are used for exceptions. trap saves the status register in the estatus register, saves the return address in the ea register, and then transfers execution to the general exception handler. eret returns from exception processing by restoring status from estatus, and executing the instruction specified by the address in ea. |
break bret |
The break and bret instructions generate and return from breaks. break and bret are used exclusively by software debugging tools. Programmers never use these instructions in application code. |
rdctl wrctl |
These instructions read and write control registers, such as the status register. The value is read from or stored to a general-purpose register. |
flushd flushda flushi initd initda initi |
These instructions are used to manage the data and instruction cache memories. |
flushp |
This instruction flushes all prefetched instructions from the pipeline. This is necessary before jumping to recently-modified instruction memory. |
sync |
This instruction ensures that all previously-issued operations have completed before allowing execution of subsequent load and store operations. |
rdprs wrprs |
These instructions read and write a general-purpose registers between the current register set and another register set. wrprs can set r0 to 0 in a shadow register set. System software must use wrprs to initialize r0 to 0 in each shadow register set before using that register set. |
3.9.8. Custom Instructions
For more information, refer to the “Custom Instructions” section of the Processor Architecture chapter of the Processor Reference Handbook
For continued more information refer to the Nios II Custom Instruction User Guide.
Machine-generated C functions and assembly language macros provide access to custom instructions, and hide implementation details from the user. Therefore, most software developers never use the custom assembly language instruction directly.
3.9.9. No-Operation Instruction
3.9.10. Potential Unimplemented Instructions
- mul
- muli
- mulxss
- mulxsu
- mulxuu
- div
- divu
- initda
All other instructions are guaranteed not to generate an unimplemented instruction exception.
An exception routine must exercise caution if it uses these instructions, because they could generate another exception before the previous exception is properly handled.
Refer to the "Unimplemented Instruction" section of this chapter for more information regarding unimplemented instruction processing.
3.10. Programming Model Revision History
Date | Version | Changes |
---|---|---|
October 2016 | 2016.10.28 | Removed extra exception information option from chapter. |
April 2015 | 2015.04.02 |
|
February 2014 | 13.1.0 |
|
May 2011 | 11.0.0 | Added references to new system integration tool. |
December 2010 | 10.1.0 | Maintenance release. |
July 2010 | 10.0.0 | Maintenance release. |
November 2009 | 9.1.0 |
|
March 2009 | 9.0.0 | Maintenance release. |
November 2008 | 8.1.0 | Maintenance release. |
May 2008 | 8.0.0 | Added text to describe the MMU, MPU, and advanced exceptions. |
October 2007 | 7.2.0 |
|
May 2007 | 7.1.0 |
|
March 2007 | 7.0.0 | Maintenance release. |
November 2006 | 6.1.0 | Maintenance release. |
May 2006 | 6.0.0 | Maintenance release. |
October 2005 | 5.1.0 | Maintenance release. |
May 2005 | 5.0.0 | Maintenance release. |
September 2004 | 1.1 |
|
May 2004 | 1.0 | Initial release. |
Document Version | Changes |
---|---|
2019.12.20 |
|
2019.04.30 | Maintenance release |
2018.04.18 |
|
2016.10.28 | Removed extra exception information option. |
2015.04.02 | Initial release |
4. Instantiating the Nios II Processor
This chapter describes the Nios® II Processor parameter editor in Qsys. The Nios II Processor parameter editor allows you to specify the processor features for a particular Nios II hardware system. This chapter covers the features of the Nios II processor that you can configure with the Nios II Processor parameter editor; it is not a user guide for creating complete Nios II processor systems.
To get started designing custom Nios II systems, refer to the Nios II Hardware Development Tutorial.
Development kits for Altera devices, available on the All Development Kits page of the Altera website, also provide ready-made hardware design examples that demonstrate different configurations of the Nios II processor.
4.1. Core Nios II Tab
The Core Nios II tab presents the main settings for configuring the Nios II processor.
Name | Description |
---|---|
Select a Nios II Core | |
Nios II Core | Refer to the "Core Selection" section. |
Hardware Arithmetic Operation | |
Hardware multiplication type | Refer to the "Multiply and Divide Settings" section. |
Hardware divide | |
Reset Vector | |
Reset vector memory | Refer to the "Reset Vector" section. |
Reset vector offset | |
Reset vector | |
Exception Vector | |
Exception vector memory | Refer to the "General Exception Vectors" section. |
Exception vector offset | |
Exception vector | |
MMU and MPU | |
Include MMU | Refer to the "Memory Management Unit Settings" section. |
Fast TLB Miss Exception vector memory | |
Fast TLB Miss Exception vector offset | |
Fast TLB Miss Exception vector | |
Include MPU | Refer to the "Memory Protection Unit Settings" section. |
The following sections describe the configuration settings available.
4.1.1. Core Selection
The main purpose of the Core Nios II tab is to select the processor core. The core you select on this tab affects other options available on this and other tabs.
Altera offers the following Nios II cores:
- Nios II/f—The Nios II/f fast core is designed for fast performance. As a result, this core presents the most configuration options allowing you to fine tune the processor for performance.
- Nios II/s—The Nios II/s standard core is designed for small size while maintaining performance.
- Nios II/e—The Nios II/e economy core is designed to achieve the smallest possible core size. As a result, this core has a limited feature set, and many settings are not available when the Nios II/e core is selected.
The Core Nios II tab displays a selector guide table that lists the basic properties of each core.
For implementation information about each core, refer to the Nios II Core Implementation Details chapter of the Nios II Processor Reference Handbook.
4.1.2. Multiply and Divide Settings
The Nios II/s and Nios II/f cores offer hardware multiply and divide options. You can choose the best option to balance embedded multiplier usage, logic element (LE) usage, and performance.
The Hardware multiplication type parameter for each core provides the following list:
- DSP Block—Include DSP block multipliers in the arithmetic logic unit (ALU). This option is only selectable when targeting devices that have DSP block multipliers.
- Embedded Multipliers—Include embedded multipliers in the ALU. This option is only present when targeting FPGA devices that have embedded multipliers.
- Logic Elements—Include LE-based multipliers in the ALU. This option achieves high multiply performance without consuming embedded multiplier resources, but with reduced fMAX.
- None—This option conserves logic resources by eliminating multiply hardware. Multiply operations are implemented in software.
Turning on Hardware divide includes LE-based divide hardware in the ALU. The Hardware divide option achieves much greater performance than software emulation of divide operations.
For information about the performance effects of the hardware multiply and divide options, refer to the Nios II Core Implementation Details chapter of the Nios II Processor Reference Handbook.
4.1.3. Reset Vector
Parameters in this section select the memory module where the reset code (boot loader) resides, and the location of the reset vector (reset address). The reset vector cannot be configured until your system memory components are in place.
The Reset vector memory list, which includes all memory modules mastered by the Nios II processor, selects the reset vector memory module. In a typical system, select a nonvolatile memory module for the reset code.
Reset vector offset specifies the location of the reset vector relative to the memory module’s base address. Qsys calculates the physical address of the reset vector when you modify the memory module, the offset, or the memory module’s base address. In Qsys, Reset vector displays the read-only, calculated address. The address is always a physical address, even when an MMU is present.
For information about reset exceptions, refer to the Programming Model chapter of the Nios II Processor Reference Handbook.
4.1.4. General Exception Vector
Parameters in this section select the memory module where the general exception vector (exception address) resides, and the location of the general exception vector. The general exception vector cannot be configured until your system memory components are in place.
The Exception vector memory list, which includes all memory modules mastered by the Nios II processor, selects the exception vector memory module. In a typical system, select a low-latency memory module for the exception code.
Exception vector offset specifies the location of the exception vector relative to the memory module’s base address. Qsys calculates the physical address of the exception vector when you modify the memory module, the offset, or the memory module’s base address. In Qsys, Exception vector displays the read-only, calculated address.. The address is always a physical address, even when an MMU is present.
For information about exceptions, refer to the Programming Model chapter of the Nios II Processor Reference Handbook.
4.1.5. Memory Management Unit Settings
4.1.5.1. Fast TLB Miss Exception Vector
The fast TLB miss exception vector is a special exception vector used exclusively by the MMU to handle TLB miss exceptions. Parameters in this section select the memory module where the fast TLB miss exception vector (exception address) resides, and the location of the fast TLB miss exception vector. The fast TLB miss exception vector cannot be configured until your system memory components are in place.
The Fast TLB Miss Exception vector memory list, which includes all memory modules mastered by the Nios II processor, selects the exception vector memory module. In a typical system, select a low-latency memory module for the exception code.
Fast TLB Miss Exception vector offset specifies the location of the exception vector relative to the memory module’s base address. Qsys calculates the physical address of the exception vector when you modify the memory module, the offset, or the memory module’s base address. In Qsys, Fast TLB Miss Exception vector displays the readonly, calculated address. The address is always a physical address, even when an MMU is present.
For information about the Nios II MMU, refer to the Programming Model chapter of the Nios II Processor Reference Handbook.
To function correctly with the MMU, the base physical address of all exception vectors (reset, general exception, break, and fast TLB miss) must point to low physical memory so that hardware can correctly map their virtual addresses into the kernel partition. This restriction is enforced by the Nios II Processor parameter editor.
4.1.6. Memory Protection Unit Settings
For information about the Nios II MPU, refer to the Programming Model chapter of the Nios II Processor Reference Handbook.
4.2. Caches and Memory Interfaces Tab
Name | Description |
---|---|
Instruction Master | |
Instruction cache | Refer to the "Instruction Master Settings" Section. |
Burst transfers | |
Number of tightly coupled instruction master port(s) | |
Data Master | |
Omit data master port | Refer to the "Data Master" Settings. |
Data cache | |
Data cache line size | |
Burst transfers | |
Data cache victim buffer implementation | |
Number of tightly coupled instruction master port(s) |
The following sections describe the configuration settings available.
4.2.1. Instruction Master Settings
The Instruction Master parameters provide the following options for the Nios II/f and Nios II/s cores:
-
Instruction cache—Specifies the size of the instruction cache. Valid sizes
are from 512 bytes to 64 KBytes, or None.
Choosing None disables the instruction cache, which also removes the Avalon-MM instruction master port from the Nios II processor. In this case, you must include a tightly-coupled instruction memory.
-
Burst transfers —The Nios II processor can fill its instruction cache lines using burst transfers. Usually you enable bursts on the processor's instruction master when instructions are stored in DRAM, and disable bursts when instructions are stored in SRAM.
Bursting to DRAM typically improves memory bandwidth, but might consume additional FPGA resources. Be aware that when bursts are enabled, accesses to slaves might go through additional hardware (called burst adapters) which might decrease your fMAX.
When the Nios II processor transfers execution to the first word of a cache line, the processor fills the line by executing a sequence of word transfers that have ascending addresses, such as 0, 4, 8, 12, 16, 20, 24, 28.
However, when the Nios II processor transfers execution to an instruction that is not the first word of a cache line, the processor fetches the required (or “critical”) instruction first, and then fills the rest of the cache line. The addresses of a burst increase until the last word of the cache line is filled, and then continue with the first word of the cache line. For example, with a 32-byte cache line, transferring control to address 8 results in a burst with the following address sequence: 8, 12, 16, 20, 24, 28, 0, 4.
- Data cache victim buffer implementation—Specifies whether to use RAM or registers. The data cache victim buffer temporarily holds a dirty cache line while the data is written back to external memory.
- Number of tightly coupled instruction master port(s) (Include tightly coupled instruction master port(s))—Specifies one to four tightly-coupled instruction master ports for the Nios II processor. In Qsys, select the number from the Number of tightly coupled instruction master port(s) list. Tightly-coupled memory ports appear on the connection panel of the Nios II processor on the Qsys System Contents tab. You must connect each port to exactly one memory component in the system.
4.2.2. Data Master Settings
- Omit data master port—Removes the Avalon-MM data master port from the Nios II processor. The port is only successfully removed when Data cache is set to None and Number of tightly coupled data master port(s) is greater than zero.
-
Data cache—Specifies the
size of the data cache. Valid sizes are from 512
bytes to 64 KBytes, or None. Depending on the value specified for Data cache, the following options are available:
- Data cache line size—Valid sizes are 4 bytes, 16 bytes, or 32 bytes.
-
Burst transfers
—The Nios II processor can fill its data cache lines using burst transfers.
Usually you enable bursts on the processor's data bus when processor data is
stored in DRAM, and disable bursts when processor data is stored in SRAM.
Bursting to DRAM typically improves memory bandwidth but might consume additional FPGA resources. Be aware that when bursts are enabled, accesses to slaves might go through additional hardware (called burst adapters) which might decrease your fMAX.
Bursting is only enabled for data cache line sizes greater than 4 bytes. The burst length is 4 for a 16 byte line size and 8 for a 32 byte line size. Data cache bursts are always aligned on the cache line boundary. For example, with a 32-byte Nios II data cache line, a cache miss to the address 8 results in a burst with the following address sequence: 0, 4, 8, 12, 16, 20, 24 and 28.
- Number of tightly coupled data master port(s) (Include tightly coupled data master port(s))—Specifies one to four tightly-coupled data master ports for the Nios II processor. In Qsys, select the number from the Number of tightly coupled data master port(s) list. Tightly-coupled memory ports appear on the connection panel of the Nios II processor on the Qsys System Contents tab. You must connect each port to exactly one memory component in the system.
4.3. Advanced Features Tab
Name | Description |
---|---|
General | |
Interrupt controller | Refer to the "Interrupt Controller" Interfaces section. |
Number of shadow register sets | Refer to the "Shadow Register Sets" section. |
Include cpu_resetrequest and cpu_resettaken signals | Refer to the "Reset Signal"s section. |
Assign cpuid control register value manually | Refer to the "Control Registers" section. |