3.3.6.1.1. Machine Status Register (mstatus)
3.3.6.1.2. Machine Trap-Vector Base-Address Register (mtvec)
3.3.6.1.3. Machine Interrupt Register (mip and mie)
3.3.6.1.4. Machine Exception Program Counter Register (mepc)
3.3.6.1.5. Machine Cause Register (mcause)
3.3.6.1.6. Machine Trap Value Register (mtval)
4.3.1. General-Purpose Register File
4.3.2. Shadow Register
4.3.3. Arithmetic Logic Unit
4.3.4. Multipy and Divide Units
4.3.5. Floating-Point Unit
4.3.6. Custom Instruction
4.3.7. Instruction Cycles
4.3.8. Reset and Debug Signals
4.3.9. Control and Status Registers
4.3.10. Trap Controller (CLINT)
4.3.11. Trap Controller (CLIC)
4.3.12. Memory and I/O Organization
4.3.13. RISC-V based Debug Module
4.3.14. Error Correction Code (ECC)
4.3.15. Branch Prediction
4.3.16. Lockstep Module
4.3.10.1.1. Machine Status Register (mstatus)
4.3.10.1.2. Machine Trap-Vector Base-Address Register (mtvec)
4.3.10.1.3. Machine Interrupt Register (mip and mie)
4.3.10.1.4. Machine Exception Program Counter Register (mepc)
4.3.10.1.5. Machine Cause Register (mcause)
4.3.10.1.6. Machine Trap Value Register (mtval)
4.3.10.1.7. Machine Second Trap Value Register (mtval2)
4.3.11.1.3.1. Machine Trap-handler Vector Table base address Register (mtvt)
4.3.11.1.3.2. Machine Next Interrupt Handler Address and Interrupt Enable Register (mnxti)
4.3.11.1.3.3. Machine Interrupt Status Register (mintstatus)
4.3.11.1.3.4. Machine Interrupt-Level Threshold Register (mintthresh)
4.3.11.1.3.5. Machine Scratch Swap for Interrupt-Level Register (mscratchcswl)
4.3.5.2. Floating Point Operations
The table below provides a detailed summary of the FPU operations.
Category | Operation | Cycles2 | Result | Subnormal | Rounding3 | GCC Inference |
---|---|---|---|---|---|---|
Arithmetic | FDIV.S | 14 | a ÷ b | Flush-to-0 | RNE | a / b |
FSUB.S | 1 | a - b | Flush-to-0 | RNE | a - b | |
FADD.S | 1 | a + b | Flush-to-0 | RNE | a + b | |
FMUL.S | 2 | a x b | Flush-to-0 | RNE | a * b | |
FSQRT.S | 12 | √a | Flush-to-0 | Faithful 4 | sqrt(a) | |
FMIN.S | 2 | (a < b) ? a : b | Supported | RNE | fminf() | |
FMAX.S | 2 | (a < b) ? b : a | Supported | RNE | fmaxf() | |
Fused Arithmetic5 | FMADD.S | 3 | (a x b) + c | Flush-to-0 | RNE | (a * b) + c |
FMSUB.S | 3 | (a x b) – c | Flush-to-0 | RNE | (a * b) - c | |
FNMSUB.S | 3 | -(a x b) + c | Flush-to-0 | RNE | -(a * b) + c | |
FNMADD.S | 3 | -(a x b) - c | Flush-to-0 | RNE | -(a * b) - c | |
Conversion | FCVT.S.W / FCVT.S.WU | 3 | int_to_float(a) | Supported | None | Casting |
FCVT.W.S / FCVT.WU.S | 3 | float_to_int(a) | Supported | Round towards Zero | Casting | |
Round to Nearest, ties to Max Magnitude | roundf(a) | |||||
Compare | FLT.S | 1 | (a < b) ? 1 : 0 | Flush-to-0 | RNE | a < b |
FLE.S | 1 | (a ≤ b) ? 1 : 0 | Flush-to-0 | RNE | a <= b | |
FEQ.S | 1 | (a = b) ? 1 : 0 | Flush-to-0 | RNE | a == b | |
Sign Injection | FSGNJN.S (FNEG.S) | 1 | -a | Supported | RNE | -a |
FSGNJX.S (FABS.S) | 1 | |a| | Supported | RNE | fabsf(a) | |
Classification | FCLASS.S | 2 | Refer to topic Floating Point Classification. | Supported | None | fpclassify(a) |
Note: Assume a, b, and c as single-precision floating point values. Nios® V Processor Fused Arithmetic has a rounding stage between the multiplier and addition.
The following list describes the header in the table above:
- Operation —Provides the name of the floating-point operation. The names match the names of the corresponding RISC-V floating-point instructions.
- Cycle —Specifies the number of cycles it takes to execute the instruction.
- Result—Describes the computation performed by the operation.
- Subnormal—Describes how the operation treats subnormal inputs and subnormal outputs. Subnormals are numbers with a magnitude less than approximately 1.17549435082e-38.
- Rounding —Describes how the FPU rounds the result.
- GCC Inference—Shows the C code from which GCC infers the instruction operation.
When optimizing a Floating Point Unit (FPU) within a processor, the FSQRT (floating-point square root) and FDIV (floating-point division) operations are critical components. Because of their complexity and longer execution time, they can influence the maximum frequency (Fmax) of the FPU. Refer to the following guidelines to help you decide whether to enable or disable the FSQRT and FDIV.
Action | Guidelines |
---|---|
Enable | Enable the FQSRT and FDIV:
|
Disable | Disable the FQSRT and FDIV:
|
In summary, deciding whether to enable or disable FSQRT and FDIV in a Floating Point Unit (FPU) is a choice between two main goals. You can aim for a higher maximum frequency (Fmax) and lower logic usage, or you can focus on improving performance for floating-point operations. This decision should depend on the specific needs and limits of your application.
2 Preliminary results.
3 Round-to-Nearest, ties to Even (RNE).
4 Faithful rounding has a maximum error of 1 Unit of Least Precision (ULP) as compared to the 0.5 ULP in RNE. Faithful rounding is employed to save area and reduce the latency of FSQRT.S.
5 GCC toolchain infers Fused Arithmetic when the optimization level is -O3 or higher.