AN 846: Intel Stratix 10 Forward Error Correction
Introduction
Forward error correction is a powerful method of correcting errors that can occur on a serial link. Although very useful, it can be costly in both area and power when implemented in soft logic. For this reason, ETile and HTile devices provide hardened FEC blocks to address many important applications, such as:
 10 Gigabit Ethernet (GbE) (HTile)
 25GbE (ETile)
 100GbE (ETile)
 24.3 Gbps Common Public Radio Interface (CPRI) (ETile)
 128 gigabit fibre channel (GFC) (ETile)
HTile  ETile  

Fire Code—NRZ  Reed Solomon (RS) Code—NRZ  Reed Solomon (RS) Code—PAM4 



Necessity of Error Correction
Transmitting data introduces many challenges. Among these challenges is the noise in a communication channel, which can result in errors in the transmission of bits. There are many types of noise.
Noise Type  Errors 

Random or shot  Uncorrelated errors 
Crosstalk  Correlated and uncorrelated errors 
Return loss  Mostly correlated errors 
Insertion loss  Uncorrelated errors 
Decision Feedback Equalizer (DFE) error propagation  Burst errors 
Solutions to Transmission Errors
Parity and Cyclic Redundancy Check
Error Correction Code (ECC) Block Coding
 Hamming code
 Low density parity check codes (LDPC)
 Convolutional codes
 Viterbi
 Various FEC codes
Error Correction Codes
In binary codes, the encoder and decoder operate on a bit basis. The 10GBASEKR Fire Code FEC is an example of a binary code. In nonbinary codes, the encoder and decoder operate on a byte or symbol basis. Symbols may be any number of bits. Galois finite field arithmetic is used and the Reed Solomon code is an example of a nonbinary code.
Cyclic block codes are defined by a generator polynomial g(x). Encoding consists of adding a set of parity bits or symbols onto the data to create a code word, also called a codeword or a block. The parity is the remainder of the block from the polynomial division of the data bits by g(x). This is easily implemented using a linear feedback shift register (LFSR). Error detection and correction calculates the syndrome of the received code word. The syndrome is the difference between the locallygenerated and received parity. If the syndrome is zero, the code word is correct. If the syndrome is nonzero, then the syndrome can determine the most likely error.
What is FEC?
The receiver analyzes the check bit information to locate and correct errors. This correction allows systems to operate at higher bit error rates (BER).
While FEC provides a performance increase, it also introduces increased power consumption, increased latency, and an increased number of gates.
The extra data added to the real data protects the real data from getting corrupted.
FEC Definitions
 Reed Solomon
 BoseChadhuriHocquenghem (BCH)
 Concatenated codes
The type you select depends on:
 The overhead your design permits
 Burst handling capability
 Gain versus complexity (number of gates, memory, power, and so on)
 Latency considerations
There are bit error and burst limits to each code. FEC complexity increases nonlinearly as you approach the Shannon limit. The Shannon limit, sometimes called Shannon's theorem, establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (digital information) nearly errorfree up to a computable maximum rate through the channel.
FEC allows detection and correction of X bits or symbols in a block. There are limits to its correcting capability.
Code Type  Parameter Description 

Binary  n = block length 
k = message length  
Nonbinary (RSFEC, for example)  n = block length 
k = message length  
t = correctable symbols (nk)/2  
m = symbol size 
FEC Selection
This section also includes an Ethernet benchmarking example to help you select the best FEC mode for the 100GbE application in this application note.
Key Considerations when Choosing a FEC
The primary considerations when choosing a FEC include:
 Hardware complexity
 Coding gain
 Latency
 Power
Coding Gain
Generally, the performance of a transmission line is characterized by the BER, where BER is the ratio of bits that have errors with respect to the total number of bits received over a transmission line. Additionally, the performance of a data transmission code is characterized as a function of the average energy per data bit (Eb) to noise power spectral density (N0) of the waveform. Eb can be expressed as the signal power (S) times the bit time (Tb). N0 can be expressed as the noise power (N) divided by the bandwidth. Therefore, Eb/N0 is equal to the SNR (bandwidth/bit rate).
The effectiveness of a FEC code is determined by the reduction in the Eb/N0 needed to ensure the specific BER. Coding gain is the reduction in the required Eb/N0 at the same BER for an uncoded versus a coded system. For example, an uncoded communication system operates at a BER of 10^{−5} at an Eb/N0 of 10 dB. Adding a strong FEC code to this communication system could reduce the ratio of Eb/N0.
Net coding gain (NCG) accounts for the bandwidth expansion needed for the FEC code, and this is associated with increased noise in the receiver side. Coding gain does not account for this. This means that the data rate had to increase by a certain percentage in order to transmit both the real data and the extra data (FEC).
The lower the latency, the better it is from the application’s perspective. However, a small latency limits the block size of the FEC code, which in turn limits the performance of the code, and can also impact the decoder complexity.
The higher the clocking rate (the more redundancy you add, for example), the more coding gain you can achieve.
The larger the block size, the higher the coding gain, but also the higher the processing latency.
More parallelism reduces processing latency, but increases hardware complexity.
FEC in Intel Stratix 10 HTile Devices
Fire Code (802.3ap, 10GBASEKR)
The code encodes 2080 bits of payload (or information symbols) and adds 32 bits of overhead (or parity symbols). The code is systematic—meaning that the information symbols are not disturbed in the encoder, and the parity symbols are added separately to the end of each block.
The (2112,2080) code is constructed by shortening the cyclic code (42987, 42955). The shortened cyclic code (2112,2080) is guaranteed to correct an error burst of up to 11 bits per block. It is a systematic code that is well suited for correction of the burst errors typical in a backplane channel resulting from error propagation in the receive equalizer.
FEC Block Format
At the end of each block there is 32bit overhead or parity check bits. Transmission is from left to right within each row, and from top to bottom between rows. The payload bits carry the information symbols from the PCS layer.
T_{0}  64bit payload Word 0  T_{1}  64bit payload Word 1  T_{2}  64bit payload Word 2  T_{3}  64bit payload Word 3 
T_{4}  64bit payload Word 4  T_{5}  64bit payload Word 5  T_{6}  64bit payload Word 6  T_{7}  64bit payload Word 7 
T_{8}  64bit payload Word 8  T_{9}  64bit payload Word 9  T_{10}  64bit payload Word 10  T_{11}  64bit payload Word 11 
T_{12}  64bit payload Word 12  T_{13}  64bit payload Word 13  T_{14}  64bit payload Word 14  T_{15}  64bit payload Word 15 
T_{16}  64bit payload Word 16  T_{17}  64bit payload Word 17  T_{18}  64bit payload Word 18  T_{19}  64bit payload Word 19 
T_{20}  64bit payload Word 20  T_{21}  64bit payload Word 21  T_{22}  64bit payload Word 22  T_{23}  64bit payload Word 23 
T_{24}  64bit payload Word 24  T_{25}  64bit payload Word 25  T_{26}  64bit payload Word 26  T_{27}  64bit payload Word 27 
T_{28}  64bit payload Word 28  T_{29}  64bit payload Word 29  T_{30}  64bit payload Word 30  T_{31}  64bit payload Word 31 
32 parity bits 
Total FEC block length = (32 × 65) + 32 = 2112 bits.
FEC Block Composition
Instead, the FEC sublayer compresses the sync bits from the 64B/66B encoded data provided by the PCS to accommodate the addition of 32 parity check bits for every block of 2080 bits.
The BASER 64B/66B PCS maps 64 bits of scrambled payload and 2 bits of unscrambled synchronization header into 66bit encoded blocks. The 2bit synchronization header allows the PCS synchronization process to establish the 64B/66B block boundaries. The synchronization header is 01 for data blocks and 10 for control blocks. The synchronization header is the only position in the PCS block that always contains a transition, and this feature of the code establishes the 64B/66B block boundaries.
The FEC sublayer compresses the 2 bits of the synchronization header to one transcode bit. The transcode bit carries the state of BASER synchronization bits for the associated payload. This is achieved by eliminating the first bit in 64B/66B block, which is also the first synchronization bit, and preserving the second bit. The value of the second bit defines the value of the removed first bit uniquely, because it is always an inversion of the first bit. The transcode bits are further scrambled (as explained in IEEE 802.3ap Clause 74.7.4.2) to ensure DC balance.
The 32 sequential 64B/66B blocks are transcoded in this fashion, and then 32 bits of FEC parity are computed for them. The 32 transcoded words and the 32 FEC parity bits comprise a FEC block. The error detection property of the FEC cyclic code establishes block synchronization at FEC block boundaries at the receiver. If decoding passes successfully, the FEC decoder produces 32 65bit words, the first decoded bit of each word being the transcode bit. Then, the inversion of the transcode bit constructs the first synchronization bit in the 64B/66B code, and the value of the second synchronization bit is equal to the transcode bit.
FEC Sublayer for BASER PHYs
LTile/HTile Implementation
The KR FEC blocks in the Enhanced PCS are designed in accordance with the 10GBASEKR FEC and 40GBASEKR FEC specification of the IEEE 802.3 specification. The KR FEC implements the FEC as a sublayer between the PCS and PMA sublayers.
The FEC sublayer is optional and you can bypass it. When used, it provides additional margin to allow for variations in manufacturing and environmental conditions. FEC can achieve the following objectives:
 Support a forward error correction mechanism for the 10GBASER/KR and 40GBASER/KR protocols
 Support full duplex mode of the Ethernet MAC
 Support the PCS, PMA, and Physical Medium Dependent (PMD) sublayers defined for the 10GBASER/KR and 40GBASER/KR protocols
KR FEC improves the BER performance of the system.
Transcode Encoder
The transcode bit is generated from a combination of 66 bits after the 64B/66B encoder which consists of a 2bit synchronization header (S0 and S1) and a 64bit payload (D0, D1,…, D63). To ensure a DCbalanced pattern, the transcode word is generated by performing an XOR function on the second synchronization bit S1 and payload bit D8. The transcode bit becomes the LSB of the 65bit pattern output of the transcode encoder.
KR FEC Encoder
The code is a shortened cyclic code (2112, 2080). For each block of 2080 message bits, the encoder generates another 32 parity checks to form a total of 2112 bits. The generator polynomial is:
g(x) = x^{32} + x^{23} + x^{21} + x^{11} + x^{2} +1
KR FEC Scrambler
KR FEC TX Gearbox
The KR FEC TX gearbox aligns with the FEC block. Because the encoder output (also the scrambler output) has its unique word size pattern, the gearbox is specially designed to handle that pattern.
KR FEC RX Gearbox
Transcode Decoder
FEC in Intel Stratix 10 ETile Devices
Types of RSFEC
RSFEC  Parameter Name  NRZ PHY  PAM4 PHY  

FEC encoding  —  RS (528, 514, t=7, m=10)  RS (544, 514, t=15, m=10)  
Total symbols  n  528  544  
Message symbols  k  514  514  
Parity symbols  nk  14  30  
Bits per symbol  m  10  10  
Correctable symbols  t  7  15  
Coding gain  DFE  —  4.9 dB @ 1E15  5.4 dB @ 1E15 
Random  —  5.3 dB @ 1E12  6.5 dB @ 1E12 
RS (528, 514, t = 7, m = 10)
If 1 bit or all the m bits of a symbol are corrupt, this accounts for one symbol error. Symbols correlate well into burst errors.
RSFEC can correct any seven single bit errors.
RS (528, 514) can correct up to seven symbols. If all bits are error bits, for all seven symbols, then the total number of correctable bits is 70.
RS (544, 514, t = 15, m = 10)
If 1 bit or all the M bits of a symbol are corrupt, this accounts for one symbol error. Symbols correlate well into burst errors.
RSFEC can correct any 15 single bit errors.
RS (544, 514) can correct up to 15 symbols. If all bits are error bits, for all 15 symbols, then the total number of correctable bits is 150.
Supported RSFEC Modes in ETile Devices
Client Type  FEC Code  Number of Physical Lanes  Marker Size (bits)  Synchronization Type 

100GbE  RS528  4  1285  AM 
100GbE with KPFEC  RS544  2  1285  AM 
128GFC  RS528  4  514  AM 
25GbE  RS528  1  257  CWM 
32GFC  RS528  1  —  SnT 
Legend:
 RS528 = RS(528, 514)
 RS544 = RS(544, 514)
 AM = Alignment Markers
 CWM = Codeword Marker
 SnT = ScrambleandTest
The RSFEC core supports the following standards:
 100GbE: IEEE 802.3 Clause 91
 100GbE with KPFEC: IEEE 802.3 Clause 91
 128GFC: Fibre Channel Framing and Signaling  4 (FCFCS4) Clause 5.6
 25GbE: IEEE 802.3 Clause 108
 32GFC: Fibre Channel Framing and Signaling  4 (FCFCS4) Clause 5.4
100GbE with KPFEC uses two physical PAM4 coded lanes, also called, 100 Gigabit Attachment Unit Interface (CAUI2). It uses the RS(544,514). The two physical lanes are supported by bitmultiplexing the RSFEC Core’s four PMA lanes pairwise outside of the RSFEC Core. The remaining defined clients use the RS(528,514) FEC.
In the CPRI standard, the CPRI FEC refers to 32GFC. CPRI is like 32GFC except for the line rate, which is 24 Gbps.
100GBASEKR4
100GBASEKR4 is a nonbinary code (528, 514, 7, 10). 100GBASEKR4 features:
 514 data symbols per codeword
 528 data plus parity symbols per codeword
 Codeword size = 10 * 528 = 5280 bits
 Correcting capability up to seven symbols within a codeword
 5 to 5.5 dB gain
 NRZ modulation
 25.78125 Gbps bit rate
 BER of 10^{12} or better (after FEC correction)
100GBASEKR4 Mapping (IEEE802.3bj Clause 91)
 RS (528, 514) FEC
 Four lanes running at 25.78125 Gbps
 No rate expansion
 Data is stripped per symbol across four lanes
100GBASEKP4
100GBASEKP4 is a nonbinary code (544, 514, 15, 10). 100GBASEKP4 features:
 514 data symbols per codeword
 544 data plus parity symbols per codeword
 Codeword size = 10 * 544 = 5440 bits
 Correcting capability up to 15 symbols within a codeword
 6 to 6.5 dB gain
 PAM4 modulation
 26.5625 Gbps bit rate
 BER of 10^{12} or better (after FEC correction)
100GBASEKP4 Mapping (IEEE802.3bj Clause 91)
 RS (544, 514, 15, 10) FEC
 26.5625 Gbps
 3.03% rate expansion
 Data is stripped per symbol across four lanes
RS(544, 514) requires additional room to accommodate 5440 bits instead of 5280 bits. After transcoding, it must additionally make room for approximately 3% more bits of overhead. The precise overhead is calculated as 1/33; new rate = old rate * 34/33. This result is overspeed for PAM4. For example:
 Payload data rate = 50 Gbps
 Encoding it to 66b encoding: 50*66/64 = 51.5625 Gbps
 Adding FEC expansion: 51.5625*(34/33) = 53.125 Gbps
Intel^{®} Stratix^{®} 10TX devices do not support the 100GBASEKP4 physical medium dependent (PMD).
FEC Decoders
Decoder Type  Description 

Hard decision FEC  Makes exact decisions of 1s or 0s. Good gain versus complexity. Broadly used in most applications. 10GBASEKR/KR4/KP4 are all examples of hard decision FECs. Used in Intel^{®} Stratix^{®} 10 TX devices. 
Soft decision FEC  Makes decisions based on probabilities of a 1 or 0. Provides higher gain and allows you to get closer to the Shannon limit. Complex design used in higher end optical transport networking (OTN) systems, specifically coherent systems. Normally used in cellular communications using the Viterbi algorithm. 
Specifications
The IEEE802.3ap specification defines an insertion loss and return loss of 25 dB at 5.15625 GHz. The 1e^{12} BER requirement is a system specification that is met with or without FEC.
The IEEE802.3bj specification specifies 100GBASEKR4 for 100 Gbps operation using NRZ over four differential pairs where the insertion loss does not exceed 35 dB at 12.9 GHz. 100GBASEKR4 uses:
 The PCS defined in Clause 82
 The RSFEC defined in Clause 91
 The PMA defined in Clause 83
 The PMD defined in Clause 93
IEEE802.3bj also specifies 100GBASEKP4 for 100 Gbps operation using PAM4 over two differential pairs where the insertion loss does not exceed 33 dB at 7 GHz. 100GBASEKP4 uses:
 The PCS defined in Clause 82
 The RSFEC defined in Clause 91
 The PMA and PMD defined in Clause 94
The CEI 56G long reach (LR) specification discusses multiple FECs, but the standard is KP4 FEC with PAM4. The 1e^{15} BER requirement is a system specification met with FEC.
Functions Within the RSFEC Sublayer
Lane Block Synchronization
It then uses the synchronization headers to obtain lock to the 66bit blocks in each bit stream and outputs 66bit blocks.
Alignment Lock and Deskew
Alignment marker lock identifies the PCS lane number received on a particular lane of the service interface. After alignment marker lock is achieved on all 20 lanes, all interlane skew is removed. The RSFEC transmit function supports a maximum skew of 49 ns between PCS lanes, and a maximum skew variation of 400 ps.
Lane Reorder
The RSFEC transmit function orders the PCS lanes according to the PCS lane number.
Alignment Marker Removal
64B/66B to 256B/257B Transcoder
If all four incoming blocks are data blocks:
 Remove the 2bit headers of all four 66bit data blocks.
 Append a header bit of 1 to the four 64bit data payloads.
If there is at least one control block among the fourincoming blocks:
 Remove the 2bit headers of all four incoming 66bit blocks
 Append a header bit of 0 to the four payloads of the four blocks.
RSFEC deletes the second 4bit nibble in the block type field (BTF) of the first control block in a transcoded block. RSFEC retains the first 4bit nibble in the BTF of the first control block (indicating the type).
 Add the 4bit header x1, x2, x3, or x4 following the overall header bit 0, where:
 x1 = Data block
 x2, x3, and x4 = Control block
FEC Implementation Using the ETile Channel Placement Tool
The ETile Channel Placement Tool allows you to swiftly plan protocol placements in the product prior to reading comprehensive documentation and implementing designs in the Intel^{®} Quartus^{®} Prime software.
The Excelbased ETile Channel Placement Tool, supplemented with Instruction, Legend, Revision and Protocols tabs, is selfsustaining, and available for download.
 100GbE EHIP_CORE (25G * 4) MAC + PCS with RS (528, 514)
 100GbE EHIP_CORE (50G * 2) MAC + PCS with RS (544, 514)
FEC in Practical Application
Datacenter Applications Scenario
Consider a typical datacenter topology.
The interconnection between the spine switches and the lead switches is a 10G/40G/100G backplane.
25GbE is a proposed standard for Ethernet connectivity in a datacenter application space, and takes advantage of the technology defined for 100GbE as four 25 Gbps lanes running on four fibers or copper pairs.
Hardware Results
Test Design
FEC performance in the Intel^{®} Stratix^{®} 10 device was measured using a 25GbE design running RS (528, 514) FEC.
Test Setup
The test configuration included:
 An Intel^{®} Stratix^{®} 10 TX signal integrity development kit board using the ETile device
 FCI backplane (Megtron 6 material)
 Variable ISI box
The FCI backplane is connected to the ETile device on one lane, starting with 28 dB loss (error free even without FEC). Attenuation is increased on only one channel using the variable ISI box. This provides fine control over the insertion loss.
Insertion Loss Plots
FEC Statistics Tool
Hardware Data
IL1 = Insertion loss point where corrected code words increase.
IL2 = Insertion loss point where uncorrected code words increase.
Total IL (dB) ^{1}  Number of corrected bits  Number of Corrected Symbols  Number of Corrected Codewords  Number of Uncorrected Codewords  PREFEC BER ^{2}  Estimated POSTFEC BER ^{2} 

38.7  0  0  0  0  0  0 
44.3  2  2  2  0  6.48E14  0 
44.4  36  36  36  0  1.17E12  0 
44.8  91  91  91  0  2.95E12  0 
45.2  116  116  116  0  3.76E12  0 
45.8  418  418  418  0  1.35E11  0 
46.2  1131  1130  1130  0  3.66E11  0 
46.6  2469  2469  2469  0  7.99E11  0 
47  17984  17978  17978  0  5.83E10  0 
47.4  220808  220580  220519  0  7.13E09  0 
47.8  901459  899306  898544  0  2.91E08  0 
48.2  2567073  2557116  2551742  0  8.31E08  0 
48.6  6665926  6628124  6593734  0  2.15E07  0 
49  31511961  31252439  30527903  0  1.02E06  0 
49.2  113176637  111898208  103314714  0  3.66E06  0 
49.4  194993850  192151109  167763244  1  6.32E06  2.72E13 
49.6  457728720  448886002  331518303  374  1.48E05  1.02E10 
49.8  928669603  904545799  510475435  50083  3.00E05  1.36E08 
Comparison to the Specification
Specification (802.3bj)  Hardware Measurement 

4.9 to 5.3 dB  5.1 dB ^{3} 
Note the following:
 Post FEC BER is an estimate from uncorrectable code words.
 Received bits at the PRBS are normalized to account for PRBS payload + MAC padding (preamble, start codeword delimiter, and so on).
 Total IL = SI development kit loss + backplane loss + cable loss + variable ISI box loss.
 Total IL is a first order loss calculated by summing all the individual losses.
These hardware results demonstrate that the Intel FEC solution complies with the specification, making it a compelling solution for your Ethernet, CPRI, or Fibre Channel designs.
References
For more information about forward error correction, refer to the following resources:
 J. Schrum, YouTube video : "Error Detection and Correction 3: Forward Error Correction," 2016. [Online].
 A. Davis, EE Times : "Design HowTo Forward Error Correction," 1998. [Online].
 N. R. Wagner, "The Laws of Cryptography: The Hamming Code for Error Correction," [Online].
 Optical Transport Network (OTN) Tutorial. ITU. [Online].
Document Revision History for AN 846: Intel Stratix 10 Forward Error Correction
Document Version  Changes 

2018.07.02  Initial release. 