- Home›
- Technology and Research›
- Intel Technology Journal›
- Intel's 45nm CMOS Technology
Intel's 45nm CMOS Technology
Managing Process Variation in Intel's 45nm CMOS Technology
CRITICAL SOURCES OF VARIATION IN THE 45NM GENERATION
45nm technology is subject to a number of variation effects that are well documented in the literature [9–63]. Examples include highly random effects (random dopant fluctuation (RDF) [9–17], line-edge and line-width roughness, line-edge and line-width roughness (LER) and (LWR), respectively [18–21]), variations in the gate dielectric (oxide thickness variations [22–26], fixed charge [27], and defects and traps [28–34]), patterning proximity effects (classical, and those associated with OPC [35]), variation associated with polish (shallow trench isolation (STI) [36, 40], gate [37–38], and interconnect [39,42–44]), variation associated with strain (wafer-level biaxial 46–49, 57], high-stress capping layers [50–52], and embedded silicon-germanium (SiGe) [53–56]), and variation associated with implants and anneals (tool-based [58], pocket implants [59–60], rapid-thermal anneal RTA [61] and variation associated with poly grains [62–63]).
Random Dopant Fluctuation (RDF)
MOS threshold voltage variation due to random fluctuations in the number and location of dopant atoms is an increasingly significant effect in sub-micron CMOS technologies (see Figure 1 and [9–17]). As the number of dopant atoms in the channel decreases with scaled dimensions, the impact of the variation associated with the atoms increases. Figure 2 illustrates the decreasing average number of dopant atoms in the channel as a function of the technology node. Note the change from the 1µm technology node (with many thousands of dopant atoms in the channel) to the 32nm technology node (with less than 100 atoms in the channel).

Figure 1: Random dopant fluctuations (RDF) are an important effect in sub-micron CMOS technologies
click image for larger view
RDF is assumed to be the major contributor to device mismatch of identical adjacent devices and is frequently represented by Stolk's formulation (Equation 1)
(1)
click image for larger view
illustrating that matching improves with decreases in channel doping (N) and gate oxide thickness (Tox), and it degrades when device area decreases [12].

Figure 2: Average number of dopant atoms in the channel as a function of technology node
click image for larger view
While Equation (1) assumes that the only contribution to random variation between two adjacent matched devices is random dopant fluctuation, in practice it is known that additional effects also contribute to the measured variation [14]. Identifying the magnitude and root cause for these additional effects is important in facilitating the development of mitigation techniques. Many groups have attempted to estimate the size of these additional effects by comparing measured data to simulation [15–16]. As an example, we reported the results of such a study [17] where we compared simulation results to 65nm silicon data and showed that simulated RDF is ~65% of the total NMOS σVT. Similar results were obtained when we compared 45nm simulation results to data where the simulated RDF is ~60% of the total PMOS σVT.
Line-edge and Line-width Roughness (LER and LWR)
While random fluctuations in patterned lines occur in both the front-end and the back-end of the process, the primary concern with LER/LWR is variations in poly-gate patterning (see Figures 3 and 4). For poly-gate patterning, LER and LWR are associated with increases in the sub-threshold current [18, 19] as well as degradation in the threshold voltage (VT) characteristics [20, 21].

Figure 3: LER/LWR definitions [19]
click image for larger view
Diaz et al. [18] quantified the impact of LER on transistor performance by comparing devices from a 130nm technology (80nm nominal gate lengths and 17Å oxide) that were patterned with a 193nm binary solution (9.3nm LER) and 248nm alternating phase shift mask (APSM) solution (6.5nm LER). LER reduction from 9.3nm to 6.5nm translated into measured improvement of 1.5X for a nominal device. For the subnominal 70nm device, a 2X improvement was observed.
In a similar experiment, Kim et al. [19] evaluated the impact of LER and LWR on device performance using a set of 80nm node single nMOS transistors from low-power SRAM devices fabricated with various combinations of gate length, gate width, LWR, and LER. The amount of LER and LWR was controlled by applying different resist materials, defocus, and overetch time. Their experimental data showed that LER effects began when the gate length was less than 85nm. They observed a four-order of magnitude increase in the standard deviation of the subthreshold current for the smallest gate lengths in the study.
Fukutome et al. [20] were able to use scanning tunneling microscopy (STM) to directly assess the impact of LER on the carrier profiles of source-drain extensions in sub-micron MOSFETs. They observed that the roughness of extension edges induced by gate LER depended on the implanted dose, halos (pockets), and various co-implantations. They showed an improvement of 4nm in VT roll-off with a decrease in the average LER, and they confirmed that co-implants induced a degradation of 5mV in the standard deviation of VT.
Asenov et al. [21] studied the combined effect of LER and random discrete dopants on current fluctuations. They were able to demonstrate that the two sources of fluctuations act in a statistically independent manner when taken into account simultaneously in the simulations. They also showed that the LER-induced current fluctuations have a much stronger channel length dependence and, as devices are scaled to shorter dimensions, LER is expected to supplant RDFs as the dominant variation source.

Figure 4: LER/LWR of poly gates has been modeled by a number of researchers [18–21]
click image for larger view
Variations in the Gate Dielectric
The high-k metal-gate (HiK+MG) devices used in the 45nm generation are subject to a number of variation effects in the gate dielectric [22–34]. These include variations in oxide thickness, fixed charge, and interface traps. These physical changes in the dielectric result in parametric variations in drive current, gate tunneling current, or threshold voltage.
Oxide Thickness
Asenov et al. [22] have studied the intrinsic threshold voltage fluctuations introduced in the atomic scale roughness of the gate interfaces in deep submicrometer MOSFETs through carefully designed simulation experiments. Their simulations show that intrinsic threshold voltage fluctuations induced by local oxide thickness variations become comparable to voltage fluctuations introduced by RDF for conventional MOS devices with dimensions 30nm and below.
Koh et al. [26] have evaluated gate-tunneling leakage current both experimentally and theoretically for MOSFETs with 1.2- to 2.8nm-thick conventional SiO2 gate oxides. They showed that the statistical distribution of gate-tunnel leakage current causes significant fluctuations in VT when the gate oxide tunnel resistance becomes comparable to the gate poly-Si resistance. They set the scaling limits (when using a low-resistive silicide gate, and with a conventional gate oxide) at an 0.8nm gate oxide thickness.
Fixed Charge
The presence of fixed charge in the high-k layer can affect the mobility and the threshold voltage. As a consequence, variation in the fixed charge may affect the uniformity of the threshold voltages on devices. Kaushik et al. [27] have studied this effect and estimated the fixed charge in high-k dielectric films based on a slant-etched SiO2 layer that allows a thickness series on a single wafer.
Defects and Traps
Electron mobility degradation and VT instability due to fast transient charging (FTC) in electron traps is a continuing concern in high-k dielectrics.
Lucovsky [28] has extensively investigated defects in HfO2 gate dielectrics through the combination of spectroscopic measurements with electrical detection of defect states. Two types of defects have been proposed, those associated with grain boundaries in the nanocrystalline HfO2 and those associated with different charge states of the O-atom vacancy. Similar conclusions are reached by other researchers [29–31].
Wen et al. [32] have investigated the effects of FTC by studying the impact of metal gate electrodes on mobility degradation. Their studies suggest that the increase of FTC in HfSix may be attributed to higher density of the O vacancies in the high-k dielectric caused by the HfSix-induced O scavenging process.
Optimization of HfO2 processing such as N incorporation [33] or use of HfSiOxNy [34] has also been shown to reduce the charge-trapping effects.
Patterning Proximity Effects
The general lithography expression for the minimum resolvable critical dimension (CD)—assumes equal line/space—is given in Equation (2),
(2)
click image for larger view
where k1 is a measure of lithographic aggressiveness (small is aggressive) and includes illumination conditions, resist materials/chemistry, OPC and other resolution enhancement techniques (RETs). The continued decrease in k1 for more advanced technologies is illustrated in Figure 5.

Figure 5: Generational trend in k1
click image for larger view
A variety of techniques can be applied to layers with low k1 to improve the lithographic patterning and reduce the variation. One of the most powerful of these is OPC [35].
OPC pre-distorts the mask data following specific algorithms in order to achieve a desired pattern on the wafer. OPC is based on a highly phenomenological process model that incorporates lumped optics, resist, wafer stack, and mask effects. This model generates a mask-to-wafer optical transfer function, and an OPC algorithm is written to invert the transfer function. An OPC recipe is developed using an iterative algorithm that modifies the starting database in order to achieve the desired pattern on the mask. An example of the power of OPC is shown in Figure 6, which compares patterning with and without OPC applied.

Figure 6: OPC pre-distorts the mask data in order to achieve a desired pattern on the wafer
click image for larger view
Polish
Chemical mechanical polish (CMP) is a critical process step in advanced semiconductor technologies. In the front end, CMP has been used for polishing STI [36], and more recently for polishing gate-in, gate-last, metal gate processes [37]. In the back-end, CMP is used for polishing dielectrics in a conventional process and metals in a damascene process [37].
In a traditional STI process [36], shallow trenches are etched into silicon using a nitride hard mask followed by oxide deposition to fill the trenches. A CMP step removes the excess oxide on top of the nitride and partially polishes the nitride layer. The remaining nitride is stripped to expose the active regions where subsequent processing forms the transistors. Subsequent process steps (poly patterning, spacer, silicide formation, etc.) are sensitive to variations in the height of the oxide "steps" between the edge of the STI and diffusion produced by CMP variation.
In a high-k, first gate-last process [37], SiO2growth is replaced by high-k gate dielectric formation. After interlayer dielectric ILD deposition, a poly polish step exposes poly gates, and a gate trench is formed by removal of the dummy poly. Workfunction and conduction metals are deposited in the gate trench and then planarized using a metal polish step. The gate-fill step is sensitive to variable height gates produced by the poly gate CMP variation, and subsequent process steps are sensitive to both height and recess variation from the combination of poly-gate and metal-gate CMP variation [38].
In the back-end [39] the traditional subtractive process uses a metal etch to pattern and remove titanium and aluminum. The subtractive metal process is followed by ILD, a CMP planarization step, and tungsten via fabrication step. In the subtractive process, the sensitivity is to ILD variation produced by the dielectric CMP planarization. Damascene-copper reverses the process, by etching troughs and vias into an insulator, depositing a copper diffusion barrier and copper into the troughs, and using CMP to remove excess copper and barrier material. An ILD is added after the Cu-CMP. In the damascene process, the sensitivity is to Cu and ILD variation produced by the metal CMP planarization. [39]
One commonly applied method for improvement of variation in any CMP process is the addition of dummy-features. Tian et al. [40] review some of the historical approaches to dummy-feature placement and modeling and present a time-dependent relation between post-CMP topography and layout pattern density for CMP in STI.
In the back end, much recent literature has been devoted to the topic of modeling interconnect variation produced by CMP. For example, Yu et al. [41] characterize the smoothing and planarization effects of ILD polishing by a polynomial equation with a small number of fitted parameters. Choi et al. [42] combine a set of scripts and commercial tools to incorporate Cu-interconnect CMP effects in a full-chip static timing analysis. Soumyanath et al. [43] present a nonintrusive time-domain technique to characterize interconnect performance on a 0.25um, 1.8V process. The technique is based on simple time-delay measurements from a repetitive waveform. Finally, Mehrotra et al. [44] analyze interconnect timing performance in a high-speed microprocessor by using timing analysis in conjunction with a post-extraction net adjustment.
Strain
Prior to the 130nm process generation, classic "Dennard" transistor scaling [45] was sufficient to support the 0.7X delay reduction per generation required by Moore's Law. For the 90nm generation and beyond, additional enhancements have been required. Primary among these enhancements is the use of strain.
During the 1980s, researchers began to explore channel strain approaches for transistor enhancement where thin Si layers were grown on relaxed SiGe substrates such that the thin Si layer would take the larger lattice constant of the SiGe and create biaxial tensile stress in the channel [46–49].
In the early 2000s, a new class of transistor strain approaches was developed that used process features external to the transistor (rather than strain in the channel itself as with the biaxial approaches) to strain the transistor. Among these approaches were high-stress capping layers [50–52] and the use of embedded SiGe in the PMOS source-drain regions [53–56].
Process strain creates a number of new variation challenges, both random and systematic. Researchers are beginning to focus both theoretically and experimentally on quantifying the magnitude of strain-induced variation. In Tsang et al. [57] for example, an analytical model was developed to predict threshold variation as a function of Ge fraction, layer thickness, channel length, and doping profile. This model was verified with simulations and experimental data for n- and p-MOSFETs in both single- and dual-channel architectures.
Implant and Anneal
In addition to the fundamental variation mechanism of random dopant fluctuation (discussed earlier), there are also a number of variation sources associated with the physical implant and anneal processes.
The implant tool conditions are a significant source of transistor variation. Al-Bayati et al. [58] have studied the device sensitivity of ultra-shallow junction processes to tool-related implant and annealing parameters. In their work, NMOS and PMOS devices were studied to quantify variation as a function of the accuracy of dose, purity of dose, spike anneal peak temperature, and the ramp-up and cool-down rates.
The architecture of the pocket (halo) and extension (tip) implants is also critical for variation management. Tanaka et al. [59–60] have investigated the statistical VT distribution for a variety of pocket (halo) implantation conditions through both experimental measurements and device simulation. They showed that the increase in VT asymmetry caused by the pocket profile degrades the total fluctuation of VT by greater than 15%.
The advent of advanced RTA processes has introduced new variation sources. Ahsan et al. [61] investigated the impact of RTA anneal on process variation and noted that most of the observed variation can be accounted for by lamp annealing-driven variations in Rext and VT. They also showed that the variation correlates with the calculated reflectivity for the lamp RTA spectrum and is dependent on the local, mm-scale pattern density.
An additional variation mechanism related to implant technology arises from the poly-crystalline nature of conventional gates. Enhanced diffusion, variations in dopant activation, and implant channeling along grain boundaries can all cause increased variation. Fukutome et al. [62] have investigated the effect of randomly oriented and rotated poly-Si gate grains on lateral carrier profiles of extension regions in sub-50nm MOSFETs by direct observations and electrical measurements. By optimizing the grain boundary they were able to demonstrate a 26% reduction in threshold voltage variation. Brown et al. [63] developed a coherent 3-D statistical simulation study of the impact of poly-Si granularity on the variability in CMOS transistors and concluded that for realistically scaled bulk MOSFETs, the poly-Si and random dopant-induced variations compete at 35nm and 25nm channel lengths. They further concluded that if LER does not scale by the International Technology Roadmap for Semiconductors (ITRS), fluctuation due to poly-grain boundaries becomes the dominant source of variability for channel lengths below ~25nm.

Figure 7: Cell topology enhancements for mismatch improvement
click image for larger view
In this article
- Abstract
- Introduction and Historical Overview
- Critical Sources of Variation in the 45nm Generation
- Process, Design and Layout Techniques Used in the 45nm Generation to Mitigate the Impact of Variation
- Characterization of Variation in the 45nm Generation
- Conclusion
- Acknowledgments
- References
- Authors' Biographies
