|
Over the past decade, thermal design for cooling microprocessor packages has
become increasingly challenging, as silicon technology has continued to scale
in accordance with Moore’s Law. Figure 1 shows the 2004 update of the
International Technology Roadmap for Semiconductors (ITRS).

Figure 1: ITRS roadmap(s) and CPU historical data for high-performance computers.
click image for larger view
It can be seen that Thermal Design Power (TDP) rises linearly up to
approximately the year 2009-2010 and will remain approximately constant
afterwards. However, these data do not show if new cooling technologies are
needed for future packages. Due to die shrinkage and other complexities of the
microprocessor design, there is a possibility of increased local power
densities, leading to highly non-uniform heat generation that will cause
localized hotspots. Figure 2, taken from Watwe and Viswanath [1], shows a
typical power map from a chip. The cell area in Figure 2 is 1×1mm. The
package thermal cooling solutions must ensure that the junction temperature of
the processor (die temperature) must be within the 90-110ºC range,
especially at the hotspots, in order to ensure device performance and
reliability.

Figure 2: Example of non-uniform power distribution
The majority of Original Equipment Manufacturers (OEMs) within the
microelectronics industry would like to further extend the application of
air-cooling technologies. However, it was already shown in [2] that the current
air cooling technologies present diminishing returns. Therefore, it is
strategically important for the microelectronics industry to establish the
research and development focus for future non air-cooling technologies. For a
better understanding of the cooling capability for different thermal solutions
used in CPU cooling, we use the concept of Density Factor (DF) proposed for the
package performance by Torresola et al. [3]. This metric can be used to
quantify the impact of non-uniform die heating on thermal management for a
specific package. The advantage of using this metric is its ability to provide
a better comparison of the impact of different power maps and die sizes on a
specific package-based thermal management technology. Figure 3 shows the
location of the temperature measurement for junction and case (lidded-type
packages).

Figure 3: Lidded package and heat sink
Based on Figure 3 temperatures, the junction-to-case DF for a package is
defined as:
|
(2) |
where Ψjc (ºC/W) is the thermal resistance from junction to case and
Rpackage (ºC-cm2/W) is defined as the thermal resistance normalized by die
area, when the die is uniformly powered. Rpackage is given by:
 |
(3) |
where Rpackage, Rsi, RTIM1, and Rspreader are the thermal impedance of
package, silicon, first-level Thermal Interface Material (TIM) and heat
spreader, respectively.
Another important metric used for the cooling technology comparisons is the
sink-to-ambient resistance:
 |
(4) |
where Ts is the sink maximum
temperature, Ta is the ambient temperature, and P
is the CPU power dissipation. The total thermal resistance is given by:
 |
(5) |
where ΨTIM2 is the thermal resistance of second-level TIM. The junction
temperature (Tj) is given by
 |
(6) |
Note that P and DF depend on the electrical design of microprocessors. To stay
on the course of Moore’s Law, they are expected to increase for
next-generation microprocessors. Equation 6 contains all the relevant terms in
guiding thermal technology development. Assuming that Tj for next-generation
microprocessors has to be the same as current technology, then the thermal
resistance of various components has to decrease. ΨTIM2 is typically a small
portion of the total thermal resistance. Therefore, thermal designers are left
with two choices: improve Rpackage and Ψsink. Since Rpackage is multiplied by
DF, which is expected to be greater than 1, a reduction in Rpackage leads to a
greater reduction in Ψtot. Because of this, the industry is putting a great
amount of effort into reducing Rpackage. Ψsink is also important; however, the
choices are limited here because the heat has to be dumped into air and is
therefore limited by the low thermal conductivity of air. Consequently, to
reduce Ψsink, the only choice is to increase the volume of the heat sink after
exhausting all the other optimization techniques such as heat pipe heat sinks
and an increase in the airflow rate. Increasing the airflow rate results in
higher levels of noise. The volume of the heat sink is subject to space
constraints and can only be increased by using a remote heat exchanger. (In
this paper, a remote heat exchanger means that the heat sink is not directly
attached to the top of the package, but is installed somewhere else in the
chassis.) Because of all these factors, it is clear that if both P and DF
increase in the future, thermal technology development needs to focus on a)
reducing Rpackage and b) the use of remote heat exchangers.
In this paper, we focus, among various key and promising strategies, on
reducing Rpackage.
Rpackage consists of three components: Rsi, RTIM1, and
Rspreader. Rsi can not be changed due to fixed conductivity of Si. RTIM1 can be
changed by optimizing the TIM by the use of micro and nano particles. We focus
on polymer-based TIMs in our discussion. Rspreader can also be changed. This
can be achieved by using a microchannel liquid cooler. Liquid cooling
technology will also enable the use of a remote heat exchanger and can possibly
increase its efficiency. Since thermal design is based on the maximum Tj on the
die, another strategy that could be followed is to locally cool the die at the
hotspots by using Thin Film Thermoelectric Cooler (TFTEC) made of thin film
super-lattice or nanocomposites. Ψjc in Equation (2) is given by:
 |
(7) |
By locally cooling the hotspots using TFTEC, the net effect is a decrease in
Ψjc leading to a decrease in effective DF, as seen from Equation 2. The use of
TFTEC will, however, lead to an increased burden on the other cooling
components, due to the electrical power that is put into the TEC to achieve the
desired cooling effect. Since TFTEC is used only to cool a few localized
hotspots, not the whole chip, the increase in the total power to be dissipated
by the other components is expected to be low. Figure 4 conceptually shows the
idea of the use of various nano and micro technologies for cooling the next
generation of microprocessors.
We discuss these three technologies in detail in this paper. Our primary focus
will be on the technological merits of each technology and also on the
fundamental and practical challenges that must be met to enable these
technologies. The rest of this paper is divided into three sections:
particle-laden thermal interface materials, microchannel cooling, and TFTEC. In
the final section, we discuss the challenges that must be solved to enable
these technologies.

Figure 4: Schematic of futuristic thermal solutions
Nano and Micro Particle-laden TIM
Figure 4 shows the schematic of a particle-laden TIM (P-L TIM). Particles are
added to enhance the thermal conductivity of the TIMs. Current commercial TIMs
utilize micron-sized particles. Due to the advent of nanotechnology, particles
of virtually any size can be made. The question that needs to be answered is
whether nano-sized particles will lead to any benefits. The thermal resistance
of a P-L TIM can be written as:
 |
(8) |
where Rc1 and Rc2 represent the contact resistances of the TIM with the two
bounding surfaces. RTIM depends on these contact resistances and on both kTIM
and Bond Line Thickness of TIMs (BLT). Both BLT and kTIM are dependent on the
particle volume fraction and size.
Prasher et al. [4,5] showed that kTIM can be accurately captured by the
Bruggman Asymmetric Model (BAM). BAM is given by following equation
 |
(9) |
where f is the volume fraction of the fillers, km is the thermal conductivity
of the matrix and α = 2Rbkm/d,
where Rb is the interface resistance between the fillers and the matrix.
Equation 9 assumes that thermal conductivity of the fillers (kp) is much higher
than km. Figure 5 shows the comparison of Equation 9 with data on various TIMs.
Some of the other data in Figure 5 are from references [6] and [7].

Figure 5: Thermal conductivity of TIMs with respect to volume fraction of fillers
click image for larger view
The importance of Rb between the fillers and the matrix is shown in Figure 5.
Equation 9 shows that at α = 1, kTIM = km, and for α > 1, kTIM < km in
spite of the fact that kp >> km. α can be large either for a smaller
particle size or a large value of Rb. For nanoparticles, α can be very large,
unless a substantial reduction in Rb is achieved. Rb at the interface between
the particle and the matrix could arise due to two factors: 1) phonon acoustic
mismatch (inherent property of interface of dissimilar material), and 2)
incomplete wetting of the interface by the polymer. Rb due to phonon acoustic
mismatch is of the order of 10-8 K m2 W-1 at room temperatures [8]. This means
that α = 0.0004 due to phonon acoustic mismatch for d = 10 μm and km = 0.2
W/m-K. Therefore, phonon acoustic mismatch could be ignored in comparison to
the incomplete wetting; however, for nanoparticles, phonon acoustic mismatch
could become important. Putnam et al. [9] measured Rb between polymer and
alumina in the range 2.5×10-8 - 5×10-8 °C-m2 /W. This means that
the critical radius (α =1) below which the thermal conductivity of the
nanocomposite is less than the conductivity of the matrix varies between 10nm
and 20nm. It is because of this reason that carbon nanotube-based composites
have not been able to achieve high k. Therefore, using only nanoparticles in
the TIM can lead to a decrease in kTIM as compared to micron-sized particles.
However, a mixture of micro and nano sized particles can possibly enhance the
conductivity by providing a percolating chain between the larger particles.
Nano-sized particles can also possibly reduce the BLT as compared to
micro-sized particles.
Prasher [10] recently developed a model of BLT, which is given as
 |
(10) |
where τy is the yield stress of the polymer, d the diameter of the particles,
P the applied pressure, and r the radius of the substrate. Figure 6 shows the
comparison between Equation 10 and experimental data collected on various TIMs
including greases and Phase Change Materials (PCM) containing a different
volume fraction of particles.

Figure 6: Comparisons of scaling bulk model (Eq. 10) with experimental data for the
phase change material
click image for larger view
Typically, kTIM is used as the metric to compare various TIMs. Thermal
resistance for a given pressure should be the metric to compare one TIM
formulation with another because a higher kTIM does not necessarily translate
into a lower TIM resistance. Since both τy and k of TIMs depend on the volume
fraction of the fillers, a minimum in thermal resistance can be achieved with
respect to f. Therefore, future P-L TIMs should be designed around this
minimum.
Microchannel (MC)
Ever since Tuckerman and Pease [11] introduced the concept of microchannels,
there have been numerous experimental and theoretical studies performed in the
area of microchannels for heat transfer applications. Recently, the
semiconductor industry has started to seriously consider microchannel cooling
with liquid as the coolant [12, 13] due to the increase in total power
generated by the microprocessor and also due to the presence of multiple
hotspots [1]. A comprehensive review of microchannels using both single-and
two-phase cooling is provided by Sobhan and Garimella [14]. Table 1 gives an
overview of the difference between single-phase and two-phase cooling.
Table 1: Comparisons between single-phase and two-phase
microchannel cooling technologies
|
Single Phase
|
Two Phase
|
Flow rate
|
High (100-200 ml/min)
|
Low (10-30 ml/min)
|
Pressure drop
|
High (0.5-2 atm)
|
High (1-2atm)
|
Thermal resistance
|
<0.1 °C-cm2/W possible
|
Less than single phase possible
|
Technological understanding
|
High
|
Low
|
Pump size
|
Small pumps possible
|
Smaller than Single phase possible
|
Modeling capability
|
Existing
|
To be developed
|
Before discussing the details of microchannel efforts at Intel Corporation,
the Test Vehicle (TV) to capture the performance of the microchannels is
briefly discussed. Figure 7 shows the schematic of the TV. Figure 8 shows the
schematic of the heaters and the location of 20 integrated temperature sensors.
The TV also has a small hotspot heater. This TV was used to assess thermal
performance of single-phase and two-phase microchannels under uniform heating
and hotspot heating conditions.

Figure 7: Schematic of the microchannel test device
click image for larger view
Figure 9 shows the Scanning Electron Microscopy (SEM) pictures of the
different microchannels considered in the experiment. Table 2 shows the
dimensions of the various microchannels. For the two-phase case, the experiment
was performed only on Microchannel 1.

Figure 8: Layout of the temperature sensors and the
hotspot heater on thermal test vehicle
click image for larger view

Figure 9: SEM photographs of cross-sections of three different microchannels
Table 2: Geometric details of various microchannels
|
Microchannel 1
|
Microchannel 2
|
Microchannel 3
|
No. of channels
|
25ea
|
66ea
|
100ea
|
Channel width (w)
|
300 μm
|
65 μm
|
61 μm
|
Channel height (H)
|
180 μm
|
295 μm
|
272 μm
|
Channel length (L)
|
13 mm
|
15 mm
|
15 mm
|
Fin thickness (t)
|
104 μm
|
88 μm
|
39 μm
|
Silicon thickness (H+B)
|
350 μm
|
550 μm
|
Inlet/outlet hole size (g)
|
1 mm
|
In/out plenum size (R)
|
4 mm
|
Flow width (D)
|
10 mm
|
Cold plate width (D’)
|
16 mm
|
Cold plate length (L’)
|
25 mm
|
27 mm
|
One of the biggest fundamental challenges of two-phase cooling is the huge
temperature and pressure oscillations. Figure 10 shows the data on the impact
of hotspots on the temperature oscillation on a water-cooled two-phase
microchannel. The flow rate in all the tests was kept constant at 2.5 ml/min.
Figure 10 clearly shows that fluctuations in the wall temperature increase with
increasing power to the hotspot. For the 0.6 W hotspot heating condition, the
worst-case fluctuations are on the order of 30 °C, whereas for the 0.4 W
hotspot heating condition, the worst-case fluctuations are of the order of 20 °C. The worst-case fluctuation for the uniform heating condition is on the
order of 15 °C. This figure shows that temperature fluctuations depend on the
power being dissipated from localized hotspots.

Figure 10: Fluctuation in the maximum temperature of the die under uniform heating and
hotspot heating conditions
click image for larger view
The temperature fluctuation must be controlled to consider two-phase
microchannels as a serious technology. Poor flow distribution in two-phase
microchannels might lead to less flow in the regions of hotspots, leading to
localized dry out on the hotspot which will result in a large and rapid
increase in the temperature of the hotspot. Temperature and pressure
fluctuations and poor flow distribution are the main fundamental challenges for
two-phase microchannels.
Table 3 shows the parameters used in the testing of single-phase microchannels
with water.
Table 3: Test conditions used for testing microchannel samples
|
Microchannel 1
|
Microchannel 2
|
Microchannel 3
|
Fluid
|
Water
|
Flow rate (ml/min)
|
159
|
110
|
98
|
Main heater (W)
|
70
|
70
|
70
|
Hotspot heater (W)
|
0, 0.5, 1, and 2
|
0, 0.5, and 1
|
0, 0.5, 1, and 2
|
Table 4 shows the comparisons between the experimental data and CFD modeling
using Icepak*. It can be seen that the CFD model matches very well with the experimental data for
both pressure drop and thermal resistance in uniform and non-uniform heating
conditions. This shows that the existing commercial CFD tools can be used to
design single-phase microchannels. Table 4 shows that a single-phase
microchannel is capable of cooling very high heat flux hotspots (highest
considered in this study is 1250 W/cm2) and is capable of achieving very small
thermal resistance.
The main challenge for single-phase microchannel cooling is that water can not
be used as a coolant because water freezes at 0 °C, whereas the freezing
requirements for the electronics cooling industry is around -40 °C. Table 5
shows the comparison between the performance of widely used Propylene Glycol
antifreeze and water mixture (50%-50%) for the same thermal performance, which
means that both flow and convection resistance are the same. The microchannel
dimension for this study is 50μm (D)×300μm (D) for water.
Table 4.1: Experimental and CFD results of Microchannel 1
click image for larger view
Table 4.2: Experimental and CFD results of Microchannel 2
click image for larger view
Table 4.3: Experimental and CFD results of Microchannel 3
click image for larger view
Table 5 shows that the pressure drop of conventional antifreeze is very large.
This is due to low thermal conductivity and the high viscosity of the
antifreeze. The high pressure drop will lead to high forces on the bearings of
the pumps to be used to pump the liquids. Therefore, for single-phase
microchannel cooling, alternate antifreeze coolants are needed that have large
thermal conductivity and low viscosity.
Table 5: Comparison of pressure drops between PG 50% and water for the same
thermal resistance
Liquid
|
Flow rate (ml/min)
|
Pressure drop (kPa)
|
Water
|
200
|
80
|
PG 50%
|
220
|
900
|
Another big challenge for microchannel cooling technology is that the coolant
will also be used as the lubricant for pump bearings, because the pump has to
be hermetically sealed. From a lubrication point of view, a liquid with high
viscosity is preferable, whereas from a pressure drop point of view, a liquid
with low viscosity is desired. These are opposing requirements. Figure 11 shows
the thermal performance of the package-based microchannel cold plate as a
function of the pressure drop through the microchannels. It can be seen that,
in order to reduce the thermal resistance of the microchannels, a large
pressure drop will result. In turn, this large pressure drop across the device
will generate significantly large forces on the bearings, thus increasing the
wear and possibly reducing the lifetime of the pumps. In addition, the low
physical size of the pump shaft may impose significant additional challenges on
the bearing design. At last, due to the requirement of having a complete seal
device and no maintenance, the coolant fluid must be used as a lubricant as
well. These are usually conflicting properties for any fluid. Due to above
limitations, sleeve bearings may be the most advantageous for future pumping
devices used in package cooling.

Figure 11: Thermal resistance vs. pressure drop for fluids with different viscosity
Figure 12 shows the schematic of a sleeve bearing. It can be seen that the
sleeve bearing relies on maintaining a continuous film between the shaft and
the housing. In a simplified way, the fundamental requirement for two surfaces
to be lubricated is that the operating thickness of the lubricant between the
surfaces must be thicker than the roughness of the surfaces. Based on
Summerfield numbers [15], the minimum film thickness can be found as a function
of rotational speed, radial loading, shaft diameter, length of the shaft in
bearing, and the eccentricity of the shaft. Due to the pressure drop across
microchannels, the radial forces could have high values, but they are usually
less than 10N. A dimensionless parameter (Λ) is then used to determine the
regime of lubrication:
 |
(11) |
where σshaft and σhousing are the root mean square roughness for the shaft and
the housing surfaces, respectively, and hmin is the minimum film thickness of
the lubricant. Typically it is considered that hydrodynamic lubrication occurs,
when Λ > 5. This parameter is plotted in Figure 13 as a function of
RPM and radial loading (shaft OD = 3 mm; Fluid viscosity of 1.5 cP; Roughness
is assumed better than mirror surfaces).

Figure 12: Simplified schematic of Sleeve (Journal) bearing

Figure 13: Lubrication regime for pumps using the coolant as lubricant
click image for larger view
It can be seen that Λ is significantly smaller than 5 and therefore the
hydrodynamic lubrication film can not be maintained under all conditions. This
may be a major issue for any cooling device to be used in future package
cooling. Although cooling solutions using pumps provide good package cooling,
the thermal solution providers should not overlook the bearing life to ensure
overall package reliability.
|