Modeling CMP for copper dual damascene interconnects
06/01/2000
overview
A theoretical model has been developed for chemical mechanical planarization of copper damascene based on the multistep, multislurry process platform. The equations for the dependence of step height, copper dishing, and dielectric erosion on the process parameters and device geometries have been mathematically derived, and pad conformality has been introduced into the model. The predictions correlate well with experimental observations, including adjustment of slurry selectivity to optimize copper planarity.
Lin Yang, Integrated Device Technology Inc., Hillsboro, Oregon
In the past several years, a worldwide effort has been made to develop copper damascene processes for making smaller, faster, and more sophisticated circuits [1-3]. The copper damascene process begins by opening trenches or canals in SiO2. Copper is then deposited, filling the trenches and covering the surface of the rest of the chip using electroplating (EP) technology. Chemical mechanical planarization (CMP) is used to polish away the excess copper, leaving only the inlaid interconnect lines. In dual damascene processing, holes or vias are also opened and filled with copper for interlevel connections. A major technical concern is the pronounced pattern dependence of process performance and the impact of the polish rate differences between copper, barrier, and insulator materials.
Several research efforts have been reported on modeling the CMP of interlevel dielectrics (ILD), shallow trench isolation (STI), and tungsten [4-7]. However, the CMP processes for copper damascene interconnects indicate unique characteristics and more complicated polishing mechanisms than conventional ILD, STI, and tungsten CMP. The transition of tungsten or ILD CMP to copper CMP is not simply a change in materials.
Figure 1. Schematic illustration of the copper trench profile and parameters for the copper planarization model. |
In copper damascene CMP, the starting point of the process is a topographical copper surface with various step heights, which include trench step height and overplated feature step height. Ideally, the CMP process should clear the excess copper and barrier materials from the ILD surface without losing the copper in the trenches. However, in practice, when the CMP process approaches the desired end point, the copper, barrier, and oxide features will co-exist on the surface with various area ratios. The distinct polish rates of these materials commonly result in considerable losses of copper and ILD in the damascene structures by the time the entire wafer is cleared.
A comprehensive model for CMP of copper dual damascene is useful in addressing these challenges, and such a model is presented here.
Model description
Historically, CMP has exploited single-slurry processes in ILD and tungsten applications, but none of the single-slurry processes developed so far for copper CMP have shown sufficiently convincing data to assure an industry commitment to use it.
On the other hand, multistep, multislurry CMP is becoming attractive because of the process flexibility, which facilitates process optimization regardless of equipment design rationale. To provide a comprehensive understanding of copper CMP processes, this modeling work is based on the multistep, multislurry process platform.
The whole CMP process is divided into three steps with three slurries. The first step is a fast copper polish, which is designed to efficiently remove the substantial portion of the excess copper and achieve a planarized surface. The second step is used to clear the remaining copper across the entire wafer with low pressure in order to minimize the losses of copper and ILD in the damascene structures. The third step is employed to clear the diffusion barrier with minimal impacts on the interconnect lines.
Copper planarization model
During copper damascene CMP, the planarization efficiency is determined by the removal rate difference between the high areas and the low areas. Figure 1 shows a trench step height model for copper CMP. According to the Preston equation [8], the CMP removal rate is directly proportional to both polish velocity and pressure. A planarization equation can be derived assuming both the upper and lower features on the chip experience the same velocity. The removal rates of low area and high area are given by
|
null
|
null
where Rl is the removal rate of the low area; Rh is the removal rate of the high area; KCu is the Preston coefficient for copper; E is the Young's elastic modulus of the pad; Hl is the compression distance of the pad on the low region under the given pressure; Hh is the compression distance of the pad on the high region under the given pressure; H is the pad thickness; and V is the linear velocity of the pad relative to the wafer. Assuming the pad is an elastic material, i.e., the pad expands instantaneously as it moves from a high area to a low area, the rate of step height (hS) reduction may be written as
|
null
If the pad bending or conformation is taken into account, the rate of step height reduction can be described as
|
null
where ΔH is the change in the pad compression distance due to pad conformation. ΔH is given by
|
null
where H0' is the compression distance of a pad without conformation under the given pressure, and H0' is the effective compression distance of the pad taking into account pad conformation. ΔH can be positive, zero, or negative, depending on the trench width and pad properties.
The interactions between the polish pad and device features have a profound impact on CMP performance. The dependence of the CMP process on the feature size has been explained with a pad viscoelastic (time-dependent) deformation model [9]. However, the modeling results based on pad viscoelasticity show large discrepancies with experimental data, so the feature dependence of pad conformation is described by a different mathematical model here. With w defined as the trench width, several important feature widths can be defined, and the pad behavior may be classified into the following four categories:
- The pad has no contact with the lower feature (H0' = 0) when w<w0. (w0 is referred to as the effective minimum width.) At extremely small trench width, the 3-D network of the pad can provide sufficient mechanical integrity to behave like a completely rigid material across the trench. w0 = 0.01mm is used for all the modeling predictions in this article.
- The pad is partially compressible (H0' <H0) when w0<w<wl. (wl is referred to as the minimum width for free compression.) At some widened trench width, the pad may possibly contact the low feature. For a given trench width, the presence of the contact between the pad and the low area also depends on the depth of the trench.
- The pad is fully compressible and partially conformal (H0<H0' <H0 + hS) when wl<w<wm. (wm is referred to as the minimum width for free conformation.)
- The pad is fully compressible and completely conformal (H0' = H0 + hS) when w>wm.
A generic rule is that a pad becomes more conformal as the copper trench width increases, and thus results in less efficient planarization. The dependence of pad conformation on the feature size may be derived, assuming that the relative change of compression distance is directly proportional to the relative change of the feature size
|
null
where w is the trench width, and x is a constant defined as the conformity of the pad, which is a pad material characteristic related to the mechanical stability of the pad. Under constant pressure P, the compression distance H0 is given by
|
null
Solving Eqn. 6 with the boundary condition w = w0, H0' = 0 gives
|
null
When w0<w<wm, ΔH can be calculated by substituting Eqns. 7 and 8 in Eqn. 5
|
null
Thus, Eqn. 4 can be solved with the appropriate boundary conditions to give the copper planarization model:
|
null
where hS0 is the initial step height. When the trench width w becomes very large (>wm), DH will be equal to hS0, and Equation 10 becomes hS = hS0, which means that the polish rates of the high area and the low area are equal. The planarization efficiency is zero and the step height remains constant. On the other hand, though the smallest features in the device are usually larger than 0.1mm in current copper CMP applications, a theoretical calculation can be made for the trench width w<w0 (0.01mm). In this case, the pad has no contact with the low features, and the removal rate of the lower feature is zero. Thus, the step height reduction becomes directly proportional to the high feature removal rate and polish time. For w0<w<wm, step height predictions can be made using Eqn. 10 with the parameter values listed in the table.
Copper-dishing model
Two CMP phenomena, copper dishing and ILD erosion, lead to device feature deviations from the idealized damascene structures. Copper dishing is defined as the difference in elevation of the insulator region to the metal line as depicted in Fig. 2. ILD erosion is a loss of the dielectric (usually SiO2) within the arrays. Field ILD loss is a global thinning of the ILD layer because the polish rate of ILD is nonzero during the overpolish step. Since the polish rate is usually much higher in the copper trenches than in the surrounding barrier region, copper dishing occurs as soon as the barrier layer over the insulator is exposed. Most copper and ILD losses result from the overpolish step. Unfortunately, the overpolish of copper damascene is a necessary step in copper dual damascene fabrication, due to an uneven clearance of the residual copper across the entire wafer. Here we present a copper-dishing model based on overpolish, pad compressibility, feature conformity, and removal rate selectivity. The terms for the copper-dishing model are defined schematically in Fig. 3. According to the Preston equation, the removal rates of tantalum and copper may be expressed as
|
null
where KTa is the Preston coefficient for tantalum (the diffusion barrier), KCu is the Preston coefficient for copper; HTa is the compression distance of the pad on the diffusion barrier region; and HCu is the compression distance of the pad on the copper region. Therefore, the copper-dishing rate is given by
|
null
The removal rate selectivity (S) of copper to tantalum for the given slurry is defined as
|
null
Therefore, Eqn. 13 can be solved with the appropriate boundary conditions to give copper dishing
|
null
where H0 is the pad compression distance under the given pressure. If the pad conformation factor is taken into account, H0should be substituted with H0', which is given in Eqn. 8.
Thus the complete copper-dishing model is
|
null
|
where hD is the copper dishing; S is the removal rate selectivity of copper to barrier material; P is the polish pressure; H is the pad thickness; E is the Young's elastic modulus of the pad; x is the pad conformity; w is the copper trench width; w0 is the effective minimum trench width; KCu is the Preston coefficient of copper polishing; V is the linear velocity of the pad relative to the wafer; and t is the elapsed time after the copper end point. These features of the model can be used to optimize the copper CMP processes.
Overpolish
Figure 2. Schematic illustration of copper dishing (hD), ILD erosion (hE), and field ILD loss (hF). |
Due to the removal rate variation across the wafer, some areas or features will clear sooner than others, and will thus inevitably suffer from some extent of overpolish by the time that all of the undesired copper is removed. Copper dishing as a function of overpolish time has been calculated using Eqn. 16 and the parameter values listed in the table. Figure 4 shows the model prediction and experimental observations of the effect of the overpolish on copper dishing. Continuing to polish past the end point will increase the level of dishing. The limiting factors in overpolish control include the nonuniformity of the electroplating and CMP processes, as well as the topography of the overplated ultrafine damascene arrays. In order to minimize overpolish, much effort is needed to achieve precise control over within-wafer nonuniformity of the etching, deposition, and CMP processes, as well as in improving planarization efficiency. In practice, the requisite overpolish time can be greatly shortened if the CMP removal rate profile matches the EP deposition rate profile.
Slurry selectivity
Figure 3. Schematic illustration of the copper damascene structure and parameters for the copper-dishing model. |
The choice of copper CMP slurry is not only determined by the polishing performance for copper, but also for barrier and ILD materials. Tantalum and tantalum nitride are being actively tested as diffusion barrier materials for copper. Tantalum is a noble metal that is difficult to polish; consequently, severe dishing can take place while polishing the surface of coexisting copper and tantalum features. One of the most important properties of copper CMP slurry is its removal rate selectivity between copper, barrier, and ILD materials. The copper-dishing model predicts that the dishing decreases as the slurry selectivity decreases, and dishing is zero if slurry with 1:1:1 selectivity is employed. Recently, a slurry with 1:1:1 (Cu:Ta:SiO2) selectivity has been tested for copper damascene CMP processes [10]. However, it must be noted that decreasing the selectivity affects process performance in two conflicting ways. Adjusting the selectivity to 1:1:1 can effectively reduce the local topography, but a relatively high SiO2 removal rate results in more field oxide loss and erosion of narrow oxide spacings. Therefore, the 1:1:1 selectivity slurry is not yet an ideal solution to the current issues in copper CMP. In practice, an acceptable trade-off between process and device parameters needs to be achieved.
Pad compressibility and conformity
Figure 4. The model's prediction of copper dishing of a 100mm-wide trench as a function of overpolish time. Experimental results are shown for comparison. |
The polishing pad is an important but poorly understood component of the CMP process. For example, many CMP studies conclude that hard pads yield better planarity. However, for lack of proper ways to quantify the pad characteristics, it is usually not clear what is meant by a "hard" pad. In this article, the polish pad is modeled by two parameters: compressibility and conformity. The compressibility is a measure of the pad deformation under the load and thus, it is associated with the Young's elastic modulus. The conformity is a measurement of the pad mechanical flexibility, i.e., the extent of the pad bending over topographical features. In this article, a hard pad is defined as a pad that has a high Young's modulus and low conformity. The model predicts that both low compressibility and low conformity are desirable for reducing copper dishing. According to Eqn. 16, copper dishing increases as the feature size increases due to pad bowing. The wider the metal line or the more compliant the pad, the more the pad can deform to remove materials within the dish.
Figure 5. The model's prediction of linewidth dependence of copper dishing and experimental data (overpolish time = 180 sec). |
Figure 5 shows the model prediction of copper dishing as a function of linewidth for the second step process. The model reveals that copper dishing increases with increasing linewidth as a logarithmic function. The calculated results agree well with the experimental data. It should be mentioned that a dual-layer pad system has often been used in ILD and tungsten CMP applications. A hard top pad is used for good planarity, while a soft bottom pad helps the top pad contact uniformly across the wafer. To control copper dishing, however, the model suggests that the soft bottom pad should be eliminated for planarization if an acceptable nonuniformity can be obtained.
ILD erosion model
ILD erosion is defined as the difference in height between the dielectric within a trench array and the dielectric outside the array. ILD erosion usually occurs in the regions where the area of the metal lines is large compared to the insulator area, which causes the ILD/metal combination to erode. Modeling of copper dishing can be extended to ILD erosion. Assuming copper is fully recessed below the ILD surface during the overpolish step, the dependence of the ILD removal rate on the pattern density may be expressed as
|
null
where Rpatt is the removal rate of the patterned area under the given pressure, Runpatt is the removal rate of the unpatterned area, r is the pattern density defined as the ratio of copper linewidth to the pitch, and h is the pattern area coefficient (0<h<1), defined as the ratio of effective patterned area to measured patterned area. The rate of ILD erosion may be written as
|
null
where hE is ILD erosion, KOX is the Preston coefficient for silicon dioxide, Hpatt is the pad compression distance on the pattern region, and Hunpatt is the pad compression distance on the
unpatterned region. Eqn. 18 can be solved with the appropriate boundary conditions. Taking into account the pad conformation factor, the ILD erosion model is expressed as
|
null
where w represents the width of the pattern array, and other symbols have their usual meanings as in preceding equations. ILD erosion is sensitive to many consumable, equipment, and device parameters according to Eqn. 19. Among the device parameters are two geometrical contributions to the erosion of a patterned area: 1. ILD erosion increases as pattern density (h) increases; and 2. ILD erosion increases as the dimensions of the patterned area (linked to h and w) increase.
Although both the parameters h and w are used to describe the impact of pattern area dimensions on ILD erosion, the parameters h and w have distinct physical implications. The pattern area coefficient h is introduced to link the effective local pressure to the contact area between the pad and the wafer surface, while the parameter w is used to correlate the effective local pressure with the pad bending. For pattern array size w = 3000mm, ILD erosion has been calculated using a value of 0.9 for h and the other parameter values listed in the table. Overpolish time is also critical to ILD erosion, as it is to copper dishing. Figure 6 shows the model prediction of ILD erosion as a function of pattern density during the second step polishing. The modeling results agree well with the experimental data.
Integration of electroplating and CMP
Traditional electroplating chemistry needs to be tailored with organic additives to meet submicron fabrication requirements. Among the process issues in electroplating, the most difficult problem is the fill of ultrafine features. New electroplating chemistry has recently been demonstrated to achieve excellent filling properties.
Figure 7. The model's prediction of bump height reduction, with experimental observations. The initial bump height is 5000Å, and the bump size is 2000mm. |
A negative impact of the new chemistry, however, is that copper continues to grow at an accelerated rate even after the filling is complete, thus resulting in "bumps" over any ultrafine arrays. Since these bumps cause longer overpolish of the wafer, copper dishing and erosion are aggravated. Therefore, control of electroplating bumps becomes a particularly important factor for planarization of copper damascene. Though we have derived the equations for trench-step heights, the same is applicable to bump-step heights. For the bump size of 2000mm, the reduction of the bump height was calculated using the copper planarization model; the calculated results are shown in Figure 7. The model prediction correlates fairly well with the experimental observations. The model reveals that bump size and bump height are important parameters for integration between electroplating and CMP processes.
When copper polishing approaches the end point of copper removal, the thickness of copper bump residues that necessitate the overpolish step is determined by both the bump height and bump size. The copper planarization model can be used to predict the overpolish required for clearing the bumps of arbitrary sizes. Figure 8 shows how much overpolish is needed for 5000Å-high bumps of various sizes. As indicated in the plot, the planarization of small bump features (<10mm) is highly effective, but the planarization efficiency decreases as the bump size increases due to pad conformation.
The influence of the bump height on the overpolish exhibits two distinct phases. The CMP process can usually planarize the bump height effectively down to a value less then 2000Å during the initial stage of copper polishing, and then the planarization efficiency drops considerably due to the pad deformation. The modeling results indicate that it is important to control the step height of large bumps, because it is usually the limiting factor in the overpolish step.
Conclusion
The mechanical parameters and chemical nature of CMP processes exhibit complex effects on the planarity of copper damascene structures. Process modeling becomes particularly important when deciding how to tailor the process to balance the trade-offs between CMP influences on device characteristics. A theoretical model has been developed for CMP of copper dual damascene based on the multistep, multislurry process platform. The degree of planarization of copper damascene can be predicted using the model with respect to both process parameters and device feature geometries. The model provides insights and guidance in optimizing the CMP process and deriving design rules for copper damascene fabrication.
References
- R.L. Jackson et al., "Processing and Integration of Copper Interconnects," Solid State Technology, Vol. 41, No. 3, pp. 49-59, 1998.
- X.W. Lin, D. Pramanik, "Future Interconnect Technologies and Copper Metallization," Solid State Technology, Vol. 41, No. 10, pp. 63-79, 1998.
- P. Singer, "Making the Move to Dual Damascene Processing," Semiconductor International, Vol. 20, No. 9, pp. 79-82, 1997.
- J. Grillaert, M. Meuris, N. Heylen, et al., "Modeling Step Height Reduction and Local Removal Rates Based on Pad-Substrate Interactions," Proc. CMP-MIC Conf., pp. 79-86, 1998.
- T.H. Smith, S.J. Fang, D.S. Boning, G.B. Shinn, J.A. Stefani, "A CMP Model Combining Density and Time Dependencies," Proc. CMP-MIC Conf., pp. 97-104, 1999.
- J. Grillaert, M. Meuris, E. Vrancken, et al., "The Use of a Semi-empirical CMP Model for the Optimization of the STI Module," Proc. CMP-MIC Conf., pp. 105-112, 1999.
- D.Z. Chen, B.S. Lee, "Pattern Planarization Model of Chemical Mechanical Polishing," J. Electrochem. Soc., Vol. 146, pp. 744-748, 1999.
- F.W. Preston, "The Theory and Design of Plate-Glass Polishing Machines," J. Soc. Glass Tech., Vol. 11, pp. 214-256, 1927.
- J.M. Steigerwald, S.P. Murarka, R.J. Gutmann, Chemical Mechanical Planarization of Microelectronic Materials, New York, John Wiley & Sons, pp. 68-72, 1997.
- D. Mahulikar, A. Pasqualoni, "Development of Ta Slurry for 1:1:1 Cu:Ta:SiO2 Selectivity," Proc. CMP-MIC Conf., pp. 221-226, 1999.
Lin Yang received his BS and MS degrees in materials science and engineering from East China University of Science and Technology, Shanghai, and his PhD in chemistry from the University of Arizona. His experience includes CMP of microelectronic materials, fabrication of integrated optical waveguide devices, development of chemical sensors, and process integration of 0.18mm feature sizes. Yang works on submicron copper interconnect technology at Integrated Device Technology Inc., 3131 N.E. Brookwood Pkwy., Hillsboro, OR 97124; ph 503/681-6391, fax 503/693-3595, e-mail [email protected].