Fixing hidden problems with thermal budget
10/01/1997
E.G. Seebauer, R. Ditchfield, Department of Chemical Engineering, University of Illinois, Urbana
Kinetic effects in rapid thermal processing (RTP) are often assessed using the concept of thermal budget. Although thermal budget can be defined in several ways, the basic premise asserts that budget minimization should minimize dopant diffusion and interface degradation. The present work highlights serious shortcomings with this principle. Experiments show that budget minimization, according to some definitions, can actually worsen diffusion problems rather than mitigate them. We present a straightforward framework for improving the results through comparison of activation energies of the desired and undesired phenomena. This framework provides strong kinetic arguments for continued development of rapid isothermal processing and small batch fast ramp methods.
Over the past few years, RTP and its variant, rapid isothermal processing (RIP), have appeared in a wide range of processing steps, including defect annealing, chemical vapor deposition (CVD), nitridation, silicidation, and oxidation. Despite widespread application, kinetic modeling often remains quite qualitative. Quantitative analysis of RTP frequently employs the concept of "thermal budget," commonly defined as the area under a time-temperature (t-T) or time-diffusivity (t-D) curve. Development efforts focus on minimizing the thermal budget in order to reduce unwanted solid-phase diffusion or interface degradation. However, no rigorous kinetic demonstration exists to validate this approach.
This article highlights shortcomings of the thermal budget concept, which arise principally through a failure to account for selectivity in rate phenomena. Minimizing the budget actually suggests inferior heating programs in some cases. Accounting for selectivity, we suggest a framework that relies on comparing the activation energies for the desired and undesired physical phenomena. This framework permits rapid qualitative comparison of the kinetic consequences of various heating programs. While these ideas apply to many physical phenomena that can occur in RTP, they do not attempt to balance kinetic effects against considerations of wafer warpage, heating uniformity, particle control, cost-of-ownership, and the like.
Evolution of the thermal budget concept
The notion of thermal budget apparently originated in conjunction with modeling dopant diffusion in pn junction fabrication [1, 2]. The term referred to the inverse functional relationship between the isothermal processing temperature and the time required to establish a particular junction depth. As ion implantation began to replace thermal diffusion, annealing of implantation damage also required consideration [2, 3]. Some notion of competing physical rate phenomena came into play, and RTP was developed empirically as a method to heal damage. As RTP no longer represented isothermal operation, however, nonisothermal behavior during heating and cooling required attention. In published work, thermal budget has come to represent the product of time t and temperature T [4]. More precisely, in nonisothermal operation the budget denotes the area under a t-T curve.
In some quarters, the solid-phase diffusivity D of some element (a dopant, for example) replaces T. The temperature then appears implicitly, as D obeys:
where Do and Ediff represent the pre-exponential factor and activation energy for diffusion, respectively. This use of D instead of T appears virtually nowhere in published literature.
In practice, the use of thermal budget today is often neither as precise nor as quantitative as these definitions suggest. Virtually no quantitative use of thermal budget for practical calculations has been published. Instead, we see the virtually axiomatic statement that reducing thermal budget should improve process results by reducing effects such as interface degradation and dopant diffusion. Some workers appear to believe that RTP reduces thermal budget primarily by decreasing processing time. Possible countervailing effects of increased average temperature are rarely considered, and the general approach to process synthesis seems to involve minimizing t and T separately. In particular, increasing T is often viewed as universally detrimental.
Regardless of definition, the notion of thermal budget carries considerable intuitive appeal in an industry driven relentlessly by cost considerations. The analogy between fiscal budget and thermal budget is readily apparent. In fiscal accounting, increasing the rate of spending means that bad things will happen unless the project time decreases. Similarly, it is commonly assumed that increasing transformation rates on the wafer (via increased T) means that bad things will happen unless the processing time decreases. Furthermore, just as smaller fiscal budgets make people smile, smaller thermal budgets are believed to do the same.
In a very loose sense, this analogy contains a kernel of kinetic truth, as the increasing success of RTP over conventional furnace processing amply attests. Closer inspection reveals a subtle but serious flaw, however. In fiscal matters, one needs to consider only a single rate: that of spending. In processing matters, one must always consider at least two rates: that of the desired phenomenon (like silicidation) and that of the undesired phenomenon (like dopant diffusion). The relative balance between these rates is termed "selectivity." Raising the temperature makes everything go faster, but usually increases one rate more than the other. Hence, the selectivity usually changes with T. Kinetic optimization seeks a temperature program that tilts the selectivity toward the desired phenomenon and away from the undesired.
Some standard applications of thermal budget fail because selectivity is not taken into account. A thermal budget analysis cannot distinguish the kinetic difference between using short process times/high temperatures and long process times/low temperatures. Many heating programs can lead to the same minimum area, but with different kinetic results (Fig. 1). These ideas are laid out in somewhat more whimsical fashion elsewhere [5].
The following section shows how thermal budget considerations actually suggest incorrect heating programs in some cases.
Experiment
Experiments considered CVD of epitaxial Si on doped Si, with concomitant dopant diffusion into the growing layer as the undesired phenomenon. In all cases, the deposition rate exceeded the diffusion rate sufficiently to avoid perturbing the diffusion profiles from those expected for step decay into a semi-infinite medium. Experiments were conducted with boron as dopant, for which the diffusion activation energy (Ediff) lies above the deposition activation energy (Edep). A more complete exposition of the experimental design can be found [6], together with details of experiments with copper as the dopant, where Ediff < Edep.
In one set of experiments, the heating and cooling rates remained constant while the soak time varied. In another set, the total process time and cooling rate remained constant while the ramp rate varied. In both sets, the soak temperature varied correspondingly to maintain constant deposition thickness.
|
The reaction chamber has been described previously in connection with CVD of titanium disilicide [7]. Silicon substrates were suspended in the chamber center from a thin wire connected to a microbalance. Real-time mass measurements, provided by the balance, permitted termination of growth at a specified film thickness. A focused 1000 W xenon arc lamp heated the substrate. The high heating rates (~100°C/sec) permitted the reactor to operate as an RTCVD system. An optical pyrometer, calibrated in separate experiments against a chromel-alumel thermocouple, monitored surface temperature. Silane was the source gas for Si CVD, and all dopant profiles were measured by secondary ion mass spectroscopy (SIMS).
Figure 2. Time-temperature profiles with varying soak t and T. Thermal budgets appear as integrals of t-T with To = 27°C. |
Figure 2 shows t-T profiles for experiments in which the soak time and temperature varied, with heating and cooling rates remaining constant. Total grown thickness remained constant at 575 ± 15 Å over a temperature range from 730–820°C. Initial ramping took place at a rate (β) of 100°C/sec, with exponential cooling (by blocking the lamp) at an initial rate of 60°C/sec. The table shows the thermal budgets for these experiments, defined as the area under plots of both t-T and t-D. Integrations began at several choices of starting temperature, To, and ended once T returned to To. While these definitions are rather arbitrary, minimization did not depend on them: all variation in budget occurred in response to changes during the soak.
Figure 3. Boron SIMS concentration profiles for Fig. 2. Counterintuitively, minimum thermal budget (defined as t-T) yields the broadest concentration profile. |
For t-T, the budgets decreased monotonically as the process time shortened and the soak temperature increased. Yet, experimentally, lower temperatures led to narrower concentration profiles (Fig. 3). The narrowest profile corresponds to the largest t-T thermal budget, in direct contradiction of the principle of budget minimization. The t-D definition gave the correct progression, however (see table).
Figure 4. Time-temperature profiles with varying soak t and heating rate β. Thermal budgets appear as integrals of t-T with To = 27°C. Growth time was fixed at 420 sec. |
Figure 4 shows t-T profiles for experiments in which the process time remained constant while the heating ramp rate varied from 6–100°C/sec. Deposition thickness remained constant at 705 ± 15 Å. The table shows the corresponding thermal budgets defined as before. The highest soak temperatures yielded the minimum budget for t-T, although the dependence of budget on soak temperature was not monotonic. Faster ramps (corresponding to lower temperatures) led to narrower concentration profiles (Fig. 5). Thermal budget minimization using t-T again failed to predict properly the conditions for the narrowest profile, but t-D again gave the correct progression.
Discussion
A t-T definition for thermal budget fails to perform adequately when the activation energy Ed for the desired phenomenon falls below the corresponding energy Eu for the undesired phenomenon. However, a t-D definition works well. To see why, consider the expression for random-walk diffusion in three dimensions:
where x denotes the mean displacement. Minimizing the t-D product perforce minimizes x. Still, while budget minimization using t-D is a necessary condition for minimizing diffusion, it is not sufficient because Eqn. 2 ignores the kinetics of the desired phenomenon. Selectivity remains unaccounted for.
A graphical approach will help to emphasize this point, as well as give a more formal framework for kinetic analysis of RTP. For simplicity, we focus only on the time and temperature associated with the soaking period, ignoring for the moment transformations during heating and cooling. We seek to construct a window in t-T space that encompasses satisfactory operation of a hypothetical process (deposition, silicidation, etc.) and to identify the optimum point of operation within that window from a kinetic point of view.
Figure 5. Boron SIMS concentration profiles for Fig. 4. Counterintuitively, minimum thermal budget (defined as t-T) yields the broadest concentration profile. |
Figure 6 is a schematic example of this approach. Construction of a suitable window begins with the observation that certain maxima and minima usually exist for T and t independent of kinetics. For example, tmax may be limited by throughput considerations, giving a vertical (T-independent) line. Equipment specifications may set tmin via maximum heating/cooling rates. In such cases, tmin will increase with T. Tmax may be limited by wafer strain or other forms of damage, while thermodynamic considerations (dopant activation, for example) may set tmin. If Mother Nature is kind, these four lines will define a nondegenerate area as shown in Fig. 6. If she is kinder, curves representing the process kinetics will fall within this area.
Two kinetic curves must be drawn to characterize the system fully. One is defined by the desired process, whose rate, rd, we assume obeys the form:
Here, kod, a constant, contains the pre-exponential factor, reactant pressures (in deposition, for example), and any other T-independent factors. Reaching the desired deposition thickness, degree of oxidation, etc., requires integration of rd over time. Since for the moment we are considering only the soak period at constant T, Eqn. 3 integrates to the form:
|
null
where Kod = kodt. Equation 4 defines a design specification for the desired phenomenon: in Fig. 6 the process must fall on the corresponding curve. The other kinetic curve in Fig. 6 follows from the kinetics of diffusion; for simplicity, we employ Eqn. 2 rewritten as:
Equation 5 contains the t-D thermal budget, which optimally corresponds to the maximum diffusive displacement the process can tolerate. Reducing the thermal budget shifts the curve to the left. Figure 6 also includes a curve of constant t-T, corresponding to the other definition of thermal budget.
Since the time-temperature history of a given wafer affects all physical phenomena taking place on it, the diffusion curve (Eqn. 5) must intersect that defined by the desired kinetics (Eqn. 4). Operating away from this intersection implies either that the desired transformation was incomplete, or that diffusive displacement differed from the design specification.
For a given thermal budget, T and t cannot take just any value along the diffusion curve. Instead, T and t must also fall on the curve for the desired phenomenon. In our opinion, this extra requirement invalidates the intuitive picture suggested by the word "budget." The analogy to fiscal budget leaves out an aspect of the kinetic analysis so crucial that we believe this usage should be dropped entirely.
|
Figure 6. Schematic t-T diagram showing how to optimize kinetically a process in which the desired phenomenon has a lower activation energy than that for diffusion. Tmin, Tmax, tmin, and tmax are set by nonkinetic considerations. The process t and T must fall within the lighter area on the kinetic curve for the desired phenomenon (defined by Eqn. 4). Unwanted diffusion obeys the curve of constant Dt defined by Eqn. 5. Here, x2 is smallest at Tmin. For comparison, a curve of constant Tt also appears. The same reasoning applies, but use of Tt incurs additional problems described in the text.
Besides solving the problem shown in Fig. 1, Fig. 6 also highlights several other important points. First, the intersection of the kinetic curves must fall within the domain defined by Tmin, Tmax, tmin, and tmax. This requirement sets an easily visualized lower limit on x2. Second, in Fig. 6, the minimum value of x2 occurs at Tmin. This fact follows from the relative slopes of the kinetic curves due to the condition used to construct Fig. 6. Figure 7 shows the situation where Ed > Eu. In this case, x2 finds its minimum at Tmax. The experiment presented earlier confirms this result.
Qualitative rules for heating program design
These latter aspects of Figs. 6 and 7 may be crystallized into simple, easily explained heuristics that can be generalized to include the heating and cooling ramps, not just the soak to which we have limited ourselves so far.
At its core, the question of selectivity is embodied in the respective temperature dependencies of the desired and undesired phenomena. Temperature dependence in turn arises from the activation energy. Higher activation energies imply stronger rate variations with temperature, so that phenomena with higher activation energies tend to occur preferentially at high temperatures. This simple fact suggests two crucial rules for designing heating programs.
Figure 7. Diagram analogous to Fig. 6, with a diffusing species having Ediff less than that for the desired phenomenon. Now, x2 is smallest at Tmax. |
First, when the undesired phenomenon has the higher activation energy, the rule is: always keep T as low as possible throughout the process. While Tmin may be determined by nonkinetic considerations such as thermodynamics (as in Figs. 6 and 7), the lowest possible temperature may also be set by tmax. In Figs. 6 and 7, this situation would occur if the "desired" kinetic curve were shifted to the right far enough to intersect tmax before hitting Tmin. This rule has a corollary: ramp-up and cool-down should always occur as quickly as possible. Longer periods in ramping and cooling require higher soak temperatures, which degrade selectivity against the undesired phenomenon. In reality, of course, ramping and cooling can never match the kinetic ideal of perfect step functions, but this rule provides a clear design goal.
Second, when the desired operation has the higher activation energy, the rule is: always soak at the highest possible temperature. While Tmax may be determined by nonkinetic considerations such as wafer warpage (as in Figs. 6 and 7), the highest possible temperature may also be set by tmin. In Figs. 6 and 7, this situation would occur if the "desired" kinetic curve were shifted to the left far enough to intersect tmin before hitting Tmax. This rule has the same corollary as Rule 1: ramping and cooling should take place as quickly as possible. Lingering at lower temperatures gives the undesired phenomenon unnecessary advantage, and increases processing time as well. Again, step-function heating and cooling represent the kinetic ideals.
Conventional thermal budget minimization can suggest heating programs that are kinetically suboptimal in some cases. How is it, then, that this concept has found such widespread use? The idea of thermal budget gained currency in connection with the development of RTP as an alternative to furnace processing, particularly for healing implantation damage. Both rules presented here suggest that fast ramping and cool-down represent kinetic optima; RTP clearly defeats conventional furnace processing in this respect. Furthermore, defect dissolution in Si after ion implantation has an activation energy of 5.2 eV [8], much higher than the 3.5 eV typical of dopant diffusion [9]. Where Eu falls below Ed, conventional thermal budget minimization by any definition performs satisfactorily [6].
Implications for equipment development
Recently, small batch fast ramp (SBFR) furnaces have been developed with ramp rates approaching those of RTP systems [10, 11]. Such behavior more closely approximates the kinetic ideal. Also, shortened cycle times improve throughput, a feature greatly augmented by the ability to process several wafers simultaneously instead of just one, as in RTP. This advantage becomes pronounced for steps that inherently require long cycle times to complete. SBFR methods also reduce the problems of temperature uniformity and repeatability that plague many RTP systems.
The results and principles laid out here suggest that SBFR methods should prove particularly suitable for process steps with Eu > Ed. In these cases, Rule 1 dictates use of the longest possible cycle times (with fast ramping and cooling) at the lowest possible temperatures. With dopant diffusion as the undesired phenomenon, many desired processes fall into this category. Activation energies for dopant diffusion fall near 3.5 eV in Si [9], and between 3.0 and 5.0 eV in SiO2 [12]. These numbers are much greater than those typical of CVD (0.5–2.2 eV [13, 14]), oxidation (0.5–1.6 eV [15, 16]), nitridation (0.4–0.6 eV [17, 18]), and some forms of silicidation (1.0–2.1 eV [19, 20]).
Such situations also would benefit from use of the RIP, with its emphasis on rapid ramping to the soak temperature.
Conclusion
This work highlights problems with the concept of thermal budget. Experiments demonstrate that when defined as t-T, thermal budget minimization can actually suggest suboptimal heating programs. A definition based on t-D avoids this problem, but neglects other crucial aspects of kinetic optimization. Arguments based on selectivity have inspired two simple rules for heating program design, based only on comparison of activation energies. This work presents kinetic ideals only, and has not tried to balance these against other practical processing considerations. However, the kinetic ideals lend strong support to continued development of rapid isothermal processing and small batch fast ramp furnace methods.
Acknowledgments
This work was supported in part by Semiconductor Research Corp. and by the National Science Foundation under Grant No. CTS-95-06419. The authors acknowledge support from the Department of Energy through the Materials Research Laboratory (MRL) at the University of Illinois under Contract No. DEFG02-91ER45439. SIMS was performed at the MRL's Center for Microanalysis of Materials.
References
- C.M. Osburn, A. Reisman, Journal of Supercomputing, 1, p. 149, 1987.
- R.B. Fair, Proceedings of the IEEE, 78, 1687, 1990.
- R. Singh, J. Appl. Phys., p. 63, R59, 1988.
- R.P.S. Thakur, R. Singh, Appl. Phys. Lett., 64, p. 327, 1994.
- R. Ditchfield, E.G. Seebauer, Rapid Thermal and Integrated Processing VI (MRS Vol. 470), in press.
- R. Ditchfield, E.G. Seebauer, J. Electrochem. Soc., 144, 1842, 1997.
- M.A. Mendicino, E.G. Seebauer, J. Cryst. Growth, 134, 377, 1993 (and references therein).
- C.L. Claeys, G.J. Declerck, R.J. van Overstraeten, Appl. Phys. Letters, 35, 797, 1979.
- M. Schulz, Landolt-Bornstein: Semiconductors: Impurities and Defects in Group IV Elements and IV-V Compounds, III/22b, (Springer: Berlin, 1989) 230.
- K.G. Reid, A.R. Sitaram, Mat. Res. Soc. Symp. Proc., 387, 201, 1995.
- T. Speranza, et al., Mat. Res. Soc. Symp. Proc., 429, 309, 1996.
- M. Ghezzo, D.M. Brown, J. Electrochem. Soc., 120, 146, 1973.
- C.W. Pearce, "Epitaxy," in VLSI Technology, S.M. Sze (ed.), McGraw-Hill: New York, p. 51, 1984.
- T. J. Mountziaris, S. Kalyanasundaram, N.K. Ingle, Journal of Crystal Growth, 131, 283, 1992.
- R. Gomez-San Paman, et al., Appl. Surf. Sci., 70/71, 479, 1993.
- H. Fukuda, M. Yasuda, T. Iwabuchi, Jpn. J. Appl. Phys., 31, 3436, 1992.
- J. Moon, I. Ito, A. Hiraki, Thin Solid Films, 229, 93, 1993.
- D.K. Shih, A.B. Joshi, D.L. Kwong, J. Appl. Phys., 68, 5851, 1990.
- J.Y. Cheng, L.J. Chen, J. Appl. Phys., 69, 2161, 1991.
- C. A. Clevenger, et al., J. Appl. Phys., 79, 4978, 1992.
|
Edmund G. Seebauer received his PhD degree in chemical engineering from the University of Minnesota in 1986. He is a professor of chemical engineering at the University of Illinois, and holds a 1988 NSF Presidential Young Investigator Award, a 1994 Alfred P. Sloan fellowship in chemistry, and a 1995 Inventor Recognition Award from Semiconductor Research Corp. University of Illinois, Department of Chemical Engineering, 207 Roger Adams Lab, Box C-3, 600 S. Mathews Avenue, Urbana, IL 61801; ph 217/333-4402, fax 217/333-5052, e-mail [email protected].
|
Roderick Ditchfield is a PhD student in chemical engineering at the University of Illinois at Urbana-Champaign. He received his BS degree in chemical engineering with distinction from Cornell University in 1993. He can be reached at ph: 217/244-5375, fax 217/244-8068, e-mail [email protected].