Issue



Thin film PVD and strategies for optimized UHV-XHV pumping


07/01/2001







Lawrence T. Lamont Jr., Cougar Labs Inc., Milpitas, California

overview
The vacuum performance of any processing tool is critically determined by a number of factors, including the design of the processing chambers, the selection, configuration and operating techniques for pumping and, of course, the process itself. The very nature of the processing environment conspires to shield us from vital knowledge of what is going on. All too often, the "horsepower race" of relatively meaningless conventional vacuum system specifications has inhibited progress.


Figure 1. The basic 1970s batch PVD system.
Click here to enlarge image

Cryopumps and turbopumps are the dominant UHV pumps for modern PVD processing tools operating in the high vacuum and ultrahigh vacuum regimes. Although these pumps operate in a fundamentally different manner, either will do the job when properly operated, assuming, of course, that the system is reasonably designed. In a presumed effort to combine the best features of each, an argument for the use of cryopumps and turbopumps in parallel combination is occasionally made. Financial considerations present a compelling argument to the contrary, however. It can be difficult to justify the approximate doubling of the cost of system pumping merely on grounds of doubling pumping speed, since a larger pump is generally less expensive than two separate pumps.

Despite evident differences, cryopumps and turbopumps share a troublesome similarity; namely, the throughput and operating pressure of process gas required by many processes often exceeds the maximum continuous ratings for the pump. For the cryopump, this limit is manifested by excessive heat load to the refrigerator and shortened periods between regeneration. For the turbopump, the problem is the slowing of the rotor and overheating of the drive motor.

Each of these cases illustrates the truism "throughput is power." Indeed, one modern vacuum text [1] gives outgassing rate, dimensionally a throughput/unit area, in units of W/m2. This pumping limitation gave rise to so-called "throttle valves" in the 1970s with the proliferation of rf planar diode sputtering systems.

To see how this works, let us consider an example: Suppose that we have a turbopump with a nominal pumping speed of 1000 liters/sec, such as would commonly be selected for application on a sputtering system. If, for the sake of argument, we accept the notion that we should not operate the pump at a pressure in excess of that for which the pumping speed begins to decrease with increasing pressure, we typically find that, for such a pump, the maximum pressure at the throat should not exceed about 1 x 10-3 torr. Since Q = SP, the throughput under such conditions is 1 torr-liters/sec.

Now, suppose we want to sputter at a pressure of 7 x 10-3 torr in the process chamber. What should the conductance of the "throttle valve" be so that we get the maximum benefit from our pump? In steady state conditions (i.e., gas is neither "piling up" nor being depleted anywhere within the system) the throughput is conserved. Thus, we have

Click here to enlarge image

null

The pumping speed to be delivered to the chamber must be 1/7th of the full rated speed of the pump or about 143 liters/sec. This is not, of course, the value of the conductance of the throttle valve. That must be found using the relation

Click here to enlarge image

null

from which we find that

Click here to enlarge image

null

yielding C = 209 liters/sec.

If this is the only UHV pump on the system, then the partial pressure of each background contaminant gas will rise by a factor of 7 upon actuation of the throttle valve.

Lest the author be accused of favoring either the cryopump or the turbopump, we could perform a similar exercise, but this time for the cryopump, setting the criterion by considering heating of the charcoal absorber. Although the exact result depends upon how one selects the maximum throughput, one will generally come out with a similar number. A little care must be exercised here, however: The adsorption process is critically dependent on the temperature of the adsorbing surface. That is not necessarily the temperature of the bulk absorber, due to the thermal conductivity of any previously adsorbed layers (crudely, think frost).

Our situation is now this: We pumped our system to some pressure, and then we throttled back to a reduced pumping speed. At the same time we may have also introduced the process gas. The background contamination partial pressure is now

Click here to enlarge image

null

If we pumped down to 2 x 10-7 torr, then the background will increase to about 1.4 x 10-6 torr. But didn't we help ourselves by pumping to that lower pressure first? Sadly, the answer is no. This is what the author likes to call the "myth of scrubbing pressure" — the pressure after throttling has been applied indicates cleanliness of the vacuum environment, not the pressure before throttling has been applied.

This naturally leads us to ask whether there is any reason for the throttle valve to be adjustable? Wouldn't a fixed conductance work just as well? The answers depend upon your criteria. From a vacuum standpoint, there is little to be gained from a variable throttle valve beyond the ability to change easily to different conductance in order to compensate for differing process pressure requirements. As usual, there is more to it than that. For example, if pressure control is important for your process, you may want to implement a control scheme based upon upstream control of flow (i.e., throughput) and downstream control of pressure. A variable throttle valve will do this for you.


Figure 2. A basic load-locked PVD system.
Click here to enlarge image

null

Batch to cluster tool
There is an implicit assumption in the foregoing section; namely, that the processing chamber is exposed to lab ambient during each process cycle. For many of us, this limitation went away with the advent of the modern cluster tool in the mid-1980s. For others, it remains to this day.

The basic 1970s batch system (Fig. 1), sans load lock, is exposed to water vapor adsorption from the lab or fab with each processing cycle. For most thin film work, this configuration has all but disappeared, for the simple reason that the pumpdown time is measured in many hours or days.

About this same time, the load lock was introduced; with it the process chamber could remain under vacuum for long periods of time, being opened primarily for maintenance. This was a vast improvement, because pumpdown time for the load lock was usually measured in minutes to (at worst) tens of minutes. But there was a hitch. With each deposition cycle, the system pallet would be vented to atmosphere to load a new batch of wafers. Upon pumpdown and transfer to the process chamber, each would then contribute a large amount of outgassing to the otherwise reasonably good vacuum there. Since outgassing depends upon surface area and the specific outgassing rate, the fact that the pallet and wafers were a nontrivial fraction of the total process chamber area tended to reduce the benefit of the design configuration (Fig. 2).


Figure 3. Basic cluster tool configuration.
Click here to enlarge image

Since it was recognized that the lion's share of the outgassing was water vapor, large cryo-adsorbing surfaces, commonly called Meissner traps, were added. These were a mixed blessing. Prior to process start, the vacuum would appear to be vastly improved. Since there was generally no conductance limit applied to them, many thousands of liters/sec of additional pumping speed could be easily and inexpensively achieved. Sadly, with the start of the process, things were not so simple. The process plasma could easily desorb the (relatively) weakly physisorbed gas and return it to the process environment at the worst possible time and in a way that was not easily discernible by the system operator.

With the cluster tool, of course, a processing tool comprises multiple process modules surrounding a central handler module. The central handler, or cluster core as it is commonly called, is accessed through one or more load-unload lock chambers. Of all these, only the load-unload lock chambers are cyclically exposed to the fab or lab air (Fig. 3). Since there is no pallet at all, the additional outgassing contribution to the process chamber comes from each new wafer. This is a very different situation "by the numbers," but not so much as one might think in terms of process contamination from adsorbed water vapor.

Within the cluster tool process chamber
Consider now the process chamber of a modern cluster tool, "living" at a background pressure of high 10-8 torr to low 10-7 torr with 173 liters/sec available pumping speed. When we introduce a wafer that, only minutes before, was in the fab exposed to ambient air, we are now primed to expect to see the background pressure skyrocket. It doesn't. Why not?

Upon a moment's reflection, the answer becomes apparent: We are in a very clean processing system and we have just completed processing the preceding wafer seconds (or at most a few tens of seconds) before. How long does it take for a monolayer of adsorbed water to form at such pressures? Again, by crude estimate, the result is a few tens of seconds to, perhaps, one hundred seconds; so, just because you don't see anything happening on a pressure gauge, you should not feel safe concluding that nothing of consequence is occurring.


Figure 4. A side view of the process module for a new UHV/XHV cluster tool configuration is shown. Insert shows a representative cross-section of the pump combination.
Click here to enlarge image

Let us also ask, "How large is this pump?" To make this estimate, we note that the initial effective pumping speed/unit area is just the product of the condensation coefficient and the aperture conductance for water vapor. For air, the aperture conductance is about 11.6 liters/sec-cm2. For water vapor, this increases to about 14.5 liters/sec-cm2. The process chamber of a typical modern cluster tool has an interior surface area, including shielding, sources, and fixturing, of at least 104cm2, and the initial sticking coefficient may be near unity; this implies an enormous pumping speed, even if the condensation coefficient begins to drop significantly as surface coverage increases. That is why we do not see the pressure rise.

Isn't this a "good thing?" Isn't it just an additional "free" pump, and a large one at that? Yes and no. The difficulty arises simply from the fact that the outgassing water is going onto the previously deposited material. After the initiation of deposition, it will continue to do that onto the surface of the growing film, generally at a greatly increased rate due to substrate heating. The growing film becomes contaminated. So, consider that our purpose is not to make an ion gauge read some comforting indicated pressure.

Our purpose is to deposit good films. But wait! Doesn't the growing film cover over the surface, and so reduce the outgassing? For the front side, that is indeed the case. For the back side, though, it is not. The process gas is not particularly effective in blowing away the outgassed water, so some of it, via gas phase back scattering, will condense on the surface of your growing film, react, and remain. Generally, the oxygen remains chemically bound within the growing film and the hydrogen is reintroduced to the processing environment. With some films, that is largely the end of it. For others, the hydrogen itself can still work some mischief. One possible example of this may be degradation of barrier properties via enlargement of grain boundaries.

The water pump, again
Since it is well known that adsorbed water vapor constitutes most of the outgassing from a wafer newly introduced to a UHV environment, the water pump has been rediscovered. The modern embodiment of the device is as an appendage pump, chilled by a refrigerator of some sort rather than by liquid nitrogen, as with the Meissner trap. Both Gifford-McMahon and Sterling Cycle refrigerators are in use.

Adsorption of very thin layers of water vapor on a cryo-surface is well known to change its properties dramatically in the infrared [2]. This means that such pumps are not only sensitive to desorption by low-energy ion and electron bombardment from a process plasma, but heating by radiative transfer may also become an important issue. It also means, as a practical matter, that the initial advantage of polished low-emission cryo-surfaces with respect to heat load due to radiative transfer is quickly lost.

The matter of heating will generally come down to engineering tradeoffs. These tradeoffs may be very different for differing processes. Electron beam or thermal evaporation, for example, with its high radiative heat loads, is a very different optimization problem from that presented by a sputter deposition system or a pulsed laser deposition system with an 850°C substrate heater in an oxygen ambient.

Optimal implementation of these pumps is critically dependent upon these specific considerations. Once the relevant conditions have been properly analyzed and shielding implemented, there is no reasonable purpose for making them adjustable simply to improve apparent performance in the "clean, dry and empty" state. Nothing gets done there.

Combining turbo and water pumps
Once we have optimized the water pump installation, we are left with the question of noncondensable gases. This is where the turbopump comes in. The initial question might be, "Should the pumps be in parallel or in series?" Available ports on existing systems will probably determine the answer to that question. Given that the conductance to the turbopump will probably need to be limited, possibly by an adjustable throttle valve, the required port may be comparatively small. On the other hand, the conductance to the water pump should probably be as large as possible, commensurate with shielding requirements associated with heat and any process plasma present.

A series arrangement has a potential advantage in that the water pump will selectively remove water vapor from the gas stream. That being the case, the water vapor will not be available for adsorption on the blades of the turbopump. A new UHV/XHV cluster tool configuration includes a nominal 6-in.-dia. entrance aperture into the water pump and a 275 liter/sec turbopump (Fig. 4); this all-metal sealed (excluding the gate seal) system offers routine process background levels in the mid to low 10-10 torr region.

Another obvious question is, "How do I determine the required speed and throughput?" As always, we look to the process. For nonreactive evaporation processes, or for most PLD processing, throughput is not a consideration and the designer is free to select a modest to small turbopump. In most cases, a pumping speed on the order of 150 liters/sec to, perhaps, 400 liters/sec will be adequate.

If the process is sputtering and therefore requires a specific throughput of a noncondensable gas, then the turbopump should be selected based upon its throughput capability, and a throttle valve should be installed between the turbopump and the water pump in the series configuration. For the parallel configuration, the conductance limit should only be applied to the turbopump. This is because there is generally no reason to conductance limit the water pump, per se.

One exception to the wide-open water pump may be for any plasma process that has a large effluent of atomic hydrogen. Recombination of atomic hydrogen in the gas phase is precluded by energy and momentum considerations. When recombination of atomic hydrogen occurs on surfaces, it is highly exothermic, thereby heating those surfaces considerably. A water-cooled baffle surface for such recombination should be provided in series with the water pump. This will ensure that the water pump is exposed only to molecular hydrogen. Such a baffle should have a high conductance while assuring that the probability of colliding with a surface is high; guidelines for the optimization process have been published [3].

If the process is reactive sputtering, then considerations not dissimilar to those of CVD come into play. Specifically, one must consider consumption and distribution of the reactants. The selections of both the turbopump and the cryopump are likely to be affected by such matters.

Conclusion
The day of the generic one-size-fits-all pumping platform for PVD processing is, or soon should be, passing from us. In its place, we need to implement pumping solutions that are more closely coupled to our process. The CVD people have known this for years.

References

  1. J.F. O'Hanlon, A User's Guide to Vacuum Technology, 2nd edition, Wiley-Interscience, New York, 1989.
  2. R.P. Caren, A.S. Gilcrest, C.A. Zierman, Adv. Cryogenic Eng., 9, K.D. Timmerhaus, ed., Plenum, New York, p. 457, 1964.
  3. C. Benvenuti, D. Blechschmidt, G. Passardi, J. Vac. Sci. Technol., 19 (1), May/June 1981.

Lawrence T. Lamont Jr. holds his AB and MA in physics and DSc in applied physics. He is president and chief scientist at Cougar Labs Inc., 988 Hanson Ct., Milpitas, CA 95035; [email protected].