Issue

Input parameter SPC for diffusion furnaces

02/01/1998

Input parameter SPC for diffusion furnaces

Stephen Munley, National Semiconductor (UK) Ltd., Greenock, Scotland

Jon Goldman, Jon Goldman Associates Inc., Orange, California

Lack of access to important information in the diffusion area made improving the processes and systems very difficult. Hours were spent troubleshooting nonproblems, while the real instabilities often went unnoticed. To improve the productivity of our troubleshooting and process improvement activities, we began a project to implement a real-time input parameter statistical process control (SPC) system. To date, the project has helped maintain tighter process control and facilitated preventive maintenance operations.

The term, "special cause" refers to process variations resulting from a nonroutine occurrence: a component failure, an inexperienced operator running a process incorrectly, or a routine maintenance procedure carried out improperly. Shewhart described the principles for identifying special causes more than 65 years ago [1]. The method generally tracks output variables (film thickness, resistivity), but the same method can also be applied to input variables. Transducer signals can be thought of as the output variables (results) of furnace components. Many times, component-level catastrophic failure is unpredictable. More often, the ailing component warns of trouble, provided we read the correct signals. We describe a system to track input variables and use the information to remedy special-cause incidents before they turn 200-wafer furnace loads into scrap. Although the system required significant time to set up and maintain, dealing with special cause variability can and should become an activity for maintenance or operations personnel, as Deming envisioned [2].

The other type of process variability, "common cause," refers to structural inconsistencies in the manufacturing processes. Some furnace-related examples include failure to control important variables affecting the process (wafer load size, wafer spacing, and tube pressure), failure to verify that profiling was successful, and inconsistent unloading time interval. Correcting common-cause variability requires procedural changes and equipment modification. Reducing common-cause variability is more difficult than the routine operations required to fix special-cause problems once they are identified.

Click here to enlarge image

Figure 1. Schematic of SPC system showing RS-232 interface to furnace control system and approximate LAN configuration.

The present study collected data through an RS-232 port from a range of different components making up a Thermco TMX series furnace: MFCs, thermocouples (TCs), pressure transducers, positional transducers on boat loaders, and digital data from valves. Trend techniques applied to this data helped us predict or highlight component failures before they occurred (special cause). The techniques also helped us clarify some common causes of process variation, which were dealt with by changing processes or procedures.

Since furnaces use many components found in other equipment - pressure transducers, MFCs and TCs to name a few - the methods used here to detect sources of process variability, including early component failure, can be translated directly to other types of equipment. Shewhart`s method is broadly applicable, regardless of the type of system involved. Because furnace processes are so well understood, interpretation of the data is very straightforward.

System configuration

A data capture/analysis system was incrementally installed at National Semiconductor (NSC) Fab 1, Greenock, Scotland, over approximately three years. By January 1996, the system consisted of a dedicated four-tube workstation on each of 12 furnace stacks, a data server, and an analysis system (Fig. 1). The analysis system provided remote access from the engineering office to any process on any tube in any part of the fab. The system also provided recipe management capability and had an interface to the factory network.

The software included a series of "data capture modules" that collected raw data on a run-by-run basis and stored each run in an individual condensed binary file. This method facilitated run-to-run comparisons. National Semiconductor also used the SPC administrator module to give users significant flexibility in extracting critical event information from raw process data. Critical events are discussed in the next section. A third module, the trend analyzer, allows users to view critical event trends, either individually or in groups.

SPC implementation

The first step in implementing an input parameter SPC system was identifying the conditions (events) worth monitoring. For example, the average values and standard deviations of pressure and gas flow rates inside the process tube during deposition show the condition of the flow controllers, pressure control system, and pump. The time rate of pressure change during leak check identifies outgassing problems and failing vacuum components. The tube pressure at the precise instant when the auxiliary gate valve closes and the main gate valve opens illustrates the state of the vacuum pump, fore-lines, and valves. Moreover, the long-term behavior of these variables can be used to predict "critical events" such as component failure and normal system drift. Critical events can, and often do, involve more than one variable. Alert conditions, a further subdivision of critical events, involve the extraction of a single parameter for monitoring. NSC worked with Jon Goldman Associates Inc. to refine the system further, giving additional flexibility to monitor process conditions as needed. Control limits - guard bands around the alert conditions - had to be set for every recipe and tube type. The limits were initially set impossibly tight so as to force the system to generate numerous "SPC violations" per run. These run reports identified typical values and we modified the control limits accordingly. We monitored and fine-tuned each change where necessary until sensible limits had been set up for every process and tube type. Initially, the system generated more than 500 alert-condition trend charts on a weekly basis.

Results

The value of the system immediately became apparent as NSC began to generate the alert-condition trend charts. The following case studies are classic examples of problems generated by common and special causes. When these examples appeared in the weekly trend plots, the relevant process engineers were alerted.

Mass flow control components are found in many different types of equipment. The first example shows how input parameter SPC for MFCs may be applied in a furnace process. Figure 2 shows a mean flow rate trend taken from a low-temperature oxide deposition process. Part a) shows the mean value of flow rate to each of two SiH4 MFCs. Each point represents one run. Averaging all the instantaneous flow-rate data during the deposition step gave the mean flow rate. The flows appeared to be stable until the mean flow for one of the MFCs fell dramatically to less than 30 sccm.

Click here to enlarge image

Figure 2. Protracted failure of an SiH4 MFCs. In a), the mean flow rate through two MFCs shows no indication of problems until catastrophic failure. In b), data reported by a MFM in series with the MFCs shows gradual diminution of flow followed by catastrophic failure.

Part b) tells a completely different story. This plot shows the SiH4 mean flow through an upstream flow meter connected in series to the two MFCs in part a). The SiH4 mass flow meter (MFM) mean flow dropped for several runs (nearly a week) prior to the failure. One of the two SiH4 MFCs, although providing no indication that its flow signal was in error, actually flowed less gas than it was reporting. Responding to SPC violations in this case would have saved a 66-wafer run. This example also illustrates the value of MFMs in series with MFCs to cross-check critical gas flow rates.

The next example again applies input parameter SPC to MFCs. Figure 3 plots a run-to-run trend chart of the average flow from an H2 MFC during a steam oxidation process. The H2 flow trended upward from about 4541 sccm to a high of about 4568 sccm. This drift was not huge, but since the film thickness data from this process had also drifted out of spec, we recalibrated the MFC and the furnace controller`s 5-V internal reference. The post-maintenance flow data (last two points) returned to normal, as did the film thickness data.

The heart of a diffusion furnace is the resistance-heated element that surrounds the quartz tube holding the wafers. Run-to-run consistency (or lack thereof) of power drawn by the element during processing was an important indicator of problems in the temperature control system. Figure 4 shows a six-week trend of the mean power applied to one of the end (source) zones of a furnace during a typical high-temperature processing step. The average power consumption during each run increased significantly during the last week of the plot. A catastrophic break in the element winding occurred at this point. Element power consumption appeared to increase prior to a catastrophic element failure, so SPC alerts based on the average power consumption could predict element failure.

Click here to enlarge image

Figure 3. Drift of H2 MFC over a 65-run time period.

The data in Fig. 4 appears bimodal because it includes two recipes that run at different temperatures. Both recipes exhibited deviations in power consumption before the element failure.

The next example requires a short digression to explain the common practice of "profiling" a furnace. Typically, furnaces use two sets of TCs - profile TCs, which are located inside the process tube, and spike TCs, embedded in the furnace element. Since the profile TCs are inside the tube, they provide a relatively accurate picture of wafer temperature, but they are difficult to use for furnace control because they are far from the element heat source. Spike TCs are more commonly used for control.

Profiling is an iterative process to determine the value of the spike TC that gives rise to the desired temperature measured with the profile TC. This profile-spike offset heavily depends on temperature, gas flow, and other process conditions. Offset values are typically stored in "profile tables." Moreover, the offset values are usually automatically added to the actual spike temperatures by furnace process controllers, so that, for a well-profiled furnace, they match the values recorded by the profile TCs. In the following example, significant temperature problems arise when furnaces are profiled under one set of conditions, and run using another.

Click here to enlarge image

Figure 4. Trend showing increase in power consumption prior to catastrophic furnace element failure.

Click here to enlarge image

Figure 5. Trend charts showing difference in temperature between spike and profile TCs for two furnace tubes. In a well-profiled furnace, this value should be close to one. The error changed sign about two-thirds of the way across the plot, when the furnace cell completed one recipe qualification and started another, which used a different gas flow.

Figure 6. Trend of spike-profile temperature difference, illustrating unsuccessful furnace profiling.

Figure 5 shows two plots of D-T, the difference between spike and profile temperatures, for two different tubes. The trend data shows the effect of running processes with different gas flows from those used in the profile recipe. A high spike-profile D-T occurred during the setup of this process. The problem persisted on both tubes despite repeated rerunning of the profile recipe. Once the profile recipe was changed to use the same gas flows as the production recipes, the performance improved dramatically (as seen at the end of the plot).

The last example, Fig. 6, shows the effect of running an unsuccessful profile recipe. The tube performed well until it was profiled after the eighth run. Then, the D-T increased markedly. The furnace was reprofiled after the twentieth run, and performance returned to normal. Profiling a furnace does not always result in improved temperature performance. It may be wiser to reprofile a furnace only when needed, and then to check the results.

Conclusion

The SPC system made it practical to "phrase the right questions," and collect and observe data to obtain the answers. It illustrated the conditions that result in equipment failures and helped NSC identify and replace components in the early stages of failure.

In all of the previous examples, SPC alerts were generated when monitored parameters exceeded hard alert limits that had been manually set. Another approach - Western Electric Rules [3] - does not require the manual setting of alert limits, but determines the limits based on previous performance. These rules are a series of criteria for determining whether observed variability in a data set is normal or abnormal. The simplest and most widely applied rule, the 3-s rule, states that more than 99% of all normal data should be within 3 s of the mean value. Virtually all data that lies outside the 3-s limit is suspect.

These rules assume that the data fits a normal distribution if the process is operating properly. Some alert conditions, such as MFC zero drift, or flow rate standard deviation, lend themselves to the Western Electric Rules because they should remain stable from run to run. Western Electric Rules should pinpoint the onset of component failures at the earliest detectable moment. Other alert conditions, like spike-profile temperature differences or leak rates, may drift in predictable ways associated with normal system operation. Western Electric Rules are not useful in these cases because the data is not expected to fit a normal distribution.

Future generations of SPC tools should generate trend charts automatically. They should also provide automatic feedback to the furnace when SPC violations are discovered. Further refinements should incorporate Western Electric rules to identify component deterioration and failure, as well as hard alert limits. While the SPC techniques described are quite old, the semiconductor industry still has a long way to go in learning to use them correctly.

We created a web site to publicize the results of the initial three-months` work. This was originally only for other NSC sites, but has recently been published on the web for all interested engineers to view: www.ourworld.compuserve.com/homepages/stephen_munley/.

References

1. W.A. Shewhart, Economic Control of Quality of Manufactured Products, D. Van Nostrand, Princeton, N.J., 1931.

2. W. Edwards Deming, Out of the Crisis, Massachusetts Institute of Technology, Center for Advanced Engineering Study, pp. 314-315, 1986.

3. Statistical Quality Control Handbook, AT&T Technologies, Select Code 700-444, Indianapolis, IN, pp. 25-28, 1956.

STEPHEN MUNLEY has an HNC in electrical and electronic engineering. He has been employed by National Semiconductor Corp. at its Greenock, Scotland, manufacturing site since 1979. He has worked as an equipment engineer, specializing in epi, CVD, and diffusion processes, since 1984, and is staff equipment engineer in the deposition section at the facility.

JON GOLDMAN received his BS degree in metallurgical engineering from Drexel Institute of Technology, and his PhD degree in materials science from MIT. After graduation, he joined Motorola`s Semiconductor Research and Development Labs, where he worked in the LPCVD process development program. Goldman subsequently worked at Thermco Systems as technical director. In 1989, he founded Jon Goldman Associates Inc., where he has been working in thin-film SPC. Jon Goldman Associates Inc., 2237 N. Batavia St., Orange, CA 92665-3105; ph 714/283-5889, fax 714/283-2884.