Beyond AEC and APC: Wafer quality control
04/01/2005
In semiconductor processing, the final yield is often a result of intensive interactions between process recipe parameters and manufacturing tool health. An approach using model-based process control, along with neural network technology, provides a comprehensive view of both recipe and tool health parameters by bridging the gap between advanced process control and automatic equipment control. This article describes an approach for analyzing near real-time detection of both tool faults and process excursions using “urgency metric” measurements.
Advanced process control (APC) and automatic equipment control (AEC) are traditionally considered separate tools in wafer-processing fabs. APC focuses on recipe and process optimization, while AEC concentrates on fault detection, fault prevention, and maintenance scheduling. However, final wafer yields often hinge on both the control of process recipe parameters and manufacturing tool performance. It can be risky to deal with process control and equipment health independently.
A model-based process control concept combining process and maintenance information has been developed by engineers at NeuMath to connect APC and AEC systems. The accuracy of the control models is greatly enhanced with input from a range of in situ sensors, in-line sensors, metrology, and sampled data. After the model is trained, it can serve as a run-to-run controller, a fault detector, and a process simulator, depending on the needs of the process engineers. This capability is used in a tool called the dynamic neural controller (DNC), which applies advanced artificial intelligence methods, such as neural networks and genetic algorithms, to adapt to ever-changing semiconductor processes [1-3]. The wide range of inputs greatly enhances the accuracy of the model, especially when the processes are interrupted by maintenance actions [3].
The ‘urgency’ concept
A major challenge facing process engineers is the interpretation of maintenance recommendations made by run-to-run controllers at the individual wafer level. First, even a simple process may have more than 60 control parameters (inputs), which typically have nonlinear relationships with the process results (outputs). The interactions between the inputs and outputs are hard to comprehend, even with the help of computer programs. Second, normal process deviations at the wafer level may cause model recommendations to vary from wafer to wafer.
To deal with this challenge, the concept of “urgency metrics” has been developed to track chronic problems associated with processes. The urgency metric system provides a quick analysis and summary of recent process conditions, enabling process engineers to make rapid decisions. All recommendations made by the system’s optimizer - including process parameter adjustments and maintenance actions - are analyzed, based on cumulative summation (CUSUM) statistics for the total wafer risk reduction on a wafer-by-wafer basis. The CUSUM value will move higher when corrective action is recommended consistently over longer periods of time. This value also grows when greater cost savings are possible from corrective actions. The CUSUM statistic has been called the urgency metric because its value indicates the level of urgency associated with taking these corrective actions.
The urgency metric system is designed to summarize cost savings or risk reduction achieved by performing a single action. The concept of cost is a relative value, assigned by process engineers, to identify costly or risky maintenance actions, as well as costly or risky metrology results or process set points. Cost (as used in this system) is not directly related to actual dollar costs. The complexity of the input/output relationship is transparently transferred to total relative cost savings for the wafer being processed. Therefore, it is clear to the process engineers that a high level of urgency on a maintenance action means the action will lead to certain cost savings. The engineer can then use the DNC model as a simulator to see the exact effect of the action if performed. The system can also display a set of actions with the maximum cost savings in the “urgency” user interface.
The CUSUM method of smoothing out values enables it to deal with the problem of random noise at the individual wafer level. It also provides a way to evaluate the performance over time. The cumulative nature of the calculation makes it more sensitive to small-but-consistent shifts over time. Therefore, with the urgency concept, any trends toward control limits will be noted long before traditional statistical process control (SPC) methods could do so, as shown in Fig. 1.
Traditionally, process excursions are identified by SPC methods, which trigger an alarm when variables move outside defined control limits, as indicated by the circles in all three charts in Fig. 1. Based on urgency calculations, alarms are set off long before a process is actually out of control because the system uses trends over time. Examples of urgency-calculated alarms are shown in the control charts with the letter “X.” When a process is running at marginal levels but still within the control limits (as shown in Fig. 1c), urgency calculations can be especially beneficial compared to traditional SPC methods. In these cases, traditional SPC methods would not identify potential problems because the process remained within the control limits. However, most catastrophic failures are preceded by long-term drift toward the control limits. Typically, urgency methods will signal the need for maintenance about 100 wafers earlier than SPC methods.
Examples of urgency usage
The interactions between process parameters and tool health parameters demonstrate the importance and effectiveness of comprehensive tool control. This article provides examples of two different situations. The first shows how urgency indicates tool health issues and requests maintenance actions. In the second, high urgency for the process parameters of a family of recipes indicates the need for recipe adjustment. The results were generated from production data from a Lam Research TCP metal etcher operated by National Semiconductor Corp. at its wafer fab in South Portland, ME.
Figure 2 shows an example of urgency for a maintenance action to calibrate an endpoint detector. The urgency curve for this maintenance action started to rise consistently around March 19. It stayed high until March 22 when the maintenance action was carried out (indicated by the pink vertical line). After the action, the urgency level dropped immediately and stayed low afterward, indicating the maintenance action was effective.
Figure 2. Calibration of the endpoint detector reduced the urgency. The pink line (at the end) indicates when an endpoint calibration is performed. |
This example shows that the urgency curve actually identified the problem fairly early, giving the engineers ample time to react and to avoid a catastrophic failure. In this case, the engineers reacted within 48 hr. Similar cases were observed on other production tools where the DNC was installed. An analysis of retrospective data shows some maintenance actions were performed without any indications of high urgency. This analysis also shows that some maintenance actions were performed too late, and, as a result, the quality of many lots was affected. The use of urgency charts for maintenance actions would have given engineers a tool for deciding when to perform actions.
The urgency concept also applies to process parameters. Figure 3 provides a screenshot for recipe parameter analysis. To facilitate decision-making, the urgency for process parameters is summarized by a pareto chart. For each process parameter that showed high urgency, all recipes using this parameter are listed. In this example, the Process ESC voltage monitor shows high urgency across the entire C8 recipe family, with one exception. The urgency count (the number of wafers with high urgency for this parameter) ranges from 17 to 183. Other recipe families (C7 and C9) both have very low urgency for this parameter. The high urgency count alerts the engineers to look into the C8 recipes. The recipes may not be optimized. On the right side of the screen are recommended set points. The system indicates that if the set points are adjusted from 700 (current target) up to a value between 705 and 712, the average cost (in relative terms) may be reduced by approximately 4%. This estimate is based on a comparison of the average value of the wafer’s current predicted cost with the projected cost after the recommended change is made.
Fab installation results
This control methodology has been installed on five Lam Research TCP metal etch tools at National Semiconductor’s fab in South Portland. An urgency pareto chart gives the engineers a daily process and tool health summary. The engineers can carry out detailed analysis on-the-fly, if needed. At times, the urgency also helps engineers find the root cause for novel situations.
An analysis compares the fab processes before and after applying this technology. After the installation, most of the quality metrics show significant improvement in process capability. As illustrated in Fig. 4, the distribution of the process chamber valve angle (for a family of recipes) becomes much tighter after installation. The mean also moved toward the target (the brown vertical line). Over all parameters measured, we observed a Cpk improvement of 19% over a two-month period across all five machines, after implementation [1].
Another area of improvement is in the maintenance. Fewer maintenance actions have been reported after the installation of equipment (an average of 66 fewer maintenance actions/tool/year). The maintenance actions also shifted from high-cost to low-cost.
Most important, all major tool-related scrap was eliminated from the five etch tools after DNC installation, during the date range studied. In contrast with fault detection systems, this methodology identifies fault patterns before the actual fault occurs. Thus, the methodology improves upon traditional fault detection and classification systems, eliminates scrap, and also eliminates the unscheduled downtime associated with the occurrence of a fault.
Conclusion
Fab installation of a model-based control approach to bridging APC and AEC systems can play a crucial role in reducing equipment maintenance while optimizing process recipes.
References
- J. Card, L. Laurin, “Using Neural Networks for Intelligent Plasma Etch Process Control,” Solid State Technology, pp. 33-36, Nov. 2002.
- J. Hyde, J. Doxsey, J. Card, et al., “The Use of Unified APC/FD in the Control of a Metal Etch Area,” Proc. ASMC 2004, pp. 237-240.
- A. Cao, J. Card, W. Chan, “Using Maintenance Input Data to Increase the Prediction Accuracy of APC Strategies,” Micro, July 2003.
Jill Card is founder and CEO of Ibex Process Technology, a division of NeuMath Inc., 57 Wingate St., Suite 301, Haverhill, MA 01832; ph 978/556-0367, fax 978/556-1539, e-mail [email protected]. Wai Chan, An Cao, Bill Martin, Joyce Hyde, Ibex Process Technology, Haverhill, Massachusetts
Paul Fearon is an operations engineering manager at National Semiconductor Corp., 5 Feden Rd., M/S 04-05. South Portland, ME 04102; ph 207/541-4671, e-mail [email protected].