Category Archives: Metrology

SUNNYVALE, Calif.—June 16, 2000—Nanometrics Inc. has developed a Windows NT based software for the film metrology equipment market that takes advantage of the operating systems support of multi-tasking functions.

By pennNET Staff

Mark A. DeSorbo

WOBURN, MA— ONE DOWN, 10 TO go.

That's one way to sum up the draft process for new cleanroom standards that are being hammered out by Technical Committee 209 of the International Organization for Standardization's (ISO; Geneva).

Eight documents make up the 14644 standard, “Cleanrooms and Associated Controlled Environments,” and one of them, 14644-1, “Classification of Air Cleanliness,” has already been issued. The industry is now moving toward passage of the other seven documents as well as three sections of the 14698 standard on biocontamination control.

Both standards are the basis of a presentation by Richard Matthews, a CleanRooms Editorial Advisory Board member and chairman of Technical Committee 209. Matthews, founder of Filtration Technology Inc. (Greensboro, NC) and president of Micron Video International, gave his presentation “The New ISO Cleanroom Standards: Mandatory Criteria vs. Voluntary Criteria,” at a New England Chapter meeting of the Institute of Environmental Sciences and Technology (IEST; Mount Prospect, IL) in mid-March.

“[Matthews] gives a very open presentation,” says Roger Diener, a 209E committee and IEST member, who works as a contamination control engineer for Analog Devices (Wilmington, MA). “Questions will differ, depending on the needs of users who attend. This standard provides a cleanroom baseline because everybody's needs are different.”

And accommodating the diversity of the contamination control industry is just what the standards aim to do, says Matthews, an advocate for the sunset of the Fed-Std 209E (See “What is Fed-Std-209E's fate?” Feb 2000, p. 46). “It is generic. There is flexibility, and you had it with Fed-Std 209E, but no one used it,” he adds. “Nothing we do is industry specific.”

While the air cleanliness portion of the standard has already been issued, there are several other portions that are in various stages of the approval process. The document 14644-2, “Specifications for testing and monitoring to prove continued compliance with 14644-1,” is presently out for final vote. The “Design and Construction” segment of the standard, “14644-4, is also out for final vote. If accepted, both will be published as international standards within two months of the vote.

When comparing 14644-1 to Fed-Std 209E, Matthews says the ISO standard has tighter specifications when it comes to 0.1-micron particles, while provisions have been relaxed for 5.0 micron particles. Generally, he says, there are a reduced number of sample locations that are required for cleanroom classification.

Testing for compliance, the 14644-2 document stipulates that particle counts in cleanrooms less than ISO Class 5 (Class 100) must be conducted every six months, while in cleanrooms greater than ISO Class 5, tests should be conducted annually. Air pressure difference and airflow tests, according to the document, are to be conducted yearly regardless of the cleanroom class.

The design and construction provision of the standard, 14644-4, provides comprehensive parameters for clean and support utility spaces. There is also a section that outlines recommendations on materials selection and provides specifications on entry, storage and waste areas. “Millions of dollars,” Matthews says, “ride on what is in this document.”

The drafts of the third, fifth and seventh sections of the standard are scheduled to be sent out for comment by the third quarter of this year. The documents address metrology and test methods, cleanroom operations and enhanced clean devices (ECDs).

The third document of the standard focuses on metrology and test methods. Recommended primary tests include airborne particle count, and under that umbrella classification, ultrafine particle, and macroparticle tests are included. There are also a number of tests that are recommended, yet optional, including filter leakage, flow visualization and direction, temperature and humidity, electrostatic and ion generation, particle deposition, recovery and containment leak. “These tests are your call,” Matthews adds.

Cleanroom operations, 14644-5, specifies the basic requirements for operating and maintaining cleanrooms and associated controlled environments, Matthews explains. The section includes requirements, operational systems, garments, personnel, equipment, materials and cleanroom integrity. “These are things you must concern yourself with,” he says.

Enhanced clean devices, ECDs, are addressed in the 14644-7 document. Its scope, Matthews says, centers on the minimum requirements for the design, construction, installation, testing and approval of ECDs where they differ from cleanrooms.

Documents six and eight will go out for review during the third quarter of 2001. Terms and definitions, a so-called glossary of terms for the standard, will be outlined in 14644-6, while molecular contamination is addressed under 14644-8. “[Molecular contamination] cuts across all industries. It's a big issue, especially for the pharmaceutical people,” he adds.

The three documents of the ISO 14698 standard on bicontamination will be finalized and approved by the fourth quarter of 2000. The standard features a general principle section for analysis, an evaluation and interpretation of data as well as a methodology for cleaning and disinfecting.

Surface Height Mapper


April 1, 2000

The NanoMapper metrology tool provides whole wafer topology data for 200-mm and 300-mm wafer sizes using non-contact optical measurement to quantify nanometer scale surface height variations. It responds to process development needs for leading-edge semiconductor devices down to 0.1 micron. Tracking the phase of the optical signal using proprietary interferometric technology results in the sub-nanometer resolution of the measurement.

ADE Phase Shift

a subsidiary of ADE Corp.

Tucson, Ariz.

Specifically, Hasper says ASMI calculations show that cycle improvements garnered through the use of integrated thin film metrology result in cost-per-wafer gains of 8%. He notes ASMI has seen cycle times for the furnace processing area drop from eight days to seven and one-half days. “That's real money,” Hasper says. “If you are able to make a shorter loop from metrology to your furnace, it will have a tremendous impact on cycle time.”

Noting that business is now on firm ground, FSI International, Minneapolis, MN, and Varian Semiconductor Equipment, Gloucester, MA, say two top executives will step out of long-held positions.

Following the sale of its real estate and shoe division, Oerlikon-Bührle Holding AG (OBH) is focused on cutting a path for itself in the semiconductor industry and hopes that its newest acquisition of Plasma-Therm, Inc., St. Petersburg, FL, will lead the way.

By Christine Lunday, Online Editor

In a move that will allow it to bolster offerings with integrated metrology capabilities, ASM International N.V., Bilthoven, The Netherlands, has purchased a 24% interest in NanoPhotonics AG, a Mainz, Germany, maker of precision thin film metrology tooling. The firm also plans to deliver its first 300mm vertical furnace equipped with integrated metrology early this year.

Under ongoing collaboration between NanoPhotonics and ASMI, the firms have added NanoPhotonics' ultra-compact ellipsometer, dubbed the Ellipson, into ASMI's 300mm furnace. Further development work is on tap, according to Albert Hasper, ASMI's worldwide product manager for vertical furnaces, who says customers are eyeing integrated metrology as a way to lower costs per wafer by improving throughput and cycle times, reducing usage of test wafers, and potentially, enhancing yields. This is particularly true at the 300mm production mark, where the cost of wafers is intimidating to many a fab manager.

By Lise Laurin, special to SST

The economics of 300mm wafer processing will demand more effective process control of equipment and tools, according to William Rozich, IBM&#39s director of equipment technology. Improved overall equipment efficiency (OEE) will be essential to cover the higher cost of wafers and the tools needed for 300mm production. Rozich points out that the area of a 300mm wafer is 2.25 times larger than that of a 200mm wafer, but the raw material is expected to cost three times as much.

Speaking at an advanced equipment control/advanced process control (AEC/APC) symposium organized by Sematech in Vail, Colorado, in September, keynote speaker Rozich made it clear that IBM wants equipment manufacturers (OEMs) to integrate the necessary sensors on their tools and take responsibility for communications. He stated that companies like Applied Materials, which supply both metrology and wafer processing tools, should be in a good position to do the integration. He gave an example of some cluster tools where, he said, “integration is occurring regardless of the suppliers involved.”

Other integration alternatives, in Rozich&#39s view, are cooperative efforts between several suppliers, and perhaps a merger of suppliers.

The conference covered automated fault detection, run-to-run control, and real-time process control. Most of these control systems require the integration of specialized sensors on the tool. Some of the sensors measure process state (most end-point detectors fall into this category), while others measure actual wafer parameters, such as film thickness. Many of the sensors falling into this latter category are termed “integrate metrology” because they replace existing off-line metrology tools.

IBM&#39s commitment to AEC/APC is demonstrated by its establishment of a 300mm Integration Facility, dedicated to testing and qualifying advanced control systems.

Other attendees, from chipmakers such as AMD, TI, Infineon, and Motorola, including the White Oak Semiconductor facility near Richmond, VA, presented work on both development and production-scale control schemes. Other IC manufacturers, such as Samsung, Lucent, and Intel, also sent representatives, showing the increased interest in advanced control systems.

IBM&#39s Rozich and others tied sensor integration into OEE. In a real-time control project on epitaxial deposition, for example, reduced Cpk, yield improvements, test wafer reduction, and downtime and set-up time reduction resulted in a 5.3% increase in OEE. This work was co-sponsored by On-Line Technologies, Wacker, Applied Materials, and MIT.

Cost of ownership was another, somewhat related, target for advanced control. The implementation of an endpoint detector on an oxide etch process eliminated two process steps, resulting in savings of $10 per wafer, according to Steve Gunther of CETAC. Christopher Bode of AMD presented data on a run to run controller for photolithography overlay that eliminated test wafers and tool qualifications — two major steps in cost reduction.

Applied Materials has taken a unique approach to AEC/APC by using sensors already integrated into its tools. Lalitha Balasubramhanya showed how the pressure-control throttle-valve position (a parameter already available on most vacuum systems) can be used as an endpoint detector for a chamber etch. Terry Reiss, also of Applied Materials, presented an overall equipment index, comprised of existing tool parameters, which has successfully detected numerous faults.

Attendees involved in CMP were quite interested in a paper by Todd Cerni of Particle Measuring Systems describing an on-line, continuous process monitor of slurry particle size. Initial data showed promise that this technique might prevent scratching caused by larger particles in a slurry.

While it seems that all this talk of advances in APC indicates that model-based, real-time control, complete with advanced sensors and integrated metrology must be right around the corner, behind-the-scenes discussions did not paint so rosy a future. While chipmakers such as IBM, TI, and AMD are making tremendous strides in APC integration, they have not been so eager to share the details of their technology with others. In some cases, they are finding that these controls can give them an important edge in manufacturing. This makes it difficult for the OEMs to make much progress on integrating the new technology effectively into tools.

Only a few OEMs say that they are finding a high demand for APC or even sensor integration from fabs, and when there is a desire for these enhancements to tools there is little willingness to pay the true cost for them. An example was an offer to pay $15,000 for sensor technology that the OEM said would actually cost $60,000 or more to add to the tool.

Another stumbling block is in the area of communications standards. While DeviceNet is emerging as a predominant standard, some of the major players (including IBM) are seriously considering an Ethernet-, TCP/IP-based network. In furnace processing, network protocols have been rejected in favor of simple 0-to-5 volt analog signals for feedback control. In spite of the best efforts of Sematech, it appears a true communications standard will continue to elude the community for a few years yet.

Many of the sensor manufacturers are hearing need for their technology, but they aren&#39t getting many orders. They attribute this partly to the recent weakness in the industry. They understand that OEMs don&#39t want to add cost to their equipment without significant demand, particularly in a down market. But another major reason for the slow adoption of APC and sensor integration may be that the engineers making the equipment decisions don&#39t know about the options available to them. Of the 40 papers presented at last year&#39s AEC/APC conference, only a handful made it into print outside of the symposium.

There were some 250 attendees to the little known meeting this year in Colorado, with some of the papers reporting important progress. Organizers say that the conference is set to expand substantially next year.

This story is adapted from WaferNews, Solid State Technology&#39s weekly news briefing for the semiconductor equipment and materials community. For more information, or to request a sample issue, see www.wafernews.com.

Lise Laurin began her career as a process engineer at Intel and has held a number of positions in semiconductor processing and marketing over the last 18 years. She founded Clear Tech in 1996 to provide technical marketing and consulting services to the semiconductor supplier community. She holds a Bachelor of Science degree in Physics from Yale University, and is an active member of the Semiconductor Safety Association and the SEMI New England Committee.

Richard Kittler, Weidong Wang, Yield Dynamics, Inc., Santa Clara, California

Click here to enlarge image

The complexity of semiconductor manufacturing and its wealth of data provide an ideal environment for the application of data mining to discover patterns of behavior leading to knowledge. The demands are, however, such that data-mining technology needs to be extended further to achieve its full potential. Companies successful in developing and harnessing this technology will be rewarded with powerful new capabilities for diagnosing process control, yield, and equipment problems.

Each generation of semiconductor process technology demands more detailed data to support process monitoring and improvement. This is being driven by additional process steps and by new forms of metrology and flexibility in sampling plans. In turn, the wealth of data drives storage, integration, analysis, and automation trends in wafer processing.

Data storage. Islands of engineering data, generated and stored at the tool or cell level, are beginning to be integrated into repositories for analysis packages. These repositories make possible new types of queries that request data from across regions of the process. Thus, new sources of process variation can be traced to their root cause, leading to quicker yield ramps.

Data integration. To access data repositories, integration must allow analysis tools to pull data from across the network. This can be done via a configurable data integration language or forced consolidation of all data into a proprietary in-house or third party database. Both techniques will remain in use until specific standards are developed to allow plug-and-play among commercial data storage, access, and analysis components.

Click here to enlarge image

Data analysis. Analysis tools range from those provided on process tools to domain-specific applications in statistical process control (SPC), yield improvement, and equipment maintenance to ad hoc applications with third-party software packages (e.g., MS Excel, SAS JMP, and SPLUS). Many large IC manufacturers have built in-house domain-specific applications because comprehensive commercial products were limited.

Data analysis automation. Data volume and the number of relationships being monitored dictate a need for improved automation of line and yield monitoring. It is no longer sufficient to generate volumes of trend charts; charts must be analyzed automatically for exceptions and action must be taken if anomalies are found.

Users and suppliers of manufacturing execution systems (MES) tend to look at these problems from the top down; equipment suppliers from the bottom up. MES suppliers must create and maintain such capabilities at the factory level. Tool suppliers see market potential in increasing the communications capabilities of tools and packaging localized automation solutions within a work cell. Automated cells will slowly become more prevalent, similar to the evolution of multichamber tools. This will drive the need for cell management systems that treat the cell as a virtual tool. SPC used within the cell will need to be comprehensive, and the cell will need to be capable of tracing local process and defect related SPC failures to root causes. Eventually, the cell will need to communicate with neighboring cells to elicit information needed to maintain local process targets and anticipate tool wear and breakdown before they have an impact on the process.

Such automated systems will require infrastructure and behavioral models for their various components. The infrastructure is emerging through Sematech&#39s work on a CIM Framework [1]. Behavioral models are already in use at the cell-level for such applications as run-to-run process control of overlay, critical dimensions (CDs), and chemical mechanical planarization (CMP) uniformity [2].

Developing control models involves tedious manual steps: gathering data, cleaning it, and repeatedly exploring it until an analytical model can be established. Techniques — specifically involving data-mining technologies (see “Data mining in brief,” page 48) — are evolving, however, that allow some such modeling to be derived, or at least revised, automatically. The promise of such methods is that automated systems guiding tools, cells, and eventually perhaps even factories, will be able to learn and adapt their own control schemes to best achieve high-level directives. We believe that data mining is a key component of the technologies needed to achieve quicker yield ramps, fewer yield busts, higher capacities, and higher productivity metrics.

Current uses of data mining

Data mining is being used extensively in the retail, finance, security, medicine, and insurance industries. Many of these applications were built originally as expert systems. Today, a subset of such systems can be built automatically using data-mining technology to derive rules from contextual historical data. This has opened up new vistas of modeling heretofore deemed impractical because of the volumes of data involved and the transient nature of the models, especially in retail.

Some applications in these industries have analogies in manufacturing. As in the medical industry, for example, use of data mining in manufacturing requires that the derived models have physical understanding associated with their components. It is not enough to predict system behavior, since the underlying root cause of the phenomena usually needs to be identified to drive corrective actions. Also, as with medicine, most applications in manufacturing are for diagnostic systems. Manufacturing groups would like to know the circumstances under which they will encounter certain types of process control excursions, equipment events, and final quality levels. Having a system that can crunch historical data and establish related models improves response time when such events occur and follow-up is needed.

Caveats for manufacturing

Statistical significance does not always imply causality. Given enough variables, one or more will show up as significant regardless of the question asked. In other cases, the answer to a problem may be in the data, but the analytical methods are too narrow to uncover them. Both types of errors are more easily tolerated in off-line than in real-time systems.

Avoidance of false signals requires ongoing work to improve models by better “cleaning” of the data, linking to previous knowledge, or including certain variables only if certain conditions are true. Today, statisticians or engineers do this weeding out process through the benefit of considerable domain knowledge beyond that contained in the dataset being analyzed.

When data mining misses the correct model, it is usually the result of the narrowness of the algorithms used. For instance, the data-mining algorithm may not be robust to “outliers” and therefore be thrown off-track by “dirty” data. Or, in the decision tree method, models at each node may not be complete enough to catch a behavior (e.g., cyclic vs. linear behaviors).

Data mining can also encounter problems when there are correlations among the variables used in the modeling. Such is often the case when the model involves variables related to the same region of the process. In these cases, the data-mining routine might choose to use one of the correlated variables based on its analytical criteria, even though a statistician or engineer would deem it to have no causal relationship and wonder why the proper variable was not selected. Data-mining tools that offer features for engineers to interact with the analysis and override variable selections help alleviate such black-box limitations.

Given the variety of caveats, the use of data mining in manufacturing is just beginning. The current interactive use will need to reach a level of maturity before being followed by embedding in real-time applications. To be viable for real-time use, data-mining engines must be applied to well-understood phenomena so that output is meaningful and trusted. To do so may require new forms of boundary conditions to define the limits for the solution space. Such rules define what is required of the process outside of the immediate context and what knowledge has been gained outside of the immediate dataset. Today such knowledge is referenced and used through participation of real engineers in the engineering change process.

Applications to wafer processing

In semiconductor manufacturing, possible uses for data mining include process and tool control, yield improvement, and equipment maintenance — analyzing historical data for patterns to derive physically meaningful models for predicting future behavior.

Process control is ripe for use of data mining because of the complexities of problem-solving processes and the richness of the data sources that need to be brought together. SPC is usually supplemented with troubleshooting guides — out of control action plans (OCAPs). An OCAP is derived from protocol and engineering experience in solving previous problems. Data mining can assist in solving new problems by looking for commonalties in the processing history of previous occurrences — for example, diagnosing excursions of in-line defect metrology and wafer electrical test (WET) data.

In-line defect metrology data is fraught with analytical difficulties. Sophisticated kill-ratio schemes have been derived by correlating with sort data to assist in prioritizing the many out-of-control signals obtained from application of standard SPC methods to random defect data. Once the decision is made to work on a given defect type, analytical information from defect metrology, context, and history of the affected product, and other data, are assimilated by knowledgeable engineers to suggest and eliminate possible causes. This proceeds until the defect goes away or is traced to a root cause.

Given access to this same data, it should be possible to develop a data-mining application to assist in this complex task. Previous results would be embodied in an expert system against which new results would be compared. With a new failure, the system would interrogate other automation components and metrology tools to gather the data needed to suggest a likely cause.

Another possible application is diagnosis of excursion failures at WET. When lots fail at WET, the engineer responsible for disposition must assimilate the pattern of failing tests and either recognize the problem from previous experience or begin an analysis of process history to find the root cause. In the latter case, knowledge of the nature of the failure narrows the process steps to be investigated. Here, a data-mining application could encapsulate relationships between failing tests and the process flow, together with a methodology for gathering data on failing lots and analyzing it for commonalties.

Advanced process control applications close the loop. Process drifts are not only detected, but corrected by feeding process corrections either forward or backward in the process flow. Run-to-run feedback control applications have become critical to achieving production-worthy processes for CD, overlay, and CMP uniformity [2]. Such model-based process control applications require well-founded models for factors that cause drift in the response, tight integration with the factory control system, and an appropriate control algorithm.

The Sematech APC Framework [3] is the foundation for an infrastructure to build such applications. Once this infrastructure is in place, the process model is crucial for achieving success. Modeling needs to take into account the primary factors influencing the response and secondary factors that determine under what sets of conditions the model shall remain valid. For the most part, statisticians and highly knowledgeable engineers have derived these factors through exploratory data analysis. Data mining offers the opportunity to increase the productivity of the tasks leading to the initial model as well as the means to improve it once in place. For instance, the residual errors from a run-to-run overlay control system could be analyzed to look for patterns that would allow the model to be improved further.

Equipment predictions

On-board tool control systems manage the tool as a self-contained electromechanical system. With the challenge of each succeeding generation of wafer processing technology, these systems have become increasingly complex. This complexity demands sophisticated control systems that monitor and react to problems. During the course of development, extensive data is collected on the failure modes of each subsystem as well as that of the integrated tool.

Data mining has potential application in building models based on historical failure data to detect the precursors of failures. This would permit systems to increase their capabilities for self-repair or graceful shutdown before compromising the process.

Yield management

Yield management covers a diverse set of tasks in both front-end and back-end processing. Yield ramps on new technologies and yield busts on existing technologies demand powerful routines to profile yield loss and trace it to root causes. These types of correlation activities can be tedious and time-consuming because of the time it takes to consolidate data for analysis. Sometimes, it is still a matter of trial and error before a signal is found.

Although various companies have developed automated search routines based on traditional methods like ANOVA and regression, these are of limited use when an interaction exists between two tools or when time is a factor. In such cases, data mining has potential to supplement existing techniques. This is especially true when the data-mining tool is embedded within an exploratory data-analysis environment and can be integrated with existing automation methods.

Macros could be developed to search for certain types of phenomena on a daily or weekly basis and if found, to run data mining to pinpoint the source of the process variation. Similar to an in-line process control application, an example of this might be a monitor that looks at the level of a certain failing bin. When the level exceeds a threshold, the monitor would trigger a data-mining analysis of the last 50 lots to determine the cause of the high counts. This type of system could be extended to trigger data-mining analysis when certain spatial patterns are detected on the wafers.

Fault detection, classification

Fault-detection systems [4] supplement those built within process tools to monitor trace data from tool sensors and look for anomalies. They allow IC manufacturers to customize patterns being monitored and associated action plans.

Data mining is useful in correlating the presence and severity of such faults to downstream quality metrics. This provides an ability to prioritize the response to detected faults. As with excursions detected in defect metrology data, some signals from fault detection on trace data are cosmetic, without a detrimental impact on the wafer; others are indicators of trouble.

Equipment maintenance

Equipment maintenance involves diagnosis and repair of failures, and preventative maintenance (PM). Diagnosis of a tool failure can be a difficult task requiring several pieces of information to be assimilated as clues until a consistent picture can be drawn and the faults isolated. Here again, data mining can make use of historical data to derive expert system-like diagnostic rules to suggest next steps and allow faults to be isolated more efficiently.

PM anticipates the effects of wear and tear to replace parts before they fail and compromise the process. PM frequencies and special PMs are often dependent on processes and recipe changes being performed. For instance, there may be interactions between recipe changes and the process performance immediately after a change. These types of effects are difficult for a semiconductor equipment supplier to anticipate and test. They can have major effects on a manufacturer&#39s success with a tool, however.

Data mining would provide the opportunity to look for such interactions between recipes, as well as establish correlation between the number of lots processed since the last PM and downstream quality measures. Once such models are established they can be used to customize PM procedures and frequencies to improve process performance under given load conditions.

Conclusion and future trends

Although the need for data-mining tools is great, and their value has been proven in other industries, data-mining capabilities needed for semiconductor manufacturing are just beginning to be developed. As with applications in medicine, potential manufacturing applications are hard to picture without involving humans. Current practice necessarily involves an engineer or statistician to guide, interpret, and evaluate the results of data mining.

Today, data mining remains a complementary tool to more traditional statistical and graphical methods for exploratory data analysis and model building. Automated results still need to be reviewed and interpreted prior to direct application in the form of a process change or a new control algorithm for a tool.

Increasingly, however, methods will be developed to reduce these limitations by increasing the breadth of models that can be developed and by reducing the frequency of false signals. Eventually it will be possible to couple data mining to repositories of knowledge as well as data. This will lead to self-optimizing systems of increasing complexity as witnessed by the emerging trend of new offerings for work cell management from the major tool suppliers. This trend will continue and lead to larger centers of automation that will rely on data mining to anticipate and recover from potential problems.

The rate at which this vision is achieved will hinge on the development and adoption of standards for data storage and access between and within tools. Such standards will also need to be extended to the more difficult topic of knowledge storage. Scripting languages will need to be developed or enhanced to manage the complex interactions among tool, cell, and factory control systems and data-mining engines.

Data-mining technology presents an opportunity to increase significantly the rate at which the volumes of data generated on the manufacturing process can be turned into information. Truly, its time has come!

Acknowledgments

Special thanks to Li-Sue Chen, Bob Anderson, and Jon Buckheit of Yield Dynamics for their help in preparing this article.

References

  1. CIM Framework Specification, Vol. 2.0, Sematech document #93061697J-ENG, 1998.
  2. Semi/Sematech AEC/APC Symposium X Proceedings, October 11-16, 1998.
  3. APCFI proposal and summary, Sematech document #96093181A-ENG, 1996.
  4. P.J. O’Sullivan, “Using UPM for Real-Time Multivariate Modeling of Semiconductor Manufacturing Equipment,” Semi/Sematech AEC/APC Workshop VII, November 5-8, 1995.

Click here to enlarge image

Richard Kittler received his PhD in solid-state physics from University of California at Berkeley. He later joined AMD, where he led the development of internal yield management systems. Kittler is currently VP of product development at Yield Dynamics Inc., 2855 Kifer Rd., Santa Clara, CA 95051; ph 408/330-9320, fax 408/330- 9326, e-mail [email protected].

Click here to enlarge image

Weidong Wang received his PhD in statistics from Stanford University. He is currently a member of the technical staff at Yield Dynamics Inc.; e-mail [email protected].


Data mining in brief

Data-mining methodologies find hidden patterns in large sets of data to help explain the behavior of one or more response variables. Unlike other methods, such as traditional statistics, there is no preconceived model to test; a model is sought using a pre-set range of explanatory variables that may have a variety of different data-types and include “outliers” and missing data. Some variables may be highly correlated and the underlying relationships may be nonlinear and include interactions.

Some data-mining techniques involve the use of traditional statistics, others more exotic techniques such as neural nets, association rules, Bayesian networks, and decision trees.

Neural nets do modeling similar to the learning patterns of the human brain. This model is a network of nodes on input and output layers separated by one or more hidden layers. Each input and output layer node is associated with a variable in the dataset and has connections to all nodes in adjacent layers. Response functions on each hidden layer node determine how a signal from the input direction is propagated to nodes in the output direction. Adjusting the weights on the hidden-layer nodes trains the network to minimize error in the outputs across a set of training data. The trained network can then be used to make predictions for new data — called supervised learning.

Under certain conditions a model with a single hidden layer is equivalent to multiple linear regression. Neural net models are useful when large amounts of data need to be modeled and a physical model is not known well enough to use statistical methods. But with this approach it is difficult to make a physical interpretation of model parameters. Also, predicted outcomes of the model are limited to the scope of the training set used. Because they are not able to discover new relationships in the data, neural nets are not true data mining.

Bayesian networks build a network of relationships using a training dataset where the weights on the links between nodes are constructed from conditional probability distributions. The networks can be built interactively or by searching a database. Though more physical than neural networks because the nodes in the network are measured variables, it is still difficult to extract elements of a physical model from the network or effectively visualize the relationships embodied in it.

Association rules look for patterns of coincidence in data (e.g., how often do failing lots go through various deposition and etch combinations). Its simplest form is the same as a contingency table; advanced forms account for event-occurrence order (e.g., how often do failing lots go through two reworks at first metal and are then etched on a certain etcher). Association rule analysis discovers patterns of behavior, but does not produce a predictive model.

Decision trees are tools for developing hierarchical models of behavior. The tree is built by iteratively evaluating which variable explains the most variability of the response based on the best rule involving the variable. Classes of rules include linear models, binary partitions, and classification groups. For example, the binary partition “if e-test variable T1234 < 1.23 x 10-5” would partition data into two groups — true and false. The root node of a tree is a rule that explains the most variability of the response. Child nodes find other variables that explain the most variability of the data subsetted by the first node. The process stops at a point of diminishing returns, similar to automated step-wise regression. Decision trees are useful when relationships are not known and broad categorical classifications or predictions are needed. They are less useful for precise predictions for a continuous variable.

Bayesian networks and decision trees offer the most power in detecting hidden trends in data, in being most physical, and in offering the predictive capability needed to understand patterns of behavior. Both can discover new relationships in data, hence, both are capable of data mining. Of the two, decision trees are easier to interpret physically and to visualize behaviors embodied in their models.

Knowledge-based systems

Knowledge-based systems, such as expert systems, signature analysis, and classification trees, encapsulate knowledge capable of being derived either whole or in part by data mining.

Expert systems create hierarchical knowledge systems given a set of rules. They guide a user through a decision making or diagnostic process. Many have been built following in-depth interviews with experts. Data mining provides another method of deriving rules for expert systems.

Signature analysis is specifically designed to assimilate clues associated with diagnostic data to fingerprint a process failure. Data mining can be used to discover patterns that associate a given failure with a set of process conditions. Once associations are known, they can be applied to new data through signature analysis to implicate likely process conditions that led to the failure. Classification trees are a special case of data mining when the response variable is categorical. They can be built with or without use of data-mining technology if the knowledge can be obtained through other means.

Requirements for data mining

Data mining requires data availability, efficient access methods, robustness to data problems, efficient algorithms, a high performance application server, and flexibility in delivering results.

The sensitivity of various data-mining methods to data problems must be considered when choosing the product or method for an application. Data problems of concern are those due to gaps and measurements beyond the range of normal observation — so-called outliers. Though data-mining methods have robustness to missing data, results will be improved when gaps can be avoided; but outliers are present in most real-world data sets. Data-mining algorithms that use nonparametric methods (i.e., those that do not rely on normality of the underlying distribution) are less sensitive to outliers. “Cleaning” data prior to analysis can also avoid false signals due to outliers.

Performance of hardware and software in a data-mining system is important when large amounts of data are being mined or response time is critical. Simple data-mining algorithms will by nature be more efficient, but less capable of finding patterns outside a narrow range of behaviors. As the algorithms are made more complex (e.g., evaluate more forms of models at the nodes of a decision tree), efficiency decreases. Once algorithms are chosen, hardware must be sized to deliver the analyses within the required time. For interactive use, it may be sufficient to deliver output to a diagram together with the ability to interact with the diagram to visualize underlying graphical relationships. In other cases, it may be necessary to deliver output to the web. In embedded applications, the output needs to be fed to other systems that can act on the results or message subsystems to perform corrective actions.

Bob Mielke was the Institute`s technical vice president of the Contamination Control Division, completing a two-year term just ended on July 1. A senior metrology engineer for Abbot Laboratories, he`s been involved in the Institute`s standards and practices program since its inception in 1982. He was the chairman of Standards and Practices for the Contamination Control Division from 1987 to 1995 and is also the secretary for ISO TC209, the International Organization for Standardization`s Technical Committee for establishing global cleanroom standards.

Q:How did the original cleanroom standards evolve?

A:The original standards were written starting in the late 1950s by corporations. Corporation A might be a contractor for three different corporations, each one having their own cleanroom, or “white room” specifications. They would have to try and conform to all three standards, and that wasn`t good. So, the Air Force came out with a standard in 1961 — U.S. Air Force TO00-25-203. Soon thereafter, in January, 1962, the first laminar-flow cleanroom was announced by Sandia National Labs. Soon thereafter, the GSA (General Services Administration) contracted with Sandia to write a federal standard on cleanrooms, Fed-Std-209. That first one was put together roughly in about a year or a year and a half.

Fed-Std-209 was a pretty comprehensive standard that set up classes plus testing methods in the appendices. Then a number of military and NASA standards came out about that same time. It was pretty active between 1962 and 1968. Then the standards activity went into a lull. A2C2 wrote about five recommended practices that were published as tentative Recommended Practices, but that`s about the time that A2C2 kind of faltered. Then the Institute of Environmental Sciences came on the scene and Bob Peck took the initiative putting standards back on the front burner. He and Gabe Danch (former president of IES) contacted the GSA and told them that Federal Standard 209 was somewhat outdated. They got permission from GSA to redo it, and with that, the Recommended Practice Program on the contamination control side of the Institute took hold. The whole program took off then, including the rewrite of 209, and by that time there was version A and B. The first version the Institute of Environmental Sciences gave to GSA was Version C, which was published by GSA on October 27, 1987. Soon thereafter, it was discovered that there were numerous errors in the document, and revision D was published on June 15, 1988. Subsequent to that, the Institute of Environmental Sciences went through Federal Standard 209D and issued another version that GSA published on September 11, 1992 — today`s Federal Standard 209E.

Q: What is happening with European standards?

A: At the beginning of the 1990s, the European Union decided that all European nations would have to conform to one standard — be it toasters, refrigerators, or cleanrooms. The European Committee of Normalization Technical Committee 243: CEN/TC243 was formed. Although other countries had standards that were similar to Federal Standard 209, they were not exactly like it. So it became the de facto standard. We realized in the U.S. that it only made sense to get an international standard together. So the U.S., through the Institute of Environmental Sciences, made the effort to establish an ISO Technical Committee. Otherwise, probably what would have happened was that we would have had a U.S. standard, a European standard, and a Japanese standard, and they`d all be different, and we`d all be having this constant battle. However, that`s when the ISO activity began. The Institute of Environmental Sciences petitioned ANSI, which then petitioned ISO, and it was put to a ballot. Thirty nations responded and twenty-nine voted in favor of the ISO proposal to establish the ISO committee. Nine nations offered to become participating members.

ISO designated a technical committee on cleanrooms (TC209) — a total coincidence in the numbering between that and Federal Standard 209. The first meeting was held in Chicago in November 1993, and there have been five subsequent meetings so far of ISO Technical Committee 209. To date, ISO has seven working groups encompassing the following topics: airborne contamination, biocontamination, metrology, design, operations, glossary, and minienvironments.