Comprehensive risk management for IC fabs
02/01/1998
Comprehensive risk management for IC fabs
Brian Sherin, Environmental & Occupational Risk Management Inc., San Jose, California
Due to several major recent fires in Taiwan, a Semiconductor Facility Accident Prevention Conference was held in Hsinchu on Dec. 4-5, 1997. This conference was sponsored by the Industrial Technology Research Institute (ITRI), the Taiwan Semiconductor Industry Association (TSIA), Taiwan`s National Safety Council (NSC), and the Ministry of Economic Affairs (MOEA). This article discusses some of the issues presented at that conference.
October 1996. A fire at Winbond`s newly completed 200-mm fab in Taiwan caused extensive damage and closed the facility. Fab III was in its pilot run stages, and scheduled to go into production by the first quarter of 1997. The total damage in the fab was initially estimated to be between US$80 and $100 million. Settlements with Winbond`s insurance companies eventually reached $222 million; however, the company filed suit against its insurers seeking an additional $206 million. The fire was of unknown origin [1].
September 1997. A fire occurred at Chartered Semiconductor Manufacturing`s Fab 2, causing "some delays in production" but no impact on production capacity. The fire occurred due to excessive silane flowing into a burn box. The burn box was located outside the fab and cleanroom, and no customers` wafers were affected. No loss estimates were available for damages or business interruption [2].
September 1997. United Integrated Circuits Corp.`s (UICC) new 200-mm wafer fab in Hsinchu, Taiwan, sustained heavy damage from a fire. According to a Bloomberg News report, the UICC fab will be shut down until at least 1999, a move that could cost the company more than NT$15 billion (US$470 million). The cause of the fire has still not been released. UICC`s wafer fab, a joint-venture subsidiary of United Microelectronics Corporation (UMC) had begun volume production this summer, processing wafers with 0.35-?m technology [3].
November 1997. Taiwan`s semiconductor industry experienced another wafer fab fire, the third such accident in 13 months. Hsinchu-based Advanced Microelectronic Products Inc. (AMPI), a small IC company whose investors include the wife of Taiwan`s president, reported that a fire caused up to $66 million worth of damage to the company facility [4]. (According to sources in Taiwan, this figure has been adjusted downward to $10 million.)
Several major and numerous minor fires at Asian semiconductor manufacturing facilities raise questions about the overall safety of operations within the industry. Although there have been no injuries in these incidents, the total losses exceed $900 million, apparently with significant portions not covered by insurance companies.
Brand new wafer fabrication facilities now cost well over $1 billion to construct and fit-up, and projected costs for new 300-mm fabs are $3 billion. If fabs are not designed and operated correctly, the risk to capital investments is tremendous, and insurance companies may seriously question the feasibility of covering these facilities. Insurance concerns are understandable considering that capital investments are projected to ramp up from $41 billion to $91 billion between 1997 and 2001. Furthermore, all of the recently reported fires have occurred in the Asia-Pacific region (excluding Japan) - the region slated for the largest growth in capital investment.
Industry loss experiences
Factory Mutual Research Corporation (FMRC) compiled a comprehensive database of loss events in the industry over the past 20 years [5]. This database includes incidents involving losses due to fire, fluid leakage (e.g., chemical, cooling water, etc.), explosion, service interruption (e.g., loss of utilities), and other events that result in actual damage to the physical plant, equipment, product, or in business interruption.
From 1977 through November 1997, there were 407 reported incidents worldwide. For every incident that is reported, there are estimated to be five additional unreported cases; therefore, the total number of incidents is probably in excess of 2500. From 1977 until 1992, there was a general upward trend in incidents (Fig. 1), while there appears to be a decline in the number of cases over the past five years.
The likely reason for this downward trend is the cumulative effect of the major safety developments applied to the industry. The 1980s regulatory requirements, and the 1990s standard industry practices, began to control risks as older fabs and equipment were phased out or upgraded. FMRC`s data shows that fire incidents (the major cause of loss) dropped to close to zero in 1994, but over the past three years have shown a significant rise. Half of these incidents occurred outside the US.
|
Figure 1. Total number of semiconductor facility loss incidents/year. Data from Factory Mutual Research Corp.
Although the losses over the past several years appear to have decreased, their cost has gone up greatly. From 1986 until 1990, the average cost for all incidents was $373,235. During the next five years, the average cost/incident jumped to $1,265,797, resulting in a 10-year average of $812,650. Factoring in the recent losses from the past two years, the average cost/incident is now over $4,000,000. Even excluding the catastrophic Winbond and UICC fires, the likely cause for this increase is the increasing complexity of the facilities and equipment and the cost of business interruption.
|
Figure 2. Breakdown of the types of semiconductor facility loss incidents between 1977 and 1997. Data from Factory Mutual Research Corp.
|
Figure 3. Breakdown of the causes of semiconductor facility fire incidents between 1977 and 1997. Data from Factory Mutual Research Corp.
In a separate report that tracked semiconductor loss experience from 1986-1995 within the Factory Mutual System [6], the dollar losses exceeded $211 million. The greatest dollar losses involved liquid leakage, fire, and service interruption (Fig. 2). More than half of the incidents involved fire or explosion, where the leading causes were flammable and/or pyrophoric gas releases (e.g., silane, hydrogen), electrical components, and process liquid heating system failures (Fig. 3). Fire losses reached a peak in 1984 (with 23 total) and dropped to a low of only one loss in 1994. However, there has been an increase over the last three years, with the number of cases reaching 11 this year.
With fluid leakage losses, the most common identified cause is defective or inadequate equipment (66%), followed by contractor error (17%), operator error (10%), and miscellaneous causes (7%). The single worst incident involving fluid leakage was a leak of 416 liters of hydrochloric acid that contaminated several thousand in-process wafers, and corroded several diffusion furnaces and other robotic equipment over an area of 464 m2 [6]. This incident resulted in almost $12 million in property damage and in $20 million of business interruption. This one incident accounted for 42% of the total dollar losses involving liquid leakage from 1985-1995.
|
Figure 4. Semiconductor process equipment involved in loss incidents between 1977 and 1997. Data from Factory Mutual Research Corp.
The most likely equipment types involved in these losses have been process wet benches, followed by furnaces (Fig. 4). For incidents in which the process tool was specified, wet benches accounted for more of the losses than all other equipment combined.
Materials of construction
One of the leading reasons for these losses is the choice of the materials from which wet benches are constructed. Traditionally, process wet benches are manufactured from polypropylene (PP) because of its resistance to chemical corrosion and its relatively low cost. However, polypropylene is a thermoplastic with a fairly low ignition temperature and it can generate significant amounts of heat and smoke.
Combustion tests performed by FMRC show that flames from a 1.8-m, 273-kg PP wet bench can reach heights of 6.1 m and generate heat release rates of 11.5 megawatts within 15 minutes (heat release rates two to three times greater than ordinary combustibles such as paper or wood). The smoke produced is very dense, sooty, and black, causing extensive damage in cleanroom environments.
Thermoplastics also tend to melt and flow when heated, producing flaming drips and liquid pool fires that are especially hazardous in cleanroom environments with open-grille flooring and sub-fabs. A fire can rapidly spread to other parts of the facility through this route or through exhaust ducts, especially if the ducts are also constructed from thermoplastics.
Fire resistant polypropylene (FRPP) has come into use over the last several years. Plastics in this family are impregnated with chemicals that make them more resistant to ignition. In addition, many of these materials meet the requirements of UL94 (Tests for Flammability of Plastic Materials for Parts in Devices and Appliances) which is specified in the current version of the SEMI S2-93 Environmental, Health, and Safety Guideline For Semiconductor Manufacturing Equipment. The flame retardants in FRPP do make it a bit more difficult to ignite the plastic; however, once it does ignite, it will continue to burn, and it releases much more smoke (which can be corrosive) than PP.
Another plastic material sometimes used is polyvinyl chloride (PVC), but it releases toxic gases such as hydrogen chloride (HCl) during combustion. HCl is extremely corrosive and can cause potentially severe damage to electronic equipment and metal surfaces. Newly developed tests can identify fire-safe materials that have low flame spread, smoke generation, and corrosive generation properties [7]. SEMI is currently drafting a new fire protection guideline, recommending the use of noncombustible materials, fire-safe materials, and/or fire detection and suppression, depending upon risk [8].
Safe facility design and construction
The US semiconductor industry has had a variety of "tools" at its disposal since the mid-1980s to reduce the risk of loss. In 1984, the Uniform Building and Fire Codes began to include specifications for the design and operation of semiconductor facilities with the addition of the H-6 occupancy category and Article 51. In 1987, there was a major re-write of the Uniform Fire Code requirements for hazardous materials (Article 80). In 1991, all of the jurisdictions in Santa Clara County, CA, enacted the Toxic Gas Ordinance. Most of these requirements focused on the prevention of hazardous production material (HPM) accidents and fires. These codes have administrative controls, engineering controls, and emergency response elements.
Administrative controls are procedural in nature and require that documentation be in place and appropriate training be performed. Examples of administrative control requirements include the following:
storage plans/hazardous materials inventory statements,
separation of incompatible materials,
security, and
placards and labeling.
Engineering controls are designed into the structure of the facility or include systems that either prevent an incident from occurring or minimize the damage should an incident occur. Examples of engineering control requirements include the following:
fire protection systems,
occupancy separation,
secondary containment,
gas cabinets,
ventilation and treatment systems,
detection and shut-off controls,
excess flow control and reduced flow orifices for HPM gases,
compatible process piping, and
use of noncombustible or fire-resistant materials of construction.
Emergency response controls are intended to respond rapidly to - and control - an emergency incident in the event that the administrative and engineering controls fail. Emergency response control requirements include the following:
fire access and water supply,
emergency alarms and emergency control stations,
trained on-site emergency response teams (ERT),
emergency response equipment, and
spill control, drainage and containment.
The requirements detailed in the fire and building codes are considered to be "minimum" codes, that is, these are the minimum requirements for compliance. These codes are primarily intended to protect lives (workers and emergency responders) and the environment. Although they may also provide some protection against business losses, that is not their main purpose. The control of risks through best engineering practices should be used to provide additional levels of protection.
Safety sign-off programs
Reducing the risk of incidents in a facility is the joint responsibility of the safety, facilities, risk-management, purchasing, manufacturing, and many other departments. The overall concept is that of "Design for Safety," which incorporates safety engineering and risk management into the entire timeline of the process of manufacturing semiconductor products (Fig. 5). Through this integration of environmental health and safety (EHS) into the fab`s design and operation, the frequency and severity of incidents is reduced in a much more timely and cost-effective fashion (Fig. 6).
The overall process needs to start with education. All key players in the construction and fitting up of the facility (architects, planners, project managers, procurement, facilities engineers, process engineers, etc.) must be informed of the process and their responsibilities. Furthermore, clear direction from senior management defining the logic and reasons behind this type of program is essential to its success.
The process starts with the EHS input on the base design of the facility. Issues such as aisle widths, clearance distances around tools, fire protection and smoke control systems, and the location of emergency control systems are addressed. Incorporating EHS review during this phase will often reduce the number of cycles necessary for the design to receive permit approval from the local jurisdictions, thereby reducing construction delays.
Occupancy-specific checklists (e.g., fabrication areas, sub-fabs, chemical utility buildings) aid in this process. During the construction phase of the facility shell and its key support equipment, EHS evaluations should ensure that the designs have been converted into reality; otherwise there could again be regulatory delays or future operational problems.
SEMI S2-93, which is currently undergoing an extensive revision, is intended to "help eliminate known safety and health hazards inherent in the operation and maintenance of process equipment" [9]. Many semiconductor operations throughout the US (including all member companies of SEMATECH) have adopted SEMI S2 as the compliance guideline standard and expectation for the acceptance of new semiconductor manufacturing equipment in their facilities. Many companies require that these reviews be performed by independent third-parties. This overall risk management program requires that all equipment manufacturers that supply hazardous equipment (i.e., high voltage, hazardous materials, or ionizing radiation systems) must provide a SEMI S2 compliance report. This report should be reviewed by the end-users (or their agents) to ensure that adequate engineering rationale has been used in accepting the design. In the event that items are not in full compliance, it is up to the end-user to determine whether the risk is acceptable [10].
|
Figure 5. Key steps in the semiconductor process and manufacturing flow where EHS considerations are necessary. (Reprinted with permission of EORM)
This evaluation is a critical step in the entire equipment safety program, because failure to catch issues prior to accepting delivery of the equipment makes corrective action more difficult, often more expensive, and will delay production start-up. Since most major US (and some international) end-users require that S2 evaluations be performed, there should not be a significant additional charge by the equipment manufacturer for providing this necessary safety documentation (since the work will have had to be performed anyway).
The final phase in the equipment safety process is to ensure that the actual installation of the equipment is performed safely and that it is evaluated for safe operation. The core of this program is the Equipment Sign-Off team, consisting of the project manager, the process tool "owner," the equipment vendor, a facilities engineer, and an EHS professional.
The team should use a multistep approach to evaluate the tool for readiness; EORM uses a three-step approach. Also, core checklists that are specific to the type of process equipment (e.g., etchers, ion implanters, wet stations) identify key elements for evaluation. These core checklists are further customized by adding specific information gleaned from the SEMI S2 evaluations to ensure that they are tool-specific.
The first level of evaluation readies the tool for electrical power. Key safety elements such as hazardous energy isolation (i.e., "lock-out") devices, emergency shut-off buttons, checks of voltage and phases, and proper earth grounding are evaluated. This step allows for the tool to be safely energized, the robotics calibrated, and the software loaded, and prepares for the subsequent hook-up steps.
The second level verifies that all internal and external safety systems and facilities are in place for the tool to receive HPMs safely. This includes elements such as exhaust service (velocity measurement, flow patterns), drainage systems (proper connections, pressure tests, line labeling), liquid leak detection, verification of gas (toxic and flammable) monitoring system annunciation, implementation of emergency response capabilities, development of an interlock matrix and verification testing of critical safety interlocks, integrity testing and verification of labeling for HPM delivery plumbing, confirmation of life safety system interfaces, and verification of seismic restraints. This step allows for the introduction of HPMs and additional physical hazards for the purpose of process qualification.
The third and last level of assessment allows for production acceptance. This includes a review of operating and maintenance specifications to ensure that critical EHS information is included - control of hazardous energies procedures (lockout/tag-out), required personal protective equipment (PPE), etc. - necessary pre-production industrial hygiene baseline exposure assessments for chemical and physical agents (e.g., RF, microwave, infrared, and ultraviolet radiation), and an approval of all outstanding Level 1 and 2 punchlist items and supplemental information.
Although design and procedures can reduce risk, situations may still occur in which emergency response is necessary. Another major concern for facilities in Asia is the apparent lack of infrastructure to deal with emergency situations. One of the factors identified as probably contributing to the severity of the losses at both the Winbond and UICC fires was the lack of rapid and efficient fire department response.
|
Figure 6. Facility design and equipment safety program overview. (Reprinted with permission of EORM)
The Hsinchu Science Based Industrial Park Fire Brigade is staffed with five personnel on duty. The private ERTs within the park lack equipment and adequate training and do not coordinate activities with the fire department. Many of the facilities within the Science Park lack adequate fire protection. The Hsinchu city Fire Department has approximately 80 personnel to protect a population of nearly 500,000. They are neither equipped for, nor trained in, hazardous materials response. Response times are poor because of the heavy traffic congestion that exists between Hsinchu and the Science Park. They do not have automatic or mutual aid agreements or train regularly with neighboring fire departments [11].
In contrast, the City of Palo Alto, California, has 122 personnel to protect a population of 80,000, which includes Stanford University and the Stanford Linear Accelerator Center. The department has nine personnel trained to the level of "HazMat specialists," with state-of-the-art equipment and a dedicated HazMat response vehicle. There is a county-wide mutual aid agreement that could bring four or five additional HazMat response units in a short period of time. Response times to facilities in Palo Alto is four minutes or less, 90% of the time.
As a result of the recent event at UICC, UMC pledged to provide several million dollars to equip and better train the local fire department. This, as well as improved ERT training, needs to be completed quickly, and such programs should also be put in place at the new Tainan Industrial Park and in Malaysia`s KHTP.
Conclusion
If adequate engineering design and risk management systems are not implemented, future investments in increasingly complex and expensive facilities are at great risk and the ability or desire of casualty insurance companies to cover this risk is in jeopardy. The greatest risk will be in Asia, primarily because there is no similar model of local jurisdictional regulatory influence to that in the US.
In Taiwan alone, $67 billion will be spent over the next decade to construct new fabs - most at a new science park in the city of Tainan. Taiwanese companies plan the construction of four 300-mm fabs in the 1999/2000 timeframe [12]. Singapore also plans the creation of 10-12 fabs over the next 10 years, and Malaysia has plans for three new facilities on which construction will begin in 1998 in the Kulim High Technology Park (KHTP).
Structured design-for-safety and equipment-safety programs provide several direct and indirect benefits. First, all of the causes of fire and fluid leakage losses should be detectable and preventable using this process. Integration of these types of programs ensures that critical tests have occurred prior to start-up and that these tests have been documented. They provide historical (i.e., base-line) documentation for future reference to determine operational parameters at time of installation. For companies with multiple facilities, these programs ensure multisite consistency and a plan-of-record, thereby reducing the time to bring a new facility on-line.
Thirteen years ago, I started as an EHS specialist at a newly constructed fabrication facility where the EHS department was not involved in any of the aspects of the design or fit-up. This was typical throughout the industry at that time. The first 12 months of the operation of that facility were fraught with problems: chemical and water leaks and spills, fires, and the discovery of defective hazardous gas piping. That facility`s initial operations were well below expected efficiencies.
By making these programs an integral part of the overall design-build project, safety can be achieved without adding a step. The result is decreased cycle time to start up the facility, increased production efficiency, and a greatly reduced risk of catastrophic losses and threats to personnel and the environment. The result is "faster, better, cheaper, and safer." With the latest fabs designed to produce 30,000-40,000 wafers/month, cutting even one week off of the time to start up the facility can translate into millions of dollars in increased production revenue, while at the same time protecting a capital investment of billions of dollars.
Acknowledgment
EORM is a registered trademark of Environmental & Occupational Risk Management Inc.
References
1. Semiconductor Business News, CMP Media Inc., 5/8/97 and 7/31/97.
2. BT Online, 10/7/97.
3. BT Online, 10/7/97, Semiconductor Business News, CMP Media Inc., 10/3/97, 10/6/97, 10/7/97, and Electronics Buyers News, CMP Media Inc. 11/20/97.
4. Semiconductor Business News, CMP Media Inc., 11/20/97.
5. Vincent DeGiorgio, Factory Mutual Engineering, Semiconductor Facility Accident Prevention Conference, Hsinchu, Taiwan, December 4-5, 1997.
6. Factory Mutual System, Engineering Loss Experience; 1986-1995 Semiconductor Manufacturing Plant Loss Experience, April 1996.
7. Factory Mutual Research Corp., FMRC Clean Room Materials Flammability Test Protocol (Class 4910).
8. SEMI Doc. 2697, draft rev.13, Safety Guideline for Fire Protection of Semiconductor Manufacturing Equipment.
9. SEMI S2-93: Environmental, Health, and Safety Guideline For Semiconductor Manufacturing Equipment, SEMI, Mountain View, CA.
10. SEMI has also published SEMI S10-1296: Safety Guidelines for Risk Assessment for evaluating risks of semiconducor equipment operations.
11. Chief Ruben Grijalva, Palo Alto Fire Department, personal communication, 12/10/97.
12. E. Korczynski, "SEMICON/Taiwan 97 sets new records," Solid State Technology, pp. 60-65, Nov. 1997.
BRIAN SHERIN received his BS degree in genetics from the University of California, Davis, and a masters degree in biology/toxicology from San Jose State University. He is a Certified Safety Professional (CSP). Sherin is CEO and a managing principal of Environmental & Occupational Risk Management Inc., an environmental, safety and health consulting firm. He has 16 years of experience in safety engineering and management, industrial hygiene, and environmental health and chemistry within the semiconductor, electronics, and chemical industries. He is a member of the Safety and Fire Protection Standards Committees for Semiconductor Equipment and Materials International. EORM Inc., 2460 N. First St., Suite 280, San Jose, CA95131-1024; ph 800/648-1506, fax 408/436-1136.