The
07/01/2001
Henry Watts, Mark Johnston, PRI Automation, Billerica, Massachusetts
overview
Changes in semiconductor manufacturing accompanying the move to 300mm wafers and the continual shrinkage of critical dimensions are placing new demands on essentially all fab systems. In particular, connections between the MES, the transport system, the equipment integration systems, and the scheduler must become more robust and rich. What is finally being approached as a standard for manufacturing is the reduction or elimination of the human fab staff for material movement, dispatching decisions and, in many ways, for communication. The control systems, richly connected together, will increasingly be making actual decisions about next lot, processing flow, etc. In this environment there are few options to having a scheduling system that, in a visible and controllable manner, accurately translates manufacturing and business objectives into a plan of work.
Wafer fab equipment idle time can never be recaptured, so it is important to arrange fab activities to use key resources efficiently. One can assure there is so much work in process (WIP) that wafers are in front of all important equipment. However, this approach increases cycle time and has negative effects on economic resources invested in WIP, business responsiveness and, often, fab yield. An optimizing scheduler can achieve short cycle times, low-levels of WIP and high equipment utilization.
We have developed a scheduling software engine that provides comprehensive, fab-wide scheduling using Interval Logic Corp.'s Repair-Based approaches. In our work, we started with the question: "What constitutes a good schedule?" The answer required some precision and detail that was more difficult than might be expected because any fab is subject to conflicting goals.
A "good schedule" must...
Obviously, a fab schedule must arrange for as much work as possible during a period. To accomplish this it must optimize wafer batching and minimize equipment setups. In addition a good fab schedule must:
- assure lots with due dates move fast enough to ship on time,
- move higher priority lots faster, and
- support single-tool requirements at key layers.
Also, factories with automated material handling systems (AMHS) must schedule so transportation capacity is used efficiently. In most fabs, some operations can be carried out in more than one location. An efficient schedule may artfully choose to transport lots to alternate available equipment, but must balance this approach with the transportation system's capacity.
Additional objectives
Line balance is another key measure of schedule quality. To simultaneously achieve fast cycle times and optimal equipment utilization, WIP must be minimized and must either be kept actively in process or carefully staged. When there is just the right amount of material at each point and in front of each equipment type, the line is "in WIP balance." A factory in balance can produce more work in a period than one that is not. A schedule that does not attend to line balance may maximize activity temporarily, but make it impossible for the factory to perform well in subsequent periods.
One indirect measure of schedule quality is related to how fab personnel and computerized systems use a schedule. What is important, when schedules are regenerated to account for changing conditions in the fab, is that each subsequent schedule be, as much as possible, similar to the previous schedule.
For example, a scheduling system can indicate when specific reticles will be needed at specific steppers over 24 hrs. However, if a schedule produced at 9:00am is radically different from that produced at 7:00am, operators who have been using the first schedule may find they have wasted some work or may have to do other work very quickly to meet changing requirements.
A good fab schedule is therefore a plan that properly balances competing fab objectives. The objectives are well known, but the degree to which a given fab should emphasize one over another is a business decision. Imagine producing a proposed schedule for a wafer fab with a specific current WIP and a specific set of current objectives. Then, ask knowledgeable manufacturing professionals to evaluate the schedule. You would get a variety of responses, depending on how each elects to balance competing objectives.
Thus it is important for a user to have convenient methods of modifying weight given to various objectives and good metrics to see the effects of tradeoffs. In this way scheduler performance can be easily and quickly modified as business conditions and fab objectives change over time.
Repair-Based scheduling
Repair-Based scheduling produces highly optimized schedules quickly and in response to flexible requirements. It begins with the presumption that all tasks can be performed exactly when and how a user desires. It then uses specific techniques for eliminating constraint violations (e.g., how many lots can be processed by a given tool, etc.) in ways that maintain or improve overall schedule quality.
Below we discuss data modeling requirements, and steps to integrate a scheduling system into an existing wafer-fab control system. We then review the process of defining an idealized schedule for each lot, considerations for initial resource assignment, the process of resolving constraints while maintaining and improving schedule quality, and what it takes to run such a system in real time in a wafer fab.
Modeling requirements
Producing an optimal schedule for a fab requires that the scheduling system have a relatively complete picture of fab operations and current status. Core information is derived from the MES and is routinely updated by MES changes. This information includes products, process flows, specific operations used at process flow steps, durables (e.g., reticles), current lots, types of lots, lot priorities, equipment units and types, valid equipment setups, definitions of physical areas and locations, calendars of working shifts and days, and maintenance schedules.
Figure 1. Underlying detailed data, setup-to-setup time, and difficulty, mediated by current user preferences, drive the scheduling decision process. |
The MES contains much of this information in a complex or indirect form. Key processing values may have base level defaults with available overrides depending on the current process flow, the current equipment ID, the product in question, or even based on a specific lot. Timing constraints may exist between specific pairs of processing steps. Choices of equipment may be mediated by defined equipment processing capabilities rather than just choosing from equipment units. Depending on the process flow, product, and lot, alternates may exist for process steps, equipment, recipes, and durables to be used. These may be of equivalent preference or not preferred to some extent.
In many cases (e.g., PROMIS native procedure stack and SiView basic record scripts) processing logic is embedded in formal MES decision structures. Two modeling approaches may be used in such cases. Where the processing logic is resolvable in advance, it will be the presumed path of any lots on the affected flows. Where the processing logic will be resolved in real time, the default path will be used until such time as the data is available to revise the disposition.
Some data is not likely to be complete in the MES, including detailed attributes of equipment (e.g., lots processed in batches and how many at once, beginning a lot before a previous lot completes, timing values for first and subsequent wafers, etc.). Transportation matrices that define time and difficulty of getting from one location to another are also needed to support optimization goals.
Similarly, matrices are needed to define the time and effort to move a piece of equipment from one setup state to another (Fig. 1).
Manufacturing standards (e.g., yields, cycle times, lead times, etc.) are needed for all process steps.
In addition to basic values for these model variables, time-phasing is also used. Semiconductor manufacturing facilities rely on a continual process of manufacturing improvement. It is, therefore, not enough to state what current yields and cycle times are, but rather to be able to plot a standard of improving yield and cycle time performance. For this reason, most variables in the model including standards for yield and cycle time, periods of equipment availability, quantities of equipment, etc. are entered as time-phasable values.
Figure 2. User controls are directly related to business objectives so trial schedules can be run to determine the effect of tradeoffs under current circumstances. |
Integration requirements
Integrating Repair-Based scheduling into the wafer fab environment is not conceptually difficult, but it involves much detailed work. Most data can be extracted from the MES, and the MES can be configured to provide routine updates. Where the MES is not configurable for such updates, taps can be set on the interface between the MES and its database.
Additional data can be configured in the MES as parameter or attribute fields or by using custom tables. As a practical matter, however, MES users may be justifiably cautious about making large changes to a successfully running MES, and may organize the additional data in a separate database.
User preference data are also needed. Preferences expected to be static on a long-term basis can simply be put into user-accessible configuration files. Data that changes more frequently need to be easy to adjust. The key information requested from a user is, by intentional design, along the same dimensions discussed above where we explored the characteristics of a good schedule. The same issues that arise when determining the important manufacturing characteristics of a good schedule are exactly the top-level controls provided to a user (Fig. 2).
Lot projection
The first step in a Repair-Based approach to fab scheduling is to define the ideal outcome, initially ignoring most constraints.
This is a critical step, as the process of doing very fast optimization is aided by having a reasonable starting point.
Since the sequence of operations performed on the lot is the dominant characteristic of the schedule, we put all tasks for each active lot (i.e., process steps defined in the MES and the scheduling model) in order on a time line. Details of each task include the lot and a list of all possible resource assignments for this task, based on equipment type or capability requirements with restrictions to a specific piece of equipment if required. In addition, tasks have a specific subset of preferred equipment, the equipment configuration required, and any timing constraints relative to other tasks (e.g., a time limit between pre-clean and oxidation). Finally, the task includes the type and number of durables needed, any alternate steps that might be available, and process durations.
To define the proposed timing of the projected tasks, we attend to additional lot characteristics. For lots with specific due dates, the entire set of tasks necessary to complete the lot will be projected to complete at the appropriate time, shrinking or stretching inter-operation lead times as necessary, but not violating minimum lead time constraints nor maximum or minimum inter-operation constraints. For remaining lots, projections will be at normal lead times for the current priority of the lot, higher priority lots using shorter lead times.
For a wafer fab running 40,000 to 50,000 wafer starts/month on a 6-week cycle time and using ~400 process steps, scheduling for 24 hrs includes roughly 30,000 tasks.
During lot projection we are not yet concerned about assigning specific resources, resource limitations, conserving equipment setups, etc. We are simply laying out work we would like the fab to accomplish. The order in which lots are selected for projection does not matter.
Initial resource assignments
Translating a Repair-Based scheduling approach into a production-ready control tool for semiconductor manufacturing includes developing certain cleverness in the initial assignment of tasks to a specific resource of valid choices. It is not a matter of being stuck with initial choices, as the system will work through the repair process to optimize the solution, but a good initial assignment will make the solution easier to find.
As it turns out, optimized solutions are reached more quickly and surely if the main emphasis of the initial resource assignment is on preferences assigning tasks in such a way as to optimize overall goals stated for this scheduling run rather than on constraints. Constraints will be resolved soon enough. This is done using Preference Calculus (see "Preference Calculus" on p. 132).
Preference Calculus is given a different set of biases for each scheduling phase where it is used. For initial assignment, certain constraints (e.g., task sequence, timing constraints, etc.) are given strong emphasis, but others (e.g., equipment capacity) are strongly de-emphasized, while most of the preferences are given full weight. Later, in the repair phase, a different weighting is used to guide the repair process in line with real limitations and user preferences.
The sequence of lots getting initial resource assignment is controlled, with high priority lots and highly constrained lots, especially in terms of inter-step timing, being assigned first.
One particular challenge must be dealt with explicitly: It may be that some heavily constrained tasks (e.g., a short time window and a specific process tool) may end up unassignable once other high-preference tasks have already been assigned to the needed resource. So, while the initial assignment process is progressing, highly constrained tasks are constantly monitored. Once available options have shrunk beyond a certain level, the assignment of these tasks takes on additional urgency. It becomes less a question of whether tasks will be assigned in a way that optimizes a number of functions, but whether they are going to be assigned at all. Depending upon expressed user preferences, these tasks may be assigned with less regard for their preference value just to assure they get onto the schedule.
Resolving conflicts, further optimization
Lot projection and initial assignment described above are necessary to define the problem space, but do not produce an optimized schedule. While relatively simple to describe, the repair process is exceedingly challenging to implement for real-world cases. It involves an approach of identifying specific tasks that are creating great conflicts and moving them in time or to a resource where they are creating fewer or no conflicts.
Each task has, as attributes, certain constraints (e.g., a certain tool or given durable). In addition, it is related to other tasks by certain specific constraints (e.g., time limit after a clean), which cannot happen before the preceding task for this lot, and must be done before succeeding tasks for this lot begin.
Some constraints will be understandable only in aggregate; for example, a lot may currently be in contention for a given tool, but this is visible only when tasks for all competing lots are being considered. To support the repair process, a large collection of equipment-based indices and counters must be maintained.
The engine considers current quantitative task conflicts and potential preference improvements, and then makes an extensive series of adjustments to the schedule. These adjustments both resolve conflicts and improve the net value of a user's preferences, step by step. Adjustments to the schedule are of the form one would expect: single or group tasks may be moved in time, assigned to different equipment, added to or removed from batches, or possibly deferred. Currently unassigned tasks, if any, may be placed onto the schedule by assigning them to times and equipment.
The net effect is a process of removing constraint violations to improve schedule quality, both locally and overall.
Not all schedule repairs and adjustments can accomplish both conflict resolution and preference improvement at the same time. Some portions of the search work solely on conflicts that must be resolved to make the schedule feasible. Other portions optimize the schedule by improving preference measures.
An example
Consider two lots vying for processing on the same stepper. This conflict is currently the most severe in the schedule. One lot must be adjusted in time, either forward or backwards, or moved to another stepper.
The first step is to determine alternatives for moving each task in a way that removes or significantly reduces constraint violations. For each alternative, there are several factors to consider. By calculating from the underlying model and emphasizing or de-emphasizing according to a user's settings in the strategy panel, we know for each task the net preference of a possible new location on the timeline based on the effect that move would have on all of the currently relevant factors, including: improving line balance, leaving reticles at a stepper as long as possible, making this schedule be as much like the previous schedule as possible, optimizing equipment utilization, optimizing batch sizes, minimizing setups, moving high priority lots rapidly, and moving lots in accordance with due dates.
Final steps
Ultimately, schedule repair and optimization ends when an analysis indicates insufficient additional benefit is likely to be obtained from further adjustments. No assurance can be made that all conflicts can be resolved, any more than one could expect that all desired work would happen within a single scheduling period. The lower-priority tasks that cannot be scheduled are deferred to the next scheduling run.
The schedule is ready for output to a dispatch list at a user interface, and, as needed, monitoring tools (Fig. 3).
We have produced a complete fab schedule without traditionally defined rules. Rather than leaving it to a user to try to construct and maintain a complex system of rules that embodies desired business practices, the entire system works to translate business objectives directly into detailed activity planning. All of the desired results can be controlled through the user interface.
Real time
It is desirable that scheduling and dispatching happen very rapidly within the fab. Despite all efforts to control variance and create absolutely predictable manufacturing, wafer fabs remain somewhat chaotic. Lots are frequently going on or coming off hold, equipment becomes available for production, or goes into an unpredicted down state. All these events must be accounted for in an optimized schedule. This is less of a problem for rules-Based dispatching, which is basically a process of priority-sorting a queue in front of an equipment type. Such queue sorting can be done extremely rapidly, even with complex rules, but does not lead to overall fab optimization.
Figure 4. Graphic view of Dual-Span scheduling. |
Parts of real-time requirements are relatively easy for an optimizing scheduler to meet. When lots or equipment become unavailable for processing the current dispatch list can be adjusted without difficulty and an overall correction to the schedule will be produced with the next schedule generated. That presumes, however, that the next overall schedule will be generated very soon. Because of this presumption the scheduling system does not, as is often imagined, wait until some extraordinary event happens and then triggers a reschedule. Rather, as soon as one schedule is published, work begins on the next. In this way the schedule is always as up to date as possible.
Our target time was a fully optimized schedule every three minutes. Meeting this was a severe challenge. In a moderately high-volume fab, there will be 30,000 tasks to schedule, about 4.3 million possible combinations of task-timings in 5-min increments. Add to this variability in specific equipment assignments by lot and the number rapidly grows to 40 to 70 million possible schedules.
A number of steps have been taken to constrain the problem size and increase the solution speed. First, the system runs completely in memory, and is CPU-constrained. Therefore one should choose the fastest available Windows 2000 system with enough memory to house the problem space. For practical purposes the system will comfortably run with a fast, small-server class machine with 1Gb of RAM. Since the scheduling engine completely consumes CPU capacity, additional system performance can be achieved by linking two PCs one for file maintenance and communication with the rest of the factory control system and one dedicated to running the scheduler engine.
The scheduling system runs more effectively if the smallest unit of time is not too small, so, for most fab purposes this is set to 5 or 10 min, rather than 1 min or less. There are interesting considerations related to the schedule time span. A task cannot be optimized on the timeline unless it fits completely within the scheduling window; for tasks that hang off the end of the schedule, one wouldn't know what conflicts or preferences might exist in the area outside the scheduling span. The time span therefore needs to be long enough to account for the long-cycle diffusions and other long operations that need to be scheduled in the next day or so.
On the other hand, we would like to constrain the schedule time span as much as possible so that optimization can complete quickly. The answer lies partially in schedule continuity capability described earlier.
For fabs where extremely short schedule turnaround is needed, two small servers can be dedicated to this task Dual-Span scheduling. One server is chartered with long-haul scheduling, creating an optimized schedule for the next two to four days. Upcoming long tasks and certain key batching and setup decisions generated in this environment can be fed across to a second server that is generating schedules for the next 6-12 hrs (Fig. 4); the imported tasks from the long-haul scheduler are treated as valid and unlikely to be modified. This method produces precise schedules on intervals as short as 1-2 min; fast enough to support full automation even in very complex fabs.
Conclusion
Repair-Based scheduling provides real-time, highly optimized schedules that fully comprehend the manufacturing realities and requirements of a wafer fab. Schedules generated are effectively aware of upstream and downstream issues and will take proactive steps to optimize manufacturing over time. This approach does not require laborious creation and maintenance of business rules, but rather works directly from a manufacturer's business objectives. The requirements of such a system are large amounts of realistic data and the ability of a manufacturer's computer technical staff to maintain this model.
Acknowledgments
Preference Calculus, Repair-Based, and Dual Span are trademarks of Interval Logic Corp. PROMIS is a registered trademark of PRI Automation. SiView is a trademark of IBM.
References
- H. Watts, "Improving Fab Performance," Future Fab, Vol. 9, Summer 1999.
- M.D. Johnston, S. Minton, "Analyzing a Heuristic Strategy for Constraint-Satisfaction and Scheduling," Intelligent Scheduling, Zweben and Fox, eds, 1994.
- M.D. Johnston, G.E. Miller, "Intelligent Scheduling of Hubble Space Telescope Observations," Intelligent Scheduling, Zweben and Fox, eds, 1994.
Henry Watts received his MS in socio-technical systems from UCLA. He is VP and GM of the Advanced Planning and Scheduling Division of PRI Automation, 805 Middlesex Turnpike, Billerica MA 01821-3986; ph 978/679-4270, fax 978/663-9755, [email protected].
Mark Johnston received his BA from Princeton and PhD in physics from MIT. Johnston is VP and CTO of the Advanced Planning and Scheduling Division of PRI Automation.
_______________________________
Preference Calculus
Preference Calculus is a mechanism for mathematically combining a set of user-expressed preferences into a measure of value. It allows any given task placement and resource usage to be compared to any other, providing guidance to both initial assignment and repair phases. This structure and caching of pre-calculated values is critical for speed. This is not just a simple mathematical summary, but is a time-phased projection of preferences. A 24-hr schedule detailed in 5-min segments and a choice of 10 resources can be visualized as a 288x10 value-merit matrix, constantly being adjusted based on other changes in the solution.
Preferences and constraints are based on basic fab data entered into the facility model by the user, such as setup difficulties, transportation time requirements, line balance targets, etc., and mediated by current settings a user has made on a strategy control panel.
Preference Calculus is not restricted to the movement of a single task. Moving one task may require that other tasks move as well. It effectively resolves the entire potential effect of a task movement or resource adjustment to a time-phased figure of merit. The operation of Preference Calculus may best be imagined as the functioning of an artificial neural network.