Category Archives: 2013

Gigaphoton Inc., a lithography light source manufacturer, announced today that it has completed development of an electricity-reduction technology for its flagship “GT Series” of argon fluoride (ArF) immersion lasers used for semiconductor lithography processing. Based on the continuous evolution of its leading-edge “Green Innovations” environmental technologies Gigaphoton is unveiling its “eGRYCOS (e-GIGAPHOTON Recycle Chamber Operation System)” product, which enhances laser efficiency and reduces electricity consumption by 15 percent.

The semiconductor industry has been growing faster than other sectors, and one of the key drivers is the evolving improvements in manufacturing equipment. Because of their use as light sources in leading-edge lithography applications, ArF immersion lasers require i increased output power to support new enhancements of the scanners. Current high-volume production lasers are running at 60 W output, but the latest requirement has reached output of 120 W. As industry demand for higher power grows, the electricity that lasers consume will continue to increase as well.

Gigaphoton has addressed this issue through its EcoPhoton program, and has continued to  work on developing a highly efficient laser chamber design. In a laser chamber, excimer gas flows between two electrodes; as the flow speed increases, the discharge becomes more stable, resulting in better laser performance. Gigaphoton’s redesigned chamber features a hydro-dynamically optimized gas flow channel shape, and enables the same speed of gas flow while consuming less electricity. In addition, the newly designed pre-ionization process enables uniform distribution of ions in the main discharge region, providing laser discharge that is 1.2 times more efficient (compared with existing products). As a result, “eGRYCOS” reduces electricity consumption by 15 percent (compared with existing products) without compromising laser performance.

“Gigaphoton has consistently focused on green innovations to support environmentally conscious ‘green fabs’,” said Hitoshi Tomaru, President and CEO of Gigaphoton Inc. “The ‘eGRYCOS’ product is an example of the success of our EcoPhoton program. We will continue to provide our global customers with innovative technologies that enable increased laser performance with lower energy consumption to meet the demands of today’s leading-edge lithography applications.”

We have an interesting dynamic in the world of semiconductor training that has been in play since the financial crisis in 2008-2009. In order to pull through the especially dire conditions, most companies in the space dramatically reduced their expenses by implementing huge reductions in headcount, travel and training. Now that the industry is bouncing back, one would think that hiring, and therefore the need to train new individuals, would also return. That has not been the case. While it’s true that the industry has resumed hiring, the recruiting strategy and composition of those hired has been different. First, the industry discovered that it could simply avoid replacing many of these positions and still maintain output. The remaining people were just told to do more. They were compelled to oblige, since the alternative was to be out of work. Second, the industry was able tap experienced individuals who were previously let go from other companies. These people did not necessarily need much training to be productive in new positions. Third, companies compensated by increasing automation and outsourcing. The result is that the amount of training has generally been greatly reduced post recession. In some other industries there has been some bounce-back to previous levels, but not in our industry. For example, the attendance at semiconductor-related short courses (1-5 days at a hotel or other training facility) has fallen off dramatically since 2008. Has the need for training gone down, or is there an unseen need that is reducing quality and productivity?

One partial explanation is that there is a shift in the responsibility for training. For example, automation in the factory has pushed more training out of the semiconductor manufacturer and into the hands of the equipment suppliers. As our tools grow more complex, we require more extensive training to operate and maintain them. Moving the fixed cost for in-house training to a supplier’s expenses just sweetens the deal for the chipmaker.

A much more positive shift is toward what we call “performance support.” Performance support is learning at the “point of need.” Rather than attending a conference or a short-course on a topic, then waiting for months to use the information, one accesses a system with the needed information in-hand when the problem occurs. Many engineers and scientists naturally work this way, jumping onto the Internet to look for papers or discussion boards that might address their pressing needs. The problem with this approach is that it is poorly structured and yields inconsistent results. “Googling” can yield some relevant hits, but this information is often of unknown, or even suspect, quality. Further, the search results tend to be a “mile wide and an inch deep.” Some sites like IEEE Xplore provide information that is highly specific and detailed. It is in essence, a “mile deep and an inch wide”. While highly experienced engineers can navigate through this mass of data and potentially find the answers they need, the sheer volume can easily stymie others. They need knowledge that is more structured, focused and at the right level to address their immediate questions. They need a system that is able to adapt to their changing needs while “on the line,” using the tool.

Another important shift we see coming is the rise of the simulator for training in the semiconductor industry. Our tools are now incredibly expensive, so much so that tools like a state-of-the-art immersion lithography system now costs more than a passenger airliner or jet fighter. We use simulators for jet aircraft, and we will likely need simulators for our manufacturing tools so that users can learn in an environment without the fear or danger of damaging the tool and/or expensive wafer lots.

In conclusion, while it may seem like training is on the decline, there are compelling reasons why we need to continue learning, and even step up our efforts in this area. We believe there are new approaches that can lower training costs, reduce risk, and provide engineers with the knowledge they need to be successful on the job. •

CHRIS HENDERSON, President, Semitracks, Inc.

Common thermal considerations in LEDs include test point temperature and thermal power.

One characteristic typically associated with While it’s true that LEDs are cool relative to filaments found in incandescent and halogen lamps, they do generate heat within the semiconductor structure, so the system must be designed in such a way that the heat is safely dissipated. The waste heat white LEDs generate in normal operation can damage both the LED and its phosphor coating (which converts the LED’s native blue color to white) unless it’s properly channeled away from the light source.

A luminaire’s thermal design is specified to support continuous operation without heat damage and oftentimes separates the LEDs from temperature-sensitive electronics, which provides an important advantage over individual LED replacement bulbs.

Test point temperature
Test point temperature (Tc) is one characteristic that plays an important role during integration to determine the amount of heat sinking, or cooling, that the luminaire design requires. In general, the higher the Tc limit compared to worst-case ambient temperature (Ta), the more flexibility a luminaire manufacturer will have in designing or selecting a cooling solution.

The worst-case ambient temperature is usually 40ºC or higher, so a module with a low Tc rating (e.g., 65ºC) doesn’t have much headroom above the already hot ambient temperature. Trying to keep a module at Tc 65ºC when the Ta is 40ºC and dissipating 40W thermal power is very difficult to do with a passive heat sink, so a fan or other active heat sink will likely be required. On the other hand, a module with a Tc rating of 90º C or higher (while still meeting lumen maintenance and warranty specifications) has at least 50º C headroom over the ambient temperature and should be able to make use of a reasonably sized passive heat sink.

However, the higher you can push the test point on the LED module, the smaller the heat sink you need. It’s dependent on the Ta – if the module can’t withstand a high enough maximum temperature, it’s impossible to cool below Ta unless you have a refrigerated system, regardless of the size or effectiveness of the heat sink. Stretching the difference between Tc and Ta as much as possible will give you greater room to deviate from the norm and be creative in your heat sink selection.

From phosphor to where the heat sink is located, Xicato is driving Corrected Cold Phosphor to lower the resistance between the phosphor and the heat sink, without having to cool through the hot LEDs. Today, the module output is at 4000 lumens, which wouldn’t have been possible five years ago.

The bottom-line considerations with respect to test point temperature are really flexibility and cost. If a module with a high Tc rating is chosen, there will be more options for design and cost savings than are provided by a module with a low Tc rating, assuming the same power dissipation.

Figure 1: Xicato XSM module family sample passive heat sink matrix showing suitable module usage for a range of thermal classes.

Thermal power
Another key characteristic, thermal power (load) has always been a difficult number to deal with. LED module manufacturers don’t always provide the information required to calculate thermal power because this value can change based on such variables as lumen package, Color Rendering Index (CRI), correlated color temperature (CCT), etc. Cooling solutions are often rated for performance in terms of degrees Celsius per watt, which, unfortunately, necessitates calculating the thermal power.

To address this problem, Xicato has developed a “class system,” through which each module variation is evaluated and assigned a “thermal class.” With this system, determining the appropriate cooling solution is as simple as referencing the thermal class from the module’s data sheet to a matrix of heat sinks. FIGURE 1 is a sample passive heat sink thermal class matrix for the Xicato XSM module family.

Let’s take, as an example, a 1300 lumen module with a thermal class rating of “F.” According to the matrix, for an ambient condition of 40°C, the best choice of heat sink would be one that is 70 mm in diameter and 40 mm tall. Validation testing is still required for each luminaire during the design phase, as variations in trims, optics, and mechanical structures can affect performance. Looking at the example module, if a manufacturer were to design a luminaire around this class “F” heat sink and nine months later a new, higher-flux class “F” module were released, the same luminaire would be able to support the higher-lumen module without the need for additional thermal testing. The thermal-class approach supports good design practice, speeds development and product portfolio expansion, and provides a future-proof approach to thermal design and integration.


Most specification sheets cite an electrical requirement for the module and the lumen output. Electrical input is basically the voltage the module will require and the current needed to drive it; the product of these two variables is power. The problem with output is that it’s always displayed in lumens – a lumen is not a measure of power, but rather a unit that quantifies and draws optical response to the eye. It’s calibrated specifically on what the human eye sees, but there’s a quality of brightness that comes into play that can’t easily be tied back to electrical power. There’s no way to figure out exactly how much thermal power is being dissipated by the module – power “in” is measured in electrical energy (voltage × current), while power “out” is non-visible electromagnetic, visible electromagnetic, and thermal power. None of this is shown in datasheets.

This intangible factor creates a challenge – for most customers, a watt is a watt, but in reality, there are thermal watts, electrical watts and optical watts; not all are easily determined. The customer can attempt calculations – e.g., how to cool 10 thermal watts – but the fact is that people don’t generally think that way. Many customers don’t have engineers on staff, and those that do often use rough approximations to determine compatibility.

Xicato has defined modules that go up to Class U. The Tc rating, while independent of module flux package, is interrelated. Class A modules, in general, don’t need a heat sink; lower power modules usually achieve about 300 lumens. On the other hand, an XLM 95 CRI product is a Class U product that requires either a passive heat sink or an active heat sink. Once the module and heat sink have been selected and integrated into the luminaire, the next step is thermal validation, which Xicato performs for the specific fixture utilizing an intensive testing process that includes detailed requirements that must be met by the luminaire maker when submitting a fixture for validation (see Table 1 for a partial summary).

The validation is based not on lumens, but on the thermal class model, and the fixture rating is also based on thermal class, rather than wattage, because watts differ. With this approach, an upgrade can be made easily without having to do any retesting. •

JOHN YRIBERRI, Xicato, is the director of Global Application Engineering, Xicato, Inc., San Jose, CA. John joined Xicato in November of 2007 and was the Project leader for Xicato’s first LED platform- the Xicato Spot Module (XSM).

Serial product

This paper is related to an alternative method of generation and propagation of binary signal (Quantum Cellular Automata) that, in most of the literature, is implemented with quantum dots or quantum wells [1],[2]. By using some new different approaches based on graphene structures, the signal processing capabilities of QCA assemblies may be obtained at significantly reduced complexity compared to conventional quantum-based QCA assemblies, which typically operate at very low temperatures. A two-layer graphene structure is presented in order to overcome technological and operating limitations that affect traditional approaches.

Figure 1: Cell configuration conventionally defined as “0” and “1” logic. 

State-of-the-art QCA: Quantum dots

The quantum-dot [3] is an “artificial atom” obtained by including a small quantity of material in a substrate. The definition “artificial atom” for a quantum-dot is intended in an electrical sense, allowing the transition of a single electron (or a single hole) among a couple of them. In this field, the most common technology is based on an indium deposition on a GaAs substrate

This happens because the reticular structure of InAs is quite different from gallium and therefore indium tends to concentrate in very small islands. By using several layers of GaAs, pillow structures can be realized. Considering this vertical structure, location probabilities of electrons and holes can be considered as “distributed” on several dots.

Figure 2: Device cell is driven to “1” due to majority effect. 

Consider that four dots realized on the same layer constitute a QCA cell [2]. In each cell, two extra electrons can assume different locations by tunneling between the dots and providing the cell with a certain polarization. Coulombic repulsion causes the two electrons to occupy antipodal sites within the cell (see FIGURE 1). The dimension of the cells may be around 10 nm.

The array of interacting quantum cells can be considered a Quantum-dot Cellular Automata. However, it must be noted that no tunneling occurs between cells and the polarization of the cell is determined only by Coulombic interaction of its neighboring cells.

QCA operating principles

The status of each cell can therefore be, according to Fig. 1, only in “0” or “1” configuration, depending on the influence of its neighbor, producing a “majority” effect. In other words, the status which is more present at the border of the cell “wins” and it is copied on it due to polarization. An example is given in FIGURE 2.

Figure 3: Examples of QCA structure. 
Figure 4: QCA logic gates. 

Signal propagation happens as a “domino“ effect with a very low power consumption. Simple structures can be easily arranged, as in FIGURE 3.

By fixing the polarization in one of the cells in a “majority” crossing structure, AND and OR gates can be easily obtained (FIGURE 4).

Quantum dots: Current technology

One of the most promising technologies for implementing quantum dots, and therefore quantum cellular automata, is Bose-Einstein Condensates [4]. This approach overcomes the traditional one based on an Indium deposition on a GaAs substrate (small indium islands aggregation due to reticular diversity).

Bose-Einstein Condensates (BEC) are made by ultra-cold atom aggregation (typically rubidium or sodium isotopes) confined using laser manipulations and magnetic fields.

BEC’s properties are quite atypical and therefore are defined as the “fifth phase of matter,” after solids, liquids, gases and plasmas. Every atom in a BEC has the same quantum state, and therefore, a BEC can be considered a “macroscopical atom.” Tunnelling and quantum effects also occur at a macroscopical scale, with advantages on state definition and detection. A major drawback is the very low operating temperature (around 1°K) that may constitute a limit for physical implementation.

Figure 5: Structure of a graphene layer. Selected area is about 4 nm2. 

Proposed technology

In recent years, an increasing interest has been devoted to new materials, whose properties seem to be very promising for nanoscale circuit applications. Graphene [5] is a 0.3 nm thin layer of carbon atoms having a honeycomb structure, whose properties of conductivity, flexibility, transparency could have a deep impact on future integration technology (FIGURE 5).

Figure 6: An example of a graphene layer with four hemispherical “hills”.  

Graphene could also be doped as usual semiconductors are (despite the fact that, from its electrical properties, it can be considered a pure conductor), and therefore, it can be used to build nanometric transistors. However, the most interesting features that suggest graphene as a good material for QCA cells are the following:

      1. In contrast to metallic or semiconductor QCA, the dimensions of molecular automata allow for operation at ambient temperatures because they have greater electrostatic energy [6],[7].
      2. Low power requirements and low heat dissipation allow high density cell disposition [8],[9].
      3. Structure flexibility (see


    ) and physical bandgap arrangement allow cells to be built with the bistable behaviour of a two-charge system.

Different techniques are currently available to reshape a graphene layer. Despite the fact that industrial processes are not yet implemented, it is arguable that a serial production of a QCA graphene cell could be possible, and simple, well-defined process steps for the single cell are identified.

Figure 7: Graphene based QCA cell structure. The four hemispherical cavities allow the two negative charges to be hosted in both configurations of logic states.  

Idea for structure and process steps

The basic idea is to realize a square structure with four cavities in which two negative charges (suitable ions or molecules) could be placed and moved depending on neighborhood polarization. Graphene manipulation may allow dimensions of the cells that are quite comparable to the traditional semiconductor Q-dots approach (for a solid state single electron transition cell, the distance among dots is typically 20nm, and the average distance among interacting cells is 60nm). However, in order to cope with the chosen ion charge (Coulombic interaction can be stronger if compared with single electrons) and with process requirements, a slight increase of distances is also possible. The structure is based on a two layer graphene arrangement (see FIGURE 7).

The top layer (Layer 1) needs some more process steps in order to realize the four hemispherical cavities. The different energy levels among the layers (obtained by establishing different potentials for the two conductors) forces negative charges, in absence of external polarization, to stay on the bottom of the holes. Supposing a dimension of 15-20nm for each cavity in order to host suitable electronegative molecules or ions (e.g. Cl-, F-, SO4–), the process steps could be the following:

Layer 2 definition (bottom layer process steps):

    1. Graphene chemical vapor deposition (CVD) on copper.
    2. Graphene (layer 2) transfer on the target substrate (through copper wet etching and standard transfer techniques).

Layer 1 definition (top layer process steps):

    1. Graphene chemical vapor deposition (CVD) on copper.
    2. Resist spin-coating (ex: PolyMethylMethAcrylate, PMMA) on graphene CVD (on copper).
    3. E-Beam lithography for hemispherical cavities definition.
    4. Resist selective removal (ex: TetraMethyl Ammonium Hydroxide chemistry).
    5. Graphene etching (plasma O2).
    6. Resist removal (ex: acetone).
    7. Graphene (layer 1) transfer on layer 2 (through copper wet etching and standard transfer techniques).

In addition to the “physical” bandgap realized with this structure, an electronic bandgap could be created on Layer 1 during the third process step. Defects induced in hemispherical cavities may allow a bandgap of 1.2eV to be reached.

Signal transduction of the resulting logic level

After the signal processing performed by the QCA network, the resulting logic state is stored in the last QCA cell. In order to be used by other electronic devices, this information has to be converted into a suitable voltage level. From an operative point of view, it is sufficient to detect a negative charge in the right up position of the last cell; if it is present, according to Fig. 1, the logic state is “1”, otherwise it is “0”. This operation is not so trivial, due to the quantity of the charge to be detected and to the small dimension of its location. To this end, among several different strategies, two approaches could be suitable: the ion approach and the optical approach.

Ion approach.This approach can be performed by using channel electron multipliers (or channeltrons), which are ion detectors with high amplification (108). Every ion can generate a cascade of electrons inside the detector, and therefore, consistent charge pulses that can be counted. In our case, there is no ion flux across a surface, and therefore, counting is not needed (information is only ion presence or absence). However, the detection area is very small (quarter cell). This problem could be solved by attaching carbon nanotubes (e.g. ,10nm diameter each) to charge pulse amplifier terminals, in order to increase their resolution, acting as nano-guides.

Optical approach. The basic principle of this approach is in theory quite simple: in order to detect an object of nanometric dimensions like molecules or ions, a suitable wavelength waveform should be used. For the described application, X-ray radiation seems to be the most appropriate, ranging its wavelength among 1 pm and 10 nm. However, the complexity of the detection set (high precision is needed in order to minimize bit error) and the huge number of the transducers that require large numbers of bit conversions, may in some cases indicate this solution as too expensive with respect to the ion approach. •


1.W. Porod, World Scientific Series on Nonlinear Science, 26, 495 (1999).

2.I. Amlani et al., Science, 284, 289 (1999).

3.G. Tóth et al., J. Appl. Phys., 85, 2977 (1999).

4.J.R. Ensher et al., Phys. Rev. Lett. 77, 1996 (1996).

5.K. Novoselov et al., Science, 306, 666 (2004)

6.X. Du et al., Nature, 462, 192 (2009)

7.K. I. Bolotin et al., Phys. Rev. Lett., 101, 096802 (2008)

8.F. Schwierz, Nat. Nanotechnol., 5, 487 (2010)

9.D. Frank, et al., IEEE Electron Dev. Lett., 19, 385. (1998)

DOMENICO MASSIMO PORTO is a systems analysis specialist staff engineer, audio and body division technical marketing, automotive product group, STMicroelectronics, Milan, Italy.

Power device characterization and reliability testing require instrumentation capable of sourcing higher voltages and more sensitive current measurements than ever before.

Silicon carbide (SiC), gallium nitride (GaN), and similar wide bandgap semiconductor materials offer physical properties superior to those of silicon, which allows for power semiconductor devices based on these materials to withstand high voltages and temperatures. These properties also permit higher frequency response, greater current density and faster switching. These emerging power devices have great potential, but the technologies necessary to create and refine them are less mature than silicon technology. For IC fabricators, this presents significant challenges associated with designing and characterizing these devices, as well as process monitoring and reliability issues.

Before wide bandgap devices can gain commercial acceptance, their reliability must be proven and the demand for higher reliability is growing. The continuous drive for greater power density at the device and package levels creates consequences in terms of higher temperatures and temperature gradients across the package. New application areas often mean more severe ambient conditions. For example, in automotive hybrid traction systems, the temperature of the cooling liquid for the combustion engine may reach up to 120°C. In order to provide sufficient margin, this means the maximum junction temperature (TJMAX) must be increased from 150°C to 175°C. In safety-critical applications such as aircraft, the zero defect concept has been proposed to meet stricter reliability requirements.

HTRB reliability testing
Along with the drain-source voltage (VDS) ramp test, the High Temperature Reverse Bias (HTRB) test is one of the most common reliability tests for power devices. In a VDS ramp test, as the drain-source voltage is stepped from a low voltage to a voltage that’s higher than the rated maximum drain-source voltage, specified device parameters are evaluated. This test is useful for tuning the design and process conditions, as well as verifying that devices deliver the performance specified on their data sheets. For example, Dynamic RDS(ON), monitored using a VDS ramp test, provides a measurement of how much a device’s ON-resistance increases after being subjected to a drain bias. A VDS ramp test offers a quick form of parametric verification; in contrast, an HTRB test evaluates long-term stability under high drain-source bias. HTRB tests are intended to accelerate failure mechanisms that are thermally activated through the use of biased operating conditions. During an HTRB test, the device samples are stressed ator slightly less than the maximum rated reverse breakdown voltage (usually 80 or 100% of VRRM) at an ambient temperature close to their maximum rated junction temperature (TJMAX) over an extended period (usually 1,000 hours).

Because HTRB tests stress the die, they can lead to junction leakage. There can also be parametric changes resulting from the release of ionic impurities onto the die surface, from either the package or the die itself. This test’s high temperature accelerates failure mechanisms according to Arrhenius equation, which states the temperature dependence of reaction rates. Therefore, this simulates a test conducted for a much longer period at a lower temperature. The leakage current is continuously monitored throughout the HTRB test and a fairly constant leakage current is generally required to pass it. Because it combines electrical and thermal stress, this test can be used to check the junction integrity, crystal defects and ionic-contamination level, which can reveal weaknesses or degradation effects in the field depletion structures at the device edges and in the passivation.

Instrument and measurement considerations
Power device characterization and reliability testing require instrumentation capable of sourcing higher voltages and more sensitive current measurements than ever before. During operation, power semiconductor devices undergo both electrical and thermal stress: when in the ON state, they have to pass tens or hundreds of amps with minimal loss (low voltage, high current); when they are OFF, they have to block thousands of volts with minimal leakage currents (high voltage, low current). Additionally, during the switching transient, they are subject to a brief period of both high voltage and high current. The high current experienced during the ON state generates a large amount of heat, which may degrade device reliability if it is not dissipated efficiently.

Reliability tests typically involve high voltages, long test times, and often multiple devices under test (wafer level testing). As a result, to avoid breaking devices, damaging equipment, and losing test data, properly designed test systems and measurement plans are essential. Consider the following factors when configuring test systems and plans for executing VDS ramp and HTRB reliability tests needed for device connections, current limit control, stress control, proper test abort design, and data management.

Device connections: Depending on the number of instruments and devices or the probe card type, various connection schemes can be used to achieve the desired stress configurations. When testing a single device, a user can apply voltage at the drain only for VDS stress and measure, which requires only one source measure unit (SMU) instrument per device. Alternatively, a user can connect each gate and source to a SMU instrument for more control in terms of measuring current at all terminals, extend the range of VDS stress, and set voltage on the gate to simulate a practical circuit situation. For example, to evaluate the device in the OFF state (including HTRB test), the gate-source voltage (VGS) might be set to VGS 0 for a P-channel device, or VGS = 0 for an enhancement mode device. Careful consideration of device connections is essential for multi-device testing. In a vertical device structure, the drain is common; therefore, it is not used for stress sourcing so that stress will not be terminated in case a single device breaks down. Instead, the source and gate are used to control stress.

Current limit control: Current limit should allow for adjustment at breakdown to avoid damage to the probe card and device. The current limit is usually set by estimating the maximum current during the entire stress process, for example, the current at the beginning of the stress. However, when a device breakdown occurs, the current limit should be lowered accordingly to avoid the high level current, which would clamp to the limit, melting the probe card tips and damaging the devices over an extended time. Some modern solutions offer dynamic limit change capabilities, which allow setting a varying current limit for the system’s SMU instruments when applying the voltage. When this function is enabled, the output current is clamped to the limit (compliance value) to prevent damage to the device under test (DUT).

Stress control: The high voltage stress must be well controlled to avoid overstressing the device, which can lead to unexpected device breakdown. Newer systems may offer a “soft bias” function that allows the forced voltage or current to reach the desired value by ramping gradually at the start or the end of the stress, or when aborting the test, instead of changing suddenly. This helps to prevent in-rush currents and unexpected device breakdowns. In addition, it serves as a timing control over the process of applying stress.

Proper test abort design: The test program must be designed in a way that allows the user to abort the test (that is, terminate the test early) without losing the data already acquired. Test configurations with a “soft abort” function offer the advantage that test data will not be lost at the termination of the test program, which is especially useful for those users who do not want to continue the test as planned. For instance, imagine that 20 devices are being evaluated over the course of 10 hours in a breakdown test and one of the tested devices exhibits abnormal behavior (such as substantial leakage current). Typically, that user will want to stop the test and redesign the test plan without losing the data already acquired.

Data management: Reliability tests can run over many hours, days, or weeks, and have the potential to amass enormous datasets, especially when testing multiple sites. Rather than collecting all the data produced, systems with data compression functions allow logging only the data important to that particular work. The user can choose when to start data compression and how the data will be recorded. For example, data points can be logged when the current shift exceeds a specified percentage as compared to previously logged data and when the current is higher than a specified noise level.

A comprehensive hardware and software solution is essential to address these test considerations effectively, ideally one that supports high power semiconductor characterization at the device, wafer and cassette levels. The measurement considerations described above, although very important, are too often left unaddressed in commercial software implementations. The software should also offer sufficient flexibility to allow users to switch easily between manual operation for lab use and fully automated operation for production settings, using the same test plan. It should also be compatible with a variety of sourcing and measurement hardware, typically various models of SMU instruments equipped with sufficient dynamic range to address the application’s high power testing levels.

With the right programming environment, system designers can readily configure test systems with anything from a few instruments on a benchtop to an integrated, fully automated rack of instruments on a production floor, complete with standard automatic probers. For example, Keithley’s Automated Characterization Suite (ACS) integrated test plan and wafer description function allow setting up single or multiple test plans on one wafer and selectively executing them later, either manually or automatically. This test environment is compatible with many advanced SMU instruments, including low current SMU instruments capable of sourcing up to 200V and measuring with 0.1fA resolution and high power SMU instruments capable of sourcing up to 3kV and measuring with 1fA resolution.

Figure 1: Example of a stress vs. time diagram for Vds_Vramp test for a single device and the associated device connection. Drain, gate and source are each connected to an SMU instrument respectively. The drain is used for VDS stress and measure; the VDS range is extended by a positive bias on drain and a negative bias on source. A soft bias (gradual change of stress) is enabled at the beginning and end of the stress (initial bias and post bias). Measurements are performed at the “x” points.

The test development environment includes a VDS breakdown test module that’s designed to apply two different stress tests across the drain and source of the MOSFET structure (or across the collector and emitter of an IGBT) for VDS ramp and HTRB reliability assessment.

Vds_Vramp – This test sequence is useful for evaluating the effect of a drain-source bias on the device’s parameters and offers a quick method of parametric verification (FIGURE 1). It has three stages: optional pre-test, main stress-measure, and optional post-test. During the pre-test, a constant voltage is applied to verify the initial integrity of the body diode of the MOSFET; if the body diode is determined to be good, the test proceeds to the main stress-measure stage. Starting at a lower level, the drain-source voltage stress is applied to the device and ramps linearly to a point higher than the rated maximum voltage or until the user-specified breakdown criteria is reached. If the tested device is not broken at the main stress stage, the test proceeds to the next step, the post-test, in which a constant voltage is applied to evaluate the state of the device, similar to the pre-test. The measurements throughout the test sequence are made at both source and gate for multi-device testing (or drain for the single device case) and the breakdown criteria will be based on the current measured at source (or drain for a single device).

Vds_Constant –This test sequence can be set up for reliability testing over an extended period and at elevated temperature, such as an HTRB test (FIGURE 2). The Vds_Constant test sequence has a structure similar to that of the Vds_Vramp with a constant voltage stress applied to the device during the stress stage and different breakdown settings. The stability of the leakage current (IDSS) is monitored throughout the test.

FIGURE 3. Example of stress vs. time diagram for Vds_Constant test sequence for vertical structure and multi-device case and the associated device connection. Common drain, gate and source are each connected to an SMU instrument respectively. The source is used for VDS stress and measure; the VDS range is extended by a positive bias on the drain and a negative bias on the source. A soft bias (gradual change of stress) is enabled at the beginning and end of the stress (initial bias and post bias). Measurements are performed at the “x” points.

FIGURE 3. Example of stress vs. time diagram for Vds_Constant test sequence for vertical structure and multi-device case and the associated device connection. Common drain, gate and source are each connected to an SMU instrument respectively. The source is used for VDS stress and measure; the VDS range is extended by a positive bias on the drain and a negative bias on the source. A soft bias (gradual change of stress) is enabled at the beginning and end of the stress (initial bias and post bias). Measurements are performed at the “x” points.


HTRB testing offers wide bandgap device developers invaluable insights into the long-term reliability and performance of their designs. •

LISHAN WENG is an applications engineer at Keithley Instruments, Inc. in Cleveland, Ohio, which is part of the Tektronix test and measurement portfolio. [email protected].

A wide array of package level integration technologies now available to chip and system designers are reviewed.

As technical challenges to shrink transistors per Moore’s Law become increasingly harder and costlier to overcome, fewer semiconductor manufacturers are able to upgrade to the next lower process nodes (e.g., 20nm). Therefore various alternative schemes to cram more transistors within a given footprint without having to shrink individual devices are being pursued actively. Many of these involve 3D stacking to reduce both footprint and the length of interconnect between the devices.

A leading memory manufacturer has just announced 3D NAND products where circuitry are fabricated one over the other on the same wafer resulting in higher device density on an area basis without having to develop smaller transistors. However such integration may not be readily feasible when irregular non-memory structures, such as sensors and CPUs, are to be integrated in 3D. Similar limits would also apply for 3D integration of devices that require very different process flows, such as analog with digital processor and memory.

For applications where integration of chips with such heterogeneous designs and processes are required, integration at the package level becomes a viable alternative. For package level integration, 3D stacking of individual chips is the ultimate configuration in terms of reducing footprint and improving performance by shrinking interconnect length between individual chips in the stack. Such packages are already in mass production for camera modules that require tight coupling of the image sensor to a signal processor. Other applications, such as 3D stacks of DRAM chips and CPU/memory stacks, are under development. For these applications 3D modules have been chosen so as to reduce not just the form factor but also the length of interconnects between individual chips.

Figure 1: Equivalent circuit for interconnect between DRAM and SoC chips in a PoP package.

Interconnects a necessary evil

To a chip or system designer the interconnect between transistors or the wiring between chips is a necessary evil. They introduce parasitic R, L and C into the signal path. For die level interconnects this problem became recognized at least two decades ago as RC delay in such interconnects for CPUs became a roadblock to operation over 2GHz. This prompted major changes in materials for wafer level interconnects. For the conductors, the shift was from aluminum to lower resistance copper which enabled a shrink in geometries. For the surrounding interlayer dielectric that affect the parasitic capacitance, silicon dioxide was replaced by various low and even ultra low k ( dielectric constant ) materials, in spite of their poorer mechanical properties. Similar changes were made even earlier in the chip packaging arena when ceramic substrates were replaced by lower– k organic substrates that also reduced costs. Interconnects in packages and PCBs too introduce parasitic capacitance that contributes to signal distortion and may limit the maximum bandwidth possible. Power lost to parasitic capacitance of interconnects while transmitting digital signals through them depend linearly on the capacitance as well as the bandwidth. With the rise in bandwidth even in battery driven consumer electronics, such as smart phones, power loss in the package or PCBs becomes ever more significant (30%) as losses in chips themselves are reduced through better design (e.g., ESD structures with lower capacitance ).

Improving the performance of package level interconnects

Over a decade ago the chip packaging world went through a round of reducing the interconnect length and increasing interconnect density when for high performance chips such as CPUs, traditional peripheral wirebond technology was replaced by solder-bumped area-array flip chip technology. The interconnect length was reduced by at least an order of magnitude with a corresponding reduction in the parasitics and rise in the bandwidth for data transfer to adjacent chips, such as the DRAM cache. However, this improvement in electrical performance came at the expense of mechanical complications as the tighter coupling of the silicon chip to a substrate with a much larger coefficient of thermal expansion (6-10X of Si ) exposed the solder bump interconnects between them to cyclic stress and transmitted some stress to the chip itself. The resulting Chip Package Interaction (CPI) gets worse with larger chips and weaker low-k dielectrics on the chip.

The latest innovation in chip packaging technology is 3D stacking with through silicon vias (TSVs) where numerous vias (5µm in diameter and getting smaller) are etched in the silicon wafer and filled with a conductive metal, such as Cu or W. The wafers or singulated chips are then stacked vertically and bonded to one another. 3D stacking with TSVs provides the shortest interconnect length between chips in the stack, with improvements in bandwidth, efficiency of power required to transmit data, and footprint. However, as we shall see later, the 3D TSV technology is delayed not only because of complex logistics issues that are often discussed, but actual technical issues rooted in choices made for the most common variant: TSVs filled by Cu, with parallel wafer thinning.

Figure 2: Breakdown of capacitance contributions from various elements of intra-package interconnect in a PoP. The total may exceed 2 pF.

Equivalent circuit for packages

PoP (package-on-package) is a pseudo-3D package using current non-TSV technologies and are ubiquitous in SmartPhones. In a PoP, two packages (DRAM and SoC) are stacked over one another and connected vertically by peripheral solder balls or columns. The PoP package is often talked about as a target for replacement by TSV-based 3D stacks. The SoC to DRAM interconnect in the PoP has 4 separate elements (wirebond in DRAM package, vertical interconnect between the top and bottom packages, substrate trace and flip chip in bottom package for SoC) in series. The equivalent circuit for package level interconnect in a typical PoP is shown in FIGURE 1.

From FIGURE 2 it is seen that interconnect capacitance in a PoP package is dominated by not just wire bonds (DRAM) but the lateral traces in the substrate of the flip chip package (SoC) as well. Both of these large contributions are eliminated in a TSV based 3D stack.

In a 3D package using TSVs the elimination of substrate traces and wire bonds between the CPU and DRAM leads to a 75% reduction in interconnect capacitance (FIGURE 3) with consequent improvement in maximum bandwidth and power efficiency.

Effect of parasitics

Not only do interconnect parasitics cause power loss during data transmission but they also affect the waveform of the digital signal. For chips with a given input/output buffer characteristics, higher capacitance slows down the rise and falling edges [1,2]. Inductance causes more noise and constricts the eye diagram. So higher interconnect parasitics limit the maximum bandwidth for error free data transmission through a package or PCB.

TSV-based 3D stacking

As has been previously stated, a major reason for developing TSV technology is to use it to improve data transmission – measured by bandwidth and power efficiency — between chips and go beyond bandwidth limits imposed by conventional interconnect. Recently a national Lab in western Europe has reported results [3] of stacking a single DRAM chip to a purpose-designed SoC with TSVs in a 4 x 128 bit wide I/O format and at a clock rate of just 200MHz. They were able to demonstrate a bandwidth of 12.8 MB/sec (2X that in a PoP with LP DDR3 running at 800MHz). Not surprisingly the power efficiency for data transfer reported (0.9 pJ/bit) was only a quarter of that for the PoP case.

Despite a string of encouraging results over the last three years from several such test vehicles, TSV-based 3D stacking technology is not yet mature for volume production. This is true for the TSV and manufacturing technology chosen by a majority of developers, namely filling the TSVs with copper and thinning the wafers in parallel but separately which requires bonding/debonding to carrier wafers. The problems with filling the TSVs with copper have been apparent for several years and affect electrical design [4]. The problem arises from the large thermal expansion mismatch between copper and silicon and the stress caused by it in the area surrounding copper-filled TSVs, which alters electron mobility and circuit performance. The immediate solution is to maintain keep-out zones around the TSVs, however this affects routing and the length of on-die interconnect. Since the stress field around copper-filled TSVs depend on the square of the via diameter, smaller diameter TSVs are now being developed to shrink the keep out zone.

Only now the problems of debonding thinned wafers with TSVs, such as fracturing, and subsequent handling are being addressed by development of new adhesive materials that can be depolymerized by laser and thinned wafers removed from the back-up without stress.

The above problems were studied and avoided by the pioneering manufacturer of 3D memory stacks. They changed via fill material from copper to tungsten, which has a small CTE mismatch with copper, and opted for a sequential bond/thin process for stacked wafers thereby totally avoiding any issues from bond/debond or thin wafer handling.

It is baffling why such alternative materials and process flows for TSVs are not being pursued even by U.S. based foundries that seem to take their technical cues instead from a national laboratory in a small European nation with no commercial production of semiconductors!

Figure 3: When TSVs (labeled VI) replace the conventional interconnect in a PoP package, the parasitic capacitance of interconnect between chips, such as SoC and DRAM, is reduced by 75%.

Options for CPU to memory integration

Given the delay in getting 3D TSV technology ready at foundries, it is normal that alternatives like 2.5D, such as planar MCMs on high density silicon substrates with TSVs, have garnered a lot of attention. However the additional cost of the silicon substrate in 2.5D must be justified from a performance and/or foot-print standpoint. Interconnect parasitics due to wiring between two adjacent chips in a 2.5D module are significantly smaller than that in a system built on PCBs with packaged chips. But they are orders of magnitude larger than what is possible in a true 3D stack with TSVs. Therefore building a 2.5D module of CPU and an adjacent stack of memory chips with TSVs would reduce the size and cost of the silicon substrate but won’t deliver performance anywhere near an all TSV 3D stack of CPU and memory.


Alternatives to TSVs for package level integration

Integrating a non-custom CPU to memory chips in a 3D stack would require the addition of redistribution layers with consequent increase in interconnection length and degradation of performance. In such cases it may be preferable to avoid adding TSVs to the CPU chips altogether and integrate the CPU to a 3D memory stack via a substrate in a double-sided package configuration. The substrate used is silicon with TSVs and high-density interconnects. Test vehicles for such an integration scheme have been built and electrical parameters evaluated [5,6]. For cost driven applications e,g. Smart Phones the cost of large silicon substrates used above may be prohibitive and the conventional PoP package may need to be upgraded. One approach to do so is to shrink the pitch of the vertical interconnects between the top and bottom packages and quadruple the number of these interconnects and the width of the memory bus [7,8]. While this mechanical approach would allow an increase in the bandwidth, unlike TSV based solutions they would not reduce the I/O power consumption as nothing is done to reduce the parasitic capacitance of the interconnect previously discussed (FIGURE 3).

A novel concept of “Active Interconnects” has been proposed and developed at APSTL. This concept employs a more electrical approach to equal the performance of TSVs [1] and replace these mechanically complex intrusions into live silicon chips. Compensation circuits on additional ICs are inserted into the interconnect path of a conventional PoP package for a Smart Phone (FIGURE 4) to create the SuperPoP package with Bandwidth and Power efficiency to approach that of TSV-based 3D stacks without having to insert any troublesome TSVs into the active chips themselves.

Figure 4: Cross-section of a APSTL Super POP package under development to equal performance of TSV based 3D stacks. Integrated circuit with compensation circuits for ea. interconnect is inserted between the two layers of a PoP for SmartPhones. This chip contains through vias and avoids insertion of TSVs in high value dice for SoC or DRAM.

A wide array of package level integration technologies now available to chip and system designers have been discussed. The performance of package level interconnect has become ever more important for system performance in terms of bandwidth and power efficiency. The traditional approach of improving package electrical performance by shrinking interconnect length and increasing their density continues with the latest iteration, namely TSVs. Like previous innovations, TSVs too suffer from mechanical complications, only now more magnified due to stress effects of TSVs on device performance. Further development of TSV technology must not only solve all remaining problems of the current mainstream technology – including Cu-filled vias and parallel thinning of wafers — but also simplify the process where possible. This includes adopting more successful material (Cu-capped W vias) and process choices (sequential wafer bond and thin) already in production. In the meantime innovative concepts like Active Interconnect that altogether avoids using TSVs and APSTL SuperPoP using this concept show promise for cost-driven power-sensitive applications like smart phones. •

Gupta, D., “A novel non-TSV approach to enhancing the bandwidth in 3D packages for processor- memory modules “, IEEE ECTC 2013, pp 124 – 128.

Karim, M. et al , “Power Comparison of 2D, 3D and 2.5D Interconnect Solutions and Power Optimization of Interposer Interconnects,” IEEE ECTC 2013, pp 860 – 866.

Dutoit, D. et al, “A 0.9 pJ/bit, 12.8 GByte/s WideIO Memory Interface in a 3D-IC NoC-based MPSoC,” 2013 Symposium on VLSI Circuits Digest of Technical Papers.

Yang, J-S et al, “TSV Stress Aware Timing Analysis with Applications to 3D-IC Layout Optimization,” Design Automation Conference (DAC), 2010 47th ACM/IEEE , June 2010.

Tzeng, P-J. et al, “Process Integration of 3D Si Interposer with Double-Sided Active Chip Attachments,” IEEE ECTC 2013, pp 86 – 93.

Beyene, W. et al, “Signal and Power Integrity Analysis of a 256-GB/s Double-Sided IC Package with a Memory Controller and 3D Stacked DRAM,” IEEE ECTC 2013, pp 13 – 21.

Mohammed, I. et al, “Package-on-Package with Very Fine Pitch Interconnects for High Bandwidth,” IEEE ECTC 2013, pp 923 – 928

Hu, D.C., “A PoP Structure to Support I/O over 1000,” ECTC IEEE 2013, pp 412 – 416

DEV GUPTA is the CTO of APSTL, Scottsdale, AZ ([email protected]).

Inside the Hybrid Memory Cube

September 18, 2013

The HMC provides a breakthrough solution that delivers unmatched performance with the utmost reliability.

Since the beginning of the computing era, memory technology has struggled to keep pace with CPUs. In the mid 1970s, CPU design and semiconductor manufacturing processes began to advance rapidly. CPUs have used these advances to increase core clock frequencies and transistor counts. Conversely, DRAM manufacturers have primarily used the advancements in process technology to rapidly and consistently scale DRAM capacity. But as more transistors were added to systems to increase performance, the memory industry was unable to keep pace in terms of designing memory systems capable of supporting these new architectures. In fact, the number of memory controllers per core decreased with each passing generation, increasing the burden on memory systems.

To address this challenge, in 2006 Micron tasked internal teams to look beyond memory performance. Their goal was to consider overall system-level requirements, with the goal of creating a balanced architecture for higher system level performance with more capable memory and I/O systems. The Hybrid Memory Cube (HMC), which blends the best of logic and DRAM processes into a heterogeneous 3D package, is the result of this effort. At its foundation is a small logic layer that sits below vertical stacks of DRAM die connected by through-silicon -vias (TSVs), as depicted in FIGURE 1. An energy-optimized DRAM array provides access to memory bits via the internal logic layer and TSV – resulting in an intelligent memory device, optimized for performance and efficiency.

By placing intelligent memory on the same substrate as the processing unit, each system can do what it’s designed to do more efficiently than previous technologies. Specifically, processors can make use of all of their computational capability without being limited by the memory channel. The logic die, with high-performance transistors, is responsible for DRAM sequencing, refresh, data routing, error correction, and high-speed interconnect to the host. HMC’s abstracted memory decouples the memory interface from the underlying memory technology and allows memory systems with different characteristics to use a common interface. Memory abstraction insulates designers from the difficult parts of memory control, such as error correction, resiliency and refresh, while allowing them to take advantage of memory features such as performance and non-volatility. Because HMC supports up to 160 GB/s of sustained memory bandwidth, the biggest question becomes, “How fast do you want to run the interface?”

The HMC Consortium
A radically new technology like HMC requires a broad ecosystem of support for mainstream adoption. To address this challenge, Micron, Samsung, Altera, Open-Silicon, and Xilinx, collaborated to form the HMC Consortium (HMCC), which was officially launched in October, 2011. The Consortium’s goals included pulling together a wide range of OEMs, enablers, and tool vendors to work together to define an industry-adoptable serial interface specification for HMC. The consortium delivered on this goal within 17 months and introduced the world’s first HMC interface and protocol specification in April 2013.
The specification provides a short-reach (SR), very short-reach (VSR), and ultra short-reach (USR) interconnection across physical layers (PHYs) for applications requiring tightly coupled or close proximity memory support for FPGAs, ASICs and ASSPs, such as high-performance networking and computing along with test and measurement equipment.

FIGURE 1. The HMC employs a small logic layer that sits below vertical stacks of DRAM die connected by through-silicon-vias (TSVs).

The next goal for the consortium is to develop a second set of standards designed to increase data rate speeds. This next specification, which is expected to gain consortium agreement by 1Q14, shows SR speeds improving from 15 Gb/s to 28 Gb/s and VSR/USR interconnection speeds increasing from 10 to 15–28 Gb/s.

Architecture and Performance

Other elements that separate HMC from traditional memories include raw performance, simplified board routing, and unmatched RAS features. Unique DRAM within the HMC device are designed to support sixteen individual and self-supporting vaults. Each vault delivers 10 GB/s of sustained memory bandwidth for an aggregate cube bandwidth of 160 GB/s. Within each vault there are two banks per DRAM layer for a total of 128 banks in a 2GB device or 256 banks in a 4GB device. Impact on system performance is significant, with lower queue delays and greater availability of data responses compared to conventional memories that run banks in lock-step. Not only is there massive parallelism, but HMC supports atomics that reduce external traffic and offload remedial tasks from the processor.

As previously mentioned, the abstracted interface is memory-agnostic and uses high-speed serial buses based on the HMCC protocol standard. Within this uncomplicated protocol, commands such as 128-byte WRITE (WR128), 64-byte READ (RD64), or dual 8-byte ADD IMMEDIATE (2ADD8), can be randomly mixed. This interface enables bandwidth and power scaling to suit practically any design—from “near memory,” mounted immediately adjacent to the CPU, to “far memory,” where HMC devices may be chained together in futuristic mesh-type networks. A near memory configuration is shown in FIGURE 2, and a far memory configuration is shown in FIGURE 3. JTAG and I2C sideband channels are also supported for optimization of device configuration, testing, and real-time monitors.

HMC board routing uses inexpensive, standard high-volume interconnect technologies, routes without complex timing relationships to other signals, and has significantly fewer signals. In fact, 160GB/s of sustained memory bandwidth is achieved using only 262 active signals (66 signals for a single link of up to 60GB/s of memory bandwidth).

FIGURE 2. The HMC communicates with the CPU using a protocol defined by the HMC consortium. A near memory configuration is shown.
FIGURE 3.A far memory communication configuration.

FIGURE 2. The HMC communicates with the CPU using a protocol defined by the HMC consortium. A near memory configuration is shown.

A single robust HMC package includes the memory, memory controller, and abstracted interface. This enables vault-controller parity and ECC correction with data scrubbing that is invisible to the user; self-correcting in-system lifetime memory repair; extensive device health-monitoring capabilities; and real-time status reporting. HMC also features a highly reliable external serializer/deserializer (SERDES) interface with exceptional low-bit error rates (BER) that support cyclic redundancy check (CRC) and packet retry.

HMC will deliver 160 GB/s of bandwidth or a 15X improvement compared to a DDR3-1333 module running at 10.66 GB/s. With energy efficiency measured in pico-joules per bit, HMC is targeted to operate in the 20 pj/b range. Compared to DDR3-1333 modules that operate at about 60 pj/b, this represents a 70% improvement in efficiency. HMC also features an almost-90% pin count reduction—66 pins for HMC versus ~600 pins for a 4-channel DDR3 solution. Given these comparisons, it’s easy to see the significant gains in performance and the huge savings in both the footprint and power usage.

Market Potential

HMC will enable new levels of performance in applications ranging from large-scale core and leading-edge networking systems, to high-performance computing, industrial automation, and eventually, consumer products.

Embedded applications will benefit greatly from high-bandwidth and energy-efficient HMC devices, especially applications such as testing and measurement equipment and networking equipment that utilizes ASICs, ASSPs, and FPGA devices from both Xilinx and Altera, two Developer members of the HMC Consortium. Altera announced in September that it has demonstrated interoperability of its Stratix FPGAs with HMC to benefit next-generation designs.

According to research analysts at Yole Développement Group, TSV-enabled devices are projected to account for nearly $40B by 2017—which is 10% of the global chip business. To drive that growth, this segment will rely on leading technologies like HMC.

FIGURE 4.Engineering samples are set to debut in 2013, but 4GB production in 2014.

Production schedule
Micron is working closely with several customers to enable a variety of applications with HMC. HMC engineering samples of a 4 link 31X31X4mm package are expected later this year, with volume production beginning the first half of 2014. Micron’s 4GB HMC is also targeted for production in 2014.

Future stacks, multiple memories
Moving forward, we will see HMC technology evolve as volume production reduces costs for TSVs and HMC enters markets where traditional DDR-type of memory has resided. Beyond DDR4, we see this class of memory technology becoming mainstream, not only because of its extreme performance, but because of its ability to overcome the effects of process scaling as seen in the NAND industry. HMC Gen3 is on the horizon, with a performance target of 320 GB/s and an 8GB density. A packaged HMC is shown in FIGURE 4.

Among the benefits of this architectural breakthrough is the future ability to stack multiple memories onto one chip. •

THOMAS KINSLEY is a Memory Development Engineer and ARON LUNDE is the Product Program Manager at Micron Technology, Inc., Boise, ID.

Megasonics has been used considered and used for many years to meet many of the cleaning challenges, but it has been shown to cause damage to nanoscale device structures such as polysilicon lines.

By Ahmed Busnaina, Northwestern University

Cavitation threshold, defined as the minimum pressure amplitude to induce cavitation, has been identified more than 40 years ago. A lower cavitation threshold indicates that cavitation (micro bubble implosion) occurs more readily. At megasonic frequencies, the cavitation threshold is very high which indicates that it’s unlikely to have cavitation at high (megasonic) frequencies. However, many have clearly shown that cavitation damage does occur at megasonic frequencies using commercial megasonic equipment.

If cavitation is not supposed to happen but does, why? The answer is that cavitation does not occur at megasonic frequencies (400kHz and higher) as shown by many over the last few decades. Cavitation is instead caused by secondary frequencies as low as 40 KHz that exist in megasonic tanks with sufficiently high power to generate ultrasonic cavitation responsible for damage.

Cavitation is a result of bubble implosion that occurs at high pressure amplitudes typically at low frequency (lower than 100 kHz). Frequency and amplitude (power) measurements show that traditional megasonic transducers generate many frequencies as low as 40 kHz at high amplitude (power). For example, a commercial megasonic tank operating at 700 kHz frequency shows siginifcant pressure amplitude (close the pressure at the 700 kHz) at 100kHz or less on top of the transducer or next to it. However, when a narrow band transducer is used where for a narrow bandwidth transducer that operates at 600 kHz frequency does show any large pressure amplitudes at lower frequencies. Narrow band transducer is a term used to indicate that large amplitude at lower frequencies (at or smaller than 100 kHz) are minimized or eliminated. Minimization of the large amplitude at the low frequencies shows that damage does not occur even at high power once the low ultrasonic frequencies (with high amplitudes) are eliminated or minimized.

FIGURE 1. SEM images of 120nm (A and C) and 350nm (B and D) lines after cleaning with 100% power for 5 minutes. While the single wafer megasonic tank damages the structures the narrow bandwidth transducer preserves the patterns. The investigated area is 1mm2.

FIGURE 1 shows SEM images of 120nm (A and C) and 350nm (B and D) lines after cleaning with 100% power for 5 minutes. While a conventional single wafer megasonic tank (3A and 3C) damages the structures the narrow bandwidth transducer (3B and 3D) preserves the patterns. The investigated area is 1mm2. This shows that damage in a conventional megasonic tank is the result of these low frequencies and that by eliminating these low frequencies (with high amplitude) damage can be eliminated without sacrificing effective cleaning.

Transducers in conventional megasonic tanks giving rise to high pressure amplitude at low frequencies is the main culprit and the cause of observed damage. Measured frequencies in conventional megasonic tanks and in a narrow bandwidth megasonic tank and show that in typical megasonic (operating at high frequencies) high amplitude low frequencies exist. Narrow bandwidth transducers reduce do not have high pressure amplitudes at low frequencies. This leads to the reduction or elimination of pattern damage which occur in conventional megasonic tanks. Therefore, if tanks are made to only generate high amplitude at high frequencies only, the issue of damage could be resolved. •