Siliconica

Monthly Archives: January 2017

Intel’s 10nm Enigma

By Dick James

I’ve been looking back at the talk given by Mark Bohr and Zane Ball (Building Winning Products with Intel® Advanced Technologies and Custom Foundry Platforms) at the Intel Developer Forum (IDF) in August last year, and I’m a bit puzzled.

Mark Bohr presenting at the 2016 IDF in San Francisco (Source: Intel)

Mark Bohr presenting at the 2016 IDF in San Francisco (Source: Intel)

Mark announced the gate pitch as 54 nm, which I make as 0.77 x 70 nm (the 14-nm gate pitch):

Gate pitch

And he also said that their measure of scaling is gate pitch x cell height:

Cell size

and then he said that new design rules give even better scaling:

Cell size 2

“Our trend in reducing logic cell area has been about 0.46x per generation, a little bit faster than the typical Moore’s Law of 0.5x.

On our last two generations, 14 nm and now on 10 nm, we’re actually scaling our logic cell area a little bit faster than what that simple metric suggests; there are some other tricks that we’re doing on 10 nm that is providing even faster than normal logic cell area scaling, so although 0.46x was the long-term trend over the past four generations, it’s actually a bit faster on our 14 and now again on our 10-nm technology.”

If you look at the numbers in the revised graph, then we appear to have a scaling factor of ~0.37 per generation, which is indeed quite impressive!

The cell height can also be measured by the number of metal tracks that are needed for routing for the cell; in recent nodes we have gone from 12-track (12T), to 9T, to ~7.5T in the latest 14- and 16-nm processes. So we can also describe cell height as the number of metal pitches (MPP) in a cell.

That leads me to the enigma – if you take the 10-nm number from the above graph, ~11,000 nm2, and divide the 54-nm gate pitch into it to get the cell height, and then divide that by the minimum SADP (self-aligned double patterning) metal pitch of ~40 nm to get the number of tracks, then you get a five-track cell, which seems really ambitious for a one-generation shrink.

If you go the other way and plug in 54 nm and a six-track cell, then the MPP comes out at 34 nm, which presumably means SAQP (self-aligned quadruple patterning), which again sounds really ambitious.

The 5T cell is more in keeping with “design rule enhancements” but if that is the case, that also requires a reduction in the number of fins per transistor, which implies taller fins or other tweaks to maintain transistor performance; or SAQP for fin definition, to allow increased fin density. Given that the 14-nm fin pitch was ~42 nm, already close to the SADP limit, the latter may be a real possibility (a 76% linear shrink would be ~32 nm).

If they’ve done any of these, I guess it could account for the increased time between generations!

Mark also broke with the current convention of showing performance plots with the dreaded “arbitrary units”, later in the talk he showed the four transistor options available in their 10-nm process:

Transistor options

According to Mark, the four options will use the same 54nm gate pitch. NMOS drive currents are still higher than PMOS, which to me suggests two things – there is a seventh-generation strain mechanism at work for NMOS, and it seems unlikely that we have a different channel material such as SiGe in the PMOS devices.

In keeping with their foundry ambitions, there will be three evolutions of the 10-nm process, with the initial launch of 10, then 10+ and 10++, as well as a SoC version of the process with high-voltage and analog elements, and three interconnect stacks. In any case, these numbers do support the Intel claim that their process is a true shrink from 14 nm, not just an improved 14-nm process.

The increased shrink allows Intel to stay ahead of the cost curve, so that we still have improved PPAC (performance/power/area/cost) numbers.

Transistor cost

We will see if any more information comes from the quarterly call this week, or at the Investor Meeting next month, but in the meantime, we have our mystery – do we have a five-track cell, or am I missing something?

IEDM 2016 – Setting the Stage for 7/5 nm

By Dick James

At IEDM last month, there was much ado about the adjacent 7-nm late-news papers from TSMC and the GLOBALFOUNDRIES/IBM/Samsung group consortium from the Albany Nanotechnology Center, and with less ado, Samsung gave a 5-nm presentation later in the conference. Here we discuss all three talks, and try and make some comparisons.

TSMC 7 nm

In paper 2.6 [1], TSMC announced the “world’s first 7nm CMOS platform technology for mobile system-on-a-chip (SoC) applications, featuring FinFET transistors”. They claimed the world’s smallest-ever SRAM cell at 0.27 µm2, and 3x the gate density of the 16-nm (16 FF+) process, together with a 35 – 40% speed gain or over 65% power reduction. In addition, the process uses 193 nm immersion lithography, dual raised source/drain epi, a novel contact technique, and a 12-layer copper/low-k interconnect stack.

The fourth-generation fin profile and width are “carefully optimized” for the fifth-generation HKMG gate-last, dual gate oxide process, with an effective gate length (leff) centered around 16.5 nm. Sub-threshold swing has been pushed down to ~65mV/decade, and DIBL is ~40 mV/V.

TSMC SRAM IEDM

Four Vt options are available in the TSMC 7-nm technology [1]

There are four device Vt options with a range of ~200 mV.

The contacted poly pitch (CPP) is not stated (Scotten Jones speculates that it is 54 nm, the same as Intel’s 10-nm process), enabled by a novel contact process, and we also have “novel strain engineering and new process knobs” which boost mobility and reduce parasitic resistance to give increased drive current (at least in arbitrary units).

The 1x metal pitch is 40 nm for M0 to M4, and M5 – M9 are 1.9x (76 nm). The paper states that “single patterning is adopted for metal layers with 2X minimum metal pitch and above” – which make me wonder if they’ve managed to push single patterning to the 76-nm pitch, or whether they are going with double patterning for the first nine levels.

An earlier SRAM paper was given in June at the VLSI meeting [2], as a sub-0.03 µm2 bitcell, aimed at a “beyond 10-nm node”, so likely the same SRAM. It also has an leff centered on ~16.5 nm, and claims similar performance figures. Some details of the inter-well spacing are also included [1]:

TSMC SRAM VLSI

Which allows us to speculate about device sizings, at least in the SRAM cell itself. Typically, a 6-transistor (6T) SRAM unit cell is 2 x CPP high, so if we take the guesstimate of 54 nm for CPP, the 0.27 µm2 cell should have a height of 108 nm. Dividing that into 27,000 nm2, we get a cell width of 250 nm. The 16FF cell was 0.70 µm2 [3], 2.6 x the area of the 7-nm cell, confirming the claim of a 2.6 x array density increase in the paper.

I don’t have a plan-view image of the TSMC 16-nm cell, which I assume is a 1:1:1 PU:PG:PD cell (i.e. one fin for each of the pull-up/pass gate/pull-down transistors), but Intel kindly provided one of their 14-nm cell in a JSSC paper [4]:

TEM image of Intel 14-nm SRAM cell [4]

TEM image of Intel 14-nm SRAM cell [4]

The Intel 14-nm cell size is 140 x 360 nm, to give a cell size of ~0.050 µm2, considerably smaller than the 16FF cell; we can see that each transistor uses one fin, and there are four fins in the cell. In this case the fin pitch is ~80 nm, instead of the nominal 42 nm, but we have to allow space between the fins and the edge of the N-well that the PMOS pull-up transistors sit in. Theoretically the two pull-up transistor fins could use the minimum pitch, but Intel have chosen not to do that here.

Applying these considerations to TSMC’s cell, if we use the maximum fin-well edge spacing of 23 nm shown above, plus (say) 8 nm for the fin width, then we get a PU – PD/PG spacing of 2 x 23 = 46, + 8 = 54 nm; if we assume a SADP (self-aligned dual patterning) minimum fin pitch of 40 nm between the PU transistors, then we get a total of 148 nm for the center of fin 1 – fin 4, which leaves us 52 nm at each end for the PD/PG – PG contact spacing. If the PU/PU pitch is also 54 nm, that only leaves 45 nm at the end of the cell, which is pushing the limits for double-patterned contact spacing. Which gives us something like this – just guessing!

Speculative layout of TSMC 7-nm SRAM bitcell

Speculative layout of TSMC 7-nm SRAM bitcell

GLOBALFOUNDRIES/IBM/Samsung 7 nm

The other 7-nm paper [5] from Albany was clearly a research paper, but illuminating in that it shows other possible directions, not the least being the use of EUV lithography, SiGe channels for PMOS, and stress applied to the channels using a strain-relaxed buffer (SRB) substrate.

The application of a SRB substrate to generate channel stress takes me back 15 – 20 years, to the late 90’s and the turn of the millennium, when a lot of work was published on the topic by Stanford, MIT, and IBM. If a silicon epitaxial layer is grown on a SiGe substrate, then the lattice mismatch creates biaxial tensile stress in the layer, and the greater the Ge content, the greater the stress. The earliest reports I can find date back to 1992/4/5 [6, 7, 8,], but the effect is nicely summarized in this plot from IEDM 2003 [9]:

Mobility enhancement vs. strain and Ge % in strained Si/relaxed SiGe MOSFETs [9].

Mobility enhancement vs. strain and Ge % in strained Si/relaxed SiGe MOSFETs [9].

As we can see, low Ge concentration gives a large increase in electron mobility, but a high Ge content is required to enhance hole mobility.

In this paper, we have the following structure:

Schematic (center) of dual-stressed channel materials on the SRB with a super-steep retrograde well (SSRW), along with dark-field TEM images of (a) the tensile-strained silicon fin and (b) the compressively-strained SiGe fin on a common SRB [5]

Schematic (center) of dual-stressed channel materials on the SRB with a super-steep retrograde well (SSRW), along with dark-field TEM images of (a) the tensile-strained silicon fin and (b) the compressively-strained SiGe fin on a common SRB [5]

This gets around the weak PMOS improvement in silicon from the SRB by using 25% Ge in the SRB and growing a 50% Ge fin; if silicon is tensile-stressed, then a layer with more Ge than the SRB will be compressively stressed; and as we know, compressive stress is a big lever for PMOS performance. The authors claim that this combination gives ~1.6 GPa enhancement stress in both NMOS and PMOS devices. SiGe also has a higher hole mobility, compounding the performance gain.

As I remember it, SRB stress never made it into production, likely for two reasons – it was difficult to get rid of the dislocations formed in the SRB, and they propagated through into the sSi; and more production-friendly sources of uniaxial stress could be supplied by tensile nitride and embedded SiGe source/drains.

Now that we are in the finFET era, and twenty years on, we have the advantage of better process control, (so likely lower defect density), and any defects that are formed cannot propagate up the fin because of its narrow aspect ratio. In addition, fins formed in the correct orientation on a SRB use only one axis of the biaxial stress, giving the uniaxial stress that we are used to; so maybe this technique can become the stress mechanism for the 7/5 nm nodes.

Biaxial stressed layer becomes uniaxially-stressed finFET [10]

Biaxial stressed layer becomes uniaxially-stressed finFET [10]

If I read the paper correctly, the SSRW is grown epitaxially as part of the SRB (“An epi based SSRW technique is utilized to improve sub-fin isolation” [5]), before the strained silicon (sSi) epi is grown; the sSi is then etched back and the 50% SiGe layer is formed and (presumably) polished back to separate the sSi and SiGe regions before fin etch [11].

Self-aligned quadruple patterning (SAQP) was used for the fins (my notes say the fin pitch was 27 nm), and SADP for the gates with a CPP of 44/48 nm. EUV was reserved for the middle-of-line (MOL) and lower metal levels, with a minimum metal pitch of 36 nm.

(a)Schematic flow for SAQP fin patterning (b) top-down SEM of fins before cut/block mask [5] Top-down SEMs of (a) BEOL M1 lines with 36nm pitch, and typical MOL trenches with (c) 45°, (d) 90° cross-couples, (24nm trench width), all patterned by EUV lithography [5]

(a) Schematic flow for SAQP fin patterning (b) top-down SEM of fins before cut/block mask [5]
Top-down SEMs of (a) BEOL M1 lines with 36nm pitch, and typical MOL trenches with (c) 45°, (d) 90° cross-couples, (24nm trench width), all patterned by EUV lithography [5]

The EUV process was presented at last year’s IITC/AMC conference [12]; a metal hard mask was used to pattern lines and self-aligned vias into an ultra-low-k dielectric (k~2.45), and a TaN/Ru liner stack was filled conventionally with a CVD Cu seed and plating. A Co cap and SiCN/SiNO layer sealed the interconnect, giving acceptable TDDB (time-dependent dielectric breakdown) and electro-migration results.

M1 – M3 stack in test die used in [12]

M1 – M3 stack in test die used in [12]

Cross section and elemental mapping of M1 Cu lines with TaN/Ru barrier and selective Co cap [12]

Cross section and elemental mapping of M1 Cu lines with TaN/Ru barrier and selective Co cap [12]

Contacts are self-aligned, with the use of a M0 level, CA/CB contacts, and a sub-contact (TS) for source/drains. The CA/CB/M0 metallization is dual-damascene cobalt, lowering line resistance, while the TS sub-contacts appear to be tungsten. Before the TS contacts are filled, Si:P and SiGe:B epi is grown in the contact trenches, and then implanted and annealed, to give improved contact resistance.

Middle-of-line architecture (left), and dark-field TEM of M0/CA/TS stack [5]

Middle-of-line architecture (left), and dark-field TEM of M0/CA/TS stack [5]

The gate profile was modified by etching back the high-k layer before depositing the work-function metal (WFM), which helps isolate the high-k from the self-aligned contact process reactants, and improves control of the WFM recess before metal fill and dielectric deposition.

Cross-section TEM of <17 nm gate showing etch-backs of high-k and WFM [5]

As you can see from the above, there was quite a bit of detail in this presentation, which can be summarized in the process sequence shown [5]:

IBM flow

Strangely, despite the tighter pitches when compared with the TSMC SRAM bitcell, the size is the same, ~0.27 µm2, though we don’t know if this is a 1:1:1 cell or not – if (say) a 1:2:1 configuration is used, that would add at least ~0.052 µm2 to the cell size, assuming the 48 nm CPP. (The paper does not state bitcell size, but in the Q & A’s, we were told that it was 50% of the 10-nm bitcell, which was quoted at ~0.53 µm2.) The Q & A’s also mentioned that there were three flavours of Vt, and high-Vt I/O transistors were not studied, reinforcing the research nature of the paper.

Samsung 5 nm

Later in the conference (paper 28.1), Samsung presented a “co-integration scheme for 5nm logic” [13] which clearly drew on the 7-nm work from Albany detailed above, and illuminating some more development problems that that must have been seen in Albany.

A SRB substrate is used with a SiGe fin for PMOS, and a common interlayer, high-k, and work-function materials. The SiGe fin, combined with e-SiGe source/drains, gives an estimated 1 GPa compressive stress, and the SRB applies similar tensile stress to the NMOS channel. As with the Albany process, the Ge concentration increases as we go from SRB to fin to e-SiGe source/drains.

Schematic of Samsung 5-nm CMOS design concept [13]

Schematic of Samsung 5-nm CMOS design concept [13]

Defect density from the SRB was definitely a concern, and was reduced to 5e4/cm2, and then demonstrated by SRAM that leakage levels are comparable with those of a reference SRAM structure on bulk Si. My notes say that a thicker SRB was used, but no actual thicknesses were mentioned.

SiGe SRB TDD evolution with lowest TDD of 5e4/cm2 (left), 128M finFET SRAM with TDD 2.3e5 /cm2 (right) shows comparable yield and leakage with reference SRAM [13]

SiGe SRB TDD evolution with lowest TDD of 5e4/cm2 (left), 128M finFET SRAM with TDD 2.3e5 /cm2 (right) shows comparable yield and leakage with reference SRAM [13]

Another problem was migration of the Ge to the surface of the SiGe fin (shifting Vt and degrading interface state density), because of later thermal processing, as shown in this LEAP (laser enhanced atom probe) image:

Ge (blue) in SiGe fin, showing higher Ge content at the surface [13]

Ge (blue) in SiGe fin, showing higher Ge content at the surface [13]

Careful optimization of the thermal sequencing reduced this to about a 4% variation. Since the STI penetrates into the buffer layer, and we have a SiGe fin, a new STI formation process had to be developed, to reduce any side-effects from oxidation.

More details were given of the stress development; the presenter showed that the strain was uniaxially transferred to the fin, and also that the source/drain recess etch relaxed the channel stress – in the PMOS device the e-SiGe epi restored the stress, but for NMOS a non-recessed S/D was used.

Geographic phase analysis (GPA) shows uniaxial (along fin) tensile strain induced by SRB, and strain fully relaxed perpendicular to fin (a). Strain profile along fin depth shows uniaxial strain along fin but almost fully relaxed strain across fin (b) [13]

Geographic phase analysis (GPA) shows uniaxial (along fin) tensile strain induced by SRB, and strain fully relaxed perpendicular to fin (a). Strain profile along fin depth shows uniaxial strain along fin but almost fully relaxed strain across fin (b) [13]

When it comes to the electrical results, long- and short-channel plots were shown, but with no numbers for either device size or measurements, so we have to trust that they actually fit 5-nm node dimensions, or at least are for smaller pitches than the 7-nm papers detailed. However, as an integration scheme it is interesting, as gives us some clues as to what we might see from Samsung and GLOBALFOUNDRIES as we go from 10 – 7 – 5 nm.

Given the lack of detail in TSMC’s presentation, we don’t know what their novel contact and strain engineering and process knobs are – could they be contact epi and SRB strain? I guess we’ll see in a couple of years or so.

N/PFET channel average stress evolution during processing – the relative strain clearly shows relaxation from the S/D recess process. After recess optimization, relaxation was minimized for the NFET (but non-recessed process chosen), and recovered with eSiGe process [13]

N/PFET channel average stress evolution during processing – the relative strain clearly shows relaxation from the S/D recess process. After recess optimization, relaxation was minimized for the NFET (but non-recessed process chosen), and recovered with eSiGe process [13]

References

  • S-Y Wu, et al., “A 7 nm CMOS Platform Technology Featuring 4th Generation FinFET Transistors with a 0.027 um2 High Density 6-T SRAM cell for Mobile SoC Applications”, IEDM 2016, pp. 43 – 46
  • S-Y Wu, et al., “Demonstration of a sub-0.03 um2 High Density 6-T SRAM with Scaled Bulk FinFETs for Mobile SOC Applications Beyond 10nm Node”, VLSI 2016, pp 92 – 93
  • S-Y Wu, et al., “An Enhanced 16nm CMOS Technology Featuring 2nd Generation FinFET Transistors and Advanced Cu/low-k Interconnect for Low Power and High Performance Applications”, IEDM 2014, pp. 48 – 51
  • Karl, et al., “A 0.6 V, 1.5 GHz 84 Mb SRAM in 14 nm FinFET CMOS Technology With Capacitive Charge-Sharing Write Assist Circuitry”, IEEE JSSC, VOL. 51, NO. 1, (Jan 2016), pp. 222 – 228
  • Xie et al., “A 7nm FinFET Technology Featuring EUV Patterning and Dual Strained High Mobility Channels”, IEDM2016, pp. 47 – 50
  • Welser et al., “NMOS and PMOS Transistors Fabricated in Strained Silicon-Relaxed Silicon-Germanium Structures”, IEDM 1992, pp. 1000 – 1002
  • Welser et al., “Strain Dependence of the Performance Enhancement in Strained-Si n-MOSFETs”, IEDM 1994, pp. 373 – 376
  • Rim, et al., “Enhanced Hole Mobilities in Surface-channel Strained-Si p-MOSFETs” IEDM 1995, pp. 517 – 520
  • Rim, et al., “Fabrication and Mobility Characteristics of Ultra-thin Strained Si Directly on Insulator (SSDOI) MOSFETs”, IEDM 2003
  • IEDM 2016 Short Course, “Technology Options at the 5 Nanometer Node”, session 3, N. Collaert, “Novel channel materials for high-performance and low-power CMOS”, sl. 17
  • Guo et al., “FINFET Technology Featuring High Mobility SiGe Channel for 10nm and Beyond”, VLSI 2016, pp. 14 – 15
  • Standaert, et. al., “BEOL Process Integration for the 7 nm Technology Node”, IITC/AMC, 2016, pp. 2 – 4
  • D, Bae et al., “A novel tensile Si (n) and compressive SiGe (p) dual-channel CMOS FinFET co-integration scheme for 5nm logic applications and beyond”, IEDM 2016, pp. 683 – 686