Publication
VLSI 2022
Conference paper

A 72GS/s, 8-bit DAC-based Wireline Transmitter in 4nm FinFET CMOS for 200+Gb/s Serial Links

Abstract

A DAC-based SST transmitter for wireline applications is reported in a 4nm FinFET technology. 8b resolution and high analog output bandwidth (BW) are achieved by employing a segmented architecture along with a single-ended LSB. Hybrid analog/digital tuning is used in the DAC LSB segments, re-sulting in well-matched MSB/LSB segments with -0.63/0.67 LSB INL and -0.16/0.43 LSB DNL. 216Gb/s PAM8 and 212Gb/s QAM64 OFDM operation are demonstrated at 288mW from a 0.95V supply.

Authors’ notes

Serializer-deserializer (SerDes) systems enable high-bandwidth communication in modern data centers. Over the past two decades, the per-lane data rate across serial links has doubled about every four to five years, as exemplified by the trend in Ethernet lane rates in Figure 1 [1]. 100G Ethernet is expected to be formally standardized through the IEEE 802.3ck in 2022, and interest in 200Gb/s serial links is emerging. However, as lane rates increase, wireline electrical channel impairments become a larger obstacle to reliable transmission and reception of high-speed data. Limited channel bandwidth results in inter-symbol interference (ISI), where energy from time-domain data pulses may interfere with other data pulses either in adjacent symbols or perhaps even several (10 or more) symbol periods away. As a result, modern wireline communication links employ heavy signal processing, commonly referred to as equalization, to remove ISI and improve signal detection with low-bit error rates.

Trends of Ethernet Lane Data Rates (from [1]).
Fig 1. Trends of Ethernet Lane Data Rates (from [1])

High-speed wireline transmitters often include feed-forward equalization (FFE), which is a finite impulse response (FIR) filter used to shape the frequency response of transmitted pulses. Figure 2 shows two high-level approaches to transmitter architectures with FFE. An “analog” TX architecture is shown at the top of the figure, where the equalization along with data modulation (for example, 4-level pulse amplitude modulation or PAM4) is implemented using high-speed analog circuitry. For higher-speed links, this approach has two drawbacks as data rates exceed 56Gb/s. First, a large number of FFE “taps” are often required to compensate multiple ISI terms. Second, multiple sub-drivers are needed to support multi-level time-domain modulation such as PAM4.

Both of these drawbacks increase the complexity and power consumption of high-speed analog circuitry. DAC-based architectures have become popular choices for high-speed transmitters, as they move this complexity from the high-speed analog circuitry into the digital signal processing element. The DSP provides pre-equalized and modulated data which is serialized up to the TX symbol rate and fed to the input of a DAC-based driver.

As link margins are studied for future 200Gb/s serial links, other more-advanced modulation formats could become candidates to increase the data rate without a commensurate rise in the link symbol rate. Moving to a DAC transmitter architecture also gives the flexibility to support multiple data modulation formats, both in the time domain (such as 8-level PAM), or perhaps even in the frequency domain using orthogonal frequency division multiplexed (OFDM) sub-carriers. However, these modulation schemes require higher DAC linearity than simpler time-domain modulation approaches like non-return-to-zero (NRZ) or PAM4, which can be difficult to achieve in nanoscale FinFET CMOS technologies while simultaneously achieving high DAC sampling rate.

The IBM Research and Samsung approach

In this work, IBM Research and Samsung Electronics partnered to architect and design a high-speed 72GS/s DAC-based transmitter with 8b resolution in a 4nm FinFET CMOS technology to support future 200+Gb/s serial links.

High-Level Wireline Transmitter Architectures and Candidate Modulation Formats.
Fig 2. High-Level Wireline Transmitter Architectures and Candidate Modulation Formats (Top) Analog TX architecture (Bottom) DAC + DSP TX architecture (Right) PAM4, PAM8, and OFDM data modulation formats.

The TX test chip architecture is depicted in the block diagram of Figure 3. The test chip receives quarter-rate input clocks (i.e., clocks with a frequency equal to one-quarter of the DAC output symbol rate) and internally generates all lower-frequency clocks required for data serialization. Transmit data patterns are loaded into a 32kB SRAM-based pattern generator. The parallel 64th-rate 8b data (i.e., 1/64th of the transmitter symbol rate) is serialized to 8th rate, and the upper 2b of the DAC data are encoded to 3b thermometer coding in order to reduce DAC glitching and improve static differential nonlinearity (DNL).

The 8th-rate data is then fed to the DAC macro, whose details are shown in Figure 4. This macro includes further 8:4 serialization of the data, followed by a block that staggers the quarter-rate data by 1 unit interval (UI, equivalent to the symbol period) before delivering the data to DAC segments. As seen in Figure 5 the DAC segments consist of a 4:1 multiplexer (MUX), pre-driver, and a series-source terminated (SST) driver. Architectures of the 4:1 MUX are discussed in a previous IBM publication [2].

Wireline Transmitter Test Chip Architecture.
Fig 3. Wireline Transmitter Test Chip Architecture with 72GS/s 8b DAC.
DAC Architecture.
Fig 4. 8b DAC Architecture with SST Driver Segments.
DAC segments.
Fig 5. DAC segments.

A common approach to the design of a high linearity 8b DAC is to use 255 equally weighted segments to produce the 256 DAC analog output levels. Such a large number of segments has a severe drawback for high-speed operation, as it would present heavy parasitic capacitive loading to the DAC output and degrade the output analog bandwidth.

Our approach reduces the required number of DAC segments to achieve 8b resolution through two techniques. First, two SST driver segments (referred to as A and B) are employed, where the A driver segment has 8x the drive strength of a B segment. Second, a single-ended DAC least significant bit (LSB) is uses. This adds an extra bit of resolution to the differential DAC without a commensurate doubling of the number of DAC segments, as the single-ended LSB segment contributes half of the differential voltage of its differential counterpart employed in the next highest weighted DAC bit. In total, our approach achieves 8b resolution (256 analog output levels) using only 15 A segments and eight B segments.

For such an approach to work, the relative strength of the B segments (associated with DAC LSBs) must be calibrated against the A segments (associated with DAC MSBs) to achieve the desired 8:1 drive strength ratio in the presence of random device mismatch. The schematics of the A and B SST driver segments are seen in Figure 6. Coarse adjustment of the segment strengths is controlled through digital tuning in the header and footer devices.

However, a separate fine analog adjustment is included for precise control of the strength of segment B. This technique overcomes limitations of pure digital tuning, the accuracy of which is hindered by minimum transistor device sizes. Again, referring to Figure 5, note that the MSB segments are driven by 4:1 MUX and pre-driver circuitry that has twice the drive strength of those in the LSB segments. Since the SST drivers have an 8:1 strength ratio, additional dummy loading was added to the output of the Segment B pre-driver to maintain a 2:1 load ratio between Segments A and B and avoid systematic skew.

SST driver schematics.
Fig 6. SST Driver Schematics (Left) Segment A (Right) Segment B.

Experimental Results

The transmitter test chip was fabricated in Samsung 4lpp 4nm FinFET CMOS technology. The test chip die photo along with layout details are seen in Figure 7. The chip was flip-chip mounted to a GL102F 4-2-4 substrate, which was subsequently mounted to a Nelco4000-13 test card. Transmitter outputs were taken through connectors attached to the package and measured using a high-speed oscilloscope. The DAC achieves a differential nonlinearity (DNL) of between -0.16 and + 0.43 LSBs, and an integral nonlinearity (INL) of between -0.63 and +0.67 LSBs.

The former measures the uniformity of the DAC output voltage step size as a function of the input code. The latter measures the cumulative output error as a function of the DAC input code and is important for multi-level PAM as it more directly relates to the uniformity of the PAM levels. For a 101-MHz output sinewave, the DAC achieves a signal-to-noise-and-distortion ratio (SNDR) of 44.5dB and a spurious free dynamic range (SFDR) of 57.5dB.

These results confirm the effectiveness of our approach to tuning the relative strengths of SST segments A and B via hybrid digital and analog controls, thereby enabling 8b DAC resolution with a low number of DAC elements.

Transmitter test chip die photo (left) and layout (right).
Fig 7. Transmitter test chip die photo (left) and layout (right).

Time-domain eye diagram measurements are shown in Figure 8 for the DAC operating at 72GS/s. These include 72Gb/s NRZ, 144Gb/s PAM4, 180Gb/s PAM6, and 216Gb/s PAM8. In all cases, an 8-tap FFE (three pre-cursors and four post-cursors) was applied to the transmit waveform to compensate approximately 9dB of loss at the Nyquist rate through the package, cables, and oscilloscope remote sampling heads. For PAM4, PAM6, and PAM8, the ratio of level mismatch (RLM) is 98.8%, 98.1%, and 97.8%, again demonstrating excellent DAC linearity.

72GBaud TX output eye diagrams for various time-domain modulation formats.
Fig 8. 72GBaud TX output eye diagrams for various time-domain modulation formats.

In a final experiment, QAM64 data was loaded onto 255 OFDM subcarriers from 140MHz to 35.9GHz, for an aggregate line rate of 212Gb/s. The OFDM frames employ block coded modulation with an inner binary product code and an output Reed-Solomon KR4 code, yielding a payload data rate of 184Gb/s. Transmitter output symbols were captured at 32 points per symbol using the sub-sampling oscilloscope, and statistical analysis was performed to decode the data and estimate OFDM SNDR as a function of subchannel.

The resulting signal power and SNDR is seen on the left side of Figure 9, along with estimated QAM64 constellation on the right side of the figure. A raw bit error rate of 0.43E-3 was achieved, along with an average subchannel SNDR of 24.3dB. This SNDR is 1.8dB better than that achieved for PAM8 data. The performance is sufficient to decode the block code modulated data with no bit errors.

OFDM experimental data (left) signal power and SNDR vs OFDM subchannel (right) estimated QAM64 constellation.
Fig 9. OFDM experimental data (left) signal power and SNDR vs OFDM subchannel (right) estimated QAM64 constellation.

Significance

The 72GS/s DAC-based transmitter described here is the first to achieve a sampling rate of greater than 56GS/s using a voltage-mode SST-style driver. While higher speeds have been achieved using current mode logic (CML) based drivers, the SST topology offers advantages for SerDes ESD protection since the passive resistor at the driver output attenuates CDM ESD pulses to improve protection of the transistor. Our use of SST segments A and B to lower the required number of driver segments while still achieving 8b resolution is a key architectural innovation enabling the speed that has been achieved in this work.

This work shows promise for the use of frequency domain modulation approaches such as OFDM in high data rate wireline communication links. Transmitter and receiver architectures to enable OFDM links require lower sampling rate ADCs and DACs than architectures for PAM4, suggesting lower power analog circuitry in the SerDes. Furthermore, the data has demonstrated that OFDM can achieve a higher SNR than PAM8 at the same line rate. The 4nm CMOS DAC described in this work is the first to demonstrate sufficient speed and linearity to support frequency domain modulation.

[1] Beyond 400 Gb/s Ethernet. IEEE 802.3 NEA Ad hoc. 29 Oct 2020.

[2] M. Kossel, V. Khatri, M. Braendli, P. A. Francese, T. Morf, S. Yonar, M. Prathapan, E. Lukes, R. Richetta, and C. Cox, “An 8b DAC-based SST TX using metal gate resistors with 1.4pJ/b efficiency at 112Gb/s PAM4 and 8-taps FFE in 7nm CMOS,” IEEE International Solid-State Circuits Conference, pp. 130-131, February 2021.

Date

12 Jun 2022

Publication

VLSI 2022

Authors

Topics

Share