

### Ph.D. Thesis

# MULTI LOOK-UP TABLE DIGITAL PREDISTORTION FOR RF POWER AMPLIFIER LINEARIZATION

Author: Pere Lluis Gilabert Pinal

Advisors: Dr. Eduard Bertran Albertí

Dr. Gabriel Montoro López

Control Monitoring and Communications Group Department of Signal Theory and Communications Universitat Politècnica de Catalunya

Barcelona, December 2007

# Chapter 6

# Experimental Results

#### 6.1 Introduction

Digital Predistortion linearization with PA memory effects compensation has been object of multiple publications [Mar03, Kim06, Din04, Cho05, Liu06, He06, Zhu02, Kim05], demonstrating the effectiveness of a variety of approaches to counteract both memory effects and nonlinear behavior of the RF PA. However, little attention has been drawn to practical implementations of such systems [Ces07a, Ces07b, Myo04, Hel06, SR05, BN05, Fra05]. Laboratory set-up demonstrators with Vector Signal Generators/Analyzers in its core and delayed off-line data processing do not cover some questions regarding DPD application prototyping such as:

- Implementation: suitable real-time architectures, practical implementations, DPD complexity dependence vs. the memory effects time span.
- Efficiency: power consumption of the DPD itself and its impact on the transmitter efficiency.
- DPD adaptation: Memory effects dependence on the specific signal, and DPD ability to maintain linearity performances through signal changes in multicarrier and variable bandwidth systems.

This Chapter presents the hardware implementation of the predictive DPD which, as it has been described in the previous Chapter, can be derived from the NARMA PA behavioral model and then mapped into a set of scalable LUTs. The objective consists in validating models and configurations presented for the NARMA based DPD, taking the external adaptation (DSP or Host PC) as the selected configuration to update the multi-LUT structure in the FPGA. The FPGA implementation of a real time adaptive configuration it is outside of the scope of this thesis and thus it remains as future work to be done.

Aiming at corroborating theories, different scenarios using test signals with flexible transmission bandwidths have been considered to validate the reliability of the proposed DPD. Experimental results on PA efficiency and linearity performance achieved by this DPD are provided along this Chapter. Besides, FPGA implementation issues, obviated in Matlab simulations, such as adaptation policies, power consumption or the possibility of enhancing the overall efficiency by degrading the PA operation mode, are also studied. To sum up, this Chapter pretends to offer an insight on actual prototyping scenarios for DPD systems. It is organized as follows:

- A first description of the experimental setup and procedures deployed to validate the proposed multi-LUT NARMA-based DPD.
- Then, experimental results of the proposed NARMA DPD are provided. Furthermore, this
  section also discusses underlying practical topics such as the DPD power consumption,
  adaptation stability and reliability.
- Finally the contents of the preceding section are extended by focusing on the impact on system performances of degrading the PA operating point to improve the overall efficiency.
   The implications of using DPD to further counteract this more efficient, but less linear, degraded PA behaviour are discussed.

## 6.2 Experimental Setup

#### 6.2.1 Base Band Setup

Despite today's digital signal processors yield much greater performance (DSP clocks up to 1 GHz) to handle current transmission bandwidths (fast envelope signals), in practice the use of a single DSP device offer limited parallelism for the implementation of the DPD and its adaptation procedures. On the other hand, because FPGAs are essentially custom programmable hardware, it is possible to trade off area against speed to meet requirements. Therefore thanks to the parallel processing capabilities of the FPGA it is feasible to implement both DPD and real time adaptation processes in a single FPGA. As an example to evidence that, an FPGA may only need to run at 250 MHz to perform four parallel computations for a signal processing function, whereas the same function in a DSP may require clock speeds of up to 1 GHz.

Alternatively, it is reasonable to consider a mixed DSP-FPGA architecture as in [BN05], where, to allow a high data throughput, the FPGA is in charge of the real time DPD processing. While the DSP performs more complex, less time-constrained functions, such as the adaptation process for DPD LUT/parameter extraction. To enhance flexibility during the prototyping procedures, the DSP device has been replaced by an external host PC in which Matlab is in charge of the adaptation, as it is schematically depicted in Fig. 6.1. The FPGA is a Xilinx



Figure 6.1: Experimental DPD platform under study.

Virtex-IV XC4VSX35, with the developed DPD core in charge of predistortion running at 105 MHz. An overview on the Virtex-IV family specifications can be found in [vir07]. The linearization process in itself is open loop controlled, and works separately of the adaptation process. A feedback loop from the PA output towards the FPGA, through the demodulator and A/D converters, is also included to capture the necessary data enabling the adaptation process. In the proposed implementation, the FPGA provides the external host with buffers of predistorted and PA output data, of 2048 I/Q samples each, from which the NARMA model is derived. Later complex gains  $G_{LUT}$  are computed and fed into the FPGA in the BPC convenient LUT form  $(LUT_{-}\hat{f}_{0}^{-1}, LUT_{-}\hat{f}_{i}, LUT_{-}\hat{g}_{j})$ . The D/A and A/D converters handle 14 bits data, at 105 MSPS as well, covering a bandwidth of 52.5 MHz at baseband. In general, the sampling rate of the complex baseband signal has to be high enough in order to have all products up to  $K^{th}$  order without aliasing [Mor06]:  $KB/f_{s} \leq 1$ , where B is the bandwidth and  $f_{s}$  is the sampling rate. Therefore, maximum allowed signal bandwidth for IMD3 coverage is thus 35 MHz, whereas for full IMD5 coverage is 11.6 MHz. Fig. 6.2 shows a picture composition of the implemented baseband setup.



Figure 6.2: Baseband setup.

#### 6.2.2 RF Setup

Tests have been performed indistinctly with different kinds of modulated signals presenting different bandwidths in order to verify, on the one hand, the dependence of the DPD function on the specific signal and, on the other hand, its reliability in front of possible changes of the RF input signal. Typically, the considered signals have been in the range 5-20 MHz of bandwidth and 5 – 10 dB of PAPR, aiming to mimic the statistical properties of different representative scenarios, i.e., one and two carrier WCDMA, single-carrier DVB-T and WiMAX. In all cases, random filtered baseband data is generated in the host PC and fed into the FPGA where the real-time DPD function takes place before Digital-to-Analog (D/A) conversion, up-conversion, and transmission. The RF chain under study in this work uses as final stage a 170 W peak power PA based on the Freescale MRF7S21170H MOSFET transistor (see Fig. 6.3). A medium power PA based on the Freescale MRF21010 transistor (10 W peak power) and two highly linear amplifiers from Mini-Circuits (ZRL-2300) precede the main output amplifier acting as driver amplifiers. Before the insertion of the PAs in the transmitter chain, a prior set of measurements and a calibration procedure to eliminate DC offsets was performed to ensure that no significant degradation was added by components in the closed-loop configuration. The whole experimental setup, including the baseband processing part and the RF chain, is depicted in Fig. 6.4.



Figure 6.3: RF amplification chain.



Figure 6.4: Experimental test-bench.



Figure 6.5: Detailed structure of a basic predistortion cell (BPC).

#### 6.2.3 Implemented LUT Configuration

As stated in the previous Chapter, the Basic Predistortion Cells (BPCs) are the fundamental building blocks to implement the predictive NARMA-based DPD in the FPGA. A BPC requires simple hardware blocks: a complex multiplier, a dual port RAM acting as LUT and an address calculator, as it is schematically depicted in Fig. 6.5. The selected LUT configuration consists in uniform (amplitude) spacing with  $2^9 = 512$  bins or 9 bits for the addressing. It provides good enough results in comparison to the Cavers optimal companding function with reduced complexity, but still, a square root is necessary for the computation of the address. This operation can take several clock cycles to execute in a FPGA, adding undesired latencies. This may not be of major concern in non-recursive DPD structures, because sub-block latencies could be compensated in the parallel-related data paths by explicit delays, and they would be directly translated into a system input-to-output delay. However, in the proposed recursive DPD implementation, address computation latencies act as a bottleneck, limiting the minimum delay order of the recursive part of the NARMA-based DPD.

Although it is possible to use an intermediate LUT, pointed by a power companding function, to store the appropriate amplitude dependent addresses (see Fig. 6.5), LUT accesses also add latencies. For that reason, the addressing in the proposed implementation is simply performed based on the power of the input complex signal. To properly fill the LUTs in that power addressing case, complex gains filling the LUT, to be accessed on the basis of the amplitude, have to be further redefined to be a function of the power. That is, a new set of coefficients  $G''_{LUT}$  have to be obtained from  $G_{LUT}$ . Supposing that each LUT has L entries, the r-th entry for the corresponding  $\{f,g\}_{\_i}$  LUT is obtained as

$$G_{LUT_{\{f,g\}_{-i}}}^{"}(r) = G_{LUT_{\{f,g\}_{-i}}}(|z|)\Big|_{|z|=\sqrt{r}}$$
 (6.1)

with |z| being the envelope's signal amplitude and  $r=1,2\cdots L$ .

The implemented multi-LUT (set of BPCs) architecture has the advantage of being easily scalable and reconfigurable, enabling to trade off relevant implementation issues such as linearity (adding more BPCs for DPD), power consumption (reducing the number of BPCs) and signal bandwidth (operating with a slower FPGA internal clock), depending on the operation mode and degree of impairments introduced by the transmitter chain in every case.

#### 6.2.4 Assessment Metrics and Definitions

In the experiments, the transmitter performance for both amplification with and without DPD are continuously compared. When DPD is performed, we distinguish between memoryless DPD, when just one BPC is active, and memory compensation DPD, when several BPC are active. In that later case, we further specify whether non recursive BPC (BPC-FIR) or recursive (BPC-IIR) are used. In concrete, when non recursive BPCs are used, they are noted as "d" BPC-FIR, with "d" being the number of non recursive LUTs used (ranging from 1 to D). On the other hand, when recursive BPCs are used, they are noted as "n" BPC-IIR, with "n" being the number of recursive BPCs used (ranging from 1 to N). In the following, additional metrics and the criteria used in the experiences are described.

The main metric to check the transmitted signal fidelity in the time domain is the EVM, as defined in (6.2), in which the unmodulated (unfiltered) raw error between baseband waveforms is computed taking into account all the available data within the 2048 samples I/Q data buffers.

$$EVM_{raw}(\%) = \sqrt{\frac{\frac{1}{L} \sum_{k=0}^{L-1} (x_T^I(k) - y_A^I(k))^2 + (x_T^Q(k) - y_A^Q(k))^2}{x_{T,\text{max}}^2} \cdot 100} \cdot 100$$
 (6.2)

Where  $x_T^I(k)$  and  $x_T^Q(k)$  are the in-phase and quadrature components of the reference baseband signal to transmit and with  $y_A^I(k)$  and  $y_A^Q(k)$  being the in-phase and quadrature components of the baseband PA output, after dowconversion. When DPD is not active we rather use the most suitable linear transformation of  $y_{A,i}$ ,  $y'_{A,i}$ 

$$y'_{A,i} = \kappa \ y_{A,i} e^{j\phi} \tag{6.3}$$

which pre-compensates gain mismatches and phase offsets associated to closed loop misalignments and thus minimizes the numerator in (6.2). When DPD is active,  $y_{A,i}$  is expected to converge to  $x_T$ , and no further prearrangement is necessary.

In the frequency domain, signal fidelity degradation is observed as spectral regrowth on both sides of the RF carrier signal. When it applies, the single carrier 3GPP WCDMA forward link ACPR conformance test [3GP99] has been used; whereas in the remaining scenarios under

test, direct spectrum inspection provided a measure of spectral regrowth as a framework of comparison.

To fairly assess the benefits of DPD, the PA output power has to be the same among the considered scenarios under comparison. In the following measurements, a power meter ensures that comparisons are established between equal mean power signals. Furthermore, the power measurement, together with the DC power consumption, which is directly obtained from the measurement of the supply current, easily provides a reliable mean to compute the PA drain efficiency. To provide an insight of the contribution of DPD to overall efficiency, the DPD power consumption has been considered as well. However, for these efficiency computations, PA bias voltages and currents are not taken into account.

#### 6.3 Experimental Results

This section is addressed to assess the performances of the described predictive NARMA-based DPD and the implemented FPGA architecture through experimental verification, on the basis of the experimental set-up and procedures stated above.

#### 6.3.1 General Testing

A first set of measurements was performed without focusing on a particular transmission standard, intending to evaluate the PA main unwanted effects and different DPD configurations. Fig. 6.6 shows the transmitted spectra of a 20 MHz bandwidth signal with 10 dB of PAPR and a mean output power of 12 W, for the following cases: without any DPD, with memoryless DPD (1 BPC) and with memory compensation (1 Memoryless BPC + 2 BPC-FIR). The benefits of using DPD are shown in terms of out-of-band distortion reduction. In the time domain, the AM/AM characteristic provides additional information on the DPD operation, as it is shown in Fig. 6.7. Similar to simulation results presented in Chapter 5, it reveals a linearized AM/AM characteristic when DPD is applied, and moreover, dispersion is reduced when memory effects are compensated using 3 BPCs. This dispersion compensation in the AM-AM characteristic is directly translated in the EVM metric, as it shown in Fig. 6.8, where for 16-QAM modulated signal a significant amount of EVM reduction is achieved.

In concrete, the amplified signal constellation in Fig. 6.8, presents an EVM of 12 %, which is slightly reduced when applying memoryless DPD compensation (EVM = 8 %), and halved when applying DPD taking into account memory effects compensation, and thus achieving an EVM of 4 %. As expected, the EVM values achieved in simulations are not easily reachable in a real experimental scenario, since additive noise, misalignments and possible latencies in both forward and feedback loops introduce additional distortion that is not taken into account in a



**Figure 6.6:** Power spectra of a 20MHz bandwidth signal with 10 dB PAPR and 12 W mean power for: (i) PA without DPD, (ii) Memoryless DPD-1 BPC-, (iii) Dynamic DPD with 3 BPCs.

more simplified simulation scenario.

The unlinearized (no DPD) AM-AM characteristic in Fig. 6.7 exhibits higher gain than the DPD linearized characteristic, although the peak amplitude levels with and without DPD meet at the PA saturation point. Linear amplification with DPD can only be achieved up to saturation, since no further correction is possible beyond that compression point. Therefore, the maximum available linear gain  $(G_{lin})$  for the DPD + PA chain, has been experimentally tuned to be the ratio between the maximum PA output power and the corresponding PA input power level,

$$G_{lin}(dB) = P_{out,RF,sat}(dB) - P_{in,RF,sat}(dB)$$
(6.4)

This reasoning is graphically shown in Fig. 6.9, where despite the overall gain is reduced with regards the nominal PA gain  $G_{PA}$ ,  $G_{lin} < G_{PA}$ , DPD allows linear amplification up to the PA saturation point while mean output power is maintained since the histogram of the PA input signal is reshaped. Following this criterion, to perform fair comparisons between signals, ensuring that the mean output power is the same with and without DPD, one has to apply the following Input Back Off (IBO) to the unlinearized signal:

$$IBO(dB) = G_{PA}(dB) - G_{lin}(dB)$$

$$(6.5)$$

This criterion has been respected in all results shown in this paper (except for illustration purposes in Fig. 6.7), thus avoiding any kind of make-up coming from a less unlinearized backed-off operation to exaggerate the actual DPD linearization performance.



Figure 6.7: AM-AM characteristics for: PA without DPD, memoryless DPD (1 BPC) and dynamic DPD with 3 BPCs.



**Figure 6.8:** Memory effects manifestation in the 16-QAM constellation for: (i) PA without DPD, (ii) Memoryless DPD-1 BPC-, (iii) Dynamic DPD with 3 BPCs.



**Figure 6.9:** Effects of choosing a proper linear gain  $(G_{lin})$  for the DPD.

It has been shown how memoryless DPD fails to deliver appropriate levels of signal fidelity at the transmitter antenna, because it is unable to properly compensate PA linear distortion. This has been mainly evidenced in terms of EVM, but also in terms of out-of-band distortion. Indeed, linearization performance can be improved by including means to compensate memory effects, that is, including additional BPCs in our NARMA based DPD. Therefore, assuming that memoryless DPD is insufficient, our goal is to compare the linearization performance achieved when considering recursive and non recursive NARMA DPD arrangements. The following three configurations are now confronted: 2 BPC-FIR, 3 BPC-FIR, and 2 BCP-FIR + 1 BPC-IIR.

Despite all considered configurations yield similar EVM figures (around 4 %), slight differences are appreciated in the ACPR improvement, as it is depicted in Fig. 6.10. The linearized power spectra of a 10 MHz filtered noisy signal, is shown, with a high PAPR aimed at statistically emulating a 2 carrier WCDMA scenario, considering the aforementioned NARMA configurations. As it can be observed in Fig. 6.10, the best ACPR is obtained by taking advantage of the recursive operation of the NARMA DPD (2 BPC-FIR + 1 BPC-IIR). In addition, during our experiments, we found that, to ensure a reliable DPD performance, it is important to identify the BPC-LUT contents using a wideband signal capable of exciting the maximum number of memory states of the PA [Hay91]. The use of a wide, spectrally rich signal to train the DPD enables to maintain linearity performances when a later reduced band signal is applied, without the need of further training the DPD, thus providing the desired independence [Ped05] on the specific signal applied. This is an important feature to be taken into account in variable bandwidth transmission schemes as WiMAX or multicarrier configurations, where the signal statistics in terms of PAPR and bandwidth may not be known a priori. This is experimentally highlighted in Fig. 6.11 and Fig. 6.12, showing the linearized power spectra of different RF signals, with different signal bandwidths: 20 MHz - 12 MHz - 8 MHz, for both memoryless



Figure 6.10: Linearized output spectra of a wideband noisy signal considering: (i) 2 BPC-FIR (3 BPCs); (ii) 3 BPC-FIR (4 BPCs) and (iii) 2 BPC-FIR + 1 BPC-IIR (4 BPCs).

DPD and DPD with recursive memory compensation respectively. The DPD has been trained using the wider bandwidth signal (20 MHz) and this permits a robust DPD functioning with narrower signal bandwidths as it is shown in Fig. 6.11 and Fig. 6.12. Moreover, again, it can be observed a better performance in ACPR reduction by using memory compensation in DPD (2 BPC-FIR + 1 BPC-IIR – 4 BPCs) than using a simple memoryless DPD, even without training between signal changes. Experimental results also show that if adaptation is performed on the reduced bandwidth signal, DPD performances are degraded when a wideband signal is applied, and further adaptation is required when a wider signal is later applied.

#### 6.3.2 Single Carrier WCDMA Signal Test

To summarize the experimental results, we have considered the linearization of a single carrier WCDMA signal. For that purpose we have first estimated the LUT contents of the DPD with a 10 MHz noisy wideband signal, as depicted in Fig. 6.10, and so for the following BPC arrangements: memoryless DPD, 2 BPC-FIR, 3 BPC-FIR and 2 BPC-FIR + 1 BPC-IIR. Once the DPD has been trained for each considered configuration, and their corresponding LUTs have been stored into the PC memory, the adaptation procedures have been stopped.

Then, we intend to check the linearization performance achieved when a different signal than the one used for the DPD identification is fed to the PA. Table 6.1 reports the measured results obtained when applying a 5 MHz, 8 dB PAPR, WCDMA signal in terms of ACPR and EVM, for all the BPC combinations considered above. For each arrangement, the suitable BPCs are



**Figure 6.11:** WIMAX variable bandwidth and DPD reliability against signal bandwidth changes (20 MHz - 12 MHz - 8 MHz) for memoryless DPD.



Figure 6.12: WIMAX variable bandwidth and DPD reliability against signal bandwidth changes (20 MHz - 12 MHz - 8 MHz) for DPD with 2 BPC-FIR + 1 BPC-IIR (4 BPCs).

| DPD                           | ACPR     |                  | $\mid$ EVM $\mid$ |
|-------------------------------|----------|------------------|-------------------|
|                               | Left     | $\mathbf{Right}$ | in Tx.            |
| NO (PA back-off)              | -38.5 dB | -38.0 dB         | 23 %              |
| 1 BPC (Memoryless)            | -39.5 dB | -38.6  dB        | 10 %              |
| 3 BPC (2 BPC-FIR)             | -41.0 dB | -40.1 dB         | 3 %               |
| 4 BPC (3 BPC-FIR)             | -45.0 dB | -43.1 dB         | 3 %               |
| 4 BPC (2 BPC-FIR + 1 BPC-IIR) | -46.3 dB | -45.4 dB         | 3 %               |

Table 6.1: 1 Carrier WCDMA: Output power = 40.8 dBm (12 W)

activated and properly filled with the LUT values derived during the adaptation procedure. Note that in Table I, for the sake of equivalent power comparison, a non-linearized back-off operation has been also considered, with an IBO defined as in (6.5).

Complementary, Fig. 6.13, Fig. 6.14, Fig. 6.15 and Fig. 6.16 show the measured output power spectra for the following DPD configurations: no DPD, memoryless DPD, 3 BPC-FIR and 2 BPC-FIR + 1 BPC-IIR, respectively. It clearly appears that from the EVM point of view, DPD with memory compensation is necessary to significantly reduce in-band distortion. In addition, better ACPR figures are achieved when considering more than 2 BPC in the DPD structure and, among these solutions, the one combining 2 BPC-FIR + 1 BPC-IIR exhibits the best ACPR reduction.

#### 6.3.3 The Adaptation Policy

As discussed in Chapter 5, if the data from which the polynomial coefficients are derived does not cover all the PA dynamic range, the LS estimation can be underdetermined and thus no reliable estimations of the PA behavioral model can be ensured.

One possible solution to avoid these uncertainties consists in performing a selective adaptation procedure, in which only data buffers presenting input PA values above a certain power threshold are taken into account to perform the adaptation. Otherwise, data buffers are rejected, and a new set of data buffers are recorded. In such a way, the PA model functions are estimated when the stimuli are complete enough, in the sense that they cover a wide part of the PA dynamic range, thereby reducing the uncertainty and resulting in a reliable PA identification and later DPD operation. Besides, it is possible to dynamically adjust the threshold to tradeoff between accuracy and adaptation rate. A low threshold lowers the chances of data buffer rejection, but at the risk of under determination. Inversely, an excessive value for the threshold will result in a high buffer rejection rate, postponing the estimation.

On the hand, because the LS estimation procedure does not take initial conditions, estimation results at each estimation step depend only on the current data, as no information of the past



Figure 6.13: Output power spectra of a WCDMA signal without DPD.



 ${\bf Figure~6.14:~Output~power~spectra~of~a~WCDMA~signal~considering~memoryless~DPD}.$ 



Figure 6.15: Output power spectra of a WCDMA signal considering DPD with 3 BPC-FIR.



 $\textbf{Figure 6.16:} \ \textbf{Output power spectra of a WCDMA signal considering DPD with 2 BPC-FIR} + 1 \ \textbf{BPC-IIR}. \\$ 



Figure 6.17: Flow diagram of the DPD adaptation procedure.

state is explicitly considered within the adaptation process. This dependence on the data used for the estimation can lead to momentary undesired PA estimations, especially when using buffers of 2048 data sample records, which may not be statistically representative. To avoid this, a degree of recursion in order to take into account past estimations is included. Therefore, the new estimated coefficients of the polynomial are calculated as a weighted sum between the past estimation state and the estimation resulting from current data. This issue may not be of concern when laboratory setups are used for delayed, off-line DPD [Kim06, Mor06, Liu06], where large acquisition capabilities may allow a one-step reliable estimation without the need of recursion.

The whole recursive estimation/adaptation procedure is illustrated in the flowchart depicted in Fig. 6.17. The current estimation state is represented by the tag EState, where EState; represents the LS solution for  $\hat{\delta}$  (the solution array containing the coefficients of the polynomials defining the NARMA PA model) attained at the *i-th* adaptation step, while  $\mu$  is the recursion forgetting factor. Concurrently to the estimation, a continuous flow of data is being predistorted and transmitted with the current EState settings, from which only a small fraction is taken into account for estimation purposes. By performing the adaptive procedure here described, a good adaptive behavior is observed while DPD reliability is reinforced. Moreover, the system converges very fast, as it is shown in Fig. 6.18, where the EVM evolution is tracked for each adaptation step, reaching a steady state within 2 to 4 steps. The EVM raw, calculated from the unmodulated raw signal, of all DPD configurations taking into account memory effects present values around 4% - 5%, while the memoryless DPD is incapable to provide lower EVM values than 11%. A significant issue that can affect the robustness of the DPD here presented is related to possible instabilities associated to the recursive IIR terms. As it is explained in previous sections, a small-gain test can be performed in order to preserve the overall DPD stability. This test was performed during the preliminary PA characterization stages when identifying the optimal delays defining PA memory effects, ensuring that nonlinear functions associated to recursive BPC were bounded below a certain threshold that guaranteed stability. For instance, the small-gain test suggested that some delays were to be avoided in order to prevent instabilities.



Figure 6.18: EVM raw of a wideband signal for different DPD configurations.

#### 6.3.4 DPD Power Consumption

This subsection is aimed at the evaluation of the DPD energetic cost, measured over the presented FPGA implementation. Although power consumption of digital circuits is strongly dependent on each particular implementation, target device (ASIC or FPGA) and technologic CMOS parameters, the particular results shown here are aimed at assessing the relative DPD contribution to the overall transmitter energetic balance.

In FPGA devices, the contribution on power consumption is both static and dynamic and dependent on the supply level, as stated by the classical CMOS power consumption approximation rule:

$$P_{CMOS} = P_{static} + P_{dyn} \propto$$

$$\propto \sum i_{leak} \cdot V_{DD} + \sum_{i=1}^{N} \rho_i \cdot f_{clock,i} \cdot C_{load,i} \cdot V_{DD,i}^2$$
(6.6)

Static power  $(P_{static})$  consumption is due to leakage currents  $(i_{leak})$  in the FPGA transistors, and depends mainly on the device size. Dynamic power  $(P_{dyn})$  consumption, due to gates being switched between low and high logic states, depends on the number of gates within the design (N), which in our case depends on the number of BPCs. For each gate, consumption depends on its activity profile  $(\rho)$ , clock frequency  $(f_{clock})$  and load capacitance  $(C_{load})$ . In our measurements, a  $\rho = 50$  % transition profile for the involved DPD signal vectors has been considered. Accidentally, because the non-linear functions are mapped into the BPC LUTs, DPD consumption does not depend on the polynomial degree of the PA estimator, but rather on the number



Figure 6.19: DPD core power consumption vs. DPD clock and number of BPCs.

of BPC, as stated before.

The following results on DPD power consumption have been obtained with the Xilinx Xpower utility. In a first attempt, the measurements are performed over the placed and routed design of the DPD core only, and do not include the remaining non DPD-related logic included in the FPGA device (mainly devoted to communications and data exchange with the host PC). Fig. 6.19 shows the DPD core power consumption dependence on the DPD clock and the number of BPCs. At 105 MHz DPD clock frequency, an increase of 36 mW per BPC is reported, whereas at 50 MHz the ratio is 21 mW per BPC. Note that increasing the BPC count results in a relative low power increase when the 1 BPC case is taken as a reference. This is due to the different supply domains within the FPGA device [vir07]. Most of the computing intensive DPD logic is placed in low supply internal banks (1.2 V), where furthermore  $C_{load}$  is low, thus having little contribution to dynamic consumption in (6.6). On the contrary, most of the power consumption is dominated by a few signals switching in and out of the DPD core, mainly the I and Q predistorted data vectors feeding the D/A converters, because of the higher supply (3.3 V) and load capacitances. To provide a qualitative framework of the overall DPD energetic cost, Table 6.2 reports the main contributions to power consumption in the proposed DPD design. Clearly, the adaptive functionalities are the main sources of power consumption: A/D converters, non-DPD related FPGA logic and the adaptation algorithm executing in a PC or DSP. Nevertheless, it is possible to reasonably neglect its contribution during regular DPD operation, when for most of the time no adaptation has to be performed, and hence only the DPD-related FPGA logic is then active.

To sum up, the DPD impact on overall efficiency can be perceived as almost negligible in high power applications where the PA power capabilities exceed tenths of Watt, as it is the case in the presented experimental works. In view of this and given the fact that DPD may be unavoidable to counteract memory effects, one may consider varying the PA operation point in order to increase its efficiency, and let the DPD compensate for the distortion introduced when

| Table 0.2. DI D ENERGETIC COST |                               |                                                                 |  |  |  |  |
|--------------------------------|-------------------------------|-----------------------------------------------------------------|--|--|--|--|
| Component                      | Power<br>Consumption          | Remarks                                                         |  |  |  |  |
| FPGA<br>DPD-related logic      | 1 W max                       | Active whenever a signal is transmitted                         |  |  |  |  |
| FPGA complete design           | 4 W max                       | Worst case, to take into account only when adaptation is active |  |  |  |  |
| A/D converters                 | 1.5 (x 2) = 3 W               | Can enter low power mode when adaptation is not active          |  |  |  |  |
| Downconverter                  | 300 mW                        | Can enter low power mode when adaptation is not active          |  |  |  |  |
| Adaptation Block               | 1-2 W (DSP)<br>~ 100's W (PC) | To take into account only when adaptation is active             |  |  |  |  |
| Overall                        | 1 W max                       | Considering that no adaptation is performed                     |  |  |  |  |

Table 6.2: DPD ENERGETIC COST

operating at this less linear PA operation point. This topic is discussed in the following section.

# 6.4 DPD as Enabler to Improve PA Efficiency

DPD linearization techniques are widely recognized as enablers of PA efficiency. By extending the usable dynamic range of a PA, in a linear manner, well until its compression point, DPD implicitly contributes to efficiency by avoiding the use of an oversized, more backed-off, less efficient, alternative PA device to produce the desired output power and linearity levels. This reasoning is illustrated in Table 6.3, presenting the measured linearity and efficiency figures when amplifying a single WCDMA carrier, with and without DPD for the same experimental setup as in the preceding sections. It is possible to observe that the PA delivering a certain amount of RF power (42 dBm) without linearization, consumes less than the DPD linearized PA delivering the same RF output power. Although this result may seem contradictory since the non linearized PA appears to be more efficient than the linearized DPD, the ACPR figures show how this misleading efficiency improvement is obtained at the price of having poorer linearity, and thus no comparison can be established.

Therefore, assuming the compliance of certain standardized levels of ACPR (e.g. -44 dBr) as a reference for comparison, it is clearly noticed how the PA without linearization has to operate with significant back-off, dramatically reducing its efficiency. Moreover its output power

| DPD                     | Output<br>Power   | $P_{DC}$ $(V_{DC} = 28V)$ | ACPR               | Efficiency |
|-------------------------|-------------------|---------------------------|--------------------|------------|
| 4 BPCs                  |                   |                           |                    |            |
| (2 BPC-FIR + 1 BPC-IIR) | $42~\mathrm{dBm}$ | 126 W*                    | $-44~\mathrm{dBr}$ | 12.58 %    |
| NO                      |                   |                           |                    |            |
| (PA alone)              | 42  dBm           | 120.4 W                   | $-39~\mathrm{dBr}$ | 13.16 %    |
| NO                      |                   |                           |                    |            |
| (PA back-off)           | 37  dBm           | 81.2 W                    | $-44~\mathrm{dBr}$ | 6.17 %     |

Table 6.3: Single Carrier WCDMA: Linearity vs Efficiency

capabilities are reduced approximately 5 dB.

Besides, there is another common way in which DPD is explicitly used as an efficiency enabler, by varying the overall linear gain  $G_{lin}$  (see Fig. 6.9), if a certain level of signal clipping can be tolerated. That is, considering a signal whose peak power is rarely reached, it is possible to increase the overall linear gain  $G_{lin}$ , and so the output power and the efficiency. This will result in having linear amplification until compression, and on the rare signal peak occurrences in which the PA is saturated, the energy contribution to the average power spectral density will be negligible as far as the clipping probability is kept small.

In the following, we focus on yet another explicit possibility to exploit DPD as efficiency enabler. Given the fact that DPD is recommendable, at least to counteract memory effects in the time domain, it may seem reasonable to think of adjusting the PA quiescent point in order to increase its efficiency. That is, turn a Class-AB PA towards class-B like operation, and then let the DPD compensate for the extra linearity degradation originated when changing the quiescent point. As depicted in Fig. 6.20, the AM-AM characteristic of the PA presents a nonlinear distortion related to crossover distortion, superposed to the dispersion originated by memory effects which cannot be corrected with a simple memoryless DPD function. However, the NARMA-DPD with 6 BPCs is capable of both linearizing the crossover characteristic and reducing the scattering present in the AM-AM characteristic, as it is depicted in Fig. 6.20.

As expected, in the class-B like operation mode the PA is less power consuming. Therefore, for a given output power level (i.e. 40.5 dBm) and by means of the DPD it is possible to achieve the same linearity level (ACPR = -44 dBr) provided by the PA in class-A like mode of operation, at the time that efficiency is improved, as it is shown in Fig. 6.21. From the frequency point of view, Fig. 6.22 shows the spectral regrowth compensation achieved by the NARMA-DPD for

<sup>\*</sup> Power consumption of the DPD circuitry has been obviated since it is comparably small compared to PA consumption.

144 6.5. Summary



Figure 6.20: AM/AM characteristics of PA operating in class-B like mode.

both class-A like and class-B like operation modes, considering an 8 MHz 64-QAM modulated signal with a PAPR of approximately 7 dB. In class-B like operation it is possible to observe that by compensating crossover distortion not only spectral regrowth is compensated but also it is possible reach higher peak powers than without DPD.

Clearly, this quiescent point manipulation is limited by the progressive maximum output power drop as the quiescent point moves towards Class-B operation. Nevertheless, the study here presented shows how DPD can successfully counteract the excess of nonlinearity, suggesting that DPD can be coupled to variable biasing strategies to boost the PA efficiency, for example during periods where the maximum nominal output power is not solicited. From the DPD point of view this could be simply performed by downloading into the BPC – LUTs the appropriate gain values corresponding to each particular bias point, and when appropriate, switching on/off BPCs to satisfy the desired memory effects compensation span.

# 6.5 Summary

This Chapter has been aimed at experimentally validating the NARMA-based predictive DPD described in Chapter 5. The DPD function is structured in a FPGA as a set of BPCs, responsible for the real-time predistortion. On the other hand, a host PC interchanges not only data buffers with the FPGA but also is responsible for the adaptation of the LUTs conforming the BPCs. The RF chain is composed by three pre-amplifiers and final class AB power amplifier from Freescale operating at 1.9 GHz.



**Figure 6.21:** Measured power consumptions and efficiency for both class-A like and class-B like PA modes of operation with DPD.



**Figure 6.22:** Unlinearized and linearized (dynamic DPD) power spectra for both class-A like (left) and classs-B like (right) PA operation modes.

146 6.5. Summary

In a first step, some preliminary results have been presented to show the scalability and reliability of the implemented DPD. In the one hand, the DPD is scalable since it can select the number of active BPCs that are contributing in the DPD process. On the other hand, results have shown how by training the DPD function with an appropriate signal (in terms of BW and PAPR), it is possible to later handle signals with variable bandwidth and high PAPR.

Later, particularizing for a WCDMA carrier and taking into account aspects related to power consumption, the linearity and efficiency performance achieved by our NARMA-based DPD have been highlighted and some interesting conclusions can be derived:

- As it was expected, results shown in simulation in terms of EVM and ACPR cannot be fulfilled. Simulation results take into account a PA model that despite being accurate enough
  to reproduce PA nonlinear dynamics, it remains a simplification of reality. However simulation results are of significance importance in order to validate algorithms and advancing
  some problems issues, for example, the ones related to the adaptation policy.
- Results have shown that when taking into account PA dynamics the use of a memoryless DPD results insufficient to compensate EVM distortion at the transmitter antenna.
   Therefore the use of multiple BPCs (NAMA DPD) that complement the nonlinear compensation by adding dynamic pre-compensation is a must when dealing with high speed envelope signals.
- Recursive BPCs (the so called BPC-IIR) have the advantage of being more accurate than direct BPCs (BPC-FIR), but its use may take to instabilities of the DPD function. Therefore a previous study that determines the best delays that conform the NARMA DPD function (avoiding thus possible instabilities) has to be carried on.
- In comparison to the power consumption needed in the amplification chain (pre-amplifiers and main amplifier), the FPGA power consumption to perform the DPD function is negligible. Moreover, the NARMA DPD can be used to handle the classical trade-off between efficiency and linearity by forcing to operate close to saturation but in a linear manner.

Finally results have shown how it is possible to degrade the PA linearity by polarizing the PA in a more efficient operating point and compensate this degradation by means of the proposed NARMA-based predictive DPD.