Mobile Communications Contents PART I 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Complex Envelope Representations for Modulated S...

Author:
Jerry D. Gibson

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Mobile Communications Contents PART I 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Complex Envelope Representations for Modulated Signals Leon W. Couch, II Sampling Hwei P. Hsu Pulse Code Modulation Leon W. Couch, II Baseband Signalling and Pulse Shaping Michael L. Honig and Melbourne Barton Channel Equalization John G. Proakis Line Coding Joseph L. LoCicero and Bhasker P. Patel Echo Cancellation Giovanni Cherubini Pseudonoise Sequences Tor Helleseth and P. Vijay Kumar Optimum Receivers Geoffrey C. Orsak Forward Error Correction Coding V.K. Bhargava and I.J. Fair Spread Spectrum Communications Laurence B. Milstein and Marvin K. Simon Diversity Arogyaswami J. Paulraj Digital Communication System Performance Bernard Sklar Telecommunications Standardization Spiros Dimolitsas and Michael Onufry

PART II 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

Basic Principles

Wireless

Wireless Personal Communications: A Perspective Donald C. Cox Modulation Methods Gordon L. St¨uber Access Methods Bernd-Peter Paris Rayleigh Fading Channels Bernard Sklar Space-Time Processing Arogyaswami J. Paulraj Location Strategies for Personal Communications Services Ravi Jain, Yi-Bing Lin, and Seshadri Mohan1 Cell Design Principles Michel Daoud Yacoub Microcellular Radio Communications Raymond Steele Fixed and Dynamic Channel Assignment Bijan Jabbari Radiolocation Techniques Gordon L. St¨uber and James J. Caffery, Jr. Power Control Roman Pichna and Qiang Wang Enhancements in Second Generation Systems Marc Delprat and Vinod Kumar The Pan-European Cellular System Lajos Hanzo Speech and Channel Coding for North American TDMA Cellular Systems Paul Mermelstein The British Cordless Telephone Standard: CT-2 Lajos Hanzo Half-Rate Standards Wai-Yip Chan, Ira Gerson, and Toshio Miki Wireless Video Communications Madhukar Budagavi and Raj Talluri Wireless LANs Suresh Singh Wireless Data Allen H. Levesque and Kaveh Pahlavan Wireless ATM: Interworking Aspects Melbourne Barton, Matthew Cheng, and Li Fung Chang 1999 by CRC Press LLC

c

Hsu, H.P. “Sampling” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Sampling 2.1 2.2 2.3 2.4 2.5 2.6

Hwei P. Hsu Fairleigh Dickinson University

2.1

Introduction Instantaneous Sampling

Ideal Sampled Signal • Band-Limited Signals

Sampling Theorem Sampling of Sinusoidal Signals Sampling of Bandpass Signals Practical Sampling

Natural Sampling • Flat-Top Sampling

2.7 Sampling Theorem in the Frequency Domain 2.8 Summary and Discussion Defining Terms References Further Information

Introduction

To transmit analog message signals, such as speech signals or video signals, by digital means, the signal has to be converted into digital form. This process is known as analog-to-digital conversion. The sampling process is the first process performed in this conversion, and it converts a continuous-time signal into a discrete-time signal or a sequence of numbers. Digital transmission of analog signals is possible by virtue of the sampling theorem, and the sampling operation is performed in accordance with the sampling theorem. In this chapter, using the Fourier transform technique, we present this remarkable sampling theorem and discuss the operation of sampling and practical aspects of sampling.

2.2

Instantaneous Sampling

Suppose we sample an arbitrary analog signal m(t) shown in Fig. 2.1(a) instantaneously at a uniform rate, once every Ts seconds. As a result of this sampling process, we obtain an infinite sequence of samples {m(nTs )}, where n takes on all possible integers. This form of sampling is called instantaneous sampling. We refer to Ts as the sampling interval, and its reciprocal 1/Ts = fs as the sampling rate. Sampling rate (samples per second) is often cited in terms of sampling frequency expressed in hertz. 1999 by CRC Press LLC

c

M (ω)

m (t )

0 (a)

−ωM

t

0 (b)

F [δ Ts (t )]

δTs (t )

−Ts

0 Ts (c)

2Ts

−ωs

t

0 (d)

0 Ts (e)

2Ts

−ωM

−ωs

t

0 (f)

ωM

0 (g)

Ts

2Ts

−2ωs

t

0 (h)

−ω s

m s (t )

−Ts

0

ω

ωs

ω

F [δ Ts (t )]

δ Ts (t )

−Ts

ωs Ms (ω)

m s (t )

−Ts

ω

ωM

Ts

ωs

2ωs

ω

ωs

2ωs

ω

Ms (ω)

2Ts

−2ωs

t

0

−ωs −ωM

(i)

(j)

ωM

FIGURE 2.1: Illustration of instantaneous sampling and sampling theorem.

2.2.1

Ideal Sampled Signal

Let ms (t) be obtained by multiplication of m(t) by the unit impulse train δT (t) with period Ts [Fig. 2.1(c)], that is, ms (t)

= =

m(t)δTs (t) = m(t) ∞ X n=−∞

1999 by CRC Press LLC

c

∞ X

δ (t − nTs )

n=−∞

m(t)δ (t − nTs ) =

∞ X n=−∞

m (nTs ) δ (t − nTs )

(2.1)

where we used the property of the δ function, m(t)δ(t − t0 ) = m(t0 )δ(t − t0 ). The signal ms (t) [Fig. 2.1(e)] is referred to as the ideal sampled signal.

2.2.2 Band-Limited Signals A real-valued signal m(t) is called a band-limited signal if its Fourier transform M(ω) satisfies the condition (2.2) M(ω) = 0 for |ω| > ωM where ωM = 2πfM [Fig. 2.1(b)]. A band-limited signal specified by Eq. (2.2) is often referred to as a low-pass signal.

2.3

Sampling Theorem

The sampling theorem states that a band-limited signal m(t) specified by Eq. (2.2) can be uniquely determined from its values m(nTs ) sampled at uniform interval Ts if Ts ≤ π/ωM = 1/(2fM ). In fact, when Ts = π/ωM , m(t) is given by m(t) =

∞ X

m (nTs )

n=−∞

sin ωM (t − nTs ) ωM (t − nTs )

(2.3)

which is known as the Nyquist–Shannon interpolation formula and it is also sometimes called the cardinal series. The sampling interval Ts = 1/(2fM )is called the Nyquist interval and the minimum rate fs = 1/Ts = 2fM is known as the Nyquist rate. Illustration of the instantaneous sampling process and the sampling theorem is shown in Fig. 2.1. The Fourier transform of the unit impulse train is given by [Fig. 2.1(d)] ∞ X δ (ω − nωs )ψ F δTs (t) = ωs

ωs = 2π/Ts

(2.4)

n=−∞

Then, by the convolution property of the Fourier transform, the Fourier transform Ms (ω) of the ideal sampled signal ms (t) is given by # " ∞ X 1 M(ω) ∗ ωs δ (ω − nωs ) Ms (ω)ψ = 2π n=−∞ =

∞ 1 X M (ω − nωs )ψ Ts n=−∞

(2.5)

where ∗ denotes convolution and we used the convolution property of the δ-function M(ω) ∗ δ(ω − ω0 ) = M(ω − ω0 ). Thus, the sampling has produced images of M(ω) along the frequency axis. Note that Ms (ω) will repeat periodically without overlap as long as ωs ≥ 2ωM or fs ≥ 2fM [Fig. 2.1(f)]. It is clear from Fig. 2.1(f) that we can recover M(ω) and, hence, m(t) by passing the sampled signal ms (t) through an ideal low-pass filter having frequency response Ts ,ψ |ω| ≤ ωM (2.6) H (ω) = 0,ψ otherwise 1999 by CRC Press LLC

c

where ωM = π/Ts . Then

M(ω) = Ms (ω)H (ω)ψ

(2.7)

Taking the inverse Fourier transform of Eq. (2.6), we obtain the impulse response h(t) of the ideal low-pass filter as sin ωM t (2.8) h(t) = ωM t Taking the inverse Fourier transform of Eq. (2.7), we obtain m(t)ψ = = =

ms (t) ∗ h(t) ∞ X sin ωM t m (nTs ) δ (t − nTs ) ∗ ωM t n=−∞ ∞ X n=−∞

m (nTs )

sin ωM (t − nTs ) ωM (t − nTs )

(2.9)

which is Eq. (2.3). The situation shown in Fig. 2.1(j) corresponds to the case where fs < 2fM . In this case there is an overlap between M(ω) and M(ω − ωM ). This overlap of the spectra is known as aliasing or foldover. When this aliasing occurs, the signal is distorted and it is impossible to recover the original signal m(t) from the sampled signal. To avoid aliasing, in practice, the signal is sampled at a rate slightly higher than the Nyquist rate. If fs > 2fM , then as shown in Fig. 2.1(f), there is a gap between the upper limit ωM of M(ω) and the lower limit ωs − ωM of M(ω − ωs ). This range from ωM to ωs − ωM is called a guard band. As an example, speech transmitted via telephone is generally limited to fM = 3.3 kHz (by passing the sampled signal through a low-pass filter). The Nyquist rate is, thus, 6.6 kHz. For digital transmission, the speech is normally sampled at the rate fs = 8 kHz. The guard band is then fs − 2fM = 1.4 kHz. The use of a sampling rate higher than the Nyquist rate also has the desirable effect of making it somewhat easier to design the low-pass reconstruction filter so as to recover the original signal from the sampled signal.

2.4

Sampling of Sinusoidal Signals

A special case is the sampling of a sinusoidal signal having the frequency fM . In this case we require that fs > 2fM rather that fs ≥ 2fM . To see that this condition is necessary, let fs = 2fM . Now, if an initial sample is taken at the instant the sinusoidal signal is zero, then all successive samples will also be zero. This situation is avoided by requiring fs > 2fM .

2.5

Sampling of Bandpass Signals

A real-valued signal m(t) is called a bandpass signal if its Fourier transform M(ω) satisfies the condition ω1 < ω < ω2 (2.10) M(ω) = 0 except for −ω2 < ω < −ω1 where ω1 = 2πf1 and ω2 = 2πf2 [Fig. 2.2(a)]. The sampling theorem for a band-limited signal has shown that a sampling rate of 2f2 or greater is adequate for a low-pass signal having the highest frequency f2 . Therefore, treating m(t) specified by Eq. (2.10) as a special case of such a low-pass signal, we conclude that a sampling rate of 2f2 is 1999 by CRC Press LLC

c

M (ω) M− (ω)

−ω 2

M+ (ω)

−ω 1

ω1

0

ω2

ω

ωB

(a)

M− [ω − (k − 1) ω s ]

M− (ω − k ω s )

ω1

0

(k − 1) ω s − ω 1

ω2

ω

k ωs − ω 2

(b)

FIGURE 2.2: (a) Spectrum of a bandpass signal; (b) Shifted spectra of M (ω). adequate for the sampling of the bandpass signal m(t). But it is not necessary to sample this fast. The minimum allowable sampling rate depends on f1 , f2 , and the bandwidth fB = f2 − f1 . Let us consider the direct sampling of the bandpass signal specified by Eq. (2.10). The spectrum of the sampled signal is periodic with the period ωs = 2πfs , where fs is the sampling frequency, as in Eq. (2.4). Shown in Fig. 2.2(b) are the two right shifted spectra of the negative side spectrum M (ω). If the recovering of the bandpass signal is achieved by passing the sampled signal through an ideal bandpass filter covering the frequency bands (−ω2 , −ω1 ) and (ω1 , ω2 ), it is necessary that there be no aliasing problem. From Fig. 2.2(b), it is clear that to avoid overlap it is necessary that ωs ≥ 2 (ω2 − ω1 ) (k − 1)ωs − ω1 ≤ ω1

(2.11)

kωs − ω2 ≥ ω2

(2.13)

(2.12)

and where ω1 = 2πf1 , ω2 = 2πf2 , and k is an integer (k = 1, 2, . . .). Since f1 = f2 − fB , these constraints can be expressed as k fs f2 ≤ (2.14) 1≤k≤ fB 2 fB and

1999 by CRC Press LLC

c

f2 k − 1 fs ≤ −1 2 fB fB

(2.15)

A graphical description of Eqs. (2.14) and (2.15) is illustrated in Fig. 2.3. The unshaded regions represent where the constraints are satisfied, whereas the shaded regions represent the regions where the constraints are not satisfied and overlap will occur. The solid line in Fig. 2.3 shows the locus of the minimum sampling rate. The minimum sampling rate is given by

min {fs } =

2f2 m

(2.16)

where m is the largest integer not exceeding f2 /fB . Note that if the ratio f2 /fB is an integer, then the minimum sampling rate is 2fB . As an example, consider a bandpass signal with f1 = 1.5 kHz and f2 = 2.5 kHz. Here fB = f2 − f1 = 1 kHz, and f2 /fB = 2.5. Then from Eq. (2.16) and Fig. 2.3 we see that the minimum sampling rate is 2f2 /2 = f2 = 2.5 kHz, and allowable ranges of sampling rate are 2.5 kHz ≤ fs ≤ 3 kHz and fs ≥ 5 kHz (= 2f2 ). fs / f B k=1

k=2

7

6

k=3

5

4

3

2 1

2

2.5

3

4

5

6

7

FIGURE 2.3: Minimum and permissible sampling rates for a bandpass signal.

1999 by CRC Press LLC

c

f2 / f B

2.6

Practical Sampling

In practice, the sampling of an analog signal is performed by means of high-speed switching circuits, and the sampling process takes the form of natural sampling or flat-top sampling.

2.6.1

Natural Sampling

Natural sampling of a band-limited signal m(t) is shown in Fig. 2.4. The sampled signal mns (t) can be expressed as mns (t) = m(t)xp (t)ψ

(2.17)

where xp (t) is the periodic train of rectangular pulses with fundamental period Ts , and each rectangular pulse in xp (t) has duration d and unit amplitude [Fig. 2.4(b)]. Observe that the sampled signal mns (t) consists of a sequence of pulses of varying amplitude whose tops follow the waveform of the signal m(t) [Fig. 2.4(c)]. m (t )

0

t

(a) x p (t ) 1

−Ts

0d

Ts

2T s

t

2T s

t

(b) m ns (t )

−T s

0

Ts

(c)

FIGURE 2.4: Natural sampling. 1999 by CRC Press LLC

c

The Fourier transform of xp (t) is Xp (ω) =

∞ X

cn δ (ω − nωs )

ωs = 2π/Ts

(2.18)

n=−∞

where cn =

d sin (nωs d/2) −j nωs d/2 e Ts nωs d/2

(2.19)

Then the Fourier transform of mns (t) is given by Mns (ω) = M(ω) ∗ Xp (ω) =

∞ X

cn M (ω − nωs )

(2.20)

n=−∞

from which we see that the effect of the natural sampling is to multiply the nth shifted spectrum M(ω − nωs ) by a constant cn . Thus, the original signal m(t) can be reconstructed from mns (t) with no distortion by passing mns (t) through an ideal low-pass filter if the sampling rate fs is equal to or greater than the Nyquist rate 2fM .

2.6.2

Flat-Top Sampling

The sampled waveform, produced by practical sampling devices that are the sample and hold types, has the form [Fig. 2.5(c)] ∞ X m (nTs ) p (t − nTs ) (2.21) mfs (t) = n=−∞

where p(t) is a rectangular pulse of duration d with unit amplitude [Fig. 2.5(a)]. This type of sampling is known as flat-top sampling. Using the ideal sampled signal ms (t) of Eq. (2.1), mfs (t) can be expressed as # " ∞ X m (nTs ) δ (t − nTs ) = p(t) ∗ ms (t) (2.22) mfs (t) = p(t) ∗ n=−∞

Using the convolution property of the Fourier transform and Eq. (2.4), the Fourier transform of mfs (t) is given by Mfs (ω) = P (ω)Ms (ω) = where P (ω) = d

∞ 1 X P (ω)M (ω − nωs ) Ts n=−∞

sin (ωd/2) −j ωd/2 e ωd/2

(2.23)

(2.24)

From Eq.(2.23) we see that by using flat-top sampling we have introduced amplitude distortion and time delay, and the primary effect is an attenuation of high-frequency components. This effect is known as the aperture effect. The aperture effect can be compensated by an equalizing filter with a frequency response Heq (ω) = 1/P (ω). If the pulse duration d is chosen such that d Ts , however, then P (ω) is essentially constant over the baseband and no equalization may be needed. 1999 by CRC Press LLC

c

p (t ) 1

0d

t

(a)

m (t )

0

t

(b) m fs (t )

−T s

0

Ts

2T s

t

(c)

FIGURE 2.5: Flat-top sampling.

2.7

Sampling Theorem in the Frequency Domain

The sampling theorem expressed in Eq. (2.4) is the time-domain sampling theorem. There is a dual to this time-domain sampling theorem, i.e., the sampling theorem in the frequency domain. Time-limited signals: A continuous-time signal m(t) is called time limited if m(t) = 0

for |t| > |T0 |

(2.25)

Frequency-domain sampling theorem: The frequency-domain sampling theorem states that the Fourier transform M(ω) of a time-limited signal m(t) specified by Eq. (2.25) can be uniquely determined from its values M(nωs ) sampled at a uniform rate ωs if ωs ≤ π/T0 . In fact, when ωs = π/T0 , then M(ω) is given by ∞ X sin T0 (ω − nωs ) M (nωs ) (2.26) M(ω) = T0 (ω − nωs ) n=−∞

2.8

Summary and Discussion

The sampling theorem is the fundamental principle of digital communications. We state the sampling theorem in two parts. 1999 by CRC Press LLC

c

THEOREM 2.1 If the signal contains no frequency higher than fM Hz, it is completely described by specifying its samples taken at instants of time spaced 1/2fM s.

THEOREM 2.2 The signal can be completely recovered from its samples taken at the rate of 2fM samples per second or higher.

The preceding sampling theorem assumes that the signal is strictly band limited. It is known that if a signal is band limited it cannot be time limited and vice versa. In many practical applications, the signal to be sampled is time limited and, consequently, it cannot be strictly band limited. Nevertheless, we know that the frequency components of physically occurring signals attenuate rapidly beyond some defined bandwidth, and for practical purposes we consider these signals are band limited. This approximation of real signals by band limited ones introduces no significant error in the application of the sampling theorem. When such a signal is sampled, we band limit the signal by filtering before sampling and sample at a rate slightly higher than the nominal Nyquist rate.

Defining Terms Band-limited signal: A signal whose frequency content (Fourier transform) is equal to zero above some specified frequency. Bandpass signal: A signal whose frequency content (Fourier transform) is nonzero only in a band of frequencies not including the origin. Flat-top sampling: Sampling with finite width pulses that maintain a constant value for a time period less than or equal to the sampling interval. The constant value is the amplitude of the signal at the desired sampling instant. Ideal sampled signal: A signal sampled using an ideal impulse train. Nyquist rate: The minimum allowable sampling rate of 2fM samples per second, to reconstruct a signal band limited to fM hertz. Nyquist-Shannon interpolation formula: The infinite series representing a time domain waveform in terms of its ideal samples taken at uniform intervals. Sampling interval: The time between samples in uniform sampling. Sampling rate: The number of samples taken per second (expressed in Hertz and equal to the reciprocal of the sampling interval). Time-limited: A signal that is zero outside of some specified time interval.

References [1] Brown, J.L. Jr., First order sampling of bandpass signals—A new approach. IEEE Trans. Information Theory, IT-26(5), 613–615, 1980. [2] Byrne, C.L. and Fitzgerald, R.M., Time-limited sampling theorem for band-limited signals, IEEE Trans. Information Theory, IT-28(5), 807–809, 1982. [3] Hsu, H.P., Applied Fourier Analysis, Harcourt Brace Jovanovich, San Diego, CA, 1984. [4] Hsu, H.P., Analog and Digital Communications, McGraw-Hill, New York, 1993. 1999 by CRC Press LLC

c

[5] Hulth´en, R., Restoring causal signals by analytical continuation: A generalized sampling theorem for causal signals. IEEE Trans. Acoustics, Speech, and Signal Processing, ASSP-31(5), 1294–1298, 1983. [6] Jerri, A.J., The Shannon sampling theorem—Its various extensions and applications: A tutorial review, Proc. IEEE. 65(11), 1565–1596, 1977.

Further Information For a tutorial review of the sampling theorem, historical notes, and earlier references see Jerri [6].

1999 by CRC Press LLC

c

Couch, II, L.W. “Pulse Code Modulation” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

1

Pulse Code Modulation

Leon W. Couch, II Universityof Florida

3.1

3.1 Introduction 3.2 Generation of PCM 3.3 Percent Quantizing Noise 3.4 Practical PCM Circuits 3.5 Bandwidth of PCM 3.6 Effects of Noise 3.7 Nonuniform Quantizing: µ-Law and A-Law Companding 3.8 Example: Design of a PCM System Defining Terms References Further Information

Introduction

Pulse code modulation (PCM) is analog-to-digital conversion of a special type where the information contained in the instantaneous samples of an analog signal is represented by digital words in a serial bit stream. If we assume that each of the digital words has n binary digits, there are M = 2n unique code words that are possible, each code word corresponding to a certain amplitude level. Each sample value from the analog signal, however, can be any one of an infinite number of levels, so that the digital word that represents the amplitude closest to the actual sampled value is used. This is called quantizing. That is, instead of using the exact sample value of the analog waveform, the sample is replaced by the closest allowed value, where there are M allowed values, and each allowed value corresponds to one of the code words. PCM is very popular because of the many advantages it offers. Some of these advantages are as follows. • Relatively inexpensive digital circuitry may be used extensively in the system. • PCM signals derived from all types of analog sources (audio, video, etc.) may be timedivision multiplexed with data signals (e.g., from digital computers) and transmitted over

1 Source: Leon W. Couch, II. 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River,

NJ. With permission. 1999 by CRC Press LLC

c

a common high-speed digital communication system. • In long-distance digital telephone systems requiring repeaters, a clean PCM waveform can be regenerated at the output of each repeater, where the input consists of a noisy PCM waveform. The noise at the input, however, may cause bit errors in the regenerated PCM output signal. • The noise performance of a digital system can be superior to that of an analog system. In addition, the probability of error for the system output can be reduced even further by the use of appropriate coding techniques. These advantages usually outweigh the main disadvantage of PCM: a much wider bandwidth than that of the corresponding analog signal.

3.2

Generation of PCM

The PCM signal is generated by carrying out three basic operations: sampling, quantizing, and encoding (see Fig. 3.1). The sampling operation generates an instantaneously-sampled flat-top pulse-amplitude modulated (PAM) signal. The quantizing operation is illustrated in Fig. 3.2 for the M = 8 level case. This quantizer is said to be uniform since all of the steps are of equal size. Since we are approximating the analog sample values by using a finite number of levels (M = 8 in this illustration), error is introduced into the recovered output analog signal because of the quantizing effect. The error waveform is illustrated in Fig. 3.2c. The quantizing error consists of the difference between the analog signal at the sampler input and the output of the quantizer. Note that the peak value of the error (±1) is one-half of the quantizer step size (2). If we sample at the Nyquist rate (2B, where B is the absolute bandwidth, in hertz, of the input analog signal) or faster and there is negligible channel noise, there will still be noise, called quantizing noise, on the recovered analog waveform due to this error. The quantizing noise can also be thought of as a round-off error. The quantizer output is a quantized (i.e., only M possible amplitude values) PAM signal.

FIGURE 3.1: A PCM transmitter. Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 138. With permission.

The PCM signal is obtained from the quantized PAM signal by encoding each quantized sample value into a digital word. It is up to the system designer to specify the exact code word that will represent a particular quantized level. If a Gray code of Table 3.1 is used, the resulting PCM signal is shown in Fig. 3.2d where the PCM word for each quantized sample is strobed out of the encoder by the next clock pulse. The Gray code was chosen because it has only 1-b change for each step change in the quantized level. Consequently, single errors in the received PCM code word will cause minimum errors in the recovered analog level, provided that the sign bit is not in error. 1999 by CRC Press LLC

c

FIGURE 3.2: Illustration of waveforms in a PCM system. Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 139. With permission.

Here we have described PCM systems that represent the quantized analog sample values by binary code words. Of course, it is possible to represent the quantized analog samples by digital words using other than base 2. That is, for base q, the number of quantized levels allowed is M = q n , where n is the number of q base digits in the code word. We will not pursue this topic since binary (q = 2) digital circuits are most commonly used. 1999 by CRC Press LLC

c

3-b Gray Code for M = 8

TABLE 3.1 Levels Quantized Sample Voltage

Gray Code Word (PCM Output)

+7 +5 +3 +1

110 111 101 100

−1 −3 −5 −7

000 001 011 010

Mirror image except for sign bit

Source: Couch, L.W., II. 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 140. With permission.

3.3

Percent Quantizing Noise

The quantizer at the PCM encoder produces an error signal at the PCM decoder output as illustrated in Fig. 3.2c. The peak value of this error signal may be expressed as a percentage of the maximum possible analog signal amplitude. Referring to Fig. 3.2c, a peak error of 1 V occurs for a maximum analog signal amplitude of M = 8 V as shown Fig. 3.1c. Thus, in general, 1 1 2P = = n 100 M 2 or

50 (3.1) P where P is the peak percentage error for a PCM system that uses n bit code words. The design value of n needed in order to have less than P percent error is obtained by taking the base 2 logarithm of both sides of Eq. (3.1), where it is realized that log2 (x) = [log10 (x)]/ log10 (2) = 3.32 log10 (x). That is, 50 (3.2) n ≥ 3.32log10 P 2n =

where n is the number of bits needed in the PCM word in order to obtain less than P percent error in the recovered analog signal (i.e., decoded PCM signal).

3.4

Practical PCM Circuits

Three techniques are used to implement the analog-to-digital converter (ADC) encoding operation. These are the counting or ramp, serial or successive approximation, and parallel or flash encoders. In the counting encoder, at the same time that the sample is taken, a ramp generator is energized and a binary counter is started. The output of the ramp generator is continuously compared to the sample value; when the value of the ramp becomes equal to the sample value, the binary value of the counter is read. This count is taken to be the PCM word. The binary counter and the ramp generator are then reset to zero and are ready to be reenergized at the next sampling time. This technique 1999 by CRC Press LLC

c

requires only a few components, but the speed of this type of ADC is usually limited by the speed of the counter. The Maxim ICL7126 CMOS ADC integrated circuit uses this technique. The serial encoder compares the value of the sample with trial quantized values. Successive trials depend on whether the past comparator outputs are positive or negative. The trial values are chosen first in large steps and then in small steps so that the process will converge rapidly. The trial voltages are generated by a series of voltage dividers that are configured by (on-off) switches. These switches are controlled by digital logic. After the process converges, the value of the switch settings is read out as the PCM word. This technique requires more precision components (for the voltage dividers) than the ramp technique. The speed of the feedback ADC technique is determined by the speed of the switches. The National Semiconductor ADC0804 8-b ADC uses this technique. The parallel encoder uses a set of parallel comparators with reference levels that are the permitted quantized values. The sample value is fed into all of the parallel comparators simultaneously. The high or low level of the comparator outputs determines the binary PCM word with the aid of some digital logic. This is a fast ADC technique but requires more hardware than the other two methods. The Harris CA3318 8-b ADC integrated circuit is an example of the technique. All of the integrated circuits listed as examples have parallel digital outputs that correspond to the digital word that represents the analog sample value. For generation of PCM, the parallel output (digital word) needs to be converted to serial form for transmission over a two-wire channel. This is accomplished by using a parallel-to-serial converter integrated circuit, which is also known as a serialinput-output (SIO) chip. The SIO chip includes a shift register that is set to contain the parallel data (usually, from 8 or 16 input lines). Then the data are shifted out of the last stage of the shift register bit by bit onto a single output line to produce the serial format. Furthermore, the SIO chips are usually full duplex; that is, they have two sets of shift registers, one that functions for data flowing in each direction. One shift register converts parallel input data to serial output data for transmission over the channel, and, simultaneously, the other shift register converts received serial data from another input to parallel data that are available at another output. Three types of SIO chips are available: the universal asynchronous receiver/transmitter (UART), the universal synchronous receiver/transmitter (USRT), and the universal synchronous/asynchronous receiver transmitter (USART). The UART transmits and receives asynchronous serial data, the USRT transmits and receives synchronous serial data, and the USART combines both a UART and a USRT on one chip. At the receiving end the PCM signal is decoded back into an analog signal by using a digital-toanalog converter (DAC) chip. If the DAC chip has a parallel data input, the received serial PCM data are first converted to a parallel form using a SIO chip as described in the preceding paragraph. The parallel data are then converted to an approximation of the analog sample value by the DAC chip. This conversion is usually accomplished by using the parallel digital word to set the configuration of electronic switches on a resistive current (or voltage) divider network so that the analog output is produced. This is called a multiplying DAC since the analog output voltage is directly proportional to the divider reference voltage multiplied by the value of the digital word. The Motorola MC1408 and the National Semiconductor DAC0808 8-b DAC chips are examples of this technique. The DAC chip outputs samples of the quantized analog signal that approximates the analog sample values. This may be smoothed by a low-pass reconstruction filter to produce the analog output. The Communications Handbook [6, pp 107–117] and The Electrical Engineering Handbook [5, pp. 771–782] give more details on ADC, DAC, and PCM circuits. 1999 by CRC Press LLC

c

3.5

Bandwidth of PCM

A good question to ask is: What is the spectrum of a PCM signal? For the case of PAM signalling, the spectrum of the PAM signal could be obtained as a function of the spectrum of the input analog signal because the PAM signal is a linear function of the analog signal. This is not the case for PCM. As shown in Figs. 3.1 and 3.2, the PCM signal is a nonlinear function of the input signal. Consequently, the spectrum of the PCM signal is not directly related to the spectrum of the input analog signal. It can be shown that the spectrum of the PCM signal depends on the bit rate, the correlation of the PCM data, and on the PCM waveform pulse shape (usually rectangular) used to describe the bits [2, 3]. From Fig. 3.2, the bit rate is (3.3) R = nfs where n is the number of bits in the PCM word (M = 2n ) and fs is the sampling rate. For no aliasing we require fs ≥ 2B where B is the bandwidth of the analog signal (that is to be converted to the PCM signal). The dimensionality theorem [2, 3] shows that the bandwidth of the PCM waveform is bounded by 1 1 (3.4) BPCM ≥ R = nfs 2 2 where equality is obtained if a (sin x)/x type of pulse shape is used to generate the PCM waveform. The exact spectrum for the PCM waveform will depend on the pulse shape that is used as well as on the type of line encoding. For example, if one uses a rectangular pulse shape with polar nonreturn to zero (NRZ) line coding, the first null bandwidth is simply BPCM = R = nfs Hz

(3.5)

Table 3.2 presents a tabulation of this result for the case of the minimum sampling rate, fs = 2B. Note that Eq. (3.4) demonstrates that the bandwidth of the PCM signal has a lower bound given by BPCM ≥ nBψ

(3.6)

where fs > 2B and B is the bandwidth of the corresponding analog signal. Thus, for reasonable values of n, the bandwidth of the PCM signal will be significantly larger than the bandwidth of the corresponding analog signal that it represents. For the example shown in Fig. 3.2 where n = 3, the PCM signal bandwidth will be at least three times wider than that of the corresponding analog signal. Furthermore, if the bandwidth of the PCM signal is reduced by improper filtering or by passing the PCM signal through a system that has a poor frequency response, the filtered pulses will be elongated (stretched in width) so that pulses corresponding to any one bit will smear into adjacent bit slots. If this condition becomes too serious, it will cause errors in the detected bits. This pulse smearing effect is called intersymbol interference (ISI).

3.6

Effects of Noise

The analog signal that is recovered at the PCM system output is corrupted by noise. Two main effects produce this noise or distortion: 1) quantizing noise that is caused by the M-step quantizer at the PCM transmitter and 2) bit errors in the recovered PCM signal. The bit errors are caused by channel noise as well as improper channel filtering, which causes ISI. In addition, if the input analog signal is not strictly band limited, there will be some aliasing noise on the recovered analog signal [12]. Under 1999 by CRC Press LLC

c

TABLE 3.2 Performance of a PCM System with Uniform Quantizing and No Channel Noise Number of Quantizer Levels Used, M 2 4 8 16 32 64 128 256 512 1,024 2,048 4,096 8,192 16,384 32,768 65,536

Length of the PCM Word, n (bits) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Bandwidth of PCM Signal (First Null Bandwidth)a 2B 4B 6B 8B 10B 12B 14B 16B 18B 20B 22B 24B 26B 28B 30B 32B

Recovered Analog Signal Power-toQuantizing Noise Power Ratios (dB) (S/N )out 6.0 12.0 18.1 24.1 30.1 36.1 42.1 48.2 54.2 60.2 66.2 72.2 78.3 84.3 90.3 96.3

a B is the absolute bandwidth of the input analog signal. Source: Couch,

L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 142. With permission.

certain assumptions, it can be shown that the recovered analog average signal power to the average noise power [2] is M2 S = (3.7) N out 1 + 4 M 2 − 1 Pe where M is the number of uniformly spaced quantizer levels used in the PCM transmitter and Pe is the probability of bit error in the recovered binary PCM signal at the receiver DAC before it is converted back into an analog signal. Most practical systems are designed so that Pe is negligible. Consequently, if we assume that there are no bit errors due to channel noise (i.e., Pe = 0), the S/N due only to quantizing errors is S = M2 (3.8) N out Numerical values for these S/N ratios are given in Table 3.2. To realize these S/N ratios, one critical assumption is that the peak-to-peak level of the analog waveform at the input to the PCM encoder is set to the design level of the quantizer. For example, referring to Fig. 3.2, this corresponds to the input traversing the range −V to +V volts where V = 8 V is the design level of the quantizer. Equation (3.7) was derived for waveforms with equally likely √ values, such as a triangle waveshape, that have a peak-to-peak value of 2V and an rms value of V / 3, where V is the design peak level of the quantizer. From a practical viewpoint, the quantizing noise at the output of the PCM decoder can be categorized into four types depending on the operating conditions. The four types are overload noise, random noise, granular noise, and hunting noise. As discussed earlier, the level of the analog waveform at the input of the PCM encoder needs to be set so that its peak level does not exceed the design peak of V volts. If the peak input does exceed V , the recovered analog waveform at the output of the PCM system will have flat tops near the peak values. This produces overload noise. The flat tops are easily seen on an oscilloscope, and the recovered analog waveform sounds distorted since the flat topping produces unwanted harmonic components. For example, this type of distortion can 1999 by CRC Press LLC

c

be heard on PCM telephone systems when there are high levels such as dial tones, busy signals, or off-hook warning signals. The second type of noise, random noise, is produced by the random quantization errors in the PCM system under normal operating conditions when the input level is properly set. This type of condition is assumed in Eq. (3.8). Random noise has a white hissing sound. If the input level is not sufficiently large, the S/N will deteriorate from that given by Eq. (3.8); the quantizing noise will still remain more or less random. If the input level is reduced further to a relatively small value with respect to the design level, the error values are not equally likely from sample to sample, and the noise has a harsh sound resembling gravel being poured into a barrel. This is called granular noise. This type of noise can be randomized (noise power decreased) by increasing the number of quantization levels and, consequently, increasing the PCM bit rate. Alternatively, granular noise can be reduced by using a nonuniform quantizer, such as the µ-law or A-law quantizers that are described in Section 3.7. The fourth type of quantizing noise that may occur at the output of a PCM system is hunting noise. It can occur when the input analog waveform is nearly constant, including when there is no signal (i.e., zero level). For these conditions the sample values at the quantizer output (see Fig. 3.2) can oscillate between two adjacent quantization levels, causing an undesired sinusoidal type tone of frequency 1/2fs at the output of the PCM system. Hunting noise can be reduced by filtering out the tone or by designing the quantizer so that there is no vertical step at the constant value of the inputs, such as at 0-V input for the no signal case. For the no signal case, the hunting noise is also called idle channel noise. Idle channel noise can be reduced by using a horizontal step at the origin of the quantizer output–input characteristic instead of a vertical step as shown in Fig. 3.2. Recalling that M = 2n , we may express Eq. (3.8) in decibels by taking 10 log10 (·) of both sides of the equation,

S N

= 6.02n + α

(3.9)

dB

where n is the number of bits in the PCM word and α = 0. This equation—called the 6-dB rule— points out the significant performance characteristic for PCM: an additional 6-dB improvement in S/N is obtained for each bit added to the PCM word. This is illustrated in Table 3.2. Equation (3.9) is valid for a wide variety of assumptions (such as various types of input waveshapes and quantification characteristics), although the value of α will depend on these assumptions [7]. Of course, it is assumed that there are no bit errors and that the input signal level is large enough to range over a significant number of quantizing levels. One may use Table 3.2 to examine the design requirements in a proposed PCM system. For example, high fidelity enthusiasts are turning to digital audio recording techniques. Here PCM signals are recorded instead of the analog audio signal to produce superb sound reproduction. For a dynamic range of 90 dB, it is seen that at least 15-b PCM words would be required. Furthermore, if the analog signal had a bandwidth of 20 kHz, the first null bandwidth for rectangular bit-shape PCM would be 2 × 20 kHz ×15 = 600 kHz. Consequently, video-type tape recorders are needed to record and reproduce high-quality digital audio signals. Although this type of recording technique might seem ridiculous at first, it is realized that expensive high-quality analog recording devices are hard pressed to reproduce a dynamic range of 70 dB. Thus, digital audio is one way to achieve improved performance. This is being proven in the marketplace with the popularity of the digital compact disk (CD). The CD uses a 16-b PCM word and a sampling rate of 44.1 kHz on each stereo 1999 by CRC Press LLC

c

channel [9, 10]. Reed–Solomon coding with interleaving is used to correct burst errors that occur as a result of scratches and fingerprints on the compact disk.

3.7

Nonuniform Quantizing: µ-Law and A-Law Companding

Voice analog signals are more likely to have amplitude values near zero than at the extreme peak values allowed. For example, when digitizing voice signals, if the peak value allowed is 1 V, weak passages may have voltage levels on the order of 0.1 V (20 dB down). For signals such as these with nonuniform amplitude distribution, the granular quantizing noise will be a serious problem if the step size is not reduced for amplitude values near zero and increased for extremely large values. This is called nonuniform quantizing since a variable step size is used. An example of a nonuniform quantizing characteristic is shown in Fig. 3.3. The effect of nonuniform quantizing can be obtained by first passing the analog signal through a compression (nonlinear) amplifier and then into the PCM circuit that uses a uniform quantizer. In the U.S., a µ-law type of compression characteristic is used. It is defined [11] by ln (1 + µ |w1 (t)|) ln(1 + µ)

|w2 (t)| =

(3.10)

where the allowed peak values of w1 (t) are ±1 (i.e., |w1 (t)| ≤ 1), µ is a positive constant that is a parameter. This compression characteristic is shown in Fig. 3.3(b) for several values of µ, and it is noted that µ → 0 corresponds to linear amplification (uniform quantization overall). In the United States, Canada, and Japan, the telephone companies use a µ = 255 compression characteristic in their PCM systems [4]. Another compression law, used mainly in Europe, is the A-law characteristic. It is defined [1] by A |w1 (t)| , 1 + ln A |w2 (t)| = 1 + ln (A |w1 (t)|) , 1 + ln A

0 ≤ |w1 (t)| ≤

1 A

1 ≤ |w1 (t)| ≤ 1 A

(3.11)

where |w1 (t)| < 1 and A is a positive constant. The A-law compression characteristic is shown in Fig. 3.3(c). The typical value for A is 87.6. When compression is used at the transmitter, expansion (i.e., decompression) must be used at the receiver output to restore signal levels to their correct relative values. The expandor characteristic is the inverse of the compression characteristic, and the combination of a compressor and an expandor is called a compandor. Once again, it can be shown that the output S/N follows the 6-dB law [2]

S N

= 6.02 + α

(3.12)

dB

where for uniform quantizing α = 4.77 − 20 log (V /xrms ) 1999 by CRC Press LLC

c

(3.13)

FIGURE 3.3: Compression characteristics (first quadrant shown). Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 147. With permission. and for sufficiently large input levels2 for µ-law companding α ≈ 4.77 − 20 log[ln(1 + µ)]

(3.14)

α ≈ 4.77 − 20 log[1 + ln A]

(3.15)

and for A-law companding [7]

n is the number of bits used in the PCM word, V is the peak design level of the quantizer, and xrms is the rms value of the input analog signal. Notice that the output S/N is a function of the input level

2 See Lathi, 1998 for a more complicated expression that is valid for any input level.

1999 by CRC Press LLC

c

FIGURE 3.4: Output S/N of 8-b PCM systems with and without companding. Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 149. With permission.

for the uniform quantizing (no companding) case but is relatively insensitive to input level for µ-law and A-law companding, as shown in Fig. 3.4. The ratio V /xrms is called the loading factor. The input level is often set for a loading factor of 4 (12 dB) to ensure that the overload quantizing noise will be negligible. In practice this gives α = −7.3 for the case of uniform encoding as compared to α = 0, which was obtained for the ideal conditions associated with Eq. (3.8).

3.8

Example: Design of a PCM System

Assume that an analog voice-frequency signal, which occupies a band from 300 to 3400 Hz, is to be transmitted over a binary PCM system. The minimum sampling frequency would be 2 × 3.4 = 6.8 kHz. In practice the signal is oversampled, and in the U.S. a sampling frequency of 8 kHz is the standard used for voice-frequency signals in telephone communication systems. Assume that each sample value is represented by 8 b; then the bit rate of the PCM signal is R

1999 by CRC Press LLC

c

= fs samples/s (n b/s) = (8 k samples/s)(8 b/s) = 64 kb/s

(3.16)

Referring to the dimensionality theorem [Eq. (3.4)], we realize that the theoretically minimum absolute bandwidth of the PCM signal is Bmin =

1 D = 32 kHz 2

(3.17)

and this is realized if the PCM waveform consists of (sin x)/x pulse shapes. If rectangular pulse shaping is used, the absolute bandwidth is infinity, and the first null bandwidth [Eq. (3.5)] is Bnull = R =

1 = 64 kHz Tb

(3.18)

That is, we require a bandwidth of 64 kHz to transmit this digital voice PCM signal where the bandwidth of the original analog voice signal was, at most, 4 kHz. Using n = 8 in Eq. (3.1), the error on the recovered analog signal is ±0.2%. Using Eqs. (3.12) and (3.13) for the case of uniform quantizing with a loading factor, V /xrms , of 10 (20 dB), we get for uniform quantizing S = 32.9 dB (3.19) N dB Using Eqs. (3.12) and (3.14) for the case of µ = 255 companding, we get S = 38.05 dB N

(3.20)

These results are illustrated in Fig. 3.4.

Defining Terms Intersymbol interference: Filtering of a digital waveform so that a pulse corresponding to 1 b will smear (stretch in width) into adjacent bit slots. Pulse amplitude modulation: An analog signal is represented by a train of pulses where the pulse amplitudes are proportional to the analog signal amplitude. Pulse code modulation: A serial bit stream that consists of binary words which represent quantized sample values of an analog signal. Quantizing: Replacing a sample value with the closest allowed value.

References [1] Cattermole, K.W., Principles of Pulse-code Modulation, American Elsevier, New York, NY, 1969. [2] Couch, L.W., Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, 1997. [3] Couch, L.W., Modern Communication Systems: Principles and Applications, Macmillan Publishing, New York, NY, 1995. [4] Dammann, C.L., McDaniel, L.D., and Maddox, C.L., D2 Channel Bank—Multiplexing and Coding. B. S. T. J., 12(10), 1675–1700, 1972. [5] Dorf, R.C., The Electrical Engineering Handbook, CRC Press, Inc., Boca Raton, FL, 1993. 1999 by CRC Press LLC

c

[6] Gibson, J.D., The Communications Handbook, CRC Press, Inc., Boca Raton, FL, 1997. [7] Jayant, N.S. and Noll, P., Digital Coding of Waveforms, Prentice Hall, Englewood Cliffs, NJ, 1984. [8] Lathi, B.P., Modern Digital and Analog Communication Systems, 3rd ed., Oxford University Press, New York, NY, 1998. [9] Miyaoka, S., Digital Audio is Compact and Rugged. IEEE Spectrum, 21(3), 35–39, 1984. [10] Peek, J.B.H., Communication Aspects of the Compact Disk Digital Audio System. IEEE Comm. Mag., 23(2), 7–15, 1985. [11] Smith, B., Instantaneous Companding of Quantized Signals. B. S. T. J., 36(5), 653–709, 1957. [12] Spilker, J.J., Digital Communications by Satellite, Prentice Hall, Englewood Cliffs, NJ, 1977.

Further Information Many practical design situations and applications of PCM transmission via twisted-pair T-1 telephone lines, fiber optic cable, microwave relay, and satellite systems are given in [2] and [3].

1999 by CRC Press LLC

c

Honig, M.L. & Barton, M.“Baseband Signalling and Pulse Shaping” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Baseband Signalling and Pulse Shaping 4.1 4.2

Communications System Model Intersymbol Interference and the Nyquist Criterion

4.3 4.4

Nyquist Criterion with Matched Filtering Eye Diagrams

Raised Cosine Pulse

4.5

Vertical Eye Opening • Horizontal Eye Opening • Slope of the Inner Eye

Partial-Response Signalling Precoding

4.6

4.7

Michael L. Honig Northwestern University

Melbourne Barton Bellcore

Additional Considerations

Average Transmitted Power and Spectral Constraints • Peakto-Average Power • Channel and Receiver Characteristics • Complexity • Tolerance to Interference • Probability of Intercept and Detection

Examples

Global System for Mobile Communications (GSM) • U.S. Digital Cellular (IS-136) • Interim Standard-95 • Personal Access Communications System (PACS)

Defining Terms References Further Information

Many physical communications channels, such as radio channels, accept a continuous-time waveform as input. Consequently, a sequence of source bits, representing data or a digitized analog signal, must be converted to a continuous-time waveform at the transmitter. In general, each successive group of bits taken from this sequence is mapped to a particular continuous-time pulse. In this chapter we discuss the basic principles involved in selecting such a pulse for channels that can be characterized as linear and time invariant with finite bandwidth.

4.1

Communications System Model

Figure 4.1a shows a simple block diagram of a communications system. The sequence of source bits {bi } are grouped into sequential blocks (vectors) of m bits {bi }, and each binary vector bi is mapped to one of 2m pulses, p(bi ; t), which is transmitted over the channel. The transmitted signal as a 1999 by CRC Press LLC

c

function of time can be written as s(t) =

X

p(bi ; t − iT )ψ

(4.1)

i

where 1/T is the rate at which each group of m bits, or pulses, is introduced to the channel. The information (bit) rate is therefore m/T .

Figure 4.1a Communication system model. The source bits are grouped into binary vectors, which are mapped to a sequence of pulse shapes.

Figure 4.1b noise.

Channel model consisting of a linear, time-invariant system (transfer function) followed by additive

The channel in Fig. 4.1a can be a radio link, which may distort the input signal s(t) in a variety of ways. For example, it may introduce pulse dispersion (due to finite bandwidth) and multipath, as well as additive background noise. The output of the channel is denoted as x(t), which is processed by the receiver to determine estimates of the source bits. The receiver can be quite complicated; however, for the purpose of this discussion, it is sufficient to assume only that it contains a front-end filter and a sampler, as shown in Fig. 4.1a. This assumption is valid for a wide variety of detection strategies. The purpose of the receiver filter is to remove noise outside of the transmitted frequency band and to compensate for the channel frequency response. A commonly used channel model is shown in Fig. 4.1b and consists of a linear, time-invariant filter, denoted as G(f ), followed by additive noise n(t). The channel output is, therefore, x(t) = [g(t) ∗ s(t)] + n(t)ψ

(4.2)

where g(t) is the channel impulse response associated with G(f ), and the asterisk denotes convolution, Z g(t) ∗ s(t) =

∞

−∞

g(t − τ )s(τ ) dτ

This channel model accounts for all linear, time-invariant channel impairments, such as finite bandwidth and time-invariant multipath. It does not account for time-varying impairments, such as rapid fading due to time-varying multipath. Nevertheless, this model can be considered valid over short time periods during which the multipath parameters remain constant. In Figs. 4.1a, and 4.1b, it is assumed that all signals are baseband signals, which means that the frequency content is centered around f = 0 (DC). The channel passband, therefore, partially coincides with the transmitted spectrum. In general, this condition requires that the transmitted signal be modulated by an appropriate carrier frequency and demodulated at the receiver. In that case, the model in Figs. 4.1a, and 4.1b still applies; however, baseband-equivalent signals must be 1999 by CRC Press LLC

c

derived from their modulated (passband) counterparts. Baseband signalling and pulse shaping refers to the way in which a group of source bits is mapped to a baseband transmitted pulse. As a simple example of baseband signalling, we can take m = 1 (map each source bit to a pulse), assign a 0 bit to a pulse p(t), and a 1 bit to the pulse −p(t). Perhaps the simplest example of a baseband pulse is the rectangular pulse given by p(t) = 1, 0 < t ≤ T , and p(t) = 0 elsewhere. In this case, we can write the transmitted signal as s(t) =

X

Ai p(t − iT )ψ

(4.3)

i

where each symbol Ai takes on a value of +1 or −1, depending on the value of the ith bit, and 1/T is the symbol rate, namely, the rate at which the symbols Ai are introduced to the channel. The preceding example is called binary pulse amplitude modulation (PAM), since the data symbols Ai are binary valued, and they amplitude modulate the transmitted pulse p(t). The information rate (bits per second) in this case is the same as the symbol rate 1/T . As a simple extension of this signalling technique, we can increase m and choose Ai from one of M = 2m values to transmit at bit rate m/T . This is known as M-ary PAM. For example, letting m = 2, each pair of bits can be mapped to a pulse in the set {p(t), −p(t), 3p(t), −3p(t)}. In general, the transmitted symbols {Ai }, the baseband pulse p(t), and channel impulse response g(t) can be complex valued. For√example, each successive pair of bits might select a symbol from the set {1, −1, j, −j }, where j = −1. This is a consequence of considering the baseband equivalent of passband modulation. (That is, generating a transmitted spectrum which is centered around a carrier frequency fc .) Here we are not concerned with the relation between the passband and baseband equivalent models and simply point out that the discussion and results in this chapter apply to complex-valued symbols and pulse shapes. As an example of a signalling technique which is not PAM, let m = 1 and √ 2 sin(2πf1 t) 0 < t < T p(0; t) = elsewhere ( 0√ 2 sin(2πf2 t) 0 < t < T p(1; t) = 0 elsewhere

(4.4)

where f1 and f2 6 = f1 are fixed frequencies selected so that f1 T and f2 T (number of cycles for each bit) are multiples of 1/2. These pulses are orthogonal, namely, Z

T

p(1; t)p(0; t) dt = 0

0

This choice of pulse shapes is called binary frequency-shift keying (FSK). Another example of a set of orthogonal pulse shapes for m = 2 bits/T is shown in Fig. 4.2. Because these pulses may have as many as three transitions within a symbol period, the transmitted spectrum occupies roughly four times the transmitted spectrum of binary PAM with a rectangular pulse shape. The spectrum is, therefore, spread across a much larger band than the smallest required for reliable transmission, assuming a data rate of 2/T . This type of signalling is referred to as spread-spectrum. 1999 by CRC Press LLC

c

FIGURE 4.2: Four orthogonal spread-spectrum pulse shapes.

Spread-spectrum signals are more robust with respect to interference from other transmitted signals than are narrowband signals.1

4.2

Intersymbol Interference and the Nyquist Criterion

Consider the transmission of a PAM signal illustrated in Fig. 4.3. The source bits {bi } are mapped to a sequence of levels {Ai }, which modulate the transmitter pulse p(t). The channel input is, therefore, given by Eq. (4.3) where p(t) is the impulse response of the transmitter pulse-shaping filter P (f ) shown P in Fig. 4.3. The input to the transmitter filter P (f ) is the modulated sequence of delta functions i Ai δ(t − iT ). The channel is represented by the transfer function G(f ) (plus noise), which has impulse response g(t), and the receiver filter has transfer function R(f ) with associated impulse response r(t).

FIGURE 4.3: Baseband model of a pulse amplitude modulation system. Let h(t) be the overall impulse response of the combined transmitter, channel, and receiver, which has transfer function H (f ) = P (f )G(f )R(f ). We can write h(t) = p(t) ∗ g(t) ∗ r(t). The output

1 This example can also be viewed as coded binary PAM. Namely, each pair of two source bits are mapped to 4 coded bits, which are transmitted via binary PAM with a rectangular pulse. The current IS-95 air interface uses an extension of this signalling method in which groups of 6 bits are mapped to 64 orthogonal pulse shapes with as many as 63 transitions during a symbol.

1999 by CRC Press LLC

c

of the receiver filter is then y(t) =

X

Ai h(t − iT ) + n(t)ψ ˜

(4.5)

i

where n(t) ˜ = r(t) ∗ n(t) is the output of the filter R(f ) with input n(t). Assuming that samples are collected at the output of the filter R(f ) at the symbol rate 1/T , we can write the kth sample of y(t) as X Ai h(kT − iT ) + n(kT ˜ ) y(kT ) = i

=

Ak h(0) +

X

Ai h(kT − iT ) + n(kT ˜ )ψ

(4.6)

i6 =k

The first term on the right-hand side of Eq. (4.6) is the kth transmitted symbol scaled by the system impulse response at t = 0. If this were the only term on the right side of Eq. (4.6), we could obtain the source bits without error by scaling the received samples by 1/ h(0). The second term on the righthand side of Eq. (4.6) is called intersymbol interference, which reflects the view that neighboring symbols interfere with the detection of each desired symbol. One possible criterion for choosing the transmitter and receiver filters is to minimize intersymbol interference. Specifically, if we choose p(t) and r(t) so that 1 k=0 (4.7) h(kT ) = 0 k 6= 0 then the kth received sample is

y(kT ) = Ak + n(kT ˜ )ψ

(4.8)

In this case, the intersymbol interference has been eliminated. This choice of p(t) and r(t) is called a zero-forcing solution, since it forces the intersymbol interference to zero. Depending on the type of detection scheme used, a zero-forcing solution may not be desirable. This is because the probability of error also depends on the noise intensity, which generally increases when intersymbol interference is suppressed. It is instructive, however, to examine the properties of the zero-forcing solution. We now view Eq. (4.7) in the frequency domain. Since h(t) has Fourier transform H (f ) = P (f )G(f )R(f )ψ

(4.9)

where P (f ) is the Fourier transform of p(t), the bandwidth of H (f ) is limited by the bandwidth of the channel G(f ). We will assume that G(f ) = 0, |f | > W . The sampled impulse response h(kT ) can, therefore, be written as the inverse Fourier transform Z W H (f )ej 2πf kT df h(kT ) = −W

Through a series of manipulations, this integral can be rewritten as an inverse discrete Fourier transform, Z h(kT ) = T where 1999 by CRC Press LLC

c

1/(2T ) −1/(2T )

Heq ej 2πf T ej 2πf kT dfψ

(4.10a)

Heq (e

j 2πf T

)ψ = =

1 X k H f+ T T k X k k k 1 P f+ G f+ R f+ T T T T

(4.10b)

k

This relation states that Heq (z), z = ej 2πf T , is the discrete Fourier transform of the sequence {hk }, where hk = h(kT ). Sampling the impulse response h(t) therefore changes the transfer function H (f ) to the aliased frequency response Heq (ej 2πf T ). From Eqs. (4.10a–4.10b), and (4.6) we conclude that Heq (z) is the transfer function that relates the sequence of input data symbols {Ai } to the sequence of received samples {yi }, where yi = y(iT ), in the absence of noise. This is illustrated in Fig. 4.4. For this reason, Heq (z) is called the equivalent discrete-time transfer function for the overall system transfer function H (f ).

FIGURE 4.4: Equivalent discrete-time channel for the PAM system shown in Fig. 4.3 [yi = y(iT ), n˜ i = n(iT ˜ )] Since Heq (ej 2πf T ) is the discrete Fourier transform of the sequence {hk }, the time-domain, or sequence condition (4.7) is equivalent to the frequency-domain condition Heq ej 2πf T = 1

(4.11)

This relation is called the Nyquist criterion. From Eqs. (4.10b) and (4.11) we make the following observations. 1. To satisfy the Nyquist criterion, the channel bandwidth W must be at least 1/(2T ). Otherwise, G(f + n/T ) = 0 for f in some interval of positive length for all n, which implies that Heq (ej 2πf T ) = 0 for f in the same interval. 2. For the minimum bandwidth W = 1/(2T ), Eqs. (4.10b) and (4.11) imply that H (f ) = T for |f | < 1/(2T ) and H (f ) = 0 elsewhere. This implies that the system impulse response is given by h(t) =

sin(π t/T ) π t/T

(4.12)

R∞ P (Since −∞ h2 (t) dt = T , the transmitted signal s(t) = i Ai h(t − iT ) has power equal to the symbol variance E[|Ai |2 ].) The impulse response in Eq. (4.12) is called a minimum bandwidth or Nyquist pulse. The frequency band [−1/(2T ), 1/(2T )] [i.e., the passband of H (f )] is called the Nyquist band. 3. Suppose that the channel is bandlimited to twice the Nyquist bandwidth. That is, G(f ) = 0 for |f | > 1/T . The condition (4.11) then becomes 1999 by CRC Press LLC

c

1 H (f ) + H f − T

1 +H f + T

=T

(4.13)

Assume for the moment that H (f ) and h(t) are both real valued, so that H (f ) is an even function of f [H (f ) = H (−f )]. This is the case when the receiver filter is the matched filter (see Section 4.3). We can then rewrite Eq. (4.13) as 1 1 − f = T, 0 < f < (4.14) H (f ) + H T 2T which states that H (f ) must have odd symmetry about f = 1/(2T ). This is illustrated in Fig. 4.5, which shows two different transfer functions H (f ) that satisfy the Nyquist criterion. 4. The pulse shape p(t) enters into Eq. (4.11) only through the product P (f )R(f ). Consequently, either P (f ) or R(f ) can be fixed, and the other filter can be adjusted or adapted to the particular channel. Typically, the pulse shape p(t) is fixed, and the receiver filter is adapted to the (possibly time-varying) channel.

FIGURE 4.5: Two examples of frequency responses that satisfy the Nyquist criterion.

4.2.1

Raised Cosine Pulse

Suppose that the channel is ideal with transfer function ( 1, |f | < W G(f ) = 0, |f | > W

(4.15)

To maximize bandwidth efficiency, Nyquist pulses given by Eq. (4.12) should be used where W = 1/(2T ). This type of signalling, however, has two major drawbacks. First, Nyquist pulses are noncausal and of infinite duration. They can be approximated in practice by introducing an appropriate delay, and truncating the pulse. The pulse, however, decays very slowly, namely, as 1/t, so that the truncation window must be wide. This is equivalent to observing that the ideal bandlimited frequency response given by Eq. (4.15) is difficult to approximate closely. The second drawback, which is more important, is the fact that this type of signalling is not robust with respect to sampling jitter. Namely, a small sampling offset ε produces the output sample y(kT + ε) =

X i

1999 by CRC Press LLC

c

Ai

sin[π(k − i + ε/T )] π(k − i + ε/T )

(4.16)

Since the Nyquist pulse decays as 1/t, this sum is not guaranteed to converge. A particular choice of symbols {Ai } can, therefore, lead to very large intersymbol interference, no matter how small the offset. Minimum bandwidth signalling is therefore impractical. The preceding problem is generally solved in one of two ways in practice: 1. The pulse bandwidth is increased to provide a faster pulse decay than 1/t. 2. A controlled amount of intersymbol interference is introduced at the transmitter, which can be subtracted out at the receiver. The former approach sacrifices bandwidth efficiency, whereas the latter approach sacrifices power efficiency. We will examine the latter approach in Section 4.5. The most common example of a pulse, which illustrates the first technique, is the raised cosine pulse, given by cos(απ t/T ) sin(πt/T ) (4.17) h(t) = π t/T 1 − (2αt/T )2 which has Fourier transform 1−α Tψ 0 ≤ |f | ≤ 2T T πT 1−α 1−α 1+α 1 + cos |f | − ≤ |f | ≤ H (f ) = 2 α 2T 2T 2T 1+α 0 |f | > 2T

(4.18)

where 0 ≤ α ≤ 1. Plots of p(t) and P (f ) are shown in Figs. 4.6a, and 4.6b for different values of α. It is easily verified that h(t) satisfies the Nyquist criterion (4.7) and, consequently, H (f ) satisfies Eq. (4.11). When α = 0, H (f ) is the Nyquist pulse with minimum bandwidth 1/(2T ), and when α > 0, H (f ) has bandwidth (1 + α)/(2T ) with a raised cosine rolloff. The parameter α, therefore, represents the additional, or excess bandwidth as a fraction of the minimum bandwidth 1/(2T ). For example, when α = 1, we say that the pulse is a raised cosine pulse with 100% excess bandwidth. This is because the pulse bandwidth 1/T is twice the minimum bandwidth. Because the raised cosine pulse decays as 1/t 3 , performance is robust with respect to sampling offsets. The raised cosine frequency response (4.18) applies to the combination of transmitter, channel, and receiver. If the transmitted pulse shape p(t) is a raised cosine pulse, then h(t) is a raised cosine pulse only if the combined receiver and channel frequency response is constant. Even with an ideal (transparent) channel, however, the optimum (matched) receiver filter response is generally not constant in the presence of additive Gaussian noise. An alternative is to transmit the square-root raised cosine pulse shape, which has frequency response P (f ) given by the square-root of the raised cosine frequency response in Eq. (4.18). Assuming an ideal channel, setting the receiver frequency response R(f ) = P (f ) then results in an overall raised cosine system response H (f ).

4.3

Nyquist Criterion with Matched Filtering

Consider the transmission of an isolated pulse A0 δ(t). In this case the input to the receiver in Fig. 4.3 is ˜ + n(t)ψ (4.19) x(t) = A0 g(t) 1999 by CRC Press LLC

c

Figure 4.6a

Raised cosine pulse.

Figure 4.6b

Raised cosine spectrum.

where g(t) ˜ is the inverse Fourier transform of the combined transmitter-channel transfer function ˜ ) = P (f )G(f ). We will assume that the noise n(t) is white with spectrum N0 /2. The output G(f of the receiver filter is then ˜ + [r(t) ∗ n(t)] y(t) = r(t) ∗ x(t) = A0 [r(t) ∗ g(t)]

(4.20)

The first term on the right-hand side is the desired signal, and the second term is noise. Assuming that y(t) is sampled at t = 0, the ratio of signal energy to noise energy, or signal-to-noise ratio (SNR) 1999 by CRC Press LLC

c

at the sampling instant, is 2 Z ∞ r(−t)g(t) ˜ dt E |A0 |2 Z −∞ SNR = N0 ∞ |r(t)|2 dt 2 −∞

(4.21)

The receiver impulse response that maximizes this expression is r(t) = g˜ ∗ (−t) [complex conjugate of g(−t)], ˜ which is known as the matched filter impulse response. The associated transfer function ˜ ∗ (f ). is R(f ) = G Choosing the receiver filter to be the matched filter is optimal in more general situations, such as when detecting a sequence of channel symbols with intersymbol interference (assuming the additive noise is Gaussian). We, therefore, reconsider the Nyquist criterion when the receiver filter is the matched filter. In this case, the baseband model is shown in Fig. 4.7, and the output of the receiver filter is given by X Ai h(t − iT ) + n(t)ψ ˜ (4.22) y(t) = i

where the baseband pulse h(t) is now the impulse response of the filter with transfer function ˜ )|2 = |P (f )G(f )|2 . This impulse response is the autocorrelation of the impulse response of |G(f ˜ ), the combined transmitter-channel filter G(f Z ∞ g˜ ∗ (s)g(s ˜ + t) dsψ (4.23) h(t) = −∞

FIGURE 4.7: Baseband PAM model with a matched filter at the receiver. With a matched filter at the receiver, the equivalent discrete-time transfer function is 1 X ˜ k 2 j 2πf T )ψ = G f − Heq (e T T k 1 X k 2 k = P f − T G f − T T

(4.24)

k

which relates the sequence of transmitted symbols {Ak } to the sequence of received samples {yk } in the absence of noise. Note that Heq (ej 2πf T ) is positive, real valued, and an even function of f . If the channel is bandlimited to twice the Nyquist bandwidth, then H (f ) = 0 for |f | > 1/T , and the Nyquist condition is given by Eq. (4.14) where H (f ) = |G(f )P (f )|2 . The aliasing sum in Eq. (4.10b) can therefore be described as a folding operation in which the channel response |H (f )|2 is folded around the Nyquist frequency 1/(2T ). For this reason, Heq (ej 2πf T ) with a matched receiver filter is often referred to as the folded channel spectrum. 1999 by CRC Press LLC

c

4.4

Eye Diagrams

One way to assess the severity of distortion due to intersymbol interference in a digital communications system is to examine the eye diagram. The eye diagram is illustrated in Figs. 4.8a and 4.8b, for a raised cosine pulse shape with 25% excess bandwidth and an ideal bandlimited channel. Figure 4.8a shows the data signal at the receiver X Ai h(t − iT ) + n(t)ψ ˜ (4.25) y(t) = i

where h(t) is given by Eq. (4.17), α = 1/4, each symbol Ai is independently chosen from the set {±1, ±3}, where each symbol is equally likely, and n(t) ˜ is bandlimited white Gaussian noise. (The received SNR is 30 dB.) The eye diagram is constructed from the time-domain data signal y(t) as follows (assuming nominal sampling times at kT , k = 0, 1, 2, . . .): 1. Partition the waveform y(t) into successive segments of length T starting from t = T /2. 2. Translate each of these waveform segments [y(t), (k + 1/2)T ≤ t ≤ (k + 3/2)T , k = 0, 1, 2, . . .] to the interval [−T /2, T /2], and superimpose. The resulting picture is shown in Fig. 4.8b for the y(t) shown in Fig. 4.8a. (Partitioning y(t) into successive segments of length iT , i > 1, is also possible. This would result in i successive eye diagrams.) The number of eye openings is one less than the number of transmitted signal levels. In practice, the eye diagram is easily viewed on an oscilloscope by applying the received waveform y(t) to the vertical deflection plates of the oscilloscope and applying a sawtooth waveform at the symbol rate 1/T to the horizontal deflection plates. This causes successive symbol intervals to be translated into one interval on the oscilloscope display. Each waveform segment y(t), (k+1/2)T ≤ t ≤ (k+3/2)T , depends on the particular sequence of channel symbols surrounding Ak . The number of channel symbols that affects a particular waveform segment depends on the extent of the intersymbol interference, shown in Eq. (4.6). This, in turn, depends on the duration of the impulse response h(t). For example, if h(t) has most of its energy in the interval 0 < t < mT , then each waveform segment depends on approximately m symbols. Assuming binary transmission, this implies that there are a total of 2m waveform segments that can be superimposed in the eye diagram. (It is possible that only one sequence of channel symbols causes significant intersymbol interference, and this sequence occurs with very low probability.) In current digital wireless applications the impulse response typically spans only a few symbols. The eye diagram has the following important features which measure the performance of a digital communications system.

4.4.1

Vertical Eye Opening

The vertical openings at any time t0 , −T /2 ≤ t0 ≤ T /2, represent the separation between signal levels with worst-case intersymbol interference, assuming that y(t) is sampled at times t = kT + t0 , k = 0, 1, 2, . . . . It is possible for the intersymbol interference to be large enough so that this vertical opening between some, or all, signal levels disappears altogether. In that case, the eye is said to be closed. Otherwise, the eye is said to be open. A closed eye implies that if the estimated bits are obtained by thresholding the samples y(kT ), then the decisions will depend primarily on the intersymbol interference rather than on the desired symbol. The probability of error will, therefore, be close to 1/2. Conversely, wide vertical spacings between signal levels imply a large degree of immunity to additive noise. In general, y(t) should be sampled at the times kT + t0 , k = 0, 1, 2, . . . , where t0 is chosen to maximize the vertical eye opening. 1999 by CRC Press LLC

c

Figure 4.8a

Received signal y(t).

Figure 4.8b

Eye diagram for received signal shown in Fig. 4.8a.

4.4.2

Horizontal Eye Opening

The width of each opening indicates the sensitivity to timing offset. Specifically, a very narrow eye opening indicates that a small timing offset will result in sampling where the eye is closed. Conversely, a wide horizontal opening indicates that a large timing offset can be tolerated, although the error probability will depend on the vertical opening.

1999 by CRC Press LLC

c

4.4.3 Slope of the Inner Eye The slope of the inner eye indicates sensitivity to timing jitter or variance in the timing offset. Specifically, a very steep slope means that the eye closes rapidly as the timing offset increases. In this case, a significant amount of jitter in the sampling times significantly increases the probability of error. The shape of the eye diagram is determined by the pulse shape. In general, the faster the baseband pulse decays, the wider the eye opening. For example, a rectangular pulse produces a box-shaped eye diagram (assuming binary signalling). The minimum bandwidth pulse shape Eq. (4.12) produces an eye diagram which is closed for all t except for t = 0. This is because, as shown earlier, an arbitrarily small timing offset can lead to an intersymbol interference term that is arbitrarily large, depending on the data sequence.

4.5

Partial-Response Signalling

To avoid the problems associated with Nyquist signalling over an ideal bandlimited channel, bandwidth and/or power efficiency must be compromised. Raised cosine pulses compromise bandwidth efficiency to gain robustness with respect to timing errors. Another possibility is to introduce a controlled amount of intersymbol interference at the transmitter, which can be removed at the receiver. This approach is called partial-response (PR) signalling. The terminology reflects the fact that the sampled system impulse response does not have the full response given by the Nyquist condition Eq. (4.7). To illustrate PR signalling, suppose that the Nyquist condition Eq. (4.7) is replaced by the condition

k = 0, 1 all other k

(4.26)

yk = Ak + Ak−1 + n˜ k

(4.27)

hk = The kth received sample is then

1 0

so that there is intersymbol interference from one neighboring transmitted symbol. For now we focus on the spectral characteristics of PR signalling and defer discussion of how to detect the transmitted sequence {Ak } in the presence of intersymbol interference. The equivalent discrete-time transfer function in this case is the discrete Fourier transform of the sequence in Eq. (4.26), 1 X k H f+ Heq (ej 2πf T )ψ = T T k

=

1 + e−j 2πf T = 2e−j πf T cos(πf T )ψ

(4.28)

As in the full-response case, for Eq. (4.28) to be satisfied, the minimum bandwidth of the channel G(f ) and transmitter filter P (f ) is W = 1/(2T ). Assuming P (f ) has this minimum bandwidth implies ( H (f ) =

and 1999 by CRC Press LLC

c

2T e−j πf T cos(πf T )ψ

|f | < 1/(2T )

0

|f | > 1/(2T )

(4.29a)

h(t) = T { sinc (t/T ) + sinc [(t − T )/T ]}

(4.29b)

where sinc x = (sin πx)/(πx). This pulse is called a duobinary pulse and is shown along with the associated H (f ) in Fig. 4.9. [Notice that h(t) satisfies Eq. (4.26).] Unlike the ideal bandlimited frequency response, the transfer function H (f ) in Eq. (4.29a) is continuous and is, therefore, easily approximated by a physically realizable filter. Duobinary PR was first proposed by Lender, [7], and later generalized by Kretzmer, [6].

FIGURE 4.9: Duobinary frequency response and minimum bandwidth pulse. 1999 by CRC Press LLC

c

The main advantage of the duobinary pulse Eq. (4.29b), relative to the minimum bandwidth pulse Eq. (4.12), is that signalling at the Nyquist symbol rate is feasible with zero excess bandwidth. Because the pulse decays much more rapidly than a Nyquist pulse, it is robust with respect to timing errors. Selecting the transmitter and receiver filters so that the overall system response is duobinary is appropriate in situations where the channel frequency response G(f ) is near zero or has a rapid rolloff at the Nyquist band edge f = 1/(2T ). As another example of PR signalling, consider the modified duobinary partial response 1 −1 hk = 0

k = −1 k=1 all other k

(4.30)

which has equivalent discrete-time transfer function = ej 2πf T − e−j 2πf T Heq ej 2πf T =

j 2 sin(2πf T )

(4.31)

With zero excess bandwidth, the overall system response is ( H (f ) =

j 2T sin(2πf T )

|f | < 1/(2T )

0

|f | > 1/(2T )

(4.32a)

and h(t) = T {sinc [(t + T )/T ] − sinc [(t − T )/T ]}

(4.32b)

These functions are plotted in Fig. 4.10. This pulse shape is appropriate when the channel response G(f ) is near zero at both DC (f = 0) and at the Nyquist band edge. This is often the case for wire (twisted-pair) channels where the transmitted signal is coupled to the channel through a transformer. Like duobinary PR, modified duobinary allows minimum bandwidth signalling at the Nyquist rate. A particular partial response is often identified by the polynomial K X

hk D k

k=0

where D (for delay) takes the place of the usual z−1 in the z transform of the sequence {hk }. For example, duobinary is also referred to as 1 + D partial response. In general, more complicated system responses than those shown in Figs. 4.9 and 4.10 can be generated by choosing more nonzero coefficients in the sequence {hk }. This complicates detection, however, because of the additional intersymbol interference that is generated. Rather than modulating a PR pulse h(t), a PR signal can also be generated by filtering the sequence of transmitted levels {Ai }. This is shown in Fig. 4.11. Namely, the transmitted levels are first passed through a discrete-time (digital) filter with transfer function Pd (ej 2πf T ) (where the subscript d indicates discrete). [Note that Pd (ej 2πf T ) can be selected to be Heq (ej 2πf T ).] The outputs of this filter form the PAM signal, where the pulse shaping filter P (f ) = 1, |f | < 1/(2T ) and is zero elsewhere. If the transmitted levels {Ak } are selected independently and are identically distributed, 1999 by CRC Press LLC

c

FIGURE 4.10: Modified duobinary frequency response and minimum bandwidth pulse. then the transmitted spectrum is σA2 |Pd (ej 2πf T )|2 for |f | < 1/(2T ) and is zero for |f | > 1/(2T ), where σA2 = E[|Ak |2 ]. Shaping the transmitted spectrum to have nulls coincident with nulls in the channel response potentially offers significant performance advantages. By introducing intersymbol interference, however, PR signalling increases the number of received signal levels, which increases the complexity of the detector and may reduce immunity to noise. For example, the set of received signal levels for duobinary signalling is {0, ± 2} from which the transmitted levels {± 1} must be estimated. The performance of a particular PR scheme depends on the channel characteristics, as well as the type of detector used at the receiver. We now describe a simple suboptimal detection strategy. 1999 by CRC Press LLC

c

FIGURE 4.11: Generation of PR signal.

4.5.1 Precoding Consider the received signal sample Eq. (4.27) with duobinary signalling. If the receiver has correctly decoded the symbol Ak−1 , then in the absence of noise Ak can be decoded by subtracting Ak−1 from the received sample yk . If an error occurs, however, then subtracting the preceding symbol estimate from the received sample will cause the error to propagate to successive detected symbols. To avoid this problem, the transmitted levels can be precoded in such a way as to compensate for the intersymbol interference introduced by the overall partial response.

FIGURE 4.12: Precoding for a PR channel.

TABLE 4.1

Example of Precoding for Duobinary PR.

{bi }: {bi0 }: {Ai }: {yi }:

1

0

0

1

1

1

0

0

1

0

0

1

1

1

0

1

0

0

0

1

1

−1

1

1

1

−1

1

−1

−1

−1

1

1

0

2

2

0

0

0

−2

−2

0

2

We first illustrate precoding for duobinary PR. The sequence of operations is illustrated in Fig. 4.12. Let {bk } denote the sequence of source bits where bk ∈ {0, 1}. This sequence is transformed to the sequence {bk0 } by the operation 0 (4.33) bk0 = bk ⊕ bk−1 where ⊕ denotes modulo 2 addition (exclusive OR). The sequence {bk0 } is mapped to the sequence of binary transmitted signal levels {Ak } according to Ak = 2bk0 − 1

(4.34)

That is, bk0 = 0 (bk0 = 1) is mapped to the transmitted level Ak = −1 (Ak = 1). In the absence of noise, the received symbol is then 0 −1 (4.35) yk = Ak + Ak−1 = 2 bk0 + bk−1 and combining Eqs. (4.33) and (4.35) gives 1 yk + 1 mod 2 bk = 2 1999 by CRC Press LLC

c

(4.36)

That is, if yk = ±2, then bk = 0, and if yk = 0, then bk = 1. Precoding, therefore, enables the detector to make symbol-by-symbol decisions that do not depend on previous decisions. Table 4.1 shows a sequence of transmitted bits {bi }, precoded bits {bi0 }, transmitted signal levels {Ai }, and received samples {yi }. The preceding precoding technique can be extended to multilevel PAM and to other PR channels. Suppose that the PR is specified by Heq (D) =

K X

hk D k

k=0

where the coefficients are integers and that the source symbols {bk } are selected from the set {0, 1, . . . , M − 1}. These symbols are transformed to the sequence {bk0 } via the precoding operation ! K X 0 0 hi bk−i mod M (4.37) bk = bk − i=1

Because of the modulo operation, each symbol bk0 is also in the set {0, 1, . . . , M − 1}. The kth transmitted signal level is given by Ak = 2bk0 − (M − 1)

(4.38)

so that the set of transmitted levels is {−(M − 1), . . . , (M − 1)} (i.e., a shifted version of the set of values assumed by bk ). In the absence of noise the received sample is yk =

K X

hi Ak−i

(4.39)

i=0

and it can be shown that the kth source symbol is given by bk =

1 yk + (M − 1) · Heq (1) mod M 2

(4.40)

Precoding the symbols {bk } in this manner, therefore, enables symbol-by-symbol decisions at the receiver. In the presence of noise, more sophisticated detection schemes (e.g., maximum likelihood) can be used with PR signalling to obtain improvements in performance.

4.6

Additional Considerations

In many applications, bandwidth and intersymbol interference are not the only important considerations for selecting baseband pulses. Here we give a brief discussion of additional practical constraints that may influence this selection.

4.6.1

Average Transmitted Power and Spectral Constraints

The constraint on average transmitted power varies according to the application. For example, low-average power is highly desirable for mobile wireless applications that use battery-powered transmitters. In many applications (e.g., digital subscriber loops, as well as digital radio), constraints are imposed to limit the amount of interference, or crosstalk, radiated into neighboring receivers and 1999 by CRC Press LLC

c

communications systems. Because this type of interference is frequency dependent, the constraint may take the form of a spectral mask that specifies the maximum allowable transmitted power as a function of frequency. For example, crosstalk in wireline channels is generally caused by capacitive coupling and increases as a function of frequency. Consequently, to reduce the amount of crosstalk generated at a particular transmitter, the pulse shaping filter generally attenuates high frequencies more than low frequencies. In radio applications where signals are assigned different frequency bands, constraints on the transmitted spectrum are imposed to limit adjacent-channel interference. This interference is generated by transmitters assigned to adjacent frequency bands. Therefore, a constraint is needed to limit the amount of out-of-band power generated by each transmitter, in addition to an overall average power constraint. To meet this constraint, the transmitter filter in Fig. 4.3 must have a sufficiently steep rolloff at the edges of the assigned frequency band. (Conversely, if the transmitted signals are time multiplexed, then the duration of the system impulse response must be contained within the assigned time slot.)

4.6.2 Peak-to-Average Power In addition to a constraint on average transmitted power, a peak-power constraint is often imposed as well. This constraint is important in practice for the following reasons: 1. The dynamic range of the transmitter is limited. In particular, saturation of the output amplifier will “clip” the transmitted waveform. 2. Rapid fades can severely distort signals with high peak-to-average power. 3. The transmitted signal may be subjected to nonlinearities. Saturation of the output amplifier is one example. Another example that pertains to wireline applications is the companding process in the voice telephone network [5]. Namely, the compander used to reduce quantization noise for pulse-code modulated voice signals introduces amplitudedependent distortion in data signals. The preceding impairments or constraints indicate that the transmitted waveform should have a low peak-to-average power ratio (PAR). For a transmitted waveform x(t), the PAR is defined as PAR =

max |x(t)|2 E |x(t)|2

where E(·) denotes expectation. Using binary signalling with rectangular pulse shapes minimizes the PAR. However, this compromises bandwidth efficiency. In applications where PAR should be low, binary signalling with rounded pulses are often used. Operating RF power amplifiers with power back-off can also reduce PAR, but leads to inefficient amplification. For an orthogonal frequency division multiplexing (OFDM) system, it is well known that the transmitted signal can exhibit a very high PAR compared to an equivalent single-carrier system. Hence more sophisticated approaches to PAR reduction are required for OFDM. Some proposed approaches are described in [8] and references therein. These include altering the set of transmitted symbols and setting aside certain OFDM tones specifically to minimize PAR.

4.6.3

Channel and Receiver Characteristics

The type of channel impairments encountered and the type of detection scheme used at the receiver can also influence the choice of a transmitted pulse shape. For example, a constant amplitude 1999 by CRC Press LLC

c

pulse is appropriate for a fast fading environment with noncoherent detection. The ability to track channel characteristics, such as phase, may allow more bandwidth efficient pulse shapes in addition to multilevel signalling. High-speed data communications over time-varying channels requires that the transmitter and/or receiver adapt to the changing channel characteristics. Adapting the transmitter to compensate for a time-varying channel requires a feedback channel through which the receiver can notify the transmitter of changes in channel characteristics. Because of this extra complication, adapting the receiver is often preferred to adapting the transmitter pulse shape. However, the following examples are notable exceptions. 1. The current IS-95 air interface for direct-sequence code-division multiple access adapts the transmitter power to control the amount of interference generated and to compensate for channel fades. This can be viewed as a simple form of adaptive transmitter pulse shaping in which a single parameter associated with the pulse shape is varied. 2. Multitone modulation divides the channel bandwidth into small subbands, and the transmitted power and source bits are distributed among these subbands to maximize the information rate. The received signal-to-noise ratio for each subband must be transmitted back to the transmitter to guide the allocation of transmitted bits and power [1]. In addition to multitone modulation, adaptive precoding (also known as Tomlinson–Harashima precoding [4, 11]) is another way in which the transmitter can adapt to the channel frequency response. Adaptive precoding is an extension of the technique described earlier for partial-response channels. Namely, the equivalent discrete-time channel impulse response is measured at the receiver and sent back to the transmitter, where it is used in a precoder. The precoder compensates for the intersymbol interference introduced by the channel, allowing the receiver to detect the data by a simple threshhold operation. Both multitone modulation and precoding have been used with wireline channels (voiceband modems and digital subscriber loops).

4.6.4

Complexity

Generation of a bandwidth-efficient signal requires a filter with a sharp cutoff. In addition, bandwidth-efficient pulse shapes can complicate other system functions, such as timing and carrier recovery. If sufficient bandwidth is available, the cost can be reduced by using a rectangular pulse shape with a simple detection strategy (low-pass filter and threshold).

4.6.5

Tolerance to Interference

Interference is one of the primary channel impairments associated with digital radio. In addition to adjacent-channel interference described earlier, cochannel interference may be generated by other transmitters assigned to the same frequency band as the desired signal. Cochannel interference can be controlled through frequency (and perhaps time slot) assignments and by pulse shaping. For example, assuming fixed average power, increasing the bandwidth occupied by the signal lowers the power spectral density and decreases the amount of interference into a narrowband system that occupies part of the available bandwidth. Sufficient bandwidth spreading, therefore, enables wideband signals to be overlaid on top of narrowband signals without disrupting either service.

1999 by CRC Press LLC

c

4.6.6

Probability of Intercept and Detection

The broadcast nature of wireless channels generally makes eavesdropping easier than for wired channels. A requirement for most commercial, as well as military applications, is to guarantee the privacy of user conversations (low probability of intercept). An additional requirement, in some applications, is that determining whether or not communications is taking place must be difficult (low probability of detection). Spread-spectrum waveforms are attractive in these applications since spreading the pulse energy over a wide frequency band decreases the power spectral density and, hence, makes the signal less visible. Power-efficient modulation combined with coding enables a further reduction in transmitted power for a target error rate.

4.7

Examples

We conclude this chapter with a brief description of baseband pulse shapes used in existing and emerging standards for digital mobile cellular and Personal Communications Services (PCS).

4.7.1

Global System for Mobile Communications (GSM)

The European GSM standard for digital mobile cellular communications operates in the 900-MHz frequency band, and is based on time-division multiple access (TDMA) [9]. The U.S. version operates at 1900 MHz, and is called PCS-1900. A special variant of binary FSK is used called Gaussian minimum-shift keying (GMSK). The GMSK modulator is illustrated in Fig. 4.13. The input to the modulator is a binary PAM signal s(t), given by Eq. (4.3), where the pulse p(t) is a Gaussian function and |s(t)| < 1. This waveform frequency modulates the carrier fc , so that the (passband) transmitted signal is Z t s(τ ) dτ w(t) = Kcos 2πfc t + 2πfd −∞

The maximum frequency deviation from the carrier is fd = 1/(2T ), which characterizes minimumshift keying. This technique can be used with a noncoherent receiver that is easy to implement. Because the transmitted signal has a constant envelope, the data can be reliably detected in the presence of rapid fades that are characteristic of mobile radio channels.

FIGURE 4.13: Generation of GMSK signal; LPF is low-pass filter.

4.7.2

U.S. Digital Cellular (IS-136)

The IS-136 air interface (formerly IS-54) operates in the 800 MHz band and is based on TDMA [3]. There is also a 1900 MHz version of IS-136. The baseband signal is given by Eq. (4.3) where the symbols are complex-valued, corresponding to quadrature phase modulation. The pulse has a square-root raised cosine spectrum with 35% excess bandwidth. 1999 by CRC Press LLC

c

4.7.3

Interim Standard-95

The IS-95 air interface for digital mobile cellular uses spread-spectrum signalling (CDMA) in the 800MHz band [10]. There is also a 1900 MHz version of IS-95. The baseband transmitted pulse shapes are analogous to those shown in Fig. 4.2, where the number of square pulses (chips) per bit is 128. To improve spectral efficiency the (wideband) transmitted signal is filtered by an approximation to an ideal low-pass response with a small amount of excess bandwidth. This shapes the chips so that they resemble minimum bandwidth pulses.

4.7.4

Personal Access Communications System (PACS)

Both PACS and the Japanese personal handy phone (PHP) system are TDMA systems which have been proposed for personal communications systems (PCS), and operate near 2 GHz [2]. The baseband signal is given by Eq. (4.3) with four complex symbols representing four-phase quadrature modulation. The baseband pulse has a square-root raised cosine spectrum with 50% excess bandwidth.

Defining Terms Baseband signal: A signal with frequency content centered around DC. Equivalent discrete-time transfer function: A discrete-time transfer function (z transform) that relates the transmitted amplitudes to received samples in the absence of noise. Excess bandwidth: That part of the baseband transmitted spectrum which is not contained within the Nyquist band. Eye diagram: Superposition of segments of a received PAM signal that indicates the amount of intersymbol interference present. Frequency-shift keying: A digital modulation technique in which the transmitted pulse is sinusoidal, where the frequency is determined by the source bits. Intersymbol interference: The additive contribution (interference) to a received sample from transmitted symbols other than the symbol to be detected. Matched filter: The receiver filter with impulse response equal to the time-reversed, complex conjugate impulse response of the combined transmitter filter-channel impulse response. Nyquist band: The narrowest frequency band that can support a PAM signal without intersymbol interference (the interval [−1/(2T ), 1/(2T )] where 1/T is the symbol rate). Nyquist criterion: A condition on the overall frequency response of a PAM system that ensures the absence of intersymbol interference. Orthogonal frequency division multiplexing (OFDM): Modulation technique in which the transmitted signal is the sum of low-bit-rate narrowband digital signals modulated on orthogonal carriers. Partial-response signalling: A signalling technique in which a controlled amount of intersymbol interference is introduced at the transmitter in order to shape the transmitted spectrum. Precoding: A transformation of source symbols at the transmitter that compensates for intersymbol interference introduced by the channel. Pulse amplitude modulation (PAM): A digital modulation technique in which the source bits are mapped to a sequence of amplitudes that modulate a transmitted pulse. 1999 by CRC Press LLC

c

Raised cosine pulse: A pulse shape with Fourier transform that decays to zero according to a raised cosine; see Eq. (4.18). The amount of excess bandwidth is conveniently determined by a single parameter (α). Spread spectrum: A signalling technique in which the pulse bandwidth is many times wider than the Nyquist bandwidth. Zero-forcing criterion: A design constraint which specifies that intersymbol interference be eliminated.

References [1] Bingham, J.A.C., Multicarrier modulation for data transmission: an idea whose time has come. IEEE Commun. Mag., 28(May), 5–14, 1990. [2] Cox, D.C., Wireless personal communications: what is it? IEEE Personal Comm., 2(2), 20–35, 1995. [3] Electronic Industries Association/Telecommunications Industry Association. Recommended minimum performance standards for 800 MHz dual-mode mobile stations. Incorp. EIA/TIA 19B, EIA/TIA Project No. 2216, Mar.,1991 [4] Harashima, H. and Miyakawa, H., Matched-transmission technique for channels with intersymbol interference. IEEE Trans. on Commun., COM-20(Aug.), 774–780, 1972. [5] Kalet, I. and Saltzberg, B.R., QAM transmission through a companding channel—signal constellations and detection. IEEE Trans. on Comm., 42(2–4), 417–429, 1994. [6] Kretzmer, E.R., Generalization of a technique for binary data communication. IEEE Trans. Comm. Tech., COM-14 (Feb.), 67, 68, 1966. [7] Lender, A., The duobinary technique for high-speed data Transmission. AIEE Trans. on Comm. Electronics, 82 (March), 214–218, 1963. [8] Muller, S.H. and Huber, J.B., A comparison of peak power reduction schemes for OFDM. Proc. GLOBECOM ’97, (Mon.), 1–5, 1997. [9] Rahnema, M., Overview of the GSM system and protocol architecture. IEEE Commun. Mag., (April), 92–100, 1993. [10] Telecommunication Industry Association. Mobile station-base station compatibility standard for dual-mode wideband spread spectrum cellular system. TIA/EIA/IS-95-A. May, 1995. [11] Tomlinson, M., New automatic equalizer employing modulo arithmetic. Electron. Lett., 7 (March), 138, 139, 1971.

Further Information Baseband signalling and pulse shaping is fundamental to the design of any digital communications system and is, therefore, covered in numerous texts on digital communications. For more advanced treatments see E.A. Lee and D.G. Messerschmitt, Digital Communication, Kluwer 1994, and J.G. Proakis, Digital Communications, McGraw-Hill 1995.

1999 by CRC Press LLC

c

Proakis, J.G. “Channel Equalization” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Channel Equalization 5.1 5.2 5.3

Characterization of Channel Distortion Characterization of Intersymbol Interference Linear Equalizers Adaptive Linear Equalizers

John G. Proakis Northeastern University

5.1

5.4 Decision-Feedback Equalizer 5.5 Maximum-Likelihood Sequence Detection 5.6 Conclusions Defining Terms References Further Information

Characterization of Channel Distortion

Many communication channels, including telephone channels, and some radio channels, may be generally characterized as band-limited linear filters. Consequently, such channels are described by their frequency response C(f ), which may be expressed as C(f ) = A(f )ej θ(f )

(5.1)

where A(f ) is called the amplitude response and θ (f ) is called the phase response. Another characteristic that is sometimes used in place of the phase response is the envelope delay or group delay, which is defined as 1 dθ (f ) (5.2) τ (f ) = − 2π df A channel is said to be nondistorting or ideal if, within the bandwidth W occupied by the transmitted signal, A(f ) = const and θ(f ) is a linear function of frequency [or the envelope delay τ (f ) = const]. On the other hand, if A(f ) and τ (f ) are not constant within the bandwidth occupied by the transmitted signal, the channel distorts the signal. If A(f ) is not constant, the distortion is called amplitude distortion and if τ (f ) is not constant, the distortion on the transmitted signal is called delay distortion. As a result of the amplitude and delay distortion caused by the nonideal channel frequency response characteristic C(f ), a succession of pulses transmitted through the channel at rates comparable to the bandwidth W are smeared to the point that they are no longer distinguishable as well-defined pulses at the receiving terminal. Instead, they overlap and, thus, we have intersymbol interference (ISI). As an example of the effect of delay distortion on a transmitted pulse, Fig. 5.1(a) illustrates a bandlimited pulse having zeros periodically spaced in time at points labeled ±T , ±2T , etc. If information 1999 by CRC Press LLC

c

is conveyed by the pulse amplitude, as in pulse amplitude modulation (PAM), for example, then one can transmit a sequence of pulses, each of which has a peak at the periodic zeros of the other pulses. Transmission of the pulse through a channel modeled as having a linear envelope delay characteristic τ (f ) [quadratic phase θ(f )], however, results in the received pulse shown in Fig. 5.1(b) having zero crossings that are no longer periodically spaced. Consequently a sequence of successive pulses would be smeared into one another, and the peaks of the pulses would no longer be distinguishable. Thus, the channel delay distortion results in intersymbol interference. As will be discussed in this chapter, it is possible to compensate for the nonideal frequency response characteristic of the channel by use of a filter or equalizer at the demodulator. Figure 5.1(c) illustrates the output of a linear equalizer that compensates for the linear distortion in the channel. The extent of the intersymbol interference on a telephone channel can be appreciated by observing a frequency response characteristic of the channel. Figure 5.2 illustrates the measured average amplitude and delay as a function of frequency for a medium-range (180–725 mi) telephone channel of the switched telecommunications network as given by Duffy and Tratcher, 1971. We observe that the usable band of the channel extends from about 300 Hz to about 3000 Hz. The corresponding impulse response of the average channel is shown in Fig. 5.3. Its duration is about 10 ms. In comparison, the transmitted symbol rates on such a channel may be of the order of 2500 pulses or symbols per second. Hence, intersymbol interference might extend over 20–30 symbols. Besides telephone channels, there are other physical channels that exhibit some form of time dispersion and, thus, introduce intersymbol interference. Radio channels, such as short-wave ionospheric propagation (HF), tropospheric scatter, and mobile cellular radio are three examples of time-dispersive wireless channels. In these channels, time dispersion and, hence, intersymbol interference is the result of multiple propagation paths with different path delays. The number of paths and the relative time delays among the paths vary with time and, for this reason, these radio channels are usually called time-variant multipath channels. The time-variant multipath conditions give rise to a wide variety of frequency response characteristics. Consequently, the frequency response characterization that is used for telephone channels is inappropriate for time-variant multipath channels. Instead, these radio channels are characterized statistically in terms of the scattering function, which, in brief, is a two-dimensional representation of the average received signal power as a function of relative time delay and Doppler frequency (see Proakis [4]). For illustrative purposes, a scattering function measured on a medium-range (150 mi) tropospheric scatter channel is shown in Fig. 5.4. The total time duration (multipath spread) of the channel response is approximately 0.7 µs on the average, and the spread between half-power points in Doppler frequency is a little less than 1 Hz on the strongest path and somewhat larger on the other paths. Typically, if one is transmitting at a rate of 107 symbols/s over such a channel, the multipath spread of 0.7 µs will result in intersymbol interference that spans about seven symbols.

5.2

Characterization of Intersymbol Interference

In a digital communication system, channel distortion causes intersymbol interference, as illustrated in the preceding section. In this section, we shall present a model that characterizes the ISI. The digital modulation methods to which this treatment applies are PAM, phase-shift keying (PSK) and quadrature amplitude modulation (QAM). The transmitted signal for these three types of modulation may be expressed as s(t)

= =

1999 by CRC Press LLC

c

vc (t) cos 2πfc t − vs (t) sin 2πfc t i h Re v(t) ej 2πfc t

(5.3)

FIGURE 5.1: Effect of channel distortion: (a) channel input, (b) channel output, (c) equalizer output.

1999 by CRC Press LLC

c

FIGURE 5.2: Average amplitude and delay characteristics of medium-range telephone channel. where v(t) = vc (t) + j vs (t) is called the equivalent low-pass signal, fc is the carrier frequency, and Re[ ] denotes the real part of the quantity in brackets. In general, the equivalent low-pass signal is expressed as v(t) =

∞ X

In gT (t − nT )

(5.4)

n=0

where gT (t) is the basic pulse shape that is selected to control the spectral characteristics of the transmitted signal, {In } the sequence of transmitted information symbols selected from a signal constellation consisting of M points, and T the signal interval (1/T is the symbol rate). For PAM, PSK, and QAM, the values of In are points from M-ary signal constellations. Figure 5.5 illustrates the signal constellations for the case of M = 8 signal points. Note that for PAM, the signal constellation is one dimensional. Hence, the equivalent low-pass signal v(t) is real valued, i.e., vs (t) = 0 and vc (t) = v(t). For M-ary (M > 2) PSK and QAM, the signal constellations are two dimensional and, hence, v(t) is complex valued. 1999 by CRC Press LLC

c

FIGURE 5.3: Impulse response of average channel with amplitude and delay shown in Fig.5.2.

FIGURE 5.4: Scattering function of a medium-range tropospheric scatter channel.

The signal s(t) is transmitted over a bandpass channel that may be characterized by an equivalent low-pass frequency response C(f ). Consequently, the equivalent low-pass received signal can be represented as r(t) =

∞ X n=0

1999 by CRC Press LLC

c

In h(t − nT ) + w(t)

(5.5)

−7

−5

−3

−1

1

3

5

7

000

001

011

010

110

111

101

100

(a) PAM

011

2

010

001

110

000

111

100

101

(b) PSK

(1, 1) (1 + 3, 0)

2 2

(c) QAM

FIGURE 5.5: M = 8 signal constellations for PAM, PSK, and QAM.

1999 by CRC Press LLC

c

where h(t) = gT (t) ∗ c(t), and c(t) is the impulse response of the equivalent low-pass channel, the asterisk denotes convolution, and w(t) represents the additive noise in the channel. To characterize the ISI, suppose that the received signal is passed through a receiving filter and then sampled at the rate 1/T samples/s. In general, the optimum filter at the receiver is matched to the received signal pulse h(t). Hence, the frequency response of this filter is H ∗ (f ). We denote its output as ∞ X In x(t − nT ) + ν(t) (5.6) y(t) = n=0

where x(t) is the signal pulse response of the receiving filter, i.e., X(f ) = H (f )H ∗ (f ) = |H (f )|2 , and ν(t) is the response of the receiving filter to the noise w(t). Now, if y(t) is sampled at times t = kT , k = 0, 1, 2, . . . , we have y(kT ) ≡ yk

= =

∞ X n=0 ∞ X

In x(kT − nT ) + ν(kT ) In xk−n + νk ,

k = 0, 1, . . .

(5.7)

n=0

The sample values {yk } can be expressed as 1 yk = x0 Ik + x0

∞ X n=0 n6 =k

In xk−n + νk ,

k = 0, 1, . . .

(5.8)

The term x0 is an arbitrary scale factor, which we arbitrarily set equal to unity for convenience. Then yk = Ik +

∞ X

In xk−n + νk

(5.9)

n=0 n6 =k

The term Ik represents the desired information symbol at the kth sampling instant, the term ∞ X

In xk−n

(5.10)

n=0 n6 =k

represents the ISI, and νk is the additive noise variable at the kth sampling instant. The amount of ISI, and noise in a digital communications system can be viewed on an oscilloscope. For PAM signals, we can display the received signal y(t) on the vertical input with the horizontal sweep rate set at 1/T . The resulting oscilloscope display is called an eye pattern because of its resemblance to the human eye. For example, Fig. 5.6 illustrates the eye patterns for binary and four-level PAM modulation. The effect of ISI is to cause the eye to close, thereby reducing the margin for additive noise to cause errors. Figure 5.7 graphically illustrates the effect of ISI in reducing the opening of a binary eye. Note that intersymbol interference distorts the position of the zero crossings and causes a reduction in the eye opening. Thus, it causes the system to be more sensitive to a synchronization error. 1999 by CRC Press LLC

c

QUATERNARY

BINARY

FIGURE 5.6: Examples of eye patterns for binary and quaternary amplitude shift keying (or PAM). Optimum sampling time Sensitivity to timing error

Peak distortion

Distortion of zero crossings

Noise margin

FIGURE 5.7: Effect of intersymbol interference on eye opening.

For PSK and QAM it is customary to display the eye pattern as a two-dimensional scatter diagram illustrating the sampled values {yk } that represent the decision variables at the sampling instants. Figure 5.8 illustrates such an eye pattern for an 8-PSK signal. In the absence of intersymbol interference and noise, the superimposed signals at the sampling instants would result in eight distinct points corresponding to the eight transmitted signal phases. Intersymbol interference and noise result in a deviation of the received samples {yk } from the desired 8-PSK signal. The larger the intersymbol interference and noise, the larger the scattering of the received signal samples relative to the transmitted signal points. In practice, the transmitter and receiver filters are designed for zero ISI at the desired sampling times t = kT . Thus, if GT (f ) is the frequency response of the transmitter filter and GR (f ) is the frequency response of the receiver filter, then the product GT (f ) GR (f ) is designed to yield zero ISI. 1999 by CRC Press LLC

c

FIGURE 5.8: Two-dimensional digital eye patterns. For example, the product GT (f ) GR (f ) may be selected as GT (f )GR (f ) = Xrc (f )ψ where Xrc (f ) is the raised-cosine frequency response characteristic, defined as T,ψ 0 ≤ |f | ≤ (1 − α)/2T T πT 1−α 1−α 1+α b1 + cos |f | − , ≤ |f | ≤ Xrc (f ) = 2 α 2T 2T 2T 1+α 0,ψ |f | > 2T

(5.11)

(5.12)

where α is called the rolloff factor, which takes values in the range 0 ≤ α ≤ 1, and 1/T is the symbol rate. The frequency response Xrc (f ) is illustrated in Fig. 5.9(a) for α = 0, 1/2, and 1. Note that when α = 0, Xrc (f ) reduces to an ideal brick wall physically nonrealizable frequency response with bandwidth occupancy 1/2T . The frequency 1/2T is called the Nyquist frequency. For α > 0, the bandwidth occupied by the desired signal Xrc (f ) beyond the Nyquist frequency 1/2T is called the excess bandwidth, and is usually expressed as a percentage of the Nyquist frequency. For example, when α = 1/2, the excess bandwidth is 50% and when α = 1, the excess bandwidth is 100%. The signal pulse xrc (t) having the raised-cosine spectrum is xrc (t) =

sin π t/T π t/T

cos (π αt/T ) 1 − 4α 2 t 2 /T 2

(5.13)

Figure 5.9(b) illustrates xrc (t) for α = 0, 1/2, and 1. Note that xrc (t) = 1 at t = 0 and xrc (t) = 0 at t = kT , k = ±1, ±2, . . . . Consequently, at the sampling instants t = kT , k 6 = 0, there is no ISI from adjacent symbols when there is no channel distortion. In the presence of channel distortion, however, the ISI given by Eq. (5.10) is no longer zero, and a channel equalizer is needed to minimize its effect on system performance.

1999 by CRC Press LLC

c

x rc (t) 1 α = 0; 0.5

β=1 α=1 0

3T

2T α=0

4T

α = 0.5

(a)

Xrc (f) T

α=0 α = 0.5 α=1

−1 T

−

1 2T

0

1 2T

1 T

f

(b)

FIGURE 5.9: Pulses having a raised cosine spectrum.

5.3

Linear Equalizers

The most common type of channel equalizer used in practice to reduce SI is a linear transversal filter with adjustable coefficients {ci }, as shown in Fig. 5.10. On channels whose frequency response characteristics are unknown, but time invariant, we may measure the channel characteristics and adjust the parameters of the equalizer; once adjusted, the parameters remain fixed during the transmission of data. Such equalizers are called preset equalizers. On the other hand, adaptive equalizers update their parameters on a periodic basis during the transmission of data and, thus, they are capable of tracking a slowly time-varying channel response. First, let us consider the design characteristics for a linear equalizer from a frequency domain viewpoint. Figure 5.11 shows a block diagram of a system that employs a linear filter as a channel equalizer. The demodulator consists of a receiver filter with frequency response GR (f ) in cascade with a channel equalizing filter that has a frequency response GE (f ). As indicated in the preceding section, the receiver filter response GR (f ) is matched to the transmitter response, i.e., GR (f ) = G∗T (f ), and the product GR (f )GT (f ) is usually designed so that there is zero ISI at the sampling instants as, for example, when GR (t)GT (f ) = Xrc (f ). For the system shown in Fig. 5.11, in which the channel frequency response is not ideal, the desired 1999 by CRC Press LLC

c

FIGURE 5.10: Linear transversal filter.

FIGURE 5.11: Block diagram of a system with an equalizer. condition for zero ISI is GT (f )C(f )GR (f )GE (f ) = Xrc (f )ψ

(5.14)

where Xrc (f ) is the desired raised-cosine spectral characteristic. Since GT (f )GR (f ) = Xrc (f ) by design, the frequency response of the equalizer that compensates for the channel distortion is GE (f ) =

1 1 = e−j θc (f ) C(f ) |C(f )|

(5.15)

Thus, the amplitude response of the equalizer is |GE (f )| = 1/|C(f )| and its phase response is θE (f ) = −θc (f ). In this case, the equalizer is said to be the inverse channel filter to the channel response. We note that the inverse channel filter completely eliminates ISI caused by the channel. Since it forces the ISI to be zero at the sampling instants t = kT , k = 0, 1, . . . , the equalizer is called a zero-forcing equalizer. Hence, the input to the detector is simply zk = Ik + ηk ,

k = 0, 1, . . . ψ

(5.16)

where ηk represents the additive noise and Ik is the desired symbol. In practice, the ISI caused by channel distortion is usually limited to a finite number of symbols on either side of the desired symbol. Hence, the number of terms that constitute the ISI in the summation 1999 by CRC Press LLC

c

given by Eq. (5.10) is finite. As a consequence, in practice the channel equalizer is implemented as a finite duration impulse response (FIR) filter, or transversal filter, with adjustable tap coefficients {cn }, as illustrated in Fig. 5.10. The time delay τ between adjacent taps may be selected as large as T , the symbol interval, in which case the FIR equalizer is called a symbol-spaced equalizer. In this case, the input to the equalizer is the sampled sequence given by Eq. (5.7). We note that when the symbol rate 1/T < 2W , however, frequencies in the received signal above the folding frequency 1/T are aliased into frequencies below 1/T . In this case, the equalizer compensates for the aliased channel-distorted signal. On the other hand, when the time delay τ between adjacent taps is selected such that 1/τ ≥ 2W > 1/T , no aliasing occurs and, hence, the inverse channel equalizer compensates for the true channel distortion. Since τ < T , the channel equalizer is said to have fractionally spaced taps and it is called a fractionally spaced equalizer. In practice, τ is often selected as τ = T /2. Notice that, in this case, the sampling rate at the output of the filter GR (f ) is 2/T . The impulse response of the FIR equalizer is N X

gE (t) =

cn δ(t − nτ )ψ

(5.17)

cn e−j 2πf nτ

(5.18)

n=−N

and the corresponding frequency response is GE (f ) =

N X n=−N

where {cn } are the (2N + 1) equalizer coefficients and N is chosen sufficiently large so that the equalizer spans the length of the ISI, i.e., 2N + 1 ≥ L, where L is the number of signal samples spanned by the ISI. Since X(f ) = GT (f )C(f )GR (f ) and x(t) is the signal pulse corresponding to X(f ), then the equalized output signal pulse is q(t) =

N X

cn x(t − nτ )ψ

(5.19)

n=−N

The zero-forcing condition can now be applied to the samples of q(t) taken at times t = mT . These samples are N X cn x(mT − nτ ),ψ m = 0, ±1, . . . , ±Nψ (5.20) q(mT ) = n=−N

Since there are 2N + 1 equalizer coefficients, we can control only 2N + 1 sampled values of q(t). Specifically, we may force the conditions q(mT ) =

N X n=−N

cn x(mT − nτ ) =

1, 0,

m=0 m = ±1, ±2, . . . , ±N

(5.21)

which may be expressed in matrix form as Xc = q, where X is a (2N + 1) × (2N + 1) matrix with elements {x(mT − nτ )}, c is the (2N + 1) coefficient vector and q is the (2N + 1) column vector with one nonzero element. Thus, we obtain a set of 2N + 1 linear equations for the coefficients of the zero-forcing equalizer. 1999 by CRC Press LLC

c

We should emphasize that the FIR zero-forcing equalizer does not completely eliminate ISI because it has a finite length. As N is increased, however, the residual ISI can be reduced, and in the limit as N → ∞, the ISI is completely eliminated.

EXAMPLE 5.1:

Consider a channel distorted pulse x(t), at the input to the equalizer, given by the expression x(t) =

1 2 2t 1+ T

where 1/T is the symbol rate. The pulse is sampled at the rate 2/T and equalized by a zero-forcing equalizer. Determine the coefficients of a five-tap zero-forcing equalizer. Solution 5.1

According to Eq. (5.21), the zero-forcing equalizer must satisfy the equations

q(mT ) =

2 X

cn x (mT − nT /2) =

n=−2

The matrix X with elements x(mT − nT /2) is given as 1 1 1 5 10 17 1 1 1 2 5 1 1 1 X= 5 2 1 1 1 17 10 5 1 1 1 37 26 17 The coefficient vector c and the vector q are given as c−2 c−1 c= c0 c1 c2

m=0 m = ±1, ±2

1, 0,

1 26 1 10 1 2 1 2 1 10

1 37 1 17 1 5

1 1 5

q=

0 0 1 0 0

(5.22)

(5.23)

Then, the linear equations Xc = q can be solved by inverting the matrix X. Thus, we obtain −2.2 4.9 (5.24) copt = X−1 q = −3 4.9 −2.2 One drawback to the zero-forcing equalizer is that it ignores the presence of additive noise. As a consequence, its use may result in significant noise enhancement. This is easily seen by noting that 1999 by CRC Press LLC

c

in a frequency range where C(f ) is small, the channel equalizer GE (f ) = 1/C(f ) compensates by placing a large gain in that frequency range. Consequently, the noise in that frequency range is greatly enhanced. An alternative is to relax the zero ISI condition and select the channel equalizer characteristic such that the combined power in the residual ISI and the additive noise at the output of the equalizer is minimized. A channel equalizer that is optimized based on the minimum mean square error (MMSE) criterion accomplishes the desired goal. To elaborate, let us consider the noise corrupted output of the FIR equalizer, which is z(t) =

N X

cn y(t − nτ )

(5.25)

n=−N

where y(t) is the input to the equalizer, given by Eq. (5.6). The equalizer output is sampled at times t = mT . Thus, we obtain N X cn y(mT − nτ ) (5.26) z(mT ) = n=−N

The desired response at the output of the equalizer at t = mT is the transmitted symbol Im . The error is defined as the difference between Im and z(mT ). Then, the mean square error (MSE) between the actual output sample z(mT ) and the desired values Im is MSE

E |z(mT ) − Im |2 2 N X = E cn y(mT − nτ ) − Im

=

n=−N

=

N X

N X

cn ck RY (n − k)

n=−N k=−N

−2

N X

ck RI Y (k) + E |Im |2

(5.27)

k=−N

where the correlations are defined as RY (n − k) RI Y (k)

=

E y ∗ (mT − nτ )y(mT − kτ )

= E y(mT − kτ )Im∗

(5.28)

and the expectation is taken with respect to the random information sequence {Im } and the additive noise. The minimum MSE solution is obtained by differentiating Eq. (5.27) with respect to the equalizer coefficients {cn }. Thus, we obtain the necessary conditions for the minimum MSE as N X

cn RY (n − k) = RI Y (k),

k = 0, ±1, 2, . . . , ±N

(5.29)

n=−N

These are the (2N + 1) linear equations for the equalizer coefficients. In contrast to the zero-forcing solution already described, these equations depend on the statistical properties (the autocorrelation) of the noise as well as the ISI through the autocorrelation RY (n). 1999 by CRC Press LLC

c

In practice, the autocorrelation matrix RY (n) and the crosscorrelation vector RI Y (n) are unknown a priori. These correlation sequences can be estimated, however, by transmitting a test signal over the channel and using the time-average estimates Rˆ Y (n)

K 1 X ∗ y (kT − nτ )y(kT ) K

=

k=1

Rˆ I Y (n)

K 1 X y(kT − nτ )Ik∗ K

=

(5.30)

k=1

in place of the ensemble averages to solve for the equalizer coefficients given by Eq. (5.29).

5.3.1

Adaptive Linear Equalizers

We have shown that the tap coefficients of a linear equalizer can be determined by solving a set of linear equations. In the zero-forcing optimization criterion, the linear equations are given by Eq. (5.21). On the other hand, if the optimization criterion is based on minimizing the MSE, the optimum equalizer coefficients are determined by solving the set of linear equations given by Eq. (5.29). In both cases, we may express the set of linear equations in the general matrix form Bc = d

(5.31)

where B is a (2N + 1) × (2N + 1) matrix, c is a column vector representing the 2N + 1 equalizer coefficients, and d a (2N + 1)-dimensional column vector. The solution of Eq. (5.31) yields copt = B −1 d

(5.32)

In practical implementations of equalizers, the solution of Eq. (5.31) for the optimum coefficient vector is usually obtained by an iterative procedure that avoids the explicit computation of the inverse of the matrix B. The simplest iterative procedure is the method of steepest descent, in which one begins by choosing arbitrarily the coefficient vector c, say c0 . This initial choice of coefficients corresponds to a point on the criterion function that is being optimized. For example, in the case of the MSE criterion, the initial guess c0 corresponds to a point on the quadratic MSE surface in the (2N +1)-dimensional space of coefficients. The gradient vector, defined as g 0 , which is the derivative of the MSE with respect to the 2N +1 filter coefficients, is then computed at this point on the criterion surface, and each tap coefficient is changed in the direction opposite to its corresponding gradient component. The change in the j th tap coefficient is proportional to the size of the j th gradient component. For example, the gradient vector denoted as g k , for the MSE criterion, found by taking the derivatives of the MSE with respect to each of the 2N + 1 coefficients, is g k = Bck − d,

k = 0, 1, 2, . . .

(5.33)

Then the coefficient vector ck is updated according to the relation ck+1 = ck − 1g k

(5.34)

where 1 is the step-size parameter for the iterative procedure. To ensure convergence of the iterative procedure, 1 is chosen to be a small positive number. In such a case, the gradient vector g k converges 1999 by CRC Press LLC

c

toward zero, i.e., g k → 0 as k → ∞, and the coefficient vector ck → copt as illustrated in Fig. 5.12 based on two-dimensional optimization. In general, convergence of the equalizer tap coefficients to copt cannot be attained in a finite number of iterations with the steepest-descent method. The optimum solution copt , however, can be approached as closely as desired in a few hundred iterations. In digital communication systems that employ channel equalizers, each iteration corresponds to a time interval for sending one symbol and, hence, a few hundred iterations to achieve convergence to copt corresponds to a fraction of a second.

FIGURE 5.12: Examples of convergence characteristics of a gradient algorithm.

Adaptive channel equalization is required for channels whose characteristics change with time. In such a case, the ISI varies with time. The channel equalizer must track such time variations in the channel response and adapt its coefficients to reduce the ISI. In the context of the preceding discussion, the optimum coefficient vector copt varies with time due to time variations in the matrix B and, for the case of the MSE criterion, time variations in the vector d. Under these conditions, the iterative method described can be modified to use estimates of the gradient components. Thus, the algorithm for adjusting the equalizer tap coefficients may be expressed as cˆ k+1 = cˆ k − 1gˆ k

(5.35)

where gˆ k denotes an estimate of the gradient vector g k and cˆ k denotes the estimate of the tap coefficient vector. In the case of the MSE criterion, the gradient vector g k given by Eq. (5.33) may also be expressed as g k = −E ek y ∗k An estimate gˆ k of the gradient vector at the kth iteration is computed as gˆ k = −ek y ∗k

(5.36)

where ek denotes the difference between the desired output from the equalizer at the kth time instant and the actual output z(kT ), and y k denotes the column vector of 2N + 1 received signal values contained in the equalizer at time instant k. The error signal ek is expressed as ek = Ik − zk 1999 by CRC Press LLC

c

(5.37)

where zk = z(kT ) is the equalizer output given by Eq. (5.26) and Ik is the desired symbol. Hence, by substituting Eq. (5.36) into Eq. (5.35), we obtain the adaptive algorithm for optimizing the tap coefficients (based on the MSE criterion) as cˆ k+1 = cˆ k + 1ek y ∗k

(5.38)

Since an estimate of the gradient vector is used in Eq. (5.38) the algorithm is called a stochastic gradient algorithm; it is also known as the LMS algorithm. A block diagram of an adaptive equalizer that adapts its tap coefficients according to Eq. (5.38) is illustrated in Fig. 5.13. Note that the difference between the desired output Ik and the actual output zk from the equalizer is used to form the error signal ek . This error is scaled by the step-size parameter 1, and the scaled error signal 1ek multiplies the received signal values {y(kT − nτ )} at the 2N + 1 taps. The products 1ek y ∗ (kT − nτ ) at the (2N + 1) taps are then added to the previous values of the tap coefficients to obtain the updated tap coefficients, according to Eq. (5.38). This computation is repeated as each new symbol is received. Thus, the equalizer coefficients are updated at the symbol rate.

Input

{y k }

Σ

τ

τ

Σ

Σ

c −N+1

c −N

τ

τ

Σ c1

c0

τ

Σ cN

+ {z k } +

Detector

{e k } {I k } ∆ Output

FIGURE 5.13: Linear adaptive equalizer based on the MSE criterion.

Initially, the adaptive equalizer is trained by the transmission of a known pseudo-random sequence {Im } over the channel. At the demodulator, the equalizer employs the known sequence to adjust its coefficients. Upon initial adjustment, the adaptive equalizer switches from a training mode to a decision-directed mode, in which case the decisions at the output of the detector are sufficiently reliable so that the error signal is formed by computing the difference between the detector output 1999 by CRC Press LLC

c

and the equalizer output, i.e., ek = I˜k − zk

(5.39)

where I˜k is the output of the detector. In general, decision errors at the output of the detector occur infrequently and, consequently, such errors have little effect on the performance of the tracking algorithm given by Eq. (5.38). A rule of thumb for selecting the step-size parameter so as to ensure convergence and good tracking capabilities in slowly varying channels is 1=

1 5(2N + 1)PR

(5.40)

where PR denotes the received signal-plus-noise power, which can be estimated from the received signal (see Proakis [4]). The convergence characteristic of the stochastic gradient algorithm in Eq. (5.38) is illustrated in Fig. 5.14. These graphs were obtained from a computer simulation of an 11-tap adaptive equalizer operating on a channel with a rather modest amount of ISI. The input signal-plus-noise power PR was normalized to unity. The rule of thumb given in Eq. (5.40) for selecting the step size gives 1 = 0.018. The effect of making 1 too large is illustrated by the large jumps in MSE as shown for 1 = 0.115. As 1 is decreased, the convergence is slowed somewhat, but a lower MSE is achieved, indicating that the estimated coefficients are closer to copt .

FIGURE 5.14: Initial convergence characteristics of the LMS algorithm with different step sizes.

Although we have described in some detail the operation of an adaptive equalizer that is optimized on the basis of the MSE criterion, the operation of an adaptive equalizer based on the zero-forcing method is very similar. The major difference lies in the method for estimating the gradient vectors g k at each iteration. A block diagram of an adaptive zero-forcing equalizer is shown in Fig. 5.15. For more details on the tap coefficient update method for a zero-forcing equalizer, the reader is referred to the papers by Lucky [2, 3], and the text by Proakis [4]. 1999 by CRC Press LLC

c

Input

τ

Σ

τ

Σ

τ

Σ

Σ yk

~ Ik

Detector

Output

− εk

∆ T

T

~ ~

T

T

FIGURE 5.15: An adaptive zero-forcing equalizer.

5.4

Decision-Feedback Equalizer

The linear filter equalizers described in the preceding section are very effective on channels, such as wire line telephone channels, where the ISI is not severe. The severity of the ISI is directly related to the spectral characteristics and not necessarily to the time span of the ISI. For example, consider the ISI resulting from the two channels that are illustrated in Fig. 5.16. The time span for the ISI in channel A is 5 symbol intervals on each side of the desired signal component, which has a value of 0.72. On the other hand, the time span for the ISI in channel B is one symbol interval on each side of the desired signal component, which has a value of 0.815. The energy of the total response is normalized to unity for both channels. In spite of the shorter ISI span, channel B results in more severe ISI. This is evidenced in the frequency response characteristics of these channels, which are shown in Fig. 5.17. We observe that channel B has a spectral null [the frequency response C(f ) = 0 for some frequencies in the band |f | ≤ W ] at f = 1/2T , whereas this does not occur in the case of channel A. Consequently, a linear equalizer will introduce a large gain in its frequency response to compensate for the channel null. Thus, the noise in channel B will be enhanced much more than in channel A. This implies that the performance of the linear equalizer for channel B will be sufficiently poorer than that for channel A. This fact is borne out by the computer simulation results for the performance of the two linear equalizers shown in Fig. 5.18. Hence, the basic limitation of a linear equalizer is that it performs poorly on channels having spectral nulls. Such channels are often encountered in radio communications, such as ionospheric transmission at frequencies below 30 MHz and mobile radio channels, such as those used for cellular radio communications. A decision-feedback equalizer (DFE) is a nonlinear equalizer that employs previous decisions to eliminate the ISI caused by previously detected symbols on the current symbol to be detected. A 1999 by CRC Press LLC

c

FIGURE 5.16: Two channels with ISI. simple block diagram for a DFE is shown in Fig. 5.19. The DFE consists of two filters. The first filter is called a feedforward filter and it is generally a fractionally spaced FIR filter with adjustable tap coefficients. This filter is identical in form to the linear equalizer already described. Its input is the received filtered signal y(t) sampled at some rate that is a multiple of the symbol rate, e.g., at rate 2/T . The second filter is a feedback filter. It is implemented as an FIR filter with symbol-spaced taps having adjustable coefficients. Its input is the set of previously detected symbols. The output of the feedback filter is subtracted from the output of the feedforward filter to form the input to the detector. Thus, we have zm =

0 X n=−N1

cn y(mT − nτ ) −

N2 X

bn I˜m−n

(5.41)

n=1

where {cn } and {bn } are the adjustable coefficients of the feedforward and feedback filters, respectively, I˜m−n , n = 1, 2, . . . , N2 are the previously detected symbols, N1 + 1 is the length of the feedforward filter, and N2 is the length of the feedback filter. Based on the input zm , the detector determines which of the possible transmitted symbols is closest in distance to the input signal Im . Thus, it makes its decision and outputs I˜m . What makes the DFE nonlinear is the nonlinear characteristic of the detector that provides the input to the feedback filter. The tap coefficients of the feedforward and feedback filters are selected to optimize some desired performance measure. For mathematical simplicity, the MSE criterion is usually applied, and a stochastic gradient algorithm is commonly used to implement an adaptive DFE. Figure 5.20 illustrates the block diagram of an adaptive DFE whose tap coefficients are adjusted by means of the LMS stochastic gradient algorithm. Figure 5.21 illustrates the probability of error performance of the 1999 by CRC Press LLC

c

FIGURE 5.17: Amplitude spectra for (a) channel A shown in Fig.5.16(a) and (b) channel B shown in Fig.5.16(b).

DFE, obtained by computer simulation, for binary PAM transmission over channel B. The gain in performance relative to that of a linear equalizer is clearly evident. We should mention that decision errors from the detector that are fed to the feedback filter have a small effect on the performance of the DFE. In general, a small loss in performance of one to two decibels is possible at error rates below 10−2 , as illustrated in Fig. 5.21, but the decision errors in the feedback filters are not catastrophic.

1999 by CRC Press LLC

c

FIGURE 5.18: Error-rate performance of linear MSE equalizer.

FIGURE 5.19: Block diagram of DFE.

5.5

Maximum-Likelihood Sequence Detection

Although the DFE outperforms a linear equalizer, it is not the optimum equalizer from the viewpoint of minimizing the probability of error in the detection of the information sequence {Ik } from the received signal samples {yk } given in Eq. (5.5). In a digital communication system that transmits information over a channel that causes ISI, the optimum detector is a maximum-likelihood symbol sequence detector which produces at its output the most probable symbol sequence {I˜k } for the given received sampled sequence {yk }. That is, the detector finds the sequence {I˜k } that maximizes the likelihood function (5.42) 3 ({Ik }) = ln p {yk } {Ik } where p({yk } | {Ik }) is the joint probability of the received sequence {yk } conditioned on {Ik }. The sequence of symbols {I˜k } that maximizes this joint conditional probability is called the maximum1999 by CRC Press LLC

c

FIGURE 5.20: Adaptive DFE.

likelihood sequence detector. An algorithm that implements maximum-likelihood sequence detection (MLSD) is the Viterbi algorithm, which was originally devised for decoding convolutional codes. For a description of this algorithm in the context of sequence detection in the presence of ISI, the reader is referred to the paper by Forney [1] and the text by Proakis [4]. The major drawback of MLSD for channels with ISI is the exponential behavior in computational complexity as a function of the span of the ISI. Consequently, MLSD is practical only for channels where the ISI spans only a few symbols and the ISI is severe, in the sense that it causes a severe degradation in the performance of a linear equalizer or a decision-feedback equalizer. For example, Fig. 5.22 illustrates the error probability performance of the Viterbi algorithm for a binary PAM signal transmitted through channel B (see Fig. 5.16). For purposes of comparison, we also illustrate the probability of error for a DFE. Both results were obtained by computer simulation. We observe that the performance of the maximum likelihood sequence detector is about 4.5 dB better than that of the DFE at an error probability of 10−4 . Hence, this is one example where the ML sequence detector provides a significant performance gain on a channel with a relatively short ISI span.

5.6

Conclusions

Channel equalizers are widely used in digital communication systems to mitigate the effects of ISI caused by channel distortion. Linear equalizers are widely used for high-speed modems that transmit data over telephone channels. For wireless (radio) transmission, such as in mobile cellular communi1999 by CRC Press LLC

c

FIGURE 5.21: Performance of DFE with and without error propagation.

FIGURE 5.22: Comparison of performance between MLSE and decision-feedback equalization for channel B of Fig.5.16.

1999 by CRC Press LLC

c

cations and interoffice communications, the multipath propagation of the transmitted signal results in severe ISI. Such channels require more powerful equalizers to combat the severe ISI. The decisionfeedback equalizer and the MLSD are two nonlinear channel equalizers that are suitable for radio channels with severe ISI.

Defining Terms Adaptive equalizer: A channel equalizer whose parameters are updated automatically and adaptively during transmission of data. Channel equalizer: A device that is used to reduce the effects of channel distortion in a received signal. Decision-directed mode: Mode for adjustment of the equalizer coefficient adaptively based on the use of the detected symbols at the output of the detector. Decision-feedback equalizer (DFE): An adaptive equalizer that consists of a feedforward filter and a feedback filter, where the latter is fed with previously detected symbols that are used to eliminate the intersymbol interference due to the tail in the channel impulse response. Fractionally spaced equalizer: A tapped-delay line channel equalizer in which the delay between adjacent taps is less than the duration of a transmitted symbol. Intersymbol interference: Interference in a received symbol from adjacent (nearby) transmitted symbols caused by channel distortion in data transmission. LMS algorithm: See stochastic gradient algorithm. Maximum-likelihood sequence detector: A detector for estimating the most probable sequence of data symbols by maximizing the likelihood function of the received signal. Preset equalizer: A channel equalizer whose parameters are fixed (time-invariant) during transmission of data. Stochastic gradient algorithm: An algorithm for adaptively adjusting the coefficients of an equalizer based on the use of (noise-corrupted) estimates of the gradients. Symbol-spaced equalizer: A tapped-delay line channel equalizer in which the delay between adjacent taps is equal to the duration of a transmitted symbol. Training mode: Mode for adjustment of the equalizer coefficients based on the transmission of a known sequence of transmitted symbols. Zero-forcing equalizer: A channel equalizer whose parameters are adjusted to completely eliminate intersymbol interference in a sequence of transmitted data symbols.

References [1] Forney, G.D., Jr., Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference. IEEE Trans. Inform. Theory, IT-18, 363–378, May 1972. [2] Lucky, R.W., Automatic equalization for digital communications. Bell Syst. Tech. J., 44, 547–588, Apr. 1965. [3] Lucky, R.W., Techniques for adaptive equalization of digital communication. Bell Syst. Tech. J., 45, 255–286, Feb. 1966. [4] Proakis, J.G., Digital Communications, 3rd ed., McGraw-Hill, New York, 1995. 1999 by CRC Press LLC

c

Further Information For a comprehensive treatment of adaptive equalization techniques and their performance characteristics, the reader may refer to the book by Proakis [4]. The two papers by Lucky [2, 3], provide a treatment on linear equalizers based on the zero-forcing criterion. Additional information on decision-feedback equalizers may be found in the journal papers “An Adaptive Decision-Feedback Equalizer” by D.A. George, R.R. Bowen, and J.R. Storey, IEEE Transactions on Communications Technology, Vol. COM-19, pp. 281–293, June 1971, and “Feedback Equalization for Fading Dispersive Channels” by P. Monsen, IEEE Transactions on Information Theory, Vol. IT-17, pp. 56–64, January 1971. A through treatment of channel equalization based on maximum-likelihood sequence detection is given in the paper by Forney [1].

1999 by CRC Press LLC

c

LoCicero, J.L. & Patel, B.P. “Line Coding” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Line Coding 6.1 6.2

6.3

6.4

Joseph L. LoCicero Illinois Institute of Technology

Bhasker P. Patel Illinois Institute of Technology

6.1

Introduction Common Line Coding Formats

Unipolar NRZ (Binary On-Off Keying) • Unipolar RZ • Polar NRZ • Polar RZ [Bipolar, Alternate Mark Inversion (AMI), or Pseudoternary] • Manchester Coding (Split Phase or Digital Biphase)

Alternate Line Codes

Delay Modulation (Miller Code) • Split Phase (Mark) • Biphase (Mark) • Code Mark Inversion (CMI) • NRZ (I) • Binary N Zero Substitution (BNZS) • High-Density Bipolar N (HDBN) • Ternary Coding

Multilevel Signalling, Partial Response Signalling, and Duobinary Coding

Multilevel Signalling • Partial Response Signalling and Duobinary Coding

6.5 Bandwidth Comparison 6.6 Concluding Remarks Defining Terms References

Introduction

The terminology line coding originated in telephony with the need to transmit digital information across a copper telephone line; more specifically, binary data over a digital repeatered line. The concept of line coding, however, readily applies to any transmission line or channel. In a digital communication system, there exists a known set of symbols to be transmitted. These can be designated as {mi }, i = 1, 2, . . . , N, with a probability of occurrence {pi }, i = 1, 2, . . . , N, where the sequentially transmitted symbols are generally assumed to be statistically independent. The conversion or coding of these abstract symbols into real, temporal waveforms to be transmitted in baseband is the process of line coding. Since the most common type of line coding is for binary data, such a waveform can be succinctly termed a direct format for serial bits. The concentration in this section will be line coding for binary data. Different channel characteristics, as well as different applications and performance requirements, have provided the impetus for the development and study of various types of line coding [1, 2]. For example, the channel might be ac coupled and, thus, could not support a line code with a dc component or large dc content. Synchronization or timing recovery requirements might necessitate a discrete component at the data rate. The channel bandwidth and crosstalk limitations might dictate 1999 by CRC Press LLC

c

the type of line coding employed. Even such factors as the complexity of the encoder and the economy of the decoder could determine the line code chosen. Each line code has its own distinct properties. Depending on the application, one property may be more important than the other. In what follows, we describe, in general, the most desirable features that are considered when choosing a line code. It is commonly accepted [1, 2, 5, 8] that the dominant considerations effecting the choice of a line code are: 1) timing, 2) dc content, 3) power spectrum, 4) performance monitoring, 5) probability of error, and 6) transparency. Each of these are detailed in the following paragraphs. 1) Timing: The waveform produced by a line code should contain enough timing information such that the receiver can synchronize with the transmitter and decode the received signal properly. The timing content should be relatively independent of source statistics, i.e., a long string of 1s or 0s should not result in loss of timing or jitter at the receiver. 2) DC content: Since the repeaters used in telephony are ac coupled, it is desirable to have zero dc in the waveform produced by a given line code. If a signal with significant dc content is used in ac coupled lines, it will cause dc wander in the received waveform. That is, the received signal baseline will vary with time. Telephone lines do not pass dc due to ac coupling with transformers and capacitors to eliminate dc ground loops. Because of this, the telephone channel causes a droop in constant signals. This causes dc wander. It can be eliminated by dc restoration circuits, feedback systems, or with specially designed line codes. 3) Power spectrum: The power spectrum and bandwidth of the transmitted signal should be matched to the frequency response of the channel to avoid significant distortion. Also, the power spectrum should be such that most of the energy is contained in as small bandwidth as possible. The smaller is the bandwidth, the higher is the transmission efficiency. 4) Performance monitoring: It is very desirable to detect errors caused by a noisy transmission channel. The error detection capability in turn allows performance monitoring while the channel is in use (i.e., without elaborate testing procedures that require suspending use of the channel). 5) Probability of error: The average error probability should be as small as possible for a given transmitter power. This reflects the reliability of the line code. 6) Transparency: A line code should allow all the possible patterns of 1s and 0s. If a certain pattern is undesirable due to other considerations, it should be mapped to a unique alternative pattern.

6.2

Common Line Coding Formats

A line coding format consists of a formal definition of the line code that specifies how a string of binary digits are converted to a line code waveform. There are two major classes of binary line codes: level codes and transition codes. Level codes carry information in their voltage level, which may be high or low for a full bit period or part of the bit period. Level codes are usually instantaneous since they typically encode a binary digit into a distinct waveform, independent of any past binary data. However, some level codes do exhibit memory. Transition codes carry information in the change in level appearing in the line code waveform. Transition codes may be instantaneous, but they generally have memory, using past binary data to dictate the present waveform. There are two common forms of level line codes: one is called return to zero (RZ) and the other is called nonreturn to zero (NRZ). In RZ coding, the level of the pulse returns to zero for a portion of the bit interval. In NRZ coding, the level of the pulse is maintained during the entire bit interval. Line coding formats are further classified according to the polarity of the voltage levels used to represent the data. If only one polarity of voltage level is used, i.e., positive or negative (in addition to the zero level) then it is called unipolar signalling. If both positive and negative voltage levels are being used, with or without a zero voltage level, then it is called polar signalling. The term bipolar 1999 by CRC Press LLC

c

signalling is used by some authors to designate a specific line coding scheme with positive, negative, and zero voltage levels. This will be described in detail later in this section. The formal definition of five common line codes is given in the following along with a representative waveform, the power spectral density (PSD), the probability of error, and a discussion of advantages and disadvantages. In some cases specific applications are noted.

6.2.1

Unipolar NRZ (Binary On-Off Keying)

In this line code, a binary 1 is represented by a non-zero voltage level and a binary 0 is represented by a zero voltage level as shown in Fig. 6.1(a). This is an instantaneous level code. The PSD of this code with equally likely 1s and 0s is given by [5, 8] S1 (f ) =

V 2T 4

sin πf T πf T

2 +

V2 δ(f ) 4

(6.1)

where V is the binary 1 voltage level, T = 1/R is the bit duration, and R is the bit rate in bits per second. The spectrum of unipolar NRZ is plotted in Fig. 6.2a. This PSD is a two-sided even spectrum, although only half of the plot is shown for efficiency of presentation. If the probability of a binary 1 is p, and the probability of a binary 0 is (1 − p), then the PSD of this code, in the most general case, is 4p(1 − p) S1 (f ). Considering the frequency of the first spectral null as the bandwidth of the waveform, the bandwidth of unipolar NRZ is R in hertz. The error rate performance of this code, for equally likely data, with additive white Gaussian noise (AWGN) and optimum, i.e., matched filter, detection is given by [1, 5] s ! 1 Eb (6.2) Pe = erfc 2 2N0 where Eb /N0 is a measure of the signal-to-noise ratio (SNR) of the received signal. In general, Eb is the energy per bit and N0 /2 is the two-sided PSD of the AWGN. More specifically, for unipolar NRZ, Eb is the energy in a binary 1, which is V 2 T . The performance of the unipolar NRZ code is plotted in Fig. 6.3 The principle advantages of unipolar NRZ are ease of generation, since it requires only a single power supply, and a relatively low bandwidth of R Hz. There are quite a few disadvantages of this line code. A loss of synchronization and timing jitter can result with a long sequence of 1s or 0s because no pulse transition is present. The code has no error detection capability and, hence, performance cannot be monitored. There is a significant dc component as well as a dc content. The error rate performance is not as good as that of polar line codes.

6.2.2

Unipolar RZ

In this line code, a binary 1 is represented by a nonzero voltage level during a portion of the bit duration, usually for half of the bit period, and a zero voltage level for rest of the bit duration. A binary 0 is represented by a zero voltage level during the entire bit duration. Thus, this is an instantaneous level code. Figure 6.1(b) illustrates a unipolar RZ waveform in which the 1 is represented by a nonzero voltage level for half the bit period. The PSD of this line code, with equally likely binary digits, is given by [5, 6, 8] S2 (f ) 1999 by CRC Press LLC

c

=

V 2T 16

sin πf T /2 πf T /2

2

1 Unipolar RZ (a)

0

1

1

0

0

0

1

1

1

0

T

2T

3T

4T

5T

6T

7T

8T

9T

10T

11T

T

2T

3T

4T

5T

6T

7T

8T

9T

10T

11T

Unipolar RZ (b)

Polar NRZ (c)

Bipolar (AMI) (d)

Manchester (Bi-phase) (e)

Delay Modulation (f)

Split Phase (Mark) (g)

Split Phase (Space) (h)

Bi-Phase (Mark) (i)

Bi-Phase (Space) (j)

Code Mark Inversion (k)

NRZ (M) (l)

NRZ (s) (m)

1999 by CRC Press LLC

c

FIGURE 6.1: Waveforms for different line codes.

Figure 6.2a

Power spectral density of different line codes, where R = 1/T is the bit rate.

V2 + 4π 2

"

# ∞ X π2 1 δ(f ) + δ(f − (2n + 1)R) 4 (2n + 1)2 n=−∞

(6.3)

where again V is the binary 1 voltage level, and T = 1/R is the bit period. The spectrum of this code is drawn in Fig. 6.2a. In the most general case, when the probability of a 1 is p, the continuous portion of the PSD in Eq. (6.3) is scaled by the factor 4p(1 − p) and the discrete portion is scaled by the factor 4p2 . The first null bandwidth of unipolar RZ is 2R Hz. The error rate performance of this line code is the same as that of the unipolar NRZ provided we increase the voltage level of this code such that the energy in binary 1, Eb , is the same for both codes. The probability of error is given by Eq. (6.2) and identified in Fig. 6.3. If the voltage level and bit period are the same for unipolar NRZ and unipolar RZ, then the energy in a binary 1 for unipolar RZ will be V 2 T /2 and the probability of error is worse by 3 dB. The main advantages of unipolar RZ are, again, ease of generation since it requires a single power supply and the presence of a discrete spectral component at the symbol rate, which allows simple timing recovery. A number of disadvantages exist for this line code. It has a nonzero dc component and nonzero dc content, which can lead to dc wander. A long string of 0s will lack pulse transitions and could lead to loss of synchronization. There is no error detection capability and, hence, performance monitoring is not possible. The bandwidth requirement (2R Hz) is higher than that of NRZ signals. The error rate performance is worse than that of polar line codes. Unipolar NRZ as well as unipolar RZ are examples of pulse/no-pulse type of signalling. In this 1999 by CRC Press LLC

c

Figure 6.2b

Power spectral density of different line codes, where R = 1/T is the bit rate.

type of signalling, the pulse for a binary 0, g2 (t), is zero and the pulse for a binary 1 is specified generically as g1 (t) = g(t). Using G(f ) as the Fourier transform of g(t), the PSD of pulse/no-pulse signalling is given as [6, 7, 10] SPNP (f ) = p(1 − p)R|G(f )|2 + p2 R 2

∞ X

|G(nR)|2 δ(f − nR)

(6.4)

n=−∞

where p is the probability of a binary 1, and R is the bit rate.

6.2.3

Polar NRZ

In this line code, a binary 1 is represented by a positive voltage +V and a binary 0 is represented by a negative voltage −V over the full bit period. This code is also referred to as NRZ (L), since a bit is represented by maintaining a level (L) during its entire period. A polar NRZ waveform is shown in Fig. 6.1(c). This is again an instantaneous level code. Alternatively, a 1 may be represented by a −V voltage level and a 0 by a +V voltage level, without changing the spectral characteristics and 1999 by CRC Press LLC

c

FIGURE 6.3: Bit error probability for different line codes.

performance of the line code. The PSD of this line code with equally likely bits is given by [5, 8] S3 (f ) = V 2 T

sin πf T πf T

2 (6.5)

This is plotted in Fig. 6.2b. When the probability of a 1 is p, and p is not 0.5, a dc component exists, and the PSD becomes [10]

sin πf T S3p (f ) = 4V Tp(1 − p) πf T 2

1999 by CRC Press LLC

c

2 + V 2 (1 − 2p)2 δ(f )

(6.6)

The first null bandwidth for this line code is again R Hz, independent of p. The probability of error of this line code when p = 0.5 is given by [1, 5] s ! 1 Eb (6.7) Pe = erfc 2 N0 The performance of polar NRZ is plotted in Fig. 6.3. This is better than the error performance of the unipolar codes by 3 dB. The advantages of polar NRZ include a low-bandwidth requirement, R Hz, comparable to unipolar NRZ, very good error probability, and greatly reduced dc because the waveform has a zero dc component when p = 0.5 even though the dc content is never zero. A few notable disadvantages are that there is no error detection capability, and that a long string of 1s or 0s could result in loss of synchronization, since there are no transitions during the string duration. Two power supplies are required to generate this code.

6.2.4 Polar RZ [Bipolar, Alternate Mark Inversion (AMI), or Pseudoternary] In this scheme, a binary 1 is represented by alternating the positive and negative voltage levels, which return to zero for a portion of the bit duration, generally half the bit period. A binary 0 is represented by a zero voltage level during the entire bit duration. This line coding scheme is often called alternate mark inversion (AMI) since 1s (marks) are represented by alternating positive and negative pulses. It is also called pseudoternary since three different voltage levels are used to represent binary data. Some authors designate this line code as bipolar RZ (BRZ). An AMI waveform is shown in Fig. 6.1(d). Note that this is a level code with memory. The AMI code is well known for its use in telephony. The PSD of this line code with memory is given by [1, 2, 7] 1 − cos 2πf T 2 (6.8) S4p (f ) = 2p(1 − p)R|G(f )| 1 + (2p − 1)2 + 2(2p − 1) cos 2πf T where G(f ) is the Fourier transform of the pulse used to represent a binary 1, and p is the probability of a binary 1. When p = 0.5 and square pulses with amplitude ±V and duration T /2 are used to represent binary 1s, the PSD becomes S4 (f ) =

V 2T 4

sin πf T /2 πf T /2

2 sin2 (πf T )ψ

(6.9)

This PSD is plotted in Fig. 6.2a. The first null bandwidth of this waveform is R Hz. This is true for RZ rectangular pulses, independent of the value of p in Eq. (6.8). The error rate performance of this line code for equally likely binary data is given by [5] s ! 3 Eb (6.10) , Eb /N0 > 2 Pe ≈ erfc 4 2N0 This curve is plotted in Fig. 6.3 and is seen to be no more than 0.5 dB worse than the unipolar codes. The advantages of polar RZ (or AMI, as it is most commonly called) outweigh the disadvantages. This code has no dc component and zero dc content, completely avoiding the dc wander problem. Timing recovery is rather easy since squaring, or full-wave rectifying, this type of signal yields a unipolar RZ waveform with a discrete component at the bit rate, R Hz. Because of the alternating 1999 by CRC Press LLC

c

polarity pulses for binary 1s, this code has error detection and, hence, performance monitoring capability. It has a low-bandwidth requirement, R Hz, comparable to unipolar NRZ. The obvious disadvantage is that the error rate performance is worse than that of the unipolar and polar waveforms. A long string of 0s could result in loss of synchronization, and two power supplies are required for this code.

6.2.5

Manchester Coding (Split Phase or Digital Biphase)

In this coding, a binary 1 is represented by a pulse that has positive voltage during the first-half of the bit duration and negative voltage during second-half of the bit duration. A binary 0 is represented by a pulse that is negative during the first-half of the bit duration and positive during the secondhalf of the bit duration. The negative or positive midbit transition indicates a binary 1 or binary 0, respectively. Thus, a Manchester code is classified as an instantaneous transition code; it has no memory. The code is also called diphase because a square wave with a 0◦ phase is used to represent a binary 1 and a square wave with a phase of 180◦ used to represent a binary 0; or vice versa. This line code is used in Ethernet local area networks (LANs). The waveform for Manchester coding is shown in Fig. 6.1(e). The PSD of a Manchester waveform with equally likely bits is given by [5, 8] sin πf T /2 2 2 sin (πf T /2) (6.11) S5 (f ) = V 2 T πf T /2 where ±V are used as the positive/negative voltage levels for this code. Its spectrum is plotted in Fig. 6.2b. When the probability p of a binary 1, is not equal to one-half, the continuous portion of the PSD is reduced in amplitude and discrete components appear at integer multiples of the bit rate, R = 1/T . The resulting PSD is [6, 10] sin πf T /2 2 2 πf T 2 sin S5p (f ) = V T 4p(1 − p) πf T /2 2 2 ∞ X 2 δ(f − nR) (6.12) + V 2 (1 − 2p)2 nπ n=−∞,n6 =0

The first null bandwidth of the waveform generated by a Manchester code is 2R Hz. The error rate performance of this waveform when p = 0.5 is the same as that of polar NRZ, given by Eq. (6.9), and plotted in Fig. 6.3. The advantages of this code include a zero dc content on an individual pulse basis, so no pattern of bits can cause dc buildup; midbit transitions are always present making it is easy to extract timing information; and it has good error rate performance, identical to polar NRZ. The main disadvantage of this code is a larger bandwidth than any of the other common codes. Also, it has no error detection capability and, hence, performance monitoring is not possible. Polar NRZ and Manchester coding are examples of the use of pure polar signalling where the pulse for a binary 0, g2 (t) is the negative of the pulse for a binary 1, i.e., g2 (t) = −g1 (t). This is also referred to as an antipodal signal set. For this broad type of polar binary line code, the PSD is given by [10] SBP (f ) = 4p(1 − p)R|G(f )|2 + (2p − 1)2 R 2

∞ X

|G(nR)|2 δ(f − nR)

n=−∞

where |G(f )| is the magnitude of the Fourier transform of either g1 (t) or g2 (t). 1999 by CRC Press LLC

c

(6.13)

A further generalization of the PSD of binary line codes can be given, wherein a continuous spectrum and a discrete spectrum is evident. Let a binary 1, with probability p, be represented by g1 (t) over the T = 1/R second bit interval; and let a binary 0, with probability 1 − p, be represented by g2 (t) over the same T second bit interval. The two-sided PSD for this general binary line code is [10] SGB (f )

=

p(1 − p)R |G1 (f ) − G2 (f )|2 ∞ X |pG1 (nR) + (1 − p)G2 (nR)|2 δ(f − nR) + R2

(6.14)

n=−∞

where the Fourier transform of g1 (t) and g2 (t) are given by G1 (f ) and G2 (f ), respectively.

6.3

Alternate Line Codes

Most of the line codes discussed thus far were instantaneous level codes. Only AMI had memory, and Manchester was an instantaneous transition code. The alternate line codes presented in this section all have memory. The first four are transition codes, where binary data is represented as the presence or absence of a transition, or by the direction of transition, i.e., positive to negative or vice versa. The last four codes described in this section are level line codes with memory.

6.3.1

Delay Modulation (Miller Code)

In this line code, a binary 1 is represented by a transition at the midbit position, and a binary 0 is represented by no transition at the midbit position. If a 0 is followed by another 0, however, the signal transition also occurs at the end of the bit interval, that is, between the two 0s. An example of delay modulation is shown in Fig. 6.1(f). It is clear that delay modulation is a transition code with memory. This code achieves the goal of providing good timing content without sacrificing bandwidth. The PSD of the Miller code for equally likely data is given by [10] S6 (f )

=

V 2T 2(πf T )2 (17 + 8 cos 2πf T )

× (23 − 2 cos πf T − 22 cos 2πf T − 12 cos 3πf T + 5 cos 4πf T + 12 cos 5πf T + 2 cos 6πf T − 8 cos 7πf T + 2 cos 8πf T )

(6.15)

This spectrum is plotted in Fig. 6.2b. The advantages of this code are that it requires relatively low bandwidth, most of the energy is contained in less than 0.5R. However, there is no distinct spectral null within the 2R-Hz band. It has low dc content and no dc component. It has very good timing content, and carrier tracking is easier than Manchester coding. Error rate performance is comparable to that of the common line codes. One important disadvantage is that it has no error detection capability and, hence, performance cannot be monitored.

6.3.2

Split Phase (Mark)

This code is similar to Manchester in the sense that there are always midbit transitions. Hence, this code is relatively easy to synchronize and has no dc. Unlike Manchester, however, split phase (mark) encodes a binary digit into a midbit transition dependent on the midbit transition in the 1999 by CRC Press LLC

c

previous bit period [12]. Specifically, a binary 1 produces a reversal of midbit transition relative to the previous midbit transition. A binary 0 produces no reversal of the midbit transition. Certainly this is a transition code with memory. An example of a split phase (mark) coded waveform is shown in Fig. 6.1(g), where the waveform in the first bit period is chosen arbitrarily. Since this method encodes bits differentially, there is no 180◦ -phase ambiguity associated with some line codes. This phase ambiguity may not be an issue in most baseband links but is important if the line code is modulated. Split phase (space) is very similar to split phase (mark), where the role of the binary 1 and binary 0 are interchanged. An example of a split phase (space) coded waveform is given in Fig. 6.1(h); again, the first bit waveform is arbitrary.

6.3.3 Biphase (Mark) This code, designated as Bi φ-M, is similar to a Miller code in that a binary 1 is represented by a midbit transition, and a binary 0 has no midbit transition. However, this code always has a transition at the beginning of a bit period [10]. Thus, the code is easy to synchronize and has no dc. An example of Bi φ-M is given in Fig. 6.1(i), where the direction of the transition at t = 0 is arbitrarily chosen. Biphase (space) or Bi φ-S is similar to Bi φ-M, except the role of the binary data is reversed. Here a binary 0 (space) produces a midbit transition, and a binary 1 does not have a midbit transition. A waveform example of Bi φ-S is shown in Fig. 6.1(j). Both Bi φ-S and Bi φ-M are transition codes with memory.

6.3.4

Code Mark Inversion (CMI)

This line code is used as the interface to a Consultative Committee on International Telegraphy and Telephony (CCITT) multiplexer and is very similar to Bi φ-S. A binary 1 is encoded as an NRZ pulse with alternate polarity, +V or −V . A binary 0 is encoded with a definitive midbit transition (or square wave phase) [1]. An example of this waveform is shown in Fig. 6.1(k) where a negative to positive transition (or 180◦ phase) is used for a binary 0. The voltage level of the first binary 1 in this example is chosen arbitrarily. This example waveform is identical to Bi φ-S shown in Fig. 6.1(j), except for the last bit. CMI has good synchronization properties and has no dc.

6.3.5

NRZ (I)

This type of line code uses an inversion (I) to designate binary digits, specifically, a change in level or no change in level. There are two variants of this code, NRZ mark (M) and NRZ space (S) [5, 12]. In NRZ (M), a change of level is used to indicate a binary 1, and no change of level is used to indicate a binary 0. In NRZ (S) a change of level is used to indicate a binary 0, and no change of level is used to indicate a binary 1. Waveforms for NRZ (M) and NRZ (S) are depicted in Fig. 6.1(l) and Fig. 6.1(m), respectively, where the voltage level of the first binary 1 in the example is chosen arbitrarily. These codes are level codes with memory. In general, line codes that use differential encoding, like NRZ (I), are insensitive to 180◦ phase ambiguity. Clock recovery with NRZ (I) is not particularly good, and dc wander is a problem as well. Its bandwidth is comparable to polar NRZ.

6.3.6

Binary N Zero Substitution (BNZS)

The common bipolar code AMI has many desirable properties of a line code. Its major limitation, however, is that a long string of zeros can lead to loss of synchronization and timing jitter because there are no pulses in the waveform for relatively long periods of time. Binary N zero substitution (BNZS) attempts to improve AMI by substituting a special code of length N for all strings of N zeros. 1999 by CRC Press LLC

c

This special code contains pulses that look like binary 1s but purposely produce violations of the AMI pulse convention. Two consecutive pulses of the same polarity violate the AMI pulse convention, independent of the number of zeros between the two consecutive pulses. These violations can be detected at the receiver, and the special code replaced by N zeros. The special code contains pulses facilitating synchronization even when the original data has long string of zeros. The special code is chosen such that the desirable properties of AMI coding are retained despite the AMI pulse convention violations, i.e., dc balance and error detection capability. The only disadvantage of BNZS compared to AMI is a slight increase in crosstalk due to the increased number of pulses and, hence, an increase in the average energy in the code. Choosing different values of N yields different BNZS codes. The value of N is chosen to meet the timing requirements of the application. In telephony, there are three commonly used BNZS codes: B6ZS, B3ZS, and B8ZS. All BNZS codes are level codes with memory. In a B6ZS code, a string of six consecutive zeros is replaced by one of two the special codes according to the rule: 0 + − 0 − +. 0 − + 0 + −.

If the last pulse was positive (+), the special code is: If the last pulse was negative (−), the special code is:

Here a zero indicates a zero voltage level for the bit period; a plus designates a positive pulse; and a minus indicates a negative pulse. This special code causes two AMI pulse violations: in its second bit position and in its fifth bit position. These violations are easily detected at the receiver and zeros resubstituted. If the number of consecutive zeros is 12, 18, 24, . . . , the substitution is repeated 2, 3, 4, . . . times. Since the number of violations is even, the B6ZS waveform is the same as the AMI waveform outside the special code, i.e., between special code sequences. There are four pulses introduced by the special code that facilitates timing recovery. Also, note that the special code is dc balanced. An example of the B6ZS code is given as follows, where the special code is indicated by the bold characters. Original data:

0

1

0

0

0

0

0

0

1

1

0

1

0

0

0

0

0

0

1

1

B6ZS format:

0

+

0

+

−

0

−

+

−

+

0

−

0

−

+

0

+

−

+

−

The computation of the PSD of a B6ZS code is tedious. Its shape is given in Fig. 6.4, for comparison purposes with AMI, for the case of equally likely data. In a B3ZS code, a string of three consecutive zeros is replaced by either B0V or 00V, where B denotes a pulse obeying the AMI (bipolar) convention and V denotes a pulse violating the AMI convention. B0V or 00V is chosen such that the number of bipolar (B) pulses between the violations is odd. The B3ZS rules are summarized in Table 6.1. TABLE 6.1

B3ZS Substitution Rules

Number of B Pulses

Polarity of Last

Substitution

Substitution

Since Last Violation

B Pulse

Code

Code Form

Odd

Negative (−)

00–

00V

Odd

Positive (+)

00+

00V

Even

Negative (−)

+0+

B 0V

Even

Positive (+)

–0–

B 0V

Observe that the violation always occurs in the third bit position of the substitution code, and 1999 by CRC Press LLC

c

FIGURE 6.4: Power spectral density of different line codes, where R = 1/T is the bit rate. so it can be easily detected and zero replacement made at the receiver. Also, the substitution code selection maintains dc balance. There is either one or two pulses in the substitution code, facilitating synchronization. The error detection capability of AMI is retained in B3ZS because a single channel error would make the number of bipolar pulses between violations even instead of being odd. Unlike B6ZS, the B3ZS waveform between violations may not be the same as the AMI waveform. B3ZS is used in the digital signal-3 (DS-3) signal interface in North America and also in the long distance-4 (LD-4) coaxial transmission system in Canada. Next is an example of a B3ZS code, using the same symbol meaning as in the B6ZS code. Original data: B3ZS format: Even No. of B pulses:

1

0

0

1

0

0

+

0

0

−

+

Odd No. of B pulses:

+

0

0

−

0

0

1

1

0

0

0

0

+

−

+

0

−

+

−

−

0

−

+

0

+

0

1

0

0

0

1

0

+

0

0

−

0

0

+

−

0

−

+

The last BNZS code considered here uses N = 8. A B8ZS code is used to provide transparent channels for the Integrated Services Digital Network (ISDN) on T1 lines and is similar to the B6ZS code. Here a string of eight consecutive zeros is replaced by one of two special codes according to the 1999 by CRC Press LLC

c

following rule: If the last pulse was positive (+), the special code is: If the last pulse was negative (−), the special code is:

0 0 0 + − 0 − +. 0 0 0 − + 0 + −.

There are two bipolar violations in the special codes, at the fourth and seventh bit positions. The code is dc balanced, and the error detection capability of AMI is retained. The waveform between substitutions is the same as that of AMI. If the number of consecutive zeros is 16, 24, . . . , then the substitution is repeated 2, 3, . . . , times.

6.3.7 High-Density Bipolar N (HDBN) This coding algorithm is a CCITT standard recommended by the Conference of European Posts and Telecommunications Administrations (CEPT), a European standards body. It is quite similar to BNZS coding. It is thus a level code with memory. Whenever there is a string of N + 1 consecutive zeros, they are replaced by a special code of length N + 1 containing AMI violations. Specific codes can be constructed for different values of N. A specific high-density bipolar N (HDBN) code, HDB3, is implemented as a CEPT primary digital signal. It is very similar to the B3ZS code. In this code, a string of four consecutive zeros is replaced by either B00V or 000V . B00V or 000V is chosen such that the number of bipolar (B) pulses between violations is odd. The HDB3 rules are summarized in Table 6.2. TABLE 6.2

HDB3 Substitution Rules

Number of B Pulses

Polarity of Last

Substitution

Substitution

Since Last Violation

B Pulse

Code

Code Form

Odd

Negative (−)

000–

000V

Odd

Positive (+)

000+

000V

Even

Negative (−)

+00+

B 00V

Even

Positive (+)

–00–

B 00V

Here the violation always occurs in the fourth bit position of the substitution code, so that it can be easily detected and zero replacement made at the receiver. Also, the substitution code selection maintains dc balance. There is either one or two pulses in the substitution code facilitating synchronization. The error detection capability of AMI is retained in HDB3 because a single channel error would make the number of bipolar pulses between violations even instead of being odd.

6.3.8

Ternary Coding

Many line coding schemes employ three symbols or levels to represent only one bit of information, like AMI. Theoretically, it should be possible to transmit information more efficiently with three symbols, specifically the maximum efficiency is log2 3 = 1.58 bits per symbol. Alternatively, the redundancy in the code signal space can be used to provide better error control. Two examples of ternary coding are described next [1, 2]: pair selected ternary (PST) and 4 binary 3 ternary (4B3T). The PST code has many of the desirable properties of line codes, but its transmission efficiency is still 1 bit per symbol. The 4B3T code also has many of the desirable properties of line codes, and it has increased transmission efficiency. 1999 by CRC Press LLC

c

In the PST code, two consecutive bits, termed a binary pair, are grouped together to form a word. These binary pairs are assigned codewords consisting of two ternary symbols, where each ternary symbol can be +, −, or 0, just as in AMI. There are nine possible ternary codewords. Ternary codewords with identical elements, however, are avoided, i.e., ++, −−, and 00. The remaining six codewords are transmitted using two modes called + mode and − mode. The modes are switched whenever a codeword with a single pulse is transmitted. The PST code and mode switching rules are summarized in Table 6.3. TABLE 6.3 PST Codeword Assignment and Mode Switching Rules Ternary Codewords

Mode

Binary Pair

+ Mode

− Mode

Switching

11

+−

+−

No

10

+0

−0

Yes

01

0+

0−

Yes

00

−+

−+

No

PST is designed to maintain dc balance and include a strong timing component. One drawback of this code is that the bits must be framed into pairs. At the receiver, an out-of-frame condition is signalled when unused ternary codewords (++, −−, and 00) are detected. The mode switching property of PST provides error detection capability. PST can be classified as a level code with memory. If the original data for PST coding contains only 1s or 0s, an alternating sequence of +− +− · · · is transmitted. As a result, an out-of-frame condition can not be detected. This problem can be minimized by using the modified PST code as shown in Table 6.4. TABLE 6.4 Modified PST Codeword Assignment and Mode Switching Rules Ternary Codewords

Mode

Binary Pair

+ Mode

− Mode

Switching

11

+0

0−

Yes

10

+−

+−

No

01

−+

−+

No

00

0+

−0

Yes

It is tedious to derive the PSD of a PST coded waveform. Again, Fig. 6.4 shows the PSD of the PST code along with the PSD of AMI and B6ZS for comparison purposes, all for equally likely binary data. Observe that PST has more power than AMI and, thus, a larger amount of energy per bit, which translates into slightly increased crosstalk. In 4B3T coding, words consisting of four binary digits are mapped into three ternary symbols. Four bits imply 24 = 16 possible binary words, whereas three ternary symbols allow 33 = 27 possible ternary codewords. The binary-to-ternary conversion in 4B3T insures dc balance and a strong timing component. The specific codeword assignment is as shown in Table 6.5. There are three types of codewords in Table 6.5, organized into three columns. The codewords in 1999 by CRC Press LLC

c

TABLE 6.5

4B3T Codeword Assignment Ternary Codewords

Binary Words

Column 1

0000

−−−

Column 2

Column 3 +++

0001

−−0

++0

0010

−0−

+0+

0011

0−−

0++

0100

−−+

++−

0101

−+−

+−+

0110

+−−

−++

0111

−00

+00

1000

0−0

0+0

1001

00−

00+

1010

0+−

1011

0−+

1100

+0−

1101

−0+

1110

+−0

1111

−+0

the first column have negative dc, codewords in the second column have zero dc, and those in the third column have positive dc. The encoder monitors the integer variable I = Np − Nn ,

(6.16)

where Np is the number of positive pulses transmitted and Nn are the number of negative pulses transmitted. Codewords are chosen according to following rule: If I < 0, If I > 0, If I = 0,

choose the ternary codeword from columns 1 and 2. choose the ternary codeword from columns 2 and 3. choose the ternary word from column 2, and from column 1 if the previous I > 0 or from column 3 if the previous I < 0.

Note that the ternary codeword 000 is not used, but the remaining 26 codewords are used in a complementary manner. For example, the column 1 codeword for 0001 is −−0, whereas the column 3 codeword is ++0. The maximum transmission efficiency for the 4B3T code is 1.33 bits per symbol compared to 1 bit per symbol for the other line codes. The disadvantages of 4B3T are that framing is required and that performance monitoring is complicated. The 4B3T code is used in the T148 span line developed by ITT Telecommunications. This code allows transmission of 48 channels using only 50% more bandwidth than required by T1 lines, instead of 100% more bandwidth.

6.4

Multilevel Signalling, Partial Response Signalling, and Duobinary Coding

Ternary coding, such as 4B3T, is an example of the use of more than two levels to improve the transmission efficiency. To increase the transmission efficiency further, more levels and/or more signal 1999 by CRC Press LLC

c

processing is needed. Multilevel signalling allows an improvement in the transmission efficiency at the expense of an increase in the error rate, i.e., more transmitter power will be required to maintain a given probability of error. In partial response signalling, intersymbol interference is deliberately introduced by using pulses that are wider and, hence, require less bandwidth. The controlled amount of interference from each pulse can be removed at the receiver. This improves the transmission efficiency, at the expense of increased complexity. Duobinary coding, a special case of partial response signalling, requires only the minimum theoretical bandwidth of 0.5R Hz. In what follows these techniques are discussed in slightly more detail.

6.4.1 Multilevel Signalling The number of levels that can be used for a line code is not restricted to two or three. Since more levels or symbols allow higher transmission efficiency, multilevel signalling can be considered in bandwidth-limited applications. Specifically, if the signalling rate or baud rate is Rs and the number of levels used is L, the equivalent transmission bit rate Rb is given by Rb = Rs log2 [L] .ψ

(6.17)

Alternatively, multilevel signalling can be used to reduce the baud rate, which in turn can reduce crosstalk for the same equivalent bit rate. The penalty, however, is that the SNR must increase to achieve the same error rate. The T1G carrier system of AT&T uses multilevel signalling with L = 4 and a baud rate of 3.152 mega-symbols/s to double the capacity of the T1C system from 48 channels to 96 channels. Also, a four level signalling scheme at 80-kB is used to achieve 160 kb/s as a basic rate in a digital subscriber loop (DSL) for ISDN.

6.4.2

Partial Response Signalling and Duobinary Coding

This class of signalling is also called correlative coding because it purposely introduces a controlled or correlated amount of intersymbol interference in each symbol. At the receiver, the known amount of interference is effectively removed from each symbol. The advantage of this signalling is that wider pulses can be used requiring less bandwidth, but the SNR must be increased to realize a given error rate. Also, errors can propagate unless precoding is used. There are many commonly used partial response signalling schemes, often described in terms of the delay operator D, which represents one signalling interval delay. For example, in (1 + D) signalling the current pulse and the previous pulse are added. The T1D system of AT&T uses (1 + D) signalling with precoding, referred to as duobinary signalling, to convert binary (two level) data into ternary (three level) data at the same rate. This requires the minimum theoretical channel bandwidth without the deleterious effects of intersymbol interference and avoids error propagation. Complete details regarding duobinary coding are found in Lender, 1963 and Schwartz, 1980. Some partial response signalling schemes, such as (1 − D), are used to shape the bandwidth rather than control it. Another interesting example of duobinary coding is a (1 − D 2 ), which can be analyzed as the product (1 − D) (1 + D). It is used by GTE in its modified T carrier system. AT&T also uses (1 − D 2 ) with four input levels to achieve an equivalent data rate of 1.544 Mb/s in only a 0.5-MHz bandwidth.

6.5

Bandwidth Comparison

We have provided the PSD expressions for most of the commonly used line codes. The actual bandwidth requirement, however, depends on the pulse shape used and the definition of bandwidth 1999 by CRC Press LLC

c

itself. There are many ways to define bandwidth, for example, as a percentage of the total power or the sidelobe suppression relative to the main lobe. Using the first null of the PSD of the code as the definition of bandwidth, Table 6.6 provides a useful bandwidth comparison. TABLE 6.6 First Null Bandwidth Comparison Bandwidth

Codes Unipolar NRZ

R

2R

BNZS

Polar NRZ

HDBN

Polar RZ (AMI)

PST

Unipolar RZ

Split Phase

Manchester

CMI

The notable omission in Table 6.6 is delay modulation (Miller code). It does not have a first null in the 2R-Hz band, but most of its power is contained in less than 0.5R Hz.

6.6

Concluding Remarks

An in-depth presentation of line coding, particularly applicable to telephony, has been included in this chapter. The most desirable characteristics of line codes were discussed. We introduced five common line codes and eight alternate line codes. Each line code was illustrated by an example waveform. In most cases expressions for the PSD and the probability of error were given and plotted. Advantages and disadvantages of all codes were included in the discussion, and some specific applications were noted. Line codes for optical fiber channels and networks built around them, such as fiber distributed data interface (FDDI) were not included in this section. A discussion of line codes for optical fiber channels, and other new developments in this topic area can be found in [1, 3, 4].

Defining Terms Alternate mark inversion (AMI): A popular name for bipolar line coding using three levels: zero, positive, and negative. Binary N zero substitution (BNZS): A class of coding schemes that attempts to improve AMI line coding. Bipolar: A particular line coding scheme using three levels: zero, positive, and negative. Crosstalk: An unwanted signal from an adjacent channel. DC wander: The dc level variation in the received signal due to a channel that cannot support dc. Duobinary coding: A coding scheme with binary input and ternary output requiring the minimum theoretical channel bandwidth. 4 Binary 3 Ternary (4B3T): A line coding scheme that maps four binary digits into three ternary symbols. High-density bipolar N (HDBN): A class of coding schemes that attempts to improve AMI. 1999 by CRC Press LLC

c

Level codes: Line codes carrying information in their voltage levels. Line coding: The process of converting abstract symbols into real, temporal waveforms to be transmitted through a baseband channel. Nonreturn to zero (NRZ): A signal that stays at a nonzero level for the entire bit duration. Pair selected ternary (PST): A coding scheme based on selecting a pair of three level symbols. Polar: A line coding scheme using both polarity of voltages, with or without a zero level. Return to zero (RZ): A signal that returns to zero for a portion of the bit duration. Transition codes: Line codes carrying information in voltage level transitions. Unipolar: A line coding scheme using only one polarity of voltage, in addition to a zero level.

References [1] Bellamy, J., Digital Telephony, John Wiley & Sons, New York, NY, 1991. [2] Bell Telephone Laboratories Technical Staff Members. Transmission Systems for Communications, 4th ed., Western Electric Company, Technical Publications, Winston-Salem, NC, 1970. [3] Bic, J.C., Duponteil, D., and Imbeaux, J.C., Elements of Digital Communication, John Wiley & Sons, New York, NY, 1991. [4] Bylanski, P., Digital Transmission Systems, Peter Peregrinus, Herts, England, 1976. [5] Couch, L.W., Modern Communication Systems: Principles and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1994. [6] Feher, K., Digital Modulation Techniques in an Interference Environment, EMC Encyclopedia Series, Vol. IX. Don White Consultants, Germantown, MD, 1977. [7] Gibson, J.D., Principles of Analog and Digital Communications, MacMillan Publishing, New York, NY, 1993. [8] Lathi, B.P., Modern Digital and Analog Communication Systems, Holt, Rinehart and Winston, Philadelphia, PA, 1989. [9] Lender, A., Duobinary Techniques for High Speed Data Transmission, IEEE Trans. Commun. Electron., CE-82, 214–218, May 1963. [10] Lindsey, W.C. and Simon, M.K., Telecommunication Systems Engineering, Prentice-Hall, Englewood Cliffs, NJ, 1973. [11] Schwartz, M., Information Transmission, Modulation, and Noise, McGraw-Hill, New York, NY, 1980. [12] Stremler, F.G., Introduction to Communication Systems, Addison-Wesley Publishing, Reading, MA, 1990.

1999 by CRC Press LLC

c

Cherubini, G. “Echo Cancellation” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Echo Cancellation

Giovanni Cherubini IBM Zurich Research Laboratory

7.1

7.1 Introduction 7.2 Echo Cancellation for PAM Systems 7.3 Echo Cancellation for QAM Systems 7.4 Echo Cancellation for OFDM Systems 7.5 Summary and Conclusions References Further Information

Introduction

Full-duplex data transmission over a single twisted-pair cable permits the simultaneous flow of information in two directions when the same frequency band is used. Examples of applications of this technique are found in digital communications systems that operate over the telephone network. In a digital subscriber loop, at each end of the full-duplex link, a circuit known as a hybrid separates the two directions of transmission. To avoid signal reflections at the near- and far-end hybrid, a precise knowledge of the line impedance would be required. Since the line impedance depends on line parameters that, in general, are not exactly known, an attenuated and distorted replica of the transmit signal leaks to the receiver input as an echo signal. Data-driven adaptive echo cancellation mitigates the effects of impedance mismatch. A similar problem is caused by crosstalk in transmission systems over voice-grade unshielded twisted-pair cables for local-area network applications, where multipair cables are used to physically separate the two directions of transmission. Crosstalk is a statistical phenomenon due to randomly varying differential capacitive and inductive coupling between adjacent two-wire transmission lines. At the rates of several megabits per second that are usually considered for local-area network applications, near-end crosstalk (NEXT) represents the dominant disturbance; hence adaptive NEXT cancellation must be performed to ensure reliable communications. In voiceband data modems, the model for the echo channel is considerably different from the echo model adopted in baseband transmission. The transmitted signal is a passband signal obtained by quadrature amplitude modulation (QAM), and the far-end echo may exhibit significant carrierphase jitter and carrier-frequency shift, which are caused by signal processing at intermediate points in the telephone network. Therefore, a digital adaptive echo canceller for voiceband modems needs to embody algorithms that account for the presence of such additional impairments. In this chapter, we describe the echo channel models and adaptive echo canceller structures that are obtained for various digital communications systems, which are classified according to the employed 1999 by CRC Press LLC

c

modulation techniques. We also address the tradeoffs between complexity, speed of adaptation, and accuracy of cancellation in adaptive echo cancellers.

7.2

Echo Cancellation for Pulse–Amplitude Modulation (PAM) Systems

The model of a full-duplex baseband data transmission system employing pulse–amplitude modulation (PAM) and adaptive echo cancellation is shown in Fig. 7.1. To describe system operations, we consider one end of the full-duplex link. The configuration of an echo canceller for a PAM transmission system is shown in Fig. 7.2. The transmitted data consist of a sequence {an } of independent and identically distributed (i.i.d.) real-valued symbols from the M-ary alphabet A = {±1, ±3, . . . , ±(M − 1)}. The sequence {an } is converted into an analog signal by a digitalto-analog (D/A) converter. The conversion to a staircase signal by a zero-order hold D/A converter is described by the frequency response HD/A (f ) = T sin(πf T )/(πf T ), where T is the modulation interval. The D/A converter output is filtered by the analog transmit filter and is input to the channel through the hybrid.

FIGURE 7.1: Model of a full-duplex PAM transmission system. The signal x(t) at the output of the low-pass analog receive filter has three components, namely, the signal from the far-end transmitter r(t), the echo u(t), and additive Gaussian noise w(t). The signal x(t) is given by x(t)

= =

r(t) + u(t) + w(t) ∞ ∞ X X anR h(t − nT ) + an hE (t − nT ) + w(t) ,

n=−∞

(7.1)

n=−∞

where {anR } is the sequence of symbols from the remote transmitter, and h(t) and hE (t) = {hD/A ⊗ gE }(t) are the impulse responses of the overall channel and the echo channel, respectively. In the expression of hE (t), the function hD/A (t) is the inverse Fourier transform of HD/A (f ), and the operator ⊗ denotes convolution. The signal obtained after echo cancellation is processed by a 1999 by CRC Press LLC

c

detector that outputs the sequence of estimated symbols {aˆ nR }. In the case of full-duplex PAM data

FIGURE 7.2: Configuration of an echo canceller for a PAM transmission system.

transmission over multi-pair cables for local-area network applications, where NEXT represents the main disturbance, the configuration of a digital NEXT canceller is obtained from Fig. 7.2, with the echo channel replaced by the crosstalk channel. For these applications, however, instead of monoduplex transmission, where one pair is used to transmit only in one direction and the other pair to transmit only in the reverse direction, dual-duplex transmission may be adopted. Bi-directional transmission at rate % over two pairs is then accomplished by full-duplex transmission of data streams at rate %/2 over each of the two pairs. The lower modulation rate and/or spectral efficiency required per pair for achieving an aggregate rate equal to % represents an advantage of dual-duplex over mono-duplex transmission. Dual-duplex transmission requires two transmitters and two receivers at each end of a link, as well as separation of the simultaneously transmitted and received signals on each pair, as illustrated in Fig. 7.3. In dual-duplex transceivers it is therefore necessary to suppress echoes returning from the hybrids and impedance discontinuities in the cable, as well as self NEXT, by adaptive digital echo and NEXT cancellation [3]. Although a dual-duplex scheme might appear to require higher implementation complexity than a mono-duplex scheme, it turns out that the two schemes are equivalent in terms of the number of multiply-and-add operations per second that are needed to perform the various filtering operations. One of the transceivers in a full-duplex link will usually employ an externally provided reference clock for its transmit and receive operations. The other transceiver will extract timing from the received signal, and use this timing for its transmitter operations. This is known as loop timing, also illustrated in Fig. 7.3. If signals were transmitted in opposite directions with independent clocks, signals received from the remote transmitter would generally shift in phase relative to the also received echo signals. To cope with this effect, some form of interpolation would be required that can significantly increase the transceiver complexity [2]. In general, we consider baseband signalling techniques such that the signal at the output of the overall channel has nonnegligible excess bandwidth, i.e., nonnegligible spectral components at fre1999 by CRC Press LLC

c

FIGURE 7.3: Model of a dual-duplex transmission system. quencies larger than half of the modulation rate, |f | ≥ 1/2T . Therefore, to avoid aliasing, the signal x(t) is sampled at twice the modulation rate or at a higher sampling rate. Assuming a sampling rate equal to m/T , m > 1, the ith sample during the nth modulation interval is given by T = xnm+i = rnm+i + unm+i + wnm+i , i = 0, . . . , m − 1 x (nm + i) m ∞ ∞ X X R hkm+i an−k + hE,km+i an−k + wnm+i , (7.2) = k=−∞

k=−∞

where {hnm+i , i = 0, . . . , m − 1} and {hE,nm+i , i = 0, . . . , m − 1} are the discrete-time impulse responses of the overall channel and the echo channel, respectively, and {wnm+i , i = 0, . . . , m − 1} is a sequence of Gaussian noise samples with zero mean and variance σw2 . Equation (7.2) suggests that the sequence of samples {xnm+i , i = 0, . . . , m − 1} be regarded as a set of m interleaved sequences, each with a sampling rate equal to the modulation rate. Similarly, the sequence of echo samples {unm+i , i = 0, . . . , m − 1} can be regarded as a set of m interleaved sequences that are output by m independent echo channels with discrete-time impulse responses {hE,nm+i }, i = 0, . . . , m − 1, and an identical sequence {an } of input symbols [7]. Hence, echo cancellation can be performed by m interleaved echo cancellers, as shown in Fig. 7.4. Since the performance of each canceller is independent of the other m − 1 units, in the remaining part of this section we will consider the operations of a single echo canceller. The echo canceller generates an estimate uˆ n of the echo signal. If we consider a transversal filter realization, uˆ n is obtained as the inner product of the vector of filter coefficients at time t = nT , cn = (cn,0 , . . . , cn,N−1 )0 and the vector of signals stored in the echo canceller delay line at the same instant, a n = (an , . . . , an−N+1 )0 , expressed by uˆ n = c0n a n =

N −1 X

cn,k an−k

(7.3)

k=0

where c0n denotes the transpose of the vector cn . The estimate of the echo is subtracted from the received signal. The result is defined as the cancellation error signal zn = xn − uˆ n = xn − c0n a n .

(7.4)

The echo attenuation that must be provided by the echo canceller to achieve proper system operation depends on the application. For example, for the Integrated Services Digital Network (ISDN) 1999 by CRC Press LLC

c

FIGURE 7.4: A set of m interleaved echo cancellers. U-Interface transceiver, the echo attenuation must be larger than 55 dB [10]. It is then required that the echo signals outside of the time span of the echo canceller delay line be negligible, i.e., hE,n ≈ 0 for n < 0 and n > N − 1. As a measure of system performance, we consider the mean square error εn2 at the output of the echo canceller at time t = nT , defined by n o (7.5) εn2 = E zn2 , where {zn } is the error sequence and E{·} denotes the expectation operator. For a particular coefficient vector cn , substitution of Eq. (7.4) into Eq. (7.5) yields n o (7.6) εn2 = E xn2 − 2c0n q + c0n R cn , where q = E{xn a n } and R = E{a n a 0n }. With the assumption of i.i.d. transmitted symbols, the correlation matrix R is diagonal. The elements on the diagonal are equal to the variance of the transmitted symbols, σa2 = (M 2 − 1)/3. The minimum mean square error is given by n o 2 = E xn2 − c0opt R copt , (7.7) εmin where the optimum coefficient vector is copt = R −1 q. We note that proper system operation is achieved only if the transmitted symbols are uncorrelated with the symbols from the remote transmitter. If this condition is satisfied, the optimum filter coefficients are given by the values of the discrete-time echo channel impulse response, i.e., copt,k = hE,k , k = 0, . . . , N − 1. By the decision-directed stochastic gradient algorithm, also known as the least mean square (LMS) algorithm, the coefficients of the echo canceller converge in the mean to copt . The LMS algorithm for an N-tap adaptive linear transversal filter is formulated as follows: n o 1 (7.8) cn+1 = cn − α∇c zn2 = cn + αzn a n , 2 1999 by CRC Press LLC

c

where α is the adaptation gain and 0 n o ∂z2 ∂zn2 n 2 ,..., = −2zn a n ∇c zn = ∂cn,0 ∂cn,N −1 is the gradient of the squared error with respect to the vector of coefficients. The block diagram of an adaptive transversal filter echo canceller is shown in Fig. 7.5.

FIGURE 7.5: Block diagram of an adaptive transversal filter echo canceller. If we define the vector pn = copt − cn , the mean square error can be expressed as 2 + p0n R pn , εn2 = εmin

(7.9)

where the term p0n R pn represents an ‘excess mean square distortion’ due to the misadjustment of the filter settings. The analysis of the convergence behavior of the excess mean square distortion was first proposed for adaptive equalizers [13] and later extended to adaptive echo cancellers [9]. Under the assumption that the vectors pn and a n are statistically independent, the dynamics of the mean square error are given by h in n o + E εn2 = ε02 1 − ασa2 2 − αN σa2

2 2εmin , 2 − αN σa2

(7.10)

where ε02 is determined by the initial conditions. The mean square error converges to a finite steady2 if the stability condition 0 < α < 2/(N σ 2 ) is satisfied. The optimum adaptation state value ε∞ a gain that yields fastest convergence at the beginning of the adaptation process is αopt = 1/(N σa2 ). 1999 by CRC Press LLC

c

2 = 2ε 2 , The corresponding time constant and asymptotic mean square error are τopt = N and ε∞ min respectively. We note that a fixed adaptation gain equal to αopt could not be adopted in practice, since after echo cancellation the signal from the remote transmitter would be embedded in a residual echo having approximately the same power. If the time constant of the convergence mode is not a critical system parameter, an adaptation gain smaller than αopt will be adopted to achieve an asymptotic mean 2 . On the other hand, if fast convergence is required, a variable gain will be square error close to εmin chosen. Several techniques have been proposed to increase the speed of convergence of the LMS algorithm. In particular, for echo cancellation in data transmission, the speed of adaptation is reduced by the presence of the signal from the remote transmitter in the cancellation error. To mitigate this problem, the data signal can be adaptively removed from the cancellation error by a decision-directed algorithm [5]. Modified versions of the LMS algorithm have been also proposed to reduce system complexity. For example, the sign algorithm suggests that only the sign of the error signal be used to compute an approximation of the stochastic gradient [4]. An alternative means to reduce the implementation complexity of an adaptive echo canceller consists in the choice of a filter structure with a lower computational complexity than the transversal filter. At high data rates, very large scale integration (VLSI) technology is needed for the implementation of transceivers for full-duplex data transmission. High-speed echo cancellers and near-end crosstalk cancellers that do not require multiplications represent an attractive solution because of their low complexity. As an example of an architecture suitable for VLSI implementation, we consider echo cancellation by a distributed-arithmetic filter, where multiplications are replaced by table lookup and shift-and-add operations [12]. By segmenting the echo canceller into filter sections of shorter lengths, various tradeoffs concerning the number of operations per modulation interval and the number of memory locations needed to store the lookup tables are possible. Adaptivity is achieved by updating the values stored in the lookup tables by the LMS algorithm. To describe the principles of operations of a distributed-arithmetic echo canceller, we assume that the number of elements in the alphabet of input symbols is a power of two, M = 2W . Therefore, (0) (W −1) (i) ), where an , i = 0, . . . , W − 1, are each symbol is represented by the vector (an , . . . , an independent binary random variables, i.e.,

an =

w=0 (w)

W −1 X bn(w) 2w , 2an(w) − 1 2w =

W −1 X

(7.11)

w=0

(w)

where bn = (2an − 1) ∈ {−1, +1}. By substituting Eq. (7.11) into Eq. (7.1) and segmenting the delay line of the echo canceller into L sections with K = N/L delay elements each, we obtain uˆ n =

−1 L−1 XW X `=0 w=0

2

w

"K−1 X k=0

# (w) bn−`K−k cn,`K+k

.

(7.12)

Equation (7.12) suggests that the filter output can be computed using a set of L2K val(w) ues that are stored in L tables with 2K memory locations each. The binary vectors a n,` = (w)

(w)

(an−(`+1)K+1 , . . . , an−`K ), w = 0, . . . , W − 1, ` = 0, . . . , L − 1, determine the addresses of the memory locations where the values that are needed to compute the filter output are stored. The filter output is obtained by W L table lookup and shift-and-add operations. 1999 by CRC Press LLC

c

(w) (w) We observe that a n,` and its binary complement a¯ n,` select two values that differ only in their sign. This symmetry is exploited to halve the number of values to be stored. To determine the output of a distributed-arithmetic filter with reduced memory size, we reformulate Eq. (7.12) as

uˆ n =

−1 L−1 XW X `=0 w=0

K−1 X (w) (w) (w) 2w bn−`K−k0 + b bn−`K−k cn,`K+k c n,`K+k 0 n−`K−k0 ,

(7.13)

k=0 k6 =k0

where k0 can be any element of the set {0, . . . , K − 1}. In the following, we take k0 = 0. Then the (w) binary symbols bn−`K determine whether the selected values are to be added or subtracted. Each table has now 2K−1 memory locations, and the filter output is given by uˆ n =

−1 L−1 XW X `=0 w=0

(w) (w) 2w bn−`K dn in,` , ` ,

(7.14) (w)

where dn (k, `), k = 0, . . . , 2K−1 − 1, ` = 0, . . . , L − 1, are the look up values, and in,` , w = 0, . . . , W − 1, ` = 0, . . . , L − 1, are the look up indices computed as follows: K−1 X (w) (w) a 2k−1 if an−`K = 1 k=1 n−`K−k (w) . (7.15) in,` = K−1 X (w) (w) a¯ n−`K−k 2k−1 if an−`K = 0 k=1

We note that, as long as Eqs. (7.12) and (7.13) hold for some coefficient vector (cn,0 , . . . , cn,N −1 ), the distributed-arithmetic filter emulates the operation of a linear transversal filter. For arbitrary values dn (k, `), however, a nonlinear filtering operation results. The expression of the LMS algorithm to update the values of a distributed-arithmetic echo canceller takes the form n o 1 (7.16) d n+1 = d n − α∇d zn2 = d n + αzn y n , 2 where d 0n = [d 0n (0), . . . , d 0n (L − 1)], with d 0n (`) = [dn (0, `), . . . , dn (2K−1 − 1, `)], and y 0n = [y 0n (0), . . . , y 0n (L − 1)], with y 0n (`) =

W −1 X w=0

(w) 2w bn−`K δ0,i (w) , . . . , δ2K−1 −1,i (w) , n,`

n,`

are L2K−1 × 1 vectors and where δi,j is the Kronecker delta. We note that at each iteration only those values that are selected to generate the filter output are updated. The block diagram of an adaptive distributed-arithmetic echo canceller with input symbols from a quaternary alphabet is shown in Fig. 7.6. The analysis of the mean square error convergence behavior and steady-state performance has been extended to adaptive distributed-arithmetic echo cancellers [1]. The dynamics of the mean square error are given by 2 n n o 2εmin ασa2 2 2 2 + . (7.17) E εn = ε0 1 − K−1 2 − αLσa 2 2 − αLσa2 1999 by CRC Press LLC

c

The stability condition for the echo canceller is 0 < α < 2/(Lσa2 ). For a given adaptation gain, echo canceller stability depends on the number of tables and on the variance of the transmitted symbols. Therefore, the time span of the echo canceller can be increased without affecting system stability, provided that the number L of tables is kept constant. In that case, however, mean square error convergence will be slower. From Eq. (7.17), we find that the optimum adaptation gain that permits the fastest mean square error convergence at the beginning of the adaptation process is αopt = 1/(Lσa2 ). The time constant of the convergence mode is τopt = L2K−1 . The smallest achievable time constant is proportional to the total number of values. The realization of a distributedarithmetic echo canceller can be further simplified by updating at each iteration only the values that are addressed by the most significant bits of the symbols stored in the delay line. The complexity required for adaptation can thus be reduced at the price of a slower rate of convergence.

FIGURE 7.6: Block diagram of an adaptive distributed-arithmetic echo canceller.

1999 by CRC Press LLC

c

7.3

Echo Cancellation for Quadrature Amplitude Modulation (QAM) Systems

Although most of the concepts presented in the preceding sections can be readily extended to echo cancellation for communications systems employing QAM, the case of full-duplex transmission over a voiceband data channel requires a specific discussion. We consider the system model shown in Fig. 7.7. The transmitter generates a sequence {an } of i.i.d. complex-valued symbols from a twodimensional constellation A, which are modulated by the carrier ej 2πfc nT , where T and fc denote the modulation interval and the carrier frequency, respectively. The discrete-time signal at the output of the transmit Hilbert filter may be regarded as an analytic signal, which is generated at the rate of m/T samples/s, m > 1. The real part of the analytic signal is converted into an analog signal by a D/A converter and input to the channel. We note that by transmitting the real part of a complex-valued signal positive- and negative-frequency components become folded. The image band attenuation of the transmit Hilbert filter thus determines the achievable echo suppression. In fact, the receiver cannot extract aliasing image-band components from desired passband frequency components, and the echo canceller is able to suppress only echo arising from transmitted passband components.

FIGURE 7.7: Configuration of an echo canceller for a QAM transmission system.

The output of the echo channel is represented as the sum of two contributions. The near-end echo uNE (t) arises from the impedance mismatch between the hybrid and the transmission line, as in the case of baseband transmission. The far-end echo uFE (t) represents the contribution due to echos that are generated at intermediate points in the telephone network. These echos are characterized by additional impairments, such as jitter and frequency shift, which are accounted for by introducing a carrier-phase rotation of an angle φ(t) in the model of the far-end echo. At the receiver, samples of the signal at the channel output are obtained synchronously with the transmitter timing, at the sampling rate of m/T samples/s. The discrete-time received signal is converted to a complex-valued baseband signal {xnm0 +i , i = 0, . . . , m0 − 1}, at the rate of m0 /T 1999 by CRC Press LLC

c

samples/s, 1 < m0 < m, through filtering by the receive Hilbert filter, decimation, and demodulation. From delayed transmit symbols, estimates of the near- and far-end echo signals after demodulation, 0 0 0 ˆ FE {uˆ NE nm0 +i , i = 0, . . . , m − 1}, respectively, are generated using m nm0 +i , i = 0, . . . , m − 1} and {u interleaved near- and far-end echo cancellers. The cancellation error is given by ˆ FE z` = x` − uˆ NE ` . ` −u

(7.18)

A different model is obtained if echo cancellation is accomplished before demodulation. In this case, two equivalent configurations for the echo canceller may be considered. In one configuration, the modulated symbols are input to the transversal filter, which approximates the passband echo response. Alternatively, the modulator can be placed after the transversal filter, which is then called a baseband transversal filter [14]. In the considered realization, the estimates of the echo signals after demodulation are given by uˆ NE nm0 +i = and uˆ FE nm0 +i

=

"N −1 FE X k=0

NX NE −1 k=0

i = 0, . . . , m0 − 1 ,

NE cn,km 0 +i an−k ,

(7.19)

# FE cn,km 0 +i an−k−DFE

ˆ

ej φnm0 +i ,

i = 0, . . . , m0 − 1 ,

(7.20)

NE , . . . , cNE FE FE 0 where (cn,0 n,m0 NNE −1 ) and (cn,0 , . . . , cn,m0 NFE −1 ) are the coefficients of the m interleaved near- and far-end echo cancellers, respectively, {φˆ nm0 +i , i = 0, . . . , m0 − 1} is the sequence of far-end echo phase estimates, and DFE denotes the bulk delay accounting for the round-trip delay from the transmitter to the point of echo generation. To prevent overlap of the time span of the near-end echo canceller with the time span of the far-end echo canceller, the condition DFE > NNE must be satisfied. We also note that, because of the different nature of near- and far-end echo generation, the time span of the far-end echo canceller needs to be larger than the time span of the near-end echo canceller, i.e., NFE > NNE . Adaptation of the filter coefficients in the near- and far-end echo cancellers by the LMS algorithm leads to NE cn+1,km 0 +i

=

NE ∗ cn,km 0 +i + αznm0 +i (an−k )

k

=

0, . . . , NNE − 1,

FE cn+1,km 0 +i

=

FE ∗ −j φnm0 +i cn,km 0 +i + αznm0 +i (an−k−DFE ) e

k

=

0, . . . , NFE − 1,

i = 0, . . . , m0 − 1 ,

(7.21)

and ˆ

i = 0, . . . , m0 − 1 ,

(7.22)

respectively, where the asterisk denotes complex conjugation. The far-end echo phase estimate is computed by a second-order phase-lock loop algorithm, where the following stochastic gradient approach is adopted: φˆ `+1 = φˆ ` − 21 γFE ∇φˆ |z` |2 + 1φ` 1999 by CRC Press LLC

c

1φ`+1 = 1φ` − 21 ζFE ∇φˆ |z` |2

(mod 2π )

,

(7.23)

where ` = nm0 + i, i = 0, . . . , m0 − 1, γFE and ζFE are step-size parameters, and ∇φˆ |z` |2 =

n ∗ o ∂ |z` |2 = −2Im z` uˆ FE . ` ∂ φˆ `

(7.24)

We note that algorithm (7.23) requires m0 iterations per modulation interval, i.e., we cannot resort to interleaving to reduce the complexity of the computation of the far-end echo phase estimate.

7.4

Echo Cancellation for Orthogonal Frequency Division Multiplexing (OFDM) Systems

Orthogonal frequency division multiplexing (OFDM) is a modulation technique whereby blocks of M symbols are transmitted in parallel over M subchannels by employing M orthogonal subcarriers. We consider a real-valued discrete-time channel impulse response {hi , i = 0, . . . , L} having length L + 1 M. To illustrate the basic principles of OFDM systems, let us consider a noiseless ideal channel with impulse response given by {hi } = {δi }, where {δi } is defined as the discrete-time delta function. Modulation of the complex-valued input symbols at the n-th modulation interval, denoted by the vector An = {An (i), i = 0, . . . , M −1}, is performed by an inverse discrete Fourier transform (IDFT), as shown in Fig. 7.8. We assume that M is even, and that each block of symbols satisfies the Hermitian symmetry conditions, i.e., An (0) and An (M/2) are real valued, and An (i) = A∗n (M − i), i = 1, . . . , M/2 − 1. Then the signals a n = {an (i), i = 0, . . . , M − 1} obtained at the output of the IDFT are real valued. After parallel-to-serial conversion, the M signals are sent over the channel at the given transmission rate M/T , where T denotes the modulation interval. At the output of the channel, the noiseless signals are received without distortion. Serial-to-parallel conversion yields blocks of M elements, with boundaries placed such that each block obtained at the modulator output is also presented at the demodulator input. Then demodulation performed by a discrete Fourier transform (DFT) will reproduce the blocks of M input symbols. The overall input-output relationship is therefore equivalent to that of a bank of M parallel, independent subchannels.

FIGURE 7.8: Block diagram of an OFDM system.

In the general case of a noisy channel with impulse response having length greater than one, M independent subchannels are obtained by a variant of OFDM that is also known as discrete multitone modulation (DMT) [11]. In a DMT system, modulation by the IDFT is performed at the rate 1/T 0 = M/(M + L)T < 1/T . After modulation, each block of M signals is cyclically extended by copying the last L signals in front of the block, and converted from parallel to serial. The resulting L + M signals are sent over the channel. At the receiver, blocks of samples with length L + M are taken. Block boundaries are placed such that the last M samples depend only on the elements of one cyclically extended block of signals. The first L samples are discarded, and the vector x n of the last M samples of the block received at the n-th modulation interval can be expressed as x n = 0n h + wn , 1999 by CRC Press LLC

c

(7.25)

where h is the vector of the impulse response extended with M − L − 1 zeros, wn is a vector of additive white Gaussian noise samples, and 0n is a M × M circulant matrix given by an (0) an (M − 1) . . . an (1) an (1) an (0) . . . an (2) . . . . (7.26) 0n = . . . . . . an (M − 1) an (M − 2) . . . an (0) −1 Recalling that FM 0n FM = diag(An ), where FM is the M × M DFT matrix defined as FM = j 2π

[(e− M )km ], k, m = 0, . . . , M − 1, and diag(An ) denotes the diagonal matrix with elements on the diagonal given by An , we find that the output of the demodulator is given by Xn = diag(An )H + W n ,

(7.27)

where H denotes the DFT of the vector h, and W n is a vector of independent Gaussian random variables. Equation (7.27) indicates that the sequence of transmitted symbol vectors can be detected by assuming a bank of M independent subchannels, at the price of a decrease in the data rate by a factor (M + L)/M. Note that in practice the computationally more efficient inverse fast Fourier transform and fast Fourier transform are used instead of IDFT and DFT. We discuss echo cancellation for OFDM with reference to a DMT system [6], as shown in Fig. 7.9. The real-valued discrete-time echo impulse response is {hE,i , i = 0, . . . , N − 1}, having length

FIGURE 7.9: Configuration of an echo canceller for a DMT transmission system. N < M. We initially assume N ≤ L + 1. Furthermore, we assume that the boundaries of the received blocks are placed such that the last M samples of the n-th received block are expressed by 1999 by CRC Press LLC

c

the vector x n = 0nR h + 0n hE + w n ,

(7.28)

where 0nR is the circulant matrix with elements given by the signals from the remote transmitter, and hE is the vector of the echo impulse response extended with M − N zeros. In the frequency domain, the echo is expressed as U n = diag(An )H E , where H E denotes the DFT of the vector hE . In this case, the echo canceller provides an echo estimate that is given by Uˆ n = diag(An )C n , where C n denotes the DFT of the vector cn of the N coefficients of the echo canceller filter extended with M − N zeros. In practice, however, we need to consider the case N > L + 1. The expression of the cancellation error is then given by zn = x n − 9n,n−1 cn ,

(7.29)

where the vector of the last M elements of the n-th received block is now x n = 0nR h+9n,n−1 hE +w n , and 9n,n−1 is a M × M Toeplitz matrix given by 9n,n−1 =

an (0)

an (M − 1)

···

an (M − L)

an−1 (M − 1)

···

an−1 (L + 1)

an (1) . . . an (M − 1)

an (0)

···

an (M − L)

···

an (M − 2)

···

an (M − L + 1) .. . an (M − L − 1)

an (M − L − 2)

···

an−1 (L + 2) . . . an (0)

.

(7.30)

In the frequency domain, the cancellation error can be expressed as Z n = FM x n − χn,n−1 cn − diag(An )C n ,

(7.31)

where χn,n−1 = 9n,n−1 − 0n is a M × M upper triangular Toeplitz matrix. Equation (7.31) suggests a computationally efficient, two-part echo cancellation technique. First, in the time domain, a short convolution is performed and the result subtracted from the received signals to compensate for the insufficient length of the cyclic extension. Second, in the frequency domain, cancellation of the residual echo is performed over a set of M independent echo subchannels. Observing that Eq. (7.31) is ˜ n,n−1 C n , where 9 ˜ n,n−1 = FM 9n,n−1 F −1 , the echo canceller adaptation equivalent to Z n = Xn − 9 M by the LMS algorithm in the frequency domain takes the form ∗ ˜ n,n−1 Zn , C n+1 = C n + α 9

(7.32)

˜∗ ˜ where α is the adaptation gain, and 9 n,n−1 denotes the transpose conjugate of 9n,n−1 . We note that, alternatively, echo canceller adaptation may also be performed by the algorithm C n+1 = C n + α diag(A∗n )Z n , which entails a substantially lower computational complexity than the LMS algorithm, at the price of a slower rate of convergence. In DMT systems it is essential that the length of the channel impulse response be much less than the number of subchannels, so that the reduction in data rate due to the cyclic extension may be considered negligible. Therefore, equalization is adopted in practice to shorten the length of the channel impulse response. From Eq. (7.31), however, we observe that transceiver complexity depends on the relative lengths of the echo and of the channel impulse responses. To reduce the length of the cyclic extension as well as the computational complexity of the echo canceller, various methods have been proposed to shorten both the channel and the echo impulse responses jointly [8]. 1999 by CRC Press LLC

c

7.5

Summary and Conclusions

Digital signal processing techniques for echo cancellation provide large echo attenuation, and eliminate the need for additional line interfaces and digital-to-analog and analog-to-digital converters that are required by echo cancellation in the analog signal domain. The realization of digital echo cancellers in transceivers for high-speed full-duplex data transmission today is possible at a low cost thanks to the advances in VLSI technology. Digital techniques for echo cancellation are also appropriate for near-end crosstalk cancellation in transceivers for transmission over voice-grade cables at rates of several megabits per second for local-area network applications. In voiceband modems for data transmission over the telephone network, digital techniques for echo cancellation also allow a precise tracking of the carrier phase and frequency shift of far-end echos.

References [1] Cherubini, G., Analysis of the convergence behavior of adaptive distributed-arithmetic echo cancellers. IEEE Trans. Commun., 41(11), 1703–1714, 1993. ¨ ¸ er, S., and Ungerboeck, G., A quaternaty partial-response class-IV transceiver [2] Cherubini, G., Olc for 125 Mbit/s data transmission over unshielded twisted-pair cables: Principles of operation and VLSI realization. IEEE J. Sel. Areas Commun., 13(9), 1656–1669, 1995. ¨ ¸ er, S., Rao, S.K., and Ungerboeck, G., 100BASE-T2: A new [3] Cherubini, G., Creigh, J., Olc standard for 100 Mb/s Ethernet transmission over voice-grade cables. IEEE Commun. Mag., 35(11), 115–122, 1997. [4] Duttweiler, D.L., Adaptive filter performance with nonlinearities in the correlation multiplier. IEEE Trans. Acoust., Speech, Signal Processing, 30(8), 578–586, 1982. [5] Falconer, D.D., Adaptive reference echo-cancellation. IEEE Trans. Commun., 30(9), 2083– 2094, 1982. [6] Ho, M., Cioffi, J.M. and Bingham, J.A.C., Discrete multitone echo cancellation. IEEE Trans. Commun., 44(7), 817–825, 1996. [7] Lee, E.A. and Messerschmitt, D.G., Digital Communication, 2nd ed., Kluwer Academic Publishers, Boston MA, 1994. [8] Melsa, P.J.W., Younce, R.C., and Rohrs, C.E., Impulse response shortening for discrete multitone transceivers. IEEE Trans. Commun., 44(12), 1662–1672, 1996. [9] Messerschmitt, D.G., Echo cancellation in speech and data transmission. IEEE J. Sel. Areas Commun., 2(2), 283–297, 1984. [10] Messerschmitt, D.G., Design issues for the ISDN U-Interface transceiver. IEEE J. Sel. Areas Commun., 4(8), 1281–1293, 1986. [11] Ruiz, A., Cioffi, J.M., and Kasturia, S., Discrete multiple tone modulation with coset coding for the spectrally shaped channel. IEEE Trans. Commun., 40(6), 1012–1029, 1992. [12] Smith, M.J., Cowan, C.F.N., and Adams, P.F., Nonlinear echo cancellers based on transpose distributed arithmetic. IEEE Trans. Circuits and Systems, 35(1), 6–18, 1988. [13] Ungerboeck, G., Theory on the speed of convergence in adaptive equalizers for digital communication. IBM J. Res. Develop., 16(6), 546–555, 1972. [14] Weinstein, S.B., A passband data-driven echo-canceller for full-duplex transmission on twowire circuits. IEEE Trans. Commun., 25(7), 654–666, 1977.

1999 by CRC Press LLC

c

Further Information For further information on adaptive transversal filters with application to echo cancellation, see Adaptive Filters: Structures, Algorithms, and Applications, M.L. Honig and D.G. Messerschmitt, Kluwer, 1984.

1999 by CRC Press LLC

c

Helleseth, T. & Kumar, P.V. “Pseudonoise Sequences” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Pseudonoise Sequences 8.1 8.2 8.3 8.4 8.5 8.6

Tor Helleseth University of Bergen

P. Vijay Kumar University of Southern California

8.1

Introduction m Sequences The q -ary Sequences with Low Autocorrelation

Families of Sequences with Low Crosscorrelation

Gold and Kasami Sequences • Quaternary Sequences with Low Crosscorrelation • Binary Kerdock Sequences

Aperiodic Correlation

Barker Sequences • Sequences with High Merit Factor quences with Low Aperiodic Crosscorrelation

•

Se-

Other Correlation Measures

Partial-Period Correlation • Mean Square Correlation • Optical Orthogonal Codes

Defining Terms References Further Information

Introduction

Pseudonoise sequences (PN sequences), also referred to as pseudorandom sequences, are sequences that are deterministically generated and yet possess some properties that one would expect to find in randomly generated sequences. Applications of PN sequences include signal synchronization, navigation, radar ranging, random number generation, spread-spectrum communications, multipath resolution, cryptography, and signal identification in multiple-access communication systems. The correlation between two sequences {x(t)} and {y(t)} is the complex inner product of the first sequence with a shifted version of the second sequence. The correlation is called 1) an autocorrelation if the two sequences are the same, 2) a crosscorrelation if they are distinct, 3) a periodic correlation if the shift is a cyclic shift, 4) an aperiodic correlation if the shift is not cyclic, and 5) a partial-period correlation if the inner product involves only a partial segment of the two sequences. More precise definitions are given subsequently. Binary m sequences, defined in the next section, are perhaps the best-known family of PN sequences. The balance, run-distribution, and autocorrelation properties of these sequences mimic those of random sequences. It is perhaps the random-like correlation properties of PN sequences that makes them most attractive in a communications system, and it is common to refer to any collection of low-correlation sequences as a family of PN sequences. Section 8.2 begins by discussing m sequences. Thereafter, the discussion continues with a description of sequences satisfying various correlation constraints along the lines of the accompanying 1999 by CRC Press LLC

c

self-explanatory figure, Fig. 8.1. Expanded tutorial discussions on pseudorandom sequences may be found in [14], in [15, Chapter 5] and in [6].

8.2

m Sequences

A binary {0, 1} shift-register sequence {s(t)} is a sequence that satisfies a linear recurrence relation of the form r X fi s(t + i) = 0 ,ψ for all t ≥ 0 (8.1) i=0

where r ≥ 1 is the degree of the recursion; the coefficients fi belong to the finite field GF (2) = {0, 1} where the leading coefficient fr = 1. Thus, both sequences {a(t)} and {b(t)} appearing in Fig. 8.2 are shift-register sequences. A sequence satisfying a recursion of the form in Eq. (8.1) is said to have P characteristic polynomial f (x) = ri=0 fi x i . Thus, {a(t)} and {b(t)} have characteristic polynomials given by f (x) = x 3 + x + 1 and f (x) = x 3 + x 2 + 1, respectively.

FIGURE 8.1: Overview of pseudonoise sequences. Since an r-bit binary shift register can assume a maximum of 2r different states, it follows that every shift-register sequence {s(t)} is eventually periodic with period n ≤ 2r , i.e., s(t) = s(t + n),ψ

for all t ≥ N

for some integer N. In fact, the maximum period of a shift-register sequence is 2r − 1, since a shift register that enters the all-zero state will remain forever in that state. The upper shift register in Fig. 8.2 when initialized with starting state 0 0 1 generates the periodic sequence {a(t)} given by 0010111 0010111

0010111

···

(8.2)

of period n = 7. It follows then that this shift register generates sequences of maximal period starting from any nonzero initial state. An m sequence is simply a binary shift-register sequence having maximal period. For every r ≥ 1, m sequences are known to exist. The periodic autocorrelation function θs of a binary {0, 1} sequence 1999 by CRC Press LLC

c

FIGURE 8.2: An example Gold sequence generator. Here {a(t)} and {b(t)} are m sequences of length 7. {s(t)} of period n is defined by θs (τ ) =

n−1 X (−1)s(t+τ )−s(t) ,

0≤τ ≤n−1

t=0

An m sequence of length 2r −1 has the following attributes. 1) Balance property: in each period of the m sequence there are 2r−1 ones and 2r−1 − 1 zeros. 2) Run property: every nonzero binary s-tuple, s ≤ r occurs 2r−s times, the all-zero s-tuple occurs 2r−s − 1 times. 3) Two-level autocorrelation function: n if τ = 0 (8.3) θs (τ ) = −1 if τ 6= 0 The first two properties follow immediately from the observation that every nonzero r-tuple occurs precisely once in each period of the m sequence. For the third property, consider the difference sequence {s(t + τ ) − s(t)} for τ 6 = 0. This sequence satisfies the same recursion as the m sequence {s(t)} and is clearly not the all-zero sequence. It follows, therefore, that {s(t +τ )−s(t)} ≡ {s(t +τ 0 )} for some τ 0 , 0 ≤ τ 0 ≤ n − 1, i.e., is a different cyclic shift of the m sequence {s(t)}. The balance property of the sequence {s(t + τ 0 )} then gives us attribute 3. The m sequence {a(t)} in Eq. (8.2) can be seen to have the three listed properties. If {s(t)} is any sequence of period n and d is an integer, 1 ≤ d ≤ n, then the mapping {s(t)} → {s(dt)} is referred to as a decimation of {s(t)} by the integer d. If {s(t)} is an m sequence of period n = 2r − 1 and d is an integer relatively prime to 2r − 1, then the decimated sequence {s(dt)} clearly also has period n. Interestingly, it turns out that the sequence {s(dt)} is always also an m sequence of the same period. For example, when {a(t)} is the sequence in Eq. (8.2), then a(3t) = 0011101

0011101

0011101

···

(8.4)

a(2t) = 0111001

0111001

0111001

···

(8.5)

and The sequence {a(3t)} is also an m sequence of period 7, since it satisfies the recursion s(t + 3) + s(t + 2) + s(t) = 0 1999 by CRC Press LLC

c

for all t

of degree r = 3. In fact {a(3t)} is precisely the sequence labeled {b(t)} in Fig. 8.2. The sequence {a(2t)} is simply a cyclically shifted version of {a(t)} itself; this property holds in general. If {s(t)} is any m sequence of period 2r − 1, then {s(2t)} will always be a shifted version of the same m sequence. Clearly, the same is true for decimations by any power of 2. Starting from an m sequence of period 2r − 1, it turns out that one can generate all m sequences of the same period through decimations by integers d relatively prime to 2r − 1. The set of integers d, 1 ≤ d ≤ 2r −1 satisfying (d, 2r −1) = 1 forms a group under multiplication modulo 2r −1, with the powers {2i | 0 ≤ i ≤ r −1} of 2 forming a subgroup of order r. Since decimation by a power of 2 yields a shifted version of the same m sequence, it follows that the number of distinct m sequences of period 2r − 1 is [φ(2r − 1)/r] where φ(n) denotes the number of integers d, 1 ≤ d ≤ n, relatively prime to n. For example, when r = 3, there are just two cyclically distinct m sequences of period 7, and these are precisely the sequences {a(t)} and {b(t)} discussed in the preceding paragraph. Tables provided in [12] can be used to determine the characteristic polynomial of the various m sequences obtainable through the decimation of a single given m sequence. The classical reference on m sequences is [4]. If one obtains a sequence of some large length n by repeatedly tossing an unbiased coin, then such a sequence will very likely satisfy the balance, run, and autocorrelation properties of an m sequence of comparable length. For this reason, it is customary to regard the extent to which a given sequence possesses these properties as a measure of randomness of the sequence. Quite apart from this, in many applications such as signal synchronization and radar ranging, it is desirable to have sequences {s(t)} with low autocorrelation sidelobes i.e., |θs (τ )| is small for τ 6= 0. Whereas m sequences are a prime example, there exist other methods of constructing binary sequences with low out-of-phase autocorrelation. Sequences {s(t)} of period n having an autocorrelation function identical to that of an m sequence, i.e., having θs satisfying Eq. (8.3) correspond to well-studied combinatorial objects known as cyclic Hadamard difference sets. Known infinite families fall into three classes 1) Singer and Gordon, Mills and Welch, 2) quadratic residue, and 3) twin-prime difference sets. These correspond, respectively, to sequences of period n of the form n = 2r − 1, r ≥ 1; n prime; and n = p(p + 2) with both p and p + 2 being prime in the last case. For a detailed treatment of cyclic difference sets, see [2]. A recent observation by Maschietti in [9] provides additional families of cyclic Hadamard difference sets that also correspond to sequences of period n = 2r − 1.

8.3

The q -ary Sequences with Low Autocorrelation

As defined earlier, the autocorrelation of a binary {0, 1} sequence {s(t)} leads to the computation of the inner product of an {−1, +1} sequence {(−1)s(t) } with a cyclically shifted version {(−1)s(t+τ ) } of itself. The {−1, +1} sequence is transmitted as a phase shift by either 0◦ and 180◦ of a radio-frequency carrier, i.e., using binary phase-shift keying (PSK) modulation. If the modulation is q-ary PSK, then one is led to consider sequences {s(t)} with symbols in the set Zq , i.e., the set of integers modulo q. The relevant autocorrelation function θs (τ ) is now defined by θs (τ ) =

n−1 X

ωs(t+τ )−s(t)

t=0

where n is the period of {s(t)} and ω is a complex primitive qth root of unity. It is possible to construct sequences {s(t)} over Zq whose autocorrelation function satisfies n if τ = 0 θs (τ ) = 0 if τ 6 = 0 1999 by CRC Press LLC

c

For obvious reasons, such sequences are said to have an ideal autocorrelation function. We provide without proof two sample constructions. The sequences in the first construction are given by 2 when n is even t /2 (mod n)ψ s(t) = t (t + 1)/2 (mod n)ψ when n is odd Thus, this construction provides sequences with ideal autocorrelation for any period n. Note that the size q of the sequence symbol alphabet equals n when n is odd and 2n when n is even. The second construction also provides sequences over Zq of period n but requires that n be a perfect square. Let n = r 2 and let π be an arbitrary permutation of the elements in the subset {0, 1, 2, . . . , (r − 1)} of Zn : Let g be an arbitrary function defined on the subset {0, 1, 2, . . . , r − 1} of Zn . Then any sequence of the form s(t) = rt1 π(t2 ) + g(t2 ) (mod n) where t = rt1 + t2 with 0 ≤ t1 , t2 ≤ r − 1 is the base-r decomposition of t, has an ideal autocorrelation function. When the alphabet size q equals or divides the period n of the sequence, ideal-autocorrelation sequences also go by the name generalized bent functions. For details, see [6].

8.4

Families of Sequences with Low Crosscorrelation

Given two sequences {s1 (t)} and {s2 (t)} over Zq of period n, their crosscorrelation function θ1,2 (τ ) is defined by n−1 X ωs1 (t+τ )−s2 (t) θ1,2 (τ ) = t=0

where ω is a primitive qth root of unity. The crosscorrelation function is important in code-division multiple-access (CDMA) communication systems. Here, each user is assigned a distinct signature sequence and to minimize interference due to the other users, it is desirable that the signature sequences have pairwise, low values of crosscorrelation function. To provide the system in addition with a self-synchronizing capability, it is desirable that the signature sequences have low values of the autocorrelation function as well. Let F = {{si (t)} | 1 ≤ i ≤ M} be a family of M sequences {si (t)} over Zq each of period n. Let θi,j (τ ) denote the crosscorrelation between the ith and j th sequence at shift τ , i.e., θi,j (τ ) =

n−1 X

ωsi (t+τ )−sj (t) ,ψ

0≤τ ≤n−1

t=0

The classical goal in sequence design for CDMA systems has been minimization of the parameter θmax = max θi,j (τ ) | either i 6= j or τ 6 = 0 for fixed n and M. It should be noted though that, in practice, because of data modulation the correlations that one runs into are typically of an aperiodic rather than a periodic nature (see Section 8.5). The problem of designing for low aperiodic correlation, however, is a more difficult one. A typical approach, therefore, has been to design based on periodic correlation, and then to analyze the resulting design for its aperiodic correlation properties. Again, in many practical systems, the mean square correlation properties are of greater interest than the worst-case correlation represented by a parameter such as θmax . The mean square correlation is discussed in Section 8.6. 1999 by CRC Press LLC

c

Bounds on the minimum possible value of θmax for given period n, family size M, and alphabet size q are available that can be used to judge the merits of a particular sequence design. The most efficient bounds are those due to Welch, Sidelnikov, and Levenshtein, see [6]. √In CDMA systems, √ there is greatest interest in designs in which the parameter θmax is in the range n ≤ θmax ≤ 2 n. Accordingly, Table 8.1 uses the Welch, Sidelnikov, and Levenshtein bounds to provide an order-ofmagnitude upper bound on the family size M for certain θmax in the cited range. Practical considerations dictate that q be small. The bit-oriented nature of electronic hardware makes it preferable to have q a power of 2. With this in mind, a description of some efficient sequence families having low auto- and crosscorrelation values and alphabet sizes q = 2 and q = 4 are described next. TABLE 8.1 Bounds on Family Size M for Given n, θmax

8.4.1

θmax

Upper bound on M q=2

Upper Bound on M q>2

√ √n 2n √ 2 n

n/2 n 3n2 /10

n n2 /2 n3 /2

Gold and Kasami Sequences

Given the low autocorrelation sidelobes of an m sequence, it is natural to attempt to construct families of low correlation sequences starting from m sequences. Two of the better known constructions of this type are the families of Gold and Kasami sequences. Let r be odd and d = 2k + 1 where k, 1 ≤ k ≤ r − 1, is an integer satisfying (k, r) = 1. Let {s(t)} be a cyclic shift of an m sequence of period n = 2r − 1 that satisfies S(dt) 6 ≡ 0 and let G be the Gold family of 2r + 1 sequences given by G = {s(t)} ∪ {s(dt)} ∪ {{s(t) + s(d[t + τ ])} | 0 ≤ τ ≤ n − 1} Then each sequence in G has period 2r − 1 and the maximum-correlation parameter θmax of G satisfies p θmax ≤ 2r+1 + 1 An application of the Sidelnikov bound coupled with the information that θmax must be an odd integer yields that for the family G, θmax is as small as it can possibly be. In this sense the family G is an optimal family. We remark that these comments remain true even when d is replaced by the integer d = 22k − 2k + 1 with the conditions on k remaining unchanged. The Gold family remains the best-known family of m sequences having low crosscorrelation. Applications include the Navstar Global Positioning System whose signals are based on Gold sequences. The family of Kasami sequences has a similar description. Let r = 2v and d = 2v + 1. Let {s(t)} be a cyclic shift of an m sequence of period n = 2r − 1 that satisfies s(dt) 6 ≡ 0, and consider the family of Kasami sequences given by K = {s(t)} ∪ {s(t) + s(d[t + τ ])} | 0 ≤ τ ≤ 2v − 2 Then the Kasami family K contains 2v sequences of period 2r − 1. It can be shown that in this case θmax = 1 + 2v 1999 by CRC Press LLC

c

This time an application of the Welch bound and the fact that θmax is an integer shows that the Kasami family is optimal in terms of having the smallest possible value of θmax for given n and M.

8.4.2

Quaternary Sequences with Low Crosscorrelation

The entries in Table 8.1 suggest that nonbinary (i.e., q > 2) designs may be used for improved performance. A family of quaternary sequences that outperform the Gold and Kasami sequences is now discussed below. Let f (x) be the characteristic polynomial of a binary m sequence of length 2r − 1 for some integer r. The coefficients of f (x) are either 0 or 1. Now, regard f (x) as a polynomial over Z4 and form the x 2 . Define the polynomial g(x) product (−1)r f (x)f (−x). This can be seen to be a polynomial Pin r 2 r = i=0 gi x i and consider the set of all of degree r by setting g(x ) = (−1) f (x)f (−x). Let Pg(x) r quaternary sequences {a(t)} satisfying the recursion i=0 gi a(t + i) = 0 for all t. It turns out that with the exception of the all-zero sequence, all of the sequences generated in this way have period 2r − 1. Thus, the recursion generates a family A of 2r + 1 cyclically distinct quaternary sequences. Closer √ study reveals that the maximum correlation parameter θmax of this to the family of Gold sequences, the family family satisfies θmax ≤ 1 + 2r . Thus, in comparison √ A offers a lower value of θmax (by a factor of 2) for the same family size. In comparison to the set of Kasami sequences, it offers a much larger family size for the same bound on θmax . Family A sequences may be found discussed in [16, 3]. We illustrate with an example. Let f (x) = x 3 + x + 1 be the characteristic polynomial of the m sequence {a(t)} in Eq. (8.1). Then over Z4 g x 2 = (−1)3 f (x)f (−x) = x 6 + 2x 4 + x 2 + 3 so that g(x) = x 3 + 2x 2 + x + 3. Thus, the sequences in family A are generated by the recursion s(t + 3) + 2s(t + 2) + s(t + 1) + 3s(t) = 0 mod 4. The corresponding shift register is shown in Fig. 8.3. By varying initial conditions, this shift register can √ be made to generate nine cyclically distinct sequences, each of length 7. In this case θmax ≤ 1 + 8.

FIGURE 8.3: Shift register that generates family A quaternary sequences {s(t)} of period 7.

8.4.3

Binary Kerdock Sequences

The Gold and Kasami families of sequences are closely related to binary linear cyclic codes. It is well known in coding theory that there exists nonlinear binary codes whose performance exceeds that of the best possible linear code. Surprisingly, some of these examples come from binary codes, which are images of linear quaternary (q = 4) codes under the Gray map: 0 → 00, 1 → 01, 2 → 11, 1999 by CRC Press LLC

c

3 → 10. A prime example of this is the Kerdock code, which recently has been shown to be the Gray image of a quaternary linear code. Thus, it is not surprising that the Kerdock code yields binary sequences that significantly outperform the family of Kasami sequences. The Kerdock sequences may be constructed as follows: let f (x) be the characteristic polynomial of an m sequence of period 2r − 1, r odd. As before, regarding f (x) as a polynomial over Z4 (which happens to have {0, 1} coefficients), let the polynomial g(x) over Z4 be defined via g(x 2 ) = −f (x)f (−x). [Thus, g(x) is thePcharacteristic polynomial of a family A sequence set of period r i 2r − 1.] Set Prh(x) = −g(−x) = i=0 hi x , and rlet S be the set of all Z4 sequences satisfying the recursion i=0 hi s(t + i) = 0. Then S contain 4 -distinct sequences corresponding to all possible distinct initializations of the shift register. Let T denote the subset S of size 2r -consisting of those sequences corresponding to initializations of the shift register only using the symbols 0 and 2 in Z4 . Then the set S − T of size 4r − 2r contains a set U of 2r−1 cyclically distinct sequences each of period 2(2r − 1). Given x = a + 2b ∈ Z4 with a, b ∈ {0, 1}, let µ denote the most significant bit (MSB) map µ(x) = b. Let KE denote the family of 2r−1 binary sequences obtained by applying the map µ to each sequence in U. It turns out that √ each sequence in U also has period 2(2r − 1) and that, furthermore, for the family KE , θmax ≤ 2 + 2r+1 . Thus, KE is a much larger family than the Kasami family, while having almost exactly the same value of θmax . For example, taking r = 3 and f (x) = x 3 + x + 1, we have from the previous family A example that g(x) = x 3 + 2x 2 + x + 3, so that h(x) = −g(−x) = x 3 + 2x 2 + x + 1. Applying the MSB map to the head of the shift register, and discarding initializations of the shift register involving only 0’s and 2’s yields a family of four cyclically distinct binary sequences of period 14. Kerdock sequences are discussed in [6, 11, 1, 17].

8.5

Aperiodic Correlation

Let {x(t)} and {y(t)} be complex-valued sequences of length (or period) n, not necessarily distinct. Their aperiodic correlation values {ρx,y (τ )| − (n − 1) ≤ τ ≤ n − 1} are given by

ρx,y (τ ) =

min{n−1,n−1−τ } X

x(t + τ )y ∗ (t)

t=max{0,−τ }

where y ∗ (t) denotes the complex conjugate of y(t). When x ≡ y, we will abbreviate and write ρx in place of ρx,y . The sequences described next are perhaps the most famous example of sequences with low-aperiodic autocorrelation values.

8.5.1

Barker Sequences

A binary {−1, +1} sequence {s(t)} of length n is said to be a Barker sequence if the aperiodic autocorrelation values ρs (τ ) satisfy |ρs (τ )| ≤ 1 for all τ, −(n − 1) ≤ τ ≤ n − 1. The Barker property is preserved under the following transformations: s(t) → −s(t), 1999 by CRC Press LLC

c

s(t) → (−1)t s(t) and s(t) → s(n − 1 − t)

as well as under compositions of the preceding transformations. Only the following Barker sequences are known: n=2 ++ n = 3 + +− n=4 +++− n = 5 + + + −+ n = 7 + + + − − +− n = 11 + + + − − − + − − +− n = 13 + + + + + − − + + − + −+ where + denotes +1 and − denotes −1 and sequences are generated from these via the transformations already discussed. It is known that if any other Barker sequence exists, it must have length n > 1,898,884, that is a multiple of 4. For an upper bound to the maximum out-of-phase aperiodic autocorrelation of an m sequence, see [13].

8.5.2

Sequences with High Merit Factor

The merit factor F of a {−1, +1} sequence {s(t)} is defined by F =

2

n2 Pn−1

2 τ =1 ρs (τ )

Since ρs (τ ) = ρs (−τ ) for 1 ≤ |τ | ≤ n − 1 and ρs (0) = n, factor F may be regarded as the ratio of the square of the in-phase autocorrelation, to the sum of the squares of the out-of-phase aperiodic autocorrelation values. Thus, the merit factor is one measure of the aperiodic autocorrelation properties of a binary {−1, +1} sequence. It is also closely connected with the signal to self-generated noise ratio of a communication system in which coded pulses are transmitted and received. Let Fn denote the largest merit factor of any binary {−1, +1} sequence of length n. For example, at length n = 13, the Barker sequence of length 13 has a merit factor F = F13 = 14.08. Assuming a certain ergodicity postulate it was established by Golay that limn→∞ Fn = 12.32. Exhaustive computer searches carried out for n ≤ 40 have revealed the following. 1. For 1 ≤ n ≤ 40, n 6 = 11, 13, 3.3 ≤ Fn ≤ 9.85 , 2. F11 = 12.1, F13 = 14.08. The value F11 is also achieved by a Barker sequence. From partial searches, for lengths up to 117, the highest known merit factor is between 8 and 9.56; for lengths from 118 to 200, the best-known factor is close to 6. For lengths > 200, statistical search methods have failed to yield a sequence having merit factor exceeding 5. An offset sequence is one in which a fraction θ of the elements of a sequence of length n are chopped off at one end and appended to the other end, i.e., an offset sequence is a cyclic shift of the original sequence by nθ symbols. It turns out that the asymptotic merit factor of m sequences is equal to 3 and is independent of the particular offset of the m sequence. There exist offsets of sequences associated with quadratic-residue and twin-prime difference sets that achieve a larger merit factor of 6. Details may be found in [7]. 1999 by CRC Press LLC

c

8.5.3

Sequences with Low Aperiodic Crosscorrelation

If {u(t)} and {v(t)} are sequences of length 2n − 1 defined by ( x(t) if 0 ≤ t ≤ n − 1 u(t) = 0 if n ≤ t ≤ 2n − 2 (

and v(t) = then

y(t) if 0 ≤ t ≤ n − 1 if n ≤ t ≤ 2n − 2

0

{ρx,y (τ ) | −(n − 1) ≤ τ ≤ n − 1} = θu,v (τ ) | 0 ≤ τ ≤ 2n − 2

Given a collection

(8.6)

U = {{xi (t)} | 1 ≤ i ≤ M}

of sequences of length n over Zq , let us define ρmax = max ρa,b (τ ) | a, b ∈ U , either a 6 = b or τ 6 = 0 It is clear from Eq. (8.6) how bounds on the periodic correlation parameter θmax can be adapted to give bounds on ρmax . Translation of the Welch bound gives that for every integer k ≥ 1, ) ( M(2n − 1) n2k 2k −1 ρmax ≥ 2n+k−2 M(2n − 1) − 1 k

Setting k = 1 in the preceding bound gives s ρmax ≥ n

M −1 M(2n − 1) − 1

Thus, for fixed M and large n, Welch’s bound gives ρmax ≥ O n1/2 There exist sequence families which asymptotically achieve ρmax ≈ O(n1/2 ), [10].

8.6 8.6.1

Other Correlation Measures Partial-Period Correlation

The partial-period (p-p) correlation between the sequences {u(t)} and {v(t)} is the collection {1u,v (l, τ, t0 ) | 1 ≤ l ≤ n, 0 ≤ τ ≤ n − 1, 0 ≤ t0 ≤ n − 1} of inner products 1u,v (l, τ, t0 ) =

t=tX 0 +l−1

u(t + τ )v ∗ (t)

t=t0

where l is the length of the partial period and the sum t + τ is again computed modulo n. 1999 by CRC Press LLC

c

In direct-sequence CDMA systems, the pseudorandom signature sequences used by the various users are often very long for reasons of data security. In such situations, to minimize receiver hardware complexity, correlation over a partial period of the signature sequence is often used to demodulate data, as well as to achieve synchronization. For this reason, the p-p correlation properties of a sequence are of interest. Researchers have attempted to determine the moments of the p-p correlation. Here the main tool is the application of the Pless power-moment identities of coding theory [8]. The identities often allow the first and second p-p correlation moments to be completely determined. For example, this is true in the case of m sequences (the remaining moments turn out to depend upon the specific characteristic polynomial of the m sequence). Further details may be found in [15].

8.6.2

Mean Square Correlation

Frequently in practice, there is a greater interest in the mean-square correlation distribution of a sequence family than in the parameter θmax . Quite often in sequence design, the sequence family is derived from a linear, binary cyclic code of length n by picking a set of cyclically distinct sequences of period n. The families of Gold and Kasami sequences are so constructed. In this case, as pointed out by Massey, the mean square correlation of the family can be shown to be either optimum or close to optimum, under certain easily satisfied conditions, imposed on the minimum distance of the dual code. A similar situation holds even when the sequence family does not come from a linear cyclic code. In this sense, mean square correlation is not a very discriminating measure of the correlation properties of a family of sequences. An expanded discussion of this issue may be found in [5].

8.6.3

Optical Orthogonal Codes

Given a pair of {0, 1} sequences {s1 (t)} and {s2 (t)} each having period n, we define the Hamming correlation function θ12 (τ ), 0 ≤ τ ≤ n − 1, by θ12 (τ ) =

n−1 X

s1 (t + τ )s2 (t)

t=0

Such correlations are of interest, for instance, in optical communication systems where the 1’s and 0’s in a sequence correspond to the presence or absence of pulses of transmitted light. An (n, w, λ) optical orthogonal code (OOC) is a family F = {{si (t)} | i = 1, 2, . . . , M}, of M {0, 1} sequences of period n, constant Hamming weight w, where w is an integer lying between 1 and n − 1 satisfying θij (τ ) ≤ λ whenever either i 6 = j or τ 6= 0. Note that the Hamming distance da,b between a period of the corresponding codewords {a(t)}, {b(t)}, 0 ≤ t ≤ n − 1 in an (n, w, λ) OOC having Hamming correlation ρ, 0 ≤ ρ ≤ λ, is given by da,b = 2(w − ρ), and, thus, OOCs are closely related to constant-weight error correcting codes. Given an (n, w, λ) OOC, by enlarging the OOC to include every cyclic shift of each sequence in the code, one obtains a constant-weight, minimum distance dmin ≥ 2(w − λ) code. Conversely, given a constant-weight cyclic code of length n, weight w and minimum distance dmin , one can derive an (n, w, λ) OOC code with λ ≤ w − dmin /2 by partitioning the code into cyclic equivalence classes and then picking precisely one representative from each equivalence class of size n. By making use of this connection, one can derive bounds on the size of an OOC from known bounds on the size of constant-weight codes. The bound given next follows directly from the Johnson bound 1999 by CRC Press LLC

c

for constant weight codes [8]. The number M(n, w, λ) of codewords in a (n, w, λ) OOC satisfies n−λ+1 n−λ 1 n−1 ··· ··· M(n, w, λ) ≤ w w−1 w−λ+1 w−λ An OOC code that achieves the Johnson bound is said to be optimal. A family {Fn } of OOCs indexed by the parameter n and arising from a common construction is said to be asymptotically optimum if |Fn | =1 lim n→∞ M(n, w, λ) Constructions for optical orthogonal codes are available for the cases when λ = 1 and λ = 2. For larger values of λ, there exist constructions which are asymptotically optimum. Further details may be found in [6].

Defining Terms Autocorrelation of a sequence: The complex inner product of the sequence with a shifted version itself. Crosscorrelation of two sequences: The complex inner product of the first sequence with a shifted version of the second sequence. m Sequence: A periodic binary {0, 1} sequence that is generated by a shift register with linear feedback and which has maximal possible period given the number of stages in the shift register. Pseudonoise sequences: Also referred to as pseudorandom sequences (PN), these are sequences that are deterministically generated and yet possess some properties that one would expect to find in randomly generated sequences. Shift-register sequence: A sequence with symbols drawn from a field, which satisfies a linearrecurrence relation and which can be implemented using a shift register.

References [1] Barg, A. On small families of sequences with low periodic correlation, Lecture Notes in Computer Science, 781, 154–158, Berlin, Springer-Verlag, 1994. [2] Baumert, L.D. Cyclic Difference Sets, Lecture Notes in Mathematics 182, Springer–Verlag, New York, 1971. [3] Boztas¸, S., Hammons, R., and Kumar, P.V. 4-phase sequences with near-optimum correlation properties, IEEE Trans. Inform. Theory, IT-38, 1101–1113, 1992. [4] Golomb, S.W. Shift Register Sequences, Aegean Park Press, San Francisco, CA, 1982. [5] Hammons, A.R., Jr. and Kumar, P.V. On a recent 4-phase sequence design for CDMA. IEICE Trans. Commun., E76-B(8), 1993. [6] Helleseth, T. and Kumar, P.V. (planned). Sequences with low correlation. In Handbook of Coding Theory, ed., V.S. Pless and W.C. Huffman, Elsevier Science Publishers, Amsterdam, 1998. [7] Jensen, J.M., Jensen, H.E., and Høholdt, T. The merit factor of binary sequences related to difference sets. IEEE Trans. Inform. Theory, IT-37(May), 617–626, 1991. [8] MacWilliams, F.J. and Sloane, N.J.A. The Theory of Error-Correcting Codes, North-Holland, Amsterdam, 1977. 1999 by CRC Press LLC

c

[9] Maschietti, A. Difference sets and hyperovals, Designs, Codes and Cryptography, 14, 89–98, 1998. [10] Mow, W.H. On McEliece’s open problem on minimax aperiodic correlation. In Proc. IEEE Intern. Symp. Inform. Theory, 75, 1994. [11] Nechaev, A. The Kerdock code in a cyclic form, Discrete Math. Appl., 1, 365–384, 1991. [12] Peterson, W.W. and Weldon, E.J., Jr. Error-Correcting Codes, 2nd ed. MIT Press, Cambridge, MA, 1972. [13] Sarwate, D.V. An upper bound on the aperiodic autocorrelation function for a maximal-length sequence. IEEE Trans. Inform. Theory, IT-30(July), 685–687, 1984. [14] Sarwate, D.V. and Pursley, M.B. Crosscorrelation properties of pseudorandom and related sequences. Proc. IEEE, 68(May), 593–619, 1980. [15] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K. Spread Spectrum Communications Handbook, revised ed., McGraw Hill, New York, 1994. [16] Sol´e, P. A quaternary cyclic code and a family of quadriphase sequences with low correlation properties, Coding Theory and Applications, Lecture Notes in Computer Science, 388, 193–201, Berlin, Springer-Verlag, 1989. [17] Udaya, P. and Siddiqi, M. Optimal biphase sequences with large linear complexity derived from sequences over Z4 , IEEE Trans. Inform. Theory, IT-42 (Jan), 206–216, 1996.

Further Information A more in-depth treatment of pseudonoise sequences, may be found in the following. [1] Golomb, S.W. Shift Register Sequences, Aegean Park Press, San Francisco, 1982. [2] Helleseth, T. and Kumar, P.V. Sequences with Low Correlation, in Handbook of Coding Theory, edited by V.S. Pless and W.C. Huffman, Elsevier Science Publishers, Amsterdam, 1998 (planned). [3] Sarwate, D.V. and Pursley, M.B. Crosscorrelation Properties of Pseudorandom and Related Sequences, Proc. IEEE, 68, May, 593–619, 1980. [4] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K. Spread Spectrum Communications Handbook, revised ed., McGraw Hill, New York, 1994.

1999 by CRC Press LLC

c

Orsak, G.C. “Optimum Receivers” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Optimum Receivers 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9

Geoffrey C. Orsak Southern Methodist University

9.1

Introduction Preliminaries Karhunen–Loeve Expansion Detection Theory Performance Signal Space Standard Binary Signalling Schemes M -ary Optimal Receivers More Realistic Channels

Random Phase Channels • Rayleigh Channel

9.10 Dispersive Channels Defining Terms References Further Information

Introduction

Every engineer strives for optimality in design. This is particularly true for communications engineers since in many cases implementing suboptimal receivers and sources can result in dramatic losses in performance. As such, this chapter focuses on design principles leading to the implementation of optimum receivers for the most common communication environments. The main objective in digital communications is to transmit a sequence of bits to a remote location with the highest degree of accuracy. This is accomplished by first representing bits (or more generally short bit sequences) by distinct waveforms of finite time duration. These time-limited waveforms are then transmitted (broadcasted) to the remote sites in accordance with the data sequence. Unfortunately, because of the nature of the communication channel, the remote location receives a corrupted version of the concatenated signal waveforms. The most widely accepted model for the communication channel is the so-called additive white Gaussian noise1 channel (AWGN channel).

1 For those unfamiliar with AWGN, a random process (waveform) is formally said to be white Gaussian noise if all collections

of instantaneous observations of the process are jointly Gaussian and mutually independent. An important consequence of this property is that the power spectral density of the process is a constant with respect to frequency variation (spectrally flat). For more on AWGN, see Papoulis [4]. 1999 by CRC Press LLC

c

Mathematical arguments based upon the central limit theorem [7], together with supporting empirical evidence, demonstrate that many common communication channels are accurately modeled by this abstraction. Moreover, from the design perspective, this is quite fortuitous since design and analysis with respect to this channel model is relatively straightforward.

9.2

Preliminaries

To better describe the digital communications process, we shall first elaborate on so-called binary communications. In this case, when the source wishes to transmit a bit value of 0, the transmitter broadcasts a specified waveform s0 (t) over the bit interval t ∈ [0, T ]. Conversely, if the source seeks to transmit the bit value of 1, the transmitter alternatively broadcasts the signal s1 (t) over the same bit interval. The received waveform R(t) corresponding to the first bit is then appropriately described by the following hypotheses testing problem: H0 : R(t) = s0 (t) + η(t)ψ H1 : R(t) = s1 (t) + η(t)

0≤t ≤T

(9.1)

where, as stated previously, η(t) corresponds to AWGN with spectral height nominally given by N0 /2. It is the objective of the receiver to determine the bit value, i.e., the most accurate hypothesis from the received waveform R(t). The optimality criterion of choice in digital communication applications is the total probability of error normally denoted as Pe . This scalar quantity is expressed as Pe

=

P r( declaring 1 | 0 transmitted)P r(0 transmitted) + P r( declaring 0 | 1 transmitted)P r(1 transmitted)ψ

(9.2)

The problem of determining the optimal binary receiver with respect to the probability of error is solved by applying stochastic representation theory [10] to detection theory [5, 9]. The specific waveform representation of relevance in this application is the Karhunen–Lo`eve (KL) expansion.

9.3

` Expansion Karhunen–Loeve

The Karhunen–Lo`eve expansion is a generalization of the Fourier series designed to represent a random process in terms of deterministic basis functions and uncorrelated random variables derived from the process. Whereas the Fourier series allows one to model or represent deterministic time-limited energy signals in terms of linear combinations of complex exponential waveforms, the Karhunen–Lo`eve expansion allows us to represent a second-order random process in terms of a set of orthonormal basis functions scaled by a sequence of random variables. The objective in this representation is to choose the basis of time functions so that the coefficients in the expansion are mutually uncorrelated random variables. To be more precise, if R(t) is a zero mean second-order random process defined over [0, T ] with covariance function KR (t, s), then so long as the basis of deterministic functions satisfy certain integral constraints [9], one may write R(t) as R(t) =

∞ X i=1

1999 by CRC Press LLC

c

Ri φi (t)ψ

0 ≤ t ≤ Tψ

(9.3)

where

Z Ri =

0

T

R(t)φi (t) dt

In this case the Ri will be mutually uncorrelated random variables with the φi being deterministic basis functions that are complete in the space of square integrable time functions over [0, T ]. Importantly, in this case, equality is to be interpreted as mean-square equivalence, i.e., !2 N X Ri φi (t) = 0 lim E R(t) − N→∞

i=1

for all 0 ≤ t ≤ T . If R(t) is AWGN, then any basis of the vector space of square integrable signals over [0, T ] results in uncorrelated and therefore independent Gaussian random variables.

FACT 9.1

The use of Fact 9.1 allows for a conversion of a continuous time detection problem into a finitedimensional detection problem. Proceeding, to derive the optimal binary receiver, we first construct our set of basis functions as the set of functions defined over t ∈ [0, T ] beginning with the signals of interest s0 (t) and s1 (t). That is, s0 (t), s1 (t), plus a countable number of functions which complete the basis In order to insure that the basis is orthonormal, we must apply the Gramm–Schmidt procedure2 [6] to the full set of functions beginning with s0 (t) and s1 (t) to arrive at our final choice of basis {φi (t)}. Let {φi (t)} be the resultant set of basis functions. Then for all i > 2, the φi (t) are orthogonal to s0 (t) and s1 (t). That is, Z T φi (t)sj (t) dt = 0

FACT 9.2

0

for all i > 2 and j = 0, 1. Using this fact in conjunction with Eq. (9.3), one may recognize that only the coefficients R1 and R2 are functions of our signals of interest. Moreover, since the Ri are mutually independent, the optimal receiver will, therefore, only be a function of these two values. Thus, through the application of the KL expansion, we arrive at an equivalent hypothesis testing problem to that given in Eq. (9.1), # " RT η1 0 φ1 (t)s0 (t) dt + H0 : R = RT η2 0 φ2 (t)s0 (t) dt # " RT η1 0 φ1 (t)s1 (t) dt + (9.4) H1 : R = RT η2 φ (t)s (t) dt 2 1 0

2 The Gramm-Schmidt procedure is a deterministic algorithm that simply converts an arbitrary set of basis functions

(vectors) into an equivalent set of orthonormal basis functions (vectors). 1999 by CRC Press LLC

c

where it is easily shown that η1 and η2 are mutually independent, zero-mean, Gaussian random variables with variance given by N0 /2, and where φ1 and φ2 are the first two functions from our orthonormal set of basis functions. Thus, the design of the optimal binary receiver reduces to a simple two-dimensional detection problem that is readily solved through the application of detection theory.

9.4

Detection Theory

It is well known from detection theory [5] that under the minimum Pe criterion, the optimal detector is given by the maximum a posteriori rule (MAP), choosei largest pHi |R (Hi | R = r)

(9.5)

i.e., determine the hypothesis that is most likely, given that our observation vector is r. By a simple application of Bayes theorem [4], we immediately arrive at the central result in detection theory: the optimal binary detector is given by the likelihood ratio test (LRT), H1 pR |H1 (R) > π0 L(R) = pR |H0 (R) < π1 H0

(9.6)

where the πi are the a priori probabilities of the hypotheses Hi being true. Since in this case we have assumed that the noise is white and Gaussian, the LRT can be written as 2 ! Q2 1 1 Ri − s1,i H1 exp − 1 √ 2 N0 /2 π N0 > π0 (9.7) L(R) = 2 ! < π1 Q2 1 1 Ri − s0,i exp − H0 1 √ 2 N0 /2 π N0 where

Z sj,i =

T

0

φi (t)sj (t) dt

By taking the logarithm and cancelling common terms, it is easily shown that the optimum binary receiver can be written as H1 2 2 > π0 1 X 2 2 X 2 R i s1,i − s0,i − ln s1,i − s0,i < N0 N0 π1 1 1 H0

(9.8)

This finite-dimensional version of the optimal receiver can be converted back into a continuous time receiver by the direct application of Parseval’s theorem [4] where it is easily shown that 2 X

Z Ri sk,i =

i=1

2 X i=1

1999 by CRC Press LLC

c

0

Z 2 sk,i

T

= 0

T

R(t)sk (t) dt (9.9)

sk2 (t) dt

By applying Eq. (9.9) to Eq. (9.8) the final receiver structure is then given by Z 0

T

H1 1 > N0 π0 R(t) [s1 (t) − s0 (t)] dt − (E1 − E0 ) ln < 2 2 π1 H0

(9.10)

where E1 and E0 are the energies of signals s1 (t) and s0 (t), respectively. (See Fig. 9.1 for a block diagram.) Importantly, if the signals are equally likely (π0 = π1 ), the optimal receiver is independent of the typically unknown spectral height of the background noise.

FIGURE 9.1: Optimal correlation receiver structure for binary communications.

One can readily observe that the optimal binary communication receiver correlates the received waveform with the difference signal s1 (t) − s0 (t) and then compares the statistic to a threshold. This operation can be interpreted as identifying the signal waveform si (t) that best correlates with the received signal R(t). Based on this interpretation, the receiver is often referred to as the correlation receiver. As an alternate means of implementing the correlation receiver, we may reformulate the computation of the left-hand side of Eq. (9.10) in terms of standard concepts in filtering. Let h(t) be the impulse response of a linear, time-invariant (LTI) system. By letting h(t) = s1 (T − t) − s0 (T − t), then it is easily verified that the output of R(t) to a LTI system with impulse response given by h(t) and then sampled at time t = T gives the desired result. (See Fig. 9.2 for a block diagram.) Since the impulse response is matched to the signal waveforms, this implementation is often referred to as the matched filter receiver.

FIGURE 9.2: Optimal matched filter receiver structure for binary communications. In this case h(t) = s1 (T − t) − s0 (t − t).

1999 by CRC Press LLC

c

9.5

Performance

Because of the nature of the statistics of the channel and the relative simplicity of the receiver, performance analysis of the optimal binary receiver in AWGN is a straightforward task. Since the conditional statistics of the log likelihood ratio are Gaussian random variables, the probability of error can be computed directly in terms of Marcum Q functions3 as Pe = Q

ks 0 − s 1 k √ 2N 0

where the s i are the two-dimensional signal vectors obtained from Eq. (9.4), and where kxk denotes the Euclidean length of the vector x. Thus, ks 0 − s 1 k is best interpreted as the distance between the respective signal representations. Since the Q function is monotonically decreasing with an increasing argument, one may recognize that the probability of error for the optimal receiver decreases with an increasing separation between the signal representations, i.e., the more dissimilar the signals, the lower the Pe .

9.6

Signal Space

The concept of a signal space allows one to view the signal classification problem (receiver design) within a geometrical framework. This offers two primary benefits: first it supplies an often more intuitive perspective on the receiver characteristics (e.g., performance) and second it allows for a straightforward generalization to standard M-ary signalling schemes. To demonstrate this, in Fig. 9.3, we have plotted an arbitrary signal space for the binary signal classification problem. The axes are given in terms of the basis functions φ1 (t) and φ2 (t). Thus, every point in the signal space is a time function constructed as a linear combination of the two basis functions. By Fact 9.2, we recall that both signals s0 (t) and s1 (t) can be constructed as a linear combination of φ1 (t) and φ2 (t) and as such we may identify these two signals in this figure as two points. Since the decision statistic given in Eq. (9.8) is a linear function of the observed vector R which is also located in the signal space, it is easily shown that the set of vectors under which the receiver declares hypothesis Hi is bounded by a line in the signal space. This so-called decision boundary is obtained by solving the equation ln[L(R)] = 0. (Here again we have assumed equally likely hypotheses.) In the case under current discussion, this decision boundary is simply the hyperplane separating the two signals in signal space. Because of the generality of this formulation, many problems in communication system design are best cast in terms of the signal space, that is, signal locations and decision boundaries.

9.7

Standard Binary Signalling Schemes

The framework just described allows us to readily analyze the most popular signalling schemes in binary communications: amplitude-shift keying (ASK), frequency-shift keying (FSK), and phase-

3 The Q function is the probability that a standard normal random variable exceeds a specified constant, i.e., Q(x) = R∞ √ 2 x 1/ 2π exp(−z /2) dz.

1999 by CRC Press LLC

c

FIGURE 9.3: Signal space and decision boundary for optimal binary receiver. shift keying (PSK). Each of these examples simply constitute a different selection for signals s0 (t) and s1 (t). √ In the case of ASK, s0 (t) = 0, while s1 (t) = 2E/T sin(2πfc t), where E denotes the energy of the waveform and fc denotes the frequency of the carrier wave with fc T being an integer. √ Because s0 (t) (t) = 2/T sin(2πfc t). is the null signal, the signal space is a one-dimensional vector space with φ 1 √ This, in turn, implies that ks0 − s1 k = E. Thus, the corresponding probability of error for ASK is s ! E Pe ( ASK) = Q 2N0 For FSK,√the signals are given by equal amplitude sinusoids with distinct center frequencies, that is, si (t) = 2E/T sin(2πfi t) with fi T being two distinct integers. In √ this case, it is easily verified that the signal√space is a two-dimensional vector space with φi (t) = 2/T sin(2πfi t) resulting in ks 0 − s 1 k = 2E. The corresponding error rate is given to be s ! E Pe (FSK) = Q N0 Finally, with regard to PSK signalling, the most frequently utilized binary PSK signal set is an example of an antipodal signal set. Specifically, the antipodal signal set results in the greatest separation between the signals in the signal space subject to an energy constraint on both signals. This, in turn, translates into the√ energy constrained signal set with the minimum Pe . In this case, the si (t) are typically given by 2E/T sin[2πfc t + θ (i)], where θ (0) = 0 and θ (1) = π . As in √ the ASK case, this results in a one-dimensional signal space, however, in this case ks 0 − s 1 k = 2 E resulting in probability of error given by s ! 2E Pe (PSK) = Q N0 In all three of the described cases, one can readily observe that the resulting performance is a function of only the signal-to-noise ratio E/N0 . In the more general case, the performance will be a function of the intersignal energy to noise ratio. To gauge the relative difference in performance of the three signalling schemes, in Fig. 9.4, we have plotted the Pe as a function of the SNR. Please note the large variation in performance between the three schemes for even moderate values of SNR. 1999 by CRC Press LLC

c

FIGURE 9.4: Pe vs. the signal to noise ratio in decibels [dB = 10 log(E/N0 )] for amplitudeshift keying, frequency-shift keying, and phase-shift keying; note that there is a 3-dB difference in performance from ASK to FSK to PSK.

9.8

M-ary Optimal Receivers

In binary signalling schemes, one seeks to transmit a single bit over the bit interval [0, T ]. This is to be contrasted with M-ary signalling schemes where one transmits multiple bits simultaneously over the so-called symbol interval [0, T ]. For example, using a signal set with 16 separate waveforms will allow one to transmit a length four-bit sequence per symbol (waveform). Examples of M-ary waveforms are quadrature phase-shift keying (QPSK) and quadrature amplitude modulation (QAM). The derivation of the optimum receiver structure for M-ary signalling requires the straightforward application of fundamental results in detection theory. As with binary signalling, the Karhunen– Lo`eve expansion is the mechanism utilized to convert a hypotheses testing problem based on continuous waveforms into a vector classification problem. Depending on the complexity of the M waveforms, the signal space can be as large as an M-dimensional vector space. By extending results from the binary signalling case, it is easily shown that the optimum M-ary receiver computes Z ξi [R(t)] =

0

T

si (t)R(t) dt −

N0 Ei + ln πi 2 2

i = 1, . . . , M

where, as before, the si (t) constitute the signal set with the πi being the corresponding a priori probabilities. After computing M separate values of ξi , the minimum probability of error receiver simply chooses the largest amongst this set. Thus, the M-ary receiver is implemented with a bank of correlation or matched filters followed by choose-largest decision logic. In many cases of practical importance, the signal sets are selected so that the resulting signal space is a two-dimensional vector space irrespective of the number of signals. This simplifies the receiver 1999 by CRC Press LLC

c

structure in that the sufficient statistics are obtained by implementing only two matched filters. Both QPSK and QAM signal sets fit into this category. As an example, in Fig. 9.5, we have depicted the signal locations for standard 16-QAM signalling with the associated decision boundaries. In this case we have assumed an equally likely signal set. As can be seen, the optimal decision rule selects the signal representation that is closest to the received signal representation in this two-dimensional signal space.

9.9

More Realistic Channels

As is unfortunately often the case, many channels of practical interest are not accurately modeled as simply an AWGN channel. It is often that these channels impose nonlinear effects on the transmitted signals. The best example of this are channels that impose a random phase and random amplitude onto the signal. This typically occurs in applications such as in mobile communications, where one often experiences rapidly changing path lengths from source to receiver. Fortunately, by the judicious choice of signal waveforms, it can be shown that the selection of the φi in the Karhunen–Lo`eve transformation is often independent of these unwanted parameters. In these situations, the random amplitude serves only to scale the signals in signal space, whereas the random phase simply imposes a rotation on the signals in signal space. Since the Karhunen–Lo`eve basis functions typically do not depend on the unknown parameters, we may again convert the continuous time classification problem to a vector channel problem where the received vector R is computed as in Eq. (9.3). Since this vector is a function of both the unknown parameters (i.e., in this case amplitude A and phase ν), to obtain a likelihood ratio test independent of A and ν, we simply apply Bayes theorem to obtain the following form for the LRT: H1 E pR |H1 ,A,ν (R | H1 , A, ν) > π0 L(R) = E pR |H0 ,A,ν (R | H0 , A, ν) < π1 H0 where the expectations are taken with respect to A and ν, and where pR|Hi ,A,ν are the conditional probability density functions of the signal representations. Assuming that the background noise is AWGN, it can be shown that the LRT simplifies to choosing the largest amongst

Z ξi [R(t)]

=

πi

A,ν

exp

2 N0

Z

T

0

R(t)si (t | A, ν) dt −

Ei (A, ν) pA,ν (A, ν) dA dν N0 i = 1, . . . , M

(9.11)

It should be noted that in the Eq. (9.11) we have explicitly shown the dependence of the transmitted signals si on the parameters A and ν. The final receiver structures, together with their corresponding performance are, thus, a function of both the choice of signal sets and the probability density functions of the random amplitude and random phase.

9.9.1

Random Phase Channels

If we consider first the special case where the channel simply imposes a uniform random phase on the signal, then it can be easily shown that the so-called in-phase and quadrature statistics obtained from the received signal R(t) (denoted by RI and RQ , respectively), are sufficient statistics for the 1999 by CRC Press LLC

c

FIGURE 9.5: Signal space representation of 16-QAM signal set. Optimal decision regions for equally likely signals are also noted.

FIGURE 9.6: Optimum receiver structure for noncoherent (random or unknown phase) ASK demodulation.

1999 by CRC Press LLC

c

signal classification problem. These quantities are computed as Z T R(t) cos [2πfc (i)t] dt RI (i) = 0

and

Z RQ (i) =

0

T

R(t) sin [2πfc (i)t] dt

where in this case the index i corresponds to the center frequencies of hypotheses Hi , (e.g., FSK signalling). The optimum binary receiver selects the largest from amongst q Ei 2 2 2 RI (i) + RQ (i) i = 1, . . . , M I0 ξi [R(t)] = πi exp − N0 N0 where I0 is a zeroth-order, modified Bessel function of the first kind. If the signals have equal energy and are equally likely (e.g., FSK signalling), then the optimum receiver is given by H1 > 2 2 2 (1) R (0) + RQ (0) RI2 (1) + RQ < I H0 One may readily observe that the q optimum receiver bases its decision on the values of the two 2 (i) and, as a consequence, is often referred to as an envelopes of the received signal RI2 (i) + RQ envelope or square-law detector. Moreover, it should be observed that the computation of the envelope is independent of the underlying phase of the signal and is as such known as a noncoherent receiver. The computation of the error rate for this detector is a relatively straightforward exercise resulting in 1 E Pe ( noncoherent) = exp − 2 2N0 As before, note that the error rate for the noncoherent receiver is simply a function of the SNR.

9.9.2 Rayleigh Channel As an important generalization of the described random phase channel, many communication systems are designed under the assumption that the channel introduces both a random amplitude and a random phase on the signal. Specifically, if the original signal sets are of the form si (t) = mi (t) cos(2πfc t) where mi (t) is the baseband version of the message (i.e., what distinguishes one signal from another), then the so-called Rayleigh channel introduces random distortion in the received signal of the following form: si (t) = Ami (t) cos (2πfc t + ν) where the amplitude A is a Rayleigh random variable4 and where the random phase ν is a uniformly distributed between zero and 2π.

4 The density of a Rayleigh random variable is given by p (a) = a/σ 2 exp(−a 2 /2σ 2 ) for a ≥ 0. A

1999 by CRC Press LLC

c

To determine the optimal receiver under this distortion, we must first construct an alternate statistical model for si (t). To begin, it can be shown from the theory of random variables [4] that if XI and XQ are statistically independent, zero mean, Gaussian random variables with variance given by σ 2 , then Ami (t) cos (2πfc t + ν) = mi (t)XI cos (2πfc t) + mi (t)XQ sin (2πfc t) Equality here is to be interpreted as implying that both A and ν will be the appropriate random variables. From this, we deduce that the combined uncertainty in the amplitude and phase of the signal is incorporated into the Gaussian random variables XI and XQ . The in-phase and quadrature components of the signal si (t) are given by sI i (t) = mi (t) cos(2πfc t) and sQ i (t) = mi (t) sin(2πfc t), respectively. By appealing to Eq. (9.11), it can be shown that the optimum receiver selects the largest from

2 πi σ2 hR(t), sI i (t)i2 + R(t), sQ i (t) exp ξi [R(t)] = Ei 2 2Ei 2 1 + 1+ σ σ N0 2 N0 where the inner product

Z hR(t), Si (t)i =

0

T

R(t)si (t) dt

Further, if we impose the conditions that the signals be equally likely with equal energy over the symbol interval, then optimum receiver selects the largest amongst q

2 ξi [R(t)] = hR(t), sI i (t)i2 + R(t), sQ i (t) Thus, much like for the random phase channel, the optimum receiver for the Rayleigh channel computes the projection of the received waveform onto the in-phase and quadrature components of the hypothetical signals. From a signal space perspective, this is akin to computing the length of the received vector in the subspace spanned by the hypothetical signal. The optimum receiver then chooses the largest amongst these lengths. As with the random phase channel, computing the performance is a straightforward task resulting in (for the equally likely, equal energy case) 1 2 Pe ( Rayleigh) = Eσ 2 1+ N0 Interestingly, in this case the performance depends not only on the SNR, but also on the variance (spread) of the Rayleigh amplitude A. Thus, if the amplitude spread is large, we expect to often experience what is known as deep fades in the amplitude of the received waveform and as such expect a commensurate loss in performance.

9.10

Dispersive Channels

The dispersive channel model assumes that the channel not only introduces AWGN but also distorts the signal through a filtering process. This model incorporates physical realities such as multipath 1999 by CRC Press LLC

c

effects and frequency selective fading. In particular, the standard model adopted is depicted in the block diagram given in Fig. 9.7. As can be seen, the receiver observes a filtered version of the signal plus AWGN. If the impulse response of the channel is known, then we arrive at the optimum receiver design by applying the previously presented theory. Unfortunately, the duration of the filtered signal can be a complicating factor. More often than not, the channel will increase the duration of the transmitted signals, hence, leading to the description, dispersive channel.

FIGURE 9.7: Standard model for dispersive channel. The time varying impulse response of the channel is denoted by hc (t, τ ). However, if the designers take this into account by shortening the duration of si (t) so that the duration of si∗ (t) is less than T , then the optimum receiver chooses the largest amongst

1 N0 ln πi + R(t), si∗ (t) − Ei∗ 2 2 If we limit our consideration to equally likely binary signal sets, then the minimum Pe matches the received waveform to the filtered versions of the signal waveforms. The resulting error rate is given by

∗

!

s − s ∗ 0 0 Pe ( dispersive) = Q √ 2N 0 ξi (R(t)) =

Thus, in this case the minimum Pe is a function of the separation of the filtered version of the signals in the signal space. The problem becomes substantially more complex if we cannot insure that the filtered signal durations are less than the symbol lengths. In this case we experience what is known as intersymbol interference (ISI). That is, observations over one symbol interval contain not only the symbol information of interest but also information from previous symbols. In this case we must appeal to optimum sequence estimation [5] to take full advantage of the information in the waveform. The basis for this procedure is the maximization of the joint likelihood function conditioned on the sequence of symbols. This procedure not only defines the structure of the optimum receiver under ISI but also is critical in the decoding of convolutional codes and coded modulation. Alternate adaptive techniques to solve this problem involve the use of channel equalization.

Defining Terms Additive white Gaussian noise (AWGN) channel: The channel whose model is that of corrupting a transmitted waveform by the addition of white (i.e., spectrally flat) Gaussian noise. 1999 by CRC Press LLC

c

Bit (symbol) interval: The period of time over which a single symbol is transmitted. Communication channel: The medium over which communication signals are transmitted. Examples are fiber optic cables, free space, or telephone lines. Correlation or matched filter receiver: The optimal receiver structure for digital communications in AWGN. Decision boundary: The boundary in signal space between the various regions where the receiver declares Hi . Typically a hyperplane when dealing with AWGN channels. Dispersive channel: A channel that elongates and distorts the transmitted signal. Normally modeled as a time-varying linear system. Intersymbol interference: The ill-effect of one symbol smearing into adjacent symbols thus interfering with the detection process. This is a consequence of the channel filtering the transmitted signals and therefore elongating their duration, see dispersive channel. Karhunen–Lo`eve expansion: A representation for second-order random processes. Allows one to express a random process in terms of a superposition of deterministic waveforms. The scale values are uncorrelated random variables obtained from the waveform. Mean-square equivalence: Two random vectors or time-limited waveforms are mean-square equivalent if and only if the expected value of their mean-square error is zero. Orthonormal: The property of two or more vectors or time-limited waveforms being mutually orthogonal and individually having unit length. Orthogonality and length are typically measured by the standard Euclidean inner product. Rayleigh channel: A channel that randomly scales the transmitted waveform by a Rayleigh random variable while adding an independent uniform phase to the carrier. Signal space: An abstraction for representing a time limited waveform in a low-dimensional vector space. Usually arrived at through the application of the Karhunen–Lo`eve transformation. Total probability of error: The probability of classifying the received waveform into any of the symbols that were not transmitted over a particular bit interval.

References [1] Gibson, J.D., Principles of Digital and Analog Communications, 2nd ed., MacMillan, New York, 1993. [2] Haykin, S., Communication Systems, 3rd ed., John Wiley & Sons, New York, 1994. [3] Lee, E.A. and Messerschmitt, D.G., Digital Communication, Kluwer Academic Publishers, Norwell, MA, 1988. [4] Papoulis, A., Probability, Random Variables, and Stochastic Processes, 3rd ed., McGraw-Hill, New York, 1991. [5] Poor, H.V., An Introduction to Signal Detection and Estimation, Springer-Verlag, New York, 1988. [6] Proakis, J.G., Digital Communications, 2nd ed., McGraw-Hill, New York, 1989. [7] Shiryayev, A.N., Probability, Springer-Verlag, New York, 1984. [8] Sklar, B., Digital Communications, Fundamentals and Applications, Prentice Hall, Englewood Cliffs, NJ, 1988. [9] Van Trees, H.L., Detection, Estimation, and Modulation Theory, Part I, John Wiley & Sons, New York, 1968. 1999 by CRC Press LLC

c

[10] Wong, E. and Hajek, B., Stochastic Processes in Engineering Systems, Springer-Verlag, New York, 1985. [11] Wozencraft, J.M. and Jacobs, I., Principles of Communication Engineering, reissue, Waveland Press, Prospect Heights, Illinois, 1990. [12] Ziemer, R.E. and Peterson, R.L., Introduction to Digital Communication, Macmillan, New York, 1992.

Further Information The fundamentals of receiver design were put in place by Wozencraft and Jacobs in their seminal book. Since that time, there have been many outstanding textbooks in this area. For a sampling see [1, 2, 3, 8, 12]. For a complete treatment on the use and application of detection theory in communications see [5, 9]. For deeper insights into the Karhunen–Lo`eve expansion and its use in communications and signal processing see [10].

1999 by CRC Press LLC

c

Bhargava, V.K. “Forward Error Correction Coding” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Forward Error Correction Coding

V.K. Bhargava University of Victoria

I.J. Fair University of Alberta

10.1

10.1 Introduction 10.2 Fundamentals of Block Coding 10.3 Structure and Decoding of Block Codes 10.4 Important Classes of Block Codes 10.5 Principles of Convolutional Coding 10.6 Decoding of Convolutional Codes 10.7 Trellis-Coded Modulation 10.8 Additional Measures 10.9 Turbo Codes 10.10 Applications Defining Terms References Further Information

Introduction

In 1948, Claude Shannon issued a challenge to communications engineers by proving that communication systems could be made arbitrarily reliable as long as a fixed percentage of the transmitted signal was redundant [9]. He showed that limits exist only on the rate of communication and not its accuracy, and went on to prove that errorless transmission could be achieved in an additive white Gaussian noise (AWGN) environment with infinite bandwidth if the ratio of energy per data bit to noise power spectral density exceeds the Shannon Limit. He did not, however, indicate how this could be achieved. Subsequent research has led to a number of techniques that introduce redundancy to allow for correction of errors without retransmission. These techniques, collectively known as forward error correction (FEC) coding techniques, are used in systems where a reverse channel is not available for requesting retransmission, the delay with retransmission would be excessive, the expected number of errors would require a large number of retransmissions, or retransmission would be awkward to implement [10]. A simplified model of a digital communication system which incorporates FEC coding is shown in Fig. 10.1. The FEC code acts on a discrete data channel comprising all system elements between the encoder output and decoder input. The encoder maps the source data to q-ary code symbols which are modulated and transmitted. During transmission, this signal can be corrupted, causing errors to arise in the demodulated symbol sequence. The FEC decoder attempts to correct these errors and restore the original source data. 1999 by CRC Press LLC

c

FIGURE 10.1: Block diagram of a digital communication system with forward error correction.

A demodulator which outputs only a value for the q-ary symbol received during each symbol interval is said to make hard decisions. In the binary symmetric channel (BSC), hard decisions are made on binary symbols and the probability of error is independent of the value of the symbol. One example of a BSC is the coherently demodulated binary phase-shift-keyed (BPSK) signal corrupted by AWGN. The conditional probability density functions which result with this system are depicted in Fig. 10.2. The probability of error is given by the area under the density functions that lies across the decision threshold, and is a function of the symbol energy Es and the one-sided noise power spectral density N0 .

FIGURE 10.2: Hard and soft decision demodulation of a coherently demodulated BPSK signal corrupted by AWGN. f (z | 1) and f (z | 0) are the Gaussianly distributed conditional probability density functions at the threshold device. Alternatively, the demodulator can make soft decisions or output an estimate of the symbol value along with an indication of its confidence in this estimate. For example, if the BPSK demodulator uses three-bit quantization, the two least significant bits can be taken as a confidence measure. Possible soft-decision thresholds for the BPSK signal are depicted in Fig. 10.2. In practice, there is little to be gained by using many soft-decision quantization levels. Block and convolutional codes introduce redundancy by adding parity symbols to the message data. They map k source symbols to n code symbols and are said to have code rate R = k/n. With fixed information rates, this redundancy results in increased bandwidth and lower energy per transmitted symbol. At low signal-to-noise ratios, these codes cannot compensate for these impairments, and performance is degraded. At higher ratios of information symbol energy Eb to noise spectral density N0 , however, there is coding gain since the performance improvement offered by coding more than compensates for these impairments. Coding gain is usually defined as the reduction in required Eb /N0 to achieve a specific error rate in an error-control coded system over one without coding. In contrast to block and convolutional codes, trellis-coded modulation introduces redundancy by expanding the size of the signal set rather than increasing the number of symbols transmitted, and so offers the advantages of coding to band-limited systems. 1999 by CRC Press LLC

c

Each of these coding techniques is considered in turn. Following a discussion of interleaving and concatenated coding, this chapter gives an overview of a recent and significant advance in coding, the development of Turbo codes, and concludes with a brief overview of FEC applications.

10.2

Fundamentals of Block Coding

In block codes there is a one-to-one mapping between k-symbol source words and n-symbol codewords. With q-ary signalling, q k out of the q n possible n-tuples are valid code vectors. The set of all n-tuples forms a vector space in which the q k code vectors are distributed. The Hamming distance between any two code vectors is the number of symbols in which they differ; the minimum distance dmin of the code is the smallest Hamming distance between any two codewords. There are two contradictory objectives of block codes. The first is to distribute the code vectors in the vector space such that the distance between them is maximized. Then, if the decoder receives a corrupted vector, by evaluating the nearest valid code vector it will decode the correct word with high probability. The second is to pack the vector space with as many code vectors as possible to reduce the redundancy in transmission. When code vectors differ in at least dmin positions, a decoder which evaluates the nearest code vector to each received word is guaranteed to correct up to t random symbol errors per word if dmin ≥ 2t + 1

(10.1)

Alternatively, all q n − q k illegal words can be detected, including all error patterns with dmin − 1 or fewer errors. In general, a block code can correct all patterns of t or fewer errors and detect all patterns of u or fewer errors provided that u ≥ t and dmin ≥ t + u + 1

(10.2)

If q = 2, knowledge of the positions of the errors is sufficient for their correction; if q > 2, the decoder must determine both the positions and values of the errors. If the demodulator indicates positions in which the symbol values are unreliable, the decoder can assume their value unknown and has only to solve for the value of these symbols. These positions are called erasures. A block code can correct up to t errors and v erasures in each word if dmin ≥ 2t + v + 1

10.3

(10.3)

Structure and Decoding of Block Codes

Shannon showed that the performance limit of codes with fixed code rate improves as the block length increases. As n and k increase, however, practical implementation requires that the mapping from message to code vector not be arbitrary but that an underlying structure to the code exist. The structures developed to date limit the error correcting capability of these codes to below what Shannon proved possible, on average, for a code with random codeword assignments. Although Turbo codes have made significant strides towards approaching the Shannon Limit, the search for good constructive codes continues. A property which simplifies implementation of the coding operations is that of code linearity. A code is linear if the addition of any two code vectors forms another code vector, which implies that the code vectors form a subspace of the vector space of n-tuples. This subspace, which contains 1999 by CRC Press LLC

c

the all-zero vector, is spanned by any set of k linearly independent code vectors. Encoding can be described as the multiplication of the information k-tuple by a generator matrix G, of dimension k × n, which contains these basis vectors as rows. That is, a message vector mi is mapped to a code vector ci according to i = 0, 1, . . . , q k − 1 (10.4) ci = mi G, where elementwise arithmetic is defined in the finite field GF(q). In general, this encoding procedure results in code vectors with nonsystematic form in that the values of the message symbols cannot be determined by inspection of the code vector. However, if G has the form [I k , P ] where I k is the k × k identity matrix and P is a k × (n − k) matrix of parity checks, then the k most significant symbols of each code vector are identical to the message vector and the code has systematic form. This notation assumes that vectors are written with their most significant or first symbols in time on the left, a convention used throughout this chapter. For each generator matrix there is an (n−k)×k parity check matrix H whose rows are orthogonal to the rows in G, i.e., GH T = 0. If the code is systematic, H = [−P T , I n−k ]. Since all codewords are linear sums of the rows in G, it follows that ci H T = 0 for all i, i = 0, 1, . . . , q k − 1, and that the validity of the demodulated vectors can be checked by performing this multiplication. If a codeword c is corrupted during transmission so that the hard-decision demodulator outputs the vector cˆ = c+e, where e is a nonzero error pattern, the result of this multiplication is an (n−k)-tuple that is indicative of the validity of the sequence. This result, called the syndrome s, is dependent only on the error pattern since s = cˆ H T = (c + e)H T = cH T + eH T = eH T

(10.5)

If the error pattern is a code vector, the errors go undetected. For all other error patterns, however, the syndrome is nonzero. Since there are q n−k − 1 nonzero syndromes, q n−k − 1 error patterns can be corrected. When these patterns include all those with t or fewer errors and no others, the code is said to be a perfect code. Few codes are perfect; most codes are capable of correcting some patterns with more than t errors. Standard array decoders use lookup tables to associate each syndrome with an error pattern but become impractical as the block length and number of parity symbols increases. Algebraic decoding algorithms have been developed for codes with stronger structure. These algorithms are simplified with imperfect codes if the patterns corrected are limited to those with t or fewer errors, a simplification called bounded distance decoding. Cyclic codes are a subclass of linear block codes with an algebraic structure that enables encoding to be implemented with a linear feedback shift register and decoding to be implemented without a lookup table. As a result, most block codes in use today are cyclic or are closely related to cyclic codes. These codes are best described if vectors are interpreted as polynomials and the arithmetic follows the rules for polynomials where the elementwise operations are defined in GF(q). In a cyclic code, all codeword polynomials are multiples of a generator polynomial g(x) of degree n − k. This polynomial is chosen to be a divisor of x n − 1 so that a cyclic shift of a code vector yields another code vector, giving this class of codes its name. A message polynomial mi (x) can be mapped to a codeword polynomial ci (x) in nonsystematic form as ci (x) = mi (x)g(x),ψ

i = 0, 1, . . . , q k − 1

(10.6)

In systematic form, codeword polynomials have the form ci (x) = mi (x)x n−k − ri (x),ψ

i = 0, 1, . . . , q k − 1

(10.7)

where ri (x) is the remainder of mi (x)x n−k divided by g(x). Polynomial multiplication and division can be easily implemented with shift registers [5]. 1999 by CRC Press LLC

c

The first step in decoding the demodulated word is to determine if the word is a multiple of g(x). This is done by dividing it by g(x) and examining the remainder. Since polynomial division is a linear operation, the resulting syndrome s(x) depends only on the error pattern. If s(x) is the allzero polynomial, transmission is errorless or an undetectable error pattern has occurred. If s(x) is nonzero, at least one error has occurred. This is the principle of the cyclic redundancy check (CRC). It remains to determine the most likely error pattern that could have generated this syndrome. Single error correcting binary codes can use the syndrome to immediately locate the bit in error. More powerful codes use this information to determine the locations and values of multiple errors. The most prominent approach of doing so is with the iterative technique developed by Berlekamp. This technique, which involves computing an error-locator polynomial and solving for its roots, was subsequently interpreted by Massey in terms of the design of a minimum-length shift register. Once the location and values of the errors are known, Chien’s search algorithm efficiently corrects them. The implementation complexity of these decoders increases only as the square of the number of errors to be corrected [4] but does not generalize easily to accommodate soft-decision information. Other decoding techniques, including Chase’s algorithm and threshold decoding, are easier to implement with soft-decision input [6]. Berlekamp’s algorithm can be used in conjunction with transformdomain decoding, which involves transforming the received block with a finite field Fourier-like transform and solving for errors in the transform domain. Since the implementation complexity of these decoders depends on the block length rather than the number of symbols corrected, this approach results in simpler circuitry for codes with high redundancy [13]. Other block codes have also been constructed, including codes that are based on transform-domain spectral properties, codes that are designed specifically for correction of burst errors, and codes that are decodable with straightforward threshold or majority logic decoders [5, 6, 7].

10.4

Important Classes of Block Codes

When errors occur independently, Bose–Chaudhuri–Hocquenghem (BCH) codes provide one of the best performances of known codes for a given block length and code rate. They are cyclic codes with n = q m − 1, where m is any integer greater than 2. They are designed to correct up to t errors per word and so have designed distance d = 2t + 1; the minimum distance may be greater. Generator polynomials for these codes are listed in many texts, including [6]. These polynomials are of degree less than or equal to mt, and so k ≥ n − mt. BCH codes can be shortened to accommodate system requirements by deleting positions for information symbols. Some subclasses of these codes are of special interest. Hamming codes are perfect single error correcting binary BCH codes. Full length codes have n = 2m − 1 and k = n − m for any m greater than 2. The duals of these codes are maximal-length codes, with n = 2m − 1, k = m, and dmin = 2m−1 . All 2m − 1 nonzero code vectors in these codes are cyclic shifts of a single nonzero code vector. Reed–Solomon (RS) codes are nonbinary BCH codes defined over GF(q), where q is often taken as a power of two so that symbols can be represented by a sequence of bits. In these cases, correction of even a single symbol allows for correction of a burst of bit errors. The block length is n = q − 1, and the minimum distance dmin = 2t + 1 is achieved using only 2t parity symbols. Since RS codes meet the Singleton bound of dmin ≤ n − k + 1, they have the largest possible minimum distance for these values of n and k and are called maximum distance separable codes. The Golay codes are the only nontrivial perfect codes that can correct more than one error. The (11, 6) ternary Golay code has minimum distance 5. The (23, 12) binary code is a triple error correcting BCH code with dmin = 7. To simplify implementation, it is often extended to a (24, 12) code through the addition of an extra parity bit. The extended code has dmin = 8. 1999 by CRC Press LLC

c

The (23, 12) Golay code is also a binary quadratic residue code. √ These cyclic codes have prime length of the form n = 8m ± 1, with k = (n + 1)/2 and dmin ≥ n. Some of these codes are as good as the best codes known with these values of n and k, but it is unknown if there are good quadratic residue codes with large n [5]. Reed-Muller codes are equivalent to binary cyclic codes with an additional overall parity bit. For m r m−r . The rth-order any m, the rth-order Reed-Muller code has n = 2m , k = 6i=0 i , and dmin = 2 and (m − r − 1)th-order codes are duals, and the first-order codes are similar to maximal-length codes. These codes, and the closely related Euclidean geometry and projective geometry codes, can be decoded with threshold decoding. The performance of several of these block codes is shown in Fig. 10.3 in terms of decoded bit error probability vs. Eb /N0 for systems using coherent, hard-decision demodulated BPSK signalling. Many other block codes have also been developed, including Goppa codes, quasicyclic codes, burst error correcting Fire codes, and other lesser known codes.

10.5

Principles of Convolutional Coding

Convolutional codes map successive information k-tuples to a series of n-tuples such that the sequence of n-tuples has distance properties that allow for detection and correction of errors. Although these codes can be defined over any alphabet, their implementation has largely been restricted to binary signals, and only binary convolutional codes are considered here. In addition to the code rate R = k/n, the constraint length K is an important parameter for these codes. Definitions vary; we will use the definition that K equals the number of k-tuples that affect formation of each n-tuple during encoding. That is, the value of an n-tuple depends on the k-tuple that arrives at the encoder during that encoding interval as well as the K − 1 previous information k-tuples. Binary convolutional encoders can be implemented with kK-stage shift registers and n modulo-2 adders, an example of which is given in Fig. 10.4(a) for a rate 1/2, constraint length 3 code. The encoder shifts in a new k-tuple during each encoding interval and samples the outputs of the adders sequentially to form the coded output. Although connection diagrams similar to that of Fig. 10.4(a) completely describe the code, a more concise description can be given by stating the values of n, k, and K and giving the adder connections in the form of vectors or polynomials. For instance, the rate 1/2 code has the generator vectors g 1 = 111 and g 2 = 101, or equivalently, the generator polynomials g1 (x) = x 2 + x + 1 and g2 (x) = x 2 + 1. Alternatively, a convolutional code can be characterized by its impulse response, the coded sequence generated due to input of a single logic-1. It is straightforward to verify that the circuit in Fig. 10.4(a) has the impulse response 111011. Since modulo-2 addition is a linear operation, convolutional codes are linear, and the coded output can be viewed as the convolution of the input sequence with the impulse response, hence the name of this coding technique. Shifted versions of the impulse response or generator vectors can be combined to form an infinite-order generator matrix which also describes the code. Shift register circuits can be modeled as finite state machines. A Mealy machine description of a convolutional encoder requires 2k(K−1) states, each describing a different value of the K − 1 k-tuples which have most recently entered the shift register. Each state has 2k exit paths which correspond to the value of the incoming k-tuple. A state machine description for the rate 1/2 encoder depicted in Fig. 10.4(a) is given in Fig. 10.4(b). States are labeled with the contents of the two leftmost register stages; edges are labeled with information bit values and their corresponding coded output. The dimension of time is added to the description of the encoder with tree and trellis diagrams. 1999 by CRC Press LLC

c

FIGURE 10.3: Block code performance. Source: Sklar, B., 1988, Digital Communications: Fundac 1988, p. 300. Reprinted by permission of Prentice-Hall, Inc., Englewood mentals and Applications, Cliffs, NJ.

The tree diagram for the rate 1/2 convolutional code is given in Fig. 10.4(c), assuming the shift register is initially clear. Each node represents an encoding interval, from which the upper branch is taken if the input bit is a 0 and the lower branch is taken if the input bit is a 1. Each branch is labeled with the corresponding output bit sequence. A drawback of the tree representation is that it grows without bound as the length of the input sequence increases. This is overcome with the trellis diagram depicted in Fig. 10.4(d), Again, encoding results in left-to-right movement, where the upper of the two branches is taken whenever the input is a 0, the lower branch is taken when the input is a 1, and the output is the bit sequence which weights the branch taken. Each level of nodes corresponds to a state of the encoder as shown on the left-hand side of the diagram. 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 10.4: A rate 1/2, constraint length 3 convolutional code.

FIGURE 10.4: (Continued). If the received sequence contains errors, it may no longer depict a valid path through the tree or trellis. It is the job of the decoder to determine the original path. In doing so, the decoder does not so much correct errors as find the closest valid path to the received sequence. As a result, the error correcting capability of a convolutional code is more difficult to quantify than that of a block code; it depends on how valid paths differ. One measure of this difference is the column distance dc (i), the minimum Hamming distance between all coded sequences generated over i encoding intervals which differ in the first interval. The nondecreasing sequence of column distance values is the distance profile of the code. The column distance after K intervals is the minimum distance of the code and is important for evaluating the performance of a code that uses threshold decoding. As i increases, dc (i) approaches the free distance of the code, dfree , which is the minimum Hamming distance in the set of arbitrarily long paths that diverge and then remerge in the trellis. With maximum likelihood decoding, convolutional codes can generally correct up to t errors within three to five constraint lengths, depending on how the errors are distributed, where dfree ≥ 2t + 1

(10.8)

The free distance can be calculated by exhaustively searching for the minimum-weight path that returns to the all-zero state, or evaluating the term of lowest degree in the generating function of the code. The objective of a convolutional code is to maximize these distance properties. They generally improve as the constraint length of the code increases, and nonsystematic codes generally have better properties than systematic ones. Good codes have been found by computer search and are tabulated in many texts, including [6]. Convolutional codes with high code rate can be constructed by puncturing or periodically deleting coded symbols from a low rate code. A list of low rate codes and perforation matrices that result in good high rate codes can be found in many sources, including [13]. The performance of good punctured codes approaches that of the best convolutional codes known with similar rate, and decoder implementation is significantly less complex. Convolutional codes can be catastrophic, having the potential to generate an unlimited number of decoded bit errors in response to a finite number of errors in the demodulated bit sequence. Catastrophic error propagation is avoided if the code has generator polynomials with a greatest common divisor of the form x a for any a or, equivalently, if there are no closed-loop paths in the state diagram with all-zero output other than the one taken with all-zero input. Systematic codes are not catastrophic.

1999 by CRC Press LLC

c

10.6

Decoding of Convolutional Codes

In 1967, Viterbi developed a maximum likelihood decoding algorithm that takes advantage of the trellis structure to reduce the complexity of the evaluation. This algorithm has become known as the Viterbi algorithm. With each received n-tuple, the decoder computes a metric or measure of likelihood for all paths that could have been taken during that interval and discards all but the most likely to terminate on each node. An arbitrary decision is made if path metrics are equal. The metrics can be formed using either hard or soft decision information with little difference in implementation complexity. If the message has finite length and the encoder is subsequently flushed with zeros, a single decoded path remains. With a BSC, this path corresponds to the valid code sequence with minimum Hamming distance from the demodulated sequence. Full-length decoding becomes impractical as the length of the message sequence increases. The most likely paths tend to have a common stem, however, and selecting the trace value four or five times the constraint length prior to the present decoding depth results in near-optimum performance. Since the number of paths examined during each interval increases exponentially with the constraint length, the Viterbi algorithm also becomes impractical for codes with large constraint length. To date, Viterbi decoding has been implemented for codes with constraint lengths up to ten. Other decoding techniques, such as sequential and threshold decoding, can be used with larger constraint lengths. Sequential decoding was proposed by Wozencraft, and the most widely used algorithm was developed by Fano. Rather than tracking multiple paths through the trellis, the sequential decoder operates on a single path while searching the code tree for a path with high probability. It makes tentative decisions regarding the transmitted sequence, computes a metric between its proposed path and the demodulated sequence, and moves forward through the tree as long as the metric indicates that the path is likely. If the likelihood of the path becomes low, the decoder moves backward, searching other paths until it finds one with high probability. The number of computations involved in this procedure is almost independent of the constraint length and is typically quite small, but it can be highly variable, depending on the channel. Buffers must be provided to store incoming sequences as the decoder searches the tree. Their overflow is a significant limiting factor in the performance of these decoders. Figure 10.5 compares the performance of the Viterbi and sequential decoding algorithms for several convolutional codes operating on coherently demodulated BPSK signals corrupted by AWGN. Other decoding algorithms have also been developed, including syndrome decoding methods such as table look-up feedback decoding and threshold decoding [6]. These algorithms are easily implemented but offer suboptimal performance. Techniques such as the one discussed by [1] have been developed to support both soft input and soft output, but these decoding techniques typically increase decoder complexity.

10.7

Trellis-Coded Modulation

Trellis-coded modulation (TCM) has received considerable attention since its development by Ungerboeck in the late 1970s [11]. Unlike block and convolutional codes, TCM schemes achieve coding gain by increasing the size of the signal alphabet and using multilevel/phase signalling. Like convolutional codes, sequences of coded symbols are restricted to certain valid patterns. In TCM, these patterns are chosen to have large Euclidean distance from one another so that a large number of corrupted sequences can be corrected. The Viterbi algorithm is often used to decode these sequences. Since the symbol transmission rate does not increase, coded and uncoded signals require the same transmis1999 by CRC Press LLC

c

c 1982 IEEE, FIGURE 10.5: Convolutional code performance. Source: Omura, J.K. and Levitt, B.K., “Coded Error Probability Evaluation for Antijam Communication Systems,” IEEE Trans. Commun., vol. COM-30, no. 5, pp. 896–903. Reprinted by permission of IEEE.

sion bandwidth. If transmission power is held constant, the signal constellation of the coded signal is denser. The loss in symbol separation, however, is more than overcome by the error correction capability of the code. Ungerboeck investigated the increase in channel capacity that can be obtained by increasing the size of the signal set and restricting the pattern of transmitted symbols, and concluded that almost all of the additional capacity can be gained by doubling the number of points in the signal constellation. This is accomplished by encoding the binary data with a rate R = k/(k + 1) code and mapping sequences of k + 1 coded bits to points in a constellation of 2k+1 symbols. For example, the rate 2/3 1999 by CRC Press LLC

c

encoder of Fig. 10.6(a) encodes pairs of source bits to three coded bits. Figure 10.6(b) depicts one stage in the trellis of the coded output where, as with the convolutional code, the state of the encoder is defined by the values of the two most recent bits to enter the shift register. Note that unlike the trellis for the convolutional code, this trellis contains parallel paths between nodes.

FIGURE 10.6: Rate 2/3 trellis-coded modulation.

The key to improving performance with TCM is to map the coded bits to points in the signal space such that the Euclidean distance between transmitted sequences is maximized. A method that ensures improved Euclidean distance is the method of set partitioning. This involves separating all parallel paths on the trellis with maximum distance and assigning the next greatest distance to paths that diverge from or merge onto the same node. Figures 10.6(c) and 10.6(d) give examples of mappings for the rate 2/3 code with 8-PSK and 8-PAM signal constellations, respectively.

As with convolutional codes, the free distance of a TCM code is defined as the minimum distance between paths through the trellis, where the distance of concern is now Euclidean distance rather than Hamming distance. The free distance of an uncoded signal is defined as the distance between the closest signal points. When coded and uncoded signals have the same average power, the coding 1999 by CRC Press LLC

c

gain of the TCM system is defined as coding gain = 20 log10

dfree, coded dfree, uncoded

(10.9)

It can be shown that the simple, rate 2/3 8 phase-shift keying (PSK) and 8 pulse-amplitude modulation (PAM) TCM systems provide gains of 3 dB and 3.3 dB, respectively, [6]. More complex TCM systems yield gains up to 6 dB. Tables of good codes are given in [11].

10.8

Additional Measures

When the demodulated sequence contains bursts of errors, the performance of codes designed to correct independent errors improves if coded sequences are interleaved prior to transmission and deinterleaved prior to decoding. Deinterleaving separates the burst errors, making them appear more random and increasing the likelihood of accurate decoding. It is generally sufficient to interleave several block lengths of a block coded signal or several constraint lengths of a convolutionally encoded signal. Block interleaving is the most straightforward approach, but delay and memory requirements are halved with convolutional and helical interleaving techniques. Periodicity in the way sequences are combined is avoided with pseudorandom interleaving. Serially concatenated codes, first investigated by Forney, use two levels of coding to achieve a level of performance with less complexity than a single coding stage would require. The inner code interfaces with the modulator and demodulator and corrects the majority of the errors; the outer code corrects errors that appear at the output of the inner-code decoder. A convolutional code with Viterbi decoding is usually chosen as the inner code, and an RS code is often chosen as the outer code due to its ability to correct the bursts of bit errors which can result with incorrect decoding of trellis-coded sequences. Interleaving and deinterleaving outer-code symbols between coding stages offers further protection against the burst error output of the inner code. Product codes effectively place the data in a two dimensional array and use FEC techniques over both the rows and columns of this array. Not only do these codes result in error protection in two dimensions, but the manner in which the array is constructed can offer advantages similar to those achieved through interleaving.

10.9

Turbo Codes

The most recent significant achievement in FEC coding is the development of Turbo codes [3]. The principle of this coding technique is to encode the data with two or more constituent codes concatenated in parallel form. The received sequence is decoded in an iterative, serial approach using soft-input, soft-output decoders. This iterative decoding approach involves feedback of information in a manner similar to processes within the turbo engine, giving this coding technique its name. Turbo codes effectively result in the construction of relatively long codewords with few codewords being close in terms of Hamming distance, while at the same time constraining the implementation complexity of the decoder to practical limits. The first Turbo codes developed used recursive systematic convolutional codes as the constituent codes, and punctured them to improve the code rate. The use of other constituent codes has since been considered. Two or more of these codes are concatenated in parallel,where code concatenation is combined with interleaving in order to increase the independence of the data sequences encoded by the constituent encoders. This apparent increase in randomness, implemented with simple interleavers, is an important contributing factor to the excellent performance of the decoders. 1999 by CRC Press LLC

c

As in other multi-stage coding techniques, the complexity of the decoder is limited through use of separate decoding stages for each constituent code. The input to the first stage is the soft output of the demodulator for a finite-length received symbol sequence. Subsequent stages use both the demodulator output and an output of the previous decoding stage which is indicative of the reliability of the symbols. This information, gleaned from soft-output decoders, is called extrinsic information. Decoding proceeds by iterating through constituent decoders, each forwarding updated extrinsic information to the next decoder, until a predefined number of iterations has been completed or the extrinsic information indicates that high reliability has been achieved. This approach results in very good performance at low values of Eb /N0 . Simulations have demonstrated error rates of 10−5 at signal-to-noise ratios appreciably less than 1 dB. At higher values of Eb /N0 , however, the performance curves can exhibit flattening if constituent codes are chosen in a manner that results in an overall small Hamming distance for the code. Although this coding technique has shown great promise, there remains considerable work with regard to optimizing code parameters. Great strides have been made over the last few years in understanding the structure of these codes and relating them to serially concatenated and product codes, but many researchers are still examining these codes in order to advance their development. With this research will come optimization of the Turbo code process and application of these codes in various communication systems.

10.10

Applications

FEC coding remained of theoretical interest until advances in digital technology and improvements in decoding algorithms made their implementation possible. It has since become an attractive alternative to improving other system components or boosting transmission power. FEC codes are commonly used in digital storage systems, deep-space and satellite communication systems, terrestrial radio and band limited wireline systems, and have also been proposed for fiber optic transmission. Accordingly, the theory and practice of error correcting codes now occupies a prominent position in the field of communications engineering. Deep-space systems began using forward error correction in the early 1970s to reduce transmission power requirements, and used multiple error correcting RS codes for the first time in 1977 to protect against corruption of compressed image data in the Voyager missions [12]. The Consultative Committee for Space Data Systems (CCSDS) has since recommended use of a concatenated coding system which uses a rate 1/2, constraint length 7 convolutional inner code and a (255, 223) RS outer code. Coding is now commonly used in satellite systems to reduce power requirements and overall hardware costs and to allow closer orbital spacing of geosynchronous satellites [2]. FEC codes play integral roles in the VSAT, MSAT, INTELSAT, and INMARSAT systems [13]. Further, a (31, 15) RS code is used in the joint tactical information distribution system (JTIDS), a (7, 2) RS code is used in the air force satellite communication system (AFSATCOM), and a (204, 192) RS code has been designed specifically for satellite time division multiple access (TDMA) systems. Another code designed for military applications involves concatenation of a Golay and RS code with interleaving to ensure an imbalance of 1’s and 0’s in the transmitted symbol sequence and enhance signal recovery under severe noise and interference [2]. TCM has become commonplace in transmission of data over voiceband telephone channels. Modems developed since 1984 use trellis coded QAM modulation to provide robust communication at rates above 9.6 kb/s. Various coding techniques are used in the new digital cellular and 1999 by CRC Press LLC

c

personal communication standards, with an emphasis on convolutional and cyclic redundancy check codes [8]. FEC codes have also been widely used in digital recording systems, most prominently in the compact disc digital audio system. This system uses two levels of coding and interleaving in the cross-interleaved RS coding (CIRC) system to correct errors that result from disc imperfections and dirt and scratches which accumulate during use. Steps are also taken to mute uncorrectable sequences [12].

Defining Terms Binary symmetric channel: A memoryless discrete data channel with binary signalling, harddecision demodulation, and channel impairments that do not depend on the value of the symbol transmitted. Bounded distance decoding: Limiting the error patterns which are corrected in an imperfect code to those with t or fewer errors. Catastrophic code: A convolutional code in which a finite number of code symbol errors can cause an unlimited number of decoded bit errors. Code rate: The ratio of source word length to codeword length, indicative of the amount of information transmitted per encoded symbol. Coding gain: The reduction in signal-to-noise ratio required for specified error performance in a block or convolutional coded system over an uncoded system with the same information rate, channel impairments, and modulation and demodulation techniques. In TCM, the ratio of the squared free distance in the coded system to that of the uncoded system. Column distance: The minimum Hamming distance between convolutionally encoded sequences of a specified length with different leading n-tuples. Constituent codes: Two or more FEC codes that are combined in concatenated coding techniques. Cyclic code: A block code in which cyclic shifts of code vectors are also code vectors. Cyclic redundancy check: When the syndrome of a cyclic block code is used to detect errors. Designed distance: The guaranteed minimum distance of a BCH code designed to correct up to t errors. Discrete data channel: The concatenation of all system elements between FEC encoder output and decoder input. Distance profile: The minimum Hamming distance after each encoding interval of convolutionally encoded sequences which differ in the first interval. Erasure: A position in the demodulated sequence where the symbol value is unknown. Extrinsic information: The output of a constituent soft decision decoder that is forwarded as input to the next decoding stage in iterative decoding of Turbo codes. Finite field: A finite set of elements and operations of addition and multiplication that satisfy specific properties. Often called Galois fields and denoted GF(q), where q is the number of elements in the field. Finite fields exist for all q which are prime or the power of a prime. Free distance: The minimum Hamming weight of convolutionally encoded sequences that diverge and remerge in the trellis. Equals the maximum column distance and the limiting value of the distance profile. 1999 by CRC Press LLC

c

Generator matrix: A matrix used to describe a linear code. Code vectors equal the information vectors multiplied by this matrix. Generator polynomial: The polynomial that is a divisor of all codeword polynomials in a cyclic block code; a polynomial that describes circuit connections in a convolutional encoder. Hamming distance: The number of symbols in which codewords differ. Hard decision: Demodulation that outputs only a value for each received symbol. Interleaving: Shuffling the coded bit sequence prior to modulation and reversing this operation following demodulation. Used to separate and redistribute burst errors over several codewords (block codes) or constraint lengths (trellis codes) for higher probability of correct decoding by codes designed to correct random errors. Linear code: A code whose code vectors form a vector space. Equivalently, a code where the addition of any two code vectors forms another code vector. Maximum distance separable: A code with the largest possible minimum distance given the block length and code rate. These codes meet the Singleton bound of dmin ≤ n − k + 1. Metric: A measure of goodness against which items are judged. In the Viterbi algorithm, an indication of the probability of a path being taken given the demodulated symbol sequence. Minimum distance: In a block code, the smallest Hamming distance between any two codewords. In a convolutional code, the column distance after K intervals. Parity check matrix: A matrix whose rows are orthogonal to the rows in the generator matrix of a linear code. Errors can be detected by multiplying the received vector by this matrix. n t Perfect code: A t error correcting (n, k) block code in which q n−k − 1 = 6i=1 i . Puncturing: Periodic deletion of code symbols from the sequence generated by a convolutional encoder for purposes of constructing a higher rate code. Also, deletion of parity bits in a block code. Set partitioning: Rules for mapping coded sequences to points in the signal constellation that always result in a larger Euclidean distance for a TCM system than an uncoded system, given appropriate construction of the trellis. Shannon Limit: The ratio of energy per data bit Eb to one-sided noise power spectral density N0 in an AWGN channel above which errorless transmission is possible when bandwidth limitations are not placed on the signal and transmission is at channel capacity. This limit has the value ln 2 = 0.693 = −1.6 dB. Soft decision: Demodulation that outputs an estimate of the received symbol value along with an indication of the reliability of this value. Usually implemented by quantizing the received signal to more levels than there are symbol values. Standard array decoding: Association of an error pattern with each syndrome by way of a lookup table. Syndrome: An indication of whether or not errors are present in the demodulated symbol sequence. Systematic code: A code in which the values of the message symbols can be identified by inspection of the code vector. Vector space: An algebraic structure comprised of a set of elements in which operations of vector addition and scalar multiplication are defined. For our purposes, a set of n-tuples consisting of symbols from GF(q) with addition and multiplication defined in terms of elementwise operations from this finite field. 1999 by CRC Press LLC

c

Viterbi algorithm: A maximum-likelihood decoding algorithm for trellis codes that discards low-probability paths at each stage of the trellis, thereby reducing the total number of paths that must be considered.

References [1] Bahl, L.R., Cocke, J., Jelinek, F., and Raviv, J., Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate. IEEE Transactions on Information Theory, 20, 248–287, 1974. [2] Berlekamp, E.R., Peile, R.E., and Pope, S.P., The application of error control to communications. IEEE Commun. Mag., 25(4), 44–57, 1987. [3] Berrou, C., Glavieux, A., and Thitimajshima, P., Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes. Proceedings of ICC’93, Geneva, Switzerland, 1064–1070, 1993. Later expanded and published as: Berrou, C., Glavieux, A., 1996. Near Optimum Error Correcting Coding and Decoding. IEEE Transactions on Communications, 44(10), 1261–1271, 1996. [4] Bhargava, V.K., Forward error correction schemes for digital communications. IEEE Commun. Mag., 21(1), 11–19, 1983. [5] Blahut, R.E., Theory and Practice of Error Control Codes, Addison-Wesley, Reading, MA, 1983. [6] Clark, G.C. Jr. and Cain, J.B., Error Correction Coding for Digital Communications, Plenum Press, New York, 1981. [7] Lin, S. and Costello, D.J. Jr., Error Control Coding: Fundamentals and Applications, PrenticeHall, Englewood Cliffs, NJ, 1983. [8] Rappaport, T.S., Wireless Communications, Principles and Practice, Prentice-Hall and IEEE Press, NJ, 1996. [9] Shannon, C.E., A mathematical theory of communication. Bell Syst. Tech. J., 27(3), 379–423 and 623–656, 1948. [10] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1988. [11] Ungerboeck, G., Trellis-coded modulation with redundant signal sets. IEEE Commun. Mag., 25(2), 5–11 and 12–21, 1987. [12] Wicker, S.B. and Bhargava, V.K., Reed-Solomon Codes and Their Applications, IEEE Press, NJ, 1994. [13] Wu, W.W., Haccoun, D., Peile, R., and Hirata, Y., Coding for satellite communication. IEEE J. Selected Areas in Commun., SAC-5(4), 724–748, 1987.

Further Information There is now a large amount of literature on the subject of FEC coding. An introduction to the philosophy and limitations of these codes can be found in the second chapter of Lucky’s book Silicon Dreams: Information, Man, and Machine, St. Martin’s Press, New York, 1989. More practical introductions can be found in overview chapters of many communications texts. The number of texts devoted entirely to this subject also continues to grow. Although these texts summarize the algebra underlying block codes, more in-depth treatments can be found in mathematical texts. Survey papers appear occasionally in the literature, but the interested reader is directed to the seminal papers by Shannon, Hamming, Reed and Solomon, Bose and Chaudhuri, Hocquenghem, Wozencraft, Fano, Forney, Berlekamp, Massey, Viterbi, Ungerboeck, Berrou and Glavieux, among others. The most recent advances in the theory and implementation of error control codes are published in IEEE 1999 by CRC Press LLC

c

Transactions on Information Theory, IEEE Transactions on Communications, and special issues of IEEE Journal on Selected Areas in Communications.

1999 by CRC Press LLC

c

Milstein, L.B. & Simon, M.K. “Spread Spectrum Communications” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Spread Spectrum Communications 11.1 11.2 11.3 11.4

Laurence B. Milstein University of California

Marvin K. Simon Jet Propulsion Laboratory

11.1

A Brief History Why Spread Spectrum? Basic Concepts and Terminology Spread Spectrum Techniques

Direct Sequence Modulation • Frequency Hopping Modulation • Time Hopping Modulation • Hybrid Modulations

11.5 Applications of Spread Spectrum Military • Commercial

Defining Terms References

A Brief History

Spread spectrum (SS) has its origin in the military arena where the friendly communicator is 1) susceptible to detection/interception by the enemy and 2) vulnerable to intentionally introduced unfriendly interference (jamming). Communication systems that employ spread spectrum to reduce the communicator’s detectability and combat the enemy-introduced interference are respectively referred to as low probability of intercept (LPI) and antijam (AJ) communication systems. With the change in the current world political situation wherein the U.S. Department of Defense (DOD) has reduced its emphasis on the development and acquisition of new communication systems for the original purposes, a host of new commercial applications for SS has evolved, particularly in the area of cellular mobile communications. This shift from military to commercial applications of SS has demonstrated that the basic concepts that make SS techniques so useful in the military can also be put to practical peacetime use. In the next section, we give a simple description of these basic concepts using the original military application as the basis of explanation. The extension of these concepts to the mentioned commercial applications will be treated later on in the chapter.

11.2

Why Spread Spectrum?

Spread spectrum is a communication technique wherein the transmitted modulation is spread (increased) in bandwidth prior to transmission over the channel and then despread (decreased) in bandwidth by the same amount at the receiver. If it were not for the fact that the communication channel introduces some form of narrowband (relative to the spread bandwidth) interference, the receiver performance would be transparent to the spreading and despreading operations (assuming that they are identical inverses of each other). That is, after despreading the received signal would be identical 1999 by CRC Press LLC

c

to the transmitted signal prior to spreading. In the presence of narrowband interference, however, there is a significant advantage to employing the spreading/despreading procedure described. The reason for this is as follows. Since the interference is introduced after the transmitted signal is spread, then, whereas the despreading operation at the receiver shrinks the desired signal back to its original bandwidth, at the same time it spreads the undesired signal (interference) in bandwidth by the same amount, thus reducing its power spectral density. This, in turn, serves to diminish the effect of the interference on the receiver performance, which depends on the amount of interference power in the despread bandwidth. It is indeed this very simple explanation, which is at the heart of all spread spectrum techniques.

11.3

Basic Concepts and Terminology

To describe this process analytically and at the same time introduce some terminology that is common in spread spectrum parlance, we proceed as follows. Consider a communicator that desires to send a message using a transmitted power S Watts (W) at an information rate Rb bits/s (bps). By introducing a SS modulation, the bandwidth of the transmitted signal is increased from Rb Hz to Wss Hz where Wss Rb denotes the spreadspectrumbandwidth. Assume that the channel introduces, in addition to the usual thermal noise (assumed to have a single-sided power spectral density (PSD) equal to N0 W/Hz), an additive interference (jamming) having power J distributed over some bandwidth WJ . After despreading, the desired signal bandwidth is once again now equal to Rb Hz and the interference PSD is now NJ = J /Wss . Note that since the thermal noise is assumed to be white, i.e., it is uniformly distributed over all frequencies, its PSD is unchanged by the despreading operation and, thus, remains equal to N0 . Regardless of the signal and interferer waveforms, the equivalent bit energy-to-total noise spectral density ratio is, in terms of the given parameters, Eb S/Rb Eb = = Nt N0 + NJ N0 + J /Wss

(11.1)

For most practical scenarios, the jammer limits performance and, thus, the effects of receiver noise in the channel can be ignored. Thus, assuming NJ N0 , we can rewrite Eq. (11.1) as S Wss S/Rb Eb ∼ Eb = = = Nt NJ J /Wss J Rb

(11.2)

where the ratio J /S is the jammer-to-signal power ratio and the ratio Wss /Rb is the spreading ratio and is defined as the processing gain of the system. Since the ultimate error probability performance of the communication receiver depends on the ratio Eb /NJ , we see that from the communicator’s viewpoint his goal should be to minimize J /S (by choice of S) and maximize the processing gain (by choice of Wss for a given desired information rate). The possible strategies for the jammer will be discussed in the section on military applications dealing with AJ communications.

11.4

Spread Spectrum Techniques

By far the two most popular spreading techniques are direct sequence (DS) modulation and frequency hopping (FH) modulation. In the following subsections, we present a brief description of each.

1999 by CRC Press LLC

c

FIGURE 11.1: A DS-BPSK system (complex form).

11.4.1 Direct Sequence Modulation A direct sequence modulation c(t) is formed by linearly modulating the output sequence {cn } of a pseudorandom number generator onto a train of pulses, each having a duration Tc called the chip time. In mathematical form, ∞ X cn p (t − nTc )ψ (11.3) c(t) = n=−∞

where p(t) is the basic pulse shape and is assumed to be of rectangular form. This type of modulation is usually used with binary phase-shift-keyed (BPSK) information signals, which have the complex form d(t) exp{j (2πfc t +θc )}, where d(t) is a binary-valued data waveform of rate 1/Tb bits/s and fc and θc are the frequency and phase of the data-modulated carrier, respectively. As such, a DS/BPSK signal is formed by multiplying the BPSK signal by c(t) (see Fig. 11.1), resulting in the real transmitted signal (11.4) x(t) = Re {c(t)d(t) exp [j (2πfc t + θc )]} Since Tc is chosen so that Tb Tc , then relative to the bandwidth of the BPSK information signal, the bandwidth of the DS/BPSK signal1 is effectively increased by the ratio Tb /Tc = Wss /2Rb , which is one-half the spreading factor or processing gain of the system. At the receiver, the sum of the transmitted DS/BPSK signal and the channel interference I (t) (as discussed before, we ignore the presence of the additive thermal noise) are ideally multiplied by the identical DS modulation (this operation is known as despreading), which returns the DS/BPSK signal to its original BPSK form whereas the real interference signal is now the real wideband signal Re{I (t)c(t)}. In the previous sentence, we used the word ideally, which implies that the PN waveform used for despreading at the receiver is identical to that used for spreading at the transmitter. This simple implication covers up a multitude of tasks that a practical DS receiver must perform. In particular, the receiver must first acquire the PN waveform. That is, the local PN random generator that generates the PN waveform at the receiver used for despreading must be aligned (synchronized) to within one chip of the PN waveform of the received DS/BPSK signal. This is accomplished by employing some sort of search algorithm which typically steps the local PN waveform sequentially in time by a fraction of a chip (e.g., half a chip) and at each position searches for a high degree of correlation between the received and local PN reference waveforms. The search terminates when the correlation exceeds a given threshold, which is an indication that the alignment has been achieved. After bringing the two PN waveforms into coarse alignment, a tracking algorithm is employed to maintain fine alignment.

1 For the usual case of a rectangular spreading pulse p(t), the PSD of the DS/BPSK modulation will have (sin x/x)2 form

with first zero crossing at 1/Tc , which is nominally taken as one-half the spread spectrum bandwidth Wss .

1999 by CRC Press LLC

c

The most popular forms of tracking loops are the continuous time delay-locked loop and its timemultiplexed version the tau–dither loop. It is the difficulty in synchronizing the receiver PN generator to subnanosecond accuracy that limits PN chip rates to values on the order of hundreds of Mchips/s, which implies the same limitation on the DS spread spectrum bandwidth Wss .

11.4.2 Frequency Hopping Modulation A frequency hopping (FH) modulation c(t) is formed by nonlinearly modulating a train of pulses with a sequence of pseudorandomly generated frequency shifts {fn }. In mathematical terms, c(t) has the complex form c(t) =

∞ X

exp {j (2πfn + φn )} p (t − nTh )ψ

(11.5)

n=−∞

where p(t) is again the basic pulse shape having a duration Th , called the hop time and {φn } is a sequence of random phases associated with the generation of the hops. FH modulation is traditionally used with multiple-frequency-shift-keyed (MFSK) information signals, which have the complex form exp{j [2π(fc + d(t))t]}, where d(t) is an M-level digital waveform (M denotes the symbol alphabet size) representing the information frequency modulation at a rate 1/Ts symbols/s (sps). As such, an FH/MFSK signal is formed by complex multiplying the MFSK signal by c(t) resulting in the real transmitted signal (11.6) x(t) = Re {c(t) exp {j [2π(fc + d(t))t]}} In reality, c(t) is never generated in the transmitter. Rather, x(t) is obtained by applying the sequence of pseudorandom frequency shifts {fn } directly to the frequency synthesizer that generates the carrier frequency fc (see Fig. 11.2). In terms of the actual implementation, successive (not necessarily

FIGURE 11.2: An FH-MFSK system. disjoint) k-chip segments of a PN sequence drive a frequency synthesizer, which hops the carrier over 2k frequencies. In view of the large bandwidths over which the frequency synthesizer must operate, it is difficult to maintain phase coherence from hop to hop, which explains the inclusion of 1999 by CRC Press LLC

c

the sequence {φn } in the Eq. (11.5) model for c(t). On a short term basis, e.g., within a given hop, the signal bandwidth is identical to that of the MFSK information modulation, which is typically much smaller than Wss . On the other hand, when averaged over many hops, the signal bandwidth is equal to Wss , which can be on the order of several GHz, i.e., an order of magnitude larger than that of implementable DS bandwidths. The exact relation between Wss , Th , Ts and the number of frequency shifts in the set {fn } will be discussed shortly. At the receiver, the sum of the transmitted FH/MFSK signal and the channel interference I (t) is ideally complex multiplied by the identical FH modulation (this operation is known as dehopping), which returns the FH/MFSK signal to its original MFSK form, whereas the real interference signal is now the wideband (in the average sense) signal Re{I (t)c(t)}. Analogous to the DS case, the receiver must acquire and track the FH signal so that the dehopping waveform is as close to the hopping waveform c(t) as possible. FH systems are traditionally classified in accordance with the relationship between Th and Ts . Fast frequency-hopped (FFH) systems are ones in which there exists one or more hops per data symbol, that is, Ts = NTh (N an integer) whereas slow frequency-hopped (SFH) systems are ones in which there exists more than one symbol per hop, that is, Th = N Ts . It is customary in SS parlance to refer to the FH/MFSK tone of shortest duration as a “chip”, despite the same usage for the PN chips associated with the code generator that drives the frequency synthesizer. Keeping this distinction in mind, in an FFH system where, as already stated, there are multiple hops per data symbol, a chip is equal to a hop. For SFH, where there are multiple data symbols per hop, a chip is equal to an MFSK symbol. Combining these two statements, the chip rate Rc in an FH system is given by the larger of Rh = 1/Th and Rs = 1/Ts and, as such, is the highest system clock rate. The frequency spacing between the FH/MFSK tones is governed by the chip rate Rc and is, thus, dependent on whether the FH modulation is FFH or SFH. In particular, for SFH where Rc = Rs , the spacing between FH/MFSK tones is equal to the spacing between the MFSK tones themselves. For noncoherent detection (the most commonly encountered in FH/MFSK systems), the separation of the MFSK symbols necessary to provide orthogonality2 is an integer multiple of Rs . Assuming the minimum spacing, i.e., Rs , the entire spread spectrum band is then partitioned into a total of Nt = Wss /Rs = Wss /Rc equally spaced FH tones. One arrangement, which is by far the most common, is to group these Nt tones into Nb = Nt /M contiguous, nonoverlapping bands, each with bandwidth MRs = MRc ; see Fig. 11.3a. Assuming symmetric MFSK modulation around the carrier frequency, then the center frequencies of the Nb = 2k bands represent the set of hop carriers, each of which is assigned to a given k-tuple of the PN code generator. In this fixed arrangement, each of the Nt FH/MFSK tones corresponds to the combination of a unique hop carrier (PN code k-tuple) and a unique MFSK symbol. Another arrangement, which provides more protection against the sophisticated interferer (jammer), is to overlap adjacent M-ary bands by an amount equal to Rc ; see Fig. 11.3b. Assuming again that the center frequency of each band corresponds to a possible hop carrier, then since all but M − 1 of the Nt tones are available as center frequencies, the number of hop carriers has been increased from Nt /M to Nt − (M − 1), which for Nt M is approximately an increase in randomness by a factor of M. For FFH, where Rc = Rh , the spacing between FH/MFSK tones is equal to the hop rate. Thus, the entire spread spectrum band is partitioned into a total of Nt = Wss /Rh = Wss /Rc equally

2 An optimum noncoherent MFSK detector consists of a bank of energy detectors each matched to one of the M frequencies

in the MFSK set. In terms of this structure, the notion of orthogonality implies that for a given transmitted frequency there will be no crosstalk (energy spillover) in any of the other M − 1 energy detectors. 1999 by CRC Press LLC

c

Figure 11.3a frequencies.

Frequency distribution for FH-4FSK—nonoverlapping bands. Dashed lines indicate location of hop

spaced FH tones, each of which is assigned to a unique k-tuple of the PN code generator that drives the frequency synthesizer. Since for FFH there are Rh /Rs hops per symbol, then the metric used to make a noncoherent decision on a particular symbol is obtained by summing up Rh /Rs detected chip (hop) energies, resulting in a so-called noncoherent combining loss.

1999 by CRC Press LLC

c

Figure 11.3b

Frequency distribution for FH-4FSK—over-lapping bands.

11.4.3 Time Hopping Modulation Time hopping (TH) is to spread spectrum modulation what pulse position modulation (PPM) is to information modulation. In particular, consider segmenting time into intervals of Tf seconds and further segment each Tf interval into MT increments of width Tf /MT . Assuming a pulse of maximum duration equal to Tf /MT , then a time hopping spread spectrum modulation would take the form ∞ X an p t − n+ (11.7) c(t) = Tf MT n=−∞ 1999 by CRC Press LLC

c

where an denotes the pseudorandom position (one of MT uniformly spaced locations) of the pulse within the Tf -second interval. For DS and FH, we saw that multiplicative modulation, that is the transmitted signal is the product of the SS and information signals, was the natural choice. For TH, delay modulation is the natural choice. In particular, a TH-SS modulation takes the form x(t) = Re {c(t − d(t)) exp [j (2πfc + φT )]}

(11.8)

where d(t) is a digital information modulation at a rate 1/Ts sps. Finally, the dehopping procedure at the receiver consists of removing the sequence of delays introduced by c(t), which restores the information signal back to its original form and spreads the interferer.

11.4.4 Hybrid Modulations By blending together several of the previous types of SS modulation, one can form hybrid modulations that, depending on the system design objectives, can achieve a better performance against the interferer than can any of the SS modulations acting alone. One possibility is to multiply several of the c(t) wideband waveforms [now denoted by c(i) (t) to distinguish them from one another] resulting in a SS modulation of the form Y c(i) (t)ψ (11.9) c(t) = i

Such a modulation may embrace the advantages of the various c(i) (t), while at the same time mitigating their individual disadvantages.

11.5

Applications of Spread Spectrum

11.5.1 Military Antijam (AJ) Communications

As already noted, one of the key applications of spread spectrum is for antijam communications in a hostile environment. The basic mechanism by which a direct sequence spread spectrum receiver attenuates a noise jammer was illustrated in Section 11.3. Therefore, in this section, we will concentrate on tone jamming. Assume the received signal, denoted r(t), is given by r(t) = Ax(t) + I (t) + nw (t)ψ

(11.10)

where x(t) is given in Eq. (11.4), A is a constant amplitude, I (t) = α cos (2πfc t + θ )ψ

(11.11)

and nw (t) is additive white Gaussian noise (AWGN) having two-sided spectral density N0 /2. In Eq. (11.11), α is the amplitude of the tone jammer and θ is a random phase uniformly distributed in [0, 2π ] . If we employ the standard correlation receiver of Fig. 11.4, it is straightforward to show that the final test statistic out of the receiver is given by Z Tb c(t) dt + N (Tb )ψ (11.12) g(Tb ) = ATb + α cos θ 0

1999 by CRC Press LLC

c

FIGURE 11.4: Standard correlation receiver. where N(Tb ) is the contribution to the test statistic due to the AWGN. Noting that, for rectangular chips, we can express Z

Tb

0

c(t) dt = Tc

M X

ci

(11.13)

i=1

where Tb Tc

4

M=

(11.14)

is one-half of the processing gain, it is straightforward to show that, for a given value of θ , the signal-to-noise-plus-interference ratio, denoted by S/Ntotal , is given by S = Ntotal

1 N0 2Eb

J MS

+

cos2 θ

(11.15)

In Eq. (11.15), the jammer power is α2 2

4

J=

(11.16)

and the signal power is A2 (11.17) 2 If we look at the second term in the denominator of Eq. (11.15), we see that the ratio J /S is divided by M. Realizing that J /S is the ratio of the jammer power to the signal power before despreading, and J /MS is the ratio of the same quantity after despreading, we see that, as was the case for noise jamming, the benefit of employing direct sequence spread spectrum signalling in the presence of tone jamming is to reduce the effect of the jammer by an amount on the order of the processing gain. Finally, one can show that an estimate of the average probability of error of a system of this type is given by s ! Z 2π 1 S φ − (11.18) dθ Pe = 2π 0 Ntotal 4

S=

where 1 φ(x) = √ 2π 4

Z

x

−∞

e−y

2 /2

dy

(11.19)

If Eq. (11.18) is evaluated numerically and plotted, the results are as shown in Fig. 11.5. It is clear from this figure that a large initial power advantage of the jammer can be overcome by a sufficiently large value of the processing gain. 1999 by CRC Press LLC

c

FIGURE 11.5: Plotted results of Eq. (11.18).

Low-Probability of Intercept (LPI)

The opposite side of the AJ problem is that of LPI, that is, the desire to hide your signal from detection by an intelligent adversary so that your transmissions will remain unnoticed and, thus, neither jammed nor exploited in any manner. This idea of designing an LPI system is achieved in a variety of ways, including transmitting at the smallest possible power level, and limiting the transmission time to as short an interval in time as is possible. The choice of signal design is also important, however, and it is here that spread spectrum techniques become relevant. The basic mechanism is reasonably straightforward; if we start with a conventional narrowband signal, say a BPSK waveform having a spectrum as shown in Fig. 11.6a, and then spread it so that its new spectrum is as shown in Fig. 11.6b, the peak amplitude of the spectrum after spreading has been reduced by an amount on the order of the processing gain relative to what it was before spreading. Indeed, a sufficiently large processing gain will result in the spectrum of the signal after spreading falling below the ambient thermal noise level. Thus, there is no easy way for an unintended listener to determine that a transmission is taking place. 1999 by CRC Press LLC

c

Figure 11.6a

Figure 11.6b

That is not to say the spread signal cannot be detected, however, merely that it is more difficult for an adversary to learn of the transmission. Indeed, there are many forms of so-called intercept receivers that are specifically designed to accomplish this very task. By way of example, probably the best known and simplest to implement is a radiometer, which is just a device that measures the total power present in the received signal. In the case of our intercept problem, even though we have lowered the power spectral density of the transmitted signal so that it falls below the noise floor, we have not lowered its power (i.e., we have merely spread its power over a wider frequency range). Thus, if the radiometer integrates over a sufficiently long period of time, it will eventually determine the presence of the transmitted signal buried in the noise. The key point, of course, is that the use of the spreading makes the interceptor’s task much more difficult, since he has no knowledge of the spreading code and, thus, cannot despread the signal.

11.5.2 Commercial Multiple Access Communications

From the perspective of commercial applications, probably the most important use of spread spectrum communications is as a multiple accessing technique. When used in this manner, it becomes an alternative to either frequency division multiple access (FDMA) or time division multiple access (TDMA) and is typically referred to as either code division multiple access (CDMA) or spread spectrum multiple access (SSMA). When using CDMA, each signal in the set is given its own spreading sequence. As opposed to either FDMA, wherein all users occupy disjoint frequency bands but are transmitted simultaneously in time, or TDMA, whereby all users occupy the same bandwidth but transmit in disjoint intervals of time, in CDMA, all signals occupy the same bandwidth and are transmitted simultaneously in time; the different waveforms in CDMA are distinguished from one another at the receiver by the specific spreading codes they employ. Since most CDMA detectors are correlation receivers, it is important when deploying such a system to have a set of spreading sequences that have relatively low-pairwise cross-correlation between any two sequences in the set. Further, there are two fundamental types of operation in CDMA, synchronous and asynchronous. In the former case, the symbol transition times of all of the users are aligned; this allows for orthogonal sequences to be used as the spreading sequences and, thus, eliminates interference from one user to another. Alternately, if no effort is made to align the sequences, the 1999 by CRC Press LLC

c

system operates asychronously; in this latter mode, multiple access interference limits the ultimate channel capacity, but the system design exhibits much more flexibility. CDMA has been of particular interest recently for applications in wireless communications. These applications include cellular communications, personal communications services (PCS), and wireless local area networks. The reason for this popularity is primarily due to the performance that spread spectrum waveforms display when transmitted over a multipath fading channel. To illustrate this idea, consider DS signalling. As long as the duration of a single chip of the spreading sequence is less than the multipath delay spread, the use of DS waveforms provides the system designer with one of two options. First, the multipath can be treated as a form of interference, which means the receiver should attempt to attenuate it as much as possible. Indeed, under this condition, all of the multipath returns that arrive at the receiver with a time delay greater than a chip duration from the multipath return to which the receiver is synchronized (usually the first return) will be attenuated because of the processing gain of the system. Alternately, the multipath returns that are separated by more than a chip duration from the main path represent independent “looks” at the received signal and can be used constructively to enhance the overall performance of the receiver. That is, because all of the multipath returns contain information regarding the data that is being sent, that information can be extracted by an appropriately designed receiver. Such a receiver, typically referred to as a RAKE receiver, attempts to resolve as many individual multipath returns as possible and then to sum them coherently. This results in an implicit diversity gain, comparable to the use of explicit diversity, such as receiving the signal with multiple antennas. The condition under which the two options are available can be stated in an alternate manner. If one envisions what is taking place in the frequency domain, it is straightforward to show that the condition of the chip duration being smaller than the multipath delay spread is equivalent to requiring that the spread bandwidth of the transmitted waveform exceed what is called the coherence bandwidth of the channel. This latter quantity is simply the inverse of the multipath delay spread and is a measure of the range of frequencies that fade in a highly correlated manner. Indeed, anytime the coherence bandwidth of the channel is less than the spread bandwidth of the signal, the channel is said to be frequency selective with respect to the signal. Thus, we see that to take advantage of DS signalling when used over a multipath fading channel, that signal should be designed such that it makes the channel appear frequency selective. In addition to the desirable properties that spread spectrum signals display over multipath channels, there are two other reasons why such signals are of interest in cellular-type applications. The first has to do with a concept known as the reuse factor. In conventional cellular systems, either analog or digital, in order to avoid excessive interference from one cell to its neighbor cells, the frequencies used by a given cell are not used by its immediate neighbors (i.e., the system is designed so that there is a certain spatial separation between cells that use the same carrier frequencies). For CDMA, however, such spatial isolation is typically not needed, so that so-called universal reuse is possible. Further, because CDMA systems tend to be interference limited, for those applications involving voice transmission, an additional gain in the capacity of the system can be achieved by the use of voice activity detection. That is, in any given two-way telephone conversation, each user is typically talking only about 50% of the time. During the time when a user is quiet, he is not contributing to the instantaneous interference. Thus, if a sufficiently large number of users can be supported by the system, statistically only about one-half of them will be active simultaneously, and the effective capacity can be doubled. 1999 by CRC Press LLC

c

Interference Rejection

In addition to providing multiple accessing capability, spread spectrum techniques are of interest in the commercial sector for basically the same reasons they are in the military community, namely their AJ and LPI characteristics. However, the motivations for such interest differ. For example, whereas the military is interested in ensuring that systems they deploy are robust to interference generated by an intelligent adversary (i.e., exhibit jamming resistance), the interference of concern in commercial applications is unintentional. It is sometimes referred to as cochannel interference (CCI) and arises naturally as the result of many services using the same frequency band at the same time. And while such scenarios almost always allow for some type of spatial isolation between the interfering waveforms, such as the use of narrow-beam antenna patterns, at times the use of the inherent interference suppression property of a spread spectrum signal is also desired. Similarly, whereas the military is very much interested in the LPI property of a spread spectrum waveform, as indicated in Section 11.3, there are applications in the commercial segment where the same characteristic can be used to advantage. To illustrate these two ideas, consider a scenario whereby a given band of frequencies is somewhat sparsely occupied by a set of conventional (i.e., nonspread) signals. To increase the overall spectral efficiency of the band, a set of spread spectrum waveforms can be overlaid on the same frequency band, thus forcing the two sets of users to share common spectrum. Clearly, this scheme is feasible only if the mutual interference that one set of users imposes on the other is within tolerable limits. Because of the interference suppression properties of spread spectrum waveforms, the despreading process at each spread spectrum receiver will attenuate the components of the final test statistic due to the overlaid narrowband signals. Similarly, because of the LPI characteristics of spread spectrum waveforms, the increase in the overall noise level as seen by any of the conventional signals, due to the overlay, can be kept relatively small.

Defining Terms Antijam communication system: A communication system designed to resist intentional jamming by the enemy. Chip time (interval): The duration of a single pulse in a direct sequence modulation; typically much smaller than the information symbol interval. Coarse alignment: The process whereby the received signal and the despreading signal are aligned to within a single chip interval. Dehopping: Despreading using a frequency-hopping modulation. Delay-locked loop: A particular implementation of a closed-loop technique for maintaining fine alignment. Despreading: The notion of decreasing the bandwidth of the received (spread) signal back to its information bandwidth. Direct sequence modulation: A signal formed by linearly modulating the output sequence of a pseudorandom number generator onto a train of pulses. Direct sequence spread spectrum: A spreading technique achieved by multiplying the information signal by a direct sequence modulation. Fast frequency-hopping: A spread spectrum technique wherein the hop time is less than or equal to the information symbol interval, i.e., there exist one or more hops per data symbol. 1999 by CRC Press LLC

c

Fine alignment: The state of the system wherein the received signal and the despreading signal are aligned to within a small fraction of a single chip interval. Frequency-hopping modulation: A signal formed by nonlinearly modulating a train of pulses with a sequence of pseudorandomly generated frequency shifts. Hop time (interval): The duration of a single pulse in a frequency-hopping modulation. Hybrid spread spectrum: A spreading technique formed by blending together several spread spectrum techniques, e.g., direct sequence, frequency-hopping, etc. Low-probability-of-intercept communication system: A communication system designed to operate in a hostile environment wherein the enemy tries to detect the presence and perhaps characteristics of the friendly communicator’s transmission. Processing gain (spreading ratio): The ratio of the spread spectrum bandwidth to the information data rate. Radiometer: A device used to measure the total energy in the received signal. Search algorithm: A means for coarse aligning (synchronizing) the despreading signal with the received spread spectrum signal. Slow frequency-hopping: A spread spectrum technique wherein the hop time is greater than the information symbol interval, i.e., there exists more than one data symbol per hop. Spread spectrum bandwidth: The bandwidth of the transmitted signal after spreading. Spreading: The notion of increasing the bandwidth of the transmitted signal by a factor far in excess of its information bandwidth. Tau–dither loop: A particular implementation of a closed-loop technique for maintaining fine alignment. Time-hopping spread spectrum: A spreading technique that is analogous to pulse position modulation. Tracking algorithm: An algorithm (typically closed loop) for maintaining fine alignment.

References [1] Cook, C.F., Ellersick, F.W., Milstein, L.B., and Schilling, D.L., Spread Spectrum Communications, IEEE Press, 1983. [2] Dixon, R.C., Spread Spectrum Systems, 3rd ed., John Wiley and Sons, Inc. 1994. [3] Holmes, J.K., Coherent Spread Spectrum Systems, John Wiley and Sons, Inc. 1982. [4] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K., Spread Spectrum Communications Handbook, McGraw Hill, 1994 (previously published as Spread Spectrum Communications, Computer Science Press, 1985). [5] Ziemer, R.E. and Peterson, R.L., Digital Communications and Spread Spectrum Techniques, Macmillan, 1985.

1999 by CRC Press LLC

c

Paulraj, A.J. “Diversity” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Diversity 12.1 Introduction 12.2 Diversity Schemes

Space Diversity • Polarization Diversity • Angle Diversity • Frequency Diversity • Path Diversity • Time Diversity • Transformed Diversity

12.3 Diversity Combining Techniques

Selection Combining • Maximal Ratio Combining • Equal Gain Combining • Loss of Diversity Gain Due to Branch Correlation and Unequal Branch Powers

Arogyaswami J. Paulraj Stanford University

12.1

12.4 Effect of Diversity Combining on Bit Error Rate 12.5 Concluding Remarks Defining Terms References

Introduction

Diversity is a commonly used technique in mobile radio systems to combat signal fading. The basic principle of diversity is as follows. If several replicas of the same information-carrying signal are received over multiple channels with comparable strengths, which exhibit independent fading, then there is a good likelihood that at least one or more of these received signals will not be in a fade at any given instant in time, thus making it possible to deliver adequate signal level to the receiver. Without diversity techniques, in noise limited conditions, the transmitter would have to deliver a much higher power level to protect the link during the short intervals when the channel is severely faded. In mobile radio, the power available on the reverse link is severely limited by the battery capacity of hand-held subscriber units. Diversity methods play a crucial role in reducing transmit power needs. Also, cellular communication networks are mostly interference limited and, once again, mitigation of channel fading through use of diversity can translate into reduced variability of carrier-to-interference ratio (C/I), which in turn means lower C/I margin and hence better reuse factors and higher system capacity. The basic principles of diversity have been known since 1927 when the first experiments in space diversity were reported. There are many techniques for obtaining independently fading branches, and these can be subdivided into two main classes. The first are explicit techniques where explicit redundant signal transmission is used to exploit diversity channels. Use of dual polarized signal transmission and reception in many point-to-point radios is an example of explicit diversity. Clearly such redundant signal transmission involves a penalty in frequency spectrum or additional power. In the second class are implicit diversity techniques: the signal is transmitted only once, but the 1999 by CRC Press LLC

c

decorrelating effects in the propagation medium such as multipaths are exploited to receive signals over multiple diversity channels. A good example of implicit diversity is the RAKE receiver in code division multiple access (CDMA) systems, which uses independent fading of resolvable multipaths to achieve diversity gain. Figure 12.1 illustrates the principle of diversity where two independently fading signals are shown along with the selection diversity output signal which selects the stronger signal. The fades in the resulting signal have been substantially smoothed out while also yielding higher average power.

FIGURE 12.1: Example of diversity combining. Two independently fading signals 1 and 2. The signal 3 is the result of selecting the strongest signal.

If antennas are used in transmit, they can be exploited for diversity. If the transmit channel is known, the antennas can be driven with complex conjugate channel weighting to co-phase the signals at the receive antenna. If the forward channel is not known, we have several methods to convert space selective fading at the transmit antennas to other forms of diversity exploitable in the receiver. Exploiting diversity needs careful design of the communication link. In explicit diversity, multiple copies of the same signal are transmitted in channels using either a frequency, time, or polarization dimension. At the receiver end we need arrangements to receive the different diversity branches (this is true for both explicit and implicit diversity). The different diversity branches are then combined to reduce signal outage probability or bit error rate. In practice, the signals in the diversity branches may not show completely independent fading. 1999 by CRC Press LLC

c

The envelope cross correlation ρ between these signals is a measure of their independence. E [[r1 − r¯1 ] [r2 − r¯2 ]] ρ=p E |r1 − r¯1 |2 E |r2 − r¯2 |2 where r1 and r2 represent the instantaneous envelope levels of the normalized signals at the two receivers and r¯1 and r¯2 are their respective means. It has been shown that a cross correlation of 0.7 [3] between signal envelopes is sufficient to provide a reasonable degree of diversity gain. Depending on the type of diversity employed, these diversity channels must be sufficiently separated along the appropriate diversity dimension. For spatial diversity, the antennas should be separated by more than the coherence distance to ensure a cross correlation of less than 0.7. Likewise in frequency diversity, the frequency separation must be larger than the coherence bandwidth, and in time diversity the separation between channel reuse in time should be longer than the coherence time. These coherence factors in turn depend on the channel characteristics. The coherence distance, coherence bandwidth and coherence time vary inversely as the angle spread, delay spread, and Doppler spread, respectively. If the receiver has a number of diversity branches, it has to combine these branches to maximize the signal level. Several techniques have been studied for diversity combining. We will describe three main techniques: selection combining, equal gain combining, and maximal ratio combining. Finally, we should note that diversity is primarily used to combat fading and if the signal does not show significant fading in the first place, for example when there is a direct path component, diversity combining may not provide significant diversity gain. In the case of antenna diversity, array gain proportional to the number of antennas will still be available.

12.2

Diversity Schemes

There are several techniques for obtaining diversity branches, sometimes also known as diversity dimensions. The most important of these are discussed in the following sections.

12.2.1

Space Diversity

This has historically been the most common form of diversity in mobile radio base stations. It is easy to implement and does not require additional frequency spectrum resources. Space diversity is exploited on the reverse link at the base station receiver by spacing antennas apart so as to obtain sufficient decorrelation. The key for obtaining minimum uncorrelated fading of antenna outputs is adequate spacing of the antennas. The required spacing depends on the degree of multipath angle spread. For example if the multipath signals arrive from all directions in the azimuth, as is usually the case at the mobile, antenna spacing (coherence distance) of the order of 0.5λ to 0.8λ is quite adequate [5]. On the other hand if the multipath angle spread is small, as in the case of base stations, the coherence distance is much larger. Also empirical measurements show a strong coupling between antenna height and spatial correlation. Larger antenna heights imply larger coherence distances. Typically 10λ to 20λ separation is adequate to achieve ρ = 0.7 at base stations in suburban settings when the signals arrive from the broadside direction. The coherence distance can be 3 to 4 times larger for endfire arrivals. The endfire problem is averted in base stations with trisectored antennas as each sector needs to handle only signals arriving ±60◦ off the broadside. The coherence distance depends strongly on the terrain. Large multipath angle spread means smaller coherence distance. Base stations normally use space diversity in the horizontal plane only. Separation in the vertical plane can also be used, and the necessary spacing depends upon vertical multipath angle spread. This can be small for distant mobiles making vertical plane diversity less attractive in most applications. 1999 by CRC Press LLC

c

Space diversity is also exploitable at the transmitter. If the forward channel is known, it works much like receive space diversity. If it is not known, then space diversity can be transformed to another form of diversity exploitable at the receiver. (See Section 12.2.7 below). If antennas are used at transmit and receive, the M transmit and N receive antennas both contribute to diversity. It can be shown that if simple weighting is used without additional bandwidth or time/memory processing, then maximum diversity gain is obtained if the transmitter and receiver use the left and right singular vectors of the M × N channel matrix, respectively. However, to approach the maximum M × N order diversity order will require the use of additional bandwidth or time/memory-based methods.

12.2.2 Polarization Diversity In mobile radio environments, signals transmitted on orthogonal polarizations exhibit low fade correlation, and therefore, offer potential for diversity combining. Polarization diversity can be obtained either by explicit or implicit techniques. Note that with polarization only two diversity branches are available as against space diversity where several branches can be obtained using multiple antennas. In explicit polarization diversity, the signal is transmitted and received in two orthogonal polarizations. For a fixed total transmit power, the power in each branch will be 3 dB lower than if single polarization is used. In the implicit polarization technique, the signal is launched in a single polarization, but is received with cross-polarized antennas. The propagation medium couples some energy into the cross-polarization plane. The observed cross-polarization coupling factor lies between 8 to 12 dB in mobile radio [8, 1]. The cross-polarization envelope decorrelation has been found to be adequate. However, the large branch imbalance reduces the available diversity gain. With hand-held phones, the handset can be held at random orientations during a call. This results in energy being launched with varying polarization angles ranging from vertical to horizontal. This further increases the advantage of cross-polarized antennas at the base station since the two antennas can be combined to match the received signal polarization. This makes polarization diversity even more attractive. Recent work [4] has shown that with variable launch polarization, a cross-polarized antenna can give comparable overall (matching plus diversity) performance to a vertically polarized space diversity antenna. Finally, we should note that cross-polarized antennas can be deployed in a compact antenna assembly and do not need large physical separation needed in space diversity antennas. This is an important advantage in the PCS base stations where low profile antennas are needed.

12.2.3 Angle Diversity In situations where the angle spread is very high, such as indoors or at the mobile unit in urban locations, signals collected from multiple nonoverlapping beams offer low fade correlation with balanced power in the diversity branches. Clearly, since directional beams imply use of antenna aperture, angle diversity is closely related to space diversity. Angle diversity has been utilized in indoor wireless LANs, where its use allows substantial increase in LAN throughputs [2].

12.2.4 Frequency Diversity Another technique to obtain decorrelated diversity branches is to transmit the same signal over different frequencies. The frequency separation between carriers should be larger than the coherence bandwidth. The coherence bandwidth, of course, depends on the multipath delay spread of the channel. The larger the delay spread, the smaller the coherence bandwidth and the more closely 1999 by CRC Press LLC

c

we can space the frequency diversity channels. Clearly, frequency diversity is an explicit diversity technique and needs additional frequency spectrum. A common form of frequency diversity is multicarrier (also known as multitone) modulation. This technique involves sending redundant data over a number of closely spaced carriers to benefit from frequency diversity, which is then exploited by applying interleaving and channel coding/forward error correction across the carriers. Another technique is to use frequency hopping wherein the interleaved and channel coded data stream is transmitted with widely separated frequencies from burst to burst. The wide frequency separation is chosen to guarantee independent fading from burst to burst.

12.2.5 Path Diversity This implicit diversity is available if the signal bandwidth is much larger than the channel coherence bandwidth. The basis for this method is that when the multipath arrivals can be resolved in the receiver and since the paths fade independently, diversity gain can be obtained. In CDMA systems, the multipath arrivals must be separated by more than one chip period and the RAKE receiver provides the diversity [9]. In TDMA systems, the multipath arrivals must be separated by more than one symbol period and the MLSE receiver provides the diversity.

12.2.6 Time Diversity In mobile communications channels, the mobile motion together with scattering in the vicinity of the mobile causes time selective fading of the signal with Rayleigh fading statistics for the signal envelope. Signal fade levels separated by the coherence time show low correlation and can be used as diversity branches if the same signal can be transmitted at multiple instants separated by the coherence time. The coherence time depends on the Doppler spread of the signal, which in turn is a function of the mobile speed and the carrier frequency. Time diversity is usually exploited via interleaving, forward-error correction (FEC) coding, and automatic request for repeat (ARQ). These are sophisticated techniques to exploit channel coding and time diversity. One fundamental drawback with time diversity approaches is the delay needed to collect the repeated or interleaved transmissions. If the coherence time is large, as for example when the vehicle is slow moving, the required delay becomes too large to be acceptable for interactive voice conversation. The statistical properties of fading signals depend on the field component used by the antenna, the vehicular speed, and the carrier frequency. For an idealized case of a mobile surrounded by scatterers in all directions, the autocorrelation function of the received signal x(t) (note this is not the envelope r(t)) can be shown to be E [x(t)x(t + τ )] = J0 (2πτ v/λ) where J0 is a Bessel function of the 0th order and v is the mobile velocity.

12.2.7 Transformed Diversity In transformed diversity, the space diversity branches at the transmitter are transformed into other forms of diversity branches exploitable at the receiver. This is used when the forward channel is not known and shifts the responsibility of diversity combining to the receiver which has the necessary channel knowledge. 1999 by CRC Press LLC

c

Space to Frequency

• Antenna-delay. Here the signal is transmitted from two or more antennas with delays of the order of a chip or symbol period in CDMA or TDMA, respectively. The different transmissions simulate resolved path arrivals that can be used as diversity branches by the RAKE or MLSE equalizer. • Multicarrier modulation. The data stream after interleaving and coding is modulated as a multicarrier output using an inverse DFT. The carriers are then mapped to the different antennas. The space selective fading at the antennas is now transformed to frequency selective fading and diversity is obtained during decoding. Space to Time • Antenna hopping/phase rolling. In this method the data stream after coding and interleaving is switched randomly from antenna to antenna. The space selective fading at the transmitter is converted into a time selective fading at the receiver. This is a form of “active” fading. • Space-time coding. The approach in space-time coding is to split the encoded data into multiple data streams each of which is modulated and simultaneously transmitted from different antennas. The received signal is a superposition of the multiple transmitted signals. Channel decoding can be used to recover the data sequence. Since the encoded data arrive over uncorrelated fade branches, diversity gain can be realized.

12.3

Diversity Combining Techniques

Several diversity combining methods are known. We describe three main techniques: selection, maximal ratio, and equal gain. They can be used with each of the diversity schemes discussed above.

12.3.1

Selection Combining

This is the simplest and perhaps the most frequently used form of diversity combining. In this technique, one of the two diversity branches with the highest carrier-to-noise ratio (C/N) is connected to the output. See Fig. 12.2(a). The performance improvement due to selection diversity can be seen as follows. Let the signal in each branch exhibit Rayleigh fading with mean power σ 2 . The density function of the envelope is given by 2 ri −ri (12.1) p (ri ) = 2 e 2σ 2 σ where ri is the signal envelope in each branch. If we define two new variables γi

=

0

=

Instantaneous signal power in each branch Mean noise power Mean signal power in each branch Mean noise power

then the probability that the C/N is less than or equal to some specified value γs is Prob γi ≤ γs = 1 − e−γs / 0 1999 by CRC Press LLC

c

(12.2)

FIGURE 12.2: Diversity combining methods for two diversity branches.

The probability that γi in all branches with independent fading will be simultaneously less than or equal to γs is then M (12.3) Prob γ1 , γ2 , . . . γM ≤ γs = 1 − e−γs / 0 This is the distribution of the best signal envelope from the two diversity branches. Figure 12.3 shows the distribution of the combiner output C/N for M = 1,2,3, and 4 branches. The improvement in signal quality is significant. For example at 99% reliability level, the improvement in C/N is 10 dB for two branches and 16 dB for four branches. Selection combining also increases the mean C/N of the combiner output and can be shown to be [3] Mean (γs ) = 0

M X 1 k=1

k

(12.4)

This indicates that with 4 branches, for example, the mean C/N of the selected branch is 2.08 better than the mean C/N in any one branch.

12.3.2

Maximal Ratio Combining

In this technique the M diversity branches are first co-phased and then weighted proportionally to their signal level before summing. See Fig. 12.2(b). The distribution of the maximal ratio combiner 1999 by CRC Press LLC

c

FIGURE 12.3: Probability distribution of signal envelope for selection combining. has been shown to be [5] M X (γm / 0)k−1 Prob γ ≤ γm = 1 − e(−γm / 0) (k − 1)!

(12.5)

k=1

The distribution of output of a maximal ratio combiner is shown in Fig. 12.4. Maximal ratio combining is known to be optimal in the sense that it yields the best statistical reduction of fading of any linear diversity combiner. In comparison to the selection combiner, at 99% reliability level, the maximal ratio combiner provides a 11.5 dB gain for two branches and a 19 dB gain for four branches, an improvement of 1.5 and 3 dB, respectively, over the selection diversity combiner. The mean C/N of the combined signal may be easily shown to be Mean (γm ) = M0

(12.6)

Therefore, combiner output mean varies linearly with M. This confirms the intuitive result that the output C/N averaged over fades should provide gain proportional to the number of diversity branches. This is a situation similar to conventional beamforming.

12.3.3

Equal Gain Combining

In some applications, it may be difficult to estimate the amplitude accurately, the combining gains may all be set to unity, and the diversity branches merely summed after co-phasing. [See Fig. 12.2(c)]. The distribution of equal gain combiner does not have a neat expression and has been computed by numerical evaluation. Its performance has been shown to be very close to within a decibel to 1999 by CRC Press LLC

c

FIGURE 12.4: Probability distribution for signal envelope for maximal ratio combining. maximal ratio combining. The mean C/N can be shown to be [3] h i π Mean (γe ) = 0 1 + (M − 1) 4

(12.7)

Like maximal ratio combining, the mean C/N for equal gain combining grows almost linearly with M and is approximately only one decibel poorer than maximal ratio combiner even with an infinite number of branches.

12.3.4

Loss of Diversity Gain Due to Branch Correlation and Unequal Branch Powers

The above analysis assumed that the fading signals in the diversity branches were all uncorrelated and of equal power. In practice, this may be difficult to achieve and as we saw earlier, the branch crosscorrelation coefficient ρ = 0.7 is considered to be acceptable. Also, equal mean powers in diversity branches are rarely available. In such cases we can expect a certain loss of diversity gain. However, since most of the damage in fading is due to deep fades, and also since the chance of coincidental deep fades is small even for moderate branch correlation, one can expect a reasonable tolerance to branch correlation. The distribution of the output signal envelope of maximal ratio combiner has been shown to be [6]: M X An −γm /2λn e Prob γm = 2λn n=1

1999 by CRC Press LLC

c

(12.8)

where λn are the eigenvalues of the M × M branch envelope covariance matrix whose elements are defined by h i (12.9) R ij = E ri rj∗ and An is defined by An =

M Y

k=1 k 6= n

12.4

1 1 − λk /λn

(12.10)

Effect of Diversity Combining on Bit Error Rate

So far we have studied the distribution of the instantaneous envelope or C/N after diversity combining. We will now briefly survey how diversity combining affects BER performance in digital radio links; we assume maximal ratio combining. To begin let us first examine the effect of Rayleigh fading on the BER performance of digital transmission links. This has been studied by several authors and is summarized in [7]. Table 12.1 gives the BER expressions in the large Eb /N0 case for coherent binary PSK and coherent binary orthogonal FSK for unfaded and Rayleigh faded AWGN (additive white Gaussian noise channels) channels. E¯ b /N0 represents the average Eb /N0 for the fading channel. TABLE 12.1 Comparison of BER Performance for Unfaded and Rayleigh Faded Signals Modulaton

Unfaded BER

Coh BPSK

p 1 Eb /N0 2 erfc

Coh FSK

1 2 erfc

q

Faded BER

1 2 Eb /N0

1 4 E¯b /N0 1 2 E¯b /N0

Observe that error rates decrease only inversely with SNR as against exponential decreases for the unfaded channel. Also note that for fading channels, coherent binary PSK is 3 dB better than coherent binary FSK, exactly the same advantage as in unfaded case. Even for modest target BER of 10−2 that is usually needed in mobile communications, the loss due to fading can be very high—17.2 dB. To obtain the BER with maximal ratio diversity combining we have to average the BER expression for the unfaded BER with the distribution obtained for the maximal ratio combiner given in (12.5). Analytical expressions have been derived for these in [7]. For a branch SNR greater than 10 dB, the BER after maximal ratio diversity combining is given in Table 12.2. We observe that the probability of error varies as 1/E¯ b /N0 raised to the Lth power. Thus, diversity reduces the error rate exponentially as the number of independent branches increases. 1999 by CRC Press LLC

c

TABLE 12.2 BER Performance for Coherent BPSK and FSK with Diversity Modulaton Coherent BPSK Coherent FSK

12.5

Post Diversity BER L 2L − 1 1 L 4 E¯ b /N0 L 2L − 1 1 L ¯ 2Eb /N0

Concluding Remarks

Diversity provides a powerful technique for combating fading in mobile communication systems. Diversity techniques seek to generate and exploit multiple branches over which the signal shows low fade correlation. To obtain the best diversity performance, the multiple access, modulation, coding and antenna design of the wireless link must all be carefully chosen so as to provide a rich and reliable level of well-balanced, low-correlation diversity branches in the target propagation environment. Successful diversity exploitation can impact a mobile network in several ways. Reduced power requirements can result in increased coverage or improved battery life. Low signal outage improves voice quality and handoff performance. Finally, reduced fade margins directly translate to better reuse factors and, hence, increased system capacity.

Defining Terms Automatic request for repeat: An error control mechanism in which received packets that cannot be corrected are retransmitted. Channel coding/Forward error correction: A technique that inserts redundant bits during transmission to help detect and correct bit errors during reception. Fading: Fluctuation in the signal level due to shadowing and multipath effects. Frequency hopping: A technique where the signal bursts are transmitted at different frequencies separated by random spacing that are multiples of signal bandwidth. Interleaving: A form of data scrambling that spreads burst of bit errors evenly over the received data allowing efficient forward error correction. Outage probability: The probability that the signal level falls below a specified minimum level. PCS: Personal Communications Services. RAKE receiver: A receiver used in direct sequence spread spectrum signals. The receiver extracts energy in each path and then adds them together with appropriate weighting and delay.

References [1] Adachi, F., Feeney, M.T., Williason, A.G., and Parsons, J.D., Crosscorrelation between the envelopes of 900 MHz signals received at a mobile radio base station site. Proc. IEE, 133(6), 506–512, 1986. [2] Freeburg, T.A., Enabling technologies for in-building network communications—four technical challenges and four solutions. IEEE Trans. Veh. Tech., 29(4), 58–64, 1991. [3] Jakes, W.C., Microwave Mobile Communications, John Wiley & Sons, New York, 1974. 1999 by CRC Press LLC

c

[4] Jefford, P.A., Turkmani, A.M.D., Arowojulu, A.A., and Kellet, C.J., An experimental evaluation of the performance of the two branch space and polarization schemes at 1800 MHz. IEEE Trans. Veh. Tech., VT-44(2), 318–326, 1995. [5] Lee, W.C.Y., Mobile Communications Engineering, McGraw-Hill, New York, 1982. [6] Pahlavan, K. and Levesque, A.H., Wireless Information Networks, John Wiley & Sons, New York, 1995. [7] Proakis, J.G., Digital Communications, McGraw-Hill, New York, 1989. [8] Vaughan, R.G., Polarization diversity system in mobile communications. IEEE Trans. Veh. Tech., VT-39(3), 177–186, 1990. [9] Viterbi, A.J., CDMA: Principle of Spread Spectrum Communications, Addison-Wesley, Reading, MA, 1995.

1999 by CRC Press LLC

c

Sklar, B. “Digital Communication System Performance” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Digital Communication System Performance 1

13.1 Introduction

The Channel • The Link

13.2 Bandwidth and Power Considerations

The Bandwidth Efficiency Plane • M-ary Signalling • Bandwidth-Limited Systems • Power-Limited Systems • Minimum Bandwidth Requirements for MPSK and MFSK Signalling

13.3 Example 1: Bandwidth-Limited Uncoded System Solution to Example 1

13.4 Example 2: Power-Limited Uncoded System Solution to Example 2

13.5 Example 3: Bandwidth-Limited Power-Limited Coded System Solution to Example 3 • Calculating Coding Gain

13.6 Example 4: Direct-Sequence Spread-Spectrum Coded System

Processing Gain • Channel Parameters for Example 13.4 • Solution to Example 13.4

Bernard Sklar Communications Engineering Services

13.1

13.7 Conclusion Appendix: Received E b /N 0 Is Independent of the Code Parameters References Further Information

Introduction

In this section we examine some fundamental tradeoffs among bandwidth, power, and error performance of digital communication systems. The criteria for choosing modulation and coding schemes, based on whether a system is bandwidth limited or power limited, are reviewed for several system examples. Emphasis is placed on the subtle but straightforward relationships we encounter when transforming from data-bits to channel-bits to symbols to chips.

1 A version of this chapter has appeared as a paper in the IEEE Communications Magazine, November 1993, under the title

“Defining, Designing, and Evaluating Digital Communication Systems.” 1999 by CRC Press LLC

c

The design or definition of any digital communication system begins with a description of the communication link. The link is the name given to the communication transmission path from the modulator and transmitter, through the channel, and up to and including the receiver and demodulator. The channel is the name given to the propagating medium between the transmitter and receiver. A link description quantifies the average signal power that is received, the available bandwidth, the noise statistics, and other impairments, such as fading. Also needed to define the system are basic requirements, such as the data rate to be supported and the error performance.

13.1.1

The Channel

For radio communications, the concept of free space assumes a channel region free of all objects that might affect radio frequency (RF) propagation by absorption, reflection, or refraction. It further assumes that the atmosphere in the channel is perfectly uniform and nonabsorbing, and that the earth is infinitely far away or its reflection coefficient is negligible. The RF energy arriving at the receiver is assumed to be a function of distance from the transmitter (simply following the inverse-square law of optics). In practice, of course, propagation in the atmosphere and near the ground results in refraction, reflection, and absorption, which modify the free space transmission.

13.1.2

The Link

A radio transmitter is characterized by its average output signal power Pt and the gain of its transmitting antenna Gt . The name given to the product Pt Gt , with reference to an isotropic antenna is effective radiated power (EIRP) in watts (or dBW). The predetection average signal power S arriving at the output of the receiver antenna can be described as a function of the EIRP, the gain of the receiving antenna Gr , the path loss (or space loss) Ls , and other losses, Lo , as follows [14, 15]: S=

EIRP Gr Ls Lo

(13.1)

The path loss Ls can be written as follows [15]: Ls =

4π d λ

2 (13.2)

where d is the distance between the transmitter and receiver and λ is the wavelength. We restrict our discussion to those links distorted by the mechanism of additive white Gaussian noise (AWGN) only. Such a noise assumption is a very useful model for a large class of communication systems. A valid approximation for average received noise power N that this model introduces is written as follows [5, 9]: (13.3) N∼ = kT ◦ W where k is Boltzmann’s constant (1.38 × 10−23 joule/K), T ◦ is effective temperature in kelvin, and W is bandwidth in hertz. Dividing Eq. (13.3) by bandwidth, enables us to write the received noise-power spectral density N0 as follows: N (13.4) = kT ◦ N0 = W Dividing Eq. (13.1) by N0 yields the received average signal-power to noise-power spectral density S/N0 as EIRP Gr /T ◦ S = (13.5) N0 kLs Lo 1999 by CRC Press LLC

c

where Gr /T ◦ is often referred to as the receiver figure of merit. A link budget analysis is a compilation of the power gains and losses throughout the link; it is generally computed in decibels, and thus takes on the bookkeeping appearance of a business enterprise, highlighting the assets and liabilities of the link. Once the value of S/N0 is specified or calculated from the link parameters, we then shift our attention to optimizing the choice of signalling types for meeting system bandwidth and error performance requirements. Given the received S/N0 , we can write the received bit-energy to noise-power spectral density Eb /N0 , for any desired data rate R, as follows: STb S 1 Eb = = (13.6) N0 N0 N0 R Equation (13.6) follows from the basic definitions that received bit energy is equal to received average signal power times the bit duration and that bit rate is the reciprocal of bit duration. Received Eb /N0 is a key parameter in defining a digital communication system. Its value indicates the apportionment of the received waveform energy among the bits that the waveform represents. At first glance, one might think that a system specification should entail the symbol-energy to noise-power spectral density Es /N0 associated with the arriving waveforms. We will show, however, that for a given S/N0 the value of Es /N0 is a function of the modulation and coding. The reason for defining systems in terms of Eb /N0 stems from the fact that Eb /N0 depends only on S/N0 and R and is unaffected by any system design choices, such as modulation and coding.

13.2

Bandwidth and Power Considerations

Two primary communications resources are the received power and the available transmission bandwidth. In many communication systems, one of these resources may be more precious than the other and, hence, most systems can be classified as either bandwidth limited or power limited. In bandwidth-limited systems, spectrally efficient modulation techniques can be used to save bandwidth at the expense of power; in power-limited systems, power efficient modulation techniques can be used to save power at the expense of bandwidth. In both bandwidth- and power-limited systems, error-correction coding (often called channel coding) can be used to save power or to improve error performance at the expense of bandwidth. Recently, trellis-coded modulation (TCM) schemes have been used to improve the error performance of bandwidth-limited channels without any increase in bandwidth [17], but these methods are beyond the scope of this chapter.

13.2.1

The Bandwidth Efficiency Plane

Figure 13.1 shows the abscissa as the ratio of bit-energy to noise-power spectral density Eb /N0 (in decibels) and the ordinate as the ratio of throughput, R (in bits per second), that can be transmitted per hertz in a given bandwidth W . The ratio R/W is called bandwidth efficiency, since it reflects how efficiently the bandwidth resource is utilized. The plot stems from the Shannon–Hartley capacity theorem [12, 13, 15], which can be stated as S (13.7) C = W log2 1 + N where S/N is the ratio of received average signal power to noise power. When the logarithm is taken to the base 2, the capacity C, is given in bits per second. The capacity of a channel defines the 1999 by CRC Press LLC

c

maximum number of bits that can be reliably sent per second over the channel. For the case where the data (information) rate R is equal to C, the curve separates a region of practical communication systems from a region where such communication systems cannot operate reliably [12, 15].

FIGURE 13.1: Bandwidth-efficiency plane.

13.2.2 M-ary Signalling Each symbol in an M-ary alphabet can be related to a unique sequence of m bits, expressed as M = 2m

or

m = log2 M

(13.8)

where M is the size of the alphabet. In the case of digital transmission, the term symbol refers to the member of the M-ary alphabet that is transmitted during each symbol duration Ts . To transmit the symbol, it must be mapped onto an electrical voltage or current waveform. Because the waveform represents the symbol, the terms symbol and waveform are sometimes used interchangeably. Since one of M symbols or waveforms is transmitted during each symbol duration Ts , the data rate R in bits per second can be expressed as log2 M m = (13.9) R= Ts Ts Data-bit-time duration is the reciprocal of data rate. Similarly, symbol-time duration is the reciprocal of symbol rate. Therefore, from Eq. (13.9), we write that the effective time duration Tb of each bit in 1999 by CRC Press LLC

c

terms of the symbol duration Ts or the symbol rate Rs is Tb =

Ts 1 1 = = R m mRs

(13.10)

Then, using Eqs. (13.8) and (13.10) we can express the symbol rate Rs in terms of the bit rate R as follows: R Rs = (13.11) log2 M From Eqs. (13.9) and (13.10), any digital scheme that transmits m = log2 M bits in Ts seconds, using a bandwidth of W hertz, operates at a bandwidth efficiency of 1 log2 M R = = W W Ts W Tb

(b/s)/Hz

(13.12)

where Tb is the effective time duration of each data bit.

13.2.3

Bandwidth-Limited Systems

From Eq. (13.12), the smaller the W Tb product, the more bandwidth efficient will be any digital communication system. Thus, signals with small W Tb products are often used with bandwidthlimited systems. For example, the European digital mobile telephone system known as Global System for Mobile Communications (GSM) uses Gaussian minimum shift keying (GMSK) modulation having a W Tb product equal to 0.3 Hz/(b/s), where W is the 3-dB bandwidth of a Gaussian filter [4]. For uncoded bandwidth-limited systems, the objective is to maximize the transmitted information rate within the allowable bandwidth, at the expense of Eb /N0 (while maintaining a specified value of bit-error probability PB ). The operating points for coherent M-ary phase-shift keying (MPSK) at PB = 10−5 are plotted on the bandwidth-efficiency plane of Fig. 13.1. We assume Nyquist (ideal rectangular) filtering at baseband [10]. Thus, for MPSK, the required double-sideband (DSB) bandwidth at an intermediate frequency (IF) is related to the symbol rate as follows: W =

1 = Rs Ts

(13.13)

where Ts is the symbol duration and Rs is the symbol rate. The use of Nyquist filtering results in the minimum required transmission bandwidth that yields zero intersymbol interference; such ideal filtering gives rise to the name Nyquist minimum bandwidth. From Eqs. (13.12) and (13.13), the bandwidth efficiency of MPSK modulated signals using Nyquist filtering can be expressed as R/W = log2 M

(b/s)/Hz

(13.14)

The MPSK points in Fig. 13.1 confirm the relationship shown in Eq. (13.14). Note that MPSK modulation is a bandwidth-efficient scheme. As M increases in value, R/W also increases. MPSK modulation can be used for realizing an improvement in bandwidth efficiency at the cost of increased Eb /N0 . Although beyond the scope of this chapter, many highly bandwidth-efficient modulation schemes are under investigation [1]. 1999 by CRC Press LLC

c

13.2.4

Power-Limited Systems

Operating points for noncoherent orthogonal M-ary FSK (MFSK) modulation at PB = 10−5 are also plotted on Fig. 13.1. For MFSK, the IF minimum bandwidth is as follows [15] W =

M = MRs Ts

(13.15)

where Ts is the symbol duration and Rs is the symbol rate. With MFSK, the required transmission bandwidth is expanded M-fold over binary FSK since there are M different orthogonal waveforms, each requiring a bandwidth of 1/Ts . Thus, from Eqs. (13.12) and (13.15), the bandwidth efficiency of noncoherent orthogonal MFSK signals can be expressed as log2 M R (13.16) = (b/s)/Hz W M The MFSK points plotted in Fig. 13.1 confirm the relationship shown in Eq. (13.16). Note that MFSK modulation is a bandwidth-expansive scheme. As M increases, R/W decreases. MFSK modulation can be used for realizing a reduction in required Eb /N0 at the cost of increased bandwidth. In Eqs. (13.13) and (13.14) for MPSK, and Eqs. (13.15) and (13.16) for MFSK, and for all the points plotted in Fig. 13.1, ideal filtering has been assumed. Such filters are not realizable! For realistic channels and waveforms, the required transmission bandwidth must be increased in order to account for realizable filters. In the examples that follow, we will consider radio channels that are disturbed only by additive white Gaussian noise (AWGN) and have no other impairments, and for simplicity, we will limit the modulation choice to constant-envelope types, i.e., either MPSK or noncoherent orthogonal MFSK. For an uncoded system, MPSK is selected if the channel is bandwidth limited, and MFSK is selected if the channel is power limited. When error-correction coding is considered, modulation selection is not as simple, because coding techniques can provide power-bandwidth tradeoffs more effectively than would be possible through the use of any M-ary modulation scheme considered in this chapter [3]. In the most general sense, M-ary signalling can be regarded as a waveform-coding procedure, i.e., when we select an M-ary modulation technique instead of a binary one, we in effect have replaced the binary waveforms with better waveforms—either better for bandwidth performance (MPSK) or better for power performance (MFSK). Even though orthogonal MFSK signalling can be thought of as being a coded system, i.e., a first-order Reed-Muller code [8], we restrict our use of the term coded system to those traditional error-correction codes using redundancies, e.g., block codes or convolutional codes.

13.2.5

Minimum Bandwidth Requirements for MPSK and MFSK Signalling

The basic relationship between the symbol (or waveform) transmission rate Rs and the data rate R was shown in Eq. (13.11). Using this relationship together with Eqs. (13.13–13.16) and R = 9600 b/s, a summary of symbol rate, minimum bandwidth, and bandwidth efficiency for MPSK and noncoherent orthogonal MFSK was compiled for M = 2, 4, 8, 16, and 32 (Table 13.1). Values of Eb /N0 required to achieve a bit-error probability of 10−5 for MPSK and MFSK are also given for each value of M. These entries (which were computed using relationships that are presented later in this chapter) corroborate the tradeoffs shown in Fig. 13.1. As M increases, MPSK signalling provides more bandwidth efficiency at the cost of increased Eb /N0 , whereas MFSK signalling allows for a reduction in Eb /N0 at the cost of increased bandwidth. 1999 by CRC Press LLC

c

TABLE 13.1 Symbol Rate, Minimum Bandwidth, Bandwidth Efficiency, and Required Eb /N0 for MPSK and Noncoherent Orthogonal MFSK Signalling at 9600 bit/s R (b/s)

Rs (symb/s)

MPSK Minimum Bandwidth (Hz)

MPSK R/W

MPSK Eb /N0 (dB) PB = 10−5

Noncoherent Orthog MFSK Min Bandwidth (Hz)

MFSK R/W

MFSK Eb /N0 (dB) PB = 10−5

M

m

2

1

9600

9600

9600

1

9.6

19,200

1/2

13.4

4

2

9600

4800

4800

2

9.6

19,200

1/2

10.6

8

3

9600

3200

3200

3

13.0

25,600

3/8

9.1

16

4

9600

2400

2400

4

17.5

38,400

1/4

8.1

32

5

9600

1920

1920

5

22.4

61,440

5/32

7.4

13.3

Example 1: Bandwidth-Limited Uncoded System

Suppose we are given a bandwidth-limited AWGN radio channel with an available bandwidth of W = 4000 Hz. Also, suppose that the link constraints (transmitter power, antenna gains, path loss, etc.) result in the ratio of received average signal-power to noise-power spectral density S/N0 being equal to 53 dB-Hz. Let the required data rate R be equal to 9600 b/s, and let the required bit-error performance PB be at most 10−5 . The goal is to choose a modulation scheme that meets the required performance. In general, an error-correction coding scheme may be needed if none of the allowable modulation schemes can meet the requirements. In this example, however, we shall find that the use of error-correction coding is not necessary.

13.3.1

Solution to Example 1

For any digital communication system, the relationship between received S/N0 and received bitenergy to noise-power spectral density, Eb /N0 was given in Eq. (13.6) and is briefly rewritten as S Eb = R N0 N0

(13.17)

Solving for Eb /N0 in decibels, we obtain Eb (dB) N0

S (dB-Hz) − R (dB-b/s) N0 = 53 dB-Hz − 10 × log10 9600 dB-b/s = 13.2 dB (or 20.89) =

(13.18)

Since the required data rate of 9600 b/s is much larger than the available bandwidth of 4000 Hz, the channel is bandwidth limited. We therefore select MPSK as our modulation scheme. We have confined the possible modulation choices to be constant-envelope types; without such a restriction, we would be able to select a modulation type with greater bandwidth efficiency. To conserve power, we compute the smallest possible value of M such that the MPSK minimum bandwidth does not exceed the available bandwidth of 4000 Hz. Table 13.1 shows that the smallest value of M meeting this requirement is M = 8. Next we determine whether the required bit-error performance of PB ≤ 10−5 can be met by using 8-PSK modulation alone or whether it is necessary to use an error-correction coding scheme. Table 13.1 shows that 8-PSK alone will meet the requirements, since the required Eb /N0 listed for 8-PSK is less than the received Eb /N0 derived in Eq. (13.18). Let us imagine that we do not have Table 13.1, however, and evaluate whether or not error-correction coding is necessary. 1999 by CRC Press LLC

c

Figure 13.2 shows the basic modulator/demodulator (MODEM) block diagram summarizing the functional details of this design. At the modulator, the transformation from data bits to symbols yields an output symbol rate Rs , that is, a factor log2 M smaller than the input data-bit rate R, as is seen in Eq. (13.11). Similarly, at the input to the demodulator, the symbol-energy to noise-power spectral density ES /N0 is a factor log2 M larger than Eb /N0 , since each symbol is made up of log2 M bits. Because ES /N0 is larger than Eb /N0 by the same factor that Rs is smaller than R, we can expand Eq. (13.17), as follows: Eb Es S = R= Rs (13.19) N0 N0 N0 The demodulator receives a waveform (in this example, one of M = 8 possible phase shifts) during each time interval Ts . The probability that the demodulator makes a symbol error PE (M) is well approximated by the following equation for M > 2 [6]: # "s π 2Es ∼ sin (13.20) PE (M) = 2Q N0 M where Q(x), sometimes called the complementary error function, represents the probability under the tail of a zero-mean unit-variance Gaussian density function. It is defined as follows [18]: 2 Z ∞ u 1 exp − (13.21) du Q(x) = √ 2 2π x A good approximation for Q(x), valid for x > 3, is given by the following equation [2] 2 x 1 Q(x) ∼ = √ exp − 2 x 2π

(13.22)

In Fig. 13.2 and all of the figures that follow, rather than show explicit probability relationships, the generalized notation f (x) has been used to indicate some functional dependence on x. A traditional way of characterizing communication efficiency in digital systems is in terms of the received Eb /N0 in decibels. This Eb /N0 description has become standard practice, but recall that there are no bits at the input to the demodulator; there are only waveforms that have been assigned bit meanings. The received Eb /N0 represents a bit-apportionment of the arriving waveform energy. To solve for PE (M) in Eq. (13.20), we first need to compute the ratio of received symbol-energy to noise-power spectral density Es /N0 . Since from Eq. (13.18) Eb = 13.2 dB (or 20.89) N0 and because each symbol is made up of log2 M bits, we compute the following using M = 8. Eb Es = log2 M = 3 × 20.89 = 62.67 N0 N0

(13.23)

Using the results of Eq. (13.23) in Eq. (13.20), yields the symbol-error probability PE = 2.2 × 10−5 . To transform this to bit-error probability, we use the relationship between bit-error probability PB and symbol-error probability PE , for multiple-phase signalling [8] for PE 1 as follows: PB ∼ = 1999 by CRC Press LLC

c

PE PE = log2 M m

(13.24)

FIGURE 13.2: Basic modulator/demodulator (MODEM) without channel coding.

which is a good approximation when Gray coding is used for the bit-to-symbol assignment [6]. This last computation yields PB = 7.3 × 10−6 , which meets the required bit-error performance. No error-correction coding is necessary, and 8-PSK modulation represents the design choice to meet the requirements of the bandwidth-limited channel, which we had predicted by examining the required Eb /N0 values in Table 13.1.

13.4

Example 2: Power-Limited Uncoded System

Now, suppose that we have exactly the same data rate and bit-error probability requirements as in Example 1, but let the available bandwidth W be equal to 45 kHz, and the available S/N0 be equal to 48 dB-Hz. The goal is to choose a modulation or modulation/coding scheme that yields the required performance. We shall again find that error-correction coding is not required.

13.4.1

Solution to Example 2

The channel is clearly not bandwidth limited since the available bandwidth of 45 kHz is more than adequate for supporting the required data rate of 9600 bit/s. We find the received Eb /N0 from Eq. (13.18), as follows: Eb (dB) = 48 dB-Hz − 10 × log10 9600 dB-b/s = 8.2 dB (or 6.61) N0

(13.25)

Since there is abundant bandwidth but a relatively small Eb /N0 for the required bit-error probability, we consider that this channel is power limited and choose MFSK as the modulation scheme. To conserve power, we search for the largest possible M such that the MFSK minimum bandwidth is not expanded beyond our available bandwidth of 45 kHz. A search results in the choice of M = 16 (Table 13.1). Next, we determine whether the required error performance of PB ≤ 10−5 can be met by using 16-FSK alone, i.e., without error-correction coding. Table 13.1 shows that 16-FSK alone meets the requirements, since the required Eb /N0 listed for 16-FSK is less than the received Eb /N0 1999 by CRC Press LLC

c

derived in Eq. (13.25). Let us imagine again that we do not have Table 13.1, and evaluate whether or not error-correction coding is necessary. The block diagram in Fig. 13.2 summarizes the relationships between symbol rate Rs , and bit rate R, and between Es /N0 and Eb /N0 , which is identical to each of the respective relationships in Example 1. The 16-FSK demodulator receives a waveform (one of 16 possible frequencies) during each symbol time interval Ts . For noncoherent orthogonal MFSK, the probability that the demodulator makes a symbol error PE (M) is approximated by the following upper bound [20]:

PE (M) ≤

M −1 Es exp − 2 2N0

(13.26)

To solve for PE (M) in Eq. (13.26), we compute ES /N0 , as in Example 1. Using the results of Eq. (13.25) in Eq. (13.23), with M = 16, we get

Eb Es = log2 M = 4 × 6.61 = 26.44 N0 N0

(13.27)

Next, using the results of Eq. (13.27) in Eq. (13.26), yields the symbol-error probability PE = 1.4 × 10−5 . To transform this to bit-error probability, PB , we use the relationship between PB and PE for orthogonal signalling [20], given by

PB =

2m−1 PE (2m − 1)

(13.28)

This last computation yields PB = 7.3×10−6 , which meets the required bit-error performance. Thus, we can meet the given specifications for this power-limited channel by using 16-FSK modulation, without any need for error-correction coding, as we had predicted by examining the required Eb /N0 values in Table 13.1.

13.5

Example 3: Bandwidth-Limited and Power-Limited Coded System

We start with the same channel parameters as in Example 1 (W = 4000 Hz, S/N0 = 53 dB-Hz, and R = 9600 b/s), with one exception. 1999 by CRC Press LLC

c

In this example, we specify that PB must be at most 10−9 . Table 13.1 shows that the system is both bandwidth limited and power limited, based on the available bandwidth of 4000 Hz and the available Eb /N0 of 20.2 dB, from Eq. (13.18); 8-PSK is the only possible choice to meet the bandwidth constraint; however, the available Eb /N0 of 20.2 dB is certainly insufficient to meet the required PB of 10−9 . For this small value of PB , we need to consider the performance improvement that error-correction coding can provide within the available bandwidth. In general, one can use convolutional codes or block codes. The Bose–Chaudhuri–Hocquenghem (BCH) codes form a large class of powerful error-correcting cyclic (block) codes [7]. To simplify the explanation, we shall choose a block code from the BCH family. Table 13.2 presents a partial catalog of the available BCH codes in terms of n, k, and t, where k represents the number of information (or data) bits that the code transforms into a longer block of n coded bits (or channel bits), and t represents the largest number of incorrect channel bits that the code can correct within each n-sized block. The rate of a code is defined as the ratio k/n; its inverse represents a measure of the code’s redundancy [7].

13.5.1

TABLE 13.2 BCH Codes (Partial Catalog) n

k

t

7

4

1

15

11 7 5

1 2 3

31

26 21 16 11

1 2 3 5

63

57 51 45 39 36 30

1 2 3 4 5 6

127

120 113 106 99 92 85 78 71 64

1 2 3 4 5 6 7 9 10

Solution to Example 3

Since this example has the same bandwidth-limited parameters given in Example 1, we start with the same 8-PSK modulation used to meet the stated bandwidth constraint. We now employ errorcorrection coding, however, so that the bit-error probability can be lowered to PB ≤ 10−9 . To make the optimum code selection from Table 13.2, we are guided by the following goals. 1. The output bit-error probability of the combined modulation/coding system must meet the system error requirement. 2. The rate of the code must not expand the required transmission bandwidth beyond the available channel bandwidth. 3. The code should be as simple as possible. Generally, the shorter the code, the simpler will be its implementation. The uncoded 8-PSK minimum bandwidth requirement is 3200 Hz (Table 13.1) and the allowable channel bandwidth is 4000 Hz, and so the uncoded signal bandwidth can be increased by no more than a factor of 1.25 (i.e., an expansion of 25%). The very first step in this (simplified) code selection example is to eliminate the candidates in Table 13.2 that would expand the bandwidth by more than 25%. The remaining entries form a much reduced set of bandwidth-compatible codes (Table 13.3). In Table 13.3, a column designated Coding Gain G (for MPSK at PB = 10−9 ) has been added. Coding gain in decibels is defined as follows: Eb Eb − s (13.29) G= N0 uncoded N0 coded G can be described as the reduction in the required Eb /N0 (in decibels) that is needed due to the error-performance properties of the channel coding. G is a function of the modulation type and bit-error probability, and it has been computed for MPSK at PB = 10−9 (Table 13.3). For MPSK 1999 by CRC Press LLC

c

TABLE 13.3 Bandwidth-Compatible BCH Codes n

k

t

Coding Gain, G (dB) MPSK, PB = 10−9

31

26

1

2.0

63

57 51

1 2

2.2 3.1

127

120 113 106

1 2 3

2.2 3.3 3.9

modulation, G is relatively independent of the value of M. Thus, for a particular bit-error probability, a given code will provide about the same coding gain when used with any of the MPSK modulation schemes. Coding gains were calculated using a procedure outlined in the subsequent Calculating Coding Gain section. A block diagram summarizes this system, which contains both modulation and coding (Fig. 13.3). The introduction of encoder/decoder blocks brings about additional transformations. The relationships that exist when transforming from R b/s to Rc channel-b/s to Rs symbol/s are shown at the encoder/modulator. Regarding the channel-bit rate Rc , some authors prefer to use the units of channel-symbol/s (or code-symbol/s). The benefit is that error-correction coding is often described more efficiently with nonbinary digits. We reserve the term symbol for that group of bits mapped onto an electrical waveform for transmission, and we designate the units of Rc to be channel-b/s (or coded-b/s).

FIGURE 13.3: MODEM with channel coding.

We assume that our communication system cannot tolerate any message delay, so that the channel1999 by CRC Press LLC

c

bit rate Rc must exceed the data-bit rate R by the factor n/k. Further, each symbol is made up of log2 M channel bits, and so the symbol rate Rs is less than Rc by the factor log2 M. For a system containing both modulation and coding, we summarize the rate transformations as follows: n (13.30) R Rc = k Rc Rs = (13.31) log2 M At the demodulator/decoder in Fig.13.3, the transformations among data-bit energy, channel- bit energy, and symbol energy are related (in a reciprocal fashion) by the same factors as shown among the rate transformations in Eqs. (13.30) and (13.31). Since the encoding transformation has replaced k data bits with n channel bits, then the ratio of channel-bit energy to noise-power spectral density Ec /N0 is computed by decrementing the value of Eb /N0 by the factor k/n. Also, since each transmission symbol is made up of log2 M channel bits, then ES /N0 , which is needed in Eq. (13.20) to solve for PE , is computed by incrementing Ec /N0 by the factor log2 M. For a system containing both modulation and coding, we summarize the energy to noise-power spectral density transformations as follows: k Eb Ec = (13.32) N0 n N0 Ec Es = log2 M (13.33) N0 N0 Using Eqs. (13.30) and (13.31), we can now expand the expression for S/N0 in Eq. (13.19), as follows (Appendix). Eb Ec Es S = R= Rc = Rs (13.34) N0 N0 N0 N0 As before, a standard way of describing the link is in terms of the received Eb /N0 in decibels. However, there are no data bits at the input to the demodulator, and there are no channel bits; there are only waveforms that have bit meanings and, thus, the waveforms can be described in terms of bit-energy apportionments. Since S/N0 and R were given as 53 dB-Hz and 9600 b/s, respectively, we find as before, from Eq. (13.18), that the received Eb /N0 = 13.2 dB. The received Eb /N0 is fixed and independent of n, k, and t (Appendix). As we search, in Table 13.3 for the ideal code to meet the specifications, we can iteratively repeat the computations suggested in Fig. 13.3. It might be useful to program on a personal computer (or calculator) the following four steps as a function of n, k, and t. Step 1 starts by combining Eqs. (13.32) and (13.33), as follows. Step 1: Ec k Eb Es = log2 M = log2 M (13.35) N0 N0 n N0 Step 2:

"s PE (M) ∼ = 2Q

# π 2Es sin N0 M

(13.36)

which is the approximation for symbol-error probability PE rewritten from Eq. (13.20). At each symbol-time interval, the demodulator makes a symbol decision, but it delivers a channel-bit sequence representing that symbol to the decoder. When the channel-bit output of the demodulator is 1999 by CRC Press LLC

c

quantized to two levels, 1 and 0, the demodulator is said to make hard decisions. When the output is quantized to more than two levels, the demodulator is said to make soft decisions [15]. Throughout this paper, we shall assume hard-decision demodulation. Now that we have a decoder block in the system, we designate the channel-bit-error probability out of the demodulator and into the decoder as pc , and we reserve the notation PB for the bit-error probability out of the decoder. We rewrite Eq. (13.24) in terms of pc for PE 1 as follows. Step 3: PE PE pc ∼ (13.37) = = log2 M m relating the channel-bit-error probability to the symbol-error probability out of the demodulator, assuming Gray coding, as referenced in Eq. (13.24). For traditional channel-coding schemes and a given value of received S/N0 , the value of Es /N0 with coding will always be less than the value of Es /N0 without coding. Since the demodulator with coding receives less Es /N0 , it makes more errors! When coding is used, however, the system error-performance does not only depend on the performance of the demodulator, it also depends on the performance of the decoder. For error-performance improvement due to coding, the decoder must provide enough error correction to more than compensate for the poor performance of the demodulator. The final output decoded bit-error probability PB depends on the particular code, the decoder, and the channel-bit-error probability pc . It can be expressed by the following approximation [11]. Step 4: n n j 1 X ∼ PB = j pc (1 − pc )n−j (13.38) j n j =t+1

where t is the largest number of channel bits that the code can correct within each block of n bits. Using Eqs. (13.35–13.38) in the four steps, we can compute the decoded bit-error probability PB as a function of n, k, and t for each of the codes listed in Table 13.3. The entry that meets the stated error requirement with the largest possible code rate and the smallest value of n is the double-error correcting (63, 51) code. The computations are as follows. Step 1: 51 Es 20.89 = 50.73 =3 N0 63 where M = 8, and the received Eb /N0 = 13.2 dB (or 20.89). Step 2: i h√ ∼ 2Q 101.5 × sin π = 2Q(3.86) = 1.2 × 10−4 PE = 8 Step 3: 1.2 × 10−4 = 4 × 10−5 pc ∼ = 3 Step 4: 3 60 3 63 4 × 10−5 PB ∼ 1 − 4 × 10−5 = 63 3 4 59 4 63 + + ··· 1 − 4 × 10−5 4 × 10−5 63 4 = 1999 by CRC Press LLC

c

1.2 × 10−10

where the bit-error-correcting capability of the code is t = 2. For the computation of PB in step 4, we need only consider the first two terms in the summation of Eq. (13.38) since the other terms have a vanishingly small effect on the result. Now that we have selected the (63, 51) code, we can compute the values of channel-bit rate Rc and symbol rate Rs using Eqs. (13.30) and (13.31), with M = 8, n 63 R= 9600 ≈ 11,859 channel-b/s Rc = k 51 Rc 11859 = = 3953 symbol/s Rs = log2 M 3

13.5.2

Calculating Coding Gain

Perhaps a more direct way of finding the simplest code that meets the specified error performance is to first compute how much coding gain G is required in order to yield PB = 10−9 when using 8-PSK modulation alone; then, from Table 13.3, we can simply choose the code that provides this performance improvement. First, we find the uncoded Es /N0 that yields an error probability of PB = 10−9 , by writing from Eqs. (13.24) and (13.36), the following: r π 2Es 2Q sin PE ∼ N0 M (13.39) = 10−9 PB ∼ = = log2 M log2 M At this low value of bit-error probability, it is valid to use Eq. (13.22) to approximate Q(x) in Eq. (13.39) By trial and error (on a programmable calculator), we find that the uncoded Es /N0 = 120.67 = 20.8 dB, and since each symbol is made up of log2 8 = 3 bits, the required (Eb /N0 )uncoded = 120.67/3 = 40.22 = 16 dB. From the given parameters and Eq. (13.18), we know that the received (Eb /N0 )coded = 13.2 dB. Using Eq. (13.29), the required coding gain to meet the bit-error performance of PB = 10−9 in decibels is Eb Eb − = 16 − 13.2 = 2.8 G= N0 uncoded N0 coded To be precise, each of the Eb /N0 values in the preceding computation must correspond to exactly the same value of bit-error probability (which they do not). They correspond to PB = 10−9 and PB = 1.2 × 10−10 , respectively. At these low probability values, however, even with such a discrepancy, this computation still provides a good approximation of the required coding gain. In searching Table 13.3 for the simplest code that will yield a coding gain of at least 2.8 dB, we see that the choice is the (63, 51) code, which corresponds to the same code choice that we made earlier.

13.6

Example 4: Direct-Sequence (DS) Spread-Spectrum Coded System

Spread-spectrum systems are not usually classified as being bandwidth- or power-limited. They are generally perceived to be power-limited systems, however, because the bandwidth occupancy of the information is much larger than the bandwidth that is intrinsically needed for the information transmission. In a direct-sequence spread-spectrum (DS/SS) system, spreading the signal bandwidth by some factor permits lowering the signal-power spectral density by the same factor (the total average signal power is the same as before spreading). The bandwidth spreading is typically accomplished 1999 by CRC Press LLC

c

by multiplying a relatively narrowband data signal by a wideband spreading signal. The spreading signal or spreading code is often referred to as a pseudorandom code or PN code.

13.6.1

Processing Gain

A typical DS/SS radio system is often described as a two-step BPSK modulation process. In the first step, the carrier wave is modulated by a bipolar data waveform having a value +1 or −1 during each data-bit duration; in the second step, the output of the first step is multiplied (modulated) by a bipolar PN-code waveform having a value +1 or −1 during each PN-code-bit duration. In reality, DS/SS systems are usually implemented by first multiplying the data waveform by the PN-code waveform and then making a single pass through a BPSK modulator. For this example, however, it is useful to characterize the modulation process in two separate steps—the outer modulator/demodulator for the data, and the inner modulator/demodulator for the PN code (Fig. 13.4).

FIGURE 13.4: Direct-sequence spread-spectrum MODEM with channel coding. A spread-spectrum system is characterized by a processing gain Gp , that is defined in terms of the spread-spectrum bandwidth Wss and the data rate R as follows [20]: Gp =

Wss R

(13.40)

For a DS/SS system, the PN-code bit has been given the name chip, and the spread-spectrum signal bandwidth can be shown to be about equal to the chip rate Rch as follows: Gp = 1999 by CRC Press LLC

c

Rch R

(13.41)

Some authors define processing gain to be the ratio of the spread-spectrum bandwidth to the symbol rate. This definition separates the system performance that is due to bandwidth spreading from the performance that is due to error-correction coding. Since we ultimately want to relate all of the coding mechanisms relative to the information source, we shall conform to the most usually accepted definition for processing gain, as expressed in Eqs. (13.40) and (13.41). A spread-spectrum system can be used for interference rejection and for multiple access (allowing multiple users to access a communications resource simultaneously). The benefits of DS/SS signals are best achieved when the processing gain is very large; in other words, the chip rate of the spreading (or PN) code is much larger than the data rate. In such systems, the large value of Gp allows the signalling chips to be transmitted at a power level well below that of the thermal noise. We will use a value of Gp = 1000. At the receiver, the despreading operation correlates the incoming signal with a synchronized copy of the PN code and, thus, accumulates the energy from multiple (Gp ) chips to yield the energy per data bit. The value of Gp has a major influence on the performance of the spread-spectrum system application. We shall see, however, that the value of Gp has no effect on the received Eb /N0 . In other words, spread spectrum techniques offer no error-performance advantage over thermal noise. For DS/SS systems, there is no disadvantage either! Sometimes such spreadspectrum radio systems are employed only to enable the transmission of very small power-spectral densities and thus avoid the need for FCC licensing [16].

13.6.2

Channel Parameters for Example 13.4

Consider a DS/SS radio system that uses the same (63, 51) code as in the previous example. Instead of using MPSK for the data modulation, we shall use BPSK. Also, we shall use BPSK for modulating the PN-code chips. Let the received S/N0 = 48 dB-Hz, the data rate R = 9600 b/s, and the required PB ≤ 10−6 . For simplicity, assume that there are no bandwidth constraints. Our task is simply to determine whether or not the required error performance can be achieved using the given system architecture and design parameters. In evaluating the system, we will use the same type of transformations used in the previous examples.

13.6.3

Solution to Example 13.4

A typical DS/SS system can be implemented more simply than the one shown in Fig. 13.4. The data and the PN code would be combined at baseband, followed by a single pass through a BPSK modulator. We will, however, assume the existence of the individual blocks in Fig. 13.4 because they enhance our understanding of the transformation process. The relationships in transforming from data bits, to channel bits, to symbols, and to chips Fig. 13.4 have the same pattern of subtle but straightforward transformations in rates and energies as previous relationships (Figs. 13.2 and 13.3). The values of Rc , Rs , and Rch can now be calculated immediately since the (63, 51) BCH code has already been selected. From Eq. (13.30) we write n 63 R= 9600 ≈ 11,859 channel-b/s Rc = k 51 Since the data modulation considered here is BPSK, then from Eq. (13.31) we write Rs = Rc ≈ 11,859 symbol/s and from Eq. (13.41), with an assumed value of Gp = 1000 Rch = Gp R = 1000 × 9600 = 9.6 × 106 chip/s 1999 by CRC Press LLC

c

Since we have been given the same S/N0 and the same data rate as in Example 2, we find the value of received Eb /N0 from Eq. (13.25) to be 8.2 dB (or 6.61). At the demodulator, we can now expand the expression for S/N0 in Eq. (13.34) and the Appendix as follows: Eb Ec Es Ech S = R= Rc = Rs = Rch N0 N0 N0 N0 N0

(13.42)

Corresponding to each transformed entity (data bit, channel bit, symbol, or chip) there is a change in rate and, similarly, a reciprocal change in energy-to-noise spectral density for that received entity. Equation (13.42) is valid for any such transformation when the rate and energy are modified in a reciprocal way. There is a kind of conservation of power (or energy) phenomenon that exists in the transformations. The total received average power (or total received energy per symbol duration) is fixed regardless of how it is computed, on the basis of data bits, channel bits, symbols, or chips. The ratio Ech /N0 is much lower in value than Eb /N0 . This can be seen from Eqs. (13.42) and (13.41), as follows: 1 S 1 S 1 Eb Ech = = (13.43) = N0 N0 Rch N0 Gp R Gp N0 But, even so, the despreading function (when properly synchronized) accumulates the energy contained in a quantity Gp of the chips, yielding the same value Eb /N0 = 8.2 dB, as was computed earlier from Eq. (13.25). Thus, the DS spreading transformation has no effect on the error performance of an AWGN channel [15], and the value of Gp has no bearing on the value of PB in this example. From Eq. (13.43), we can compute, in decibels, Ech N0

=

Eb /N0 − Gp

= 8.2 − 10 × log10 1000 = −21.8

(13.44)

The chosen value of processing gain (Gp = 1000) enables the DS/SS system to operate at a value of chip energy well below the thermal noise, with the same error performance as without spreading. Since BPSK is the data modulation selected in this example, each message symbol therefore corresponds to a single channel bit, and we can write Ec k Eb 51 Es × 6.61 = 5.35 = = = (13.45) N0 N0 n N0 63 where the received Eb /N0 = 8.2 dB (or 6.61). Out of the BPSK data demodulator, the symbol-error probability PE (and the channel-bit error probability pc ) is computed as follows [15]: s ! 2Ec (13.46) pc = PE = Q N0 Using the results of Eq. (13.45) in Eq. (13.46) yields pc = Q(3.27) = 5.8 × 10−4 Finally, using this value of pc in Eq. (13.38) for the (63,51) double-error correcting code yields the output bit-error probability of PB = 3.6 × 10−7 . We can, therefore, verify that for the given architecture and design parameters of this example the system does, in fact, achieve the required error performance. 1999 by CRC Press LLC

c

13.7

Conclusion

The goal of this section has been to review fundamental relationships used in evaluating the performance of digital communication systems. First, we described the concept of a link and a channel and examined a radio system from its transmitting segment up through the output of the receiving antenna. We then examined the concept of bandwidth-limited and power-limited systems and how such conditions influence the system design when the choices are confined to MPSK and MFSK modulation. Most important, we focused on the definitions and computations involved in transforming from data bits to channel bits to symbols to chips. In general, most digital communication systems share these concepts; thus, understanding them should enable one to evaluate other such systems in a similar way.

Appendix: Received E b /N 0 Is Independent of the Code Parameters Starting with the basic concept that the received average signal power S is equal to the received symbol or waveform energy, Es , divided by the symbol-time duration, Ts (or multiplied by the symbol rate, Rs ), we write Es /Ts Es S = = Rs (A13.1) N0 N0 N0 where N0 is noise-power spectral density. Using Eqs. (13.27) and (13.25), rewritten as Ec Es = log2 M N0 N0

and

Rs =

Rc log2 M

let us make substitutions into Eq. (A13.1), which yields Ec S = Rc N0 N0 Next, using Eqs. (13.26) and (13.24), rewritten as Ec k Eb = and N0 n N0

(A13.2)

Rc =

n k

R

let us now make substitutions into Eq. (A13.2), which yields the relationship expressed in Eq. (13.11) Eb S = R N0 N0

(A13.3)

Hence, the received Eb /N0 is only a function of the received S/N0 and the data rate R. It is independent of the code parameters, n, k, and t. These results are summarized in Fig. 13.3.

References [1] Anderson, J.B. and Sundberg, C.-E.W., Advances in constant envelope coded modulation, IEEE Commun., Mag., 29(12), 36–45, 1991. 1999 by CRC Press LLC

c

[2] Borjesson, P.O. and Sundberg, C.E., Simple approximations of the error function Q(x) for communications applications, IEEE Trans. Comm., COM-27, 639–642, Mar. 1979. [3] Clark Jr., G.C. and Cain, J.B., Error-Correction Coding for Digital Communications, Plenum Press, New York, 1981. [4] Hodges, M.R.L., The GSM radio interface, British Telecom Technol. J., 8(1), 31–43, 1990. [5] Johnson, J.B., Thermal agitation of electricity in conductors, Phys. Rev., 32, 97–109, Jul. 1928. [6] Korn, I., Digital Communications, Van Nostrand Reinhold Co., New York, 1985. [7] Lin, S. and Costello Jr., D.J., Error Control Coding: Fundamentals and Applications, PrenticeHall, Englewood Cliffs, NJ, 1983. [8] Lindsey, W.C. and Simon, M.K., Telecommunication Systems Engineering, Prentice-Hall, Englewood Cliffs, NJ, 1973. [9] Nyquist, H., Thermal agitation of electric charge in conductors, Phys. Rev., 32, 110–113, Jul. 1928. [10] Nyquist, H., Certain topics on telegraph transmission theory, Trans. AIEE, 47, 617–644, Apr. 1928. [11] Odenwalder, J.P., Error Control Coding Handbook. Linkabit Corp., San Diego, CA, Jul. 15, 1976. [12] Shannon, C.E., A mathematical theory of communication, BSTJ. 27, 379–423, 623–657, 1948. [13] Shannon, C.E., Communication in the presence of noise, Proc. IRE. 37(1), 10–21, 1949. [14] Sklar, B., What the system link budget tells the system engineer or how I learned to count in decibels, Proc. of the Intl. Telemetering Conf., San Diego, CA, Nov. 1979. [15] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1988. [16] Title 47, Code of Federal Regulations, Part 15 Radio Frequency Devices. [17] Ungerboeck, G., Trellis-coded modulation with redundant signal sets, Pt. I and II, IEEE Comm. Mag., 25, 5–21. Feb. 1987. [18] Van Trees, H.L., Detection, Estimation, and Modulation Theory, Pt. I, John Wiley & Sons, New York, 1968. [19] Viterbi, A.J., Principles of Coherent Communication, McGraw-Hill, New York, 1966. [20] Viterbi, A.J., Spread spectrum communications—myths and realities, IEEE Comm. Mag., 11– 18, May, 1979.

Further Information A useful compilation of selected papers can be found in: Cellular Radio & Personal Communications– A Book of Selected Readings, edited by Theodore S. Rappaport, Institute of Electrical and Electronics Engineers, Inc., Piscataway, New Jersey, 1995. Fundamental design issues, such as propagation, modulation, channel coding, speech coding, multiple-accessing and networking, are well represented in this volume. Another useful sourcebook that covers the fundamentals of mobile communications in great detail is: Mobile Radio Communications, edited by Raymond Steele, Pentech Press, London 1992. This volume is also available through the Institute of Electrical and Electronics Engineers, Inc., Piscataway, New Jersey. For spread spectrum systems, an excellent reference is: Spread Spectrum Communications Handbook, by Marvin K. Simon, Jim K. Omura, Robert A. Scholtz, and Barry K. Levitt, McGraw-Hill Inc., New York, 1994.

1999 by CRC Press LLC

c

Dimolitsas, S. & Onufry, M. “Telecommunications Standardization” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Telecommunications Standardization 14.1 Introduction 14.2 Global Standardization

ITU-T • ITU-R • BDT • ISO/IEC JTC 1

14.3 Regional Standardization CITEL

14.4 National Standardization ANSI T1 • TIA • TTC

Spiros Dimolitsas Lawrence Livermore National Laboratory

Michael Onufry COMSAT Laboratories

14.1

14.5 Intellectual Property 14.6 Standards Coordination 14.7 Scientific 14.8 Standards Development Cycle Defining Terms Further Information

Introduction

National economies are increasingly becoming information based, where networking and information transport provide a foundation for productivity and economic growth. Concurrently, many countries are rapidly adopting deregulation policies that are resulting in a telecommunications industry that is increasingly multicarrier and multivendor based, and where interconnectivity and compatibility between different networks is emerging as key to the success of this technological and regulatory transition. The communications industry has, consequently, become more interested in standardization; standards give manufacturers, service providers, and users freedom of choice at reasonable cost. In this chapter, a review is provided of the primary telecommunications standards setting bodies. As will be seen, these bodies are often driven by slightly different underlying philosophies, but the output of their activities, i.e., the standards, possess essentially the same characteristics. An all-encompassing review of standardization bodies is not attempted here; this would clearly take many volumes to describe. Furthermore, as country after country increasingly deregulates its telecommunication industry, new standards setting bodies emerge to fill in the void of the de-facto (but no longer existing) standards setting bodies: the national telecommunications administration. The principal communications standards bodies that will be covered are the following: the International Telecommunications Union (ITU); the United States ANSI Committee T1 on Telecom1999 by CRC Press LLC

c

munications; the Telecommunications Industry Association (TIA); the European Telecommunications Standards Institute (ETSI); the Inter-American Telecommunications Commission (CITEL); the Japanese Telecommunications Technology Committee (TTC); and the Institute of Electrical and Electronics Engineers (IEEE). Not addressed explicitly are other standards setting bodies that are either national or regional in character, even though it is recognized that sometimes there is overlap in scope with the bodies explicitly covered here. Most notably, standards setting bodies that are not covered, but that are worth noting, include: the United States ANSI Committee X3; the International Standards Organization (ISO), the International Electrotechnical Commission (IEC) [except ISO/IEC joint technical committee (JTC) 1], the Telecommunications Standards Advisory Council of Canada (TSACC), the Australian Telecommunications Standardization Committee (ATSC), the Telecommunication Technology Association (TTA) in Korea, and several forums (whose scope is, in principle, somewhat different) such as the asynchronous transfer mode (ATM) forum, the frame relay forum, the integrated digital services network (ISDN) users’ forum, and telocator. As will be described later, many of these bodies operate in a coherent fashion through a mechanism developed by the Interregional Telecommunications Standards Conference (ITSC) and its successor, the Global Standards Collaboration (GSC).

14.2

Global Standardization

When it comes to setting global communications standards, the ITU comes to the forefront. The ITU is an intergovernmental organization, whereby each sovereign state that is a member of the United Nations may become a member of the ITU. Member governments (in most cases represented by their telecommunications administrations) are constitutional members with a right to vote. Other organizations, such as network and service providers, manufacturers, and scientific and industrial organizations also participate in ITU activities but with a lower legal status. ITU traces its history back to 1865 in the era of telegraphy. The supreme organ of the ITU is the plenipotentiary conference, which is held not less than every five years and plays a major role in the management of ITU. In 1993 the ITU as a U.N.-specialized agency was reorganized into three sectors (see Fig. 14.1): The telecommunications standardization sector (ITU-T), the radiocommunications sector (ITU-R), and the development sector (BDT). These sectors’ activities are, respectively, standardization of telecommunications, including radio communications; regulation of telecommunications (mainly for radio communications); and development of telecommunications. It should be noted that, in general, the ITU-T is the successor of the international telephone and telegraph consultative committee (CCITT) of the ITU with additional responsibilities for standardization of network-related radio communications. Similarly, the ITU-R is the successor of the international radio consultative committee (CCIR) and the international frequency registration bureau (IFRB) of the ITU (after transferring some of its standardization activities to the ITU-T). The BDT is a new sector, which became operational in 1989.

14.2.1 ITU-T Within the ITU structure, standardization work is undertaken by a number of study groups (SG) dealing with specific areas of communications. There are currently 14 study groups, as shown in Table 14.1. Study groups develop standards for their respective work areas, which then have to be agreed upon by consensus—a process that for the time being is reserved to administrations only. The standards so developed are called recommendations to indicate their legal nonbinding nature. Technically, 1999 by CRC Press LLC

c

FIGURE 14.1: The ITU structure.

however, there is no distinction between recommendations developed by the ITU and standards developed by other standards setting bodies. The study groups’ work is undertaken by delegation members, sent or sponsored by their national administrations, and delegates from recognized private operating organizations (RPOA), scientific and industrial organizations (SIO) or international organizations. Because an ITU-T study group can typically have from 100 to more than 500 participating members and deal with 20–50 project standards, the work of each study group is often divided among working parties (WP). Such working parties are usually split further into experts’ groups led by a chair or “rapporteur” with responsibility for leading the work defined in an approved active question or subelement of a question. To coordinate standardization work that spans several study groups, two joint coordination groups (JCG) have also been established (not shown in Fig. 14.1): International Mobile Communications (IMT-2000) and Satellite Matters. Such groups do not have executive powers but are merely there to coordinate work of pervasive interest within the ITU-T sector. Also part of the ITU-T structure is the telecommunications standardization bureau (TSB) or, as it was formerly called, the CCITT secretariat. The TSB is responsible for the organization of numerous meetings held by the sector each year as well as all other support services required to 1999 by CRC Press LLC

c

TABLE 14.1

ITU-T Study Group Structure

SG 2

Network and service operation Lead SG on Service definition, Numbering, Routing and Global Mobility

SG 3

Tariff and accounting principles

SG 4

TMN and network maintenance Lead SG on Telecommunication management network (TMN) studies

SG 5

Protection against electromagnetic environmental effects

SG 6

Outside plant

SG 7

Data networks and open systems communications Lead SG on Open Distributed Processing (ODP), Frame Relay and for Communications System Security

SG 8

Characteristics of telematic services Lead SG on Facsimile

SG 9

Television and sound transmission

SG 10

Languages and general software aspects for telecommunications systems

SG 11

Signalling requirements and protocols Lead SG on Intelligent Network and IMT-2000

SG 12

End-to-end transmission performance of networks and terminals

SG 13

General network aspects Lead SG on General network aspects Global Information Infrastructures and Broadband ISDN

SG 15

Transport networks, systems and equipment Lead SG on Access Network Transport

SG 16

Transmission systems and equipment Lead SG on Multimedia services and systems

ensure the smooth and efficient operation of the sector (including, but not limited to, document production and distribution). The TSB is headed by a director, who holds the executive power and, in collaboration with the study groups, bears full responsibility for the ITU-T activities. In this structure, unlike other U.N. organizations, the secretary general is the legal representative of the ITU, with the executive powers being vested in the director. Finally, the ITU-T is supported by an advisory group, i.e., the telecommunications standardization advisory group (TSAG), which together with interested ITU members, the ITU-T Director, and ITUT SG chairman, guides standardization activities.

14.2.2 ITU-R The radiocommunications sector emphasizes the regulatory and pure radio-interface aspects. The functional structure of the ITU-R currently includes eight study groups, (shown in Table 14.2) a radiocommunications bureau, and an advisory board. The role of the latter two elements is very similar to the ITU-T and, thus, need not be repeated here. As within the ITU-T, there are areas of pervasive interest, and so areas of common interest can be found between the ITU-T and ITU-R where activities need to be coordinated. To achieve this objective, two intersector coordination groups (ICG) have been established (not shown in Fig. 14.1) dealing with international mobile telecommunications (IMT-2000), and satellite matters. Three major special activities have been organized within ITU-R: • IMT-2000 (formerly known as Future Public Land Mobile Telecommunications Systems FPLMTS). The objective of the International Mobile Telecommunications (IMT)2000 activity is to provide seamless satellite and terrestrial operation of mobile terminals 1999 by CRC Press LLC

c

TABLE 14.2

ITU-R Study Group Structure

SG 1

Spectrum management

SG 3

Radio wave propagation

SG 4

Fixed satellite service

SG 7

Science services

SG 8

Mobile, radio determination, amateur and related satellite services

SG 9

Fixed service

SG 10

Broadcasting services: sound

SG 11

Broadcasting services: television

throughout the world—anywhere, anytime—where communication coverage requires interoperation of satellite and terrestrial networks. This is to be accomplished using technology available around the year 2000. • Mobile-satellite and radionavigation-satellite service (MSS-RNSS). The rapid growth of service in these areas has created a need to focus attention on interference and spectrum allocation. • Wireless Access Systems (WAS). This is an application of radio technology and personal communications systems directed toward lowering the installation and maintenance cost of the local access network. The traditional high cost has prevented penetration of basic telephone service in evolving and developing countries of the world. Overcoming this barrier is an objective of the BDT, described next.

14.2.3

BDT

Unlike the ITU-T (and to some extent ITU-R), which deals with standardization, the BDT deals with aspects that promote the integration and deployment of communications in developing countries. Typical outputs from this sector include implementation guides that expand the utility of ITU recommendations and ensure their expeditious implementation. Communications has been recognized as a necessary element for economic growth. The BDT also seeks to arrange special financing involving communication suppliers and governments or authorized carriers within developing countries to enable provision of basic communications service where otherwise it would not be possible.

14.2.4

ISO/IEC JTC 1

Two global organizations are active in the information processing systems area, the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC), particularly through the Joint Technical Committee 1 (JTC 1). The ISO comprises national standards bodies, which have the responsibility for promoting and distributing ISO standards within their own countries. ISO technical work is carried out by some 200 technical committees (TC). Technical committees are established by the ISO council and their work program is approved by the technical board on behalf of the council. The IEC comprises national committees (one from each country) and deals with almost all spheres of electrotechnology, including power, electronics, telecommunications, and nuclear energy. IEC technical work is performed by some 200 TCs set up by its council and some 700 working groups. Part of this organization, a President’s Advisory Committee on future technology (PACT) advises the IEC president on new technologies which require preliminary or immediate standardization work. 1999 by CRC Press LLC

c

PACT is designed to form a direct link with private and public research and development activities, keeping the IEC abreast of accelerating technological changes and the accompanying demand for new standards. Small industrial project teams examine new work initiatives which can be introduced into the regular IEC working structure. In 1987 a joint technical committee was established incorporating ISO TC97, IEC TC83, and subcommittee 47B to deal with generic information technology. The international standards developed by JTC1 are published under the ISO and IEC logos. The activities of ISO/IEC/JTC 1 are listed in Table 14.3 expressed in terms of its subcommittees (SC). TABLE 14.3 SC 1

ISO/IEC/JTC1 Subcommittees

Vocabulary

SC 2

Coded character sets

SC 6

Telecommunications information exchange between systems

SC 7

Software engineering

SC 11

Flexible magnetic media for digital data interchange

SC 17

Identification cards and related devices

SC 22

Programming languages, their environments and systems software interfaces

SC 23

Optical disk cartridges for information interchange

SC 24

Computer graphics and image processing

SC 25

Interconnection information technology management

SC 26

Microprocessor systems

SC 27

IT security techniques

SC 28

Office equipment

SC 29

Coding of audio, picture, multimedia and hypermedia information

SC 31

Automatic data capture

SC 32

Data management services

SC 33

Distributed application services

The ISO and IEC jointly issue directions for the work of the technical committees. The scope (or area of activity) of each technical committee (TC)/subcommittee (SC) is defined by the TC/SC itself, and then submitted to the Committee of Action (CA)/parent TC for approval. The TCs/SCs prepare technical documents on specific subjects within their respective scopes, which area then submitted to the National Committees for voting with a view to their approval as international standards.

14.3

Regional Standardization

Today the ETSI comes closest to being a true regional standards setting body, together with CITEL, the regional (Latin-American) standardization body. ETSI is the result of the Single Act of the European community and the EC commission green paper in 1987 that analyzed the consequences of the Single Act and recommended that a European telecommunications standards body be created to develop common standards for telecommunications equipment and networks. Out of this recommendation, the Committee for Harmonization (CCH) and the European Conference for Post and Telecommunications (CEPT) evolved into ETSI, which formally came into being in March 1988. It should be noted, however, that even though ETSI attributes at least part of its existence to the European Community, its membership is wider than just the European Union Nations. 1999 by CRC Press LLC

c

Because of the way ETSI came into being, ETSI is characterized by a unique aspect, namely, it is often called upon by the European Commission to develop standards that are necessary to implement legislation. Such standards, which are referred to as technical basis reports (TBR) and whose application is usually mandatory, are often needed in public procurements, as well as in provisioning for open network interconnection as national telecommunications administrations are being deregulated. Like ITU, however, ETSI also develops voluntary standards in accordance with common international understanding against which industry is not obliged to produce conforming products. These standards fall into either the European technical standard (ETS) class when fully approved, or into the interim-ETS class, when not fully stable or proven. ETSI standards are typically sought when either the subject matter is not studied at the global level (such as when it may be required to support some piece of legislation), or the development of the standard is justified by market needs that exist in Europe and not in other parts of the world. In some cases, it may be necessary to adapt ITU standards for the European continent, although a simple endorsement of an ITU standard as a European standard is also possible. A more delicate case arises when both the ITU and ETSI are pursuing parallel standards activities, in which case close coordination with the ITU is sought either through member countries that may input ETSI standards to the ITU for consideration or through the global standards collaboration process. The highest authority of ETSI is the general assembly, which determines ETSI’s policy, appoints its director and deputy, adopts the budget, and approves the audited accounts. The more technical issues are addressed by the technical assembly, which approves technical standards, advises on the work to be undertaken, and sets priorities. The ETSI technical committees are listed in Table 14.4. TABLE 14.4

ETSI Technical Committees

TCEE

Environmental engineering

TCHF

Human factors

TCMTS

Methods for testing and specification

TCSEC

Security

TCSPS

Signalling protocols and switching

TCTM

Transmission and multiplexing

TCERM

EMC and radio spectrum matters

TCICC

Integrated circuit cards

TCNA

Network aspects

TCSES

Satellite earth stations and systems

TCSTQ

Speech processing, transmission and quality

TCTMN

Telecommunications management networks

ECMA TC32

Communication, networks and systems interconnection

EBU/CENELEC/ETSI JTC

Joint technical committee

It can be seen that ETSI currently comprises 14 technical committees reporting to the technical assembly. These committees are responsible for the development of technical standards. In addition, these committees are responsible for prestandardization activities, that is, activities lead to ETSI technical reports (ETR) that eventually become the basis for future standards. In addition to the technical assembly, a strategic review committee (SRC) is responsible for prospective examination of a single technical domain, whereas an intellectual property rights committee defines ETSI’s policy in the area of intellectual property. Although by no means unique to ETSI, 1999 by CRC Press LLC

c

the rapid pace of technological progress has resulted in more standards being adopted that embrace technologies that are still under patent protection. This creates a fundamental conflict between the private, exclusive nature of industrial property rights, and the open, public nature of standards. Harmonizing those conflicting claims has emerged as a thorny issue in all standards organizations; ETSI has established a formal function for this purpose. Finally, the ETS/EBU technical committee coordinates activities with the European broadcasting union (EBU), whereas the ISDN committee is in charge of managing and coordinating the standardization process for narrowband ISDN.

14.3.1

CITEL

On June 11, 1993, the Organization of American States (OAS) General Assembly revised the existing Inter-American Telecommunication Commission (CITEL) strengthening and reorganizing the activities of CITEL, creating a position for the executive secretariat of CITEL and opening the doors, as associate members, to enterprises, organisms, and private telecommunication organizations, to act as observers of the permanent consultative committees of CITEL and its working groups. CITEL’s objectives include facilitating and promoting the continuous development of telecommunications in the hemisphere. It serves as the organization’s principal advisory body on matters related to telecommunications. The commission represents all the members states. It has a permanent executive committee consisting of 11 members, and three permanent consultative committees. The permanent consultative committees, whose members are all member states of the organization, also have associate members that represent various private telecommunications agencies or companies. The general assembly of CITEL, through resolution CITEL Res.8(I-94) established the following specific mandates for the three permanent consultative committees and the steering committee. Permanent Consultative Committee I: Public Telecommunication Services. To promote and watch over the integration and strengthening of networks and public telecommunication services operating in the countries of the Americas, taking into account the need for modernization of networks and promotion of universal telephone basic services, as well as for increasing the public availability of specialized services and the promotion of the use of international ITU standards and radio regulations. Permanent Consultative Committee II: Broadcasting. To stimulate and encourage the regional presence of broadcasting services, promoting the use of modern technologies and improving the public availability of such communication media, including audio and video systems, and the promotion of the use of international ITU standards and radio regulations. Permanent Consultative Committee III: Radiocommunications. To promote the harmonization of radiocommunication services bearing especially in mind the need for a reduction to the minimum of those factors that may cause harmful interferences in the performance and operation of networks and services. To promote the use of modern technologies and the application of the ITU radio regulations and standards. Steering Committee. The Steering Committee shall be formed by the chairman and vice-chairman of COM/CITEL and the chairman of the PCCs. The committee will be responsible for the revision and proposal to COM/CITEL of the continuous updating of the regulations, mandates and work programs of CITEL bodies; the executive secretary of CITEL will act as the secretary of said committee.

14.4

National Standardization

As standardization moves from global to regional and then to national levels, the number of actual participating entities rapidly grows. Here, the function of two national standards bodies are reviewed, 1999 by CRC Press LLC

c

primarily because these have been in existence the longest and secondarily because they also represent major markets for commercial communications.

14.4.1

ANSI T1

Unlike the ETSI, which came into being partly as a consequence of legislative recommendations, the ANSI Committee T1 on telecommunications came into being as a result of the realization that with the breakup of the Bell System, de-facto standards could no longer be expected. In fact, T1 came into being the very same year (1984) that the breakup of the Bell System came into effect. The T1 membership comprises four types of interest groups: users and general interest groups, manufacturers, interexchange carriers, and exchange carriers. This rather broad membership is reflected, to some extent, by the scope to which T1 standards are being applied; this means that nontraditional telecommunications service providers are utilizing the technologies standardized by committee T1. This situation is the result of the rapid evolution and convergence of the telecommunications, computer, and cable television industries in the United States, and advances in wireless technology. Committee T1 currently addresses approximately 150 approved projects, which led to the establishment of six, primarily functionally oriented, technical subcommittees (TSC), as shown in Table 14.5 and Fig. 14.2 [although not evident from Table 14.3, subcommittee T1P1 has primary responsibility for management of activities on personal communications systems (PCS)]. In-turn, each of these six subcommittees is divided into a number of subtending working groups, and subworking groups. TABLE 14.5

T1 Subcommittee Structure

TSC: T1A1

Performance and signal processing

TSC: T1E1

Network interfaces and environmental considerations

TSC: T1M1

Interwork operations, administration, maintenance, and provisioning

TSC: T1P1

Systems engineering, standards planning, and program management

TSC: T1S1

Services, architecture, and signalling

TSC: T1X1

Digital hierarchy and synchronization

FIGURE 14.2: T1 committee structure.

1999 by CRC Press LLC

c

Committee T1 also has an advisory group (T1AG) made up of elected representatives from each of the four interest groups to carry out committee T1 directives and to develop proposals for consideration by the T1 membership. In parallel to serving as the forum that establishes ANSI telecommunications network standards, committee T1 technical subcommittees draft candidate U.S. technical contributions to the ITU. These contributions are submitted to the U.S. Department of State National Committee for the ITU, which administers U.S. participation and contributions to the ITU (see Fig. 14.3). In this manner, activities within T1 are coordinated with those of the ITU. This coordination with other standards setting bodies is also reflected in T1’s involvement with Latin-American standards, through the formation of an ad hoc group with CITEL’s permanent technical committee 1 (PTC 1/T1). Further coordination with ETSI and other standards setting bodies is accomplished through the global standards collaboration process.

FIGURE 14.3: Committee T1 output.

14.4.2 TIA The TIA is a full-service trade organization that provides its members with numerous services including government relations, market support activities, educational programs, and standards setting activities. TIA is a member-driven organization. Policy is formulated by 25 board members selected from member companies, and is carried out by a permanent professional staff located in Washington D.C. TIA comprises six issue-oriented standing committees, each of which is chaired by a board member. The six committees are membership scope and development, international, marketing and trade shows, public policy and government relations, and technical. It is this last committee that in 1992 was accredited by ANSI in the United States to standardize telecommunications products. Technology standardization activities are reflected by TIA’s four product-oriented divisions, namely, user premises equipment, network equipment, mobile and personal communications equipment, and fiber optics. In these divisions the legislative and regulatory concerns of product manufacturers and the preparation of standards dealing with performance testing and compatibility are addressed. For example, modem and telematic standards, as well as much of the cellular standards technology, has been standardized in the United States under the mandate of TIA.

1999 by CRC Press LLC

c

14.4.3

TTC

The third national committee to be addressed is the TTC in Japan. TTC was established in October 1985 to develop and disseminate Japanese domestic standards for deregulated technical items and protocols. It is a nongovernmental, nonprofit standards setting organization established to ensure fair and transparent standardization procedures. TTC’s primary emphasis is to develop, conduct studies and research, and disseminate protocols and standards for the connection of telecommunications networks. TTC is organized along six technical subcommittees that report to a board of directors through a technical assembly (see Fig. 14.4).

FIGURE 14.4: Organization of TTC.

The TTC organization comprises a general assembly, which is in charge of matters such as business plans and budgets. The councilors meeting examines standards development procedures in order to assure impartiality and clarity. The secretariat provides overall support to the organization; the technical assembly develops standards and handles technical matters including surveys and research. Each technical subcommittee is partitioned into two or more working groups (WG). The coordination committee handles all issues in or between the TSCs and WGs, and it assures the smooth running of all technical committee meetings. Under the coordination committee, a subcommittee examines users’ requests and studies their applicability to the five-year standardization-project plan. This subcommittee also conducts userrequest surveys. The areas of involvement of each of the five subcommittees are shown in Table 14.6. TTC membership is divided into four categories. Type I telecommunications carriers, that is, those carriers that own telecommunications circuits and facilities; type II telecommunications carTABLE 14.6

TTC SubCommittees

Strategic Research and Planning Committee: Technical Survey and International Collaboration TSC 1 Network-to-network interfaces, mobile communications TSC 2 User-network interfaces TSC 3 PBX, LAN TSC 4 Higher level protocols TSC 5 Voice and video signal coding scheme and systems

1999 by CRC Press LLC

c

riers, that is, those with telecommunications circuits leased from type I carriers; related equipment manufacturers; and others, including users. Underlying objectives that guide TTC’s approach to standards development are 1) to conform to international recommendations or standards; 2) standardize items, where either international recommendations or standards are not clear, or where national standards need to be set, and where a consensus is achieved; and 3) to conduct further studies into any of the items just mentioned whenever the technical assembly is unable to arrive at a consensus. These objectives, which give highest priority in developing standards that are compatible with international recommendations or standards, have often driven TTC to adapt international standards for national use through the use of supplements that: • • • • • •

Give guidelines for users of TTC standards on how to apply them Help clarify the contents of standards Help with the implementation of standards in terminal equipment and adaptors Assure interconnection between terminal equipment and adaptors Provide background information regarding the content of standards Assure interconnection.

These supplements also include questions and answers that help in implementing the standards, including encoding examples of various parameters and explanation of the practical meaning of a standard.

14.5

Intellectual Property

In the deregulating telecommunication arena patents have become increasingly more important. New ideas that are incorporated in standards often have global market potential and patent holders are seeking to obtain an income from their intellectual property as well as from products. In addition, the general effort to develop standards quickly places them closer to the leading edge of technology. There are some cases, for example speech encoding algorithms, where terms of reference for performance are typically set as objectives that no one can meet when the objectives are defined. The state of the art is being pushed by goals of the standards development organization. In this environment, incorporation of some intellectual property in standards is practically unavoidable. With regard to intellectual property rights in the ITU, the TSB has developed a “code of practice” which may be summarized as follows. The TSB requests members putting forth standards to draw the attention of the TSB to any known patent or patent pending application relevant to the developing standard. Where such information has been declared to the TSB, a log of registered patent holders for each affected recommendation is maintained for the convenience of users of ITU standards. If a recommendation, which is a nonbinding international standard, is developed and contains patented intellectual property there are three situations that may arise. • The patent holder waives the rights and the recommendation is freely accessible to everybody. • The patent holder will not waive the rights but is willing to negotiate licenses with other parties on a nondiscriminatory basis and on reasonable terms and conditions. What is reasonable is not defined, and the ITU-T will not participate in such negotiations. 1999 by CRC Press LLC

c

• The patent holder is not willing to comply with either of the above two situations, in which case the ITU-T will not approve a recommendation containing such intellectual property. The patent policy of the American National Standards Institute (ANSI), which governs all standards development organizations accredited by ANSI, is defined in ANSI procedures 1.2.11. It is similar to that of the ITU in that it requires a statement from patent holders or identified parties to indicate granting of a royalty-free license, willingness to license on reasonable and nondiscriminatory terms and conditions, or a disclaimer of no patent. Unlike the ITU, ANSI advises that is prepared to get involved in resolving disputes of what is considered “nondiscriminatory” and “reasonable.” Additional information on ANSI patent guidelines can be found at http://web.ansi.org/public/library/guides/ppguide.html. As mentioned earlier ETSI produces a combination of mandatory and voluntary standards. This can create additional complications when intellectual property issues are encapsulated within the standards. To formally address these issues an intellectual property rights committee defines ETSI’s policy in the area of intellectual property. Given the different patent policies adopted by various standards organizations, it is recommended that companies developing products based on standards investigate and understand the patent policy of the associated standards body and the patent statements filed regarding the standard being implemented.

14.6

Standards Coordination

The pace of technological advancements coupled with deregulation has given rise to increased global telecommunications standards activities. At the same time a growth of regional standards bodies has occurred which has increased the potential for duplication of work, wasting resources, and creating conflicting standards. This potentially adverse situation was addressed by a number of interregional telecommunications standardization conferences (ITSCs) that were held in the early 1990s. A global standards collaboration (GSC) group was established to oversee collaborative activities including electronic document handling (EDH) and five high-interest standards subjects: • • • • •

Broadband integrated services digital network (B-ISDN) Intelligent Networks (IN) Transmission management network (TMN) Universal personal telecommunications (UPT) Synchronous digital hierarchy/synchronous optical network (SDH/SONET)

This early activity was successful in avoiding duplication of effort and coordinating activities on these major standardization efforts. Today the level of cooperative activities, again driven by the pressure to avoid wasting valuable resources and reaching agreed standards more rapidly, are being driven to lower levels through the use of liaison statements between regional standards groups and permitting “documents of information” to flow between standards development organizations. The processes for this information flow are evolving and the electronic addresses provided at the end of this chapter should be consulted for the current interstandards organization communication mechanisms. 1999 by CRC Press LLC

c

14.7

Scientific

Another global, scientifically based organization that has been particularly active in standards development (more recently emphasizing information processing) is the IEEE. Responsibility for standards adoption within the IEEE lies with the IEEE standards board. The board is supported by nine standing committees (see Fig. 14.5).

FIGURE 14.5: IEEE standards board organization.

Proposed standards are normally developed in the technical committees of the IEEE societies. There are occasions, however, when the scope of activity is too broad to be encompassed by a single society or where the societies are not able to do so for other reasons. In this case the standards board establishes its own standards developing committees, namely, the standards coordinating committees (SCC), to perform this function. The adoption of IEEE standards is based on projects that have been approved by the IEEE standards board, while each project is the responsibility of a sponsor. Sponsors need not be an SCC, but can also include technical committees of IEEE societies; a standards, or standards coordinating committee of an IEEE Society; an accredited standards committee; or another organization approved by the IEEE standards board.

14.8

Standards Development Cycle

Although the manner in which standards are developed and approved somewhat varies between standards organizations, there are common characteristics to be found. For most standards, first a set of requirements is defined. This may be done either by the standards committee actually developing the standard or by another entity in collaboration with such a committee. Subsequently, the technical details of a standard are developed. The actual entity developing a standard may be a member of the standards committee, or the actual standards committee 1999 by CRC Press LLC

c

itself. Outsiders may also contribute to standards development but, typically, only if sponsored by a committee member. Membership in the standards committee and the right to contribute technical information towards the development of the standard differs among the various standards’ organizations, as indicated. This process is illustrated in Fig. 14.6.

FIGURE 14.6: Typical standards development and approval process.

Finally, once the standard has been fully developed, it is placed under an approval cycle. Each standards setting body typically has precisely defined and often complex procedures for reviewing and then approving proposed standards, which although different in detail, are typically consensus driven.

Defining Terms ANSI: The American National Standards Institute. CCIR: The International Radio Consultative Committee, the predecessor of the ITU-R. CCITT: The International Telephone and Telegraph Consultative Committee, the predecessor of the ITU-T. CEPT: The European Conference for Post and Telecommunications, a predecessor of ETSI. CITEL: Inter-American Telecommunications Commission, a standards setting body for the Americas. ETS: A European (ETSI) technical standard. ETSI: The European Telecommunications Standards Institute. GSC: The Global Standards Collaboration group. ICG: Intersector Coordination Group, a group which coordinates activities between the ITU-T and ITU-R. IEC: The International Electrotechnical Commission. IEEE: The Institute of Electrical and Electronics Engineers. ISO: The International Standards Organization. ITU: The International Telecommunications Union, an international treaty organization, which is part of the United Nations. ITU-R: The radio communications sector of the ITU, the successor of the CCIR. 1999 by CRC Press LLC

c

ITU-T: The standardization sector of the ITU, the successor of the CCITT. JCG: The Joint Coordination Group, which oversees the coordination of common work between ITU-T study groups. Recommendation: An ITU technical standard. SCC: A standard’s coordinating committee within the IEEE organization. Standard: A publicly approved technical specification. T1: An ANSI-approved standards body, which develops telecommunications standards in the United States. T1AG: The primary advisory group within ANSI Committee T1 on Telecommunications. TIA: The Telecommunications Industry Association, which is an ANSI-approved standards body that develops terminal equipment standards. TTC: The Telecommunications Technology Committee, a Japanese standards setting body.

Further Information [1] Irmer, T., Shaping future telecommunications: the challenge of global standardization, IEEE Comm. Mag., 32(1), 20–28, 1994. [2] Matute, M.A., CITEL: formulating telecommunications in the Americas. IEEE Comm. Mag., 32(1), 38–39, 1994. [3] Robin, G., The European perspective for telecommunications standards. IEEE Comm. Mag., 32(1), 40–50, 1994. [4] Reilly, A.K., A U.S. perspective on standards development. IEEE Comm. Mag., 32(1), 30–36, 1994. [5] Iida, T., Domestic standards in a changing world. IEEE Comm. Mag., 32(1), 46–50, 1994. [6] Habara, K., Cooperation in standardization. IEEE Comm. Mag., 32(1), 78–84, 1994. [7] IEEE Standards Board Bylaws. Institute of Electrical and Electronics Engineers. Dec. 1993. [8] Chiarottino, W. and Pirani, G., International telecommunications standards organizations, CSELT Tech. Repts., XXI(2), 207–236, 1993. [9] ITU, Book No. 1. Resolutions; Recommendations on the organization of the work of ITU-T (series A); study groups and other groups; list of study questions (1993-1996). World Standardization Conf. Helsinki, 1–12, Mar. 1993. [10] Standards Committee T1., Telecommunications. Procedures Manual. 7th Iss. Jun. 1992.

The standards’ organizations often undergo structural and substantive changes. It is recommended that the following web sites be visited for the most updated information. ANSI CITEL ETSI IEC IEEE ISO ITU T1 TIA TTC 1999 by CRC Press LLC

c

http://www.ansi.org/ http://www.oas.org http://www.etsi.org http://www.iec.ch http://www.ieee.org http://www.iso.ch http://www.itu.ch http://www.t1.org http://www.tia.org http://www.ttc.or.jp

Cox, D.C. “Wireless Personal Communications: A Perspective” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Wireless Personal Communications: A Perspective 15.1 Introduction 15.2 Background and Issues Mobility and Freedom from Tethers

15.3 Evolution of Technologies, Systems, and Services

Cordless Telephones • Cellular Mobile Radio Systems • WideArea Wireless Data Systems • High-Speed Wireless Local-Area Networks (WLANs) • Paging/Messaging Systems • SatelliteBased Mobile Systems • { Fixed Point-to-Multipoint Wireless Loops • Reality Check

15.4 Evolution Toward the Future and to Low-Tier Personal Communications Services 15.5 Comparisons with Other Technologies Complexity/Coverage Area Comparisons Speed, and Environments

•

{ Coverage, Range,

15.6 Quality, Capacity, and Economic Issues

Capacity, Quality, and Complexity • Economics, System Capacity, and Coverage Area Size • { Loop Evolution and Economics

15.7 Other Issues

Improvement of Batteries • People Only Want One Handset • Other Environments • Speech Quality Issues • New Technology • High-Tier to Low-Tier or Low-Tier to High-Tier Dual Mode

Donald C. Cox Stanford University

15.8 Infrastructure Networks 15.9 Conclusion References

{ This chapter has been updated using { } as indicators of inserts into the text of the original chapter of the same title that appeared in the first edition of this Handbook in 1996. }

15.1

Introduction

Wireless personal communications has captured the attention of the media and with it, the imagination of the public. Hardly a week goes by without one seeing an article on the subject appearing in a popular U.S. newspaper or magazine. Articles ranging from a short paragraph to many pages regularly appear in local newspapers, as well as in nationwide print media, e.g., The Wall Street Journal, The New York Times, Business Week, and U.S. News and World Report. Countless marketing surveys 1999 by CRC Press LLC

c

continue to project enormous demand, often projecting that at least half of the households, or half of the people, want wireless personal communications. Trade magazines, newsletters, conferences, and seminars on the subject by many different names have become too numerous to keep track of, and technical journals, magazines, conferences, and symposia continue to proliferate and to have ever increasing attendance and numbers of papers presented. It is clear that wireless personal communications is, by any measure, the fastest growing segment of telecommunications. { The explosive growth of wireless personal communications has continued unabated worldwide. Cellular and high-tier PCS pocketphones, pagers, and cordless telephones have become so common in many countries that few people even notice them anymore. These items have become an expected part of everyday life in most developed countries and in many developing countries around the world. } If you look carefully at the seemingly endless discussions of the topic, however, you cannot help but note that they are often describing different things, i.e., different versions of wireless personal communications [29, 50]. Some discuss pagers, or messaging, or data systems, or access to the national information infrastructure, whereas others emphasize cellular radio, or cordless telephones, or dense systems of satellites. Many make reference to popular fiction entities such as Dick Tracy, Maxwell Smart, or Star Trek. { In addition to the things noted above, the topic of wireless loops [24], [30], [32] has also become popular in the widespread discussions of wireless communications. As discussed in [30], this topic includes several fixed wireless applications as well as the low-tier PCS application that was discussed originally under the wireless loop designation [24, 32]. The fixed wireless applications are aimed at reducing the cost of wireline loop-ends, i.e., the so-called “last mile” or “last km” of wireline telecommunications. } Thus, it appears that almost everyone wants wireless personal communications, but What is it? There are many different ways to segment the complex topic into different communications applications, modes, functions, extent of coverage, or mobility [29, 30, 50]. The complexity of the issues has resulted in considerable confusion in the industry, as evidenced by the many different wireless systems, technologies, and services being offered, planned, or proposed. Many different industry groups and regulatory entities are becoming involved. The confusion is a natural consequence of the massive dislocations that are occurring, and will continue to occur, as we progress along this large change in the paradigm of the way we communicate. Among the different changes that are occurring in our communications paradigm, perhaps the major constituent is the change from wired fixed place-to-place communications to wireless mobile person-to-person communications. Within this major change are also many other changes, e.g., an increase in the significance of data and message communications, a perception of possible changes in video applications, and changes in the regulatory and political climates. { The fixed wireless loop applications noted earlier do not fit the new mobile communications paradigm. After many years of decline of fixed wireless communications applications, e.g., intercontinental HF radio and later satellites, point-to-point terrestrial microwave radio, and tropospheric scatter, it is interesting to see this rebirth of interest in fixed wireless applications. This rebirth is riding on the gigantic “wireless wave” resulting from the rapid public acceptance of mobile wireless communications. It will be interesting to observe this rebirth to see if communications history repeats; certainly mobility is wireless, but there is also considerable historical evidence that wireless is also mobility. } This chapter attempts to identify different issues and to put many of the activities in wireless into a framework that can provide perspective on what is driving them, and perhaps even to yield some indication of where they appear to be going in the future. Like any attempt to categorize many complex interrelated issues, however, there are some that do not quite fit into neat categories, and so there will remain some dangling loose ends. Like any major paradigm shift, there will continue to be considerable confusion as many entities attempt to interpret the different needs and expectations associated with the new paradigm. 1999 by CRC Press LLC

c

15.2

Background and Issues

15.2.1

Mobility and Freedom from Tethers

Perhaps the clearest constituents in all of the wireless personal communications activity are the desire for mobility in communications and the companion desire to be free from tethers, i.e., from physical connections to communications networks. These desires are clear from the very rapid growth of mobile technologies that provide primarily two-way voice services, even though economical wireline voice services are readily available. For example, cellular mobile radio has experienced rapid growth. Growth rates have been between 35 and 60% per year in the United States for a decade, with the total number of subscribers reaching 20 million by year-end 1994. The often neglected wireless companions to cellular radio, i.e., cordless telephones, have experienced even more rapid, but harder to quantify, growth with sales rates often exceeding 10 million sets a year in the United States, and with an estimated usage significantly exceeding 50 million in 1994. Telephones in airlines have also become commonplace. Similar or even greater growth in these wireless technologies has been experienced throughout the world. { The explosive growth in cellular and its identical companion, high-tier PCS, has continued to about 55 million subscribers in the U.S. at year-end 1997 and a similar number worldwide. In Sweden the penetration of cellular subscribers by 1997 was over one-third of the total population, i.e., the total including every man, woman, and child! And the growth has continued since. Similar penetrations of mobile wireless services are seen in some other developed nations, e.g., Japan. The growth in users of cordless telephones also has continued to the point that they have become the dominant subscriber terminal on wireline telephone loops in the U.S. It would appear that, taking into account cordless telephones and cellular and high-tier PCS phones, half of all telephone calls in the U.S. terminate with at least one end on a wireless device. } { Perhaps the most significant event in wireless personal communications since the writing of this original chapter was the widespread deployment and start of commercial service of personal handphone (PHS) in Japan in July of 1995 and its very rapid early acceptance by the consumer market [53]. By year-end 1996 there were 5 million PHS subscribers in Japan with the growth rate exceeding one-half million/month for some months. The PHS “phenomena” was one of the fastest adoptions of a new technology ever experienced. However, the PHS success story [41] peaked at a little over 7 million subscribers in 1997 and has declined slightly to a little under 7 million in mid-1998. This was the first mass deployment of a low-tier-like PCS technology (see later sections of this chapter), but PHS has some significant limitations. Perhaps the most significant limitation is the inability to successfully handoff at vehicular speeds. This handoff limitation is a result of the cumbersome radio link structure and control algorithms used to implement dynamic channel allocation (DCA) in PHS. DCA significantly increases channel occupancy (base station capacity) but incurs considerable complexity in implementing handoff. Another significant limitation of the PHS standard has been insufficient receiver sensitivity to permit “adequate” coverage from a “reasonably” dense deployment of base stations. These technology deficiencies coupled with heavy price cutting by the cellular service providers to compete with the rapid advancing of the PHS market were significant contributors to the leveling out of PHS growth. It is again evident, as with CT-2 phone point discussed in a later section, that low-tier PCS has very attractive features that can attract many subscribers, but it must also provide vehicle speed handoff and widespread coverage of highways as well as populated regions. Others might point out the deployment and start of service of CDMA systems as a significant event since the first edition. However, the major significance of this CDMA activity is that it confirmed that CDMA performance was no better than other less-complex technologies and that those, including this author, who had been branded as “unbelieving skeptics” were correct in their assessments of 1999 by CRC Press LLC

c

the shortcomings of this technology. The overwhelming failure of CDMA technology to live up to the early claims for it can hardly be seen as a significant positive event in the evolution of wireless communication. It was, of course, a significant negative event. After years of struggling with the problems of this technology, service providers still have significantly fewer subscribers on CDMA worldwide than there are PHS subscribers in Japan alone! CDMA issues are discussed more in later sections dealing with technology issues. } Paging and associated messaging, although not providing two-way voice, do provide a form of tetherless mobile communications to many subscribers worldwide. These services have also experienced significant growth { and have continued to grow since 1996. } There is even a glimmer of a market in the many different specialized wireless data applications evident in the many wireless local area network (WLAN) products on the market, the several wide area data services being offered, and the specialized satellite-based message services being provided to trucks on highways. { Wireless data technologies still have many supporters, but they still have fallen far short of the rapid deployment and growth of the more voice oriented wireless technologies. However, hope appears to be eternal in the wireless data arena. } The topics discussed in the preceding two paragraphs indicate a dominant issue separating the different evolutions of wireless personal communications. That issue is the voice versus data communications issue that permeates all of communications today; this division also is very evident in fixed networks. The packet-oriented computer communications community and the circuit-oriented voice telecommunications (telephone) community hardly talk to each other and often speak different languages in addressing similar issues. Although they often converge to similar overall solutions at large scales (e.g., hierarchical routing with exceptions for embedded high-usage routes), the smallscale initial solutions are frequently quite different. Asynchronous transfer mode (ATM-) based networks are an attempt to integrate, at least partially, the needs of both the packet-data and circuitoriented communities. Superimposed on the voice-data issue is an issue of competing modes of communications that exist in both fixed and mobile forms. These different modes include the following. Messaging is where the communication is not real time but is by way of message transmission, storage, and retrieval. This mode is represented by voice mail, electronic facsimile (fax), and electronic mail (e-mail), the latter of which appears to be a modern automated version of an evolution that includes telegraph and telex. Radio paging systems often provide limited one-way messaging, ranging from transmitting only the number of a calling party to longer alpha-numeric text messages. Real-time two-way communications are represented by the telephone, cellular mobile radio telephone, and interactive text (and graphics) exchange over data networks. Two-way video phone always captures significant attention and fits into this mode; however, its benefit/cost ratio has yet to exceed a value that customers are willing to pay. Paging, i.e., broadcast with no return channel, alerts a paged party that someone wants to communicate with him/her. Paging is like the ringer on a telephone without having the capability for completing the communications. Agents are new high-level software applications or entities being incorporated into some computer networks. When launched into a data network, an agent is aimed at finding information by some title or characteristic and returning the information to the point from which the agent was launched. { The rapid growth of the worldwide web is based on this mode of communications. } There are still other ways in which wireless communications have been segmented in attempts to optimize a technology to satisfy the needs of some particular group. Examples include 1) user location, which can be differentiated by indoors or outdoors, or on an airplane or a train and 2) degree of mobility, which can be differentiated either by speed, e.g., vehicular, pedestrian, or stationary, or 1999 by CRC Press LLC

c

by size of area throughout which communications are provided. { As noted earlier, wireless local loop with stationary terminals has become a major segment in the pursuit of wireless technology. } At this point one should again ask; wireless personal communications—What is it? The evidence suggests that what is being sought by users, and produced by providers, can be categorized according to the following two main characteristics. Communications portability and mobility on many different scales: • Within a house or building [cordless telephone, (WLANs)] • Within a campus, a town, or a city (cellular radio, WLANs, wide area wireless data, radio paging, extended cordless telephone) • Throughout a state or region (cellular radio, wide area wireless data, radio paging, satellitebased wireless) • Throughout a large country or continent (cellular radio, paging, satellite-based wireless) • Throughout the world? Communications by many different modes for many different applications: • • • •

Two-way voice Data Messaging Video?

Thus, it is clear why wireless personal communications today is not one technology, not one system, and not one service but encompasses many technologies, systems, and services optimized for different applications.

15.3

Evolution of Technologies, Systems, and Services

Technologies and systems [27, 29, 30, 39, 50, 59, 67, 87], that are currently providing, or are proposed to provide, wireless communications services can be grouped into about seven relatively distinct groups, { the seven previous groups are still evident in the technology but with the addition of the fixed point-to-multipoint wireless loops there are now eight, } although there may be some disagreement on the group definitions, and in what group some particular technology or system belongs. All of the technologies and systems are evolving as technology advances and perceived needs change. Some trends are becoming evident in the evolutions. In this section, different groups and evolutionary trends are explored along with factors that influence the characteristics of members of the groups. The grouping is generally with respect to scale of mobility and communications applications or modes.

15.3.1

Cordless Telephones

Cordless telephones [29, 39, 50] generally can be categorized as providing low-mobility, low-power, two-way tetherless voice communications, with low mobility applying both to the range and the user’s speed. Cordless telephones using analog radio technologies appeared in the late 1970s, and have experienced spectacular growth. They have evolved to digital radio technologies in the forms of second-generation cordless telephone (CT-2), and digital European cordless telephone (DECT) 1999 by CRC Press LLC

c

standards in Europe, and several different industrial scientific medical (ISM) band technologies in the United States.1 { Personal handyphone (PHS) noted earlier and discussed in later sections and inserts can be considered either as a quite advanced digital cordless telephone similar to DECT or as a somewhat limited low-tier PCS technology. It has most of the attributes of similarity of the digital cordless telephones listed later in this section except that PHS uses π/4 QPSK modulation. } Cordless telephones were originally aimed at providing economical, tetherless voice communications inside residences, i.e., at using a short wireless link to replace the cord between a telephone base unit and its handset. The most significant considerations in design compromises made for these technologies are to minimize total cost, while maximizing the talk time away from the battery charger. For digital cordless phones intended to be carried away from home in a pocket, e.g., CT-2 or DECT, handset weight and size are also major factors. These considerations drive designs toward minimizing complexity and minimizing the power used for signal processing and for transmitting. Cordless telephones compete with wireline telephones. Therefore, high circuit quality has become a requirement. Early cordless sets had marginal quality. They were purchased by the millions, and discarded by the millions, until manufacturers produced higher-quality sets. Cordless telephones sales then exploded. Their usage has become commonplace, approaching, and perhaps exceeding, usage of corded telephones. The compromises accepted in cordless telephone design in order to meet the cost, weight, and talk-time objectives are the following. • Few users per megahertz • Few users per base unit (many link together a particular handset and base unit) • Large number of base units per unit area; one or more base units per wireline access line (in high-rise apartment buildings the density of base units is very large) • Short transmission range There is no added network complexity since a base unit looks to a telephone network like a wireline telephone. These issues are also discussed in [29, 50]. Digital cordless telephones in Europe have been evolving for a few years to extend their domain of use beyond the limits of inside residences. Cordless telephone, second generation, (CT-2) has evolved to provide telepoint or phone-point services. Base units are located in places where people congregate. e.g., along city streets and in shopping malls, train stations, etc. Handsets registered with the phone-point provider can place calls when within range of a telepoint. CT-2 does not provide capability for transferring (handing off) active wireless calls from one phone point to another if a user moves out of range of the one to which the call was initiated. A CT-2+ technology, evolved from CT-2 and providing limited handoff capability, is being deployed in Canada. { CT-2+ deployment was never completed. } Phone-point service was introduced in the United Kingdom twice, but failed to attract enough customers to become a viable service. In Singapore and Hong Kong, however, CT-2 phone point has grown rapidly, reaching over 150,000 subscribers in Hong Kong [75] in mid-1994. The reasons for success in some places and failure in others are still being debated, but it is clear that the compactness of the Hong Kong and Singapore populations make the service more widely available, using fewer base stations than in more spreadout cities. Complaints of CT-2 phone-point

1 These ISM technologies either use spread spectrum techniques (direct sequence or frequency hopping) or very lowtransmitter power ( 10 ms. Simple Frequency-Shift Modulation and Noncoherent Detection: Although still being low in complexity, the slightly more complex 4QPSK modulation with coherent detection provides significantly more spectrum efficiency, range, and interference immunity. Dynamic Channel Allocation: Although this technique has potential for improved system capacity, the cordless-telephone implementations do not take full advantage of this feature for handoff and, thus, cannot reap the full benefit for moving users [15, 19]. Time Division Duplex (TDD): This technique permits the use of a single contiguous frequency band and implementation of diversity from one end of a radio link. Unless all base station transmissions are synchronized in time, however, it can incur severe cochannel interference penalties in outside environments [15, 16]. Of course, for cordless telephones used inside with base stations not having a propagation advantage, this is not a problem. Also, for small indoor PBX networks, synchronization of base station transmission is easier than is synchronization throughout a widespread outdoor network, which can have many adjacent base stations connected to different geographic locations for central control and switching.

15.3.2

Cellular Mobile Radio Systems

Cellular mobile radio systems are becoming known in the United States as high-tier personal communications service (PCS), particularly when implemented in the new 1.9-GHz PCS bands [20]. These systems generally can be categorized as providing high-mobility, wide-ranging, two-way tetherless voice communications. In these systems, high mobility refers to vehicular speeds, and also to widespread regional to nationwide coverage [27, 29, 50]. Mobile radio has been evolving for over 50 years. Cellular radio integrates wireless access with large-scale networks having sophisticated intelligence to manage mobility of users. Cellular radio was designed to provide voice service to wide-ranging vehicles on streets and highways [29, 39, 50, 82], and generally uses transmitter power on the order of 100 times that of cordless telephones (≈ 2 W for cellular). Thus, cellular systems can only provide reduced service to handheld sets that are disadvantaged by using somewhat lower transmitter power (< 0.5 W) and less efficient antennas than vehicular sets. Handheld sets used inside buildings have the further disadvantage of attenuation through walls that is not taken into account in system design. Cellular radio or high-tier PCS has experienced large growth as noted earlier. In spite of the 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

TABLE 15.1

Wireless PCS Technologies High-Power Systems

Low-Power Systems

Digital Cellular (High-Tier PCS)

Low-Tier PCS

Digital Cordless

System

IS-54

IS-95 (DS)

GSM

DCS-1800

WACS/PACS

Handi-Phone

DECT

CT-2

Multiple Access

TDMA/ FDMA

CDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

FDMA

Freq. band, MHz Uplink, MHz 869–894 Downlink, MHz 824–849 (USA)

1895–1907

1880–1990

864–868

869–894 824–849 (USA)

935–960 890–915 (Eur.)

1710–1785 1805–1880 (UK)

Emerg. Tech.∗ (USA)

RF ch. spacing Downlink, KHz 30 Uplink, KHz 30

1250 1250

200 200

200 200

300 300

(Japan)

(Eur.)

(Eur. and Asia)

300

1728

100

Modulation

π/4 DQPSK

BPSK/QPSK GMSK

GMSK

π/4 QPSK

π/4 DQPSK

GFSK

GFSK

Portable txmit Power, max./avg.

600 mW/ 200 mW

600 mW

1 W/ 125 mW

1 W/ 125 mW

200 mW/ 25 mW

80 mW/ 10 mW

250 mW/ 10 mW

10 mW/ 5 mW

Speech coding

VSELP

QCELP

RPE-LTP

RPE-LTP

ADPCM

ADPCM

ADPCM

ADPCM

Speech rate, kb/s

7.95

8 (var.)

13

13

32/16/8

32

32

32

Speech ch./RF ch.

3

—

8

8

8/16/32

4

12

1

384

1152

72

Ch. Bit rate, kb/s Uplink, kb/s 48.6 Downlink, kb/s 48.6

270.833 270.833

270.833 270.833

384 384

Ch. coding

1/2 rate conv.

1/2 rate fwd. 1/2 rate 1/3 rate rev. conv.

1/2 rate conv.

CRC

CRC

CRC (control )

None

Frame, ms

40

20

4.615

2.5

5

10

2

4.615

∗ Spectrum is 1.85–2.2 GHz allocated by the FCC for emerging technologies; DS is direct sequence.

limitations on usage of handheld sets already noted, handheld cellular sets have become very popular, with their sales becoming comparable to the sales of vehicular sets. Frequent complaints from handheld cellular users are that batteries are too large and heavy, and both talk time and standby time are inadequate. { Cellular and high-tier PCS pocket handsets have continued to decrease in size and weight and more efficient lithium batteries have been incorporated. This has increased their attractiveness (more on this in the later section “Reality Check”). For several years there have been many more pocket handsets sold than vehicular mounted sets every year. However, despite the improvements in these handsets and batteries, the complaints of weight and limited talk time still persist. The electronics have become essentially weightless compared to the batteries required for these high-tier PCS and cellular handsets. } Cellular radio at 800 MHz has evolved to digital radio technologies [29, 39, 50] in the forms of the deployed systems standards • Global Standard for Mobile (GSM) in Europe • Japanese or personal digital cellular (JDC or PDC) in Japan • U.S. TDMA digital cellular known as USDC or IS-54. and in the form of the code division multiple access (CDMA) standard, IS-95, which is under development but not yet deployed. { Since the first edition was published, CDMA systems have been deployed in the U.S., Korea, Hong Kong, and other countries after many months (years) of redesign, reprogramming, and adjustment. These CDMA issues are discussed later in the section “New Technology.” } The most significant consideration in the design compromises made for the U.S. digital cellular or high-tier PCS systems was the high cost of cell sites (base stations). A figure often quoted is U.S. $1 million for a cell site. This consideration drove digital system designs to maximize users per megahertz and to maximize the users per cell site. Because of the need to cover highways running through low-population-density regions between cities, the relatively high transmitter power requirement was retained to provide maximum range from high antenna locations. Compromises that were accepted while maximizing the two just cited parameters are as follows. • • • •

High transmitter power consumption. High user-set complexity, and thus high signal-processing power consumption. Low circuit quality. High network complexity, e.g., the new IS-95 technology will require complex new switching and control equipment in the network, as well as high-complexity wireless-access technology.

Cellular radio or high-tier PCS has also been evolving for a few years in a different direction, toward very small coverage areas or microcells. This evolution provides increased capacity in areas having high user density, as well as improved coverage of shadowed areas. Some microcell base stations are being installed inside, in conference center lobbies and similar places of high user concentrations. Of course, microcells, also permit lower transmitter power that conserves battery power when power control is implemented, and base stations inside buildings circumvent the outside wall attenuation. Low-complexity microcell base stations also are considerably less expensive than conventional cell sites, perhaps two orders of magnitude less expensive. Thus, the use of microcell base stations provides large increases in overall system capacity, while also reducing the cost per available radio channel and the battery drain on portable subscriber equipment. This microcell evolution, illustrated in Fig. 15.1, 1999 by CRC Press LLC

c

moves handheld cellular sets in a direction similar to that of the expanded-coverage evolution of cordless telephones to phone points and wireless PBX. Some of the characteristics of digital-cellular or high-tier PCS technologies are listed in Table 15.1 for IS-54, IS-95, and GSM at 900 MHz, and DCS-1800, which is GSM at 1800 MHz. { The technology listed here as IS-54 has also become known as IS-136 having more sophisticated digital control channels. These technologies, IS-54/IS-136 are also sometimes known as DAMPS (i.e., Digital AMPS), as U.S. TDMA or North American TDMA, or sometimes just as “TDMA.” } Additional information can be found in [29, 39, 50]. The JDC or PDC technology, not listed, is similar to IS-54. As with the digital cordless technologies, there are significant differences among these cellular technologies, e.g., modulation type, multiple access technology, and channel bit rate. There are also many similarities, however, that are fundamental to the design objectives discussed earlier. These similarities and their implications are as follows. Low Bit-Rate Speech Coding ≤13 kb/s with Some ≤8 kb/s: Low bit-rate speech coding obviously increases the number of users per megahertz and per cell site. However, it also significantly reduces speech quality [29], and does not permit speech encodings in tandem while traversing a network; see also the section on Other Issues later in this chapter. Some Implementations Make Use of Speech Inactivity: This further increases the number of users per cell site, i.e., the cell-site capacity. It also further reduces speech quality [29], however, because of the difficulty of detecting the onset of speech. This problem is even worse in an acoustically noisy environment like an automobile. High Transmission Delay; ≈200-ms Round Trip: This is another important circuit-quality issue. Such large delay is about the same as one-way transmission through a synchronous-orbit communications satellite. A voice circuit with digital cellular technology on both ends will experience the delay of a full satellite circuit. It should be recalled that one reason long-distance circuits have been removed from satellites and put onto fiber-optic cable is because customers find the delay to be objectionable. This delay in digital cellular technology results from both computation for speech bitrate reduction and from complex signal processing, e.g., bit interleaving, error correction decoding, and multipath mitigation [equalization or spread spectrum code division multiple access (CDMA)]. High-Complexity Signal Processing, Both for Speech Encoding and for Demodulation: Signal processing has been allowed to grow without bound and is about a factor of 10 greater than that used in the low-complexity digital cordless telephones [29]. Since several watts are required from a battery to produce the high transmitter power in a cellular or high-tier PCS set, signal-processing power is not as significant as it is in the low-power cordless telephones; see also the section on Complexity/Coverage Area Comparisons later in this chapter. Fixed Channel Allocation: The difficulties associated with implementing capacity-increasing dynamic channel allocation to work with handoff [15, 19] have impeded its adoption in systems requiring reliable and frequent handoff. Frequency Division Duplex (FDD): Cellular systems have already been allocated pairedfrequency bands suitable for FDD. Thus, the network or system complexity required for providing synchronized transmissions [15, 16] from all cell sites for TDD has not been embraced in these digital cellular systems. Note that TDD has not been employed in IS-95 even though such synchronization is required for other reasons. Mobile/Portable Set Power Control: The benefits of increased capacity from lower overall cochannel interference and reduced battery drain have been sought by incorporating power control in the digital cellular technologies.

1999 by CRC Press LLC

c

15.3.3

Wide-Area Wireless Data Systems

Existing wide area data systems generally can be categorized as providing high mobility, wide-ranging, low-data-rate digital data communications to both vehicles and pedestrians [29, 50]. These systems have not experienced the rapid growth that the two-way voice technologies have, even though they have been deployed in many cities for a few years and have established a base of customers in several countries. Examples of these packet data systems are shown in Table 15.2. TABLE 15.2

Wide-Area Wireless Packet Data Systems RAM Mobile

Metricom

CDPD1

(Mobitex)

ARDIS2 (KDT)

(MDN)3

Data rate, kb/s

19.2

8 (19.2)

4.8 (19.2)

76

Modulation

GMSK BT = 0.5

GMSK

GMSK

GMSK

Frequency, MHz

800

900

800

915

Chan. spacing, kHz

30

12.5

25

160

Full service

Status

1994 service

Full service

Access means

Unused AMPS channels

Slotted Aloha CSMA

Transmit power, W

In service FH SS (ISM)

40

1

Note: Data in parentheses ( ) indicates proposed. 1 Cellular Digital Packet Data 2 Advanced Radio Data Information Service 3 Microcellular Data Network

The earliest and best known of these systems in the United States are the ARDIS network developed and run by Motorola, and the RAM mobile data network based on Ericsson Mobitex Technology. These technologies were designed to make use of standard, two-way voice, land mobile-radio channels, with 12.5- or 25-kHz channel spacing. In the United States these are specialized mobile radio services (SMRS) allocations around 450 MHz and 900 MHz. Initially, the data rates were low: 4.8 kb/s for ARDIS and 8 kb/s for RAM. The systems use high transmitter power (several tens of watts) to cover large regions from a few base stations having high antennas. The relatively low data capacity of a relatively expensive base station has resulted in economics that have not favored rapid growth. The wide-area mobile data systems also are evolving in several different directions in an attempt to improve base station capacity, economics, and the attractiveness of the service. The technologies used in both the ARDIS and RAM networks are evolving to higher channel bit rates of 19.2 kb/s. The cellular carriers and several manufacturers in the United States are developing and deploying a new wide area packet data network as an overlay to the cellular radio networks. This cellular digital packet data (CDPD) technology shares the 30-kHz spaced 800-MHz voice channels used by the analog FM advanced mobile phone service (AMPS) systems. Data rate is 19.2 kb/s. The CDPD base station equipment also shares cell sites with the voice cellular radio system. The aim is to reduce the cost of providing packet data service by sharing the costs of base stations with the better established and higher cell-site capacity cellular systems. This is a strategy similar to that used by nationwide fixed wireline packet data networks that could not provide an economically viable data service if they did not share costs by leasing a small amount of the capacity of the interexchange networks that are paid for largely by voice traffic. { CDPD has been deployed in many U.S. cities for several years. However, it has not lived up to early expectations and has become 1999 by CRC Press LLC

c

“just another” wireless data service with some subscribers, but not with the large growth envisioned earlier. } Another evolutionary path in wide-area wireless packet data networks is toward smaller coverage areas or microcells. This evolutionary path also is indicated on Fig. 15.1. The microcell data networks are aimed at stationary or low-speed users. The design compromises are aimed at reducing service costs by making very small and inexpensive base stations that can be attached to utility poles, the sides of buildings and inside buildings and can be widely distributed throughout a region. Basestation-to-base-station wireless links are used to reduce the cost of the interconnecting data network. In one network this decreases the overall capacity to serve users, since it uses the same radio channels that are used to provide service. Capacity is expected to be made up by increasing the number of base stations that have connections to a fixed-distribution network as service demand increases. Another such network uses other dedicated radio channels to interconnect base stations. In the highcapacity limit, these networks will look more like a conventional cellular network architecture, with closely spaced, small, inexpensive base stations, i.e., microcells, connected to a fixed infrastructure. Specialized wireless data networks have been built to provide metering and control of electric power distributions, e.g., Celldata and Metricom in California. A large microcell network of small inexpensive base stations has been installed in the lower San Francisco Bay Area by Metricom, and public packet-data service was offered during early 1994. Most of the small (shoe-box size) base stations are mounted on street light poles. Reliable data rates are about 75 kb/s. The technology is based on slow frequency-hopped spread spectrum in the 902–928 MHz U.S. ISM band. Transmitter power is 1 W maximum, and power control is used to minimize interference and maximize battery life time. { The metricom network has been improved and significantly expanded in the San Francisco Bay Area and has been deployed in Washington, D.C. and a few other places in the U.S. However, like all wireless data services so far, it has failed to grow as rapidly or to attract as many subscribers as was originally expected. Wireless data overall has had only very limited success compared to that of the more voice-oriented technologies, systems, and services. }

15.3.4

High-Speed Wireless Local-Area Networks (WLANs)

Wireless local-area data networks can be categorized as providing low-mobility high-data-rate data communications within a confined region, e.g., a campus or a large building. Coverage range from a wireless data terminal is short, tens to hundreds of feet, like cordless telephones. Coverage is limited to within a room or to several rooms in a building. WLANs have been evolving for a few years, but overall the situation is chaotic, with many different products being offered by many different vendors [29, 59]. There is no stable definition of the needs or design objectives for WLANs, with data rates ranging from hundreds of kb/s to more than 10 Mb/s, and with several products providing one or two Mb/s wireless link rates. The best description of the WLAN evolutionary process is: having severe birth pains. An IEEE standards committee, 802.11, has been attempting to put some order into this topic, but their success has been somewhat limited. A partial list of some advertised products is given in Table 15.3. Users of WLANs are not nearly as numerous as the users of more voice-oriented wireless systems. Part of the difficulty stems from these systems being driven by the computer industry that views the wireless system as just another plug-in interface card, without giving sufficient consideration to the vagaries and needs of a reliable radio system. { This section still describes the WLAN situation in spite of some attempts at standards in the U.S. and Europe, and continuing industry efforts. Some of the products in Table 15.3 have been discontinued because of lack of market and some new products have been offered, but the manufacturers still continue to struggle to find enough customers to support their efforts. Optimism remains high in the WLAN 1999 by CRC Press LLC

c

community that “eventually” they will find the “right” technology, service, or application to make WLANs “take off ” — but the world still waits. Success is still quite limited. } There are two overall network architectures pursued by WLAN designers. One is a centrally coordinated and controlled network that resembles other wireless systems. There are base stations in these networks that exercise overall control over channel access [44]. The other type of network architecture is the self-organizing and distributed controlled network where every terminal has the same function as every other terminal, and networks are formed ad hoc by communications exchanges among terminals. Such ad hoc networks are more like citizen band (CB) radio networks, with similar expected limitations if they were ever to become very widespread. Nearly all WLANs in the United States have attempted to use one of the ISM frequency bands for unlicensed operation under part 15 of the FCC rules. These bands are 902–928 MHz, 2400–2483.5 MHz, and 5725–5850 MHz, and they require users to accept interference from any interfering source that may also be using the frequency. The use of ISM bands has further handicapped WLAN development because of the requirement for use of either frequency hopping or direct sequence spread spectrum as an access technology, if transmitter power is to be adequate to cover more than a few feet. One exception to the ISM band implementations is the Motorola ALTAIR, which operates in a licensed band at 18 GHz. { It appears that ALTAIR has been discontinued because of the limited market. } The technical and economic challenges of operation at 18 GHz have hampered the adoption of this 10–15 Mb/s technology. The frequency-spectrum constraints have been improved in the United States with the recent FCC allocation of spectrum from 1910–1930 MHz for unlicensed data PCS applications. Use of this new spectrum requires implementation of an access etiquette incorporating listen before transmit in an attempt to provide some coordination of an otherwise potentially chaotic, uncontrolled environment [68]. Also, since spread spectrum is not a requirement, access technologies and multipath mitigation techniques more compatible with the needs of packet-data transmission [59], e.g., multipath equalization or multicarrier transmission can be incorporated into new WLAN designs. { The FCC is allocating spectrum at 5 GHz for wideband wireless data for internet and next generation data network access, BUT it remains to be seen whether this initiative is any more successful than past wireless data attempts. Optimism is again high, BUT... } Three other widely different WLAN activities also need mentioning. One is a large European Telecommunications Standards Institute (ETSI) activity to produce a standard for high performance radio local area network (HIPERLAN), a 20-Mb/s WLAN technology to operate near 5 GHz. Other activities are large U.S. Advance Research Projects Agency- (ARPA-) sponsored, WLAN research projects at the Universities of California at Berkeley (UCB), and at Los Angeles (UCLA). The UCB Infopad project is based on a coordinated network architecture with fixed coordinating nodes and direct-sequence spread spectrum (CDMA), whereas, the UCLA project is aimed at peer-to-peer networks and uses frequency hopping. Both ARPA sponsored projects are concentrated on the 900-MHz ISM band. As computers shrink in size from desktop to laptop to palmtop, mobility in data network access is becoming more important to the user. This fact, coupled with the availability of more usable frequency spectrum, and perhaps some progress on standards, may speed the evolution and adoption of wireless mobile access to WLANs. From the large number of companies making products, it is obvious that many believe in the future of this market. { It should be noted that the objective for 10 MB/s data service with widespread coverage from a sparse distribution of widely separated base stations equivalent to cellular is unrealistic and unrealizable. This can be readily seen by considering a simple example. Consider a cellular coverage area that requires full cellular power of 0.5 watt to cover from a handset. Consider the handset to use a typical digital cellular bit rate of about 10 kb/s (perhaps 8 kb/s speech coding + overhead). With all else in the system the same, e.g., antennas, antenna height, receiver noise figure, detection sensitivity, etc., the 10 MB/s data would require 10 MB/s ÷ 10 kb/s 1999 by CRC Press LLC

c

TABLE 15.3

Partial List of WLAN Products

1999 by CRC Press LLC

c

Product

No. of chan.

Company

Freq.,

Link Rate,

or Spread

Location

MHz

Mb/s

User Rate

Protocol(s)

Mod./

Factor

Coding

Power, mW

Altair Plus Motorola Arlington Hts, IL

18–19 GHz

15

5.7 Mb/s

Ethernet

Topology

4-level FSK

25 peak

Eight devices/ radio; radio to base to ethernet

WaveLAN NCR/AT&T Dayton, OH

902–928

2

1.6 Mb/s

Ethernet-like

DS SS

DQPSK

250

Peer-to-peer

AirLan Solectek San Diego, CA

902–928

2 Mb/s

Ethernet

DS SS

DQPSK

250

PCMCIA w/ant.; radio to hub

Freeport Windata Inc. Northboro, MA

902–928

5.7 Mb/s

Ethernet

DS SS

16 PSK trellis coding

650

Hub

Intersect Persoft Inc. Madison, WI

902–928

2 Mb/s

Ethernet token ring

DS SS

DQPSK

250

Hub

LAWN O’Neill Comm. Horsham, PA

902–928

38.4 kb/s

AX.25

SS

20

Peer-to-peer

WILAN Wi-LAN Inc. Calgary, Alberta

902–928

30

Peer-to-peer

RadioPort ALPS Electric USA

100

Peer-to-peer

1W max

PCs with ant.; radio to hub

16

Access

32 chips/bit

20

Network

users/chan.; max. 4 chan. 20

1.5 Mb/s/ chan.

Ethernet, token ring

CDMA/ TDMA

3 chan. 10–15 links each

902–928

242 kb/s

Ethernet

SS

7/3 channels

ArLAN 600 Telesys. SLW Don Mills, Ont.

902–928; 2.4 GHz

1.35 Mb/s

Ethernet

SS

Radio Link Cal. Microwave Sunnyvale, CA

902–928; 2.4 GHz

Range LAN Proxim, Inc. Mountain View, CA

902–928

RangeLAN 2 Proxim, Inc. Mountain View, CA

2.4 GHz

1.6

Netwave Xircom Calabasas, CA

2.4 GHz

1/adaptor

Freelink Cabletron Sys. Rochester, NH

2.4 and 5.8 GHz

250 kb/s

64 kb/s

FH SS

250 ms/hop 500 kHz space

unconventional

Hub

242 kb/s

Ethernet, token ring

DS SS

3 chan.

100

50 kb/s max.

Ethernet, token ring

FH SS

10 chan. at 5 kb/s; 15 sub-ch. each

100

Ethernet, token ring

FH SS

82 l-MHz chn. or “hops”

Ethernet

DS SS

32 chips/bit

5.7 Mb/s

Peer-to-peer bridge Hub

16 PSK trellis coding

100

Hub

= 1000 times as much power as the 10 kb/s cellular. Thus, it would require 0.5 × 1000 = 500 watts for the wireless data transmitter. This is a totally unrealistic situation. If the data system operates at a higher frequency (e.g., 5 GHz) than the cellular system (e.g., 1 or 2 GHz) then there will be even more power required to overcome the additional loss at a higher frequency. The sometimes expressed desire by the wireless data community for a system to provide network access to users in and around buildings and to provide 10 MB/s over 10 miles with 10 milliwatts of transmitter power and costing $10.00 is totally impossible. It requires violation of the “laws of physics.” }

15.3.5

Paging/Messaging Systems

Radio paging began many years ago as a one-bit messaging system. The one bit was: some one wants to communicate with you. More generally, paging can be categorized as one-way messaging over wide areas. The one-way radio link is optimized to take advantage of the asymmetry. High transmitter power (hundreds of watts to kilowatts), and high antennas at the fixed base stations permit low-complexity, very low-power-consumption, pocket paging receivers that provide long usage time from small batteries. This combination provides the large radio-link margins needed to penetrate walls of buildings without burdening the user set battery. Paging has experienced steady rapid growth for many years and serves about 15 million subscribers in the United States. Paging also has evolved in several different directions. It has changed from analog tone coding for user identification to digitally encoded messages. It has evolved from the 1-b message, someone wants you, to multibit messages from, first, the calling party’s telephone number to, now, short e-mail text messages. This evolution is noted in Fig. 15.1. The region over which a page is transmitted has also increased from 1) local, around one transmitting antenna; to 2) regional, from multiple widely-dispersed antennas; to 3) nationwide, from large networks of interconnected paging transmitters. The integration of paging with CT-2 user sets for phone-point call alerting was noted previously. Another evolutionary paging route sometimes proposed is two-way paging. This is an ambiguous and unrealizable concept, however, since the requirement for two-way communications destroys the asymmetrical link advantage so well exploited by paging. Two-way paging puts a transmitter in the user’s set and brings along with it all of the design compromises that must be faced in such a two-way radio system. Thus, the word paging is not appropriate to describe a system that provides two-way communications. { The two-way paging situation is as unrealistic as that noted earlier for widearea, high-speed, low-power wireless data. This can be seen by looking at the asymmetry situation in paging. In order to achieve comparable coverage uplink and downlink, a 500-watt paging transmitter downlink advantage must be overcome in the uplink. Even considering the relatively high cellular handset transmit power levels on the order of 0.5 watt results in a factor of 1000 disadvantage, and 0.5 watt is completely incompatible with the low power paging receiver power requirements. If the same uplink and downlink coverage is required for an equivalent set of system parameters, then the only variable left to work with is bandwidth. If the paging link bit rate is taken to be 10 kb/sec (much higher than many paging systems), then the usable uplink rate is 10 kb/s/1000 = 10 B/s, an unusably low rate. Of course, some uplink benefit can be gained because of better base station receiver noise figure and by using forward error correction and perhaps ARQ. However, this is unlikely to raise the allowable rate to greater than 100 B/s which even though likely overoptimistic is still unrealistically low and we have assumed an unrealistically high transmit power in the two-way “pager!” } 1999 by CRC Press LLC

c

15.3.6

Satellite-Based Mobile Systems

Satellite-based mobile systems are the epitome of wide-area coverage, expensive base station systems. They generally can be categorized as providing two-way (or one-way) limited quality voice and/or very limited data or messaging to very wide-ranging vehicles (or fixed locations). These systems can provide very widespread, often global, coverage, e.g., to ships at sea by INMARSAT. There are a few messaging systems in operation, e.g., to trucks on highways in the United States by Qualcomm’s Omnitracs system. A few large-scale mobile satellite systems have been proposed and are being pursued: perhaps the best known is Motorola’s Iridium; others include Odyssey, Globalstar, and Teledesic. The strength of satellite systems is their ability to provide large regional or global coverage to users outside buildings. However, it is very difficult to provide adequate link margin to cover inside buildings, or even to cover locations shadowed by buildings, trees, or mountains. A satellite system’s weakness is also its large coverage area. It is very difficult to provide from Earth orbit the small coverage cells that are necessary for providing high overall systems capacity from frequency reuse. This fact, coupled with the high cost of the orbital base stations, results in low capacity along with the wide overall coverage but also in expensive service. Thus, satellite systems are not likely to compete favorably with terrestrial systems in populated areas or even along well-traveled highways. They can complement terrestrial cellular or PCS systems in low-population-density areas. It remains to be seen whether there will be enough users with enough money in low-population-density regions of the world to make satellite mobile systems economically viable. { Some of the mobile satellite systems have been withdrawn, e.g., Odyssey. Some satellites in the Iridium and Globalstar systems have been launched. The industry will soon find out whether these systems are economically viable. } Proposed satellite systems range from 1) low-Earth-orbit systems (LEOS) having tens to hundreds of satellites through 2) intermediate- or medium-height systems (MEOS) to 3) geostationary or geosynchronous orbit systems (GEOS) having fewer than ten satellites. LEOS require more, but less expensive, satellites to cover the Earth, but they can more easily produce smaller coverage areas and, thus, provide higher capacity within a given spectrum allocation. Also, their transmission delay is significantly less (perhaps two orders of magnitude!), providing higher quality voice links, as discussed previously. On the other hand, GEOS require only a few, somewhat more expensive, satellites (perhaps only three) and are likely to provide lower capacity within a given spectrum allocation and suffer severe transmission-delay impairment on the order of 0.5 s. Of course, MEOS fall in between these extremes. The possible evolution of satellite systems to complement high-tier PCS is indicated in Fig. 15.1.

15.3.7 {Fixed Point-to-Multipoint Wireless Loops Wideband point-to-multipoint wireless loop technologies sometimes have been referred to earlier as “wireless cable” when they were proposed as an approach for providing interactive video services to homes [30]. However, as the video application started to appear less attractive, the application emphasis shifted to providing wideband data access for the internet, the worldwide web, and future wideband data networks. Potentially lower costs are the motivation for this wireless application. As such, these technologies will have to compete with existing coaxial cable and fiber/coax distribution by CATV companies, with satellites, and with fiber and fiber/coax systems being installed or proposed by telephone companies and other entities [30]. Another competitor is asymmetric digital subscriber line technology, which uses advanced digital signal processing to provide high-bandwidth digital distribution over twisted copper wire pairs. In the U.S. two widely different frequency bands are being pursued for fixed point-to-multipoint 1999 by CRC Press LLC

c

wireless loops. These bands are at 28 GHz for local multipoint distribution systems or services (LMDS) [52] and 2.5 to 2.7 GHz for microwave or metropolitan distribution systems (MMDS) [74]. The goal of low-cost fixed wireless loops is based on the low cost of point-to-multipoint line-of-sight wireless technology. However, significant challenges are presented by the inevitable blockage by trees, terrain, and houses, and by buildings in heavily built-up residential areas. Attenuation in rainstorms presents an additional problem at 28 GHz in some localities. Even at the 2.5-GHz MMDS frequencies, the large bandwidth required for distribution of many video channels presents a challenge to provide adequate radio-link margin over obstructed paths. From mobile satellite investigations it is known that trees can often produce over 15 dB additional path attenuation [38]. Studies of blockage by buildings in cities have shown that it is difficult to have line-of-sight access to more than 60% of the buildings from a single base station [55]. Measurements in a region in Brooklyn, NY [60], suggest that access from a single base station can range from 25% to 85% for subscriber antenna heights of 10 to 35 ft and a base station height of about 290 ft. While less blockage by houses could be expected in residential areas, such numbers would suggest that greater than 90% access to houses could be difficult, even from multiple elevated locations, when mixes of one- and two-story houses, trees, and hills are present. In regions where tree cover is heavy, e.g., the eastern half of the U.S., tree cover in many places will present a significant obstacle. Heavy rainfall is an additional problem at 28 GHz in some regions. In spite of these challenges, the lure of low-cost wireless loops is attracting many participants, both service providers and equipment manufacturers. }

15.3.8

Reality Check

Before we go on to consider other applications and compromises, perhaps it would be helpful to see if there is any indication that the previous discussion is valid. For this check, we could look at cordless telephones for telepoint use (i.e., pocketphones) and at pocket cellular telephones that existed in the 1993 time frame. Two products from one United States manufacturer are good for this comparison. One is a thirdgeneration hand-portable analog FM cellular phone from this manufacturer that represents their second generation of pocketphones. The other is a first-generation digital cordless phone built to the United Kingdom CT-2 common air interface (CAI) standard. Both units are of flip phone type with the earpiece on the main handset body and the mouthpiece formed by or on the flip-down part. Both operate near 900 MHz and have 1/4 wavelength pull-out antennas. Both are fully functional within their class of operation (i.e., full number of U.S. cellular channels, full number of CT-2 channels, automatic channel setup, etc.) Table 15.4 compares characteristics of these two wireless access pocketphones from the same manufacturer. The following are the most important items to note in the Table 15.4 comparison. 1. The talk time of the low-power pocketphone is four times that of the high-power pocketphone. 2. The battery inside the low-power pocketphone is about one-half the weight and size of the battery attached to the high-power pocketphone. 3. The battery-usage ratio, talk time/weight of battery, is eight times greater, almost an order of magnitude, for the low-power pocketphone compared to the high-power pocketphone! 4. Additionally, the lower power (5 mW) digital cordless pocketphone is slightly smaller and lighter than the high-power (500 mW) analog FM cellular mobile pocketphone. { Similar comparisons can be made between PHS advanced cordless/low-tier PCS phones and advanced cellular/high-tier PCS pocketphones. New lithium batteries have permitted increased talk 1999 by CRC Press LLC

c

TABLE 15.4 Comparison of CT-2 and Cellular Pocket Size Flip-Phones from the Same Manufacturer Characteristics/Parameter

CT-2

Cellular

Weight, oz Flip phone only Battery1 only Total unit

5.2 1.9 7.1

4.2 3.6 7.8

5.9 × 2.2 × 0.95 8.5 in3 1.9 × 1.3 × 0.5 internal 5.9 × 2.2 × 0.95 8.5 in3

5.5 × 2.4 × 0.9 — 4.7 × 2.3 × 0.4 external 5.9 × 2.4 × 1.1 11.6 in3

Talk-time, min (h) Rechargeable battery2 Nonrechargeable battery

180 (3) 600 (10)

45 N/A

Standby time, h Rechargeable battery Nonrechargeable battery

30 100

8 N/A

Speech quality

32 kb/s telephone quality

30 kHz FM depends on channel quality

Transmit power avg., W

0.005

0.5

Size (max.dimensions), in Flip phone only Battery1 only Total unit

1 Rechargeable battery. 2 Ni-cad battery.

time in pocketphones. Digital control/paging channels facilitate significantly extended standby time. Advances in solid-state circuits have reduced the size and weight of cellular pocketphone electronics so that they are almost insignificant compared to the battery required for the high power transmitter and complex digital signal processing. However, even with all these changes, there is still a very significant weight and talk time benefit in the low complexity PHS handsets compared to the most advanced cellular/high-tier PCS handsets. Picking typical minimum size and weight handsets for both technologies results in the following comparisons. PHS

Cellular

weight, oz total unit

3

4.5

size total unit

4.2 in3

—

talk-time, h

8

3

600

48

standby time, h

From the table, the battery usage ratio has been reduced to a factor of about 4 from a factor of 8, but this is based on total weight, not battery weight alone as used for the earlier CT-2 and cellular comparison. Thus, there is still a large talk time and weight benefit for low-power low-complexity low-tier PCS compared to higher power high-complexity, high tier PCS and cellular. } The following should also be noted. 1. The room for technology improvement of the CT-2 cordless phone is greater since it is first generation and the cellular phone is second/third generation. 1999 by CRC Press LLC

c

2. A digital cellular phone built to the IS-54, GSM, or JDC standard, or in the proposed United States CDMA technology, would either have less talk time or be heavier and larger than the analog FM phone, because: a) the low-bit-rate digital speech coder is more complex and will consume more power than the analog speech processing circuits; b) the digital units have complex digital signal-processing circuits for forward error correction—either for delay dispersion equalizing or for spread-spectrum processing— that will consume significant amounts of power and that have no equivalents in the analog FM unit; and c) power amplifiers for the shaped-pulse nonconstant-envelope digital signals will be less efficient than the amplifiers for constant-envelope analog FM. Although it may be suggested that transmitter power control will reduce the weight and size of a CDMA handset and battery, if that handset is to be capable of operating at full power in fringe areas, it will have to have capabilities similar to other cellular sets. Similar power control applied to a CT-2-like low-maximum-power set would also reduce its power consumption and thus also its weight and size. The major difference in size, weight, and talk time between the two pocketphones is directly attributable to the two orders of magnitude difference in average transmitter power. The generation of transmitter power dominates power consumption in the analog cellular phone. Power consumption in the digital CT-2 phone is more evenly divided between transmitter-power generation and digital signal processing. Therefore, power consumption in complex digital signal processing would have more impact on talk time in small low-power personal communicators than in cellular handsets where the transmitter-power generation is so large. Other than reducing power consumption for both functions, the only alternative for increasing talk time and reducing battery weight is to invent new battery technology having greater density; see section on Other Issues later in this chapter. In contrast, lowering the transmitter power requirement, modestly applying digital signal processing, and shifting some of the radio coverage burden to a higher density of small, low-power, low-complexity, low-cost fixed radio ports has the effect of shifting some of the talk time, weight, and cost constraints from battery technology to solid state electronics technology, which continues to experience orders-of-magnitude improvements in the span of several years. Digital signal-processing complexity, however, cannot be permitted to overwhelm power consumption in low-power handsets; whereas small differences in complexity will not matter much, orders-of-magnitude differences in complexity will continue to be significant. Thus, it can be seen from Table 15.4 that the size, weight, and quality arguments in the preceding sections generally hold for these examples. It also is evident from the preceding paragraphs that they will be even more notable when comparing digital cordless pocketphones with digital cellular pocketphones of the same development generations.

15.4

Evolution Toward the Future and to Low-Tier Personal Communications Services

After looking at the evolution of several wireless technologies and systems in the preceding sections it appears appropriate to ask again: wireless personal communications, What is it? All of the technologies in the preceding sections claim to provide wireless personal communications, and all do to some extent. All have significant limitations, however, and all are evolving in attempts to overcome the limitations. It seems appropriate to ask, what are the likely endpoints? Perhaps some hint of the endpoints can be found by exploring what users see as limitations of existing technologies and systems and by looking at the evolutionary trends. 1999 by CRC Press LLC

c

In order to do so, we summarize some important clues from the preceding sections and project them, along with some U.S. standards activity, toward the future. Digital Cordless Telephones

• Strengths: good circuit quality; long talk time; small lightweight battery; low-cost sets and service. • Limitations: limited range; limited usage regions. • Evolutionary trends: phone points in public places; wireless PBX in business. • Remaining limitations and issues: limited usage regions and coverage holes; limited or no handoff; limited range. { Experience with PHS and CT-2 phone point have provided more emphasis on the need for vehicle speed handoff and continuous widespread coverage of populated areas and of highways in between. } Digital Cellular Pocket Handsets

• Strength: widespread service availability. • Limitations: limited talk time; large heavy batteries; high-cost sets and service; marginal circuit quality; holes in coverage and poor in-building coverage; limited data capabilities; complex technologies. • Evolutionary trends: microcells to increase capacity and in-building coverage and to reduce battery drain; satellite systems to extend coverage. • Remaining limitations and issues: limited talk time and large battery; marginal circuit quality; complex technologies. Wide Area Data

• Strength: digital messages. • Limitations: no voice, limited data rate; high cost. • Evolutionary trends: microcells to increase capacity and reduce cost; share facilities with voice systems to reduce cost. • Remaining limitations and issues: no voice; limited capacity. Wireless Local Area Networks (WLANs)

• Strength: high data rate. • Limitations: insufficient capacity for voice, limited coverage; no standards; chaos. • Evolutionary trends: hard to discern from all of the churning. Paging/messaging

• Strengths: widespread coverage; long battery life; small lightweight sets and batteries; economical. • Limitations: one-way message only; limited capacity. • Evolutionary desire: two-way messaging and/or voice; capacity. • Limitations and issues: two-way link cannot exploit the advantages of one-way link asymmetry. 1999 by CRC Press LLC

c

{Fixed Wireless Loops • Strength: High data rates. • Limitations: no mobility. } There is a strong trajectory evident in these systems and technologies aimed at providing the following features. High Quality Voice and Data

• • • • • •

To small, lightweight, pocket carried communicators. Having small lightweight batteries. Having long talk time and long standby battery life. Providing service over large coverage regions. For pedestrians in populated areas (but not requiring high population density). Including low to moderate speed mobility with handoff. { It has become evident from the experience with PHS and CT-2 phone point that vehicle speed handoff is essential so that handsets can be used in vehicles also. } Economical Service

• Low subscriber-set cost. • Low network-service cost. Privacy and Security of Communications

• Encrypted radio links. This trajectory is evident in all of the evolving technologies but can only be partially satisfied by any of the existing and evolving systems and technologies! Trajectories from all of the evolving technologies and systems are illustrated in Fig. 15.1 as being aimed at low-tier personal communications systems or services, i.e., low-tier PCS. Taking characteristics from cordless, cellular, wide-area data and, at least moderate-rate, WLANs, suggests the following attributes for this low-tier PCS. 1. 32 kb/s ADPCM speech encoding in the near future to take advantage of the low complexity and low power consumption, and to provide low-delay high-quality speech. 2. Flexible radio link architecture that will support multiple data rates from several kilobits per second. This is needed to permit evolution in the future to lower bit rate speech as technology improvements permit high quality without excessive power consumption or transmission delay and to provide multiple data rates for data transmission and messaging. 3. Low transmitter power (≤ 25 mW average) with adaptive power control to maximize talk time and data transmission time. This incurs short radio range that requires many base stations to cover a large region. Thus, base stations must be small and inexpensive, like cordless telephone phone points or the Metricom wireless data base stations. { The lower power will require somewhat closer spacing of base stations in cluttered environments with many buildings, etc. This issue is dealt with in more detail in Section 15.5. The issues associated with greater base station spacing along highways are also considered in Section 15.5. } 4. Low-complexity signal processing to minimize power consumption. Complexity onetenth that of digital cellular or high-tier PCS technologies is required [29]. With only 1999 by CRC Press LLC

c

5.

6. 7.

8.

several tens of milliwatts (or less under power control) required for transmitter power, signal processing power becomes significant. Low cochannel interference and high coverage area design criteria. In order to provide high-quality service over a large region, at least 99% of any covered area must receive good or better coverage and be below acceptable cochannel interference limits. This implies less than 1% of a region will receive marginal service. This is an order-of-magnitude higher service requirement than the 10% of a region permitted to receive marginal service in vehicular cellular system (high-tier PCS) design criteria. Four-level phase modulation with coherent detection to maximize radio link performance and capacity with low complexity. Frequency division duplexing to relax the requirement for synchronizing base station transmissions over a large region. { PHS uses time division duplexing and requires base station synchronization. In first deployments, one provider did not implement this synchronization. The expected serious performance degradation prompted system upgrades to provide the needed synchronization. While this is not a big issue, it does add complexity to the system and decreases the overall robustness. } { As noted previously, experience with PHS and CT-2 phone point have emphasized the need for vehicular speed handoff in these low-tier PCS systems. Such handoff is readily implemented in PACS and has been demonstrated in the field [51]. This issue is discussed in more detail later in this section. }

Such technologies and systems have been designed, prototyped, and laboratory and field tested and evaluated for several years [7, 23, 24, 25, 26, 27, 28, 29, 31, 32, 50]. The viewpoint expressed here is consistent with the progress in the Joint Technical Committee (JTC) of the U.S. standards bodies, Telecommunications Industry Association (TIA) and Committee T1 of the Alliance for Telecommunications Industry Solutions (ATIS). Many technologies and systems were submitted to the JTC for consideration for wireless PCS in the new 1.9-GHz frequency bands for use in the United States [20]. Essentially all of the technologies and systems listed in Table 15.1, and some others, were submitted in late 1993. It was evident that there were at least two and perhaps three distinctly different classes of submissions. No systems optimized for packet data were submitted, but some of the technologies are optimized for voice. One class of submissions was the group labeled high-power systems, digital cellular (high-tier PCS) in Table 15.1. These are the technologies discussed previously in this chapter. They are highly optimized for low-bit-rate voice and, therefore, have somewhat limited capability for serving packet-data applications. Since it is clear that wireless services to wide ranging high-speed mobiles will continue to be needed, and that the technology already described for low-tier PCS may not be optimum for such services, Fig. 15.1 shows a continuing evolution and need in the future for hightier PCS systems that are the equivalent of today’s cellular radio. There are more than 100 million vehicles in the United States alone. In the future, most, if not all, of these will be equipped with high-tier cellular mobile phones. Therefore, there will be a continuing and rapidly expanding market for high-tier systems. Another class of submissions to the JTC [20] included the Japanese personal handyphone system (PHS) and a technology and system originally developed at Bellcore but carried forward to prototypes and submitted to the JTC by Motorola and Hughes Network Systems. This system was known 1999 by CRC Press LLC

c

as wireless access communications systems (WACS).2 These two submissions were so similar in their design objectives and system characteristics that, with the agreement of the delegations from Japan and the United States, the PHS and WACS submissions were combined under a new name, personal access communication systems (PACS), that was to incorporate the best features of both. This advanced, low-power wireless access system, PACS, was to be known as low-tier PCS. Both WACS/PACS and Handyphone (PHS) are shown in Table 15.1 as low-tier PCS and represent the evolution to low-tier PCS in Fig. 15.1. The WACS/PACS/ UDPC system and technology are discussed in [7, 23, 24, 25, 26, 28, 29, 31, 32, 50]. In the JTC, submissions for PCS of DECT and CT-2 and their variations were also lumped under the class of low-tier PCS, even though these advanced digital cordless telephone technologies were somewhat more limited in their ability to serve all of the low-tier PCS needs. They are included under digital cordless technologies in Table 15.1. Other technologies and systems were also submitted to the JTC for high-tier and low-tier applications, but they have not received widespread industry support. One wireless access application discussed earlier that is not addressed by either high-tier or low-tier PCS is the high-speed WLAN application. Specialized high-speed WLANs also are likely to find a place in the future. Therefore, their evolution is also continued in Fig. 15.1. The figure also recognizes that widespread low-tier PCS can support data at several hundred kilobits per second and, thus, can satisfy many of the needs of WLAN users. It is not clear what the future roles are for paging/messaging, cordless telephone appliances, or wide-area packet-data networks in an environment with widespread contiguous coverage by lowtier and high-tier PCS. Thus, their extensions into the future are indicated with a question mark in Fig. 15.1. Those who may object to the separation of wireless PCS into high-tier and low-tier should review this section again, and note that we have two tiers of PCS now. On the voice side there is cellular radio, i.e., high-tier PCS, and cordless telephone, i.e., an early form of low-tier PCS. On the data side there is wide-area data, i.e., high-tier data PCS, and WLANs, i.e., perhaps a form of low-tier data PCS. In their evolutions, these all have the trajectories discussed and shown in Fig. 15.1 that point surely toward low-tier PCS. It is this low-tier PCS that marketing studies continue to project is wanted by more than half of the U.S. households or by half of the people, a potential market of over 100 million subscribers in the United States alone. Similar projections have been made worldwide. { PACS technology [6] has been prototyped by several manufacturers. In 1995 field demonstrations were run in Boulder, CO at a U.S. West test site using radio ports (base stations) and radio port control units made by NEC. “Handset” prototypes made by Motorola and Panasonic were trialed. The handsets and ports were brought together for the first time in Boulder. The highly successful trial demonstrated the ease of integrating the subsystems of the low-complexity PACS technology and the overall advantages of PACS from a user’s perspective as noted throughout this chapter. Effective vehicular speed operation was demonstrated in these tests. Also, Hughes Network Systems (HNS) has developed and tested many sets of PACS infrastructure technology with different handsets in several settings and has many times demonstrated highly reliable high vehicular speed (in excess of 70 mi/hr) operation and handoff among several radio ports. Motorola also has demonstrated PACS equipment in several settings at vehicular speeds as well as for wireless loop applications. Highly successful demonstrations of PACS prototypes have been conducted from Alaska to Florida, from New York to California, and in China and elsewhere. A PACS deployment in China using NEC equipment started to provide service in 1998. The U.S.

2 WACS was known previously as Universal Digital Portable Communications (UDPC).

1999 by CRC Press LLC

c

Service Provider, 21st Century Telesis, is poised to begin a PACS deployment in several states in the U.S. using infrastructure equipment from HNS and handsets and switching equipment from different suppliers. Perhaps, with a little more support of these deployments, the public will finally be able to obtain the benefits of low-tier PCS. }

15.5

Comparisons with Other Technologies

15.5.1

Complexity/Coverage Area Comparisons

Experimental research prototypes of radio ports and subscriber sets [64, 66] have been constructed to demonstrate the technical feasibility of the radio link requirements in [7]. These WACS prototypes generally have the characteristics and parameters previously noted, with the exceptions that 1) the portable transmitter power is lower (10 mW average, 100 mW peak), 2) dynamic power control and automatic time slot transfer are not implemented, and 3) a rudimentary automatic link-transfer implementation is based only on received power. The experimental base stations transmit near 2.17 GHz; the experimental subscriber sets transmit near 2.12 GHz. Both operated under a Bellcore experimental license. The experimental prototypes incorporate application-specific, very large-scale integrated circuits3 fabricated to demonstrate the feasibility of the low-complexity high-performance digital signal-processing techniques [63, 64] for symbol timing and coherent bit detection. These techniques permit the efficient short TDMA bursts having only 100 b that are necessary for low-delay TDMA implementations. Other digital signal-processing functions in the prototypes are implemented in programmable logic devices. All of the digital signal-processing functions combined require about 1/10 of the logic gates that are required for digital signal processing in vehicular digital cellular mobile implementations [42, 62, 63]; that is, this low-complexity PCS implementation having no delay-dispersion-compensating circuits and no forward error-correction decoding and is about 1/10 as complex as the digital cellular implementations that include these functions.4 The 32 kb/s ADPCM speech-encoding in the low-complexity PCS implementation is also about 1/10 as complex as the less than 10-kb/s speech encoding used in digital cellular implementations. This significantly lower complexity will continue to translate into lower power consumption and cost. It is particularly important for low-power pocket personal communicators with power control in which the DC power expended for radio frequency transmitting can be only tens of milliwatts for significant lengths of time. The experimental radio links have been tested in the laboratory for detection sensitivity [bit error rate (BER) vs SNR] [18, 61, 66] and for performance against cochannel interference [1] and intersymbol interference caused by multipath delay spread [66]. These laboratory tests confirm the performance of the radio-link techniques. In addition to the laboratory tests, qualitative tests have been made in several PCS environments to compare these experimental prototypes with several United States CT-1 cordless telephones at 50 MHz, with CT-2 cordless telephones at 900 MHz, and with DCT900 cordless telephones at 900 MHz. Some of these comparisons have been reported [8, 71, 84, 85]. In general, depending on the criteria, e.g., either no degradation or limited degradation of circuit quality, these WACS experimental prototypes covered areas inside buildings that ranged from 1.4

3 Applications specific integrated circuits (ASIC), very large-scale integration (VLSI). 4 Some indication of VLSI complexity can be seen by the number of people required to design the circuits. For the low-

complexity TDMA ASIC set, only one person part time plus a student part time were required; the complex CDMA ASIC has six authors on the paper alone. 1999 by CRC Press LLC

c

to 4 times the areas covered by the other technologies. The coverage areas for the experimental prototypes were always substantially limited in two or three directions by the outside walls of the buildings. These area factors could be expected to be even larger if the coverage were not limited by walls, i.e., once all of a building is covered in one direction, no more area can be covered no matter what the radio link margin. The earlier comparisons [8, 84, 85] were made with only two-branch uplink diversity before subscriber-set transmitting antenna switching was implemented and, with only one radio port before automatic radio-link transfer was implemented. The later tests [71] included these implementations. These reported comparisons agree with similar unreported comparisons made in a Bellcore Laboratory building. Similar coverage comparison results have been noted for a 900-MHz ISM-band cordless telephone compared to the 2-GHz experimental prototype. The area coverage factors (e.g., ×1.4 to ×4) could be expected to be even greater if the cordless technologies had also been operated at 2 GHz since attenuation inside buildings between similar small antennas is about 7 dB greater at 2 GHz than at 900 MHz [35, 36] and the 900 MHz handsets transmitted only 3 dB less average power than the 2-GHz experimental prototypes. The greater area coverage demonstrated for this technology is expected because of the different compromises noted earlier; the following, in particular. 1. Coherent detection of QAM provides more detection sensitivity than noncoherent detection of frequency-shift modulations [17]. 2. Antenna diversity mitigates bursts of errors from multipath fading [66]. 3. Error detection and blanking of TDMA bursts having errors significantly improves perceived speech quality [72]. (Undetected errors in the most significant bit cause sharp audio pops that seriously degrade perceived speech quality.) 4. Robust symbol timing and burst and frame synchronization reduce the number of frames in error due to imperfect timing and synchronization [66]. 5. Transmitting more power from the radio port compared to the subscriber set offsets the less sensitive subscriber set receiver compared to the port receiver that results from power and complexity compromises made in a portable set. Of course, as expected, the low-power (10-mW) radio links cover less area than high-power (0.5-W) cellular mobile pocketphone radio links because of the 17-dB transmitter power difference resulting from the compromises discussed previously. In the case of vehicular mounted sets, even more radiolink advantage accrues to the mobile set because of the higher gain of vehicle-mounted antennas and higher transmitter power (3 W). { The power difference between a low-tier PACS handset and a high-tier PCS or cellular pocket handset is not as significant in limiting range as is often portrayed. Other differences in deployment scenarios for low-tier and high-tier systems are as large or even larger factors, e.g., base station antenna height and antenna gain. This can be seen by considering using the same antennas and receiver noise figures at base stations and looking at the range of high-tier and low-tier handsets. High-tier handsets typically transmit a maximum average power of 0.5 watt. The PACS handset average transmit power is 25 milliwatts (peak power is higher for TDMA, but equal comparisons can be made considering average power and equivalent receiver sensitivities). This power ratio of ×20 translates to approximately a range reduction of a factor of about 0.5 for an environment with a distance dependence of 1/(d)4 or a factor of about 0.4 for a 1/(d)3.5 environment. These represent typical values of distance dependence for PCS and cellular environments. Thus, if the high-tier handset would provide a range of 5 miles in some environment, the low-tier handset would provide a range of 2 to 2.5 miles in the same environment, if the base station antennas and receiver noise figures were the same. This difference in range is no greater than the difference in range between 1999 by CRC Press LLC

c

high-tier PCS handsets used in 1.9 GHz systems and cellular handsets with the same power used in 800 MHz systems. Range factors are discussed further in the next section. }

15.5.2 {Coverage, Range, Speed, and Environments Interest has been expressed in having greater range for low-tier PCS technology for low-populationdensity areas. One should first note that the range of a wireless link is highly dependent on the amount of clutter or obstructions in the environment in which it is operated. For example, radio link calculations that result in a 1400-ft base station (radio-port) separation at 1.9 GHz contain over 50-dB margin for shadowing from obstructions and multipath effects [25, 37]. Thus, in an environment without obstructions, e.g., along a highway, the base station separation can be increased at least by a factor of 4 to over a mile, i.e., 25 dB for an attenuation characteristic of d −4 , while providing the same quality of service, without any changes to the base station or subscriber transceivers, and while still allowing over 25-dB margin for multipath and some shadowing. This remaining margin allows for operation of a handset inside an automobile. In such an unobstructed environment, multipath RMS delay spread [21, 33] will still be less than the 0.5 µs in which PACS was designed to operate [28]. Operation at still greater range along unobstructed highways or at a range of a mile along more obstructed streets can be obtained in several ways. Additional link gain of 6 dB can be obtained by going from omnidirectional antennas at base stations to 90◦ sectored antennas (four sectors). Another 6 dB can be obtained by raising base station antennas by a factor of 2 from 27 ft to 55 ft in height. This additional 12 dB will allow another factor of 2 increase in range to 2-mile base station separation along highways, or to about 3000-ft separation in residential areas. Even higher-gain and taller antennas could be used to concentrate coverage along highways, particularly in rural areas. Of course, range could be further increased by increasing the power transmitted. As the range of the low-tier PACS technology is extended in cluttered areas by increasing link gain, increased RMS delay spread is likely to be encountered. This will require increasing complexity in receivers. A factor of 2 in tolerance of delay spread can be obtained by interference-canceling signal combining [76, 77, 78, 79, 80] from two antennas instead of the simpler selection diversity combining originally used in PACS. This will provide adequate delay-spread tolerance for most suburban environments [21, 33]. The PACS downlink contains synchronization words that could be used to train a conventional delay-spread equalizer in subscriber set receivers. Constant-modulus (blind) equalization will provide greater tolerance to delay spread in base station receivers on the uplink [45, 46, 47, 48] than can be obtained by interference-cancellation combining from only two antennas. The use of more basestation antennas and receivers can also help mitigate uplink delay spread. Thus, with some added complexity, the low-tier PACS technology can work effectively in the RMS delay spreads expected in cluttered environments for base station separations of 2 miles or so. The guard time in the PACS TDMA uplink is adequate for 1-mile range, i.e., 2-mile separation between base station and subscriber transceivers. A separation of up to 3 miles between transceivers could be allowed if some statistical outage were accepted for the few times when adjacent uplink timeslots are occupied by subscribers at the extremes of range (near–far). With some added complexity in assigning timeslots, the assignment of subscribers at very different ranges to adjacent timeslots could be avoided, and the base station separation could be increased to several miles without incurring adjacent slot interference. A simple alternative in low-density (rural) areas, where lower capacity could be acceptable and greater range could be desirable, would be to use every other timeslot to ensure adequate guard time for range differences of many tens of miles. Also, the capability of transmitter time advance has been added to PACS standard in order to increase the range of operation. Such time advance is applied in the cellular TDMA technologies. 1999 by CRC Press LLC

c

The synchronization, carrier recovery, and detection in the low-complexity PACS transceivers will perform well at highway speeds. The two-receiver diversity used in uplink transceivers also will perform well at highway speeds. The performance of the single-receiver selection diversity used in the low-complexity PACS downlink transceivers begins to deteriorate at speeds above about 30 mi/h. However, at any speed, the performance is always at least as good as that of a single transceiver without the low-complexity diversity. Also, fading in the relatively uncluttered environment of a highway is likely to have a less severe Ricean distribution, so diversity will be less needed for mitigating the fading. Cellular handsets do not have diversity. Of course, more complex two-receiver diversity could be added to downlink transceivers to provide two-branch diversity performance at highway speeds. It should be noted that the very short 2.5-ms TDMA frames incorporated into PACS to provide low transmission delay (for high speech quality) also make the technology significantly less sensitive to high-speed fading than the longer-frame-period cellular technologies. The short frame also facilitates the rapid coordination needed to make reliable high-speed handoffs between base stations. Measurements on radio links to potential handoff base stations can be made rapidly, i.e., a measurement on at least one radio link every 2.5 ms. Once a handoff decision is made, signalling exchanges every 2.5 ms ensure that the radio link handoff is completed quickly. In contrast, the long frame periods in the high-tier (cellular) technologies prolong the time it takes to complete a handoff. As noted earlier, high speed handoff has been demonstrated many times with PACS technology and at speeds over 70 mi/hr. }

15.6

Quality, Capacity, and Economic Issues

Although the several trajectories toward low-tier PCS discussed in the preceding section are clear, it does not fit the existing wireless communications paradigms. Thus, low-tier PCS has attracted less attention than the systems and technologies that are compatible with the existing paradigms. Some examples are cited in the following paragraphs. The need for intense interaction with an intelligent network infrastructure in order to manage mobility is not compatible with the cordless telephone appliance paradigm. In that paradigm, independence of network intelligence and base units that mimic wireline telephones are paramount. Wireless data systems often do not admit to the dominance of wireless voice communications and, thus, do not take advantage of the economics of sharing network infrastructure and base station equipment. Also, wireless voice systems often do not recognize the importance of data and messaging and, thus, only add them in as bandaids to systems. The need for a dense collection of many low-complexity, low-cost, low-tier PCS base stations interconnected with inexpensive fixed-network facilities (copper or fiber based) does not fit the cellular high-tier paradigm that expects sparsely distributed $1 million cell sites. Also, the need for high transmission quality to compete with wireline telephones is not compatible with the drive toward maximizing users-per-cell-site and per megahertz to minimize the number of expensive cell sites. These concerns, of course, ignore the hallmark of frequency-reusing cellular systems. That hallmark is the production of almost unlimited overall system capacity by reducing the separation between base stations. The cellular paradigm does not recognize the fact that almost all houses in the U.S. have inexpensive copper wires connecting telephones to the telephone network. The use of low-tier PCS base stations that concentrate individual user services before backhauling in the network will result in less fixed interconnecting facilities than exist now for wireline telephones. Thus, inexpensive techniques for interconnecting many low-tier base stations are already deployed to provide wireline telephones to almost all houses. { The cost of backhaul to many base stations (radio ports) in a low-tier system is often cited as an economic disadvantage that cannot be overcome. 1999 by CRC Press LLC

c

However, this perception is based on existing tariffs for T1 digital lines which are excessive considering current digital subscriber line technology. These tariffs were established many years ago when digital subscriber line electronics were very expensive. With modern low-cost high-rate digital subscriber line (HDSL) electronics, the cost of backhaul could be greatly reduced. If efforts were made to revise tariffs for digital line backhaul based on low cost electronics and copper loops like residential loops, the resulting backhaul costs would more nearly approach the cost of residential telephone lines. As it is now, backhaul costs are calculated based on antiquated high T1 line tariffs that were established for “antique” high cost electronics. } This list could be extended, but the preceding examples are sufficient, along with the earlier sections of the paper, to indicate the many complex interactions among circuit quality, spectrum utilization, complexity (circuit and network), system capacity, and economics that are involved in the design compromises for a large, high-capacity wireless-access system. Unfortunately, the tendency has been to ignore many of the issues and focus on only one, e.g., the focus on cell site capacity that drove the development of digital-cellular high-tier systems in the United States. Interactions among circuit quality, complexity, capacity, and economics are considered in the following sections.

15.6.1

Capacity, Quality, and Complexity

Although capacity comparisons frequently are made without regard to circuit quality, complexity, or cost per base station, such comparisons are not meaningful. An example in Table 15.5 compares capacity factors for U.S. cellular or high-tier PCS technologies with the low-tier PCS technology, PACS/WACS. The mean opinion scores (MOS) (noted in Table 15.5) for speech coding are discussed later. Detection of speech activity and turning off the transmitter during times of no activity is implemented in IS-95. Its impact on MOS also is noted later. A similar technique has been proposed as E-TDMA for use with IS-54 and is discussed with respect to TDMA system in [29]. Note that the use of low-bit-rate speech coding combined with speech activity degrades the high-tier system’s quality by nearly one full MOS point on the five-point MOS scale when compared to 32 kb/s ADPCM. Tandem encoding is discussed in a later section. These speech quality degrading factors alone provide a base station capacity increasing factor of ×4 × 2.5 = ×10 over the high-speech-quality low-tier system! Speech coding, of course, directly affects base station capacity and, thus, overall system capacity by its effect on the number of speech channels that can fit into a given bandwidth. TABLE 15.5 Comparison of Cellular (IS-54/IS-95) and Low-Tier PCS (WACS/PACS). Capacity Comparisons Made without Regard to Quality Factors, Complexity, and Cost per Base Station Are not Meaningful Parameter

Cellular (High-Tier)

Low-Tier PCS

Capacity Factor

Speech Coding, kb/s

8 (MOS 3.4) No tandem coding

32 (MOS 4.1) 3 or 4 tandem

×4

Speech activity

Yes (MOS 3.2)

No (MOS 4.1)

×2.5

Percentage of good areas, %

90

99

×2

Propagation σ , dB

8

10

×1.5

Total: trading quality for capacity

×30

The allowance of extra system margin to provide coverage of 99% of an area for low-tier PCS versus 90% coverage for high-tier is discussed in the previous section and [29]. This additional quality factor costs a capacity factor of ×2. The last item in Table 15.5 does not change the actual system, but only 1999 by CRC Press LLC

c

changes the way that frequency reuse is calculated. The additional 2-dB margin in standard deviation σ , allowed for coverage into houses and small buildings for low-tier PCS, costs yet another factor of ×1.5 in calculation only. Frequency reuse factors affect the number of sets of frequencies required and, thus, the bandwidth available for use at each base station. Thus, these factors also affect the base station capacity and the overall system capacity. For the example in Table 15.5, significant speech and coverage quality has been traded for a factor of ×30 in base station capacity! Whereas base station capacity affects overall system capacity directly, it should be remembered that overall system capacity can be increased arbitrarily by decreasing the spacing between base stations. Thus, if the PACS low-tier PCS technology were to start with a base station capacity of ×0.5 of AMPS cellular5 (a much lower figure than the ×0.8 sometimes quoted [20]), and then were degraded in quality as described above to yield the ×30 capacity factor, it would have a resulting capacity of ×15 of AMPS! Thus, it is obvious that making such a base station capacity comparison without including quality is not meaningful.

15.6.2

Economics, System Capacity, and Coverage Area Size

Claims are sometimes made that low-tier PCS cannot be provided economically, even though it is what the user wants. These claims are often made based on economic estimates from the cellular paradigm. These include the following. • Very low estimates of market penetration, much less than cordless telephones, and often even less than cellular. • High estimates of base station costs more appropriate to high-complexity, high-cost cellular technology than to low-complexity, low-cost, low-tier technology. • Very low estimates of circuit usage time more appropriate to cellular than to cordless/wireline telephone usage, which is more likely for low-tier PCS. • { Backhaul costs based on existing T1 line tariffs that are based on “antique” high cost digital loop electronics. (See discussion in fourth paragraph at start of Section 15.6.) } Such economic estimates are often done by making absolute economic calculations based on very uncertain input data. The resulting estimates for low-tier and high-tier are often closer together than the large uncertainties in the input data. A perhaps more realistic approach for comparing such systems is to vary only one or two parameters while holding all others fixed and then looking at relative economics between high-tier and low-tier systems. This is the approach used in the following examples.

EXAMPLE 15.1:

In the first example (see Table 15.6), the number of channels per megahertz is held constant for cellular and for low-tier PCS. Only the spacing is varied between base stations, e.g., cell sites for cellular and radio ports for low-tier PCS, to account for the differences in transmitter power, antenna height, etc. In this example, overall system capacity varies directly as the square of base station spacing, but base station capacity is the same for both cellular and low-tier PCS. For the typical values in the

5 Note that the ×0.5 factor is an arbitrary factor taken for illustrating this example. The so-called ×AMPS factors are only with regard to base station capacity, although they are often misused as system capacity.

1999 by CRC Press LLC

c

example, the resulting low-tier system capacity is ×400 greater, only because of the closer base station spacing. If the two systems were to cost the same, the equivalent low-tier PCS base stations would have to cost less than $2,500. TABLE 15.6

System Capacity/Coverage Area Size/Economics Example 15.1

Assume channels/MHz are the same for cellular and PCS Cell site: spacing = 20.000 ft cost $ = 1 M PCS port: spacing = 1,000 ft PCS system capacity is (20000/1000)2 = 400 × cellular capacity Then, for the system costs to be the same Port cost = ($ 1 M/400) $2,500 a reasonable figure If, cell site and port each have 180 channels Cellular cost/circuit = $ 1 M/180 = $5,555/circuit PCS cost/circuit = $2500/180 = $14/circuit Example 15.2 Assume equal cellular and PCS system capacity Cell site: spacing = 20,000 ft PCS port: spacing = 1,000 ft If, a cell site has 180 channels then, for equal system capacity, a PCS port needs 180/400 < 1 channel/port Example 15.3 Quality/cost trade Cell site: Spacing = 20,000 ft PCS port: Spacing = 1,000 ft

cost = $1 M channels = 180 cost = $2,500

Cellular to PCS, base station spacing capacity factor = × 400 PCS to cellular quality reduction factors: 32 to 8 kb/s speech ×4 Voice activity (buying) ×2 ×2 99–90% good areas Both in same environment (same σ ) ×1 Capacity factor traded ×16 180 ch/16 = 11.25 channels/port then, $2500/11.25 = $222/circuit and remaining is ×400/16 = ×25 system capacity of PCS over cellular

This cost is well within the range of estimates for such base stations, including equivalent infrastructure. These low-tier PCS base stations are of comparable or lower complexity than cellular vehicular subscriber sets, and large-scale manufacture will be needed to produce the millions that will be required. Also, land, building, antenna tower and legal fees for zoning approval, or rental of expensive space on top of commercial buildings, represent large expenses for cellular cell sites. Low-tier PCS base stations that are mounted on utility poles and sides of buildings will not incur such large additional expenses. Therefore, costs of the order of magnitude indicated seem reasonable in large quantities. Note that, with these estimates, the per-wireless-circuit cost of the low-tier PCS circuits would be only $14/circuit compared to $5,555/circuit for the high-tier circuits. Even if there were a factor of 10 error in cost estimates, or a reduction of channels per radio port of a factor of 10, the per-circuit cost of low-tier PCS would still be only $140/circuit, which is still much less than the per-circuit cost of high-tier. EXAMPLE 15.2:

In the second example (see Table 15.6), the overall system capacity is held constant, and the number of channels/port, i.e., channels/(base station) is varied. In this example, less than 1/2 channel/port 1999 by CRC Press LLC

c

is needed, again indicating the tremendous capacity that can be produced with close-spaced lowcomplexity base stations. EXAMPLE 15.3:

Since the first two examples are somewhat extreme, the third example (see Table 15.6) uses a more moderate, intermediate approach. In this example, some of the cellular high-tier channels/(base station) are traded to yield higher quality low-tier PCS as in the previous subsection. This reduces the channels/port to 11+, with an accompanying increase in cost/circuit up to $222/circuit, which is still much less than the $5,555/circuit for the high-tier system. Note, also, that the low-tier system still has ×25 the capacity of the high-tier system! Low-tier base station (Port) cost would have to exceed $62,500 for the low-tier per-circuit cost to exceed that of the high-tier cellular system. Such a high port cost far exceeds any existing realistic estimate of low-tier system costs. It can be seen from these examples, and particularly Example 15.3, that the circuit economics of low-tier PCS are significantly better than for high-tier PCS, if the user demand and density is sufficient to make use of the large system capacity. Considering the high penetration of cordless telephones, the rapid growth of cellular handsets, and the enormous market projections for wireless PCS noted earlier in this chapter, filling such high capacity in the future would appear to be certain. The major problem is providing rapidly the widespread coverage (buildout) required by the FCC in the United States. If this unrealistic regulatory demand can be overcome, low-tier wireless PCS promises to provide the wireless personal communications that everyone wants.

15.6.3 {Loop Evolution and Economics It is interesting to note that several wireless loop applications are aimed at reducing cost by replacing parts of wireline or CATV loops with wireless links between transceivers. The economics of these applications are driven by the replacing of labor-intensive wireline and cable technologies with massproduced solid-state electronics in transceivers. Consider first a cordless telephone base unit. The cordless base-unit transceiver usually serves one or, at most, two handsets at the end of one wireline loop. Now consider moving such a base unit back along the copper-wire-pair loop end a distance that can be reliably covered by a low-power wireless link [25, 31], i.e., several hundred to a thousand feet or so, and mounting it on a utility pole or a street light pole. This replaces the copper loop end with the wireless link. Many additional copper loop ends to other subscribers will be contained within a circle around the pole having a maximum usable radius of this wireless link. Replace all of the copper loop ends within the circle with cordless base units on the same pole. Note that this process replaces the most expensive parts of these many loops, i.e., the many individual loop ends, with the wireless links from cordless handsets to“equivalent” cordless base units on a pole. Of course, being mounted outside will require somewhat stronger enclosures and means of powering the base units, but these additional costs are considerably more than offset by eliminating the many copper wire drops. It is instructive to consider how many subscribers could be collected at a pole containing base units. Consider, as an example, a coverage square of 1400 ft on a side (PACS will provide good coverage over this range, i.e., for base unit pole separations of about 1400 ft, at 1.9 GHz). Within this square will be 45 houses for a 1 house/acre density typical of low-housing-density areas, or 180 houses for 4 house/acre density more typical of high-density single-family housing areas. These represent significant concentration of traffic at a pole. 1999 by CRC Press LLC

c

Because of the trunking advantage of the significant number of subscribers concentrated at a pole, they can share a smaller number of base unit, i.e., wireless base unit transceivers, than there are wireless subscriber sets. Therefore, the total cost compared with having a cordless base unit per subscriber also is reduced by the concentration of users. A single PACS transceiver will support simultaneously eight TDMA channels or circuits at 32 kb/s (or 16 at 16 kb/s or 32 at 8 kb/s) [56]. Of these, one channel is reserved for system control. The cost of such moderate-rate transceivers is relatively insensitive to the number of channels supported; i.e., the cost of such an 8-channel (or 16 or 32) transceiver will be significantly less than twice the cost of a similar one-channel transceiver. Thus, another economic advantage accrues to this wireless loop approach from using time-multiplexed (TDMA) transceivers instead of single-channel-pertransceiver cordless telephone base units. For an offered traffic of 0.06 Erlang, a typical busy-hour value for a wireline subscriber, a sevenchannel transceiver could serve about 40 subscribers at 1% blocking, based on the Erlang B queuing discipline. From the earlier example, such a transceiver could serve most of the 45 houses within a 1400-ft square. Considering partial penetration, the transceiver capacity is more than adequate for the low-density housing.6 Considering the high-density example of 4 houses/acre, a seven-channel transceiver could serve only about 20% of the subscribers within a 1400-ft square. If the penetration became greater than about 20%, either additional transceivers, perhaps those of other service providers, or closer transceiver spacing would be required. Another advantageous economic factor for wireless loops results when considering timemultiplexed transmission in the fixed distribution facilities. For copper or fiber digital subscriber loop carrier (SLC), e.g., T1 or high-rate digital subscriber line (HDSL), a demultiplexing/multiplexing terminal and drop interface are required at the end of the time-multiplexed SLC line to provide the individual circuits for each subscriber loop-end circuit, i.e., for each drop. The most expensive part of such an SLC terminating unit is the subscriber line cards that provide per-line interfaces for each subscriber drop. Terminating a T1 or HDSL line on a wireless loop transceiver eliminates all per-line interfaces, i.e., all line cards, the most expensive part of a SLC line termination. Thus, the greatly simplified SLC termination can be incorporated within a TDMA wireless loop transceiver, resulting in another cost savings over the conventional copper-wire-pair telephone loop end. The purpose of the previous discussions is not to give an exact system design or economic analysis, but to illustrate the inherent economic advantages of low-power wireless loops over copper loop ends and over copper loop ends with cordless telephone base units. Some economic analyses have found wireless loop ends to be more economical than copper loop ends when subscribers use low-power wireless handsets. Rizzo and Sollenberger [56] have also discussed the advantageous economics of PACS wireless loop technology in the context of low-tier PCS. The discussions in this section can be briefly summarized as follows. Replacing copper wire telephone loop ends with low-complexity wireless loop technology like PACS can produce economic benefits in at least four ways. These are. 1. Replacing the most expensive part of a loop, the per-subscriber loop-end, with a wireless link. 2. Taking advantage of trunking in concentrating many wireless subscriber loops into a

6 The range could be extended by using higher base unit antennas, by using higher-gain directional (sectored) antennas, and/or by increasing the maximum power that can be transmitted.

1999 by CRC Press LLC

c

smaller number of wireless transceiver channels. 3. Reducing the cost of wireless transceivers by time multiplexing (TDMA) a few (7, 15, or 31) wireless loop circuits (channels) in each transceiver. 4. Eliminating per-line interface cards in digital subscriber line terminations by terminating time-multiplexed subscriber lines in the wireless loop transceivers. }

15.7

Other Issues

Several issues in addition to those addressed in the previous two sections continue to be raised with respect to low-tier PCS. These are treated in this section.

15.7.1

Improvement of Batteries

Frequently, the suggestion is made that battery technology will improve so that high-power handsets will be able to provide the desired 5 or 6 hours of talk time in addition to 10 or 12 hours of standby time, and still weigh less than one-fourth of the weight of today’s smallest cellular handset batteries. This hope does not take into account the maturity of battery technology, and the long history (many decades) of concerted attempts to improve it. Increases in battery capacity have come in small increments, a few percent, and very slowly over many years, and the shortfall is well over a factor of 10. In contrast, integrated electronics and radio frequency devices needed for low-power low-tier PCS continue to improve and to decrease in cost by factors of greater than 2 in time spans on the order of a year or so. It also should be noted that, as the energy density of a battery is increased, the energy release rate per volume must also increase in order to supply the same amount of power. If energy storage density and release rate are increased significantly, the difference between a battery and a bomb become indistinguishable! The likelihood of a ×10 improvement in battery capacity appears to be essentially zero. If even a modest improvement in battery capacity were possible, many people would be driving electric vehicles. { As noted in the addition to the “Reality Check” section, new lithium batteries have become the batteries of choice for the smallest cellular/high-tier PCS handsets. While these lithium batteries have higher energy density than earlier nickel cadmium batteries, they still fall far short of the factor of 10 improvement that was needed to make long talk time, small size, and low weight possible. With the much larger advance in electronics, the battery is even more dominant in the size and weight of the newest cellular handsets. The introduction of these batteries incurred considerable startup pain because of the greater fire and explosive hazard associated with lithium materials, i.e., closer approach to a bomb. Further attempts in this direction will be even more hazardous. }

15.7.2

People Only Want One Handset

This issue is often raised in support of high-tier cellular handsets over low-tier handsets. Whereas the statement is likely true, the assumption that the handset must work with high-tier cellular is not. Such a statement follows from the current large usage of cellular handsets; but such usage results because that is the only form of widespread wireless service currently available, not because it is what people want. The statement assumes inadequate coverage of a region by low-tier PCS, and that low-tier handsets will not work in vehicles. The only way that high-tier handsets could serve the desires of people discussed earlier would be for an unlikely breakthrough in battery technology to occur. A low-tier system, however, can cover economically any large region having some people in 1999 by CRC Press LLC

c

it. (It will not cover rural or isolated areas but, by definition, there is essentially no one there to want communications anyway.) Low-tier handsets will work in vehicles on village and city streets at speeds up to 30 or 40 mi/h, and the required handoffs make use of computer technology that is rapidly becoming inexpensive. { As noted earlier, vehicular speed handoff is readily accomplished with PACS. Reliable handoff has been demonstrated for PACS at speeds in excess of 70 mi/hr. } Highways between populated areas, and also streets within them, will need to be covered by high-tier cellular PCS, but users are likely to use vehicular sets in these cellular systems. Frequently the vehicular mobile user will want a different communications device anyway, e.g., a hands-free phone. The use of hands-free phones in vehicles is becoming a legal requirement in some places now and is likely to become a requirement in many more places in the future. Thus, handsets may not be legally usable in vehicles anyway. With widespread deployment of low-tier PCS systems, the one handset of choice will be the low-power, low-tier PCS pocket handset or voice/data communicator. { As discussed in earlier sections, it is quite feasible economically to cover highways between cities with low-tier systems, if the low-tier base stations have antennas with the same height and gain as used for cellular and high-tier PCS systems. (The range penalty for the lower power was noted earlier to be only on the order of 1/2, or about the same as the range penalty in going from 800 MHz cellular to 1.9 GHz high-tier PCS.) } There are approaches for integrating low-tier pocket phones or pocket communicators with hightier vehicular cellular mobile telephones. The user’s identity could be contained either in memory in the low-tier set or in a small smart card inserted into the set, as is a feature of the European GSM system. When entering an automobile, the small low-tier communicator or card could be inserted into a receptacle in a high-tier vehicular cellular set installed in the automobile.7 The user’s identity would then be transferred to the mobile set. { “Car adapters” that have a cradle for a small cellular handset providing battery charging and connection to an outside antenna are quite common — e.g., in Sweden use of such adapters is commonplace. Thus, this concept has already evolved significantly, even for the disadvantaged cellular handsets when they are used in vehicles. } The mobile set could then initiate a data exchange with the high-tier system, indicating that the user could now receive calls at that mobile set. This information about the user’s location would then be exchanged between the network intelligence so that calls to the user could be correctly routed.8 In this approach the radio sets are optimized for their specific environments, high-power, high-tier vehicular or low-power, low-tier pedestrian, as discussed earlier, and the network access and call routing is coordinated by the interworking of network intelligence. This approach does not compromise the design of either radio set or radio system. It places the burden on network intelligence technology that benefits from the large and rapid advances in computer technology. The approach of using different communications devices for pedestrians than for vehicles is consistent with what has actually happened in other applications of technology in similarly different environments. For example, consider the case of audio cassette tape players. Pedestrians often carry and listen to small portable tape players with lightweight headsets (e.g., a Walkman).9 When one of these people enters an automobile, he or she often removes the tape from the Walkman and inserts it into a tape player installed in the automobile. The automobile player has speakers that fill the car

7 Inserting the small personal communicator in the vehicular set would also facilitate charging the personal communicator’s battery. 8 This is a feature proposed for FPLMTS in CCIR Rec. 687. 9 Walkman is a registered trademark of Sony Corporation.

1999 by CRC Press LLC

c

with sound. The Walkman is optimized for a pedestrian, whereas the vehicular-mounted player is optimized for an automobile. Both use the same tape, but they have separate tape heads, tape transports, audio preamps, etc. They do not attempt to share electronics. In this example, the tape cassette is the information-carrying entity similar to the user identification in the personal communications example discussed earlier. The main points are that the information is shared among different devices but that the devices are optimized for their environments and do not share electronics. Similarly, a high-tier vehicular-cellular set does not need to share oscillators, synthesizers, signal processing, or even frequency bands or protocols with a low-tier pocket-size communicator. Only the information identifying the user and where he or she can be reached needs to be shared among the intelligence elements, e.g., routing logic, databases, and common channel signalling [26, 29] of the infrastructure networks. This information exchange between network intelligence functions can be standardized and coordinated among infrastructure subnetworks owned and operated by different business entities (e.g., vehicular cellular mobile radio networks and intelligent low-tier PCS networks). Such standardization and coordination are the same as are required today to pass intelligence among local exchange networks and interexchange carrier networks.

15.7.3

Other Environments

Low-tier personal communications can be provided to occupants of airplanes, trains, and buses by installing compatible low-tier radio access ports inside these vehicles. The ports can be connected to high-power, high-tier vehicular cellular mobile sets or to special air-ground or satellite-based mobile communications sets. Intelligence between the internal ports and mobile sets could interact with cellular mobile, air-ground, or satellite networks in one direction, using protocols and spectrum allocated for that purpose, and with low-tier personal communicators in the other direction to exchange user identification and route calls to and from users inside these large vehicles. Radio isolation between the low-power units inside the large metal vehicles and low-power systems outside the vehicles can be ensured by using windows that are opaque to the radio frequencies. Such an approach also has been considered for automobiles, i.e., a radio port for low-tier personal communications connected to a cellular mobile set in a vehicle so that the low-tier personal communicator can access a high-tier cellular network. (This could be done in the United States using unlicensed PCS frequencies within the vehicle.)

15.7.4

Speech Quality Issues

All of the PCS and cordless telephone technologies that use CCITT standardized 32-kb/s ADPCM speech encoding can provide similar error-free speech distortion quality. This quality often is rated on a five-point subjective mean opinion score (MOS) with 5 excellent, 4 good, 3 fair, 2 poor, and 1 very poor. The error-free MOS of 32-kb/s ADPCM is about 4.1 and degrades very slightly with tandem encodings. Tandem encodings could be expected in going from a digital-radio PCS access link, through a network using analog transmission or 64-kb/s PCM, and back to another digital-radio PCS access link on the other end of the circuit. In contrast, a low-bit-rate ( W . Hence, all of the signal’s spectral components will be affected by the channel in a similar manner (e.g., fading or no fading); this is illustrated in Fig. 18.8(b). Flat-fading does not introduce channel-induced ISI distortion, but performance degradation can still be expected due to the loss in SNR whenever the signal is fading. In order to avoid channel-induced ISI distortion, the channel is required to exhibit flat fading by insuring that 1 (18.14) f0 > W ≈ Ts Hence, the channel coherence bandwidth f0 sets an upper limit on the transmission rate that can be used without incorporating an equalizer in the receiver. For the flat-fading case, where f0 > W (or Tm < Ts ), Fig. 18.8(b) shows the usual flat-fading pictorial representation. However, as a mobile radio changes its position, there will be times when the received signal experiences frequency-selective distortion even though f0 > W . This is seen in Fig. 18.8(c), where the null of the channel’s frequency transfer function occurs at the center of the signal band. Whenever this occurs, the baseband pulse will be especially mutilated by deprivation of its DC component. One consequence of the loss of DC (zero mean value) is the absence of a reliable pulse peak on which to establish the timing synchronization, or from which to sample the carrier phase carried by the pulse [18]. Thus, even though a channel is categorized as flat fading (based on rms relationships), it can still manifest frequency-selective fading on occasions. It is fair to say that a mobile radio channel, classified as having flat-fading degradation, cannot exhibit flat fading all of the time. As f0 becomes much larger than W (or Tm becomes much smaller than Ts ), less time will be spent in conditions approximating Fig. 18.8(c). By comparison, it should be clear that in Fig. 18.8(a) the fading is independent of the position of the signal band, and frequency-selective fading occurs all the time, not just occasionally.

18.6

Typical Examples of Flat Fading and Frequency-Selective Fading Manifestations

Figure 18.9 shows some examples of flat fading and frequency-selective fading for a direct-sequence spread-spectrum (DS/SS) system [20, 22]. In Fig. 18.9, there are three plots of the output of a pseudonoise (PN) code correlator vs. delay as a function of time (transmission or observation time). Each amplitude vs. delay plot is akin to S(τ ) vs. τ shown in Fig. 18.7(a). The key difference is that the amplitudes shown in Fig. 18.9 represent the output of a correlator; hence, the waveshapes are a function not only of the impulse response of the channel, but also of the impulse response of the correlator. The delay time is expressed in units of chip durations (chips), where the chip is defined as the spread-spectrum minimal-duration keying element. For each plot, the observation time is shown on an axis perpendicular to the amplitude vs. time-delay plane. Figure 18.9 is drawn from a satellite-toground communications link exhibiting scintillation because of atmospheric disturbances. However, Fig. 18.9 is still a useful illustration of three different channel conditions that might apply to a mobile radio situation. A mobile radio that moves along the observation-time axis is affected by changing multipath profiles along the route, as seen in the figure. The scale along the observation-time axis is also in units of chips. In Fig. 18.9(a), the signal dispersion (one “finger” of return) is on the order of a chip time duration, Tch . In a typical DS/SS system, the spread-spectrum signal bandwidth is approximately equal to 1/Tch ; hence, the normalized coherence bandwidth f0 Tch of approximately unity in Fig. 18.9(a) implies that the coherence bandwidth is about equal to the spread-spectrum bandwidth. This describes a channel that can be called frequency-nonselective or slightly frequencyselective. In Fig. 18.9(b), where f0 Tch = 0.25, the signal dispersion is more pronounced. There is 1999 by CRC Press LLC

c

FIGURE 18.9: DS/SS Matched-filter output time-history examples for three levels of channel conditions, where Tch is the time duration of a chip.

definite interchip interference, and the coherence bandwidth is approximately equal to 25% of the spread-spectrum bandwidth. In Fig. 18.9(c), where f0 Tch = 0.1, the signal dispersion is even more pronounced, with greater interchip-interference effects, and the coherence bandwidth is approximately equal to 10% of the spread-spectrum bandwidth. The channels of Figs. 18.9(b) and (c) can be categorized as moderately and highly frequency-selective, respectively, with respect to the basic signalling element, the chip. Later, we show that a DS/SS system operating over a frequency-selective channel at the chip level does not necessarily experience frequency-selective distortion at the symbol level. 1999 by CRC Press LLC

c

18.7

Time Variance Viewed in the Time Domain: Figure 18.1, Block 13—The Spaced-Time Correlation Function

Until now, we have described signal dispersion and coherence bandwidth, parameters that describe the channel’s time-spreading properties in a local area. However, they do not offer information about the time-varying nature of the channel caused by relative motion between a transmitter and receiver, or by movement of objects within the channel. For mobile-radio applications, the channel is time variant because motion between the transmitter and receiver results in propagation-path changes. Thus, for a transmitted continuous wave (CW) signal, as a result of such motion, the radio receiver sees variations in the signal’s amplitude and phase. Assuming that all scatterers making up the channel are stationary, then whenever motion ceases, the amplitude and phase of the received signal remain constant; that is, the channel appears to be time invariant. Whenever motion begins again, the channel appears time variant. Since the channel characteristics are dependent on the positions of the transmitter and receiver, time variance in this case is equivalent to spatial variance. Figure 18.7(c) shows the function R(1t), designated the spaced-time correlation function; it is the autocorrelation function of the channel’s response to a sinusoid. This function specifies the extent to which there is correlation between the channel’s response to a sinusoid sent at time t1 and the response to a similar sinusoid sent at time t2 , where 1t = t2 −t1 . The coherence time, T0 , is a measure of the expected time duration over which the channel’s response is essentially invariant. Earlier, we made measurements of signal dispersion and coherence bandwidth by using wideband signals. Now, to measure the time-variant nature of the channel, we use a narrowband signal. To measure R(1t) we can transmit a single sinusoid (1f = 0) and determine the autocorrelation function of the received signal. The function R(1t) and the parameter T0 provide us with knowledge about the fading rapidity of the channel. Note that for an ideal time-invariant channel (e.g., a mobile radio exhibiting no motion at all), the channel’s response would be highly correlated for all values of 1t, and R(1t) would be a constant function. When using the dense-scatterer channel model described earlier, with constant velocity of motion, and an unmodulated CW signal, the normalized R(1t) is described as (18.15) R(1t) = J0 (kV 1t) where J0 (·) is the zero-order Bessel function of the first kind, V is velocity, V 1t is distance traversed, and k = 2π/λ is the free-space phase constant (transforming distance to radians of phase). Coherence time can be measured in terms of either time or distance traversed (assuming some fixed velocity of motion). Amoroso described such a measurement using a CW signal and a dense-scatterer channel model [18]. He measured the statistical correlation between the combination of received magnitude and phase sampled at a particular antenna location x0 , and the corresponding combination sampled at some displaced location x0 + ζ , with displacement measured in units of wavelength λ. For a displacement ζ of 0.38λ between two antenna locations, the combined magnitudes and phases of the received CW are statistically uncorrelated. In other words, the state of the signal at x0 says nothing about the state of the signal at x0 + ζ . For a given velocity of motion, this displacement is readily transformed into units of time (coherence time).

18.7.1

The Concept of Duality

Two operators (functions, elements, or systems) are dual when the behavior of one with reference to a time-related domain (time or time-delay) is identical to the behavior of the other with reference to the corresponding frequency-related domain (frequency or Doppler shift). 1999 by CRC Press LLC

c

In Fig. 18.7, we can identify functions that exhibit similar behavior across domains. For understanding the fading channel model, it is useful to refer to such functions as duals. For example, R(1f ) in Fig. 18.7(b), characterizing signal dispersion in the frequency domain, yields knowledge about the range of frequency over which two spectral components of a received signal have a strong potential for amplitude and phase correlation. R(1t) in Fig. 18.7(c), characterizing fading rapidity in the time domain, yields knowledge about the span of time over which two received signals have a strong potential for amplitude and phase correlation. We have labeled these two correlation functions as duals. This is also noted in Fig. 18.1 as the duality between blocks 10 and 13, and in Fig. 18.6 as the duality between the time-spreading mechanism in the frequency domain and the time-variant mechanism in the time domain.

18.7.2

Degradation Categories due to Time Variance Viewed in the Time Domain

The time-variant nature of the channel or fading rapidity mechanism can be viewed in terms of two degradation categories as listed in Fig. 18.6: fast fading and slow fading. The terminology “fast fading” is used for describing channels in which T0 < Ts , where T0 is the channel coherence time and Ts is the time duration of a transmission symbol. Fast fading describes a condition where the time duration in which the channel behaves in a correlated manner is short compared to the time duration of a symbol. Therefore, it can be expected that the fading character of the channel will change several times during the time that a symbol is propagating, leading to distortion of the baseband pulse shape. Analogous to the distortion previously described as channel-induced ISI, here distortion takes place because the received signal’s components are not all highly correlated throughout time. Hence, fast fading can cause the baseband pulse to be distorted, resulting in a loss of SNR that often yields an irreducible error rate. Such distorted pulses cause synchronization problems (failure of phase-locked-loop receivers), in addition to difficulties in adequately defining a matched filter. A channel is generally referred to as introducing slow fading if T0 > Ts . Here, the time duration that the channel behaves in a correlated manner is long compared to the time duration of a transmission symbol. Thus, one can expect the channel state to virtually remain unchanged during the time in which a symbol is transmitted. The propagating symbols will likely not suffer from the pulse distortion described above. The primary degradation in a slow-fading channel, as with flat fading, is loss in SNR.

18.8

Time Variance Viewed in the Doppler-Shift Domain: Figure 18.1, Block 16—The Doppler Power Spectrum

A completely analogous characterization of the time-variant nature of the channel can begin in the Doppler-shift (frequency) domain. Figure 18.7(d) shows a Doppler power spectral density, S(v), plotted as a function of Doppler-frequency shift, v. For the case of the dense-scatterer model, a vertical receive antenna with constant azimuthal gain, a uniform distribution of signals arriving at all arrival angles throughout the range (0, 2π ), and an unmodulated CW signal, the signal spectrum at the antenna terminals is [19] S(v) =

r

1

πfd 1 − 1999 by CRC Press LLC

c

v−fc fd

2

(18.16)

The equality holds for frequency shifts of v that are in the range ±fd about the carrier frequency fc and would be zero outside that range. The shape of the RF Doppler spectrum described by Eq. (18.16) is classically bowl-shaped, as seen in Fig. 18.7(d). Note that the spectral shape is a result of the densescatterer channel model. Equation (18.16) has been shown to match experimental data gathered for mobile radio channels [23]; however, different applications yield different spectral shapes. For example, the dense-scatterer model does not hold for the indoor radio channel; the channel model for an indoor area assumes S(v) to be a flat spectrum [24]. In Fig. 18.7(d), the sharpness and steepness of the boundaries of the Doppler spectrum are due to the sharp upper limit on the Doppler shift produced by a vehicular antenna traveling among the stationary scatterers of the dense scatterer model. The largest magnitude (infinite) of S(v) occurs when the scatterer is directly ahead of the moving antenna platform or directly behind it. In that case the magnitude of the frequency shift is given by fd =

V λ

(18.17)

where V is relative velocity and λ is the signal wavelength. fd is positive when the transmitter and receiver move toward each other and negative when moving away from each other. For scatterers directly broadside of the moving platform, the magnitude of the frequency shift is zero. The fact that Doppler components arriving at exactly 0◦ and 180◦ have an infinite power spectral density is not a problem, since the angle of arrival is continuously distributed and the probability of components arriving at exactly these angles is zero [3, 19]. S(v) is the Fourier transform of R(1t). We know that the Fourier transform of the autocorrelation function of a time series is the magnitude squared of the Fourier transform of the original time series. Therefore, measurements can be made by simply transmitting a sinusoid (narrowband signal) and using Fourier analysis to generate the power spectrum of the received amplitude [16]. This Doppler power spectrum of the channel yields knowledge about the spectral spreading of a transmitted sinusoid (impulse in frequency) in the Doppler-shift domain. As indicated in Fig. 18.7, S(v) can be regarded as the dual of the multipath intensity profile, S(τ ), since the latter yields knowledge about the time spreading of a transmitted impulse in the time-delay domain. This is also noted in Fig. 18.1 as the duality between blocks 7 and 16, and in Fig. 18.6 as the duality between the time-spreading mechanism in the time-delay domain and the time-variant mechanism in the Doppler-shift domain. Knowledge of S(v) allows us to glean how much spectral broadening is imposed on the signal as a function of the rate of change in the channel state. The width of the Doppler power spectrum is referred to as the spectral broadening or Doppler spread, denoted by fd , and sometimes called the fading bandwidth of the channel. Equation (18.16) describes the Doppler frequency shift. In a typical multipath environment, the received signal arrives from several reflected paths with different path distances and different angles of arrival, and the Doppler shift of each arriving path is generally different from that of another path. The effect on the received signal is seen as a Doppler spreading or spectral broadening of the transmitted signal frequency, rather than a shift. Note that the Doppler spread, fd , and the coherence time, T0 , are reciprocally related (within a multiplicative constant). Therefore, we show the approximate relationship between the two parameters as T0 ≈

1 fd

(18.18)

Hence, the Doppler spread fd or 1/T0 is regarded as the typical fading rate of the channel. Earlier, T0 was described as the expected time duration over which the channel’s response to a sinusoid is essentially invariant. When T0 is defined more precisely as the time duration over which the 1999 by CRC Press LLC

c

channel’s response to a sinusoid has a correlation of at least 0.5, the relationship between T0 and fd is approximately [4] 9 (18.19) T0 ≈ 16πfd A popular “rule of thumb” is to define T0 as the geometric mean of Eqs. (18.18) and (18.19). This yields s 0.423 9 = (18.20) T0 = 2 fd 16πfd For the case of a 900 MHz mobile radio, Fig. 18.10 illustrates the typical effect of Rayleigh fading on a signal’s envelope amplitude vs. time [3]. The figure shows that the distance traveled by the

FIGURE 18.10: A typical Rayleigh fading envelope at 900 MHz.

mobile in the time interval corresponding to two adjacent nulls (small-scale fades) is on the order of a half-wavelength (λ/2) [3]. Thus, from Fig. 18.10 and Eq. (18.17), the time (approximately, the coherence time) required to traverse a distance λ/2 when traveling at a constant velocity, V , is: T0 ≈

λ/2 0.5 = V fd

(18.21)

Thus, when the interval between fades is taken to be λ/2, as in Fig. 18.10, the resulting expression for T0 in Eq. (18.21) is quite close to the rule-of-thumb shown in Eq. (18.20). Using Eq. (18.21), with the parameters shown in Fig. 18.10 (velocity = 120 km/hr, and carrier frequency = 900 MHz), it is 1999 by CRC Press LLC

c

straightforward to compute that the coherence time is approximately 5 ms and the Doppler spread (channel fading rate) is approximately 100 Hz. Therefore, if this example represents a voice-grade channel with a typical transmission rate of 104 symbols/s, the fading rate is considerably less than the symbol rate. Under such conditions, the channel would manifest slow-fading effects. Note that if the abscissa of Fig. 18.10 were labeled in units of wavelength instead of time, the figure would look the same for any radio frequency and any antenna speed.

18.9

Analogy Between Spectral Broadening in Fading Channels and Spectral Broadening in Digital Signal Keying

Help is often needed in understanding why spectral broadening of the signal is a function of fading rate of the channel. Figure 18.11 uses the keying of a digital signal (such as amplitude-shift-keying or frequency-shift-keying) to illustrate an analogous case. Figure 18.11(a) shows that a single tone, cos 2πfc t (−∞ < t < ∞) that exists for all time is characterized in the frequency domain in terms of impulses (at ±fc ). This frequency domain representation is ideal (i.e., zero bandwidth), since the tone is pure and neverending. In practical applications, digital signalling involves switching (keying) signals on and off at a required rate. The keying operation can be viewed as multiplying the infinite-duration tone in Fig. 18.11(a) by an ideal rectangular (switching) function in Fig. 18.11(b). The frequency-domain description of the ideal rectangular function is of the form (sin f )/f . In Fig. 18.11(c), the result of the multiplication yields a tone, cos 2πfc t, that is time-duration limited in the interval −T /2 < t < T /2. The resulting spectrum is obtained by convolving the spectral impulses in part (a) with the (sin f )/f function in part (b), yielding the broadened spectrum in part (c). It is further seen that, if the signalling occurs at a faster rate characterized by the rectangle of shorter duration in part (d), the resulting spectrum of the signal in part (e) exhibits greater spectral broadening. The changing state of a fading channel is somewhat analogous to the keying on and off of digital signals. The channel behaves like a switch, turning the signal “on” and “off.” The greater the rapidity of the change in the channel state, the greater the spectral broadening of the received signals. The analogy is not exact because the on and off switching of signals may result in phase discontinuities, but the typical multipath-scatterer environment induces phase-continuous effects.

18.10

Degradation Categories due to Time Variance, Viewed in the Doppler-Shift Domain

A channel is referred to as fast fading if the symbol rate, 1/Ts (approximately equal to the signalling rate or bandwidth W ) is less than the fading rate, 1/T0 (approximately equal to fd ); that is, fast fading is characterized by W < fd

(18.22a)

Ts > T0

(18.22b)

or

Conversely, a channel is referred to as slow fading if the signalling rate is greater than the fading 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 18.11: Analogy between spectral broadening in fading and spectral broadening in keying a digital signal.

rate. Thus, in order to avoid signal distortion caused by fast fading, the channel must be made to exhibit slow fading by insuring that the signalling rate must exceed the channel fading rate. That is W > fd

(18.23a)

Ts < T0

(18.23b)

or

In Eq. (18.14), it was shown that due to signal dispersion, the coherence bandwidth, f0 , sets an upper limit on the signalling rate which can be used without suffering frequency-selective distortion. Similarly, Eq. (18.23a–18.23b) shows that due to Doppler spreading, the channel fading rate, fd , sets a lower limit on the signalling rate that can be used without suffering fast-fading distortion. For HF communicating systems, when teletype or Morse-coded messages were transmitted at a low data rate, the channels were often fast fading. However, most present-day terrestrial mobile-radio channels can generally be characterized as slow fading. Equation (18.23a–18.23b) doesn’t go far enough in describing what we desire of the channel. A better way to state the requirement for mitigating the effects of fast fading would be that we desire W fd (or Ts T0 ). If this condition is not satisfied, the random frequency modulation (FM) due to varying Doppler shifts will limit the system performance significantly. The Doppler effect yields an irreducible error rate that cannot be overcome by simply increasing Eb /N0 [25]. This irreducible error rate is most pronounced for any modulation that involves switching the carrier phase. A single specular Doppler path, without scatterers, registers an instantaneous frequency shift, classically calculated as fd = V /λ. However, a combination of specular and multipath components yields a rather complex time dependence of instantaneous frequency which can cause much larger frequency swings than ±V /λ when detected by an instantaneous frequency detector (a nonlinear device) [26]. Ideally, coherent demodulators that lock onto and track the information signal should suppress the effect of this FM noise and thus cancel the impact of Doppler shift. However, for large values of fd , carrier recovery becomes a problem because very wideband (relative to the data rate) phase-lock loops (PLLs) need to be designed. For voice-grade applications with bit-error rates of 10−3 to 10−4 , a large value of Doppler shift is considered to be on the order of 0.01 × W . Therefore, to avoid fast-fading distortion and the Doppler-induced irreducible error rate, the signalling rate should exceed the fading rate by a factor of 100 to 200 [27]. The exact factor depends on the signal modulation, receiver design, and required error-rate [3], [26]–[29]. Davarian [29] showed that a frequency-tracking loop can help lower, but not completely remove, the irreducible error rate in a mobile system when using differential minimum-shift keyed (DMSK) modulation.

18.11

Mitigation Methods

Figure 18.12, subtitled “The Good, The Bad, and The Awful,” highlights three major performance categories in terms of bit-error probability, PB , vs. Eb /N0 . The leftmost exponentially-shaped curve represents the performance that can be expected when using any nominal modulation type in AWGN. Observe that with a reasonable amount of Eb /N0 , good performance results. The middle curve, referred to as the Rayleigh limit, shows the performance degradation resulting from a loss in SNR that is characteristic of flat fading or slow fading when there is no line-of-sight signal component present. The curve is a function of the reciprocal of Eb /N0 (an inverse-linear function), so for 1999 by CRC Press LLC

c

reasonable values of SNR, performance will generally be “bad.” In the case of Rayleigh fading, parameters with overbars are often introduced to indicate that a mean is being taken over the “ups” and “downs” of the fading experience. Therefore, one often sees such bit-error probability plots with mean parameters denoted by PB and Eb /N0 . The curve that reaches an irreducible level, sometimes called an error floor, represents “awful” performance, where the bit-error probability can approach the value of 0.5. This shows the severe distorting effects of frequency-selective fading or fast fading.

FIGURE 18.12: Error performance: The good, the bad, and the awful.

1999 by CRC Press LLC

c

If the channel introduces signal distortion as a result of fading, the system performance can exhibit an irreducible error rate; when larger than the desired error rate, no amount of Eb /N0 will help achieve the desired level of performance. In such cases, the general approach for improving performance is to use some form of mitigation to remove or reduce the distortion. The mitigation method depends on whether the distortion is caused by frequency-selective fading or fast fading. Once the distortion has been mitigated, the PB vs. Eb /N0 performance should have transitioned from the “awful” bottoming out curve to the merely “bad” Rayleigh limit curve. Next, we can further ameliorate the effects of fading and strive to approach AWGN performance by using some form of diversity to provide the receiver with a collection of uncorrelated samples of the signal, and by using a powerful error-correction code. In Fig. 18.13, several mitigation techniques for combating the effects of both signal distortion and loss in SNR are listed. Just as Figs. 18.1 and 18.6 serve as a guide for characterizing fading phenomena and their effects, Fig. 18.13 can similarly serve to describe mitigation methods that can be used to ameliorate the effects of fading. The mitigation approach to be used should follow two basic steps: first, provide distortion mitigation; second, provide diversity.

18.11.1

Mitigation to Combat Frequency-Selective Distortion

• Equalization can compensate for the channel-induced ISI that is seen in frequencyselective fading. That is, it can help move the operating point from the error-performance curve that is “awful” in Fig. 18.12 to the one that is “bad.” The process of equalizing the ISI involves some method of gathering the dispersed symbol energy back together into its original time interval. In effect, equalization involves insertion of a filter to make the combination of channel and filter yield a flat response with linear phase. The phase linearity is achieved by making the equalizer filter the complex conjugate of the time reverse of the dispersed pulse [30]. Because in a mobile system the channel response varies with time, the equalizer filter must also change or adapt to the time-varying channel. Such equalizer filters are, therefore, called adaptive equalizers. An equalizer accomplishes more than distortion mitigation; it also provides diversity. Since distortion mitigation is achieved by gathering the dispersed symbol’s energy back into the symbol’s original time interval so that it doesn’t hamper the detection of other symbols, the equalizer is simultaneously providing each received symbol with energy that would otherwise be lost. • The decision feedback equalizer (DFE) has a feedforward section that is a linear transversal filter [30] whose length and tap weights are selected to coherently combine virtually all of the current symbol’s energy. The DFE also has a feedback section which removes energy that remains from previously detected symbols [14], [30]–[32]. The basic idea behind the DFE is that once an information symbol has been detected, the ISI that it induces on future symbols can be estimated and subtracted before the detection of subsequent symbols. • The maximum-likelihood sequence estimation (MLSE) equalizer tests all possible data sequences (rather than decoding each received symbol by itself) and chooses the data sequence that is the most probable of the candidates. The MLSE equalizer was first proposed by Forney [33] when he implemented the equalizer using the Viterbi decoding algorithm [34]. The MLSE is optimal in the sense that it minimizes the probability of a sequence error. Because the Viterbi decoding algorithm is the way in which the MLSE equalizer is typically implemented, the equalizer is often referred to as the Viterbi equalizer. Later in this chapter, we illustrate the adaptive equalization performed in the 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 18.13: Basic mitigation types.

Global System for Mobile Communications (GSM) using the Viterbi equalizer. • Spread-spectrum techniques can be used to mitigate frequency-selective ISI distortion because the hallmark of any spread-spectrum system is its capability to reject interference, and ISI is a type of interference. Consider a direct-sequence spread-spectrum (DS/SS) binary phase shift keying (PSK) communication channel comprising one direct path and one reflected path. Assume that the propagation from transmitter to receiver results in a multipath wave that is delayed by τk compared to the direct wave. If the receiver is synchronized to the waveform arriving via the direct path, the received signal, r(t), neglecting noise, can be expressed as r(t) = Ax(t)g(t) cos (2πfc t) + αAx (t − τk ) g (t − τk ) cos (2πfc t + 2)

(18.24)

where x(t) is the data signal, g(t) is the pseudonoise (PN) spreading code, and τk is the differential time delay between the two paths. The angle 2 is a random phase, assumed to be uniformly distributed in the range (0, 2π ), and α is the attenuation of the multipath signal relative to the direct path signal. The receiver multiplies the incoming r(t) by the code g(t). If the receiver is synchronized to the direct path signal, multiplication by the code signal yields Ax(t)g 2 (t) cos (2πfc t) + αAx (t − τk ) g(t)g (t − τk ) cos (2πfc t + 2) where g 2 (t) = 1, and if τk is greater than the chip duration, then, Z Z g ∗ (t)g (t − τk ) dt g ∗ (t)g(t)dt

(18.25)

(18.26)

over some appropriate interval of integration (correlation), where ∗ indicates complex conjugate, and τk is equal to or larger than the PN chip duration. Thus, the spread spectrum system effectively eliminates the multipath interference by virtue of its codecorrelation receiver. Even though channel-induced ISI is typically transparent to DS/SS systems, such systems suffer from the loss in energy contained in all the multipath components not seen by the receiver. The need to gather up this lost energy belonging to the received chip was the motivation for developing the Rake receiver [35]–[37]. The Rake receiver dedicates a separate correlator to each multipath component (finger). It is able to coherently add the energy from each finger by selectively delaying them (the earliest component gets the longest delay) so that they can all be coherently combined. • Earlier, we described a channel that could be classified as flat fading, but occasionally exhibits frequency-selective distortion when the null of the channel’s frequency transfer function occurs at the center of the signal band. The use of DS/SS is a good way to mitigate such distortion because the wideband SS signal would span many lobes of the selectively faded frequency response. Hence, a great deal of pulse energy would then be passed by the scatterer medium, in contrast to the nulling effect on a relatively narrowband signal [see Fig. 18.8(c)] [18]. • Frequency-hopping spread-spectrum (FH/SS) can be used to mitigate the distortion due to frequency-selective fading, provided the hopping rate is at least equal to the symbol rate. Compared to DS/SS, mitigation takes place through a different mechanism. FH receivers avoid multipath losses by rapid changes in the transmitter frequency band, thus avoiding the interference by changing the receiver band position before the arrival of the multipath signal. 1999 by CRC Press LLC

c

• Orthogonal frequency-division multiplexing (OFDM) can be used in frequency-selective fading channels to avoid the use of an equalizer by lengthening the symbol duration. The signal band is partitioned into multiple subbands, each one exhibiting a lower symbol rate than the original band. The subbands are then transmitted on multiple orthogonal carriers. The goal is to reduce the symbol rate (signalling rate), W ≈ 1/Ts , on each carrier to be less than the channel’s coherence bandwidth f0 . OFDM was originally referred to as Kineplex. The technique has been implemented in the U.S. in mobile radio systems [38], and has been chosen by the European community under the name Coded OFDM (COFDM), for high-definition television (HDTV) broadcasting [39]. • Pilot signal is the name given to a signal intended to facilitate the coherent detection of waveforms. Pilot signals can be implemented in the frequency domain as an in-band tone [40], or in the time domain as a pilot sequence, which can also provide information about the channel state and thus improve performance in fading [41].

18.11.2

Mitigation to Combat Fast-Fading Distortion

• For fast fading distortion, use a robust modulation (noncoherent or differentially coherent) that does not require phase tracking, and reduce the detector integration time [20]. • Increase the symbol rate, W ≈ 1/Ts , to be greater than the fading rate, fd ≈ 1/T0 , by adding signal redundancy. • Error-correction coding and interleaving can provide mitigation because instead of providing more signal energy, a code reduces the required Eb /N0 . For a given Eb /N0 , with coding present, the error floor will be lowered compared to the uncoded case. • An interesting filtering technique can provide mitigation in the event of fast-fading distortion and frequency-selective distortion occurring simultaneously. The frequencyselective distortion can be mitigated by the use of an OFDM signal set. Fast fading, however, will typically degrade conventional OFDM because the Doppler spreading corrupts the orthogonality of the OFDM subcarriers. A polyphase filtering technique [42] is used to provide time-domain shaping and duration extension to reduce the spectral sidelobes of the signal set and thus help preserve its orthogonality. The process introduces known ISI and adjacent channel interference (ACI) which are then removed by a post-processing equalizer and canceling filter [43].

18.11.3

Mitigation to Combat Loss in SNR

After implementing some form of mitigation to combat the possible distortion (frequency-selective or fast fading), the next step is to use some form of diversity to move the operating point from the errorperformance curve labeled as “bad” in Fig. 18.12 to a curve that approaches AWGN performance. The term “diversity” is used to denote the various methods available for providing the receiver with uncorrelated renditions of the signal. Uncorrelated is the important feature here, since it would not help the receiver to have additional copies of the signal if the copies were all equally poor. Listed below are some of the ways in which diversity can be implemented. • Time diversity—Transmit the signal on L different time slots with time separation of at least T0 . Interleaving, often used with error-correction coding, is a form of time diversity. • Frequency diversity—Transmit the signal on L different carriers with frequency separation of at least f0 . Bandwidth expansion is a form of frequency diversity. The 1999 by CRC Press LLC

c

•

• •

• •

signal bandwidth, W , is expanded to be greater than f0 , thus providing the receiver with several independently fading signal replicas. This achieves frequency diversity of the order L = W/f0 . Whenever W is made larger than f0 , there is the potential for frequency-selective distortion unless we further provide some mitigation such as equalization. Thus, an expanded bandwidth can improve system performance (via diversity) only if the frequency-selective distortion the diversity may have introduced is mitigated. Spread spectrum is a form of bandwidth expansion that excels at rejecting interfering signals. In the case of direct-sequence spread-spectrum (DS/SS), it was shown earlier that multipath components are rejected if they are delayed by more than one chip duration. However, in order to approach AWGN performance, it is necessary to compensate for the loss in energy contained in those rejected components. The Rake receiver (described later) makes it possible to coherently combine the energy from each of the multipath components arriving along different paths. Thus, used with a Rake receiver, DS/SS modulation can be said to achieve path diversity. The Rake receiver is needed in phasecoherent reception, but in differentially coherent bit detection, a simple delay line (one bit long) with complex conjugation will do the trick [44]. Frequency-hopping spread-spectrum (FH/SS) is sometimes used as a diversity mechanism. The GSM system uses slow FH (217 hops/s) to compensate for those cases where the mobile user is moving very slowly (or not at all) and happens to be in a spectral null. Spatial diversity is usually accomplished through the use of multiple receive antennas, separated by a distance of at least 10 wavelengths for a base station (much less for a mobile station). Signal processing must be employed to choose the best antenna output or to coherently combine all the outputs. Systems have also been implemented with multiple spaced transmitters; an example is the Global Positioning System (GPS). Polarization diversity [45] is yet another way to achieve additional uncorrelated samples of the signal. Any diversity scheme may be viewed as a trivial form of repetition coding in space or time. However, there exist techniques for improving the loss in SNR in a fading channel that are more efficient and more powerful than repetition coding. Error-correction coding represents a unique mitigation technique, because instead of providing more signal energy it reduces the required Eb /N0 in order to accomplish the desired error performance. Error-correction coding coupled with interleaving [20], [46]–[51] is probably the most prevalent of the mitigation schemes used to provide improved performance in a fading environment.

18.12

Summary of the Key Parameters Characterizing Fading Channels

We summarize the conditions that must be met so that the channel does not introduce frequencyselective distortion and fast-fading distortion. Combining the inequalities of Eqs. (18.14) and (18.23a– 18.23b), we obtain f0 > W > fd or 1999 by CRC Press LLC

c

(18.27a)

Tm < Ts < T0

(18.27b)

In other words, we want the channel coherence bandwidth to exceed our signalling rate, which in turn should exceed the fading rate of the channel. Recall that without distortion mitigation, f0 sets an upper limit on signalling rate, and fd sets a lower limit on it.

18.12.1

Fast-Fading Distortion: Example #1

If the inequalities of Eq. (18.27a–18.27b) are not met and distortion mitigation is not provided, distortion will result. Consider the fast-fading case where the signalling rate is less than the channel fading rate, that is, (18.28) f0 > W < fd Mitigation consists of using one or more of the following methods. (See Fig. 18.13). • Choose a modulation/demodulation technique that is most robust under fast-fading conditions. That means, for example, avoiding carrier recovery with PLLs since the fast fading could keep a PLL from achieving lock conditions. • Incorporate sufficient redundancy so that the transmission symbol rate exceeds the channel fading rate. As long as the transmission symbol rate does not exceed the coherence bandwidth, the channel can be classified as flat fading. However, even flat-fading channels will experience frequency-selective distortion whenever a channel null appears at the band center. Since this happens only occasionally, mitigation might be accomplished by adequate errorcorrection coding and interleaving. • The above two mitigation approaches should result in the demodulator operating at the Rayleigh limit [20] (see Fig. 18.12). However, there may be an irreducible floor in the error-performance vs. Eb /N0 curve due to the FM noise that results from the random Doppler spreading. The use of an in-band pilot tone and a frequency-control loop can lower this irreducible performance level. • To avoid this error floor caused by random Doppler spreading, increase the signalling rate above the fading rate still further (100–200 × fading rate) [27]. This is one architectural motive behind time-division multiple access (TDMA) mobile systems. • Incorporate error-correction coding and interleaving to lower the floor and approach AWGN performance.

18.12.2

Frequency-Selective Fading Distortion: Example #2

Consider the frequency-selective case where the coherence bandwidth is less than the symbol rate; that is, (18.29) f0 < W > fd Mitigation consists of using one or more of the following methods. (See Fig. 18.13). • Since the transmission symbol rate exceeds the channel-fading rate, there is no fastfading distortion. Mitigation of frequency-selective effects is necessary. One or more of the following techniques may be considered: 1999 by CRC Press LLC

c

• Adaptive equalization, spread spectrum (DS or FH), OFDM, pilot signal. The European GSM system uses a midamble training sequence in each transmission time slot so that the receiver can learn the impulse response of the channel. It then uses a Viterbi equalizer (explained later) for mitigating the frequency-selective distortion. • Once the distortion effects have been reduced, introduce some form of diversity and error-correction coding and interleaving in order to approach AWGN performance. For direct-sequence spread-spectrum (DS/SS) signalling, the use of a Rake receiver (explained later) may be used for providing diversity by coherently combining multipath components that would otherwise be lost.

18.12.3

Fast-Fading and Frequency-Selective Fading Distortion: Example #3

Consider the case where the coherence bandwidth is less than the signalling rate, which in turn is less than the fading rate. The channel exhibits both fast-fading and frequency-selective fading which is expressed as f0 < W < fd

(18.30a)

f0 < fd

(18.30b)

or

Recalling from Eq. (18.27a–18.27b) that f0 sets an upper limit on signalling rate and fd sets a lower limit on it, this is a difficult design problem because, unless distortion mitigation is provided, the maximum allowable signalling rate is (in the strict terms of the above discussion) less than the minimum allowable signalling rate. Mitigation in this case is similar to the initial approach outlined in example #1. • Choose a modulation/demodulation technique that is most robust under fast-fading conditions. • Use transmission redundancy in order to increase the transmitted symbol rate. • Provide some form of frequency-selective mitigation in a manner similar to that outlined in example #2. • Once the distortion effects have been reduced, introduce some form of diversity and error-correction coding and interleaving in order to approach AWGN performance.

18.13

The Viterbi Equalizer as Applied to GSM

Figure 18.14 shows the GSM time-division multiple access (TDMA) frame, having a duration of 4.615 ms and comprising 8 slots, one assigned to each active mobile user. A normal transmission burst occupying one slot of time contains 57 message bits on each side of a 26-bit midamble called a training or sounding sequence. The slot-time duration is 0.577 ms (or the slot rate is 1733 slots/s). The purpose of the midamble is to assist the receiver in estimating the impulse response of the channel in an adaptive way (during the time duration of each 0.577 ms slot). In order for the technique to be effective, the fading behavior of the channel should not change appreciably during the time interval 1999 by CRC Press LLC

c

FIGURE 18.14: The GSM TDMA frame and time-slot containing a normal burst.

of one slot. In other words, there should not be any fast-fading degradation during a slot time when the receiver is using knowledge from the midamble to compensate for the channel’s fading behavior. Consider the example of a GSM receiver used aboard a high-speed train, traveling at a constant velocity of 200 km/hr (55.56 m/s). Assume the carrier frequency to be 900 MHz, (the wavelength is λ = 0.33 m). From Eq. (18.21), we can calculate that a half-wavelength is traversed in approximately the time (coherence time) λ/2 ≈ 3 ms (18.31) T0 ≈ V Therefore, the channel coherence time is over 5 times greater than the slot time of 0.577 ms. The time needed for a significant change in fading behavior is relatively long compared to the time duration of one slot. Note, that the choices made in the design of the GSM TDMA slot time and midamble were undoubtedly influenced by the need to preclude fast fading with respect to a slot-time duration, as in this example. The GSM symbol rate (or bit rate, since the modulation is binary) is 271 kilosymbols/s and the bandwidth is W = 200 kHz. If we consider that the typical rms delay spread in an urban environment is on the order of στ = 2µs, then using Eq. (18.13) the resulting coherence bandwidth is f0 ≈ 100 kHz. It should therefore be apparent that since f0 < W , the GSM receiver must utilize some form of mitigation to combat frequency-selective distortion. To accomplish this goal, the Viterbi equalizer is typically implemented. Figure 18.15 illustrates the basic functional blocks used in a GSM receiver for estimating the channel impulse response, which is then used to provide the detector with channel-corrected reference waveforms [52]. In the final step, the Viterbi algorithm is used to compute the MLSE of the message. As stated in Eq. (18.2), a received signal can be described in terms of the transmitted signal convolved with the impulse response of the channel, hc (t). We show this below, using the notation of a received training sequence, rtr (t), and the transmitted training sequence, str (t), as follows: rtr (t) = str (t) ∗ hc (t) 1999 by CRC Press LLC

c

(18.32)

FIGURE 18.15: The Viterbi equalizer as applied to GSM. where ∗ denotes convolution. At the receiver, rtr (t) is extracted from the normal burst and sent to a filter having impulse response, hmf (t), that is matched to str (t). This matched filter yields at its output an estimate of hc (t), denoted he (t), developed from Eq. (18.32) as follows. he (t)

= rtr (t) ∗ hmf (t) = str (t) ∗ hc (t) ∗ hmf (t) = Rs (t) ∗ hc (t)

(18.33)

where Rs (t) is the autocorrelation function of str (t). If Rs (t) is a highly peaked (impulse-like) function, then he (t) ≈ hc (t). Next, using a windowing function, w(t), we truncate he (t) to form a computationally affordable function, hw (t). The window length must be large enough to compensate for the effect of typical channel-induced ISI. The required observation interval L0 for the window can be expressed as the sum of two contributions. The interval of length LCI SI is due to the controlled ISI caused by Gaussian filtering of the baseband pulses, which are then MSK modulated. The interval of length LC is due to the channel-induced ISI caused by multipath propagation; therefore, L0 can be written as L0 = LCI SI + LC

(18.34)

The GSM system is required to provide mitigation for distortion due to signal dispersions of approximately 15–20 µs. The bit duration is 3.69 µs. Thus, the Viterbi equalizer used in GSM has a memory of 4–6 bit intervals. For each L0 -bit interval in the message, the function of the Viterbi equalizer is to find the most likely L0 -bit sequence out of the 2L0 possible sequences that might have been transmitted. Determining the most likely L0 -bit sequence requires that 2L0 meaningful reference waveforms be created by modifying (or disturbing) the 2L0 ideal waveforms in the same way that the channel has disturbed the transmitted message. Therefore, the 2L0 reference waveforms are convolved with the windowed estimate of the channel impulse response, hw (t) in order to derive the disturbed or channel-corrected reference waveforms. Next, the channel-corrected reference waveforms are compared against the received data waveforms to yield metric calculations. However, before the comparison takes place, the received data waveforms are convolved with the known windowed autocorrelation function w(t)Rs (t), transforming them in a manner comparable to that applied to the 1999 by CRC Press LLC

c

reference waveforms. This filtered message signal is compared to all possible 2L0 channel-corrected reference signals, and metrics are computed as required by the Viterbi decoding algorithm (VDA). The VDA yields the maximum likelihood estimate of the transmitted sequence [34].

18.14

The Rake Receiver Applied to Direct-Sequence Spread-Spectrum (DS/SS) Systems

Interim Specification 95 (IS-95) describes a DS/SS cellular system that uses a Rake receiver [35]–[37] to provide path diversity. In Fig. 18.16, five instances of chip transmissions corresponding to the code sequence 1 0 1 1 1 are shown, with the transmission or observation times labeled t−4 for the earliest transmission and t0 for the latest. Each abscissa shows three “fingers” of a signal that arrive at the receiver with delay times τ1 , τ2 , and τ3 . Assume that the intervals between the ti transmission times and the intervals between the τi delay times are each one chip long. From this, one can conclude that the finger arriving at the receiver at time t−4 , with delay τ3 , is time coincident with two other fingers, namely the fingers arriving at times t−3 and t−2 with delays τ2 and τ1 , respectively. Since, in this example, the delayed components are separated by exactly one chip time, they are just resolvable. At the receiver, there must be a sounding device that is dedicated to estimating the τi delay times. Note that for a terrestrial mobile radio system, the fading rate is relatively slow (milliseconds) or the channel coherence time large compared to the chip time (T0 > Tch ). Hence, the changes in τi occur slowly enough so that the receiver can readily adapt to them. Once the τi delays are estimated, a separate correlator is dedicated to processing each finger. In this example, there would be three such dedicated correlators, each one processing a delayed version of the same chip sequence 1 0 1 1 1. In Fig. 18.16, each correlator receives chips with power profiles represented by the sequence of fingers shown along a diagonal line. Each correlator attempts to match these arriving chips with the same PN code, similarly delayed in time. At the end of a symbol interval (typically there may be hundreds or thousands of chips per symbol), the outputs of the correlators are coherently combined, and a symbol detection is made. At the chip level, the Rake receiver resembles an equalizer, but its real function is to provide diversity. The interference-suppression nature of DS/SS systems stems from the fact that a code sequence arriving at the receiver merely one chip time late, will be approximately orthogonal to the particular PN code with which the sequence is correlated. Therefore, any code chips that are delayed by one or more chip times will be suppressed by the correlator. The delayed chips only contribute to raising the noise floor (correlation sidelobes). The mitigation provided by the Rake receiver can be termed path diversity, since it allows the energy of a chip that arrives via multiple paths to be combined coherently. Without the Rake receiver, this energy would be transparent and therefore lost to the DS/SS system. In Fig. 18.16, looking vertically above point τ3 , it is clear that there is interchip interference due to different fingers arriving simultaneously. The spread-spectrum processing gain allows the system to endure such interference at the chip level. No other equalization is deemed necessary in IS-95.

18.15

Conclusion

In this chapter, the major elements that contribute to fading in a communication channel have been characterized. Figure 18.1 was presented as a guide for the characterization of fading phenomena. Two types of fading, large-scale and small-scale, were described. Two manifestations of small-scale fading (signal dispersion and fading rapidity) were examined, and the examination involved two views, time and frequency. Two degradation categories were defined for dispersion: frequency-selective 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 18.16: Example of received chips seen by a 3-finger rake receiver.

fading and flat-fading. Two degradation categories were defined for fading rapidity: fast and slow. The small-scale fading degradation categories were summarized in Fig. 18.6. A mathematical model using correlation and power density functions was presented in Fig. 18.7. This model yields a nice symmetry, a kind of “poetry” to help us view the Fourier transform and duality relationships that describe the fading phenomena. Further, mitigation techniques for ameliorating the effects of each degradation category were treated, and these techniques were summarized in Fig. 18.13. Finally, mitigation methods that have been implemented in two system types, GSM and CDMA systems meeting IS-95, were described.

References [1] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, Ch. 4, 1988. [2] Van Trees, H.L., Detection, Estimation, and Modulation Theory, Part I, John Wiley & Sons, New York, Ch. 4, 1968. [3] Rappaport, T.S., Wireless Communications, Prentice-Hall, Upper Saddle River, New Jersey, Chs. 3 and 4, 1996. [4] Greenwood, D. and Hanzo, L., Characterisation of Mobile Radio Channels, Mobile Radio Communications, Steele, R., Ed., Pentech Press, London, Ch. 2, 1994. [5] Lee, W.C.Y., Elements of cellular mobile radio systems, IEEE Trans. Vehicular Technol., V35(2), 48–56, May 1986. [6] Okumura, Y. et al., Field strength and its variability in VHF and UHF land mobile radio service, Rev. Elec. Comm. Lab., 16(9-10), 825–873, 1968. [7] Hata, M., Empirical formulæ for propagation loss in land mobile radio services, IEEE Trans. Vehicular Technol., VT-29(3), 317–325, 1980. [8] Seidel, S.Y. et al., Path loss, scattering and multipath delay statistics in four European cities for digital cellular and microcellular radiotelephone, IEEE Trans. Vehicular Technol., 40(4), 721–730, Nov. 1991. [9] Cox, D.C., Murray, R., and Norris, A., 800 MHz Attenuation measured in and around suburban houses, AT&T Bell Laboratory Technical Journal, 673(6), 921–954, Jul.-Aug. 1984. [10] Schilling, D.L. et al., Broadband CDMA for personal communications systems, IEEE Commun. Mag., 29(11), 86–93, Nov. 1991. [11] Andersen, J.B., Rappaport, T.S., and Yoshida, S., Propagation measurements and models for wireless communications channels, IEEE Commun. Mag., 33(1), 42–49, Jan. 1995. [12] Amoroso, F., Investigation of signal variance, bit error rates and pulse dispersion for DSPN signalling in a mobile dense scatterer ray tracing model, Intl. J. Satellite Commun., 12, 579– 588, 1994. [13] Bello, P.A., Characterization of randomly time-variant linear channels, IEEE Trans. Commun. Syst., 360–393, Dec. 1963. [14] Proakis, J.G., Digital Communications, McGraw-Hill, New York, Ch. 7, 1983. [15] Green, P.E., Jr., Radar astronomy measurement techniques, MIT Lincoln Laboratory, Lexington, MA, Tech. Report No. 282, Dec. 1962. [16] Pahlavan, K. and Levesque, A.H., Wireless Information Networks, John Wiley & Sons, New York, Chs. 3 and 4, 1995. [17] Lee, W.Y.C., Mobile Cellular Communications, McGraw-Hill, New York, 1989. [18] Amoroso, F., Use of DS/SS signalling to mitigate Rayleigh fading in a dense scatterer environment, IEEE Personal Commun., 3(2), 52–61, Apr. 1996. 1999 by CRC Press LLC

c

[19] Clarke, R.H., A statistical theory of mobile radio reception, Bell Syst. Tech. J., 47(6), 957–1000, Jul.-Aug. 1968. [20] Bogusch, R.L., Digital Communications in Fading Channels: Modulation and Coding, Mission Research Corp., Santa Barbara, California, Report No. MRC-R-1043, Mar. 11, 1987. [21] Amoroso, F., The bandwidth of digital data signals, IEEE Commun. Mag., 18(6), 13–24, Nov. 1980. [22] Bogusch, R.L. et al., Frequency selective propagation effects on spread-spectrum receiver tracking, Proc. IEEE, 69(7), 787–796, Jul. 1981. [23] Jakes, W.C., Ed., Microwave Mobile Communications, John Wiley & Sons, New York, 1974. Technical Committee of Committee T1 R1P1.4 and TIA [24] Joint TR46.3.3/TR45.4.4 on Wireless Access, Draft Final Report on RF Channel Characterization, Paper No. JTC(AIR)/94.01.17-238R4, Jan. 17, 1994. [25] Bello, P.A. and Nelin, B.D., The influence of fading spectrum on the binary error probabilities of incoherent and differentially coherent matched filter receivers, IRE Trans. Commun. Syst., CS-10, 160–168, Jun. 1962. [26] Amoroso, F., Instantaneous frequency effects in a Doppler scattering environment, IEEE International Conference on Communications, 1458–1466, Jun. 7–10, 1987. [27] Bateman, A.J. and McGeehan, J.P., Data transmission over UHF fading mobile radio channels, IEEE Proc., 131, Pt. F(4), 364–374, Jul. 1984. [28] Feher, K., Wireless Digital Communications, Prentice-Hall, Upper Saddle River, NJ, 1995. [29] Davarian, F., Simon, M., and Sumida, J., DMSK: A Practical 2400-bps Receiver for the Mobile Satellite Service, Jet Propulsion Laboratory Publication 85-51 (MSAT-X Report No. 111), Jun. 15, 1985. [30] Rappaport, T.S., Wireless Communications, Prentice-Hall, Upper Saddle River, NJ, Ch. 6, 1996. [31] Bogusch, R.L., Guigliano, F.W., and Knepp, D.L., Frequency-selective scintillation effects and decision feedback equalization in high data-rate satellite links, Proc. IEEE, 71(6), 754–767, Jun. 1983. [32] Qureshi, S.U.H., Adaptive equalization, Proc. IEEE, 73(9), 1340–1387, Sept. 1985. [33] Forney, G.D., The Viterbi algorithm, Proc. IEEE, 61(3), 268–278, Mar. 1978. [34] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, Ch. 6, 1988. [35] Price, R. and Green, P.E., Jr., A communication technique for multipath channels, Proc. IRE, 555–570, Mar. 1958. [36] Turin, G.L., Introduction to spread-spectrum antimultipath techniques and their application to urban digital radio, Proc. IEEE, 68(3), 328–353, Mar. 1980. [37] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K., Spread Spectrum Communications Handbook, McGraw-Hill, New York, 1994. [38] Birchler, M.A. and Jasper, S.C., A 64 kbps Digital Land Mobile Radio System Employing M16QAM, Proceedings of the 1992 IEEE Intl. Conference on Selected Topics in Wireless Communications, Vancouver, British Columbia, 158–162, Jun. 25–26, 1992. [39] Sari, H., Karam, G., and Jeanclaude, I., Transmission techniques for digital terrestrial TV broadcasting, IEEE Commun. Mag., 33(2), 100–109, Feb. 1995. [40] Cavers, J.K., The performance of phase locked transparent tone-in-band with symmetric phase detection, IEEE Trans. Commun., 39(9), 1389–1399, Sept. 1991. [41] Moher, M.L. and Lodge, J.H., TCMP—A modulation and coding strategy for Rician fading channel, IEEE J. Selected Areas Commun., 7(9), 1347–1355, Dec. 1989.

1999 by CRC Press LLC

c

[42] Harris, F., On the Relationship Between Multirate Polyphase FIR Filters and Windowed, Overlapped FFT Processing, Proceedings of the Twenty Third Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, 485–488, Oct. 30 to Nov. 1, 1989. [43] Lowdermilk, R.W. and Harris, F., Design and Performance of Fading Insensitive Orthogonal Frequency Division Multiplexing (OFDM) using Polyphase Filtering Techniques, Proceedings of the Thirtieth Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, Nov. 3–6, 1996. [44] Kavehrad, M. and Bodeep, G.E., Design and experimental results for a direct-sequence spreadspectrum radio using differential phase-shift keying modulation for indoor wireless communications, IEEE JSAC, SAC-5(5), 815–823, Jun. 1987. [45] Hess, G.C., Land-Mobile Radio System Engineering, Artech House, Boston, 1993. [46] Hagenauer, J. and Lutz, E., Forward error correction coding for fading compensation in mobile satellite channels, IEEE JSAC, SAC-5(2), 215–225, Feb. 1987. [47] McLane, P.I. et al., PSK and DPSK trellis codes for fast fading, shadowed mobile satellite communication channels, IEEE Trans. Commun., 36(11), 1242–1246, Nov. 1988. [48] Schlegel, C. and Costello, D.J., Jr., Bandwidth efficient coding for fading channels: code construction and performance analysis, IEEE JSAC, 7(9), 1356–1368, Dec. 1989. [49] Edbauer, F., Performance of interleaved trellis-coded differential 8–PSK modulation over fading channels, IEEE J. Selected Areas Commun., 7(9), 1340–1346, Dec. 1989. [50] Soliman, S. and Mokrani, K., Performance of coded systems over fading dispersive channels, IEEE Trans. Commun., 40(1), 51–59, Jan. 1992. [51] Divsalar, D. and Pollara, F., Turbo Codes for PCS Applications, Proc. ICC’95, Seattle, Washington, 54–59, Jun. 18–22, 1995. [52] Hanzo, L. and Stefanov, J., The Pan-European Digital Cellular Mobile Radio System—known as GSM, Mobile Radio Communications. Steele, R., Ed., Pentech Press, London, Ch. 8, 1992.

1999 by CRC Press LLC

c

Paulraj, A.J. “Space-Time Processing” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Space-Time Processing 19.1 Introduction 19.2 The Space-Time Wireless Channel

Multipath Propagation • Space-Time Channel Model

19.3 Signal Models

Signal Model at Base Station (Reverse Link) • Signal Model at Mobile (Forward Link) • Discrete Time Signal Model • SignalPlus-Interference Model

19.4 ST Receive Processing (Base)

Receive Channel Estimation (Base) • Multiuser ST Receive Algorithms • Single-User ST Receive Algorithms

19.5 ST Transmit Processing (Base)

Transmit Channel Estimation (Base) • ST Transmit Processing • Forward Link Processing at the Mobile

Arogyaswami J. Paulraj Stanford University

19.1

19.6 Summary Defining Terms References

Introduction

Mobile radio signal processing includes modulation and demodulation, channel coding and decoding, equalization and diversity. Current cellular modems mainly use temporal signal processing. Use of spatio-temporal signal processing can improve average signal power, mitigate fading, and reduce cochannel and intersymbol interference. This can significantly improve the capacity, coverage, and quality of wireless networks. A space-time processing radio operates simultaneously on multiple antennas by processing signal samples both in space and time. In receive, space-time (ST) processing can increase array gain, spatial and temporal diversity and reduce cochannel interference and intersymbol interference. In transmit, the spatial dimension can enhance array gain, improve diversity, reduce generation of cochannel and inter-symbol interference.

19.2

The Space-Time Wireless Channel

19.2.1 Multipath Propagation Multipath scattering gives rise to a number of propagation effects described below. 1999 by CRC Press LLC

c

Scatterers Local to Mobile

Scattering local to the mobile is caused by buildings/other scatterers in the vicinity of the mobile (a few tens of meters). Mobile motion and local scattering give rise to Doppler spread which causes time-selective fading. For a mobile traveling at 65 mph, the Doppler spread is about 200 Hz in the 1900 MHz band. While local scatterers contribute to Doppler spread, the delay spread they contribute is usually insignificant because of the small scattering radius. Likewise, the angle spread induced at the base station is also small. Remote Scatterers

The emerging wavefront from the local scatterers may then travel directly to the base or may be scattered toward the base by remote dominant scatterers, giving rise to specular multipaths. These remote scatterers can be either terrain features or high-rise building complexes. Remote scattering can cause significant delay and angle spreads. Scatterers Local to Base

Once these multiple wavefronts reach the base station, they may be scattered further by local structures such as buildings or other structures in the vicinity of the base. Such scattering will be more pronounced for low elevation and below roof-top antennas. Scattering local to the base can cause severe angle spread which in turn causes space-selective fading. See Figure 19.1 for a depiction of different types of scattering.

FIGURE 19.1: Multipath propagation in macrocells.

The forward link channel is affected in similar ways by these scatterers, but in a reverse order.

19.2.2

Space-Time Channel Model

The effect of delay, Doppler and angle spreads makes the channel selective in frequency, time, and space. Figure 19.2 shows plots of the frequency response at each branch of a four-antenna receiver operating with a 200 Khz bandwidth. We can see that the channel is highly frequency-selective since the delay spread reaches 10 to 15 µs. Also, an angle spread of 30◦ causes variations in the channel from antenna to antenna. The channel variation in time depends upon the Doppler spread. As expected, the plots show negligible channel variation between adjacent time slots, despite the high velocity of the mobile (100 kph). Use of longer time slots such as in IS-136 will result in significant 1999 by CRC Press LLC

c

channel variations over the slot period. Therefore, space-time processing should address the effect of the three spreads on the signal.

FIGURE 19.2: ST channel.

19.3

Signal Models

We develop signal models for nonspread modulation used in time division multiple access (TDMA) systems.

19.3.1

Signal Model at Base Station (Reverse Link)

We assume that antenna arrays are used at the base station only and that the mobile has a single omni antenna. The mobile transmits a channel coded and modulated signal which does not incorporate any spatial (or indeed any special temporal) processing. See Figure 19.3. The baseband signal xi (t) received by the base station at the ith element of an m element antenna array is given by

xi (t) =

L X l=1

ai (θl ) αlR (t)u (t − τl ) + ni (t)

(19.1)

where L is the number of multipaths, ai (θl ) is the response of the ith element for the lth path from direction θl , αlR (t) is the complex path fading, τl is the path delay, ni (t) is the additive noise and u(·) is the transmitted signal that depends on the modulation waveform and the information data stream. 1999 by CRC Press LLC

c

FIGURE 19.3: ST Processing Model. For a linear modulation, the baseband transmitted signal is given by X g(t − kT )s(k) u(t) =

(19.2)

k

where g(·) is the pulse shaping waveform and s(k) represents the information bits. In the above model we have assumed that the inverse signal bandwidth is large compared to the travel time across the array. Therefore, the complex envelopes of the signals received by different antennas from a given path are identical except for phase and amplitude differences that depend on the path angle-of-arrival, array geometry and the element pattern. This angle-of-arrival dependent phase and amplitude response at the ith element is ai (θl ). We collect all the element responses to a path arriving from angle θl into an m-dimensional vector, called the array response vector defined as a (θl )

=

x(t)

=

[a1 (θl ) a2 (θl ) . . . am (θl )]T L X l=1

a (θl ) αlR (t)u (t − τl ) + n(t)

(19.3)

where x(t) and n(t) are m-dimensional complex vectors. The fading |α R (t)| is Rayleigh or Rician distributed depending on the propagation model.

19.3.2

Signal Model at Mobile (Forward Link)

In this model, the base station transmits different signals from each antenna with a defined relationship between them. In the case of a two element array, some examples of transmitted signals ui (t), i = 1, 2 can be: (a) delay diversity: u2 (t) = u1 (t − T ) where T is the symbol period; (b) Doppler diversity: beamforming: u2 (t) = w2 u1P (t) where u2 (t) = u1 (t)ej ωt where ω is differential carrier offset; (c)P w2 is complex scalar; and (d) space-time coding: u1 (t) = k g(t − kT )s 1 (k), u2 (t) = k g(t − kT )s 2 (k) where s 1 (k) and s 2 (k) are related to the symbol sequence s(k) through coding. The received signal at the mobile is then given by x(t) =

L m X X i=1 l=1

1999 by CRC Press LLC

c

ai (θl ) αlF (t)ui (t − τl ) + n(t)

(19.4)

where the path delay τl and angle parameters θl are the same as those of the reverse link. αlF (t) is the complex fading on the forward link. In (fast) TDD systems αlF (t) will be identical to the reverse link complex fading αlR (t). In a FDD system αlF (t) and αlR (t) will usually have the same statistics but will in general be uncorrelated with each other. We assume ai (θl ) is the same for both links. This is only approximately true in FDD systems. If simple beamforming alone is used in transmit, the signals radiated from the antennas are related by a complex scalar and result in a directional transmit beam which may selectively couple into the multipath environment and differentially scale the power in each path. The signal received by the mobile in this case can be written as x(t) =

L X l=1

w H a (θl ) αlF (t)u (t − τl ) + n(t)

(19.5)

where w is the beamforming vector.

19.3.3

Discrete Time Signal Model

The channel model described above uses physical path parameters such as path gain, delay, and angle of arrival. In practice these are not known and the discrete time received signal uses a more convenient discretized “symbol response” channel model. We derive a discrete-time signal model at the base station antenna array. Let the continuous-time output from the receive antenna array x(t) be sampled at the symbol rate at instants t = to +kT . Then the vector array output may be written as x(k) = HR s(k) + n(k)

(19.6)

where HR is the reverse link symbol response channel (a m × N matrix) that captures the effects of the array response, symbol waveform and path fading. m is the number of antennas, N is the channel length in symbol periods and n(k) the sampled vector of additive noise. Note that n(k) may be colored in space and time, as discussed later. HR is assumed to be time invariant. s(k) is a vector of N consecutive elements of the data sequence and is defined as s(k) .. (19.7) s(k) = . s(k−N+1)

Note that we have assumed a sampling rate of one sample per symbol. Higher sampling rates may be used. Also, HR is given by L X a (θl ) αlR gT (τl ) (19.8) HR = l=1

where g(τl ) is a vector defined by T spaced sampling of the pulse shaping function g(·) with an offset of τl . Likewise the forward discrete signal model at the mobile is given by x(k) =

m X i=1

1999 by CRC Press LLC

c

hiF s(k) + n(k)

(19.9)

where hiF is a 1 × N composite channel from the symbol sequence via the ith antenna to the mobile receiver which includes the effect transmit ST processing at the base station. In the case of two antenna delay diversity, hiF is given by L X

hF1 =

a1 (θl ) αlF g (τl )ψ

(19.10)

a2 (θl ) αlF g (τl − T )ψ

(19.11)

l=1

and hF2

=

L X l=1

If spatial beamforming alone is used, the signal model becomes x(k) =

L X

w H HF s(k) + n(k)ψ

(19.12)

l=1

where HF is the intrinsic forward (F) channel given by F

H =

h1F h2F

=

L X l=1

a (θl ) αlF gT (τl )ψ

(19.13)

19.3.4 Signal-Plus-Interference Model The overall received signal-plus-interference-and-noise model at the base station antenna array can be written as x(k) = HsR ss (k) +

Q−1 X q=1

HqR sq (k) + n(k)ψ

(19.14)

where HsR and HqR are channels for signal and CCI, respectively, while ss and sq are the corresponding data sequences. Note that Eq. (19.14) appears to suggest that the signal and interference are baud synchronous. However, this can be relaxed and the time offsets can be absorbed into the channel HqR . Similarly, the signal at the mobile can also be extended to include CCI. Note that in this case, the source of interference is from other base stations (in TDMA) and the channel is between the interfering base station and the desired mobile. It is often convenient to handle signals in blocks. Therefore, we may collect M consecutive snapshots of x(·) corresponding to time instants k, . . . , k + M − 1, (and dropping subscripts for a moment), we get X(k) = HR S(k) + N(k)ψ

(19.15)

where X(k), S(k) and N(k) are defined appropriately. Similarly the received signal at the mobile in the forward link has a block representation using a row vector. 1999 by CRC Press LLC

c

19.4

ST Receive Processing (Base)

The base station receiver receives the signals from the array antenna which consist of the signals from the desired mobile and the cochannel signals along with associated intersymbol interference and fading. The task of the receiver is to maximize signal power and mitigate fading, CCI and ISI. There are two broad approaches for doing this—one is multiuser detection wherein we demodulate both the cochannel and desired signals jointly, the other is to cancel CCI. The structure of the receiver depends on the nature of the channel estimates available and the tolerable receiver complexity. There are a number of options and we discuss only a few salient cases. Before discussing the receiver processing, we discuss how receiver channel is estimated.

19.4.1 Receive Channel Estimation (Base) In many mobile communications standards, such as GSM and IS-54, explicit training signals are inserted inside the TDMA data bursts. Let T be the training sequence arranged in a matrix form (T is arranged to be a Toeplitz matrix). Then, during the training burst, the received data is given by X = HR T + N

(19.16)

Clearly HR can be estimated using least squares as HR = XT†

H −1

(19.17)

where T† = TH TT . The use of training consumes spectrum resource. In GSM, for example, about 20% of the bits are dedicated to training. Moreover, in rapidly varying mobile channels, we may have to retrain frequently, resulting in even poorer spectral efficiency. There is, therefore, increased interest in blind methods that can estimate a channel without an explicit training signal.

19.4.2 Multiuser ST Receive Algorithms In multiuser (MU) algorithms, we address the problem of jointly demodulating the multiple signals. Recall the received signal is given by X = HR S + N

(19.18)

where HR and S are suitably defined to include multiple users and are of dimensions m × N Q and NQ × M, respectively. If the channels for all the arriving signals are known, then we jointly demodulate all the user data sequences using multiuser maximum likelihood sequence estimation (MLSE). Starting with the data model in Eq. (19.18), we can then search for multiple user data sequences that minimize the ML cost function

2

min X − HR S S

F

(19.19)

The multiuser MLSE will have a large number of states in the trellis. Efficient techniques for implementing this complex receiver are needed. Multiuser MLSE detection schemes outperform all other receivers. 1999 by CRC Press LLC

c

19.4.3

Single-User ST Receive Algorithms

In this scheme we only demodulate the desired user and cancel the CCI. Therefore, after CCI cancellation we can use MLSE receivers to handle diversity and ISI. In this method there is potential conflict between CCI mitigation and diversity maximization. We are forced to allocate the available degrees of freedom (antennas) to the competing requirements. One approach is to cancel CCI by a space-time filter followed by an MLSE receiver to handle ISI. We do this by reformulating the MLSE criterion to arrive at a joint solution for the ST-MMSE filter and the effective channel for the scalar MLSE. Another approach is to use a ST-MMSE receiver to handle both CCI and ISI. In a space-time filter (equalizer-beamformer), W has the following form w11 (k) · · · w1M (k) .. .. (19.20) W(k) = . ··· . wm1 (k) · · · wmM (k)

In order to obtain a convenient formulation for the space-time filter output, we introduce the quantities W (k) and X(k) as follows X(k) = vec (X(k)) (mM × 1) W (k) = vec (W(k)) (mM × 1)

(19.21)

where the operator vec(·) is defined as:

v1 vec ([v1 · · · vM ]) = ... vM The ST-MMSE filter chooses the filter weights to achieve the minimum mean square error. The ST-MMSE filter takes the familiar form −1 HR W = RXX

(19.22)

where H R is one column of vec (HR ). In ST-MMSE the CCI and spatial diversity conflict for the spatial degrees of freedom. Likewise, temporal diversity and ISI cancellation conflict for the temporal degrees of freedom.

19.5

ST Transmit Processing (Base)

The goal in ST transmit processing is to maximize the average signal power and diversity at the receiver as well as minimize cochannel generation to other mobiles. Note that the base station transmission cannot directly affect the CCI seen by its intended mobile. In transmit the space-time processing needs channel knowledge, but since it is carried out prior to transmission and, therefore, before the signal encounters the channel, this is different from the reverse link where the space-time processing is carried out after the channel has affected the signal. Note that the mobile receiver will, of course, need to know the channel for signal demodulation, but since it sees the signal after transmission through the channel, it can estimate the forward link channel using training signals transmitted from the individual transmitter antennas. 1999 by CRC Press LLC

c

19.5.1

Transmit Channel Estimation (Base)

The transmit channel estimation at the base of the vector forward channel can be done via feedback by use of reciprocity principles. In a TDD system, if the duplexing time is small compared to the coherence time of the channel, both channels are the same and the base-station can use its estimate of the reverse channel as the forward channel; i.e., HF = HR , where HR is the reverse channel (we have added superscript R to emphasize the receive channel). In FDD systems, the forward and reverse channels can potentially be very different. This arises from differences in instantaneous complex path gains α R 6 = α F . The other channel components a(θl ) and g(τl ) are very nearly equal. A direct approach to estimating the forward channel is to feed back the signal from the mobile unit and then estimate the channel. We can do this by transmitting orthogonal training signals through each base station antenna. We can feed back from the mobile to the base the received signal for each transmitted signal and thus estimate the channel.

19.5.2

ST Transmit Processing

The primary goals at the transmitter are to maximize diversity in the link and to reduce CCI generation to other mobiles. The diversity maximization depends on the inherent diversity at the antenna array and cannot be created at the transmitter. The role of ST processing is limited to maximizing the exploitability of this diversity R at the receiver. This usually leads to use of orthogonal or near orthogonal signalling at each antenna: u1 (t) u2 (t) dt ≈ 0. Orthogonality ensures that the transmitted signals are separable at the mobile which can now combine these signals after appropriate weighting to attain maximum diversity. In order to minimize CCI, our goal is to use the beamforming vector w to steer the radiated energy and therefore minimize the interference at the other mobiles while maximizing the signal level at one’s own mobile. Note that the CCI at the reference mobile is not controlled by its own base station but is generated by other base stations. Reducing CCI at one’s own mobile requires the cooperation of the other base stations. Therefore we choose w such that max w

E(wH HF s(k)s(k)H HF H w) Q−1 X q=1

H

w

(19.23)

HFq HFq H w

where Q−1 is the number of susceptible outer cell mobiles. HqF is the channel from the base station to the qth outer cell mobile. In order to solve the above equation, we need to know the forward link channel HF to the reference mobile and HqF to cochannel mobiles. In general, such complete channel knowledge may not be available and suboptimum receivers must be designed. Furthermore, we need to find a receiver that harmonizes maximization of diversity and reduction of CCI. Use of transmit ST processing affects HF and thus can be incorporated.

19.5.3

Forward Link Processing at the Mobile

The mobile will receive the composite signal from all the base station transmit antennas and will need to demodulate the signal to estimate the symbol sequence. In doing so it usually needs to estimate the individual channels from each base station antenna to itself. This is usually done via the use of training signals on each transmit antenna. Note that as the number of transmit antennas increases, 1999 by CRC Press LLC

c

there is a greater burden of training requirements. The use of transmit ST processing reduces the CCI power observed by the mobile as well enhances the diversity available.

19.6

Summary

Use of space-time processing can significantly improve average signal power, mitigate fading, and reduce cochannel and intersymbol interference in wireless networks. This can in turn result in significantly improved capacity, coverage, and quality of wireless networks. In this chapter we have discussed applications of ST processing to TDMA systems. The applications to CDMA systems follow similar principles, but differences arise due to the nature of the signal and interference models.

Defining Terms ISI: Intersymbol intereference is caused by multipath propagation where one symbol interferes with other symbols. CCI: Cochannel interference arises from neighboring cells where the frequency channel is reused. Maximum Likelihood Sequence Estimation: A technique for channel equalization based on determining the best symbol sequence that matches the received signal.

References [1] Lindskog, E. and Paulraj, A., A taxonomy of space-time signal processing, IEE Trans. Radar and Sonar, 25–31, Feb. 1998. [2] Ng, B.C. and Paulraj, A., Space-time processing for PCS, IEEE PCS Magazine, 5(1), 36–48, Feb. 1998. [3] Paulraj, A. and Papadias, C.B., Space-time processing for wireless communications, IEEE Signal Processing Magazine, 14(5), 49–83, Nov. 1997. [4] Paulraj, A., Papadias, C., Reddy, V.U., and Van der Veen, A., A Review of Space-Time Signal Processing for Wireless Communications, in Signal Processing for Wireless Communications, V. Poor, Ed., Prentice Hall, 179–210, Dec. 1997.

1999 by CRC Press LLC

c

Jain, R.; Lin, Y. & Mohan, S. “Location Strategies for Personal Communications Services” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Location Strategies for Personal Communications Services 20.1 Introduction 20.2 An Overview of PCS

Aspects of Mobility—Example 20.1 • A Model for PCS

20.3 IS-41 Preliminaries

Terminal/Location Registration • Call Delivery

20.4 Global System for Mobile Communications Architecture • User Location Strategy

20.5 Analysis of Database Traffic Rate for IS-41 and GSM

The Mobility Model for PCS Users • Additional Assumptions • Analysis of IS-41 • Analysis of GSM

20.6 20.7 20.8 20.9

Ravi Jain Bell Communications Research

Yi-Bing Lin Bell Communications Research

Seshadri

Mohan1

Bell Communications Research

Reducing Signalling During Call Delivery Per-User Location Caching Caching Threshold Analysis Techniques for Estimating Users’ LCMR

The Running Average Algorithm • The Reset-K Algorithm • Comparison of the LCMR Estimation Algorithms

20.10 Discussion

Conditions When Caching Is Beneficial • Alternative Network Architectures • LCMR Estimation and Caching Policy

20.11 Conclusions Acknowledgment References

1 Address correspondence to: Seshadri Mohan, MCC-1A216B, Bellcore, 445 South St, Morristown, NJ 07960; Phone:

973-829-5160, Fax: 973-829-5888, e-mail: [email protected]+

c

1996 by Bell Communications Research, Inc. Used with permission. The material in

this chapter appeared originally in the following IEEE publications: S. Mohan and R. Jain. 1994. Two user location strategies for personal communications services, IEEE Personal Communications: The Magazine of Nomadic Communications and Computing, pp. 42--50, Feb., and R. Jain, C.N. Lo, and S. Mohan. 1994. A caching strategy to reduce network impacts of PCS, J-SAC Special Issue on Wireless and Mobile Networks, Aug.

1999 by CRC Press LLC

c

20.1

Introduction

The vision of nomadic personal communications is the ubiquitous availability of services to facilitate exchange of information (voice, data, video, image, etc.) between nomadic end users independent of time, location, or access arrangements. To realize this vision, it is necessary to locate users that move from place to place. The strategies commonly proposed are two-level hierarchical strategies, which maintain a system of mobility databases, home location registers (HLR) and visitor location resisters (VLR), to keep track of user locations. Two standards exist for carrying out two-level hierarchical strategies using HLRs and VLRs. The standard commonly used in North America is the EIA/TIA Interim Standard 41 (IS 41) [6] and in Europe the Global System for Mobile Communications (GSM) [15, 18]. In this chapter, we refer to these two strategies as basic location strategies. We introduce these two strategies for locating users and provide a tutorial on their usage. We then analyze and compare these basic location strategies with respect to load on mobility databases and signalling network. Next we propose an auxiliary strategy, called the per-user caching or, simply, the caching strategy, that augments the basic location strategies to reduce the signalling and database loads. The outline of this chapter is as follows. In Section 20.2 we discuss different forms of mobility in the context of personal communications services (PCS) and describe a reference model for a PCS architecture. In Sections 20.3 and 20.4, we describe the user location strategies specified in the IS-41 and GSM standards, respectively, and in Section 20.5, using a simple example, we present a simplified analysis of the database loads generated by each strategy. In Section 20.6, we briefly discuss possible modifications to these protocols that are likely to result in significant benefits by either reducing query and update rate to databases or reducing the signalling traffic or both. Section 20.7 introduces the caching strategy followed by an analysis in the next two sections. This idea attempts to exploit the spatial and temporal locality in calls received by users, similar to the idea of exploiting locality of file access in computer systems [20]. A feature of the caching location strategy is that it is useful only for certain classes of PCS users, those meeting certain call and mobility criteria. We encapsulate this notion in the definition of the user’s call-to-mobility ratio (CMR), and local CMR (LCMR), in Section 20.8. We then use this definition and our PCS network reference architecture to quantify the costs and benefits of caching and the threshold LCMR for which caching is beneficial, thus characterizing the classes of users for which caching should be applied. In Section 20.9 we describe two methods for estimating users’ LCMR and compare their effectiveness when call and mobility patterns are fairly stable, as well as when they may be variable. In Section 20.10, we briefly discuss alternative architectures and implementation issues of the strategy proposed and mention other auxiliary strategies that can be designed. Section 20.11 provides some conclusions and discussion of future work. The choice of platforms on which to realize the two location strategies (IS-41 and GSM) may vary from one service provider to another. In this paper, we describe a possible realization of these protocols based on the advanced intelligent network (AIN) architecture (see [2, 5]), and signalling system 7 (SS7). It is also worthwhile to point out that several strategies have been proposed in the literature for locating users, many of which attempt to reduce the signalling traffic and database loads imposed by the need to locate users in PCS.

20.2

An Overview of PCS

This section explains different aspects of mobility in PCS using an example of two nomadic users who wish to communicate with each other. It also describes a reference model for PCS. 1999 by CRC Press LLC

c

20.2.1

Aspects of Mobility—Example 20.1

PCS can involve two possible types of mobility, terminal mobility and personal mobility, that are explained next. Terminal Mobility: This type of mobility allows a terminal to be identified by a unique terminal identifier independent of the point of attachment to the network. Calls intended for that terminal can therefore be delivered to that terminal regardless of its network point of attachment. To facilitate terminal mobility, a network must provide several functions, which include those that locate, identify, and validate a terminal and provide services (e.g., deliver calls) to the terminal based on the location information. This implies that the network must store and maintain the location information of the terminal based on a unique identifier assigned to that terminal. An example of a terminal identifier is the IS-41 EIA/TIA cellular industry term mobile identification number (MIN), which is a North American Numbering Plan (NANP) number that is stored in the terminal at the time of manufacture and cannot be changed. A similar notion exists in GSM (see Section 20.4). Personal Mobility: This type of mobility allows a PCS user to make and receive calls independent of both the network point of attachment and a specific PCS terminal. This implies that the services that a user has subscribed to (stored in that user’s service profile) are available to the user even if the user moves or changes terminal equipment. Functions needed to provide personal mobility include those that identify (authenticate) the end user and provide services to an end user independent of both the terminal and the location of the user. An example of a functionality needed to provide personal mobility for voice calls is the need to maintain a user’s location information based on a unique number, called the universal personal telecommunications (UPT) number, assigned to that user. UPT numbers are also NANP numbers. Another example is one that allows end users to define and manage their service profiles to enable users to tailor services to suit their needs. In Section 20.4, we describe how GSM caters to personal mobility via smart cards. For the purposes of the example that follows, the terminal identifiers (TID) and UPT numbers are NANP numbers, the distinction being TIDs address terminal mobility and UPT numbers address personal mobility. Though we have assigned two different numbers to address personal and terminal mobility concerns, the same effect could be achieved by a single identifier assigned to the terminal that varies depending on the user that is currently utilizing the terminal. For simplicity we assume that two different numbers are assigned. Figure 20.1 illustrates the terminal and personal mobility aspects of PCS, which will be explained via an example. Let us assume that users Kate and Al have, respectively, subscribed to PCS services from PCS service provider (PSP) A and PSP B. Kate receives the UPT number, say, 500 111 4711, from PSP A. She also owns a PCS terminal with TID 200 777 9760. Al too receives his UPT number 500 222 4712 from PSP B, and he owns a PCS terminal with TID 200 888 5760. Each has been provided a personal identification number (PIN) by their respective PSP when subscription began. We assume that the two PSPs have subscribed to PCS access services from a certain network provider such as, for example, a local exchange carrier (LEC). (Depending on the capabilities of the PSPs, the access services provided may vary. Examples of access services include translation of UPT number to a routing number, terminal and personal registration, and call delivery. Refer to Bellcore, [3], for further details). When Kate plugs in her terminal to the network, or when she activates it, the terminal registers itself with the network by providing its TID to the network. The network creates an entry for the terminal in an appropriate database, which, in this example, is entered in the terminal mobility database (TMDB) A. The entry provides a mapping of her terminal’s TID, 200 777 9760, to a routing number (RN), RN1. All of these activities happen without Kate being aware of them. After activating her terminal, Kate registers herself at that terminal by entering her UPT number (500 111 4711) to inform the network that all calls to her UPT number are to be delivered to her 1999 by CRC Press LLC

c

at the terminal. For security reasons, the network may want to authenticate her and she may be prompted to enter her PIN number into her terminal. (Alternatively, if the terminal is equipped with a smart card reader, she may enter her smart card into the reader. Other techniques, such as, for example, voice recognition, may be employed). Assuming that she is authenticated, Kate has now registered herself. As a result of personal registration by Kate, the network creates an entry for her in the personal mobility database (PMDB) A that maps her UPT number to the TID of the terminal at which she registered. Similarly, when Al activates his terminal and registers himself, appropriate entries are created in TMDB B and PMDB B. Now Al wishes to call Kate and, hence, he dials Kate’s UPT number (500 111 4711). The network carries out the following tasks. 1. The switch analyzes the dialed digits and recognizes the need for AIN service, determines that the dialed UPT number needs to be translated to a RN by querying PMDB A and, hence, it queries PMDB A. 2. PMDB A searches its database and determines that the person with UPT number 500 111 4711 is currently registered at terminal with TID 200 777 9760. 3. PMDB A then queries TMDB A for the RN of the terminal with TID 200 777 9760. TMDB A returns the RN (RN1). 4. PMDB A returns the RN (RN1) to the originating switch. 5. The originating switch directs the call to the switch RN1, which then alerts Kate’s terminal. The call is completed when Kate picks up her terminal. Kate may take her terminal wherever she goes and perform registration at her new location. From then on, the network will deliver all calls for her UPT number to her terminal at the new location. In fact, she may actually register on someone else’s terminal too. For example, suppose that Kate and Al agree to meet at Al’s place to discuss a school project they are working on together. Kate may register herself on Al’s terminal (TID 200 888 9534). The network will now modify the entry corresponding to 4711 in PMDB A to point to B 9534. Subsequent calls to Kate will be delivered to Al’s terminal. The scenario given here is used only to illustrate the key aspects of terminal and personal mobility; an actual deployment of these services may be implemented in ways different from those suggested here. We will not discuss personal registration further. The analyses that follow consider only terminal mobility but may easily be modified to include personal mobility.

20.2.2

A Model for PCS

Figure 20.2 illustrates the reference model used for the comparative analysis. The model assumes that the HLR resides in a service control point (SCP) connected to a regional signal transfer point (RSTP). The SCP is a storehouse of the AIN service logic, i.e., functionality used to perform the processing required to provide advanced services, such as speed calling, outgoing call screening, etc., in the AIN architecture (see Bellcore, [2] and Berman and Brewster, [5]). The RSTP and the local STP (LSTP) are packet switches, connected together by various links such A links or D links, that perform the signalling functions of the SS7 network. Such functions include, for example, global title translation for routing messages between the AIN switching system, which is also referred to as the service switching point (SSP), and SCP and IS-41 messages [6]. Several SSPs may be connected to an LSTP. The reference model in Fig. 20.2 introduces several terms which are explained next. We have tried to keep the terms and discussions fairly general. Wherever possible, however, we point to equivalent cellular terms from IS-41 or GSM. 1999 by CRC Press LLC

c

FIGURE 20.1: Illustrating terminal and personal mobility.

For our purposes, the geographical area served by a PCS system is partitioned into a number of radio port coverage areas (or cells, in cellular terms) each of which is served by a radio port (or, equivalently, base station) that communicates with PCS terminals in that cell. A registration area (also known in the cellular world as location area) is composed of a number of cells. The base stations of all cells in a registration area are connected by wireline links to a mobile switching center (MSC). We assume that each registration area is served by a single VLR. The MSC of a registration area is responsible for maintaining and accessing the VLR and for switching between radio ports. The VLR associated with a registration area is responsible for maintaining a subset of the user information contained in the HLR. Terminal registration process is initiated by terminals whenever they move into a new registration area. The base stations of a registration area periodically broadcast an identifier associated with that area. The terminals periodically compare an identifier they have stored with the identifier to the registration area being broadcast. If the two identifiers differ, the terminal recognizes that it has moved from one registration area to another and will, therefore, generate a registration message. It also replaces the previous registration area identifier with that of the new one. Movement of a terminal within the same registration area will not generate registration messages. Registration messages may also be generated when the terminals are switched on. Similarly, messages are generated to deregister them when they are switched off. PCS services may be provided by different types of commercial service vendors. Bellcore, [3] 1999 by CRC Press LLC

c

FIGURE 20.2: Example of a reference model for a PCS.

describes three different types of PSPs and the different access services that a public network may provide to them. For example, a PSP may have full network capabilities with its own switching, radio management, and radio port capabilities. Certain others may not have switching capabilities, and others may have only radio port capabilities. The model in Fig. 20.2 assumes full PSP capabilities. The analysis in Section 20.5 is based on this model and modifications may be necessary for other types of PSPs. It is also quite possible that one or more registration areas may be served by a single PSP. The PSP may have one or more HLRs for serving its service area. In such a situation users that move within the PSP’s serving area may generate traffic to the PSP’s HLR (not shown in Fig. 20.2) but not to the network’s HLR (shown in Fig. 20.2). In the interest of keeping the discussions simple, we have assumed that there is one-to-one correspondence between SSPs and MSCs and also between MSCs, registration areas, and VLRs. One impact of locating the SSP, MSC, and VLR in separate physical sites connected by SS7 signalling links would be to increase the required signalling message volume on the SS7 network. Our model assumes that the messages between the SSP and the associated MSC and VLR do not add to signalling load on the public network. Other configurations and assumptions could be studied for which the analysis may need to be suitably modified. The underlying analysis techniques will not, however, differ significantly.

20.3

IS-41 Preliminaries

We now describe the message flow for call origination, call delivery, and terminal registration, sometimes called location registration, based on the IS-41 protocol. This protocol is described in detail in EIA/TIA, [6]. Only an outline is provided here.

20.3.1

Terminal/Location Registration

During IS-41 registration, signalling is performed between the following pairs of network elements: • New serving MSC and the associated database (or VLR) 1999 by CRC Press LLC

c

• New database (VLR) in the visited area and the HLR in the public network • HLR and the VLR in former visited registration area or the old MSC serving area. Figure 20.3 shows the signalling message flow diagram for IS-41 registration activity, focusing only on the essential elements of the message flow relating to registration; for details of variations from the basic registration procedure, see Bellcore, [3].

FIGURE 20.3: Signalling flow diagram for registration in IS-41.

The following steps describe the activities that take place during registration. 1. Once a terminal enters a new registration area, the terminal sends a registration request to the MSC of that area. 2. The MSC sends an authentication request (AUTHRQST) message to its VLR to authenticate the terminal, which in turn sends the request to the HLR. The HLR sends its response in the authrqst message. 3. Assuming the terminal is authenticated, the MSC sends a registration notification (REGNOT) message to its VLR. 4. The VLR in turn sends a REGNOT message to the HLR serving the terminal. The HLR updates the location entry corresponding to the terminal to point to the new serving 1999 by CRC Press LLC

c

MSC/VLR. The HLR sends a response back to the VLR, which may contain relevant parts of the user’s service profile. The VLR stores the service profile in its database and also responds to the serving MSC. 5. If the user/terminal was registered previously in a different registration area, the HLR sends a registration cancellation (REGCANC) message to the previously visited VLR. On receiving this message, the VLR erases all entries for the terminal from the record and sends a REGCANC message to the previously visited MSC, which then erases all entries for the terminal from its memory. The protocol shows authentication request and registration notification as separate messages. If the two messages can be packaged into one message, then the rate of queries to HLR may be cut in half. This does not necessarily mean that the total number of messages are cut in half.

20.3.2

Call Delivery

The signalling message flow diagram for IS-41 call delivery is shown in Fig. 20.4. The following steps describe the activities that take place during call delivery. 1. A call origination is detected and the number of the called terminal (for example, MIN) is received by the serving MSC. Observe that the call could have originated from within the public network from a wireline phone or from a wireless terminal in an MSC/VLR serving area. (If the call originated within the public network, the AIN SSP analyzes the dialed digits and sends a query to the SCP.) 2. The MSC determines the associated HLR serving the called terminal and sends a location request (LOCREQ) message to the HLR. 3. The HLR determines the serving VLR for that called terminal and sends a routing address request (ROUTEREQ) to the VLR, which forwards it to the MSC currently serving the terminal. 4. Assuming that the terminal is idle, the serving MSC allocates a temporary identifier, called a temporary local directory number (TLDN), to the terminal and returns a response to the HLR containing this information. The HLR forwards this information to the originating SSP/MSC in response to its LOCREQ message. 5. The originating SSP requests call setup to the serving MSC of the called terminal via the SS7 signalling network using the usual call setup protocols. Similar to the considerations for reducing signalling traffic for location registration, the VLR and HLR functions could be united in a single logical database for a given serving area and collocated; further, the database and switch can be integrated into the same piece of physical equipment or be collocated. In this manner, a significant portion of the messages exchanged between the switch, HLR and VLR as shown in Fig. 20.4 will not contribute to signalling traffic.

20.4

Global System for Mobile Communications

In this section we describe the user location strategy proposed in the European Global System for Mobile Communications (GSM) standard and its offshoot, digital cellular system 1800 (DCS1800). There has recently been increased interest in GSM in North America, since it is possible that early deployment of PCS will be facilitated by using the communication equipment already available from 1999 by CRC Press LLC

c

FIGURE 20.4: Signalling flow diagram for call delivery in IS-41.

European manufacturers who use the GSM standard. Since the GSM standard is relatively unfamiliar to North American readers, we first give some background and introduce the various abbreviations. The reader will find additional details in Mouley and Pautet, [18]. For an overview on GSM, refer to Lycksell, [15]. The abbreviation GSM originally stood for Groupe Special Mobile, a committee created within the pan-European standardization body Conference Europeenne des Posts et Telecommunications (CEPT) in 1982. There were numerous national cellular communication systems and standards in Europe at the time, and the aim of GSM was to specify a uniform standard around the newly reserved 900-MHz frequency band with a bandwidth of twice 25 MHz. The phase 1 specifications of this standard were frozen in 1990. Also in 1990, at the request of the United Kingdom, specification of a version of GSM adapted to the 1800-MHz frequency, with bandwidth of twice 75 MHz, was begun. This variant is referred to as DCS1800; the abbreviation GSM900 is sometimes used to distinguish between the two variations, with the abbreviation GSM being used to encompass both GSM900 and DSC1800. The motivation for DCS1800 is to provide higher capacities in densely populated urban areas, particularly for PCS. The DCS1800 specifications were frozen in 1991, and by 1992 all major GSM900 European operators began operation. At the end of 1991, activities concerning the post-GSM generation of mobile communications were begun by the standardization committee, using the name universal mobile telecommunications system (UMTS) for this effort. In 1992, the name of the standardization committee was changed from GSM to special mobile group (SMG) to distinguish it from the 900-MHz system itself, and the term GSM was chosen as the commercial trademark of the European 900-MHz system, where GSM now stands for global system for mobile communications. The GSM standard has now been widely adopted in Europe and is under consideration in several other non-European countries, including the United Arab Emirates, Hong Kong, and New Zealand. 1999 by CRC Press LLC

c

FIGURE 20.5: Flow diagram for registration in GSM.

In 1992, Australian operators officially adopted GSM.

20.4.1

Architecture

In this section we describe the GSM architecture, focusing on those aspects that differ from the architecture assumed in the IS-41 standard. A major goal of the GSM standard was to enable users to move across national boundaries and still be able to communicate. It was considered desirable, however, that the operational network within each country be operated independently. Each of the operational networks is called a public land mobile network (PLMN) and its commercial coverage area is confined to the borders of one country (although some radio coverage overlap at national boundaries may occur), and each country may have several competing PLMNs. A GSM customer subscribes to a single PLMN called the home PLMN, and subscription information includes the services the customer subscribes to. During normal operation, a user may elect to choose other PLMNs as their service becomes available (either as the user moves or as new operators enter the marketplace). The user’s terminal [GSM calls the terminal a mobile station (MS)] assists the user in choosing a PLMN in this case, either presenting a list of possible PLMNs to the user using 1999 by CRC Press LLC

c

explicit names (e.g., DK Sonofon for the Danish PLMN) or choosing automatically based on a list of preferred PLMNs stored in the terminal’s memory. This PLMN selection process allows users to choose between the services and tariffs of several competing PLMNs. Note that the PLMN selection process differs from the cell selection and handoff process that a terminal carries out automatically without any possibility of user intervention, typically based on received radio signal strengths and, thus, requires additional intelligence and functionality in the terminal. The geographical area covered by a PLMN is partitioned into MSC serving areas, and a registration area is constrained to be a subset of a single MSC serving area. The PLMN operator has complete freedom to allocate cells to registration areas. Each PLMN has, logically speaking, a single HLR, although this may be implemented as several physically distributed databases, as for IS-41. Each MSC also has a VLR, and a VLR may serve one or several MSCs. As for IS-41, it is interesting to consider how the VLR should be viewed in this context. The VLR can be viewed as simply a database off loading the query and signalling load on the HLR and, hence, logically tightly coupled to the HLR or as an ancillary processor to the MSC. This distinction is not academic; in the first view, it would be natural to implement a VLR as serving several MSCs, whereas in the second each VLR would serve one MSC and be physically closely coupled to it. For GSM, the MSC implements most of the signalling protocols, and at present all switch manufacturers implement a combined MSC and VLR, with one VLR per MSC [18]. A GSM mobile station is split in two parts, one containing the hardware and software for the radio interface and the other containing subscribers-specific and location information, called the subscriber identity module (SIM), which can be removed from the terminal and is the size of a credit card or smaller. The SIM is assigned a unique identity within the GSM system, called the international mobile subscriber identity (IMSI), which is used by the user location strategy as described the next subsection. The SIM also stores authentication information, services lists, PLMN selection lists, etc., and can itself be protected by password or PIN. The SIM can be used to implement a form of large-scale mobility called SIM roaming. The GSM specifications standardize the interface between the SIM and the terminal, so that a user carrying his or her SIM can move between different terminals and use the SIM to personalize the terminal. This capability is particularly useful for users who move between PLMNs which have different radio interfaces. The user can use the appropriate terminal for each PLMN coverage area while obtaining the personalized facilities specified in his or her SIM. Thus, SIMs address personal mobility. In the European context, the usage of two closely related standards at different frequencies, namely, GSM900 and DCS1800, makes this capability an especially important one and facilitates interworking between the two systems.

20.4.2

User Location Strategy

We present a synopsis of the user location strategy in GSM using call flow diagrams similar to those used to describe the strategy in IS-41. In order to describe the registration procedure, it is first useful to clarify the different identifiers used in this procedure. The SIM of the terminal is assigned a unique identity, called the IMSI, as already mentioned. To increase confidentiality and make more efficient use of the radio bandwidth, however, the IMSI is not normally transmitted over the radio link. Instead, the terminal is assigned a temporary mobile subscriber identity (TMSI) by the VLR when it enters a new registration area. The TMSI is valid only within a given registration area and is shorter than the IMSI. The IMSI and TMSI are identifiers that are internal to the system and assigned to a terminal or SIM and should not be confused with the user’s number that would be dialed by a calling party; the latter is a separate number called the mobile subscriber integrated service digital network (ISDN) number (MSISDN), 1999 by CRC Press LLC

c

and is similar to the usual telephone number in a fixed network. We now describe the procedure during registration. The terminal can detect when it has moved into the cell of a new registration area from the system information broadcast by the base station in the new cell. The terminal initiates a registration update request to the new base station; this request includes the identity of the old registration area and the TMSI of the terminal in the old area. The request is forwarded to the MSC, which, in turn, forwards it to the new VLR. Since the new VLR cannot translate the TMSI to the IMSI of the terminal, it sends a request to the old VLR to send the IMSI of the terminal corresponding to that TMSI. In its response, the old VLR also provides the required authentication information. The new VLR then initiates procedures to authenticate the terminal. If the authentication succeeds, the VLR uses the IMSI to determine the address of the terminal’s HLR. The ensuing protocol is then very similar to that in IS-41, except for the following differences. When the new VLR receives the registration affirmation (similar to regnot in IS-41) from the HLR, it assigns a new TMSI to the terminal for the new registration area. The HLR also provides the new VLR with all relevant subscriber profile information required for call handling (e.g., call screening lists, etc.) as part of the affirmation message. Thus, in contrast with IS-41, authentication and subscriber profile information are obtained from both the HLR and old VLR and not just the HLR. The procedure for delivering calls to mobile users in GSM is very similar to that in IS-41. The sequence of messages between the caller and called party’s MSC/VLRs and the HLR is identical to that shown in the call flow diagrams for IS-41, although the names, contents and lengths of messages may be different and, hence, the details are left out. The interested reader is referred to Mouly and Pautet, [18], or Lycksell, [15], for further details.

20.5

Analysis of Database Traffic Rate for IS-41 and GSM

In the two subsections that follow, we state the common set of assumptions on which we base our comparison of the two strategies.

20.5.1

The Mobility Model for PCS Users

In the analysis that follows in the IS-41 analysis subsection, we assume a simple mobility model for the PCS users. The model, which is described in [23], assumes that PCS users carrying terminals are moving at an average velocity of v and their direction of movement is uniformly distributed over [0, 2π ]. Assuming that the PCS users are uniformly populated with a density of ρ and the registration area boundary is of length L, it has been shown that the rate of registration area crossing R is given by ρv L (20.1) R= π Using Eq. (20.1), we can calculate the signalling traffic due to registration, call origination, and delivery. We now need a set of assumptions so that we may proceed to derive the traffic rate to the databases using the model in Fig. 20.2.

20.5.2

Additional Assumptions

The following assumptions are made in performing the analysis. • 128 total registration areas 1999 by CRC Press LLC

c

• • • • • • •

Square registration area size: (7.575 km)2 = 57.5 km2 , with border length L = 30.3 km Average call origination rate = average call termination (delivery) rate = 1.4/h/terminal Mean density of mobile terminals = ρ = 390/km2 Total number of mobile terminals = 128 × 57.4 × 390 = 2.87 × 106 Average call origination rate = average call termination (delivery) rate = 1.4/h/terminal Average speed of a mobile, v = 5.6 km/h Fluid flow mobility model

The assumptions regarding the total number of terminals may also be obtained by assuming that a certain public network provider serves 19.15 × 106 users and that 15% (or 2.87 × 106 ) of the users also subscribe to PCS services from various PSPs. Note that we have adopted a simplified model that ignores situations where PCS users may turn their handsets on and off that will generate additional registration and deregistration traffic. The model also ignores wireline registrations. These activities will increase the total number of queries and updates to HLR and VLRs.

20.5.3

Analysis of IS-41

Using Eq. (20.1) and the parameter values assumed in the preceding subsection, we can compute the traffic due to registration. The registration traffic is generated by mobile terminals moving into a new registration area, and this must equal the mobile terminals moving out of the registration area, which per second is 390 × 30.3 × 5.6 = 5.85 Rreg, VLR = 3600π This must also be equal to the number of deregistrations (registration cancellations), Rdereg, VLR = 5.85 The total number of registration messages per second arriving at the HLR will be Rreg, HLR = Rreg, VLR × total No. of registration areas = 749 The HLR should, therefore, be able to handle, roughly, 750 updates per second. We observe from Fig. 20.3 that authenticating terminals generate as many queries to VLR and HLR as the respective number of updates generated due to registration notification messages. The number of queries that the HLR must handle during call origination and delivery can be similarly calculated. Queries to HLR are generated when a call is made to a PCS user. The SSP that receives the request for a call, generates a location request (LOCREQ) query to the SCP controlling the HLR. The rate per second of such queries must be equal to the rate of calls made to PCS users. This is calculated as RCallDeliv, HLR

=

call rate per user × total number of users

1.4 × 2.87 × 105 = 3600 = 1116 For calls originated from a mobile terminal by PCS users, the switch authenticates the terminal by querying the VLR. The rate per second of such queries is determined by the rate of calls originating 1999 by CRC Press LLC

c

in an SSP serving area, which is also a registration area (RA). This is given by RCallOrig, VLR =

1116 = 8.7 128

This is also the number of queries per second needed to authenticate terminals of PCS users to which calls are delivered: RCallDeliv, VLR = 8.7 Table 20.1 summarizes the calculations. TABLE 20.1

IS-41 Query and Update Rates to HLR and VLR

Activity

HLR Updates/s

Mobility-related activities at registration

749

Mobility-related activities at deregistration

VLR Updates/s 5.85

HLR Queries/s

5.85

1116

8.7

5.85

Call origination

8.7

Call delivery Total (per RA) Total (Network)

20.5.4

VLR queries/s

749

5.85 749

11.7 1497.6

14.57 1865

23.25 2976

Analysis of GSM

Calculations for query and update rates for GSM may be performed in the same manner as for IS-41, and they are summarized in Table 20.2. The difference between this table and Table 20.1 is that in GSM the new serving VLR does not query the HLR separately in order to authenticate the terminal during registration and, hence, there are no HLR queries during registration. Instead, the entry (749 queries) under HLR queries in Table 20.1, corresponding to mobility-related authentication activity at registration, gets equally divided between the 128 VLRs. Observe that with either protocol the total database traffic rates are conserved, where the total database traffic for the entire network is given by the sum of all of the entries in the last row total (Network), i.e., HLR updates + VLR updates + HLR queries + VLR queries From Tables 20.1 and 20.2 we see that this quantity equals 7087. The conclusion is independent of any variations we may provide to the assumptions in earlier in the section. For example, if the PCS penetration (the percentage of the total users subscribing to PCS services) were to increase from 15 to 30%, all of the entries in the two tables will double and, hence, the total database traffic generated by the two protocols will still be equal.

20.6

Reducing Signalling During Call Delivery

In the preceding section, we provided a simplified analysis of some scenarios associated with user location strategies and the associated database queries and updates required. Previous studies [13, 16] 1999 by CRC Press LLC

c

TABLE 20.2

GSM Query and Update Rates to HLR and VLR

Activity

HLR Updates/s

Mobility-related activities at registration

749

Mobility-related activities at deregistration

VLR Updates/s

HLR Queries/s

5.85

VLR Queries/s 11.7

5.85

Call origination

8.7

Call delivery

1116

8.7

Total (per VLR)

749

11.7

1116

29.1

Total (Network)

749

1497.6

1116

3724.8

indicate that the signalling traffic and database queries associated with PCS due to user mobility are likely to grow to levels well in excess of that associated with a conventional call. It is, therefore, desirable to study modifications to the two protocols that would result in reduced signalling and database traffic. We now provide some suggestions. For both GSM and IS-41, delivery of calls to a mobile user involves four messages: from the caller’s VLR to the called party’s HLR, from the HLR to the called party’s VLR, from the called party’s VLR to the HLR, and from the HLR to the caller’s VLR. The last two of these messages involve the HLR, whose role is to simply relay the routing information provided by the called party’s VLR to the caller’s VLR. An obvious modification to the protocol would be to have the called VLR directly send the routing information to the calling VLR. This would reduce the total load on the HLR and on signalling network links substantially. Such a modification to the protocol may not be easy, of course, due to administrative, billing, legal, or security concerns. Besides, this would violate the query/response model adopted in IS-41, requiring further analysis. A related question which arises is whether the routing information obtained from the called party’s VLR could instead be stored in the HLR. This routing information could be provided to the HLR, for example, whenever a terminal registers in a new registration area. If this were possible, two of the four messages involved in call delivery could be eliminated. This point was discussed at length by the GSM standards body, and the present strategy was arrived at. The reason for this decision was to reduce the number of temporary routing numbers allocated by VLRs to terminals in their registration area. If a temporary routing number (TLDN in IS-41 or MSRN in GSM) is allocated to a terminal for the whole duration of its stay in a registration area, the quantity of numbers required is much greater than if a number is assigned on a per-call basis. Other strategies may be employed to reduce signalling and database traffic via intelligent paging or by storing user’s mobility behavior in user profiles (see, for example, Tabbane, [22]). A discussion of these techniques is beyond the scope of the paper.

20.7

Per-User Location Caching

The basic idea behind per-user location caching is that the volume of SS7 message traffic and database accesses required in locating a called subscriber can be reduced by maintaining local storage, or cache, of user location information at a switch. At any switch, location caching for a given user should be employed only if a large number of calls originate for that user from that switch, relative to the user’s mobility. Note that the cached information is kept at the switch from which calls originate, which may or may not be the switch where the user is currently registered. Location caching involves the storage of location pointers at the originating switch; these point to 1999 by CRC Press LLC

c

the VLR (and the associated switch) where the user is currently registered. We refer to the procedure of locating a PCS user a FIND operation, borrowing the terminology from Awerbuch and Peleg, [1]. We define a basic FIND, or BasicFIND( ), as one where the following sequence of steps takes place. 1. The incoming call to a PCS user is directed to the nearest switch. 2. Assuming that the called party is not located within the immediate RA, the switch queries the HLR for routing information. 3. The HLR contains a pointer to the VLR in whose associated RA the subscriber is currently situated and launches a query to that VLR. 4. The VLR, in turn, queries the MSC to determine whether the user terminal is capable of receiving the call (i.e., is idle) and, if so, the MSC returns a routable address (TLDN in IS-41) to the VLR. 5. The VLR relays the routing address back to the originating switch via the HLR. At this point, the originating switch can route the call to the destination switch. Alternately, BasicFIND( ) can be described by pseudocode as follows. (We observe that a more formal method of specifying PCS protocols may be desirable). BasicFIND( ){ Call to PCS user is detected at local switch; if called party is in same RA then return; Switch queries called party’s HLR; Called party’s HLR queries called party’s current VLR, V ; V returns called party’s location to HLR; HLR returns location to calling switch; } In the FIND procedure involving the use of location caching, or CacheFIND( ), each switch contains a local memory (cache) that stores location information for subscribers. When the switch receives a call origination (from either a wire-line or wireless caller) directed to a PCS subscriber, it first checks its cache to see if location information for the called party is maintained. If so, a query is launched to the pointed VLR; if not, BasicFIND( ), as just described, is followed. If a cache entry exists and the pointed VLR is queried, two situations are possible. If the user is still registered at the RA of the pointed VLR (i.e., we have a cache hit), the pointed VLR returns the user’s routing address. Otherwise, the pointed VLR returns a cache miss. CacheFIND( ){

}

Call to PCS user is detected at local switch; if called is in same RA then return; if there is no cache entry for called user then invoke BasicFIND( ) and return; Switch queries the VLR, V , specified in the cache entry; if called is at V , then V returns called party’s location to calling switch; else { V returns “miss” to calling switch; Calling switch invokes BasicFIND( ); }

1999 by CRC Press LLC

c

When a cache hit occurs we save one query to the HLR [a VLR query is involved in both CacheFIND( ) and BasicFIND( )], and we also save traffic along some of the signalling links; instead of four message transmissions, as in BasicFIND( ), only two are needed. In steady-state operation, the cached pointer for any given user is updated only upon a miss. Note that the BasicFIND( ) procedure differs from that specified for roaming subscribers in the IS-41 standard EIA/TIA, [6]. In the IS-41 standard, the second line in the BasicFIND( ) procedure is omitted, i.e., every call results in a query of the called user’s HLR. Thus, in fact, the procedure specified in the standard will result in an even higher network load than the BasicFIND( ) procedure specified here. To make a fair assessment of the benefits of CacheFIND( ), however, we have compared it against BasicFIND( ). Thus, the benefits of CacheFIND( ) investigated here depend specifically on the use of caching and not simply on the availability of user location information at the local VLR.

20.8

Caching Threshold Analysis

In this section we investigate the classes of users for which the caching strategy yields net reductions in signalling traffic and database loads. We characterize classes of users by their CMR. The CMR of a user is the average number of calls to a user per unit time, divided by the average number of times the user changes registration areas per unit time. We also define a LCMR, which is the average number of calls to a user from a given originating switch per unit time, divided by the average number of times the user changes registration areas per unit time. For each user, the amount of savings due to caching is a function of the probability that the cached pointer correctly points to the user’s location and increases with the user’s LCMR. In this section we quantify the minimum value of LCMR for caching to be worthwhile. This caching threshold is parameterized with respect to costs of traversing signalling network elements and network databases and can be used as a guide to select the subset of users to whom caching should be applied. The analysis in this section shows that estimating user’s LCMRs, preferably dynamically, is very important in order to apply the caching strategy. The next section will discuss methods for obtaining this estimate. From the pseudocode for BasicFIND( ), the signalling network cost incurred in locating a PCS user in the event of an incoming call is the sum of the cost of querying the HLR (and receiving the response), and the cost of querying the VLR which the HLR points to (and receiving the response). Let α β

= cost of querying the HLR and receiving a response = cost of querying the pointed VLR and receiving a response

Then, the cost of BasicFIND( ) operation is CB = α + β

(20.2)

To quantify this further, assume costs for traversing various network elements as follows. Al = cost of transmitting a location request or response message on A link between SSP and LSTP D = cost of transmitting a location request on response message or D link Ar = cost of transmitting a location request or response message on A link between RSTP and SCP L = cost of processing and routing a location request or response message by LSTP R = cost of processing and routing a location request or response message by RSTP HQ = cost of a query to the HLR to obtain the current VLR location 1999 by CRC Press LLC

c

VQ = cost of a query to the VLR to obtain the routing address Then, using the PCS reference network architecture (Fig. 80.2), α β

= 2 (Al + D + Ar + L + R) + HQ = 2 (Al + D + Ar + L + R) + VQ

(20.3) (20.4)

From Eqs. (20.2)–(20.4) CB = 4 (Al + D + Ar + L + R) + HQ + VQ

(20.5)

We now calculate the cost of CacheFIND( ). We define the hit ratio as the relative frequency with which the cached pointer correctly points to the user’s location when it is consulted. Let p = cache hit ratio CH = cost of the CacheFIND( ) procedure when there is a hit CM = cost of the CacheFIND( ) procedure when there is a miss Then the cost of CacheFIND( ) is CC = p CH + (1 − p)CM

(20.6)

For CacheFIND( ), the signalling network costs incurred in locating a user in the event of an incoming call depend on the hit ratio as well as the cost of querying the VLR, which is stored in the cache; this VLR query may or may not involve traversing the RSTP. In the following, we say a VLR is a local VLR if it is served by the same LSTP as the originating switch, and a remote VLR otherwise. Let q δ η

= = = =

Prob (VLR in originating switch’s cache is a local VLR) cost of querying a local VLR cost of querying a remote VLR cost of updating the cache upon a miss

Then, δ CH

= 4Al + 2L + VQ = 4 (Al + D + L) + 2R + VQ = qδ + (1 − q)

(20.7) (20.8) (20.9)

Since updating the cache involves an operation to a fast local memory rather than a database operation, we shall assume in the following that η = 0. Then, CM = CH + CB = qδ + (1 − q) + α + β

(20.10)

From Eqs. (20.6), (20.9) and (20.10) we have CC = α + β + − p(α + β) + q(δ − )

(20.11)

For net cost savings we require CC < CB , or that the hit ratio exceeds a hit ratio threshold pT , derived using Eqs. (20.6), (20.9), and (20.2), p > pT =

CH CB

= =

1999 by CRC Press LLC

c

+ q(δ − ) α+β 4Al + 4D + 4L + 2R + VQ − q(4D + 2L + 2R) 4Al + 4D + 4Ar + 4L + 4R + HQ + VQ

(20.12) (20.13)

Equation (20.13) specifies the hit ratio threshold for a user, evaluated at a given switch, for which local maintenance of a cached location entry produces cost savings. As pointed out earlier, a given user’s hit ratio may be location dependent, since the rates of calls destined for that user may vary widely across switches. The hit ratio threshold in Eq. (20.13) is comprised of heterogeneous cost terms, i.e., transmission link utilization, packet switch processing, and database access costs. Therefore, numerical evaluation of the hit ratio threshold requires either detailed knowledge of these individual quantities or some form of simplifying assumptions. Based on the latter approach, the following two possible methods of evaluation may be employed. 1. Assume one or more cost terms dominate, and simplify Eq. (20.13) by setting the remaining terms to zero. 2. Establish a common unit of measure for all cost terms, for example, time delay. In this case, Al , Ar , and D may represent transmission delays of fixed transmission speed (e.g., 56 kb/s) signalling links, L and R may constitute the sum of queueing and service delays of packet switches (i.e., STPs), and HQ and VQ the transaction delays for database queries. In this section we adopt the first method and evaluate Eq. (20.13) assuming a single term dominates. (In Section 20.9 we present results using the second method). Table 20.3 shows the hit ratio threshold required to obtain net cost savings, for each case in which one of the cost terms is dominant. TABLE 20.3 Minimum Hit Ratios and LCMRs for Various Individual Dominant Signalling Network Cost Terms Dominant Cost Term

Hit ratio Threshold, pT

LCMR Threshold, LCMRT

LCMR Threshold (q = 0.043)

LCMR Threshold (q = 0.25)

Al

1

∞

∞

∞

Ar

0

0

0

0

D

1−q

1/q − 1

22

3

L

1 − q/2

2/q − 1

45

7

R

1 − q/2

2/q − 1

45

7

HQ

0

0

0

0

VQ

1

∞

∞

∞

In Table 20.3 we see that if the cost of querying a VLR or of traversing a local A link is the dominant cost, caching for users who may move is never worthwhile, regardless of users’ call reception and mobility patterns. This is because the caching strategy essentially distributes the functionality of the HLR to the VLRs. Thus, the load on the VLR and the local A link is always increased, since any move by a user results in a cache miss. On the other hand, for a fixed user (or telephone), caching is always worthwhile. We also observe that if the remote A links or HLR querying are the bottlenecks, caching is worthwhile even for users with very low hit ratios. As a simple average-case calculation, consider the net network benefit of caching when HLR access and update is the performance bottleneck. Consider a scenario where u = 50% of PCS users receive c = 80% of their calls from s = 5 RAs where their hit ratio p > 0, and s 0 = 4 of the SSPs at those RAs contain sufficiently large caches. Assume that caching is applied only to this subset of users and to no other users. Suppose that the average hit ratio for these users is p = 80%, so that 80% of the 1999 by CRC Press LLC

c

HLR accesses for calls to these users from these RA are avoided. Then the net saving in the accesses to the system’s HLR is H = (u c s 0 p)/s = 25%. We discuss other quantities in Table 20.3 next. It is first useful to relate the cache hit ratio to users’ calling and mobility patterns directly via the LCMR. Doing so requires making assumptions about the distribution of the user’s calls and moves. We consider the steady state where the incoming call stream from an SSP to a user is a Poisson process with arrival rate λ, and the time that the user resides in an RA has a general distribution F (t) with mean 1/µ. Thus, LCMR =

λ µ

(20.14)

Let t be the time interval between two consecutive calls from the SSP to the user and t1 be the time interval between the first call and the time when the user moves to a new RA. From the random observer property of the arrival call stream [7], the hit ratio is Z ∞ Z ∞ −λt λe µ [1 − F (t1 )] dt1 dt p = Pr [t < t1 ] = t1 =t

t=0

If F (t) is an exponential distribution, then p=

λ λ+µ

(20.15)

and we can derive the LCMR threshold, the minimum LCMR required for caching to be beneficial assuming incoming calls are a Poisson process and intermove times are exponentially distributed, LCMRT =

pT 1 − pT

(20.16)

Equation (20.16) is used to derive LCMR thresholds assuming various dominant costs terms, as shown in Table 20.3. Several values for LCMRT in Table 20.3 involve the term q, i.e., the probability that the pointed VLR is a local VLR. These values may be numerically evaluated by simplifying assumptions. For example, assume that all of the SSPs in the network are uniformly distributed amongst l LSTPs. Also, assume that all of the PCS subscribers are uniformly distributed in location across all SSPs and that each subscriber exhibits the same incoming call rate at every SSP. Under those conditions, q is simply 1/ l. Consider the case of the public switched telephone network. Given that there are a total of 160 local access transport area (LATA) across the 7 Regional Bell Operating Company (RBOC) regions [4], the average number of LATAs, or l, is 160/7 or 23. Table 20.3 shows the results with q = 1/ l in this case. We observe that the assumption that all users receive calls uniformly from all switches in the network is extremely conservative. In practice, we expect that user call reception patterns would display significantly more locality, so that q would be larger and the LCMR thresholds required to make caching worthwhile would be smaller. It is also worthwhile to consider the case of a RBOC region with PCS deployed in a few LATA only, a likely initial scenario, say, 4 LATAs. In either case the value of q would be significantly higher; Table 20.3 shows the LCMR threshold when q = 0.25. It is possible to quantify the net costs and benefits of caching in terms of signalling network impacts in this way and to determine the hit ratio and LCMR threshold above which users should have the caching strategy applied. Applying caching to users whose hit ratio and LCMR is below this threshold results in net increases in network impacts. It is, thus, important to estimate users’ LCMRs accurately. The next section discusses how to do so. 1999 by CRC Press LLC

c

20.9

Techniques for Estimating Users’ LCMR

Here we sketch some methods of estimating users’ LCMR. A simple and attractive policy is to not estimate these quantities on a per-user basis at all. For instance, if the average LCMR over all users in a PCS system is high enough (and from Table 20.3, it need not be high depending on which network elements are the dominant costs), then caching could be used at every SSP to yield net system-wide benefits. Alternatively, if it is known that at any given SSP the average LCMR over all users is high enough, a cache can be installed at that SSP. Other variations can be designed. One possibility for deciding about caching on a per-user basis is to maintain information about a user’s calling and mobility pattern at the HLR and to download it periodically to selected SSPs during off-peak hours. It is easy to envision numerous variations on this idea. In this section we investigate two possible techniques for estimating LCMR on a per-user basis when caching is to be deployed. The first algorithm, called the running average algorithm, simply maintains a running average of the hit ratio for each user. The second algorithm, called the reset-K algorithm, attempts to obtain a measure of the hit ratio over the recent history of the user’s movements. We describe the two algorithms next and evaluate their effectiveness using a stochastic analysis taking into account user calling and mobility patterns.

20.9.1

The Running Average Algorithm

The running average algorithm maintains, for every user that has a cache entry, the running average of the hit ratio. A running count is kept of the number of calls to a given user, and, regardless of the FIND procedure used to locate the user, a running count of the number of times that the user was at the same location for any two consecutive calls; the ratio of these numbers provides the measured running average of the hit ratio. We denote the measured running average of the hit ratio by pM ; in steady state, we expect that pM = p. The user’s previous location as stored in the cache entry is used only if the running average of the hit ratio pM is greater than the cache hit threshold pT . Recall that the cache scheme outperforms the basic scheme if p > pT = CH /CB . Thus, in steady state, the running average algorithm will outperform the basic scheme when pM > pT . We consider, as before, the steady state where the incoming call stream from an SSP to a user is a Poisson process with arrival rate λ, and the time that the user resides in an RA has an exponential distribution with mean 1/µ. Thus LCMR = λ/µ [Eq. (20.14)] and the location tracking cost at steady state is pM CH + (1 − pM ) CB , pM > pT (20.17) CC = CB , otherwise Figure 20.6 plots the cost ratio CC /CB from Eq. (20.17) against LCMR. (This corresponds to assigning uniform units to all cost terms in Eq. (20.13), i.e., the second evaluation method as discussed in Section 20.8. Thus, the ratio CC /CB may represent the percentage reduction in user location time with the caching strategy compared to the basic strategy.) The figure indicates that in the steady state, the caching strategy with the running average algorithm for estimating LCMR can significantly outperform the basic scheme if LCMR is sufficiently large. For instance with LCMR ∼ 5, caching can lead to cost savings of 20–60% over the basic strategy. Equation (20.17) (cf., solid curves in Fig. 20.6) is validated against a simple Monte Carlo simulation (cf., dashed curves in Fig. 20.6). In the simulation, the confidence interval for the 95% confidence level of the output measure CC /CB is within 3% of the mean value. This simulation model will later be used to study the running average algorithm when the mean of the movement distribution changes from time to time [which cannot be modeled by using Eq. (20.17)]. 1999 by CRC Press LLC

c

FIGURE 20.6: The location tracking cost for the running average algorithm. One problem with the running average algorithm is that the parameter p is measured from the entire past history of the user’s movement, and the algorithm may not be sufficiently dynamic to adequately reflect the recent history of the user’s mobility patterns.

20.9.2

The Reset-K Algorithm

We may modify the running average algorithm such that p is measured from the recent history. Define every K incoming calls as a cycle. The modified algorithm, which is referred to as the reset-K algorithm, counts the number of cache hits n in a cycle. If the measured hit ratio for a user, pM = n/K ≥ pT , then the cache is enabled for that user, and the cached information is always used to locate the user in the next cycle. Otherwise, the cache is disabled for that user, and the basic scheme is used. At the beginning of a cycle, the cache hit count is reset, and a new pM value is measured during the cycle. To study the performance of the reset-K algorithm, we model the number of cache misses in a cycle by a Markov process. Assume as before that the call arrivals are a Poisson process with arrival rate λ and the time period the user resides in an RA has an exponential distribution with mean 1/µ. A pair (i, j ), where i > j , represents the state that there are j cache misses before the first i incoming 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 20.7: State transitions.

phone calls in a cycle. A pair (i, j )∗ , where i ≥ j ≥ 1, represents the state that there are j − 1 cache misses before the first i incoming phone calls in a cycle, and the user moves between the ith and the i + 1 phone calls. The difference between (i, j ) and (i, j )∗ is that if the Markov process is in the state (i, j ) and the user moves, then the process moves into the state (i, j + 1)∗ . On the other hand, if the process is in state (i, j )∗ when the user moves, the process remains in (i, j )∗ because at most one cache miss occurs between two consecutive phone calls. Figure 20.7(a) illustrates the transitions for state (i, 0) where 2 < i < K + 1. The Markov process moves from (i − 1, 0) to (i, 0) if a phone call arrives before the user moves out. The rate is λ. The process moves from (i, 0) to (i, 1)∗ if the user moves to another RA before the i + 1 call arrival. Let π(i, j ) denote the probability of the process being in state (i, j ). Then the transition equation is π(i, 0) =

λ π(i − 1, 0), λ+µ

2 1.5 with M = 8), but an increased range in the downlink is also needed to get an effective reduction of the number of cell sites. An increased downlink range can be achieved using adaptive beamforming (but with a much higher complexity compared to the uplink-only implementation), a multibeam antenna (i.e., a phased array doing fixed beamforming), or an increased transmit power of the base station. However, the success of smart antenna techniques for range extension applications in second generation systems has been slowed down by their complexity of implementation and by operational constraints (multiple feeders, large antenna panels).

26.5

High Bit Rate Data Transmission

26.5.1

Circuit Mode Techniques

All second generation wireless systems support circuit mode data services with basic rates typically ranging from 9.6 kb/s (in cellular systems) to 32 kb/s (in cordless systems) for a single physical 1999 by CRC Press LLC

c

radio resource. With the growing needs for higher rates, new services have been developed based on multiple allocation or grouping of physical resource. In GSM, HSCSD (High Speed Circuit Switched Data) enables multiple Full Rate Traffic Channels (TCH/F) to be allocated to a call so that a mobile subscriber can use n times the transmission capacity of a single TCH/F channel (Fig. 26.2). The n full rate channels over which the user data stream is split are handled completely independently in the physical layer and for layer 1 error control. The HSCSD channel resulting from the logical combination of n TCH/F channels is controlled as a single radio link during cellular operations such as handover. At the A interface, calls will be limited to a single 64 kb/s circuit. Thus HSCSD will support transparent (up to 64 kb/s) and nontransparent modes (up to 4 × 9.6 = 38.4 kb/s and, later, 4 × 14.4 = 57.6 kb/s). The initial allocation can be changed during a call if required by the user and authorized by the network. Initially the network allocates an appropriate HSCSD connection according to the requested user bit rate over the air interface. Both symmetric and asymmetric configurations for bidirectional HSCSD operation are authorized. The required TCH/F channels are allocated over consecutive or nonconsecutive timeslots.

FIGURE 26.2: Simplified GSM network configuration for HSCSD.

Similar multislot schemes are envisaged or standardized for other TDMA systems. In IS-54 and PDC, where radio channels are relatively narrowband, no more than three time slots can be used per carrier and the achievable data rate is therefore limited to, say, 32 kb/s. On the contrary, in DECT up to 12 time slots can be used at 32 kb/s each, yielding a maximum data rate of 384 kb/s. Moreover, the TDD access mode of DECT allows asymmetric time slot allocation between uplink and downlink, thus enabling even higher data rates in one direction.

26.5.2

Packet Mode Techniques

There is a growing interest for packet data services in second generation wireless systems to support data applications with intermittent and bursty transmission requirements like the Internet, with a better usage of available radio resources, thanks to the multiplexing of data from several mobile users on the same physical channel. Cellular Digital Packet Data (CDPD) has been defined in the U.S. as a radio access overlay for AMPS or D-AMPS (IS-54) systems, allowing packet data transmission on available radio channels. However, CDPD is optimized for short data transmission and the bit rate is limited to 19.2 kb/s. A CDMA packet data standard has also been defined (IS-657) which supports 1999 by CRC Press LLC

c

CDPD and Internet protocols with a similar bit rate limitation but allowing use of the same backhaul as for voice traffic. In Europe, ETSI has almost completed the standardization of GPRS (General Packet Radio Service) for GSM. A GPRS subscriber will be able to send and receive in an end-to-end packet transfer mode. Both point-to-point and point-to-multipoint modes are defined. A GPRS network coexists with a GSM PLMN as an autonomous network. In fact, the Serving GPRS Support Node (SGSN) interfaces with the GSM Base Station Controller (BSC), an MSC and a Gateway GPRS Service Node (GGSN). In turn, the GGSN interfaces with the GGSNs of other GPRS networks and with public Packet Data Networks (PDN). Typically, GPRS traffic can be set up through the common control channels of GSM, which are accessed in slotted ALOHA mode. The layer 2 protocol data units, which are about 2 kbytes in length, are segmented and transmitted over the air interface using one of the four possible channel coding schemes. The system is highly scaleable as it allows from one mobile using 8 radio time slots up to 16 mobiles per time slot, with separate allocation in up- and downlink. The resulting peak data rate per user ranges from 9 kb/s up to 170 kb/s. Time slot concatenation and variable channel coding to maximize the user information bit rate are envisaged for future implementations. This is indicated by the mobile station, which provides information concerning the desire to initiate in-call modifications and the channel coding schemes that can be used during the call set up phase. It is expected that use of the GPRS service will initially be limited and traffic growth will depend on the introduction of GPRS capable subscriber terminals. Easy scalability of the GPRS backbone (e.g., by introducing parallel GGSNs) is an essential feature of the system architecture (Fig. 26.3).

FIGURE 26.3: Simplified view of the GPRS architecture.

26.5.3

New Modulation Schemes

New modulation schemes are being studied as an option in several second generation wireless standards. The aim is to offer higher rate data services equivalent or close to the 2 Mb/s objective of the forthcoming third generation standards. Multilevel modulations (i.e., several bits per modulated symbol) represent a straightforward means to increase the carrier bit rate. However, it represents a significant change in the air interface characteristics, and the increased bit rate is achieved at the expense of a higher operational signal-to-noise plus interference ratio, which is not compatible with 1999 by CRC Press LLC

c

large cell dimensions. Therefore, the new high bit rate data services are mainly targetting urban areas, and the effective bit rate allocated to data users will depend on the system load. Such a new air interface option is being standardized for GSM under the name of EDGE (Enhanced Data rates for GSM Evolution). The selected modulation scheme is 8-PSK, suitable coding schemes are under study, whereas the other air interface parameters (carrier spacing, TDMA frame structure,...) are kept unchanged. Reusing HSCSD (for circuit data) and GPRS (for packet data) protocols and service capabilities, EDGE will provide similar ECSD and EGPRS services but with a three-fold increase of the user bit rate. The higher level modulation requires better radio link performances, typically a loss of 3 to 4 dB in sensitivity and a C/I increased by 6 to 7 dB. Operation will also be restricted to environments with limited time dispersion and limited mobile speed. Nevertheless, EGPRS will roughly double the mean throughput compared to GPRS (for the same average transmitted power). EDGE will also increase the maximum achievable data rate in a GSM system to 553.6 kb/s in multislot (unprotected) operation. Six different protection schemes are foreseen in EGPRS using convolutional coding with a rate ranging from 1/3 to 1 and corresponding to user rates between 22.8 and 69.2 kb/s per time slot. This is in addition to the four coding schemes already defined for GPRS. An intelligent link adaptation algorithm will dynamically select the most appropriate modulation and coding schemes, i.e., those yielding the highest throughput for a given channel quality. The first phase of EDGE standardization should be completed by end 1999. It should be noted that a similar EDGE option is being studied for IS-54/IS-136 (and their PCS derivatives). Initially, the 30 kHz channel spacing will be maintained and then extension to a 200 kHz channel will be provided in order to offer a convergence with its GSM counterpart. A higher bit rate option is also under standardization for DECT. Here it is seen as an essential requirement to maintain backward compatibility with existing equipment so the new multilevel modulation will only affect the payload part of the bursts, keeping the control and signalling parts unchanged. This ensures that equipment with basic modulation and equipment with a higher rate option can efficiently share a common base station infrastructure. Only 4-level and 8-level modulations are considered and the symbol length, carrier spacing, and slot structure remain unchanged. The requirements on transmitter modulation accuracy need to be more stringent for 4- and 8-level modulation than for the current 2-level scheme. An increased accuracy can provide for coherent demodulation, whereby some (or most) of the sensitivity and C/I loss when using the multilevel mode can be regained. In combination with other new air interface features like forward error correction and double slots (with reduced overhead), the new modulation scheme will provide a wide range of data rates up to 2 Mb/s. For instance using (5/4-DQPSK modulation (a possible/suitable choice), an unprotected connection with two double slots in each direction gives a data rate of 384 kb/s. Asymmetric connections with a maximum of 11 double slots in one direction will also be supported.

26.6

Conclusion

Since their introduction in the early 1990s, most of the second generation systems have been enjoying exponential growth. With more than 100 million subscribers acquired worldwide in less than ten years of lifetime, the systems based on the GSM family of standards have demonstrated the most spectacular development. Despite a more regional implementation of other second generation systems, each one of those can boast a multimillion subscriber base in mobile or fixed wireless networks. A variety of service requirements of third generation mobile communication systems are being already met by the upcoming enhancements of second generation systems. Two important trends are reflected by this: • The introduction of third generation systems like Universal Mobile Telecommunication 1999 by CRC Press LLC

c

System (UMTS) or International Mobile Telecommunication-2000 (IMT-2000) might be delayed to a point in time where the evolutionary capabilities of second generation systems have been exhausted. • The deployment of networks based on third generation systems will be progressive. Any new radio interface will be imposed worldwide if and only if it provides substantial advantages as compared to the present systems. Another essential requirement is the capability of downward compatibility to second generation systems.

Defining Terms Capacity: In a mobile network it can be defined as the Erlangs throughput by a cell, a cluster of cells, or by a portion of a network. For a given radio interface, the achievable capacity is a function of the robustness of the physical layer, the effectiveness of the medium access control (MAC) layer and the multiple access technique. Moreover, it is strongly dependent on the radio spectrum available for network planning. Cellular: Refers to public land mobile radio networks for generally wide area (e.g., national) coverage, to be used with medium- or high-power vehicular mobiles or portable stations and for providing mobile access to the Public Switched Telephone Network (PSTN). The network implementation exhibits a cellular architecture which enables frequency reuse in nonadjacent cells. Cordless: These are systems to be used with simple low power portable stations operating within a short range of a base station and providing access to fixed public or private networks. There are three main applications, namely, residential (at home, for Plain Old Telephone Service, POTS), public-access (in public places and crowded areas, also called Telepoint), and Wireless Private Automatic Branch eXchange (WPABX, providing cordless access in the office environment), plus emerging applications like radio access for local loop. Coverage quality: It is the percentage of the served area where a communication can be established. It is determined by the acceptable path loss of the radio link and by the propagation characteristics in the area. The radio link budget generally includes some margin depending on the type of terrain (for shadowing effects) and on operator’s requirements (for indoor penetration). A coverage quality of 90% is a typical value for cellular networks. Speech quality: It strongly depends on the intrinsic performance of the speech coder and its evaluation normally requires intensive listening tests. When it is comparable to the quality achieved on modern wire-line telephone networks, it is called “toll quality.” In wireless systems it is also influenced by other parameters linked to the communication characteristics like radio channel impairments (bit error rate), transmission delay, echo, background noise, and tandeming (i.e., when several coding/decoding operations are involved in the link).

References [1] Anderson, S., Antenna Arrays in Mobile Communication Systems, Proc. Second Workshop on Smart Antennas in Wireless Mobile Communications, Stanford University, Jul. 1995. [2] Budagavi, M. and Gibson, J.D., Speech coding in mobile radio communications, Proceedings of the IEEE, 86(7), 1402–1412, Jul. 1998. 1999 by CRC Press LLC

c

[3] Cox, D.C., Wireless network access for personal communications, IEEE Communications Magazine, 96–115, Dec. 1992. [4] DECT, Digital European Cordless Telecommunications Common Interface, ETS-300-175, ETSI, 1992. [5] Fingscheidt, T. and Vary, P., Robust Speech Decoding: A Universal Approach to Bit Error Concealment, Proc. IEEE ICASSP, 1667–1670, Apr. 1997. [6] Ganz, A., et al., On optimal design of multitier wireless cellular systems, IEEE Communications Magazine, 88–93, Feb. 1997. [7] GSM, GSM Recommendations Series 01-12, ETSI, 1990. [8] IS-54, Cellular System, Dual-Mode Mobile Station-Base Station Compatibility Standard, EIA/TIA Interim Standard, 1991. [9] IS-95, Mobile Station-Base Station Compatibility Standard for Dual-Mode Wideband Spread Spectrum Cellular System, EIA/TIA Interim Standard, 1993. [10] Kuhn, A., et al., Validation of the Feature Frequency Hopping in a Live GSM Network, Proc. 46th IEEE Vehic. Tech. Conf., 321–325, Apr. 1996. [11] Lagrange, X., Multitier cell design, IEEE Communications Magazine, 60–64, Aug. 1997. [12] Lee, D. and Xu, C., The effect of narrowbeam antenna and multiple tiers on system capacity in CDMA wireless local loop, IEEE Communications Magazine, 110–114, Sep. 1997. [13] Olofsson, H., et al., Interference Diversity as Means for Increased Capacity in GSM, Proc. EPMCC’95, 97–102, Nov. 1995. [14] PDC, Personal Digital Cellular System Common Air Interface, RCR-STD27B, 1991. [15] PHS, Personal Handy Phone System: Second Generation Cordless Telephone System Standard, RCR-STD28, 1993. [16] Tuttlebee, W.H.W., Cordless personal communications, IEEE Communications Magazine, 42–53, Dec. 1992.

Further Information European standards (GSM, CT2, DECT, TETRA) are published by ETSI Secretariat, 06921 Sophia Antipolis Cedex, France. U.S. standards (IS-54, IS-95, APCO) are published by Electronic Industries Association, Engineering Department, 2001 Eye Street, N.W., Washington D.C. 20006, U.S.A. Japanese standards (PDC, PHS) are published by RCR (Research and Development Center for Radio Systems), 1-5-16, Toranomon, Minato-ku, Tokyo 105, Japan.

1999 by CRC Press LLC

c

Hanzo, L. “The Pan-European Cellular System” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

The Pan-European Cellular System

Lajos Hanzo University of Southampton

27.1

27.1 Introduction 27.2 Overview 27.3 Logical and Physical Channels 27.4 Speech and Data Transmission 27.5 Transmission of Control Signals 27.6 Synchronization Issues 27.7 Gaussian Minimum Shift Keying Modulation 27.8 Wideband Channel Models 27.9 Adaptive Link Control 27.10 Discontinuous Transmission 27.11 Summary Defining Terms References

Introduction

Following the standardization and launch of the Pan-European digital mobile cellular radio system known as GSM, it is of practical merit to provide a rudimentary introduction to the system’s main features for the communications practitioner. Since GSM operating licenses have been allocated to 126 service providers in 75 countries, it is justifiable that the GSM system is often referred to as the Global System of Mobile communications. The GSM specifications were released as 13 sets of recommendations [1], which are summarized in Table 27.1, covering various aspects of the system [3]. After a brief system overview in Section 27.2 and the introduction of physical and logical channels in Section 27.3 we embark upon describing aspects of mapping logical channels onto physical resources for speech and control channels in Sections 27.4 and 27.5, respectively. These details can be found in recommendations R.05.02 and R.05.03. These recommendations and all subsequently enumerated ones are to be found in [1]. Synchronization issues are considered in Section 27.6. Modulation (R.05.04), transmission via the standardized wideband GSM channel models (R.05.05), as well as adaptive radio link control (R.05.06 and R.05.08), discontinuous transmission (DTX) (R.06.31), and voice activity detection (VAD) (R.06.32) are highlighted in Sections 27.7–27.10, whereas a summary of the fundamental GSM features is offered in Section 27.11. 1999 by CRC Press LLC

c

TABLE 27.1

GSM Recommendations [R.01.01]

R.00

Preamble to the GSM recommendations

R.01

General structure of the recommendations, description of a GSM network, associated recommendations, vocabulary, etc.

R.02

Service aspects: bearer-, tele- and supplementary services, use of services, types and features of mobile stations (MS), licensing and subscription, as well as transferred and international accounting, etc.

R.03

Network aspects, including network functions and architecture, call routing to the MS, technical performance, availability and reliability objectives, handover and location registration procedures, as well as discontinuous reception and cryptological algorithms, etc.

R.04

Mobile/base station (BS) interface and protocols, including specifications for layer 1 and 3 aspects of the open systems interconnection (OSI) seven-layer structure.

R.05

Physical layer on the radio path, incorporating issues of multiplexing and multiple access, channel coding and modulation, transmission and reception, power control, frequency allocation and synchronization aspects, etc.

R.06

Speech coding specifications, such as functional, computational and verification procedures for the speech codec and its associated voice activity detector (VAD) and other optional features.

R.07

Terminal adaptors for MSs, including circuit and packet mode as well as voiceband data services.

R.08

Base station and mobile switching center (MSC) interface, and transcoder functions.

R.09

Network interworking with the public switched telephone network (PSTN), integrated services digital network (ISDN) and, packet data networks.

R.10

Service interworking, short message service.

R.11

Equipment specification and type approval specification as regards to MSs, BSs, MSCs, home (HLR) and visited location register (VLR), as well as system simulator.

R.12

Operation and maintenance, including subscriber, routing tariff and traffic administration, as well as BS, MSC, HLR and VLR maintenance issues.

27.2

Overview

The system elements of a GSM public land mobile network (PLMN) are portrayed in Fig. 27.1, where their interconnections via the standardized interfaces A and Um are indicated as well. The mobile station (MS) communicates with the serving and adjacent base stations (BS) via the radio interface Um, whereas the BSs are connected to the mobile switching center (MSC) through the network interface A. As seen in Fig. 27.1, the MS includes a mobile termination (MT) and a terminal equipment (TE). The TE may be constituted, for example, by a telephone set and fax machine. The MT performs functions needed to support the physical channel between the MS and the base station, such as radio transmissions, radio channel management, channel coding/decoding, speech encoding/decoding, and so forth. The BS is divided functionally into a number of base transceiver stations (BTS) and a base station controller (BSC). The BS is responsible for channel allocation (R.05.09), link quality and power budget control (R.05.06 and R.05.08), signalling and broadcast traffic control, frequency hopping (FH) (R.05.02), handover (HO) initiation (R.03.09 and R.05.08), etc. The MSC represents the gateway to other networks, such as the public switched telephone network (PSTN), integrated services digital network (ISDN) and packet data networks using the interworking functions standardized in recommendation R.09. The MSC’s further functions include paging, MS location updating (R.03.12), HO control (R.03.09), etc. The MS’s mobility management is assisted by the home location register (HLR) (R.03.12), storing part of the MS’s location information and routing incoming calls to the visitor location register (VLR) (R.03.12) in charge of the area, where the paged MS roams. Location update is asked for by the MS, whenever it detects from the received and decoded broadcast control channel (BCCH) messages that it entered a new location area. The HLR contains, amongst a number of other parameters, the international mobile subscriber identity (IMSI), which is used for the authentication (R.03.20) of the subscriber by his authentication center (AUC). This enables the 1999 by CRC Press LLC

c

TE

MT

Um

MS

TE

MT

BTS BTS

Um

BSC

Um

MS

TE

MT MS

OMC

OMC

MSC

MSC

HLR

VLR

AUC

EIR

BS A

MT

ADC

BTS

MS

TE

NMC

BTS BTS

Um

BSC

BTS BS

c ETT [4]. FIGURE 27.1: Simplified structure of GSM PLMN

system to confirm that the subscriber is allowed to access it. Every subscriber belongs to a home network and the specific services that the subscriber is allowed to use are entered into his HLR. The equipment identity register (EIR) allows for stolen, fraudulent, or faulty mobile stations to be identified by the network operators. The VLR is the functional unit that attends to a MS operating outside the area of its HLR. The visiting MS is automatically registered at the nearest MSC, and the VLR is informed of the MSs arrival. A roaming number is then assigned to the MS, and this enables calls to be routed to it. The operations and maintenance center (OMC), network management center (NMC) and administration center (ADC) are the functional entities through which the system is monitored, controlled, maintained and managed (R.12). The MS initiates a call by searching for a BS with a sufficiently high received signal level on the BCCH carrier; it will await and recognize a frequency correction burst and synchronize to it (R.05.08). Now the BS allocates a bidirectional signalling channel and also sets up a link with the MSC via the network. How the control frame structure assists in this process will be highlighted in Section 27.5. The MSC uses the IMSI received from the MS to interrogate its HLR and sends the data obtained to the serving VLR. After authentication (R.03.20) the MS provides the destination number, the BS allocates a traffic channel, and the MSC routes the call to its destination. If the MS moves to another cell, it is reassigned to another BS, and a handover occurs. If both BSs in the handover process are controlled by the same BSC, the handover takes place under the control of the BSC, otherwise it is performed by the MSC. In case of incoming calls the MS must be paged by the BSC. A paging signal is transmitted on a paging channel (PCH) monitored continuously by all MSs, and which covers the location area in which the MS roams. In response to the paging signal, the MS performs an access procedure identical to that employed when the MS initiates a call. 1999 by CRC Press LLC

c

27.3

Logical and Physical Channels

The GSM logical traffic and control channels are standardized in recommendation R.05.02, whereas their mapping onto physical channels is the subject of recommendations R.05.02 and R.05.03. The GSM system’s prime objective is to transmit the logical traffic channel’s (TCH) speech or data information. Their transmission via the network requires a variety of logical control channels. The set of logical traffic and control channels defined in the GSM system is summarized in Table 27.2. There are two general forms of speech and data traffic channels: the full-rate traffic channels (TCH/F), which carry information at a gross rate of 22.8 kb/s, and the half-rate traffic channels (TCH/H), which communicate at a gross rate of 11.4 kb/s. A physical channel carries either a full-rate traffic channel, or two half-rate traffic channels. In the former, the traffic channel occupies one timeslot, whereas in the latter the two half-rate traffic channels are mapped onto the same timeslot, but in alternate frames. TABLE 27.2

c ETT [4] GSM Logical Channels Logical Channels

Duplex BS ↔ MS Traffic Channels: TCH

Control Channels: CCH

FEC-coded Speech

FEC-coded Data

Broadcast CCH BCCH BS → MS

Common CCH CCCH

Stand-alone Dedicated CCH SDCCH BS ↔ MS

Associated CCH ACCH BS ↔ MS

TCH/F 22.8 kb/s

TCH/F9.6 TCH/F4.8 TCH/F2.4

Freq. Corr. Ch: FCCH

Paging Ch: PCH BS → MS

SDCCH/4

Fast ACCH: FACCH/F FACCH/H

Synchron. Ch: SCH

Random Access Ch: RACH MS → BS

SDCCH/8

General Inf.

Access Grant Ch: AGCH BS → MS

Slow ACCH: SACCH/TF SACCH/TH SACCH/C4 SACCH/C8

22.8 kb/s TCH/H 11.4 kb/s

TCH/H4.8 TCH/H2.4 11.4 kb/s

For a summary of the logical control channels carrying signalling or synchronisation data, see Table 27.2. There are four categories of logical control channels, known as the BCCH, the common control channel (CCCH), the stand-alone dedicated control channel (SDCCH), and the associated control channel (ACCH). The purpose and way of deployment of the logical traffic and control channels will be explained by highlighting how they are mapped onto physical channels in assisting high-integrity communications. A physical channel in a time division multiple access (TDMA) system is defined as a timeslot with a timeslot number (TN) in a sequence of TDMA frames. The GSM system, however, deploys TDMA combined with frequency hopping (FH) and, hence, the physical channel is partitioned in both time and frequency. Frequency hopping (R.05.02) combined with interleaving is known to be very efficient in combatting channel fading, and it results in near-Gaussian performance even over hostile Rayleigh-fading channels. The principle of FH is that each TDMA burst is transmitted via a different RF channel (RFCH). If the present TDMA burst happened to be in a deep fade, then the next burst most probably will not be. Consequently, the physical channel is defined as a sequence of radio frequency channels and timeslots. Each carrier frequency supports eight physical channels mapped onto eight timeslots within a TDMA frame. A given physical channel always uses the same 1999 by CRC Press LLC

c

TN in every TDMA frame. Therefore, a timeslot sequence is defined by a TN and a TDMA frame number FN sequence.

27.4

Speech and Data Transmission

The speech coding standard is recommendation R.06.10, whereas issues of mapping the logical speech traffic channel’s information onto the physical channel constituted by a timeslot of a certain carrier are specified in recommendation R.05.02. Since the error correction coding represents part of this mapping process, recommendation R.05.03 is also relevant to these discussions. The example of the full-rate speech traffic channel (TCH/FS) is used here to highlight how this logical channel is mapped onto the physical channel constituted by a so-called normal burst (NB) of the TDMA frame structure. This mapping is explained by referring to Figs. 27.2 and 27.3. Then this example will be extended to other physical bursts such as the frequency correction (FCB), synchronization (SB), access (AB), and dummy burst (DB) carrying logical control channels, as well as to their TDMA frame structures, as seen in Figs. 27.2 and 27.6. The regular pulse excited (RPE) speech encoder is fully characterized in the following references: [3, 5, 7]. Because of its complexity, its description is beyond the scope of this chapter. Suffice to say that, as it can be seen in Fig. 27.3, it delivers 260 b/20 ms at a bit rate of 13 kb/s, which are divided into three significance classes: class 1a (50 b), class 1b (132 b) and class 2 (78 b). The class-1a bits are encoded by a systematic (53, 50) cyclic error detection code by adding three parity bits. Then the bits are reordered and four zero tailing bits are added to periodically reset the memory of the subsequent half-rate, constraint length five convolutional codec (CC) CC(2, 1, 5), as portrayed in Fig. 27.3. Now the unprotected 78 class-2 bits are concatenated to yield a block of 456 b/20 ms, which implies an encoded bit rate of 22.8 kb/s. This frame is partitioned into eight 57-b subblocks that are block diagonally interleaved before undergoing intraburst interleaving. At this stage each 57-b subblock is combined with a similar subblock of the previous 456-b frame to construct a 116-b burst, where the flag bits hl and hu are included to classify whether the current burst is really a TCH/FS burst or it has been stolen by an urgent fast associated (FACCH) control channel message. Now the bits are encrypted and positioned in a NB, as depicted at the bottom of Fig. 27.2, where three tailing bits (TB) are added at both ends of the burst to reset the memory of the Viterbi channel equalizer (VE), which is responsible for removing both the channel-induced and the intentional controlled intersymbol interference [6]. The 8.25-b interval duration guard period (GP) at the bottom of Fig. 27.2 is provided to prevent burst overlapping due to propagation delay fluctuations. Finally, a 26-b equalizer training segment is included in the center of the normal traffic burst. This segment is constructed by a 16-b Viterbi channel equalizer training pattern surrounded by five quasiperiodically repeated bits on both sides. Since the MS has to be informed about which BS it communicates with, for neighboring BSs one of eight different training patterns is used, associated with the so-called BS color codes, which assist in identifying the BSs. This 156.25-b duration TCH/FS NB constitutes the basic timeslot of the TDMA frame structure, which is input to the Gaussian minimum shift keying (GMSK) modulator to be highlighted in Section 27.7, at a bit rate of approximately 271 kb/s. Since the bit interval is 1/(271 kb/s) = 3.69 µs, the timeslot duration is 156.25 · 3.69 ≈ 0.577 ms. Eight such normal bursts of eight appropriately staggered TDMA users are multiplexed onto one (RF) carrier giving, a TDMA frame of 8 · 0.577 ≈ 4.615-ms duration, as shown in Fig. 27.2. The physical channel as characterized earlier provides a physical timeslot with a throughput of 114 b/4.615 ms = 24.7 kb/s, which is sufficiently high to transmit the 22.8 kb/s TCH/FS information. It even has a reserved capacity of 24.7 − 22.8 = 1999 by CRC Press LLC

c

c

1999 by CRC Press LLC FIGURE 27.2:

c ETT [4]. The GSM TDMA frame structure

e. g. TCH / FS

0

0

2

3

0 0

1

2 1

1

TB 3

58 Encrypted bits

11 SACCH 12 13

1 multiframe = 26 TDMA frames (120 ms)

e. g. BCCH

e. g. TCH / FS

1

0

24

2

0

1

26 bits Training Seg.

∼ 3.69 us) (1 bit duration −

1 timeslot = (156.25 bit durations ∼ − 0.577 ms)

1

1 TDMA frame = 8 timeslots (4.615 ms)

Idle/SACCH

24

25

49

50

2046

7

58 Encrypted bits

2

TB 3

GP 8.25

49 50

1 multiframe = 51 TDMA frames (235 ms)

1 superframe = 1326 TDMA frames (6.12 s)

1 hyperframe = 2048 superframes = 2,715,648 TDMA frames (3 hours, 28 minutes, ...)

e. g. BCCH

2047

260 bits/20 ms = 13 kbps C1a 50 bits

C1b 132 bits

C2 78 bits

Parity Check

50

3

4

132 189 bits

Convolutional Code r = 1/2, k = 5 78

0

1

2

3

4

5

6

7

0

1

block (n − 1)

2

3

4

5

6

7

block (n)

57 hl hu57 57 hl hu 57 114

114

114

114

c ETT [4]. FIGURE 27.3: Mapping the TCH/FS logical channel onto a physical channel,

1.9 kb/s, which can be exploited to transmit slow control information associated with this specific traffic channel, i.e., to construct a so-called slow associated control channel (SACCH), constituted by the SACCH TDMA frames, interspersed with traffic frames at multiframe level of the hierarchy, as seen in Fig. 27.2. Mapping logical data traffic channels onto a physical channel is essentially carried out by the channel codecs [8], as specified in recommendation R.05.03. The full- and half-rate data traffic channels standardized in the GSM system are: TCH/F9.6, TCH/F4.8, TCH/F2.4, as well as TCH/H4.8, TCH/H2.4, as was shown earlier in Table 27.2. Note that the numbers in these acronyms represent the data transmission rate in kilobits per second. Without considering the details of these mapping processes we now focus our attention on control signal transmission issues.

27.5

Transmission of Control Signals

The exact derivation, forward error correcting (FEC) coding and mapping of logical control channel information is beyond the scope of this chapter, and the interested reader is referred to ETSI, 1988 1999 by CRC Press LLC

c

(R.05.02 and R.05.03) and Hanzo and Stefanov, 1992, for a detailed discussion. As an example, the mapping of the 184-b SACCH, FACCH, BCCH, SDCCH, PCH, and access grant control channel (AGCH) messages onto a 456-b block, i.e., onto four 114-b bursts is demonstrated in Fig. 27.4. A double-layer concatenated FIRE-code/convolutional code scheme generates 456 bits, using an overall coding rate of R = 184/456, which gives a stronger protection for control channels than the error protection of traffic channels.

184 bits Fire-Code (224, 184)

26

23

G 5 (D) = D 40+ D + D + D17 + D3 + 1 information bits: 184

tailing parity: 40

4

CC (2, 1, 5)

456

c ETT [4]. FIGURE 27.4: FEC in SACCH, FACCH, BCCH, SDCCH, PCH and AGCH,

Returning to Fig. 27.2 we will now show how the SACCH is accommodated by the TDMA frame structure. The TCH/FS TDMA frames of the eight users are multiplexed into multiframes of 24 TDMA frames, but the 13th frame will carry a SACCH message, rather than the 13th TCH/FS frame, whereas the 26th frame will be an idle or dummy frame, as seen at the left-hand side of Fig. 27.2 at the multiframe level of the traffic channel hierarchy. The general control channel frame structure shown at the right of Fig. 27.2 is discussed later. This way 24-TCH/FS frames are sent in a 26-frame multiframe during 26 · 4.615 = 120 ms. This reduces the traffic throughput to (24/26) · 24.7 = 22.8 kb/s required by TCH/FS, allocates (1/26) · 24.7 = 950 b/s to the SACCH and wastes 950 b/s in the idle frame. Observe that the SACCH frame has eight timeslots to transmit the eight 950-b/s SACCHs of the eight users on the same carrier. The 950-b/s idle capacity will be used in case of half-rate channels, where 16 users will be multiplexed onto alternate frames of the TDMA structure to increase system capacity. Then 16, 11.4-kb/s encoded half-rate speech TCHs will be transmitted in a 120-ms multiframe, where also 16 SACCHs are available. The FACCH messages are transmitted via the physical channels provided by bits stolen from their own host traffic channels. The construction of the FACCH bursts from 184 control bits is identical to that of the SACCH, as also shown in Fig. 27.4 but its 456-b frame is mapped onto eight consecutive 114-b TDMA traffic bursts, exactly as specified for TCH/FS. This is carried out by stealing the even bits of the first four and the odd bits of the last four bursts, which is signalled by setting hu = 1, hl = 0 and hu = 0, hl = 1 in the first and last bursts, respectively. The unprotected FACCH information 1999 by CRC Press LLC

c

rate is 184 b/20 ms = 9.2 kb/s, which is transmitted after concatenated error protection at a rate of 22.8 kb/s. The repetition delay is 20 ms, and the interleaving delay is 8 · 4.615 = 37 ms, resulting in a total of 57-ms delay. In Fig. 27.2 at the next hierarchical level, 51-TCH/FS multiframes are multiplexed into one superframe lasting 51·120 ms = 6.12 s, which contains 26·51 = 1326-TDMA frames. In the case of 1326-TDMA frames, however, the frame number would be limited to 0 ≤ F N ≤ 1326 and the encryption rule relying on such a limited range of F N values would not be sufficiently secure. Then 2048 superframes were amalgamated to form a hyperframe of 1326 · 2048 = 2,715,648-TDMA frames lasting 2048 · 6.12 s ≈ 3 h 28 min, allowing a sufficiently high F N value to be used in the encryption algorithm. The uplink and downlink traffic-frame structures are identical with a shift of three timeslots between them, which relieves the MS from having to transmit and receive simultaneously, preventing high-level transmitted power leakage back to the sensitive receiver. The received power of adjacent BSs can be monitored during unallocated timeslots. In contrast to duplex traffic and associated control channels, the simplex BCCH and CCCH logical channels of all MSs roaming in a specific cell share the physical channel provided by timeslot zero of the so-called BCCH carriers available in the cell. Furthermore, as demonstrated by the right-hand side section of Fig. 27.2, 51 BCCH and CCCH TDMA frames are mapped onto a 51 · 4.615 = 235-ms duration multiframe, rather than on a 26-frame, 120-ms duration multiframe. In order to compensate for the extended multiframe length of 235 ms, 26 multiframes constitute a 1326-frame superframe of 6.12-s duration. Note in Fig. 27.5, that the allocation of the uplink and downlink frames is different, since these control channels exist only in one direction.

51 time frames 51.4,615 = 235 ms

1 2 3 4 RR R R R R RR RR

RR RR R R RRRR

51 R RR R R RRRRR R R

(a) Uplink Direction

51 time frames 235 ms F S B B B B CC CC

F S CC C C CCCC

F S C C C CCCCC C I

(a) Downlink Direction

R: Random Access Channel F : Frequency Correction Channel S : Synchronisation Channel B : Broadcast Control Channel C: Access Grant/Paging Channel I : Idle Frame

c ETT [4]. FIGURE 27.5: The control multiframe, 1999 by CRC Press LLC

c

Specifically, the random access channel (RACH) is only used by the MSs in the uplink direction if they request, for example, a bidirectional SDCCH to be mapped onto an RF channel to register with the network and set up a call. The uplink RACH has a low capacity, carrying messages of 8-b/235-ms multiframe, which is equivalent to an unprotected control information rate of 34 b/s. These messages are concatenated FEC coded to a rate of 36 b/235 ms = 153 b/s. They are not transmitted by the NB derived for TCH/FS, SACCH, or FACCH logical channels, but by the AB, depicted in Fig. 27.6 in comparison to a NB and other types of bursts to be described later. The FEC coded, encrypted 36-b AB messages of Fig. 27.6 contain among other parameters, the encoded 6-b BS identifier code (BSIC) constituted by the 3-b PLMN color code and 3-b BS color code for unique BS identification. These 36 b are positioned after the 41-b synchronization sequence, which has a high wordlength in order to ensure reliable access burst recognition and a low probability of being emulated by interfering stray data. These messages have no interleaving delay, while they are transmitted with a repetition delay of one control multiframe length, i.e., 235 ms. 1 TDMA FRAME = 8 TIME SLOTS 0

1

2

3

4

5

6

7

1 TIME SLOT = 156.25 BIT DURATIONS

NORMAL BURST TAIL BITS 3

ENCRYPTED BITS 58

TRAINING SEQUENCE 26

ENCRYPTED BITS TAIL BITS GUARD PERIOD 58 3 8.25

FREQUENCY CORRECTION BURST TAIL BITS 3

FIXED BITS 142

TAIL BITS GUARD PERIOD 3 8.25

SYNCHRONISATION BURST TAIL BITS 3

ENCRYPTED SYNC BITS 39

EXTENDED TRAINING SEQUENCE 64

TAIL BITS GUARD PERIOD 3 8.25

ENCRYPTED SYNC BITS 39

ACCESS BURST TAIL BITS 8

SYNCHRO SEQUENCE ENCRYPTED BITS TAIL BITS 41 36 3

FIGURE 27.6: GSM burst structures,

GUARD PERIOD 68.25

c

ETT [4].

Adaptive time frame alignment is a technique designed to equalize propagation delay differences between MSs at different distances. The GSM system is designed to allow for cell sizes up to 35 km radius. The time a radio signal takes to travel the 70 km from the base station to the mobile station and back again is 233.3 µs. As signals from all the mobiles in the cell must reach the base station without overlapping each other, a long guard period of 68.25 b (252 µs) is provided in the access

1999 by CRC Press LLC

c

burst, which exceeds the maximum possible propagation delay of 233.3 µs. This long guard period in the access burst is needed when the mobile station attempts its first access to the base station or after a handover has occurred. When the base station detects a 41-b random access synchronization sequence with a long guard period, it measures the received signal delay relative to the expected signal from a mobile station of zero range. This delay, called the timing advance, is signalled using a 6-b number to the mobile station, which advances its timebase over the range of 0–63 b, i.e., in units of 3.69 µs. By this process the TDMA bursts arrive at the BS in their correct timeslots and do not overlap with adjacent ones. This process allows the guard period in all other bursts to be reduced to 8.25 · 3.69 µs ≈ 30.46 µs (8.25 b) only. During normal operation, the BS continuously monitors the signal delay from the MS and, if necessary, it will instruct the MS to update its time advance parameter. In very large traffic cells there is an option to actively utilize every second timeslot only to cope with higher propagation delays, which is spectrally inefficient, but in these large, low-traffic rural cells it is admissible. As demonstrated by Fig. 27.2, the downlink multiframe transmitted by the BS is shared amongst a number of BCCH and CCCH logical channels. In particular, the last frame is an idle frame (I), whereas the remaining 50 frames are divided in five blocks of ten frames, where each block starts with a frequency correction channel (FCCH) followed by a synchronization channel (SCH). In the first block of ten frames the FCCH and SCH frames are followed by four BCCH frames and by either four AGCH or four PCH. In the remaining four blocks of ten frames, the last eight frames are devoted to either PCHs or AGCHs, which are mutually exclusive for a specific MS being either paged or granted a control channel. The FCCH, SCH, and RACH require special transmission bursts, tailored to their missions, as depicted in Fig. 27.6. The FCCH uses frequency correction bursts (FCB) hosting a specific 142-b pattern. In partial response GMSK it is possible to design a modulating data sequence, which results in a near-sinusoidal modulated signal imitating an unmodulated carrier exhibiting a fixed frequency offset from the RF carrier utilized. The synchronization channel transmits SB hosting a 16 · 4 = 64b extended sequence exhibiting a high-correlation peak in order to allow frame alignment with a quarter-bit accuracy. Furthermore, the SB contains 2·39 = 78 encrypted FEC-coded synchronization bits, hosting the BS and PLMN color codes, each representing one of eight legitimate identifiers. Lastly, the AB contain an extended 41-b synchronization sequence, and they are invoked to facilitate initial access to the system. Their long guard space of 68.25-b duration prevents frame overlap, before the MS’s distance, i.e., the propagation delay becomes known to the BS and could be compensated for by adjusting the MS’s timing advance.

27.6

Synchronization Issues

Although some synchronization issues are standardized in recommendations R.05.02 and R.05.03, the GSM recommendations do not specify the exact BS-MS synchronization algorithms to be used, these are left to the equipment manufacturers. A unique set of timebase counters, however, is defined in order to ensure perfect BS-MS synchronism. The BS sends FCB and SB on specific timeslots of the BCCH carrier to the MS to ensure that the MS’s frequency standard is perfectly aligned with that of the BS, as well as to inform the MS about the required initial state of its internal counters. The MS transmits its uniquely numbered traffic and control bursts staggered by three timeslots with respect to those of the BS to prevent simultaneous MS transmission and reception, and also takes into account the required timing advance (TA) to cater for different BS-MS-BS round-trip delays. The timebase counters used to uniquely describe the internal timing states of BSs and MSs are the quarter-bit number (QN = 0–624) counting the quarter-bit intervals in bursts, bit number 1999 by CRC Press LLC

c

(BN = 0–156), timeslot number (T N = 0–7) and TDMA Frame Number (F N = 0–26·51·2048), given in the order of increasing interval duration. The MS sets up its timebase counters after receiving a SB by determining QN from the 64-b extended training sequence in the center of the SB, setting T N = 0 and decoding the 78-encrypted, protected bits carrying the 25-SCH control bits. The SCH carries frame synchronization information as well as BS identification information to the MS, as seen in Fig. 27.7, and it is provided solely to support the operation of the radio subsystem. The first 6 b of the 25-b segment consist of three PLMN color code bits and three

PLMN colour 3 bits

BS colour 3 bits

BSIC 6 bits

T1 : superframe index 11 bits

T2 : multiframe index

T1 : block frame index

5 bits

3 bits

RFN 19 bits

c ETT [4]. FIGURE 27.7: Synchronization channel (SCH) message format,

BS color code bits supplying a unique BS identifier code (BSIC) to inform the MS which BS it is communicating with. The second 19-bit segment is the so-called reduced TDMA frame number RFN derived from the full TDMA frame number F N , constrained to the range of [0–(26 · 51 · 2048) − 1] = (0–2,715,647) in terms of three subsegments T 1, T 2, and T 3. These subsegments are computed as follows: T 1(11 b) = [F N div (26 · 51)], T 2(5 b) = (F N mod 26) and T 30 (3b) = [(T 3 − 1) div 10], where T 3 = (F N mod 5), whereas div and mod represent the integer division and modulo operations, respectively. Explicitly, in Fig. 27.7 T 1 determines the superframe index in a hyperframe, T 2 the multiframe index in a superframe, T 3 the frame index in a multiframe, whereas T 30 is the so-called signalling block index [1–5] of a frame in a specific 51-frame control multiframe, and their roles are best understood by referring to Fig. 27.2. Once the MS has received the SB, it readily computes the F N required in various control algorithms, such as encryption, handover, etc., as F N = 51 (T 3 − T 2) mod 26 + T 3 + 51 · 26 · T 1,ψ where T 3 = 10 · T 30 + 1

27.7

Gaussian Minimum Shift Keying Modulation

The GSM system uses constant envelope partial response GMSK modulation [6] specified in recommendation R.05.04. Constant envelope, continuous-phase modulation schemes are robust against signal fading as well as interference and have good spectral efficiency. The slower and smoother are the phase changes, the better is the spectral efficiency, since the signal is allowed to change less abruptly, requiring lower frequency components. The effect of an input bit, however, is spread over several bit periods, leading to a so-called partial response system, which requires a channel equalizer in order to remove this controlled, intentional intersymbol interference (ISI) even in the absence of uncontrolled channel dispersion. The widely employed partial response GMSK scheme is derived from the full response minimum shift keying (MSK) scheme. In MSK the phase changes between adjacent bit periods are piecewise linear, which results in discontinuous-phase derivative, i.e., instantaneous frequency at the signalling instants, and hence widens the spectrum. Smoothing these phase changes, however, by a filter having 1999 by CRC Press LLC

c

a Gaussian impulse response [6], which is known to have the lowest possible bandwidth, this problem is circumvented using the schematic of Fig. 27.8, where the GMSK signal is generated by modulating and adding two quadrature carriers. The key parameter of GMSK in controlling both bandwidth and interference resistance is the 3-dB down filter-bandwidth × bit interval product (B · T ), referred to as normalized bandwidth. It was found that as the B · T product is increased from 0.2 to 0.5, the interference resistance is improved by approximately 2 dB at the cost of increased bandwidth occupancy, and best compromise was achieved for B · T = 0.3. This corresponds to spreading the effect of 1 b over approximately 3-b intervals. The spectral efficiency gain due to higher interference tolerance and, hence, more dense frequency reuse was found to be more significant than the spectral loss caused by wider GMSK spectral lobes.

cos [φ (t, α n)] phase pulse shaping

φ (t, α n)

cos ωt

cos cos [ωt + φ (t, α n)]

Gaussian filter

dt

frequency pulse shaping

sin sin [φ (t, α n)]

−sin ωt

c ETT [4]. FIGURE 27.8: GMSK modulator schematic diagram,

The channel separation at the TDMA burst rate of 271 kb/s is 200 kHz, and the modulated spectrum must be 40 dB down at both adjacent carrier frequencies. When TDMA bursts are transmitted in an on-off keyed mode, further spectral spillage arises, which is mitigated by a smooth power ramp up and down envelope at the leading and trailing edges of the transmission bursts, attenuating the signal by 70 dB during a 28- and 18-µs interval, respectively.

27.8

Wideband Channel Models

The set of 6-tap GSM impulse responses [2] specified in recommendation R.05.05 is depicted in Fig. 27.9, where the individual propagation paths are independent Rayleigh fading paths, weighted by the appropriate coefficients hi corresponding to their relative powers portrayed in the figure. In simple terms the wideband channel’s impulse response is measured by transmitting an impulse and detecting the received echoes at the channel’s output in every D-spaced so-called delay bin. In some bins no delayed and attenuated multipath component is received, whereas in others significant energy is detected, depending on the typical reflecting objects and their distance from the receiver. The path delay can be easily related to the distance of the reflecting objects, since radio waves are travelling at the speed of light. For example, at a speed of 300,000 km/s, a reflecting object situated at a distance of 0.15 km yields a multipath component at a round-trip delay of 1 µs. The typical urban (TU) impulse response spreads over a delay interval of 5 µs, which is almost two 3.69-µs bit-intervals duration and, therefore, results in serious ISI. In simple terms, it can be treated as a two-path model, where the reflected path has a length of 0.75 km, corresponding to a reflector 1999 by CRC Press LLC

c

HILLY TERRAIN (HT) IMPULSE RESPONSE

TYPICAL URBAN (TU) IMPULSE RESPONSE 1.2

REL. POWER

1.2

REL. POWER

1

××

××

1

0.8

0.6

0.6

0.4

0.4 0.2

××

××

××

0.2

××

××

××

××

××

0.8

0

0

0

20

10 DELAY (us)

15

20

EQUALISER TEST (EQ) IMPULSE RESPONSE

REL. POWER

1.2

REL. POWER

1

××

××

1

××

RURAL AREA (RA) IMPULSE RESPONSE 1.2

5

××

15

××

10 DELAY (us)

××

5

××

0

0.8

××

0.8

0.6

0.4

0.4

0.2

0.2

××

××

0.6

0 0

0 5

10 DELAY (us)

15

20

0

5

10 DELAY (us)

c ETT [4]. FIGURE 27.9: Typical GSM channel impulse responses,

1999 by CRC Press LLC

c

15

20

located at a distance of about 375 m. The hilly terrain (HT) model has a sharply decaying shortdelay section due to local reflections and a long-delay path around 15 µs due to distant reflections. Therefore, in practical terms it can be considered a two- or three-path model having reflections from a distance of about 2 km. The rural area (RA) response seems the least hostile amongst all standardized responses, decaying rapidly inside 1-b interval and, therefore, is expected to be easily combated by the channel equalizer. Although the type of the equalizer is not standardized, partial response systems typically use VEs. Since the RA channel effectively behaves as a single-path nondispersive channel, it would not require an equalizer. The fourth standardized impulse response is artificially contrived in order to test the equalizer’s performance and is constituted by six equidistant unit-amplitude impulses representing six equal-powered independent Rayleigh-fading paths with a delay spread over 16 µs. With these impulse responses in mind, the required channel is simulated by summing the appropriately delayed and weighted received signal components. In all but one case the individual components are assumed to have Rayleigh amplitude distribution, whereas in the RA model the main tap at zero delay is supposed to have a Rician distribution with the presence of a dominant line-of-sight path.

27.9

Adaptive Link Control

The adaptive link control algorithm portrayed in Fig. 27.10 and specified in recommendation R.05.08 allows for the MS to favor that specific traffic cell which provides the highest probability of reliable communications associated with the lowest possible path loss. It also decreases interference with other cochannel users and, through dense frequency reuse, improves spectral efficiency, whilst maintaining an adequate communications quality, and facilitates a reduction in power consumption, which is particularly important in hand-held MSs. The handover process maintains a call in progress as the MS moves between cells, or when there is an unacceptable transmission quality degradation caused by interference, in which case an intracell handover to another carrier in the same cell is performed. A radio-link failure occurs when a call with an unacceptable voice or data quality cannot be improved either by RF power control or by handover. The reasons for the link failure may be loss of radio coverage or very high-interference levels. The link control procedures rely on measurements of the received RF signal strength (RXLEV), the received signal quality (RXQUAL), and the absolute distance between base and mobile stations (DISTANCE). RXLEV is evaluated by measuring the received level of the BCCH carrier which is continuously transmitted by the BS on all time slots of the B frames in Fig. 27.5 and without variations of the RF level. A MS measures the received signal level from the serving cell and from the BSs in all adjacent cells by tuning and listening to their BCCH carriers. The root mean squared level of the received signal is measured over a dynamic range from −103 to −41 dBm for intervals of one SACCH multiframe (480 ms). The received signal level is averaged over at least 32 SACCH frames (≈15 s) and mapped to give RXLEV values between 0 and 63 to cover the range from −103 to −41 dBm in steps of 1 dB. The RXLEV parameters are then coded into 6-b words for transmission to the serving BS via the SACCH. RXQUAL is estimated by measuring the bit error ratio (BER) before channel decoding, using the Viterbi channel equalizer’s metrics [6] and/or those of the Viterbi convolutional decoder [8]. Eight values of RXQUAL span the logarithmically scaled BER range of 0.2–12.8% before channel decoding. The absolute DISTANCE between base and mobile stations is measured using the timing advance parameter. The timing advance is coded as a 6-b number corresponding to a propagation delay from 0 to 63 · 3.69 µs = 232.6 µs, characteristic of a cell radius of 35 km. While roaming, the MS needs to identify which potential target BS it is measuring, and the BCCH carrier frequency may not be sufficient for this purpose, since in small cluster sizes the same BCCH 1999 by CRC Press LLC

c

Switch ON

N

Home PLMN

MS selects new PLMN Y BCCHS for PLMN known

Y

N Measure RXLEV for all GSM carriers

Measure & store RXLEV for all GSM carriers

Hop to strongest carrier & await FCB

Hop to strongest BCCH carrier & await FCB

Sychronise & await BCCh data

Time out

Recognise FCB Sychronise & await BCCh data

Decode BSIC

Decode BSIC (PLMN & BS colour bits)

BCCH from selected PLMN

Time out

Time out

BCCH from selected PLMN

N

N

All BCCHS tested

Y

Y All 124 carriers tested

Y

Y

N

Y Barred cell

Barred cell

N Hop to next strongest crarrier

N

Pathloss acceptable

Pathloss acceptable

N

Hop to next strongest BCCH

N

N

Y Any BCCH decoded

N

Y Save BCCH list for this PLMN

Y

Hop to strongest BCCH

Idle Mode

c ETT [4]. FIGURE 27.10: Initial cell selection by the MS,

1999 by CRC Press LLC

c

frequency may be used in more than one surrounding cell. To avoid ambiguity a 6-b BSIC is transmitted on each BCCH carrier in the SB of Fig. 27.6. Two other parameters transmitted in the BCCH data provide additional information about the BS. The binary flag called PLMN PERMITTED indicates whether the measured BCCH carrier belongs to a PLMN that the MS is permitted to access. The second Boolean flag, CELL BAR ACCESS, indicates whether the cell is barred for access by the MS, although it belongs to a permitted PLMN. A MS in idle mode, i.e., after it has just been switched on or after it has lost contact with the network, searches all 125 RF channels and takes readings of RXLEV on each of them. Then it tunes to the carrier with the highest RXLEV and searches for FCB in order to determine whether or not the carrier is a BCCH carrier. If it is not, then the MS tunes to the next highest carrier, and so on, until it finds a BCCH carrier, synchronizes to it and decodes the parameters BSIC, PLMN PERMITTED and CELL BAR ACCESS in order to decide whether to continue the search. The MS may store the BCCH carrier frequencies used in the network accessed, in which case the search time would be reduced. Again, the process described is summarized in the flowchart of Fig. 27.10. The adaptive power control is based on RXLEV measurements. In every SACCH multiframe the BS compares the RXLEV readings reported by the MS or obtained by the base station with a set of thresholds. The exact strategy for RF power control is determined by the network operator with the aim of providing an adequate quality of service for speech and data transmissions while keeping interferences low. Clearly, adequate quality must be achieved at the lowest possible transmitted power to keep cochannel interferences low, which implies contradictory requirements in terms of transmitted power. The criteria for reporting radio link failure are based on the measurements of RXLEV and RXQUAL performed by both the mobile and base stations, and the procedures for handling link failures result in the re-establishment or the release of the call, depending on the network operator’s strategy. The handover process involves the most complex set of procedures in the radio-link control. Handover decisions are based on results of measurements performed both by the base and mobile stations. The base station measures RXLEV, RXQUAL, DISTANCE, and also the interference level in unallocated time slots, whereas the MS measures and reports to the BS the values of RXLEV and RXQUAL for the serving cell and RXLEV for the adjacent cells. When the MS moves away from the BS, the RXLEV and RXQUAL parameters for the serving station become lower, whereas RXLEV for one of the adjacent cells increases.

27.10

Discontinuous Transmission

Discontinuous transmission (DTX) issues are standardized in recommendation R.06.31, whereas the associated problems of voice activity detection VAD are specified by R.06.32. Assuming an average speech activity of 50% and a high number of interferers combined with frequency hopping to randomize the interference load, significant spectral efficiency gains can be achieved when deploying discontinuous transmissions due to decreasing interferences, while reducing power dissipation as well. Because of the reduction in power consumption, full DTX operation is mandatory for MSs, but in BSs, only receiver DTX functions are compulsory. The fundamental problem in voice activity detection is how to differentiate between speech and noise, while keeping false noise triggering and speech spurt clipping as low as possible. In vehiclemounted MSs the severity of the speech/noise recognition problem is aggravated by the excessive vehicle background noise. This problem is resolved by deploying a combination of threshold comparisons and spectral domain techniques [1, 3]. Another important associated problem is the intro1999 by CRC Press LLC

c

duction of noiseless inactive segments, which is mitigated by comfort noise insertion (CNI) in these segments at the receiver.

27.11

Summary

Following the standardization and launch of the GSM system its salient features were summarized in this brief review. Time division multiple access (TDMA) with eight users per carrier is used at a multiuser rate of 271 kb/s, demanding a channel equalizer to combat dispersion in large cell environments. The error protected chip rate of the full-rate traffic channels is 22.8 kb/s, whereas in half-rate channels it is 11.4 kb/s. Apart from the full- and half-rate speech traffic channels, there are 5 different rate data traffic channels and 14 various control and signalling channels to support the system’s operation. A moderately complex, 13 kb/s regular pulse excited speech codec with long term predictor (LTP) is used, combined with an embedded three-class error correction codec and multilayer interleaving to provide sensitivity-matched unequal error protection for the speech bits. An overall speech delay of 57.5 ms is maintained. Slow frequency hopping at 217 hops/s yields substantial performance gains for slowly moving pedestrians. TABLE 27.3

Summary of GSM Features

System feature

Specification

Up-link bandwidth, MHz

890–915 = 25

Down-link bandwidth, MHz

935–960 = 25

Total GSM bandwidth, MHz

50

Carrier spacing, KHz

200

No. of RF carriers

125

Multiple access

TDMA

No. of users/carrier

8

Total No. of channels

1000

TDMA burst rate, kb/s

271

Modulation

GMSK with BT = 0.3

Bandwidth efficiency, b/s/Hz

1.35

Channel equalizer

yes

Speech coding rate, kb/s

13

FEC coded speech rate, kb/s

22.8

FEC coding

Embedded block/ convolutional

Frequency hopping, hop/s

217

DTX and VAD

yes

Maximum cell radius, km

35

Constant envelope partial response GMSK with a channel spacing of 200 kHz is deployed to support 125 duplex channels in the 890–915-MHz up-link and 935–960-MHz down-link bands, respectively. At a transmission rate of 271 kb/s a spectral efficiency of 1.35-bit/s/Hz is achieved. The controlled GMSK-induced and uncontrolled channel-induced intersymbol interferences are removed by the channel equalizer. The set of standardized wideband GSM channels was introduced in order to provide bench markers for performance comparisons. Efficient power budgeting and minimum 1999 by CRC Press LLC

c

cochannel interferences are ensured by the combination of adaptive power and handover control based on weighted averaging of up to eight up-link and down-link system parameters. Discontinuous transmissions assisted by reliable spectral-domain voice activity detection and comfort-noise insertion further reduce interferences and power consumption. Because of ciphering, no unprotected information is sent via the radio link. As a result, spectrally efficient, high-quality mobile communications with a variety of services and international roaming is possible in cells of up to 35 km radius for signal-to-noise and interference ratios in excess of 10–12 dBs. The key system features are summarized in Table 27.3.

Defining Terms A3: Authentication algorithm A5: Cyphering algorithm A8: Confidential algorithm to compute the cyphering key AB: Access burst ACCH: Associated control channel ADC: Administration center AGCH: Access grant control channel AUC: Authentication center AWGN: Additive Gaussian noise BCCH: Broadcast control channel BER: Bit error ratio BFI: Bad frame indicator flag BN : Bit number BS: Base station BS-PBGT: BS powerbudget: to be evaluated for power budget motivated handovers BSIC: Base station identifier code CC: Convolutional codec CCCH: Common control channel CELL BAR ACCESS: Boolean flag to indicate, whether the MS is permitted to access the specific traffic cell CNC: Comfort noise computation CNI: Comfort noise insertion CNU: Comfort noise update state in the DTX handler DB: Dummy burst DL: Down link DSI: Digital speech interpolation to improve link efficiency DTX: Discontinuous transmission for power consumption and interference reduction EIR: Equipment identity register EOS: End of speech flag in the DTX handler FACCH: Fast associated control channel FCB: Frequency correction burst 1999 by CRC Press LLC

c

FCCH: Frequency correction channel FEC: Forward error correction FH: Frequency hopping F N: TDMA frame number GMSK: Gaussian minimum shift keying GP: Guard space HGO: Handover in the VAD HLR: Home location register HO: Handover HOCT: Handover counter in the VAD HO MARGIN: Handover margin to facilitate hysteresis HSN: Hopping sequence number: frequency hopping algorithm’s input variable IMSI: International mobile subscriber identity ISDN: Integrated services digital network LAI: Location area identifier LAR: Logarithmic area ratio LTP: Long term predictor MA: Mobile allocation: set of legitimate RF channels, input variable in the frequency hopping algorithm MAI: Mobile allocation index: output variable of the FH algorithm MAIO: Mobile allocation index offset: initial RF channel offset, input variable of the FH algorithm MS: Mobile station MSC: Mobile switching center MSRN: Mobile station roaming number MS TXPWR MAX: Maximum permitted MS transmitted power on a specific traffic channel in a specific traffic cell MS TXPWR MAX(n): Maximum permitted MS transmitted power on a specific traffic channel in the nth adjacent traffic cell NB: Normal burst NMC: Network management center NUFR: Receiver noise update flag NUFT: Noise update flag to ask for SID frame transmission OMC: Operation and maintenance center PARCOR: Partial correlation PCH: Paging channel PCM: Pulse code modulation PIN: Personal identity number for MSs PLMN: Public land mobile network PLMN PERMITTED: Boolean flag to indicate whether the MS is permitted to access the specific PLMN PSTN: Public switched telephone network 1999 by CRC Press LLC

c

QN: Quarter bit number R: Random number in the authentication process RA: Rural area channel impulse response RACH: Random access channel RF: Radio frequency RFCH: Radio frequency channel RFN: Reduced TDMA frame number: equivalent representation of the TDMA frame number that is used in the synchronization channel RNTABLE: Random number table utilized in the frequency hopping algorithm RPE: Regular pulse excited RPE-LTP: Regular pulse excited codec with long term predictor RS-232: Serial data transmission standard equivalent to CCITT V24. interface RXLEV: Received signal level: parameter used in handovers RXQUAL: Received signal quality: parameter used in handovers S: Signed response in the authentication process SACCH: Slow associated control channel SB: Synchronization burst SCH: Synchronization channel SCPC: Single channel per carrier SDCCH: Stand-alone dedicated control channel SE: Speech extrapolation SID: Silence identifier SIM: Subscriber identity module in MSs SPRX: Speech received flag SPTX: Speech transmit flag in the DTX handler STP: Short term predictor TA: Timing advance TB: Tailing bits TCH: Traffic channel TCH/F: Full-rate traffic channel TCH/F2.4: Full-rate 2.4-kb/s data traffic channel TCH/F4.8: Full-rate 4.8-kb/s data traffic channel TCH/F9.6: Full-rate 9.6-kb/s data traffic channel TCH/FS: Full-rate speech traffic channel TCH/H: Half-rate traffic channel TCH/H2.4: Half-rate 2.4-kb/s data traffic channel TCH/H4.8: Half-rate 4.8-kb/s data traffic channel TDMA: Time division multiple access TMSI: Temporary mobile subscriber identifier T N : Time slot number TU: Typical urban channel impulse response 1999 by CRC Press LLC

c

TXFL: Transmit flag in the DTX handler UL: Up link VAD: Voice activity detection VE: Viterbi equalizer VLR: Visiting location register

References [1] European Telecommunications Standardization Institute. Group Speciale Mobile or Global System of Mobile Communication (GSM) Recommendation, ETSI Secretariat, Sophia Antipolis Cedex, France, 1988. [2] Greenwood, D. and Hanzo, L., Characterisation of mobile radio channels, In Mobile Radio Communications. Steele, R., Ed., Chap. 2, 92–185. IEEE Press–Pentech Press, London, 1992. [3] Hanzo, L. and Stefanov, J., The Pan-European digital cellular mobile radio system—known as GSM. In Mobile Radio Communications, Steele, R., Ed., Chap. 8, 677–773, IEEE Press–Pentech Press, London, 1992. [4] Hanzo, L. and Steele, R., The Pan-European mobile radio system, Pts. 1 and 2, European Trans. on Telecomm., 5(2), 245–276, 1994. [5] Salami, R.A., Hanzo, L., et al., Speech coding. In Mobile Radio Communications, Steele, R., Ed., Chap. 3, 186–346. IEEE Press–Pentech Press, London, 1992. [6] Steele, R. Ed., Mobile Radio Communications, IEEE Press–Pentech Press, London, 1992. [7] Vary, P. and Sluyter, R.J., MATS-D speech codec: Regular-pulse excitation LPC, Proceedings of Nordic Conference on Mobile Radio Communications. 257–261, 1986. [8] Wong, K.H.H. and Hanzo, L., Channel coding. In Mobile Radio Communications. Steele, R., Ed., Chap. 4, 347–488. IEEE Press–Pentech Press, London, 1992.

1999 by CRC Press LLC

c

Mermelstein, P. “Speech and Channel Coding for North American TDMA Cellular Systems” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Speech and Channel Coding for North American TDMA Cellular Systems

Paul Mermelstein ´ ecommunications ´ INRS-Tel ´ University of Quebec

28.1

28.1 Introduction 28.2 Modulation of Digital Voice and Data Signals 28.3 Speech Coding Fundamentals 28.4 Channel Coding Considerations 28.5 VSELP Encoder 28.6 Linear Prediction Analysis and Quantization 28.7 Bandwidth Expansion 28.8 Quantizing and Encoding the Reflection Coefficients 28.9 VSELP Codebook Search 28.10 Long-Term Filter Search 28.11 Orthogonalization of the Codebooks 28.12 Quantizing the Excitation and Signal Gains 28.13 Channel Coding and Interleaving 28.14 Bad Frame Masking 28.15 ACELP Encoder 28.16 Algebraic Codebook Structure and Search 28.17 Quantization of the Gains for ACELP Encoding 28.18 Channel Coding for ACELP Encoding 28.19 Conclusions Defining Terms References Further Information

Introduction

The goals of this chapter are to give the reader a tutorial introduction and high-level understanding of the techniques employed for speech transmission by the IS-54 digital cellular standard. It builds on the information provided in the standards document but is not meant to be a replacement for it. Separate standards cover the control channel used for the setup of calls and their handoff to neighboring cells, as well as the encoding of data signals for transmission. For detailed implementation information 1999 by CRC Press LLC

c

the reader should consult the most recent standards document [9]. IS-54 provides for encoding bidirectional speech signals digitally and transmitting them over cellular and microcellular mobile radio systems. It retains the 30-kHz channel spacing of the earlier advanced mobile telephone service (AMPS), which uses analog frequency modulation for speech transmission and frequency shift keying for signalling. The two directions of transmission use frequencies some 45 MHz apart in the band between 824 and 894 MHz. AMPS employs one channel per conversation in each direction, a technique known as frequency division multiple access (FDMA). IS-54 employs time division multiple access (TDMA) by allowing three, and in the future six, simultaneous transmissions to share each frequency band. Because the overall 30-kHz channelization of the allocated 25 MHz of spectrum in each direction is retained, it is also known as a FDMA-TDMA system. In contrast, the later IS-95 standard employs code division multiple access (CDMA) over bands of 1.23 MHz by combining several 30-kHz frequency channels. Each frequency channel provides for transmission at a digital bit rate of 48.6 kb/s through use of differential quadrature-phase shift key (DQPSK) modulation at a 24.3-kBd channel rate. The channel is divided into six time slots every 40 ms. The full-rate voice coder employs every third time slot and utilizes 13 kb/s for combined speech and channel coding. The six slots provide for an eventual half-rate channel occupying one slot per 40 ms frame and utilizing only about 6.5 kb/s for each call. Thus, the simultaneous call carrying capacity with IS-54 is increased by a factor 3(factor 6 in the future) above that of AMPS. All digital transmission is expected to result in a reduction in transmitted power. The resulting reduction in intercell interference may allow more frequent reuse of the same frequency channels than the reuse pattern of seven cells for AMPS. Additional increases in erlang capacity (the total call-carrying capacity at a given blocking rate) may be available from the increased trunking efficiency achieved by the larger number of simultaneously available channels. The first systems employing dual-mode AMPS and TDMA service were put into operation in 1993. In 1996 the TIA introduced the IS-641 enhanced full rate codec. This codec consists of 7.4 kb/s speech coding following the algebraic code-excited linear prediction (ACELP) technique [7], and 5.6 kb/s channel coding. The 13 kb/s coded information replaces the combined 13 kb/s for speech and channel coding introduced by the IS-54 standard. The new codec provides significant enhancements in terms of speech quality and robustness to transmission errors. The quality enhancement for clear channels results from the improved modeling of the stochastic excitation by means of an algebraic codebook instead of the two trained VSELP codebooks. Improved robustness to transmission errors is achieved by employing predictive quantization techniques for the linear-prediction filter and gain parameters, and increasing the number of bits protected by forward error correction.

28.2

Modulation of Digital Voice and Data Signals

The modulation method used in IS-54 is π/4 shifted differentially encoded quadrature phase-shift keying (DPSK). Symbols are transmitted as changes in phase rather than their absolute values. The binary data stream is converted to two binary streams Xk and Yk formed from the odd- and evennumbered bits, respectively. The quadrature streams Ik and Qk are formed according to Ik Qk

= =

Ik−1 cos [1φ (Xk , Yk )] − Qk−1 sin [1φ (Xk , Yk )] Ik−1 sin [1φ (Xk , Yk )] + Qk−1 cos [1φ (Xk , Yk )]

where Ik−1 and Qk−1 are the amplitudes at the previous pulse time. The phase change 1φ takes the values π/4, 3π/4, −π/4, and −3π/4 for the dibit (Xk , Yk ) symbols (0,0), (0,1), (1,0) and (1,1), respectively. This results in a rotation by π/4 between the constellations for odd and even symbols. 1999 by CRC Press LLC

c

The differential encoding avoids the problem of 180◦ phase ambiguity that may otherwise result in estimation of the carrier phase. The signals √ Ik and Qk at the output of the differential phase encoder can take one of five values, 0, ±1, ±1/ 2 as indicated in the constellation of Fig. 28.1. The corresponding impulses are applied to the inputs of the I and Q baseband filters, which have linear phase and square root raised cosine frequency responses. The generic modulator circuit is shown in Fig. 28.2. The rolloff factor α determines the width of the transition band and its value is 0.35, 1, 0 ≤ f ≤ (1 − α)/2T √ 1/2{1 − sin[π(2f T − 1)/2α]}, (1 − α)/2T ≤ f ≤ (1 + α)/2T |H (f )| = 0, f > (1 + α)/2T

FIGURE 28.1: Constellation for π/4 shifted QPSK modulation. Source: TIA, 1992. Cellular System Dual-mode Mobile Station–Base Station Compatibility Standard TIA/EIA IS-54. With permission.

Baseband Filters

multiplier

lk

A cos(ω c t) source

~ Σ

s(t)

90 −A sin(ω c t) Qk

multiplier

FIGURE 28.2: Generic modulation circuit for digital voice and data signals. Source: TIA, 1992. Cellular System Dual-mode Mobile Station–Base Station Compatibility Standard TIA/EIA IS-54.

1999 by CRC Press LLC

c

28.3

Speech Coding Fundamentals

The IS-54 standard employs a vector-sum excited linear prediction (VSELP) coding technique. It represents a specific formulation of the much larger class of code-excited linear prediction (CELP) coders [2] that have proved effective in recent years for the coding of speech at moderate rates in the range 4–16 kb/s. VSELP provides reconstructed speech with a quality that is comparable to that available with frequency modulation and analog transmission over the AMPS system. The coding rate employed is 7.95 kb/s. Each of the six slots per frame carry 260 b of speech and channel coding information for a gross information rate of 13 kb/s. The 260 b correspond to 20 ms of real time speech, transmitted as a single burst. For an excellent recent review of speech coding techniques for transmission, the reader is referred to Gersho, 1994 [3]. Most modern speech coders use a form of analysis by synthesis coding where the encoder determines the coded signal one segment at a time by feeding candidate excitation segments into a replica of a synthesis filter and selecting the segment that minimizes the distortion between the original and reproduced signals. Linear prediction coding (LPC) techniques [1] encode the speech signal by first finding an optimum linear filter to remove the short-time correlation, passing the signal through that LPC filter to obtain a residual signal, and encoding this residual using much fewer bits than would have been required to code the original signal with the same fidelity. In most cases the coding of the residual is divided into two steps. First, the long-time correlation due to the periodic pitch excitation is removed by means of an optimum one-tap filter with adjustable gain and lag. Next, the remaining residual signal, which now closely resembles a white-noise signal, is encoded. Code-excited linear predictors use one or more codebooks from which they select replicas of the residual of the input signal by means of a closed-loop error-minimization technique. The index of the codebook entry as well as the parameters of all the filters are transmitted to allow the speech signal to be reconstructed at the receiver. Most code-excited coders use trained codebooks. Starting with a codebook containing Guassian signal segments, entries that are found to be used rarely in coding a large body of speech data are iteratively eliminated to result in a smaller codebook that is considered more effective. The speech signal can be considered quasistationary or stationary for the duration of the speech frame, of the order of 20 ms. The parameters of the short-term filter, the LPC coefficients, are determined by analysis of the autocorrelation function of a suitably windowed segment of the input signal. To allow accurate determination of the time-varying pitch lag as well as simplify the computations, each speech frame is divided into four 5-ms subframes. Independent pitch filter computations and residual coding operations are carried out for each subframe. The speech decoder attempts to reconstruct the speech signal from the received information as best possible. It employs a codebook identical to that of the encoder for excitation generation and, in the absence of transmission errors, would produce an exact replica of the signal that produced the minimized error at the encoder. Transmission errors do occur, however, due, to signal fading and excessive interference. Since any attempt at retransmission would incur unacceptable signal delays, sufficient error protection is provided to allow correction of most transmission errors.

28.4

Channel Coding Considerations

The sharp limitations on available bandwidth for error protection argue for careful consideration of the sensitivity of the speech coding parameters to transmission errors. Pairwise interleaving of coded blocks and convolutional coding of a subset of the parameters permit correction of a limited number of transmission errors. In addition, a cyclic redundancy check (CRC) is used to determine whether 1999 by CRC Press LLC

c

the error correction was successful. The coded information is divided into three blocks of varying sensitivity to errors. Group 1 contains the most sensitive bits, mainly the parameters of the LPC filter and frame energy, and is protected by both error detection and correction bits. Group 2 is provided with error correction only. The third group, comprising mostly the fixed codebook indices, is not protected at all. The speech signal contains significant temporal redundancy. Thus, speech frames within which errors have been detected may be reconstructed with the aid of previously correctly received information. A bad-frame masking procedure attempts to hide the effects of short fades by extrapolating the previously received parameters. Of course, if the errors persist, the decoded signal must be muted while an attempt is made to hand off the connection to a base station to/from which the mobile may experience better reception.

28.5

VSELP Encoder

A block diagram of the VSELP speech encoder [4] is shown in Fig. 28.3. The excitation signal is generated from three components, the output of a long term or pitch filter, as well as entries from two codebooks. A weighted synthesis filter generates a synthesized approximation to the frequencyweighted input signal. The weighted mean square error between these two signals is used to drive the error minimization process. This weighted error is considered to be a better approximation to the perceptually important noise components than the unweighted mean square error. The total weighted square error is minimized by adjusting the pitch lag and the codebook indices as well as their gains. The decoder follows the encoder closely and generates the excitation signal identically to the encoder but uses an unweighted linear-prediction synthesis filter to generate the decoded signal. A spectral postfilter is added after the synthesis filter to enhance the quality of the reconstructed speech. The precise data rate of the speech coder is 7950 b/s or 159 b per time slot, each corresponding to 20 ms of signal in real time. These 159 b are allocated as follows: 1) short-term filter coefficients, 38 bits; 2) frame energy, 5 bits; 3) pitch lag, 28 bits; 4) codewords, 56 bits; and 5) gain values, 32 bits.

28.6

Linear Prediction Analysis and Quantization

The purpose of the LPC analysis filter is to whiten the spectrum of the input signal so that it can be better matched by the codebook outputs. The corresponding LPC synthesis filter A(z) restores the short-time speech spectrum characteristics to the output signal. The transfer function of the tenth-order synthesis filter is given by A(z) =

1−

1 PNp

i=1 αi z

−i

The filter predictor parameters α1 , . . . , αNp are not transmitted directly. Instead, a set of reflection coefficients r1 , . . . , rNp are computed and quantized. The predictor parameters are determined from the reflection coefficients using a well-known backward recursion algorithm [6]. A variety of algorithms are known that determine a set of reflection coefficients from a windowed input signal. One such algorithm is the fixed point covariance lattice, FLAT, which builds an optimum inverse lattice stage by stage. At each stage j , the sum of the mean-squared forward and backward residuals is minimized by selection of the best reflection coefficient rj . The analysis window used is 170 samples long, centered with respect to the middle of the fourth 5-ms subframe of the 20-ms 1999 by CRC Press LLC

c

β

L

s(n)

Longterm filter state

W(z) weighting filter

p(n)

γ1

I

+

Codebook 1

ex(n)

p'(n) H(z)

−

Σ( ) 2

weighted synthesis filter γ2

H

Total weighted error

Codebook 2 Select indices L, I or H to minimize total weighted error. ERROR MINIMIZATION

FIGURE 28.3: Black diagram of the speech encoder in VSELP. TIA. 1992. Cellular system Dual-mode Mobile Station–Base Station Compatibility Standard. TIA/EIA IS-54.

frame. Since this centerpoint is 20 samples from the end of the frame, 65 samples from the next frame to be coded are used in computing the reflection coefficient of the current frame. This introduces a lookahead delay of 8.125 ms. The FLAT algorithm first computes the covariance matrix of the input speech for NA = 170 and Np = 10, φ(i, k) =

NX A −1

s(n − i)s(n − k),

0 ≤ i,

k ≤ Np ,

n=Np

Define the forward residual out of stage j as fj (n) and the backward residual as bj (n). Then the autocorrelation of the initial forward residual F0 (i, k) is given by φ(i, k). The autocorrelation of the initial backward residual B0 (i, k) is given by φ(i + 1, k + 1) and the initial cross correlation of the two residuals is given by C0 (i, k) = φ(i, k + 1) for 0 ≤ i, k ≤ Np−1 . Initially j is set to 1. The reflection coefficient at each stage is determined as the ratio of the cross correlation to the mean of the autocorrelations. A block diagram of the computations is shown in Fig. 28.4. By quantizing the reflection coefficients within the computation loops, reflection coefficients at subsequent stages are computed taking into account the quantization errors of the previous stages. Specifically, Cj0 −1

Fj0 −1 Bj0 −1

1999 by CRC Press LLC

c

=

Cj −1 (0, 0) + Cj −1 (Np − j, Np − j )

=

Fj −1 (0, 0) + Fj −1 (Np − j, Np − j )

=

Bj −1 (0, 0) + Bj −1 (Np − j, Np − j )

Fj − 1 Bj−1 Cj − 1

Fj Bj Cj rj

rj+1

F j − 1(i, k) B j − 1(i, k)

F j + 1(i, k)

F j (i, k)

C j − 1(i, k) + C j − 1(k, i) rj+1

rj F j − 1(i + 1, k + 1) B j − 1(i + 1, k + 1)

B j + 1(i, k)

B j(i, k)

C j − 1(i + 1, k + 1) + C j − 1(k + 1, i + 1) rj

rj+1

F j − 1(i, k + 1) B j − 1(i, k + 1)

C j (i, k)

C j + 1(i, k)

C j − 1(i, k + 1) C j − 1(k + 1, i)

FIGURE 28.4: Block diagram for lattice covariance computations. and rj =

−2Cj0 −1

Fj0 −1 + Bj0 −1

Use of two sets of correlation values separated by Np − j samples provides additional stability to the computed reflection coefficients in case the input signal changes form rapidly. Once a quantized reflection coefficient rj has been determined, the resulting auto- and cross correlations can be determined iteratively as Fj (i, k) = Bj (i, k) =

Fj −1 (i, k) + rj [Cj −1 (i, k) + Cj −1 (k, i)] + rj2 Bj −1 (i, k) Bj −1 (i + 1, k + 1) + rj [Cj −1 (i + 1, k + 1) + Cj −1 (k + 1, i + 1)] + rj2 Fj −1 (i + 1, k + 1)

and Cj (i, k) =

Cj −1 (i, k + 1) + rj [Bj −1 (i, k + 1) + Fj −1 (i, k + 1)] + rj2 Cj −1 (k + 1, i)

1999 by CRC Press LLC

c

These computations are carried out iteratively for rj , j = 1, . . . , Np .

28.7

Bandwidth Expansion

Poles with very narrow bandwidths may introduce undesirable distortions into the synthesized signal. Use of a binomial window with effective bandwidth of 80 Hz suffices to limit the ringing of the LPC filter and reduce the effect of the LPC filter selected for one frame on the signal reconstructed for subsequent frames. To achieve this, prior to searching for the reflection coefficients, the φ(i, k) is modified by use of a window function w( j ), j = 1, . . . , 10, as follows: φ 0 (i, k) = φ(i, k)w(|i − k|)

28.8

Quantizing and Encoding the Reflection Coefficients

The distortion introduced into the overall spectrum by quantizing the reflection coefficients diminishes as we move to higher orders in the reflection coefficients. Accordingly, more bits are assigned to the lower order coefficients. Specifically, 6, 5, 5, 4, 4, 3, 3, 3, 3, and 2 b are assigned to r1 , . . . , r10 , respectively. Scalar quantization of the reflection coefficients is used in IS-54 because it is particularly simple. Vector quantization achieves additional quantizing efficiencies at the cost of significant added complexity. It is important to preserve the smooth time evolution of the linear prediction filter. Both the encoder and decoder linearly interpolate the coefficients αi for the first, second and third subframes of each frame using the coefficients determined for the previous and current frames. The fourth subframe uses the values computed for that frame.

28.9

VSELP Codebook Search

The codebook search operation selects indices for the long-term filter (pitch lag L) and the two codebooks I and H so as to minimize the total weighted error. This closed-loop search is the most computationally complex part of the encoding operation, and significant effort has been invested to minimize the complexity of these operations without degrading performance. To reduce complexity, simultaneous optimization of the codebook selections is replaced by a sequential optimization procedure, which considers the long-term filter search as the most significant and therefore executes it first. The two vector-sum codebooks are considered to contribute less and less to the minimization of the error, and their search follows in sequence. Subdivision of the total codebook into two vector sums simplifies the processing and makes the result less sensitive to errors in decoding the individual bits arising from transmission errors. Entries from each of the two vector-sum codebooks can be expressed as the sum of basis vectors. By orthogonalizing these basis vectors to the previously selected codebook component(s), one ensures that the newly introduced components reduce the remaining errors. The subframes over which the codebook search is carried out are 5 ms or 40 samples long. An optimal search would need exploration of a 40-dimensional space. The vector-sum approximation limits the search to 14 dimensions after the optimal pitch lag has been selected. The search is further divided into two stages of 7 dimensions each. The two codebooks are specified in terms of the fourteen, 40-dimensional basis vectors stored at the encoder and decoder. The two 7-b indices indicate the required weights on the basic vectors to arrive at the two optimum codewords. 1999 by CRC Press LLC

c

The codebook search can be viewed as selecting the three best directions in 40-dimensional space, which when summed result in the best approximation to the weighted input signal. The gains of the three components are determined through a separate error minimization process.

28.10

Long-Term Filter Search

The long-term filter is optimized by selection of a lag value that minimizes the error between the weighted input signal p(n) and the past excitation signal filtered by the current weighted synthesis filter H (z). There are 127 possible coded lag values provided corresponding to lags of 20–146 samples. One value is reserved for the case when all correlations between the input and the lagged residuals are negative and use of no long term filter output would be best. To simplify the convolution operation between the impulse response of the weighted synthesis filter and the past excitation, the impulse response is truncated to 21 samples or 2.5 ms. Once the lag is determined, the untruncated impulse response is used to compute the weighted long-term lag vector.

28.11

Orthogonalization of the Codebooks

Prior to the search of the first codebook, each filtered basis vector may be made orthogonal to the long-term filter output, the zero-state response of the weighted synthesis filter H (z) to the long-term prediction vector. Each orthogonalized filtered basis vector is computed by subtracting its projection onto the long-term filter output from itself. Similarly, the basis vectors of the second codebook can be orthogonalized with respect to both the long-term filter output and the first codebook output, the zero-state response of H (z) to the previously selected summation of first-codebook basis vectors. In each case the codebook excitation can be reconstituted as M X θim vk,m (n) uk,i (n) = m=1

where k = 1, 2 for the two codebooks, i = I or H the 7-b code vector received, vk,m are the two sets of basis vectors, and θim = +1 if bit m of codeword i = 1 and −1 if bit m of codeword i = 0. Orthogonalization is not required at the decoder since the gains of the codebooks outputs are determined with respect to the weighted nonorthogonalized code vectors.

28.12

Quantizing the Excitation and Signal Gains

The three codebook gain values β, γ1 , and γ2 are transformed to three new parameters GS, P 0 and P 1 for quantization purposes. GS is an energy offset parameter that equalizes the input and output signal energies. It adjusts the energy of the output of the LPC synthesis filter to equal the energy computed for the same subframe at the encoder input. P 0 is the energy contribution of the long-term prediction vector as a fraction of the total excitation energy within the subframe. Similarly, P 1 is the energy contribution of the code vector selected from the first codebook as a fraction of the total excitation energy of the subframe. The transformation reduces the dynamic range of the parameters to be encoded. An 8-b vector quantizer efficiently encodes the appropriate (GS, P 0, P 1) vectors by selecting the vector which minimizes the weighted error. The received and decoded values β, γ1 , and γ2 are computed from the received (GS, P 0, P 1) vector and applied to reconstitute the decoded signal. 1999 by CRC Press LLC

c

28.13

Channel Coding and Interleaving

77 Class-1 bits

5 Tail Bits

7 178

Rate 1/2 Convolutional Coding

Coded Class-1 bits

260

2-Slot interleaver

12 Most Perceptually Significant Bits 7-bit CRC Computation

Voice cipher

Speech Coder

The goals of channel coding are to reduce the impairments in the reconstructed speech due to transmission errors. The 159 b characterizing each 20-ms block of speech are divided into two classes, 77 in class 1 and 82 in class 2. Class 1 includes the bits in which errors result in a more significant impairment, whereas the speech quality is considered less sensitive to the class- 2 bits. Class 1 generally includes the gain, pitch lag, and more significant reflection coefficient bits. In addition, a 7-b cyclic redundancy check is applied to the 12 most perceptually significant bits of class 1 to indicate whether the error correction was successful. Failure of the CRC check at the receiver suggests that the received information is so erroneous that it would be better to discard it than use it. The error correction coding is illustrated in Fig. 28.5.

260

82 Class-2 bits

Speech frames x and y

speech frame y and z

40 msec

FIGURE 28.5: Error correction insertion for speech coder. Source TIA, 1992. Cellular Systems DualMode Mobile Station–Base Station Compatibility Standards. TIA/EIA IS-54. With permission. The error correction technique used is rate 1/2 convolutional coding with a constraint length of 5 [5]. A tail of 5 b is appended to the 84 b to be convolutionally encoded to result in a 178-b output. Inclusion of the tail bits ensures independent decoding of successive time slots and no propagation of errors between slots. Interleaving the bits to be transmitted over two time slots is introduced to diminish the effects of short deep fades and to improve the error-correction capabilities of the channel coding technique. Two speech frames, the previous and the present, are interleaved so that the bits from each speech block span two transmission time slots separated by 20 ms. The interleaving attempts to separate the convolutionally coded class-1 bits from one frame as much as possible in time by inserting noncoded class-2 bits between them. 1999 by CRC Press LLC

c

28.14

Bad Frame Masking

A CRC failure indicates that the received data is unusable, either due to transmission errors resulting from a fade, or from pre-emption of the time slot by a control message (fast associated control channel, FACCH). To mask the effects that may result from leaving a gap in the speech signal, a masking operation based on the temporal redundancy between adjacent speech blocks has been proposed. Such masking can at best bridge over short gaps but cannot recover loss of signal of longer duration. The bad frame masking operation may follow a finite state machine where each state indicates an operation appropriate to the elapsed duration of the fade to which it corresponds. The masking operation consists of copying the previous LPC information and attenuating the gain of the signal. State 6 corresponds to error sequences exceeding 100 ms, for which the output signal is muted. The result of such a masking operation is generation of an extrapolation in the gap to the previously received signal, significantly reducing the perceptual effects of short fades. No additional delay is introduced in the reconstructed signal. At the same time, the receiver will report a high frequency of bad frames leading the system to explore handoff possibilities immediately. A quick successful handoff will result in rapid signal recovery.

28.15

ACELP Encoder

The ACELP encoder employs linear prediction analysis and quantization techniques similar to those used in VSELP and discussed in Section 28.6. The frame structure of 20 ms frames and 5 ms subframes is preserved. Linear prediction analysis is carried out for every frame. The ACELP encoder uses a long-term filter similar to the one discussed in Section 28.10 and represented as an adaptive codebook. The nonpredictable part of the LPC residual is represented in terms of ACELP codebooks, which replace the two VSELP codebooks shown in Fig. 28.3. Instead of encoding the reflection coefficients as in VSELP, the information is transformed into line-spectral frequency pairs (LSP) [8]. The LSPs can be derived from linear prediction coefficients, a 10th order analysis generating 10 line-spectral frequencies (LSF), 5 poles, and 5 zeroes. The LSFs can be vector quantized and the LPC coefficients recalculated from the quantized LSFs. As long as the interleaved order of the poles and zeroes is preserved, quantization of the LSPs preserves the stability of the LPC synthesis filters. The LSPs of any frame can be better predicted from the values calculated and transmitted corresponding to previous frames, resulting in additional advantages. The longterm means of the LSPs are calculated for a large body of speech data and stored at both the encoder and decoder. First-order moving-average prediction is then used for the mean-removed LSPs. The time-prediction technique also permits use of predicted values for the LSPs in case uncorrectable transmissions errors are encountered, resulting in reduced speech degradation. To simplify the vector quantization operations, each LSP vector is split into 3 subvectors of dimensions 3, 3, and 4. The three subvectors are quantized with 8, 9, and 9 bits respectively, corresponding to a total bit assignment of 26 bits per frame for LPC information.

28.16

Algebraic Codebook Structure and Search

Algebraic codebooks contain relatively few pulses having nonzero values leading to rapid search of the possible innovation vectors, the vectors which together with the ACB output form the excitation of the LPC filter for the current subframe. In this implementation the 40-position innovation vector contains only four nonzero pulses and each can take on only values +1 and −1. The 40 positions are divided into four tracks and one pulse is selected from each track. The tracks are generally equally 1999 by CRC Press LLC

c

spaced but differ in their starting value, thus the first pulse can take on positions 0, 5, 10, 15, 20, 25, 30, or 35 and the second has possible positions 1, 6, 11, 16, 21, 26, 31, or 36. The first three pulse positions are coded with 3 bits and the fourth pulse position (starting positions 3 or 4) with 4 bits, resulting in a 17-bit sequence for the algebraic code of each subframe. The algebraic codebook is searched by minimizing the mean square error between the weighted input speech and the weighted synthesized speech over the time span of each subframe. In each case the weighting is that produced by a perceptual weighting filter that has the effect of shaping the spectrum of the synthesis error signal so that it is better masked by spectrum of the current speech signal.

28.17

Quantization of the Gains for ACELP Encoding

The adaptive codebook gain and the fixed (algebraic) codebook gains are vector quantized using a 7-bit codebook. The gain codebook search is performed by minimizing the mean-square of the weighted error between the original and the reconstructed speech, expressed as a function of the adaptive codebook gain and a fixed codebook correction factor. This correction factor represents the log energy difference between a predicted gain and an estimated gain. The predicted gain is computed using fourth-order moving-average prediction with fixed coefficients on the innovation energy of each subframe. The result is a smoothed energy profile even in the presence of modest quantization errors. As discussed above in case of the LSP quantization, the moving-average prediction serves to provide predicted values even when the current frame information is lost due to transmission errors. Degradations resulting from loss of one or two frames of information are thereby mitigated.

28.18

Channel Coding for ACELP Encoding

The channel coding and interleaving operations for ACELP speech coding are similar to those discussed in Section 28.13 for VSELP coding. The number of bits protected by both error-detection (parity) and error-correction convolutional coding is increased to 48 from 12. Rate 1/2 convolutional coding is used on the 108 more significant bits, 96 class-1 bits, 7 CRC bits and the 5 tail bits of the convolutional coder, resulting in 216 coded class-1 bits. Eight of the 216 bits are dropped by puncturing, yielding 208 coded class-1 bits which are then combined with 52 nonprotected class-2 bits. As compared to the channel coding of the VSELP encoder, the numbers of protected bits is increased and the number of unprotected bits is reduced while keeping the overall coding structure unchanged.

28.19

Conclusions

The IS-54 digital cellular standard specifies modulation and speech coding techniques for mobile cellular systems that allow the interoperation of terminals built by a variety of manufacturers and systems operated across the country by a number of different service providers. It permits speech communication with good quality in a transmission environment characterized by frequent multipath fading and significant intercell interference. Generally, the quality of the IS-54 decoded speech is better at the edges of a cell than the corresponding AMPS transmission due to the error mitigation resulting from channel coding. Near a base station or in the absence of significant fading and interference, the IS-54 speech quality is reported to be somewhat worse than AMPS due to the inherent limitations of the analysis–synthesis model in reconstructing arbitrary speech signals with limited bits. The 1999 by CRC Press LLC

c

IS-641 standard coder achieves higher speech quality, particularly at the edges of heavily occupied cells where transmission errors may be more numerous. At this time no new systems following the IS-54 standard are being introduced. Most base-stations have been converted to transmit and receive on the IS-641 standard as well and use of IS-54 transmissions is dropping rapidly. At the time of its introduction in 1996 the IS-641 coder represented the state of the art in terms of toll quality speech coding near 8 kb/s, a significant improvement over the IS-54 coder introduced in 1990. These standards represent reasonable engineering compromises between high performance and complexity sufficiently low to permit single-chip implementations in mobile terminals. Both IS-54 and IS-641 are considered second generation cellular standards. Third generation cellular systems promise higher call capacities through better exploitation of the time-varying transmission requirements of speech conversations, as well as improved modulation and coding in wider spectrum bandwidths that achieve similar bit-error ratios but reduce the required transmitted power. Until such systems are introduced, the second generation TDMA systems can be expected to provide many years of successful cellular and personal communications services.

Defining Terms Codebook: A set of signal vectors available to both the encoder and decoder. Covariance lattice algorithm: An algorithm for reduction of the covariance matrix of the signal consisting of several lattice stages, each stage implementing an optimal first-order filter with a single coefficient. Reflection coefficient: A parameter of each stage of the lattice linear prediction filter that determines 1) a forward residual signal at the output of the filter-stage by subtracting from the forward residual at the input a linear function of the backward residual, also 2) a backward residual at the output of the filter stage by subtracting a linear function of the forward residual from the backward residual at the input. Vector quantizer: A quantizer that assigns quantized vectors to a vector of parameters based on their current values by minimizing some error criterion.

References [1] Atal, B.S. and Hanauer, S.L., Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am., 50, 637–655, 1971. [2] Atal, B.S. and Schroeder, M., Stochastic coding of speech signals at very low bit rates. Proc. Int. Conf. Comm., 1610–1613, 1984. [3] Gersho, A., Advances in speech and audio compression. Proc. IEEE, 82, 900–918, 1994. [4] Gerson, I.A. and Jasiuk, M.A., Vector sum excited linear prediction (VSELP) speech coding at 8 kbps. Int. Conf. Acoust. Speech and Sig. Proc., ICASSP90, 461–464, 1990. [5] Lin S. and Costello, D., Error Control Coding: Fundamentals and Application, Prentice Hall, Englewood Cliffs, NJ, 1983. [6] Makhoul, J., Linear prediction, a tutorial review. Proc. IEEE, 63, 561–580, 1975. [7] Salami, R., Laflamme, C., Adoul, J.P., and Massaloux, D., A toll quality 8 kb/s speech codec for the personal communication system (PCS). IEEE Trans. Vehic. Tech., 43, 808–816, 1994. [8] Soong, F.K. and Juang, B.H., Line spectrum pair (LSP) and speech data compression. Proc. ICASSP’84, 1.10.1–1.10.4, 1984. 1999 by CRC Press LLC

c

[9] Telecommunications Industry Association, EIA/TIA Interim Standard, Cellular System Dualmode Mobile Station–Base Station Compatibility Standard IS-54B, TIA/EIA, Washington, D.C., 1992.

Further Information For a general treatment of speech coding for telecommunications, see N.S. Jayant and P. Noll, Digital Coding of Waveforms, Prentice Hall, Englewood, NJ, 1984. For a more detailed treatment of linear prediction techniques, see J. Markel and A. Gray, Linear Prediction of Speech, Springer–Verlag, NY, 1976.

1999 by CRC Press LLC

c

Hanzo, L. “The British Cordless Telephone Standard: CT-2” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

The British Cordless Telephone Standard: CT-2 29.1 History and Background 29.2 The CT-2 Standard 29.3 The Radio Interface

Transmission Issues • Multiple Access and Burst Structure • Power Ramping, Guard Period, and Propagation Delay • Power Control

29.4 Burst Formats 29.5 Signalling Layer Two (L2)

General Message Format • Fixed Format Packet

Lajos Hanzo University of Southampton

29.1

29.6 CPP-Initiated Link Setup Procedures 29.7 CFP-Initiated Link Setup Procedures 29.8 Handshaking 29.9 Main Features of the CT-2 System Defining Terms References

History and Background

Following a decade of world-wide research and development (R&D), cordless telephones (CT) are now becoming widespread consumer products, and they are paving the way towards ubiquitous, low-cost personal communications networks (PCN) [7, 8]. The two most well-known European representatives of CTs are the digital European cordless telecommunications (DECT) system [1, 5] and the CT-2 system [2, 6]. Three potential application areas have been identified, namely, domestic, business, and public access, which is also often referred to as telepoint (TP). In addition to conventional voice communications, CTs have been conceived with additional data services and local area network (LAN) applications in mind. The fundamental difference between conventional mobile radio systems and CT systems is that CTs have been designed for small to very small cells, where typically benign low-dispersion, dominant line-of-sight (LOS) propagation conditions prevail. Therefore, CTs can usually dispense with channel equalizers and complex low-rate speech codecs, since the low-signal dispersion allows for the employment of higher bit rates before the effect of channel dispersion becomes a limiting factor. On the same note, the LOS propagation scenario is associated with mild fading or near-constant received signal level, and when combined with appropriate small-cell power-budget design, it ensures a high average signal-to-noise ratio (SNR). 1999 by CRC Press LLC

c

These prerequisites facilitate the employment of high-rate, low-complexity speech codecs, which maintain a low battery drain. Furthermore, the deployment of forward error correction codecs can often also be avoided, which reduces both the bandwidth requirement and the power consumption of the portable station (PS). A further difference between public land mobile radio (PLMR) systems [3] and CTs is that whereas the former endeavor to standardize virtually all system features, the latter seek to offer a so-called access technology, specifying the common air interface (CAI), access and signalling protocols, and some network architecture features, but leaving many other characteristics unspecified. By the same token, whereas PLMR systems typically have a rigid frequency allocation scheme and fixed cell structure, CTs use dynamic channel allocation (DCA) [4]. The DCA principle allows for a more intelligent and judicious channel assignment, where the base station (BS) and PS select an appropriate traffic channel on the basis of the prevailing traffic and channel quality conditions, thus minimizing, for example, the effect of cochannel interference or channel blocking probability. In contrast to PLMR schemes, such as the Pan-European global system of mobile communications (GSM) system [3], CT systems typically dispense with sophisticated mobility management, which accounts for the bulk of the cost of PLMR call charges, although they may facilitate limited hand-over capabilities. Whereas in residential applications CTs are the extension of the public switched telephone network (PSTN), the concept of omitting mobility management functions, such as location update, etc., leads to telepoint CT applications where users are able to initiate but not to receive calls. This fact drastically reduces the network operating costs and, ultimately, the call charge at a concomittant reduction of the services rendered. Having considered some of the fundamental differences between PLMR and CT systems let us now review the basic features of the CT-2 system.

29.2

The CT-2 Standard

The European CT-2 recommendation has evolved from the British standard MPT-1375 with the aim of ensuring the compatibility of various manufacturers’ systems as well as setting performance requirements, which would encourage the development of cost-efficient implementations. Further standardization objectives were to enable future evolution of the system, for example, by reserving signalling messages for future applications and to maintain a low PS complexity even at the expense of higher BS costs. The CT-2 or MPT 1375 CAI recommendation is constituted by the four following parts. 1. Radio interface: Standardizes the radio frequency (RF) parameters, such as legitimate channel frequencies, the modulation method, the transmitter power control, and the required receiver sensitivity as well as the carrier-to-interference ratio (CIR) and the time division duplex (TDD) multiple access scheme. Furthermore, the transmission burst and master/slave timing structures to be used are also laid down, along with the scrambling procedures to be applied. 2. Signalling layers one and two: Defines how the bandwidth is divided among signalling, traffic data, and synchronization information. The description of the first signalling layer includes the dynamic channel allocation strategy, calling channel detection, as well as link setup and establishment algorithms. The second layer is concerned with issues of various signalling message formats, as well as link establishment and re-establishment procedures. 1999 by CRC Press LLC

c

3. Signalling layer three: The third signalling layer description includes a range of message sequence diagrams as regards to call setup to telepoint BSs, private BSs, as well as the call clear down procedures. 4. Speech coding and transmission: The last part of the standard is concerned with the algorithmic and performance features of the audio path, including frequency responses, clipping, distortion, noise, and delay characteristics. Having briefly reviewed the structure of the CT-2 recommendations let us now turn our attention to its main constituent parts and consider specific issues of the system’s operation.

29.3

The Radio Interface

29.3.1 Transmission Issues In our description of the system we will adopt the terminology used in the recommendation, where the PS is called cordless portable part (CPP), whereas the BS is referred to as cordless fixed part (CFP). The channel bandwidth and the channel spacing are 100 kHz, and the allocated system bandwidth is 40 MHz, which is hosted in the range of 864.15–868.15 MHz. Accordingly, a total of 40 RF channels can be utilized by the system. The accuracy of the radio frequency must be maintained within ±10 kHz of its nominal value for both the CFP and CPP over the entire specified supply voltage and ambient temperature range. To counteract the maximum possible frequency drift of 20 kHz, automatic frequency correction (AFC) may be used in both the CFP and CPP receivers. The AFC may be allowed to control the transmission frequency of only the CPP, however, in order to prevent the misalignment of both transmission frequencies. Binary frequency shift keying (FSK) is proposed, and the signal must be shaped by an approximately Gaussian filter in order to maintain the lowest possible frequency occupancy. The resulting scheme is referred to as Gaussian frequency shift keying (GFSK), which is closely related to Gaussian minimum shift keying (GMSK) [7] used in the DECT [1] and GSM [3] systems. Suffice to say that in M-arry FSK modems the carrier’s frequency is modulated in accordance with the information to be transmitted, where the modulated signal is given by r Si (t) =

2E cos [ωi t + 8] T

i = 1, . . . , M

and E represents the bit energy, T the signalling interval length, ωi has M discrete values, whereas the phase 8 is constant.

29.3.2 Multiple Access and Burst Structure The so-called TDD multiple access scheme is used, which is demonstrated in Fig. 29.1. The simple principle is to use the same radio frequency for both uplink and downlink transmissions between the CPP and the CFP, respectively, but with a certain staggering in time. This figure reveals further details of the burst structure, indicating that 66 or 68 b per TDD frame are transmitted in both directions. There is a 3.5- or 5.5-b duration guard period (GP) between the uplink and downlink transmissions, and half of the time the CPP (the other half of the time the CFP) is transmitting with the other part listening, accordingly. Although the guard period wastes some channel capacity, it allows a finite time for both the CPP and CFP for switching from transmission to reception and vice versa. The burst 1999 by CRC Press LLC

c

structure of Fig. 29.1 is used during normal operation across an established link for the transmission of adaptive differential pulse code modulated (ADPCM) speech at 32 kb/s according to the CCITT G721 standard in a so-called B channel or bearer channel. The D channel, or signalling channel, is used for the transmission of link control signals. This specific burst structure is referred to as a multiplex one (M1) frame.

FIGURE 29.1: M1 burst and TDD frame structure. Since the speech signal is encoded according to the CCITT G721 recommendation at 32 kb/s the TDD bit rate must be in excess of 64 kb/s in order to be able to provide the idle guard space of 3.5or 5.5-b interval duration plus some signalling capacity. This is how channel capacity is sacrificed to provide the GP. Therefore, the transmission bit rate is stipulated to be 72 kb/s and the transmission burst length is 2 ms, during which 144-b intervals can be accommodated. As it was demonstrated in Fig. 29.1, 66 or 68 b are transmitted in both the uplink and downlink burst, and taking into account the guard spaces, the total transmission frame is constituted by (2·68) + 3.5 + 4.5 = 144 b or equivalently, by (2·66) + 5.5 + 4.5 = 144 b. The 66-b transmission format is compulsory, whereas the 68-b format is optional. In the 66-b burst there is one D bit dedicated to signalling at both ends of the burst, whereas in the 68-b burst the two additional bits are also assigned to signalling. Accordingly, the signalling rate becomes 2 b/2 ms or 4 b/2 ms, corresponding to 1 kb/s or 2 kb/s signalling rates.

29.3.3 Power Ramping, Guard Period, and Propagation Delay As mentioned before and suggested Fig. 29.1, there is a 3.5- or 5.5-b interval duration GP between transmitted and received bursts. Since the signalling rate is 72 kb/s, the bit interval becomes about 1/(72 kb/s) ≈ 13.9 µs and, hence, the GP duration is about 49 µs or 76 µs. This GP serves a number of purposes. Primarily, the GP allows the transmitter to ramp up and ramp down the transmitted signal level smoothly over a finite time interval at the beginning and end of the transmitted burst. This is necessary, because if the transmitted signal is toggled instantaneously, that is equivalent to multiplying the transmitted signal by a rectangular time-domain window function, which corresponds in the frequency domain to convolving the transmitted spectrum with a sinc function. This convolution would result in spectral side-lobes over a very wide frequency range, which would interfere with adjacent channels. Furthermore, due to the introduction of the guard period, both the CFP and CPP can tolerate a limited propagation delay, but the entire transmitted burst must arrive within the receivers’ window, otherwise the last transmitted bits cannot be decoded.

1999 by CRC Press LLC

c

29.3.4 Power Control In order to minimize the battery drain and the cochannel interference load imposed upon cochannel users, the CT-2 system provides a power control option. The CPPs must be able to transmit at two different power levels, namely, either between 1 and 10 mW or at a level between 12 and 20 dB lower. The mechanism for invoking the lower CPP transmission level is based on the received signal level at the CFP. If the CFP detects a received signal strength more than 90 dB relative to 1 µV/m, it may instruct the CPP to drop its transmitted level by the specified 12–20 dB. Since the 90-dB gain factor corresponds to about a ratio of 31,623, this received signal strength would be equivalent for a 10-cm antenna length to an antenna output voltage of about 3.16 mV. A further beneficial ramification of using power control is that by powering down CPPs that are in the vicinity of a telepoint-type multiple-transceiver CFP, the CFP’s receiver will not be so prone to being desensitised by the high-powered close-in CPPs, which would severely degrade the reception quality of more distant CPPs.

29.4

Burst Formats

As already mentioned in the previous section on the radio interface, there are three different subchannels assisting the operation of the CT-2 system, namely, the voice/data channel or B channel, the signalling channel or D channel, and the burst synchronization channel or SYN channel. According to the momentary system requirements, a variable fraction of the overall channel capacity or, equivalently, a variable fraction of the bandwidth can be allocated to any of these channels. Each different channel capacity or bandwidth allocation mode is associated with a different burst structure and accordingly bears a different name. The corresponding burst structures are termed as multiplex one (M1), multiplex two (M2), and multiplex three (M3), of which multiplex one used during the normal operation of established links has already been described in the previous section. Multiplex two and three will be extensively used during link setup and establishment in subsequent sections, as further details of the system’s operation are unravelled. Signalling layer one (L1) defines the burst formats multiplex one–three just mentioned, outlines the calling channel detection procedures, as well as link setup and establishment techniques. Layer two (L2) deals with issues of acknowledged and unacknowledged information transfer over the radio link, error detection and correction by retransmission, correct ordering of messages, and link maintenance aspects. The burst structure multiplex two is shown in Fig. 29.2. It is constituted by two 16-b D-channel segments at both sides of the 10-b preamble (P) and the 24-b frame synchronization pattern (SYN), and its signalling capacity is 32 b/2 ms = 16 kb/s. Note that the M2 burst does not carry any B-channel information, it is dedicated to synchronization purposes. The 32-b D-channel message is split in two 16-b segments in order to prevent that any 24-b fraction of the 32-b word emulates the 24-b SYN segment, which would result in synchronization misalignment. Since the CFP plays the role of the master in a telepoint scenario communicating with many CPPs, all of the CPP’s actions must be synchronized to those of the CFP. Therefore, if the CPP attempts to initiate a call, the CFP will reinitiate it using the M2 burst, while imposing its own timing structure. The 10-b preamble consists of an alternate zero/one sequence and assists in the operation of the clock recovery circuitry, which has to be able to recover the clock frequency before the arrival of the SYN sequence, in order to be able to detect it. The SYN sequence is a unique word determined by computer search, which has a sharp autocorrelation peak, and its function is discussed later. The way the M2 and M3 burst formats are used for signalling purposes will be made explicit in our further discussions when considering the link setup procedures. 1999 by CRC Press LLC

c

FIGURE 29.2: CT2 multiplex two burst structure.

The specific SYN sequences used by the CFP and the CPP are shown in Table 29.1 along with the so-called channel marker (CHM) sequences used for synchronization purposes by the M3 burst format. Their differences will be made explicit during our further discourse. Observe from the table that the sequences used by the CFP and CPP, namely, SYNF, CHMF and SYNP, CHMP, respectively, are each other’s bit-wise inverses. This was introduced in order to prevent CPPs and CFPs from calling each other directly. The CHM sequences are used, for instance, in residential applications, where the CFP can issue an M2 burst containing a 24-b CHMF sequence and a so-called poll message mapped on to the D-channel bits in order to wake up the specific CPP called. When the called CPP responds, the CFP changes the CHMF to SYNF in order to prevent waking up further CPPs unnecessarily. TABLE 29.1

CT-2 Synchronization Patterns

MSB (sent last)

LSB (sent first)

CHMF

1011

1110

0100

1110

0101

0000

CHMP

0100

0001

1011

0001

1010

1111

SYNCF

1110

1011

0001

1011

0000

0101

SYNCP

0001

0100

1110

0100

1111

1010

Since the CT-2 system does not entail mobility functions, such as registration of visiting CPPs in other than their own home cells, in telepoint applications all calls must be initiated by the CPPs. Hence, in this scenario when the CPP attempts to set up a link, it uses the so-called multiplex three burst format displayed in Fig. 29.3. The design of the M3 burst reflects that the CPP initiating the call is oblivious of the timing structure of the potentially suitable target CFP, which can detect access attempts only during its receive window, but not while the CFP is transmitting. Therefore, the M3 format is rather complex at first sight, but it is well structured, as we will show in our further discussions. Observe in the figure that in the M3 format there are five consecutive 2-ms long 144-b transmitted bursts, followed by two idle frames, during which the CPP listens in order to determine whether its 24-b CHMP sequence has been detected and acknowledged by the CFP. This process can be followed by consulting Fig. 29.6, which will be described in depth after considering the detailed construction of the M3 burst. The first four of the five 2-ms bursts are identical D-channel bursts, whereas the fifth one serves as a synchronization message and has a different construction. Observe, furthermore, that both the first four 144-b bursts as well as the fifth one contain four so-called submultiplex segments, each of which hosts a total of (6 + 10 + 8 + 10 + 2) = 36 b. In the first four 144-b bursts there are (6 + 8 + 2) = 16 one/zero clock-synchronizing P bits and (10 + 10) = 20 D bits or signalling bits. Since the D-channel message is constituted by two 10-b half-messages, the first half of the D-message 1999 by CRC Press LLC

c

FIGURE 29.3: CT2 multiplex three burst structure. is marked by the + sign in the figure. As mentioned in the context of M2, the D-channel bits are split in two halves and interspersed with the preamble segments in order to ensure that these bits do not emulate valid CHM sequences. Without splitting the D bits this could happen upon concatenating the one/zero P bits with the D bits, since the tail of the SYNF and SYNP sequences is also a one/zero segment. In the fifth 144-b M3 burst, each of the four submultiplex segments is constituted by 12 preamble bits and 24 CPP channel marker (CHMP) bits. The four-fold submultiplex M3 structure ensures that irrespective of how the CFP’s receive window is aligned with the CPP’s transmission window, the CFP will be able to capture one of the four submultiplex segments of the fifth M3 burst, establish clock synchronization during the preamble, and lock on to the CHMP sequence. Once the CFP has successfully locked on to one of the CHMP words, the corresponding D-channel messages comprising the CPP identifier can be decoded. If the CPP identifier has been recognized, the CFP can attempt to reinitialize the link using its own master synchronization.

29.5

Signalling Layer Two (L2)

29.5.1 General Message Format The signalling L2 is responsible for acknowledged and un-acknowledged information transfer over the air interface, error detection and correction by retransmission, as well as for the correct ordering of messages in the acknowledged mode. Its further functions are the link end-point identification and link maintenance for both CPP and CFP, as well as the definition of the L2 and L3 interface. Compliance with the L2 specifications will ensure the adequate transport of messages between 1999 by CRC Press LLC

c

the terminals of an established link. The L2 recommendations, however, do not define the meaning of messages, this is specified by L3 messages, albeit some of the messages are undefined in order to accommodate future system improvements. The L3 messages are broken down to a number of standard packets, each constituted by one or more codewords (CW), as shown in Fig. 29.4. The codewords have a standard length of eight octets, and each packet contains up to six codewords. The first codeword in a packet is the so-called address codeword (ACW) and the subsequent ones, if present, are data codewords (DCW). The first octet of the ACW of each packet contains a variety of parameters, of which the binary flag L3 END is indicated in Fig. 29.4, and it is set to zero in the last packet. If the L3 message transmitted is mapped onto more than one packet, the packets must be numbered up to N . The address codeword is always preceded by a 16-b D-channel frame synchronization word SYNCD. Furthermore, each eight-octet CW is protected by a 16-b parity-check word occupying its last two octets. The binary Bose–Chaudhuri– Hocquenghem BCH(63,48) code is used to encode the first six octets or 48 b by adding 15 parity b to yield 63 b. Then bit 7 of octet 8 is inverted and bit 8 of octet 8 added such that the 64-b codeword has an even parity. If there are no D-channel packets to send, a 3-octet idle message IDLE D constituted by zero/one reversals is transmitted. The 8-octet format of the ACWs and DCWs is made explicit in Fig. 29.5, where the two parity check octets occupy octets 7 and 8. The first octet hosts a number of control bits. Specifically, bit 1 is set to logical one for an ACW and to zero for a DCW, whereas bit 2 represents the so-called format type FT bit. F T = 1 indicates that variable length packet format is used for the transfer of L3 messages, whereas F T = 0 implies that a fixed length link setup is used for link end point addressing end service requests. F T is only relevant to ACWs, and in DCWs it has to be set to one.

FIGURE 29.4: General L2 and L3 message format.

1999 by CRC Press LLC

c

FIGURE 29.5: Fixed format packets mapped on M1, M2, and M3 during link initialization and on M1 and M2 during handshake.

29.5.2 Fixed Format Packet As an example, let us focus our attention on the fixed format scenario associated with F T = 0. The corresponding codeword format defined for use in M1, M2, and M3 for link initiation and in M1 and M2 for handshaking is displayed in Fig. 29.5. Bits 1 and 2 have already been discussed, whereas the 2-bit link status (LS) field is used during call setup and handshaking. The encoding of the four possible LS messages is given in Table 29.2. The aim of these LS messages will become more explicit during our further discussions with reference to Fig. 29.6 and Fig. 29.7. Specifically, link request is transmitted from the CPP to the CFP either in an M3 burst as the first packet during CPP-initiated call setup and link re-establishment, or returned as a poll response in an M2 burst from the CPP to the CFP, when the CPP is responding to a call. Link grant is sent by the CFP in response to a link request originating from the CPP. In octets 5 and 6 it hosts the so-called link identification (LID) code, which is used by the CPP, for example, to address a specific CFP or a requested service. The LID is also used to maintain link reference during handshake exchanges and link re-establishment. The two remaining link status handshake messages, namely, ID OK and ID lost, are used to report to the far end whether a positive confirmation of adequate link quality has been received within the required time-out period. These issues will be revisited during our further elaborations. Returning to Fig. 29.5, we note that the fixed packet format (F T = 0) also contains a 19-b handset identification code (HIC) and an 8-b manufacturer identification code (MIC). The concatenated HIC and MIC fields jointly from the unique 27-b portable identity code (PIC), serving as a link end-point identifier. Lastly, we have to note that bit 5 of octet 1 represents the signalling rate (SR) request/response bit, which is used by the calling party to specify the choice of the 66- or 68-b M1 format. Specifically, SR = 1 represents the four bit/burst M1 signalling format. The first 6 octets are then protected by the parity check information contained in octets 7 and 8.

1999 by CRC Press LLC

c

TABLE 29.2 Encoding of Link Status Messages

29.6

LS1

LS0

Message

0

0

Link request

0

1

Link grant

1

0

ID OK

1

1

ID lost

CPP-Initiated Link Setup Procedures

Calls can be initiated at both the CPP and CFP, and the call initiation and detection procedures invoked depend on which party initiated the call. Let us first consider calling channel detection at the CFP, which ensues as follows. Under the instruction of the CFP control scheme, the RF synthesizer tunes to a legitimate RF channel and after a certain settling time commences reception. Upon receiving the M3 bursts from the CPP, the automatic gain control (AGC) circuitry adjusts its gain factor, and during the 12-b preamble in the fifth M3 burst, bit synchronization is established. This specific 144-b M3 burst, is transmitted every 14 ms, corresponding to every seventh 144-b burst. Now the CFP is ready to bit-synchronously correlate the received sequences with its locally stored CHMP word in order to identify any CHMP word arriving from the CPP. If no valid CHMP word is detected, the CFP may retune itself to the next legitimate RF channel, etc. As mentioned, the call identification and link initialization process is shown in the flowchart of Fig. 29.6. If a valid 24-b CHMP word is identified, D-channel frame synchronization can take place using the 16-b SYNCD sequence and the next 8-octet L2 D-channel message delivering the link request handshake portrayed earlier in Fig. 29.5 and Table 29.2 is decoded by the CFP. The required 16 + 64 = 80 D bits are accommodated in this scenario by the 4·20 = 80 D bits of the next four 144-b bursts of the M3 structure, where the 20 D bits of the four submultiplex segments are transmitted four times within the same burst before the D message changes. If the decoded LID code of Fig. 29.5 is recognized by the CFP, the link may be reinitialized based on the master’s timing information using the M2 burst associated with SYNF and containing the link grant message addressed to the specific CPP identified by its PID. Otherwise the CFP returns to its scanning mode and attempts to detect the next CHMP message. The reception of the CFP’s 24-b SYNF segment embedded in the M2 message shown previously in Fig. 29.2 allows the CPP to identify the position of the CFP’s transmit and receive windows and, hence, the CPP now can respond with another M2 burst within the receive window of the CFP. Following a number of M2 message exchanges, the CFP then sends a L3 message to instruct the CPP to switch to M1 bursts, which marks the commencement of normal voice communications and the end of the link setup session.

29.7

CFP-Initiated Link Setup Procedures

Similar procedures are followed when the CPP is being polled. The CFP transmits the 24-b CHMF words hosted by the 24-b SYN segment of the M2 burst shown in Fig. 29.2 in order to indicate that one or more CPPs are being paged. This process is displayed in the flowchart of Fig. 29.7, as well as in the timing diagram displayed in Fig. 29.8. The M2 D-channel messages convey the identifiers of the polled CPPs. The CPPs keep scanning all 40 legitimate RF channels in order to pinpoint any 24-b CHMF words. 1999 by CRC Press LLC

c

FIGURE 29.6: Flowchart of the CT-2 link initialization by the CPP.

Explicitly, the CPP control scheme notifies the RF synthesizer to retune to the next legitimate RF channel if no CHMF words have been found on the current one. The synthesizer needs a finite time to settle on the new center frequency and then starts receiving again. Observe in Fig. 29.8 that at this stage only the CFP is transmitting the M2 bursts; hence, the uplink-half of the 2-ms TDD frame is unused. Since the M2 burst commences with the D-channel bits arriving from the CFP, the CPP receiver’s 1999 by CRC Press LLC

c

FIGURE 29.7: Flowchart of the CT-2 link initialization by the CFP.

1999 by CRC Press LLC

c

FIGURE 29.8: CT-2 call detection by the CPP. AGC will have to settle during this 16-b interval, which corresponds to about 16·1/[72 kb/s] ≈ 0.22 ms. Upon the arrival of the 10 alternating one–zero preamble bits, bit synchronization is established. Now the CPP is ready to detect the CHMF word using a simple correlator circuitry, which establishes the appropriate frame synchronization. If, however, no CHMF word is detected within the receive window, the synthesizer will be retuned to the next RF channel, and the same procedure is repeated, until a CHMF word is detected. When a CHMF word is correctly decoded by the CPP, the CPP is now capable of frame and bit synchronously decoding the D-channel bits. Upon decoding the D-channel message of the M2 burst, the CPP identifier (ID) constituted by the LID and PID segments of Fig. 29.5 is detected and compared to the CPP’s own ID in order to decide as to whether the call is for this specific CPP. If so, the CPP ID is reflected back to the CFP along with a SYNP word, which is included in the SYN segment of an uplink M2 burst. This channel scanning and retuning process continues until a legitimate incoming call is detected or the CPP intends to initiate a call. More precisely, if the specific CPP in question is polled and its own ID is recognized, the CPP sends its poll response message in three consecutive M2 bursts, since the capacity of a single M2 burst is 32 D bits only, while the handshake messages of Fig. 29.5 and Table 29.2 require 8 octets preceded by a 16-b SYNCD segment. If by this time all paged CPPs have responded, the CFP changes the CHMF word to a SYNF word, in order to prevent activating dormant CPPs who are not being paged. If any of the paged CPPs intends to set up the link, then it will change its poll response to a L2 link request message, in response to which the CFP will issue an M2 link grant message, as seen in Fig. 29.7, and from now on the procedure is identical to that of the CPP-initiated link setup portrayed in Fig. 29.6.

29.8

Handshaking

Having established the link, voice communications is maintained using M1 bursts, and the link quality is monitored by sending handshaking (HS) signalling messages using the D-channel bits. The required frequency of the handshaking messages must be between once every 400 ms and 1000 ms. The CT-2 codewords ID OK, ID lost, link request and link grant of Table 29.2 all represent valid handshakes. When using M1 bursts, however, the transmission of these 8-octet messages using the 2- or 4-b/2ms D-channel segment must be spread over 16 or 32 M1 bursts, corresponding to 32 or 64 ms. 1999 by CRC Press LLC

c

Let us now focus our attention on the handshake protocol shown in Fig. 29.9. Suppose that the CPP’s handshake interval of Thtx p = 0.4 s since the start of the last transmitted handshake has expired, and hence the CPP prepares to send a handshake message HS p. If the CPP has received a valid HS f message from the CFP within the last Thrx p = 1s, the CPP sends an HS p = ID OK message to the CFP, otherwise an ID Lost HS p. Furthermore, if the valid handshake was HS f = ID OK, the CPP will reset its HS f lost timer Thlost p to 10 s. The CFP will maintain a 1-s timer referred to as Thrx f, which is reset to its initial value upon the reception of a valid HS p from the CPP.

FIGURE 29.9: CT-2 handshake algorithms.

The CFP’s actions also follow the structure of Fig. 29.9 upon simply interchanging CPP with CFP and the descriptor p with f. If the Thrx f = 1 s timer expires without the reception of a valid HS p from the CPP, then the CFP will send its ID Lost HS f message to the CPP instead of the ID OK message and will not reset the Thlost f = 10 s timer. If, however, the CFP happens to detect a valid 1999 by CRC Press LLC

c

HS p, which can be any of the ID OK, ID Lost, link request and link grant messages of Table 29.2, arriving from the CPP, the CFP will reset its Thrx f = 1 s timer and resumes transmitting the ID OK HS f message instead of the ID Lost. Should any of the HS messages go astray for more than 3 s, the CPP or the CFP may try and re-establish the link on the current or another RF channel. Again, although any of the ID OK, ID Lost, link request and link grant represent valid handshakes, only the reception of the ID OK HS message is allowed to reset the Thlost = 10 s timer at both the CPP and CFP. If this timer expires, the link will be relinquished and the call dropped. The handshake mechanism is further augmented by referring to Fig. 29.10, where two different scenarios are examplified, portraying the situation when the HS message sent by the CPP to the CFP is lost or, conversely, that transmitted by the CFP is corrupted.

FIGURE 29.10: Handshake loss scenarios. Considering the first scenario, during error-free communications the CPP sends HS p = ID OK, and upon receiving it the CFP resets its Thlost f timer to 10 s. In due course it sends an HS f = ID OK acknowledgement, which also arrives free from errors. The CPP resets the Thlost f timer to 10 s and, after the elapse of the 0.4–1 s handshake interval, issues an HS p = ID OK message, which does not reach the CFP. Hence, the Thlost f timer is now reduced to 9 s and an HS f = ID Lost message is sent to the CPP. Upon reception of this, the CPP now cannot reset its Thlost p timer to 10 s but can respond with an HS p = ID OK message, which again goes astray, forcing the CFP to further reduce its Thlost f timer to 8 s. The CFP issues the valid handshake HS f = ID Lost, which arrives at the 1999 by CRC Press LLC

c

CPP, where the lack of HS f = ID OK reduces Thlost p to 8 s. Now the corruption of the issued HS p = ID OK reduces Thlost f to 7 s, in which event the link may be reinitialized using the M3 burst. The portrayed second example of Fig. 29.10 can be easily followed in case of the scenario when the HS f message is corrupted.

29.9

Main Features of the CT-2 System

In our previous discourse we have given an insight in the algorithmic procedures of the CT-2 MPT 1375 recommendation. We have briefly highlighted the four-part structure of the standard dealing with the radio interface, signalling layers 1 and 2, signalling layer 3, and the speech coding issues, respectively. There are forty 100-kHz wide RF channels in the band 864.15–868.15 MHz, and the 72 kb/s bit stream modulates a Gaussian filtered FSK modem. The multiple access technique is TDD, transmitting 2-ms duration, 144-b M1 bursts during normal voice communications, which deliver the 32-kb/s ADPCM-coded speech signal. During link establishment the M2 and M3 bursts are used, which were also portrayed in this treatise, along with a range of handshaking messages and scenarios.

Defining Terms AFC: Automatic frequency correction CAI: Common air interface CFP: Cordless fixed part CHM: Channel marker sequence CHMF: CFP channel marker CHMP: CPP channel marker CPP: Cordless portable part CT: Cordless telephone DCA: Dynamic channel allocation DCW: Data code word DECT: Digital European cordless telecommunications system FT: Frame format type bit GFSK: Gaussian frequency shift keying GP: Guard period HIC: Handset identification code HS: Handshaking ID: Identifier L2: Signalling layer 2 L3: Signalling layer 3 LAN: Local area network LID: Link identification LOS: Line of sight LS: Link status M1: Multiplex one burst format M2: Multiplex two burst format 1999 by CRC Press LLC

c

M3: Multiplex three burst format MIC: Manufacturer identification code MPT-1375: British CT2 standard PCN: Personal communications network PIC: Portable identification code PLMR: Public land mobile radio SNR: Signal-to-noise ratio SR: Signalling rate bit SYN: Synchronization sequence SYNCD: 16-b D-channel frame synchronization word TDD: Time division duplex multiple access scheme TP: Telepoint

References [1] Asghar, S., Digital European cordless telephone (DECT), In The Mobile Communications Handbook, Chap. 30, CRC Press, Boca Raton, FL, 1995. [2] Gardiner, J.G., Second generation cordless (CT-2) telephony in the UK: telepoint services and the common air-interface, Elec. & Comm. Eng. J., 71–78, Apr. 1990. [3] Hanzo, L., The Pan-European mobile radio system, In The Mobile Communications Handbook, Chap. 25, CRC Press, Boca Raton, FL, 1995. [4] Jabbari, B., Dynamic channel assignment, In The Mobile Communications Handbook, Chap. 21, CRC Press, Boca Raton, FL, 1995. [5] Ochsner, H., The digital European cordless telecommunications specification DECT. In Cordless telecommunication in Europe. Tuttlebee, W.H.M., Ed., 273–285. Springer-Verlag, 1990. [6] Steedman, R.A.J., The Common Air Interface MPT 1375. In Cordless Telecommunication in Europe. Tuttlebee, W.H.W. Ed., 261–272, Springer-Verlag, 1990. [7] Steele, R., Ed., Mobile Radio Communications, Pentech Press, London, 1992. [8] Tuttlebee, W.H.W., Ed., Cordless Telecommunication in Europe, Springer-Verlag, 1990.

1999 by CRC Press LLC

c

Chan, W.; Gerson, I. & Miki, T. “Half-Rate Standards” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Half-Rate Standards

Wai-Yip Chan Illinois Institute of Technology

Ira Gerson Motorola Corporate Systems Research Laboratories

Toshio Miki NTT Mobile Communication Network, Inc.

30.1

30.1 Introduction 30.2 Speech Coding for Cellular Mobile Radio Communications 30.3 Codec Selection and Performance Requirements 30.4 Speech Coding Techniques in the Half-Rate Standards 30.5 Channel Coding Techniques in the Half-Rate Standards 30.6 The Japanese Half-Rate Standard 30.7 The European GSM Half-Rate Standard 30.8 Conclusions Defining Terms Acknowledgment References Further Information

Introduction

A half-rate speech coding standard specifies a procedure for digital transmission of speech signals in a digital cellular radio system. The speech processing functions that are specified by a half-rate standard are depicted in Fig. 30.1. An input speech signal is processed by a speech encoder to generate

FIGURE 30.1: Digital speech transmission for digital cellular radio. Boxes with solid outlines represent processing modules that are specified by the half-rate standards. a digital representation at a net bit rate of Rs bits per second. The encoded bit stream representing the input speech signal is processed by a channel encoder to generate another bit stream at a gross bit rate of Rc bits per second, where Rc > Rs . The channel encoded bit stream is organized into data frames, and each frame is transmitted as payload data by a radio-link access controller and modulator. The net bit rate Rs counts the number of bits used to describe the speech signal, and the difference between the gross and net bit rates (Rc −Rs ) counts the number of error protection bits needed by the channel decoder to correct and detect transmission errors. The output of the channel decoder is given 1999 by CRC Press LLC

c

to the speech decoder to generate a quantized version of the speech encoder’s input signal. In current digital cellular radio systems that use time-division multiple access (TDMA), a voice connection is allocated a fixed transmission rate (i.e., Rc is a constant). The operations performed by the speech and channel encoders and decoders and their input and output data formats are governed by the half-rate standards. Globally, three major TDMA cellular radio systems have been developed and deployed. The initial digital speech services offered by these cellular systems were governed by full-rate standards. Because of the rapid growth in demand for cellular services, the available transmission capacity in some areas is frequently saturated, eroding customer satisfaction. By providing essentially the same voice quality but at half the gross bit rates of the full-rate standards, half-rate standards can readily double the number of callers that can be serviced by the cellular systems. The gross bit rates of the full-rate and half-rate standards for the European Groupe Speciale Mobile (GSM), Japanese Personal Digital Cellular1 (PDC), and North American cellular (IS-54) systems are listed in Table 30.1. The three systems were developed and deployed under different time tables. Their disparate full- and half-bit rates partly reflect this difference. At the time of writing (January, 1995), the European and the Japanese systems have each selected an algorithm for their respective half-rate codec. Standardization of the North American half-rate codec has not reached a conclusion as none of the candidate algorithms has fully satisfied the standard’s requirements. Thus, we focus here on the Japanese and European half-rate standards and will only touch upon the requirements of the North American standard. TABLE 30.1 Gross Bit Rates Used for Digital Speech Transmission in Three TDMA Cellular Radio Systems Gross Bit Rate, b/s

30.2

Standard Organization and Digital Cellular System

Full Rate

European Telecommunications Standards Institute (ETSI), GSM Research & Development Center for Radio Systems (RCR), PDC Telecommunication Industries Association (TIA), IS-54

22,800 11,200 13,000

Half Rate 11,400 5,600 6,500

Speech Coding for Cellular Mobile Radio Communications

Unlike the relatively benign transmission media commonly used in the public-switched telephone network (PSTN) for analog and digital transmission of speech signals, mobile radio channels are impaired by various forms of fading and interference effects. Whereas proper engineering of the radio link elements (modulation, power control, diversity, equalization, frequency allocation, etc.) ameliorates fading effects, burst and isolated bit errors still occur frequently. The net effect is such that speech communication may be required to be operational even for bit-error rates greater than 1%. In order to furnish reliable voice communication, typically half of the transmitted payload bits are devoted to error correction and detection. It is common for low-bit-rate speech codecs to process samples of the input speech signal one frame

1 Personal Digital Cellular was formerly Japanese Digital Cellular (JDC).

1999 by CRC Press LLC

c

at a time, e.g., 160 samples processed once every 20 ms. Thus, a certain amount of time is required to gather a block of speech samples, encode them, perform channel encoding, transport the encoded data over the radio channel, and perform channel decoding and speech synthesis. These processing steps of the speech codec add to the overall end-to-end transmission delay. Long transmission delay hampers conversational interaction. Moreover, if the cellular system is interconnected with the PSTN and a four-wire to two-wire (analog) circuit conversion is performed in the network, feedbacks called echoes may be generated across the conversion circuit. The echoes can be heard by the originating talker as a delayed and distorted version of his/her speech and can be quite annoying. The annoyance level increases with the transmission delay and may necessitate (at additional costs) the deployment of echo cancellers. A consequence of user mobility is that the level and other characteristics of the acoustic background noise can be highly variable. Though acoustic noise can be minimized through suitable acoustic transduction design and the use of adaptive filtering/cancellation techniques [9, 13, 15], the speech encoding algorithm still needs to be robust against background noise of various levels and kinds (e.g., babble, music, noise bursts, and colored noise). Processing complexity directly impacts the viability of achieving a circuit realization that is compact and has low-power consumption, two key enabling factors of equipment portability for the end user. Factors that tend to result in low complexity are fixed-point instead of floating-point computation, lack of complicated arithmetic operations (division, square roots, transcendental functions), regular algorithm structure, small data memory, and small program memory. Since, in general, better speech quality can be achieved with increasing speech and channel coding delay and complexity, the digital cellular mobile-radio environment imposes conflicting and challenging requirements on the speech codec.

30.3

Codec Selection and Performance Requirements

The half-rate speech coding standards are drawn up through competitive testing and selection. From a set of candidate codec algorithms submitted by contending organizations, the one algorithm that meets basic selection criteria and offers the best performance is selected to form the standard. The codec performance measures and codec testing and selection procedures are set out in a test plan under the auspices of the organization (Table 30.1) responsible for the standardization process (see, e.g., [16]). Major codec characteristics evaluated are speech quality, delay, and complexity. The fullrate codec is also evaluated as a reference codec, and its evaluation scores form part of the selection criteria for the codec candidates. The speech quality of each candidate codec is evaluated through listening tests. To conduct the tests, each candidate codec is required to process speech signals and/or encoded bit streams that have been preprocessed to simulate a range of operating conditions: variations in speaker voice and level, acoustic background noise type and level, channel error rate, and stages of tandem coding. During the tests, subjects listen to processed speech signals and judge their quality levels or annoyance levels on a five-point opinion scale. The opinion scores collected from the tests are suitably averaged over all trials and subjects for each test condition (see [11], for mean opinion score (MOS) and degradation mean opinion score). The categorical opinion scales of the subjects are also calibrated using modulated noise reference units (MNRUs) [3]. Modulated noise better resembles the distortions created by speech codecs than noise that is uncorrelated with the speech signal. Modulated noise is generated by multiplying the speech signal with a noise signal. The resultant modulated noise is scaled to a desired power level and then added to the uncoded (clean) speech signal. The ratio between the power level of the speech signal and that of the modulated noise is expressed in decibels 1999 by CRC Press LLC

c

and given the notation dBQ. Under each test condition, subjects are presented with speech signals processed by the codecs as well as speech signals corrupted by modulated noise. Through presenting a range of modulated-noise levels, the subjects’ opinions are calibrated on the dBQ scale. Thereafter, the mean opinion scores obtained for the codecs can also be expressed on that scale. For each codec candidate, a profile of scores is compiled, consisting of speech quality scores, delay measurements, and complexity estimates. Each candidate’s score profile is compared with that of the reference codec, ensuring that basic requirements are satisfied (see, e.g., [12]). An overall figure of merit for each candidate is also computed from the profile. The candidates, if any, that meet the basic requirements then compete on the basis of maximizing the figure of merit. Basic performance requirements for each of the three half-rate standards are summarized in Table 30.2. In terms of speech quality, the GSM and PDC half-rate codecs are permitted to underperform their respective full-rate codecs by no more than 1 dBQ averaging over all test conditions and no more than 3 dBQ within each test condition. More stringently, the North American half-rate codec is required to furnish a speech-quality profile that is statistically equivalent to that of the North American full-rate codec as determined by a specific statistical procedure for multiple comparisons [16]. Since various requirements on the half-rate standards are set relative to their full-rate counterparts, an indication of the relative speech quality between the three half-rate standards can be deduced from the test results of De Martino [2] comparing the three full-rate codecs. The maximum delays in Table 30.2 apply to the total of the delays through the speech and channel encoders and decoders (Fig. 30.1). Codec complexity is computed using a formula that counts the computational operations and memory usage of the codec algorithm. The complexity of the half-rate codecs is limited to 3 or 4 times that of their full-rate counterparts. TABLE 30.2 Standards

Basic Performance Requirements for the Three Half-Rate Basic performance requirements

30.4

Digital Cellular Systems

Min. Speech Quality, dBQ Rel. to Full Rate

Max. Delay, ms

Max. Complexity Rel. to Full Rate

Japanese (PDC) European (GSM) North American (IS-54)

−1 average, −3 maximum −1 average, −3 maximum Statistically equivalent

94.8 90 100

3× 4× 4×

Speech Coding Techniques in the Half-Rate Standards

Existing half-rate and full-rate standard coders can be characterized as linear-prediction based analysisby-synthesis (LPAS) speech coders [4]. LPAS coding entails using a time-varying all-pole filter in the decoder to synthesize the quantized speech signal. A short segment of the signal is synthesized by driving the filter with an excitation signal that is either quasiperiodic (for voiced speech) or random (for unvoiced speech). In either case, the excitation signal has a spectral envelope that is relatively flat. The synthesis filter serves to shape the spectrum of the excitation input so that the spectral envelope of the synthesized output resembles the filter’s magnitude frequency response. The magnitude response often has prominent peaks; they render the formants that give a speech signal its phonetic character. The synthesis filter has to be adapted to the current frame of input speech signal. This is accomplished with the encoder performing a linear prediction (LP) analysis of the frame: the inverse of the all-pole synthesis filter is applied as an LP error filter to the frame, and the values of the filter parameters are 1999 by CRC Press LLC

c

computed to minimize the energy of the filter’s output error signal. The resultant filter parameters are quantized and conveyed to the decoder for it to update the synthesis filter. Having executed an LP analysis and quantized the synthesis filter parameters, the LPAS encoder performs analysis-by-synthesis (ABS) on the input signal to find a suitable excitation signal. An ABS encoder maintains a copy of the decoder. The encoder examines the possible outputs that can be produced by the decoder copy in order to determine how best to instruct (using transmitted information) the actual decoder so that it would output (synthesize) a good approximation of the input speech signal. The decoder copy tracks the state of the actual decoder, since the latter evolves (under ideal channel conditions) according to information received from the encoder. The details of the ABS procedure vary with the particular excitation model employed in a specific coding scheme. One of the earliest seminal LPAS schemes is code excited linear prediction (CELP) [4]. In CELP, the excitation signal is obtained from a codebook of code vectors, each of which is a candidate for the excitation signal. The encoder searches the codebook to find the one code vector that would result in a best match between the resultant synthesis output signal and the encoder’s input speech signal. The matching is considered best when the energy of the difference between the two signals being matched is minimized. A perceptual weighting filter is usually applied to the difference signal (prior to energy integration) to make the minimization more relevant to human perception of speech fidelity. Regions in the frequency spectrum where human listeners are more sensitive to distortions are given relatively stronger weighting by the filter and vice versa. For instance, the concentration of spectral energy around the formant frequencies gives rise to stronger masking of coder noise (i.e., rendering the noise less audible) and, therefore, weaker weighting can be applied to the formant frequency regions. For masking to be effective, the weighting filter has to be adapted to the time-varying speech spectrum. Adaptation is achieved usually by basing the weighting filter parameters on the synthesis filter parameters. The CELP framework has evolved to form the basis of a great variety of speech coding algorithms, including all existing full- and half-rate standard algorithms for digital cellular systems. We outline next the basic CELP encoder-processing steps, in a form suited to our subsequent detailed descriptions of the PDC and GSM half-rate coders. These steps have accounted for various computational efficiency considerations and may, therefore, deviate from a conceptual functional description of the encoder constituents. 1. LP analysis on the current frame of input speech to determine the coefficients of the all-pole synthesis filter; 2. quantization of the LP filter parameters; 3. determination of the open-loop pitch period or lag; 4. adapting the perceptual weighting filter to the current LP information (and also pitch information when appropriate) and applying the adapted filter to the input speech signal; 5. formation of a filter cascade (which we shall refer to as perceptually weighted synthesis filter) consisting of the LP synthesis filter, as specified by the quantized parameters in step 2, followed by the perceptual weighting filter; 6. subtraction of the zero-input response of the perceptually weighted synthesis filter (the filter’s decaying response due to past input) from the perceptually weighted input speech signal obtained in step 4; 7. an adaptive codebook is searched to find the most suitable periodic excitation, i.e., when the perceptually weighted synthesis filter is driven by the best code vector from the adaptive codebook, the output of the filter cascade should best match the difference signal obtained in step 6; 1999 by CRC Press LLC

c

8. one or more nonadaptive excitation codebooks are searched to find the most suitable random excitation vectors that, when added to the best periodic excitation as determined in step 7 and with the resultant sum signal driving the filter cascade, would result in an output signal best matching the difference signal obtained in step 6. Steps 1–6 are executed once per frame. Steps 7 and 8 are executed once for each of the subframes that together constitute a frame. Step 7 may be skipped depending on the pitch information from step 3, or if step 7 were always executed, a nonperiodic excitation decision would be one of the possible outcomes of the search process in step 7. Integral to steps 7 and 8 is the determination of gain (scaling) parameters for the excitation vectors. For each frame of input speech, the filter and excitation and gain parameters determined as outlined are conveyed as encoded bits to the speech decoder. In a properly designed system, the data conveyed by the channel decoder to the speech decoder should be free of errors most of the time, and the speech signal synthesized by the speech decoder would be identical to that as determined in the speech encoder’s ABS operation. It is common to enhance the quality of the synthesized speech by using an adaptive postfilter to attenuate coder noise in the perceptually sensitive regions of the spectrum. The postfilter of the decoder and the perceptual weighting filter of the encoder may seem to be functionally identical. The weighting filter, however, influences the selection of the best excitation among available choices, whereas the postfilter actually shapes the spectrum of the synthesized signal. Since postfiltering introduces its own distortion, its advantage may be diminished if tandem coding occurs along the end-to-end communication path. Nevertheless, proper design can ensure that the net effect of postfiltering is a reduction in the amount of audible codec noise [1]. Excepting postfiltering, all other speech synthesis operations of an LPAS decoder are (effectively) duplicated in the encoder (though the converse is not true). Using this fact, we shall illustrate each coder in the sequel by exhibiting only a block diagram of its encoder or decoder but not both.

30.5

Channel Coding Techniques in the Half-Rate Standards

Crucial to the maintenance of quality speech communication is the ability to transport coded speech data across the radio channel with minimal errors. Low-bit-rate LPAS coders are particularly sensitive to channel errors; errors in the bits representing the LP parameters in one frame, for instance, could result in the synthesis of nonsensical sounds for longer than a frame duration. The error rate of a digital cellular radio channel with no channel coding can be catastrophically high for LPAS coders. The amount of tolerable transmission delay is limited by the requirement of interactive communication and, consequently, forward error control is used to remedy transmission errors. “Forward” means that channel errors are remedied in the receiver, with no additional information from the transmitter and, hence, no additional transmission delay. To enable the channel decoder to correct channel errors, the channel encoder conveys more bits than the amount generated by the speech encoder. The additional bits are for error protection, as errors may or may not occur in any particular transmission epoch. The ratio of the number of encoder input (information) bits to the number of encoder output (code) bits is called the (channel) coding rate. This is a number no more than one and generally decreases as the error protection power increases. Though a lower channel coding rate gives more error protection, fewer bits will be available for speech coding. When the channel is in good condition and, hence, less error protection is needed, the received speech quality could be better if bits devoted to channel coding were used for speech coding. On the other hand, if a high channel coding rate were used, there would be uncorrected errors under poor channel conditions and speech quality would suffer. 1999 by CRC Press LLC

c

Thus, when nonadaptive forward error protection is used over channels with nonstationary statistics, there is an inevitable tradeoff between quality degradation due to uncorrected errors and that due to expending bits on error protection (instead of on speech encoding). Both the GSM and PDC half-rate coders use convolutional coding [14] for error correction. Convolutional codes are sliding or sequential codes. The encoder of a rate m/n, m < n convolutional code can be realized using m shift registers. For every m information bits input to the encoder (one bit to each of the m shift registers), n code bits are output to the channel. Each code bit is computed as a modulo-2 sum of a subset of the bits in the shift registers. Error protection overhead can be reduced by exploiting the unequal sensitivity of speech quality to errors in different positions of the encoded bit stream. A family of rate-compatible punctured convolutional codes (RCPCCs) [10] is a collection of related convolutional codes; all of the codes in the collection except the one with the lowest rate are derived by puncturing (dropping) code bits from the convolutional code with the lowest rate. With an RCPCC, the channel coding rate can be varied on the fly (i.e., variable-rate coding) while a sequence of information bits is being encoded through the shift registers, thereby imparting on different segments in the sequence different degrees of error protection. For decoding a convolutional coded bit stream, the Viterbi algorithm [14] is a computationally efficient procedure. Given the output of the demodulator, the algorithm determines the most likely sequence of data bits sent by the channel encoder. To fully utilize the error correction power of the convolutional code, the amplitude of the demodulated channel symbol can be quantized to more bits than the minimum number required, i.e., for subsequent soft decision decoding. The minimum number of bits is given by the number of channel-coded bits mapped by the modulator onto each channel symbol; decoding based on the minimum-rate bit stream is called hard decision decoding. Although soft decoding gives better error protection, decoding complexity is also increased. Whereas convolutional codes are most effective against randomly scattered bit errors, errors on cellular radio channels often occur in bursts of bits. These bursts can be broken up if the bits put into the channel are rearranged after demodulation. Thus, in block interleaving, encoded bits are read into a matrix by row and then read out of the matrix by column (or vice versa) and then passed on to the modulator; the reverse operation is performed by a deinterleaver following demodulation. Interleaving increases the transmission delay to the extent that enough bits need to be collected in order to fill up the matrix. Owing to the severe nature of the cellular radio channel and limited available transmission capacity, uncorrected errors often remain in the decoded data. A common countermeasure is to append an error detection code to the speech data stream prior to channel coding. When residual channel errors are detected, the speech decoder can take various remedial measures to minimize the negative impact on speech quality. Common measures are repetition of speech parameters from the most recent good frames and gradual muting of the possibly corrupted synthesized speech. The PDC and GSM half-rate standard algorithms together embody some of the latest advances in speech coding techniques, including: multimodal coding where the coder configuration and bit allocation change with the type of speech input; vector quantization (VQ) [5] of the LP filter parameters; higher precision and improved coding efficiency for pitch-periodic excitation; and postfiltering with improved tandeming performance. We next explore the more distinctive features of the PDC and GSM speech coders.

30.6

The Japanese Half-Rate Standard

An algorithm was selected for the Japanese half-rate standard in April 1993, following the evaluation of 12 submissions in a first round, and four final candidates in a second round [12]. The selected 1999 by CRC Press LLC

c

algorithm, called pitch synchronous innovation CELP2 (PSI-CELP), met all of the basic selection criteria and scored the highest among all candidates evaluated. A block diagram of the PSI-CELP encoder is shown in Fig. 30.2, and bit allocations are summarized in Table 30.3. The complexity of the coder is estimated to be approximately 2.4 times that of the PDC full-rate coder. The frame size of the coder is 40 ms, and its subframe size is 10 ms. These sizes are longer than those used in most existing CELP-type standard coders. However, LP analysis is performed twice per frame in the PSI-CELP coder.

FIGURE 30.2: Basic structure of the PSI-CELP encoder.

A distinctive feature of the PSI-CELP coder is the use of an adaptive noise canceller [13, 15] to suppress noise in the input signal prior to coding. The input signal is classified into various modes, depending on the presence or absence of background noise and speech and their relative power levels. The current active mode determines whether Kalman filtering [9] is applied to the input signal

2 There were two candidate algorithms named PSI-CELP in the PDC half-rate competition. The algorithm described here was contributed by NTT Mobile Communications Network, Inc. (NTT DoCoMo).

1999 by CRC Press LLC

c

TABLE 30.3 Bit Allocations for the PSI-CELP Half- Rate PDC Speech Coder Parameter

Bits

Error Protected Bits

LP synthesis filter Frame energy Periodic excitation Stochastic excitation Gain Total

31 7 8×4 10 × 4 7×4 138

15 7 8×4 0 3×4 66

and whether the parameters of the Kalman filter are adapted. Kalman filtering is applied when a significant amount of background noise is present or when both background noise and speech are strongly present. The filter parameters are adapted to the statistics of the speech and noise signals in accordance with whether they are both present or only noise is present. The LP filter parameters in the PSI-CELP coder are encoded using VQ. A tenth-order LP analysis is performed every 20 ms. The resultant filter parameters are converted to 10 line spectral frequencies (LSFs).3 The LSF parameters have a naturally increasing order, and together are treated as the ordered components of a vector. Since the speech spectral envelope tends to evolve slowly with time, there is intervector dependency between adjacent LSF vectors that can be exploited. Thus, the two LSF vectors for each 40-ms frame are paired together and jointly encoded. Each LSF vector in the pair is split into three subvectors. The pair of subvectors that cover the same vector component indexes are combined into one composite vector and vector quantized. Altogether, 31 b are used to encode a pair of LSF vectors. This three-way split VQ4 scheme embodies a compromise between the prohibitively high complexity of using a large vector dimension and the performance gain from exploiting intraand intervector dependency. The PSI-CELP encoder uses a perceptual weighting filter consisting of a cascade of two filter sections. The sections exploit the pitch-harmonic structure and the LP spectral-envelope structure of the speech signal, respectively. The pitch-harmonic section has four parameters, a pitch lag and three coefficients, whose values are determined from an analysis of the periodic structure of the input speech signal. Pitch-harmonic weighting reduces the amount of noise in between the pitch harmonics by aggregating coder noise to be closer to the harmonic frequencies of the speech signal. In high-pitched voice, the harmonics are spaced relatively farther apart, and pitch-harmonic weighting becomes correspondingly more important. The excitation vector x (Fig. 30.2) is updated once every subframe interval (10 ms) and is constructed as a linear combination of two vectors x = g0 y + g1 z

(30.1)

where g0 and g1 are scalar gains, y is labeled as the periodic component of the excitation and z as the stochastic or random component. When the input speech is voiced, the ABS operation would find a value for y from the adaptive codebook (Fig. 30.2). The codebook is constructed out of past samples of the excitation signal x; hence, there is a feedback path into the adaptive codebook in Fig. 30.2. Each code vector in the adaptive codebook corresponds to one of the 192 possible pitch lag L values available for encoding; the code vector is populated with samples of x beginning with the Lth sample backward in time. L is not restricted to be an integer, i.e., fractional pitch period is

3 Also known as line spectrum pairs (LSPs). 4 Matrix quantization is another possible description.

1999 by CRC Press LLC

c

permitted. Successive values of L are more closely spaced for smaller values of L; short, medium, and long lags are quantized to one-quarter, one-half, and one sampling-period resolution, respectively. As a result, the relative quantization error in the encoded pitch frequency (which is the reciprocal of the encoded pitch lag) remains roughly constant with increasing pitch frequency. When the input speech is unvoiced, y would be obtained from the fixed codebook (Fig. 30.2). To find the best value for y, the encoder searches through the aggregate of 256 code vectors from both the adaptive and fixed codebooks. The code vector that results in a synthesis output most resembling the input speech is selected. The best code vector thus chosen also implicitly determines the voicing condition (voiced/unvoiced) and the pitch lag value L∗ most appropriate to the current subframe of input speech. These parameters are said to be determined in a closed-loop search. The stochastic excitation z is formed as a sum of two code vectors, each selected from a conjugate codebook (Fig. 30.2) [13]. Using a pair of conjugate codebooks each of size 16 code vectors (4 b) has been found to improve robustness against channel errors, in comparison with using one single codebook of size 256 code vectors (8 b). The synthesis output due to z can be decomposed into a sum of two orthogonal components, one of which points in the same direction as the synthesis output due to the periodic excitation y and the other component points in a direction orthogonal to the synthesis output due to y. The latter synthesis output component of z is kept, whereas the former component is discarded. Such decomposition enables the two gain factors g0 and g1 to be separately quantized. For voiced speech, the conjugate code vectors are preprocessed to produce a set of pitch synchronous innovation (PSI) vectors. The first L∗ samples of each code vector are treated as a fundamental period of samples. The fundamental period is replicated until there are enough samples to populate a subframe. If L∗ is not an integer, interpolated samples of the code vectors are used (upsampled versions of the code vectors can be precomputed). PSI has been found to reinforce the periodicity and substantially improve the quality of synthesized voiced speech. The postfilter in the PSI-CELP decoder has three sections, for enhancing the formants, the pitch harmonics, and the high frequencies of the synthesized speech, respectively. Pitch-harmonic enhancement is applied only when the adaptive codebook has been used. Formant enhancement makes use of the decoded LP synthesis filter parameters, whereas a refined pitch analysis is performed on the synthesized speech to obtain the values for the parameters of the pitch-harmonic section of the postfilter. A first-order high-pass filter section compensates for the low-pass spectral tilt [1] of the formant enhancement section. Of the 138 speech data bits generated by the speech encoder every 40-ms frame, 66 b (Table 30.3) receive error protection and the remaining 72 speech data bits of the frame are not error protected. An error detection code of 9 cyclic redundancy check (CRC) bits is appended to the 66 b and then submitted to a rate 1/2, punctured convolutional encoder to generate a sequence of 152 channel coded bits. Of the unprotected 72 b, the 40 b that index the excitation codebooks (Table 30.3) are remapped or pseudo-Gray coded [17] so as to equalize their channel error sensitivity. As a result, a bit error occurring in an index word is likely to cause about the same amount of degradation regardless of the bit error position in the index word. For each speech frame, the channel encoder emits 224 b of payload data. The payload data from two adjacent frames are interleaved before transmission over the radio link. Uncorrected errors in the most critical 66 b are detected with high probability as a CRC error. A finite state machine keeps track of the recent history of CRC errors. When a sequence of CRC errors is encountered, the power level of the synthesized speech is progressively suppressed, so that muting is reached after four consecutive CRC errors. Conversely, following the cessation of a sequence of CRC errors, the power level of the synthesized speech is ramped up gradually.

1999 by CRC Press LLC

c

30.7

The European GSM Half-Rate Standard

A vector sum excited linear prediction (VSELP) coder, contributed by Motorola, Inc., was selected in January 1994 by the main GSM technical committee as a basis for the GSM half-rate standard. The standard was finally approved in January 1995. VSELP is a generic name for a family of algorithms from Motorola; the North American full-rate and the Japanese full-rate standards are also based on VSELP. All VSELP coders make use of the basic idea of representing the excitation signal by a linear combination of basis vectors [6]. This representation renders the excitation codebook search procedure very computationally efficient. A block diagram of the GSM half-rate decoder is depicted in Fig. 30.3 and bit allocations are tabulated in Table 30.4. The coder’s frame size is 20 ms, and each frame comprises four subframes of 5 ms each. The coder has been optimized for execution on a processor with 16-b word length and 32-b accumulator. The GSM standard is a bit exact specification: in addition to specifying the codec’s processing steps, the numerical formats and precisions of the codec’s variables are also specified.

FIGURE 30.3: Basic structure of the GSM VSELP decoder. Top is for mode 0 and bottom is for modes 1, 2, and 3. 1999 by CRC Press LLC

c

TABLE 30.4 Coder

Bit Allocations for the VSELP Half-Rate GSM

Parameter LP synthesis filter Soft interpolation Frame energy Mode selection Mode 0 Excitation code I Excitation code H Gain code Gs , P0 Mode 1, 2, and 3 Pitch lag L (first subframe) Difference lag (subframes 2, 3, 4) Excitation code J Gain code Gs , P0 Total

Bits/subframe

Bits/frame 28 1 5 2

7 7 5

28 28 20

4 9 5

8 12 36 20 112

The synthesis filter coefficients in GSM VSELP are encoded using the fixed point lattice technique (FLAT) [8] and vector quantization. FLAT is based on the lattice filter representation of the linear prediction error filter. The tenth-order lattice filter has 10 stages, with the ith stage, i ∈ {1, . . . , 10}, containing a reflection coefficient parameter ri . The lattice filter has an order-recursion property such that the best prediction error filters of all orders less than ten are all embedded in the best tenth-order lattice filter. This means that once the values of the lower order reflection coefficients have been optimized, they do not have to be reoptimized when a higher order predictor is desired; in other words, the coefficients can be optimized sequentially from low to high orders. On the other hand, if the lower order coefficients were suboptimal (as in the case when the coefficients are quantized), the higher order coefficients could still be selected to minimize the prediction residual (or error) energy at the output of the higher order stages; in effect, the higher order stages can compensate for the suboptimality of lower order stages. In the GSM VSELP coder, the ten reflection coefficients {r1 , . . . , r10 } that have to be encoded for each frame are grouped into three coefficient vectors v 1 = [r1 r2 r3 ], v 2 = [r4 r5 r6 ], v 3 = [r7 r8 r9 r10 ]. The vectors are quantized sequentially, from v 1 to v 3 , using a bi -bit VQ codebook Ci for v i , where bi , i = 1, 2, 3 are 11, 9, and 8 b, respectively. The vector v i is quantized to minimize the prediction error at the energy output of the j th stage of the lattice filter where rj is the highest order coefficient in the vector v i . The computational complexity associated with quantizing v i is reduced by searching only a small subset of the code vectors in Ci . The subset is determined by first searching a prequantizer codebook of size ci bits, where ci , i = 1, . . . , 3 are 6, 5, and 4 b, respectively. Each code vector in the prequantizer codebook is associated with 2bi −ci code vectors in the target codebook. The subset is obtained by pooling together all of the code vectors in Ci that are associated with the top few best matching prequantizer code vectors. In this way, a factor of reduction in computational complexity of nearly 2bi −ci is obtained for the quantization of v i . The half-rate GSM coder changes its configuration of excitation generation (Fig. 30.3) in accordance with a voicing mode [7]. For each frame, the coder selects one of four possible voicing modes depending on the values of the open-loop pitch-prediction gains computed for the frame and its four subframes. Open loop refers to determining the pitch lag and the pitch-predictor coefficient(s) via a direct analysis of the input speech signal or, in the case of the half-rate GSM coder, the perceptually weighted (LP-weighting only) input signal. Open-loop analysis can be regarded as the opposite of closed-loop analysis, which in our context is synonymous with ABS. When the pitch-prediction gain for the frame is weak, the input speech signal is deemed to be unvoiced and mode 0 is used. In this mode, two 7-b trained codebooks (excitation codebooks 1 and 2 in Fig. 30.3) are used, and the excitation signal for each subframe is formed as a linear combination of two code vectors, one from 1999 by CRC Press LLC

c

each of the codebooks. A trained codebook is one designed by applying the coder to a representative set of speech signals while optimizing the codebook to suit the set. Mode 1, 2, or 3 is chosen depending on the strength of the pitch-prediction gains for the frame and its subframes. In these modes, the excitation signal is formed as a linear combination of a code vector from an 8-b adaptive codebook and a code vector from a 9-b trained codebook (Fig. 30.3). The code vectors that are summed together to form the excitation signal for a subframe are each scaled by a gain factor (β and γ in Fig. 30.3). Each mode uses a gain VQ codebook specific to that mode. As depicted in Fig. 30.3, the decoder contains an adaptive pitch prefilter for the voiced modes and an adaptive postfilter for all modes. The filters enhance the perceptual quality of the decoded speech and are not present in the encoder. It is more conventional to locate the pitch prefilter as a section of the postfilter; the distinctive placement of the pitch prefilter in VSELP was chosen to reduce artifacts caused by the time-varying nature of the filter. In mode 0, the encoder uses an LP spectral weighting filter in its ABS search of the two excitation codebooks. In the other modes, the encoder uses a pitchharmonic weighting filter in cascade with an LP spectral weighting filter for searching excitation codebook 0, whereas only LP spectral weighting is used for searching the adaptive codebook. The pitch-harmonic weighting filter has two parameters, a pitch lag and a coefficient, whose values are determined in the aforementioned open-loop pitch analysis. A code vector in the 8-b adaptive codebook has a dimension of 40 (the duration of a subframe) and is populated with past samples of the excitation signal beginning with the Lth sample back from the present time. L can take on one of 256 different integer and fractional values. The best adaptive code vector for each subframe can be selected via a complete ABS; the required exhaustive search of the adaptive codebook is, however, computationally expensive. To reduce computation, the GSM VSELP coder makes use of the aforementioned open-loop pitch analysis to produce a list of candidate lag values. The open-loop pitch-prediction gains are ranked in decreasing order, and only the lags corresponding to top-ranked gains are kept as candidates. The final decisions for the four L values of the four subframes in a frame are made jointly. By assuming that the four L values can not vary over the entire range of all possible 256 values in the short duration of a frame, the L of the first subframe is coded using 8 b, and the L of each of the other three subframes is coded differentially using 4 b. The 4 b represent 16 possible values of deviation relative to the lag of the previous subframe. The four lags in a frame trace out a trajectory where the change from one time point to the next is restricted; consequently, only 20 b are needed instead of 32 b for encoding the four lags. Candidate trajectories are constructed by linking top ranked lags that are commensurate with differential encoding. The best trajectory among the candidates is then selected via ABS. The trained excitation codebooks of VSELP have a special vector sum structure that facilitates fast searching [6]. Each of the 2b code vectors in a b-bit trained codebook is formed as a linear combination of b basis vectors. Each of the b scalar weights in the linear combination is restricted to have a binary value of either 1 or −1. The 2b code vectors in the codebook are obtained by taking all 2b possible combinations of values of the weights. A substantial storage saving is incurred by storing only b basis vectors instead of 2b code vectors. Computational saving is another advantage of the vector-sum structure. Since filtering is a linear operation, the synthesis output due to each code vector is a linear combination of the synthesis outputs due to the individual basis vectors, where the same weight values are used in the output linear combination as in forming the code vector. A vector sum codebook can be searched by first performing synthesis filtering on its b basis vectors. If, for the present subframe, another trained codebook (mode 0) or an adaptive codebook (mode 1, 2, 3) had been searched, the filtered basis vectors are further orthogonalized with respect to the signal synthesized from that codebook, i.e., each filtered basis vector is replaced by its own component that is orthogonal to the synthesized signal. Further complexity reduction is obtained by examining the code vectors in a sequence such that two successive code vectors differ in only one of the b scalar 1999 by CRC Press LLC

c

weight values; that is, the entire set of 2b code vectors is searched in a Gray coded sequence. With successive code vectors differing in only one term in the linear combination, it is only necessary in the codebook search computation to progressively track the difference [6]. The total energy of a speech frame is encoded with 5 b (Table 30.4). The two gain factors (β and γ in Fig. 30.3) for each subframe are computed after the excitation codebooks have been searched and are then transformed to parameters Gs and P0 to be vector quantized. Each mode has its own 5-b gain VQ codebook. Gs represents the energy of the subframe relative to the total frame energy, and P0 represents the fraction of the subframe energy due to the first excitation source (excitation codebook 1 in mode 0, or the adaptive codebook in the other modes). An interpolation bit (Table 30.4) transmitted for each frame specifies to the decoder whether the LP synthesis filter parameters for each subframe should be obtained from interpolating between the decoded filter parameters for the current and the previous frames. The encoder determines the value of this bit according to whether interpolation or no interpolation results in a lower prediction residual energy for the frame. The postfilter in the decoder operates in concordance with the actual LP parameters used for synthesis. The speech encoder generates 112 b of encoded data (Table 30.4) for every 20-ms frame of the speech signal. These bits are processed by the channel encoder to improve, after channel decoding at the receiver, the uncoded bit-error rate and the detectability of uncorrected errors. Error detection coding in the form of 3 CRC bits is applied to the most critical 22 data bits. The combined 25 b plus an additional 73 speech data bits and 6 tail bits are input to an RCPCC encoder (the tail bits serve to bring the channel encoder and decoder to a fixed terminal state at the end of the payload data stream). The 3 CRC bits are encoded at rate 1/3 and the other 101 b are encoded at rate 1/2, generating a total of 211 channel coded bits. These are finally combined with the remaining 17 (uncoded) speech data bits to form a total of 228 b for the payload data of a speech frame. The payload data from two speech frames are interleaved for transmission over four timeslots of the GSM TDMA channel. With the Viterbi algorithm, the channel decoder performs soft decision decoding on the demodulated and deinterleaved channel data. Uncorrected channel errors may still be present in the decoded speech data after Viterbi decoding. Thus, the channel decoder classifies each frame into three integrity categories: bad, unreliable, and reliable, in order to assist the speech decoder in undertaking error concealment measures. A frame is considered bad if the CRC check fails or if the received channel data is close to more than one candidate sequence. The latter evaluation is based on applying an adaptive threshold to the metric values produced by the Viterbi algorithm over the course of decoding the most critical 22 speech data bits and their 3 CRC bits. Frames that are not bad may be classified as unreliable, depending on the metric values produced by the Viterbi algorithm and on channel reliability information supplied by the demodulator. Depending on the recent history of decoded data integrity, the speech decoder can take various error concealment measures. The onset of bad frames is concealed by repetition of parameters from previous reliable frames, whereas the persistence of bad frames results in power attenuation and ultimately muting of the synthesized speech. Unreliable frames are decoded with normality constraints applied to the energy of the synthesized speech.

30.8

Conclusions

The half-rate standards employ some of the latest techniques in speech and channel coding to meet the challenges posed by the severe transmission environment of digital cellular radio systems. By halving the bit rate, the voice transmission capacity of existing full-rate digital cellular systems can be doubled. Although advances are still being made that can address the needs of quarter-rate speech transmission, 1999 by CRC Press LLC

c

much effort is currently devoted to enhancing the speech quality and robustness of full-rate (GSM and IS-54) systems, aiming to be closer to toll quality. On the other hand, the imminent introduction of competing wireless systems that use different modulation schemes [e.g., coded division multiple access (CDMA)] and/or different radio frequencies [e.g., personal communications systems (PCS)] is poised to alleviate congestion in high-user-density areas.

Defining Terms Codebook: An ordered collection of all possible values that can be assigned to a scalar or vector variable. Each unique scalar or vector value in a codebook is called a codeword, or code vector where appropriate. Codec: A contraction of (en)coder–decoder, used synonymously with the word coder. The encoder and decoder are often designed and deployed as a pair. A half-rate standard codec performs speech as well as channel coding. Echo canceller: A signal processing device that, given the source signal causing the echo signal, generates an estimate of the echo signal and subtracts the estimate from the signal being interfered with by the echo signal. The device is usually based on a discrete-time adaptive filter. Pitch period: The fundamental period of a voiced speech waveform that can be regarded as periodic over a short-time interval (quasiperiodic). The reciprocal of pitch period is pitch frequency or simply, pitch. Tandem coding: Having more than one encoder–decoder pair in an end-to-end transmission path. In cellular radio communications, having a radio link at each end of the communication path could subject the speech signal to two passes of speech encoding–decoding. In general, repeated encoding and decoding increases the distortion.

Acknowledgment The authors would like to thank Erdal Paksoy and Mark A. Jasiuk for their valuable comments.

References [1] Chen, J.-H. and Gersho, A., Adaptive postfiltering for quality enhancement of coded speech. IEEE Trans. Speech & Audio Proc., 3(1), 59–71, 1995. [2] De Martino, E., Speech quality evaluation of the European, North-American and Japanese speech codec standards for digital cellular systems. In Speech and Audio Coding for Wireless and Network Applications, Atal, B.S., Cuperman, V., and Gersho, A., Eds., 55–58, Kluwer Academic Publishers, Norwell, MA, 1993. [3] Dimolitsas, S., Corcoran, F.L., and Baraniecki, M.R., Transmission quality of North American cellular, personal communications, and public switched telephone networks. IEEE Trans. Veh. Tech., 43(2), 245–251, 1994. [4] Gersho, A., Advances in speech and audio compression. Proc. IEEE, 82(6), 900–918, 1994. [5] Gersho, A. and Gray, R.M., Vector Quantization and Signal Compression, Kluwer Academic Publishers, Norwell, MA, 1991. [6] Gerson, I.A. and Jasiuk, M.A., Vector sum excited linear prediction (VSELP) speech coding at 8 kbps. In Proceedings, IEEE Intl. Conf. Acoustics, Speech, & Sig. Proc., 461–464, April, 1990. 1999 by CRC Press LLC

c

[7] Gerson, I.A. and Jasiuk, M.A., Techniques for improving the performance of CELP—type speech coders. IEEE J. Sel. Areas Comm., 10(5), 858–865, 1992. [8] Gerson, I.A., Jasiuk, M.A., Nowack, J.M., Winter, E.H., and M¨uller, J.-M., Speech and channel coding for the half-rate GSM channel. In Proceedings, ITG-Report 130 on Source and Channel Coding, 225–232. Munich, Germany, Oct., 1994. [9] Gibson, J.D., Koo, B., and Gray, S.D., Filtering of colored noise for speech enhancement and coding. IEEE Trans. Sig. Proc., 39(8), 1732–1742, 1991. [10] Hagenauer, J., Rate-compatible punctured convolutional codes (RCPC codes) and their applications. IEEE Trans. Comm., 36(4), 389–400, 1988. [11] Jayant, N.S. and Noll, P., Digital Coding of Waveforms, Prentice-Hall, Englewood Cliffs, NJ, 1984. [12] Masui, F. and Oguchi, M.,. Activity of the half rate speech codec algorithm selection for the personal digital cellular system. Tech. Rept. of IEICE, RCS93-77(11), 55–62 (in Japanese), 1993. [13] Ohya, T., Suda, H., and Miki, T., 5.6 kbits/s PSI-CELP of the half-rate PDC speech coding standard. In Proceedings, IEEE Veh. Tech. Conf., 1680–1684, June, 1994. [14] Proakis, J.G., Digital Communications, 3rd ed., McGraw-Hill, New York, 1995. [15] Suda, H., Ikeda, K., and Ikedo, J., Error protection and speech enhancement schemes of PSICELP, NTT R & D. (Special issue on PSI-CELP speech coding system for mobile communications), 43(4), 373–380, (in Japanese), 1994. [16] Telecommunication Industries Association (TIA). Half-rate speech codec test plan V6.0. TR45.3.5/93.05.19.01, 1993. [17] Zeger, K. and Gersho, A., Pseudo-Gray coding. IEEE Trans. Comm., 38(12), 2147–2158, 1990.

Further Information Additional technical information on speech coding can be found in the books, periodicals, and conference proceedings that appear in the list of references. Other relevant publications not represented in the list are Speech Communication, Elsevier Science Publishers; Advances in Speech Coding, B. S. Atal, V. Cuperman, and A, Gersho, eds., Kluwer Academic Publishers; and Proceedings of the IEEE Workshop on Speech Coding.

1999 by CRC Press LLC

c

Budagavi, M. & Talluri, R. “Wireless Video Communications” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Wireless Video Communications 31.1 Introduction 31.2 Wireless Video Communications Recommendation H.223

31.3 Error Resilient Video Coding

A Standard Video Coder • Error Resilient Video Decoding Classification of Error-Resilience Techniques

•

31.4 MPEG-4 Error Resilience Tools

Resynchronization • Data Partitioning • Reversible Variable Length Codes (RVLCs) • Header Extension Code (HEC) • Adaptive Intra Refresh (AIR)

31.5 H.263 Error Resilience Tools

Madhukar Budagavi Texas Instruments

Raj Talluri Texas Instruments

31.1

Slice Structure Mode (Annex K) • Independent Segment Decoding (ISD) (Annex R) • Error Tracking (Appendix I) • Reference Picture Selection (Annex N)

31.6 Discussion Defining Terms References Further Information

Introduction

Recent advances in technology have resulted in a rapid growth in mobile communications. With this explosive growth, the need for reliable transmission of mixed media information—audio, video, text, graphics, and speech data—over wireless links is becoming an increasingly important application requirement. The bandwidth requirements of raw video data are very high (a 176 × 144 pixels, color video sequence requires over 8 Mb/s). Since the amount of bandwidth available on current wireless channels is limited, the video data has to be compressed before it can be transmitted on the wireless channel. The techniques used for video compression typically utilize predictive coding schemes to remove redundancy in the video signal. They also employ variable length coding schemes, such as Huffman codes, to achieve further compression. The wireless channel is a noisy fading channel characterized by long bursts of errors [8]. When compressed video data is transmitted over wireless channels, the effect of channel errors on the video can be severe. The variable length coding schemes make the compressed bitstream sensitive to channel errors. As a result, the video decoder that is decoding the corrupted video bitstream can easily lose synchronization with the encoder. Predictive coding techniques, such as block motion compensation, which are used in current video compression standards, make the matter worse by 1999 by CRC Press LLC

c

quickly propagating the effects of channel errors across the video sequence and rapidly degrading the video quality. This may render the video sequence totally unusable. Error control coding [5], in the form of Forward Error Correction (FEC) and/or Automatic Repeat reQuest (ARQ), is usually employed on wireless channels to improve the channel conditions. FEC techniques prove to be quite effective against random bit errors, but their performance is usually not adequate against longer duration burst errors. FEC techniques also come with an increased overhead in terms of the overall bitstream size; hence, some of the coding efficiency gains achieved by video compression are lost. ARQ techniques typically increase the delay and, therefore, might not be suitable for real-time videoconferencing. Thus, in practical video communication schemes, error control coding is typically used only to provide a certain level of error protection to the compressed video bitstream, and it becomes necessary for the video coder to accept some level of errors in the video bitstream. Error-resilience tools are introduced in the video codec to handle these residual errors that remain after error correction. The emphasis in this chapter is on discussing relevant international standards that are making wireless video communications possible. We will concentrate on both the error control and source coding aspects of the problem. In the next section, we give an overview of a wireless video communication system that is a part of a complete wireless multimedia communication system. The International Telecommunication Union—Telecommunications Standardization Sector (ITU-T) H.223 [1] standard that describes a method of providing error protection to the video data before it is transmitted is also described. It should be noted that the main function of H.223 is to multiplex/demultiplex the audio, video, text, graphics, etc., which are typically communicated together in a videoconferencing application—error protection of the transmitted data becomes a requirement to support this functionality on error-prone channels. In Section 31.3, an overview of error-resilient video coding is given. The specific tools adopted into the International Standards Organization (ISO)/International Electrotechnical Commission (IEC) Motion Picture Experts Group (MPEG) v.4 (i.e., MPEG-4) [7] and the ITU-T H.263 [3] video coding standards to improve the error robustness of the video coder are described in Sections 31.4 and 31.5, respectively. Table 31.1 provides a listing of some of the standards that are described or referred to in this chapter.

31.2

Wireless Video Communications

Figure 31.1 shows the basic block diagram of a wireless video communication system [10]. Input video is compressed by the video encoder to generate a compressed bitstream. The transport coder converts the compressed video bitstream into data units suitable for transmission over wireless channels. Typical operations carried out in the transport coder include channel coding, framing of data, modulation, and control operations required for accessing the wireless channel. At the receiver side, the inverse operations are performed to reconstruct the video signal for display. In practice, the video communication system is part of a complete multimedia communication system and needs to interact with other system components to achieve the desired functionality. Hence, it becomes necessary to understand the other components of a multimedia communication system in order to design a good video communication system. Figure 31.2 shows the block diagram of a wireless multimedia terminal based on the ITU-T H.324 set of standards [4]. We use the H.324 standard as an example because it is the first videoconferencing standard for which mobile extensions were added to facilitate use on wireless channels. The system components of a multimedia terminal can be grouped into three processing blocks: (1) audio, video, and data (the word data is used here to mean still images/slides, shared files, documents etc.), (2) control, and (3) multiplex-demultiplex blocks. 1999 by CRC Press LLC

c

TABLE 31.1

List of Relevant Standards

ISO/IEC 14496-2

Information Technology—Coding of Audio-Visual Objects: Visual

(MPEG-4) H.263 (Version 1 and Version 2)

Video coding for low bitrate communication

H.261

Video codec for audiovisual services at p X 64 kbit/s

H.223

Multiplexing protocol for low bitrate multimedia communication

H.324

Terminal for low bitrate multimedia communication

H.245

Control protocol for multimedia communication

G.723.1

Dual rate speech coder for multimedia communication transmitting at 5.3 and 6.3 kbit/s

FIGURE 31.1: A wireless video communication system.

1. Audio, video, and data processing blocks—These blocks basically produce/consume the multimedia information that is communicated. The aggregate bitrate generated by these blocks is restricted due to limitations of the wireless channel and, therefore, the total rate allowed has to be judiciously allocated among these blocks. Typically, the video blocks use up the highest percentage of the aggregate rate, followed by audio and then data. H.324 specifies the use of H.261/H.263 for video coding and G.723.1 for audio coding. 2. Control block—This block has a wide variety of responsibilities all aimed at setting up and maintaining a multimedia call. The control block facilitates the set-up of compression methods and preferred bitrates for audio, video, and data to be used in the multimedia call. It is also responsible for end-to-network signalling for accessing the network and end-to-end signalling for reliable operation of the multimedia call. H.245 is the control protocol in the H.324 suite of standards that specifies the control messages to achieve the above functionality. 3. Multiplex-Demultiplex (MUX) block—This block multiplexes the resulting audio, video, data, and control signals into a single stream before transmission on the network. Similarly, the received bitstream is demultiplexed to obtain the audio, video, data, and control signals, which are then passed to their respective processing blocks. The MUX block accesses the network through a suitable network interface. The H.223 standard is the multiplexing scheme used in H.324. 1999 by CRC Press LLC

c

FIGURE 31.2: Configuration of a wireless multimedia terminal.

Proper functioning of the MUX is crucial to the operation of the video communication system, as all the multimedia data/signals flow through it. On wireless channels, transmission errors can lead to a breakdown of the MUX resulting in, for example, nonvideo data being channeled to the video decoder or corrupted video data being passed on to the video decoder. Three annexes were specifically added to H.223 to enable its operation in error-prone environments. Below, we give a more detailed overview of H.223 and point out the levels of error protection provided by H.223 and its three annexes. It should also be noted that MPEG-4 does not specify a lower-level MUX like H.223, and thus H.223 can also be used to transmit MPEG-4 video data.

31.2.1

Recommendation H.223

Video, audio, data, and control information is transmitted in H.324 on distinct logical channels. H.223 determines the way in which the logical channels are mixed into a single bitstream before transmission over the physical channel (e.g., the wireless channel). The H.223 multiplex consists of two layers—the multiplex layer and the adaptation layer, as shown in Fig. 31.2. The multiplex layer is responsible for multiplexing the various logical channels. It transmits the multiplexed stream in the form of packets. The adaptation layer adapts the information stream provided by the applications above it to the multiplex layer below it by adding, where appropriate, additional octets for the purposes of error control and sequence numbering. The type of error control used depends on the type of information (audio/video/data/control) being conveyed in the stream. The adaptation layer provides error control support in the form of both FEC and ARQ. H.223 was initially targeted for use on the benign general switched telephone network (GSTN). Later on, to enable its use on wireless channels, three annexes (referred to as Levels 1–3, respectively), were defined to provide improved levels of error protection. The initial specification of H.223 is 1999 by CRC Press LLC

c

referred to as Level 0. Together, Levels 0–3 provide for a trade-off of error robustness against the overhead required, with Level 0 being the least robust and using the least amount of overhead and Level 3 being the most robust and also using the most amount of overhead. 1. H.223 Level 0—Default mode. In this mode the transmitted packet sizes are of variable length and are delimited by an 8-bit HDLC (High-level Data Link Control) flag (01111110). Each packet consists of a 1-octet header followed by the payload, which consists of a variable number of information octets. The header octet includes a Multiplex Code (MC) which specifies, by indexing to a multiplex table, the logical channels to which each octet in the information field belongs. To prevent emulation of the HDLC flag in the payload, bitstuffing is adopted. 2. H.223 Level 1 (Annex A)—Communication over low error-prone channels. The use of bitstuffing leads to poor performance in the presence of errors; therefore in Level 1, bitstuffing is not performed. The other improvement incorporated in Level 1 is the use of a longer 16-bit pseudo-noise synchronization flag to allow for more reliable detection of packet boundaries. The input bitstream is correlated with the synchronization flag and the output of the correlator is compared with a correlation threshold. Whenever the correlator output is equal to or greater than the threshold, a flag is detected. Since, bitstuffing is not performed, it is possible to have this flag emulated in the payload. However, the probability of such an emulation is low and is outweighed by the improvement gained by not using bitstuffing over error-prone channels. 3. H.223 Level 2 (Annex B)—Communication over moderately error-prone channels. When compared to the Level 1 operation, Level 2 increases the protection on the packet header. A Multiplex Payload Length (MPL) field, which gives the length of the payload in bytes, is introduced into the header to provide additional redundancy for detecting the length of the video packet. A (24,12,8) extended Golay code is used to protect the MC and the MPL fields. Use of error protection in the header enables robust delineation of packet boundaries. Note that the payload data is not protected in Level 2. 4. H.223 Level 3 (Annex C)—Communication over highly error-prone channels. Level 3 goes one step above Level 2 and provides for protection of the payload data. Rate Compatible Punctured Convolutional (RCPC) codes, various CRC polynomials, and ARQ techniques are used for protection of the payload data. Level 3 allows for the payload error protection overhead to vary depending on the channel conditions. RCPC codes are used for achieving this adaptive level of error protection because RCPC codes use the same channel decoder architecture for all the allowed levels of error protection, thereby reducing the complexity of the MUX.

31.3

Error Resilient Video Coding

Even after error control and correction, some amount of residual errors still exist in the compressed bitstream fed to the video decoder in the receiver. Therefore, the video decoder should be robust to these errors

Complex Envelope Representations for Modulated Signals Leon W. Couch, II Sampling Hwei P. Hsu Pulse Code Modulation Leon W. Couch, II Baseband Signalling and Pulse Shaping Michael L. Honig and Melbourne Barton Channel Equalization John G. Proakis Line Coding Joseph L. LoCicero and Bhasker P. Patel Echo Cancellation Giovanni Cherubini Pseudonoise Sequences Tor Helleseth and P. Vijay Kumar Optimum Receivers Geoffrey C. Orsak Forward Error Correction Coding V.K. Bhargava and I.J. Fair Spread Spectrum Communications Laurence B. Milstein and Marvin K. Simon Diversity Arogyaswami J. Paulraj Digital Communication System Performance Bernard Sklar Telecommunications Standardization Spiros Dimolitsas and Michael Onufry

PART II 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

Basic Principles

Wireless

Wireless Personal Communications: A Perspective Donald C. Cox Modulation Methods Gordon L. St¨uber Access Methods Bernd-Peter Paris Rayleigh Fading Channels Bernard Sklar Space-Time Processing Arogyaswami J. Paulraj Location Strategies for Personal Communications Services Ravi Jain, Yi-Bing Lin, and Seshadri Mohan1 Cell Design Principles Michel Daoud Yacoub Microcellular Radio Communications Raymond Steele Fixed and Dynamic Channel Assignment Bijan Jabbari Radiolocation Techniques Gordon L. St¨uber and James J. Caffery, Jr. Power Control Roman Pichna and Qiang Wang Enhancements in Second Generation Systems Marc Delprat and Vinod Kumar The Pan-European Cellular System Lajos Hanzo Speech and Channel Coding for North American TDMA Cellular Systems Paul Mermelstein The British Cordless Telephone Standard: CT-2 Lajos Hanzo Half-Rate Standards Wai-Yip Chan, Ira Gerson, and Toshio Miki Wireless Video Communications Madhukar Budagavi and Raj Talluri Wireless LANs Suresh Singh Wireless Data Allen H. Levesque and Kaveh Pahlavan Wireless ATM: Interworking Aspects Melbourne Barton, Matthew Cheng, and Li Fung Chang 1999 by CRC Press LLC

c

Hsu, H.P. “Sampling” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Sampling 2.1 2.2 2.3 2.4 2.5 2.6

Hwei P. Hsu Fairleigh Dickinson University

2.1

Introduction Instantaneous Sampling

Ideal Sampled Signal • Band-Limited Signals

Sampling Theorem Sampling of Sinusoidal Signals Sampling of Bandpass Signals Practical Sampling

Natural Sampling • Flat-Top Sampling

2.7 Sampling Theorem in the Frequency Domain 2.8 Summary and Discussion Defining Terms References Further Information

Introduction

To transmit analog message signals, such as speech signals or video signals, by digital means, the signal has to be converted into digital form. This process is known as analog-to-digital conversion. The sampling process is the first process performed in this conversion, and it converts a continuous-time signal into a discrete-time signal or a sequence of numbers. Digital transmission of analog signals is possible by virtue of the sampling theorem, and the sampling operation is performed in accordance with the sampling theorem. In this chapter, using the Fourier transform technique, we present this remarkable sampling theorem and discuss the operation of sampling and practical aspects of sampling.

2.2

Instantaneous Sampling

Suppose we sample an arbitrary analog signal m(t) shown in Fig. 2.1(a) instantaneously at a uniform rate, once every Ts seconds. As a result of this sampling process, we obtain an infinite sequence of samples {m(nTs )}, where n takes on all possible integers. This form of sampling is called instantaneous sampling. We refer to Ts as the sampling interval, and its reciprocal 1/Ts = fs as the sampling rate. Sampling rate (samples per second) is often cited in terms of sampling frequency expressed in hertz. 1999 by CRC Press LLC

c

M (ω)

m (t )

0 (a)

−ωM

t

0 (b)

F [δ Ts (t )]

δTs (t )

−Ts

0 Ts (c)

2Ts

−ωs

t

0 (d)

0 Ts (e)

2Ts

−ωM

−ωs

t

0 (f)

ωM

0 (g)

Ts

2Ts

−2ωs

t

0 (h)

−ω s

m s (t )

−Ts

0

ω

ωs

ω

F [δ Ts (t )]

δ Ts (t )

−Ts

ωs Ms (ω)

m s (t )

−Ts

ω

ωM

Ts

ωs

2ωs

ω

ωs

2ωs

ω

Ms (ω)

2Ts

−2ωs

t

0

−ωs −ωM

(i)

(j)

ωM

FIGURE 2.1: Illustration of instantaneous sampling and sampling theorem.

2.2.1

Ideal Sampled Signal

Let ms (t) be obtained by multiplication of m(t) by the unit impulse train δT (t) with period Ts [Fig. 2.1(c)], that is, ms (t)

= =

m(t)δTs (t) = m(t) ∞ X n=−∞

1999 by CRC Press LLC

c

∞ X

δ (t − nTs )

n=−∞

m(t)δ (t − nTs ) =

∞ X n=−∞

m (nTs ) δ (t − nTs )

(2.1)

where we used the property of the δ function, m(t)δ(t − t0 ) = m(t0 )δ(t − t0 ). The signal ms (t) [Fig. 2.1(e)] is referred to as the ideal sampled signal.

2.2.2 Band-Limited Signals A real-valued signal m(t) is called a band-limited signal if its Fourier transform M(ω) satisfies the condition (2.2) M(ω) = 0 for |ω| > ωM where ωM = 2πfM [Fig. 2.1(b)]. A band-limited signal specified by Eq. (2.2) is often referred to as a low-pass signal.

2.3

Sampling Theorem

The sampling theorem states that a band-limited signal m(t) specified by Eq. (2.2) can be uniquely determined from its values m(nTs ) sampled at uniform interval Ts if Ts ≤ π/ωM = 1/(2fM ). In fact, when Ts = π/ωM , m(t) is given by m(t) =

∞ X

m (nTs )

n=−∞

sin ωM (t − nTs ) ωM (t − nTs )

(2.3)

which is known as the Nyquist–Shannon interpolation formula and it is also sometimes called the cardinal series. The sampling interval Ts = 1/(2fM )is called the Nyquist interval and the minimum rate fs = 1/Ts = 2fM is known as the Nyquist rate. Illustration of the instantaneous sampling process and the sampling theorem is shown in Fig. 2.1. The Fourier transform of the unit impulse train is given by [Fig. 2.1(d)] ∞ X δ (ω − nωs )ψ F δTs (t) = ωs

ωs = 2π/Ts

(2.4)

n=−∞

Then, by the convolution property of the Fourier transform, the Fourier transform Ms (ω) of the ideal sampled signal ms (t) is given by # " ∞ X 1 M(ω) ∗ ωs δ (ω − nωs ) Ms (ω)ψ = 2π n=−∞ =

∞ 1 X M (ω − nωs )ψ Ts n=−∞

(2.5)

where ∗ denotes convolution and we used the convolution property of the δ-function M(ω) ∗ δ(ω − ω0 ) = M(ω − ω0 ). Thus, the sampling has produced images of M(ω) along the frequency axis. Note that Ms (ω) will repeat periodically without overlap as long as ωs ≥ 2ωM or fs ≥ 2fM [Fig. 2.1(f)]. It is clear from Fig. 2.1(f) that we can recover M(ω) and, hence, m(t) by passing the sampled signal ms (t) through an ideal low-pass filter having frequency response Ts ,ψ |ω| ≤ ωM (2.6) H (ω) = 0,ψ otherwise 1999 by CRC Press LLC

c

where ωM = π/Ts . Then

M(ω) = Ms (ω)H (ω)ψ

(2.7)

Taking the inverse Fourier transform of Eq. (2.6), we obtain the impulse response h(t) of the ideal low-pass filter as sin ωM t (2.8) h(t) = ωM t Taking the inverse Fourier transform of Eq. (2.7), we obtain m(t)ψ = = =

ms (t) ∗ h(t) ∞ X sin ωM t m (nTs ) δ (t − nTs ) ∗ ωM t n=−∞ ∞ X n=−∞

m (nTs )

sin ωM (t − nTs ) ωM (t − nTs )

(2.9)

which is Eq. (2.3). The situation shown in Fig. 2.1(j) corresponds to the case where fs < 2fM . In this case there is an overlap between M(ω) and M(ω − ωM ). This overlap of the spectra is known as aliasing or foldover. When this aliasing occurs, the signal is distorted and it is impossible to recover the original signal m(t) from the sampled signal. To avoid aliasing, in practice, the signal is sampled at a rate slightly higher than the Nyquist rate. If fs > 2fM , then as shown in Fig. 2.1(f), there is a gap between the upper limit ωM of M(ω) and the lower limit ωs − ωM of M(ω − ωs ). This range from ωM to ωs − ωM is called a guard band. As an example, speech transmitted via telephone is generally limited to fM = 3.3 kHz (by passing the sampled signal through a low-pass filter). The Nyquist rate is, thus, 6.6 kHz. For digital transmission, the speech is normally sampled at the rate fs = 8 kHz. The guard band is then fs − 2fM = 1.4 kHz. The use of a sampling rate higher than the Nyquist rate also has the desirable effect of making it somewhat easier to design the low-pass reconstruction filter so as to recover the original signal from the sampled signal.

2.4

Sampling of Sinusoidal Signals

A special case is the sampling of a sinusoidal signal having the frequency fM . In this case we require that fs > 2fM rather that fs ≥ 2fM . To see that this condition is necessary, let fs = 2fM . Now, if an initial sample is taken at the instant the sinusoidal signal is zero, then all successive samples will also be zero. This situation is avoided by requiring fs > 2fM .

2.5

Sampling of Bandpass Signals

A real-valued signal m(t) is called a bandpass signal if its Fourier transform M(ω) satisfies the condition ω1 < ω < ω2 (2.10) M(ω) = 0 except for −ω2 < ω < −ω1 where ω1 = 2πf1 and ω2 = 2πf2 [Fig. 2.2(a)]. The sampling theorem for a band-limited signal has shown that a sampling rate of 2f2 or greater is adequate for a low-pass signal having the highest frequency f2 . Therefore, treating m(t) specified by Eq. (2.10) as a special case of such a low-pass signal, we conclude that a sampling rate of 2f2 is 1999 by CRC Press LLC

c

M (ω) M− (ω)

−ω 2

M+ (ω)

−ω 1

ω1

0

ω2

ω

ωB

(a)

M− [ω − (k − 1) ω s ]

M− (ω − k ω s )

ω1

0

(k − 1) ω s − ω 1

ω2

ω

k ωs − ω 2

(b)

FIGURE 2.2: (a) Spectrum of a bandpass signal; (b) Shifted spectra of M (ω). adequate for the sampling of the bandpass signal m(t). But it is not necessary to sample this fast. The minimum allowable sampling rate depends on f1 , f2 , and the bandwidth fB = f2 − f1 . Let us consider the direct sampling of the bandpass signal specified by Eq. (2.10). The spectrum of the sampled signal is periodic with the period ωs = 2πfs , where fs is the sampling frequency, as in Eq. (2.4). Shown in Fig. 2.2(b) are the two right shifted spectra of the negative side spectrum M (ω). If the recovering of the bandpass signal is achieved by passing the sampled signal through an ideal bandpass filter covering the frequency bands (−ω2 , −ω1 ) and (ω1 , ω2 ), it is necessary that there be no aliasing problem. From Fig. 2.2(b), it is clear that to avoid overlap it is necessary that ωs ≥ 2 (ω2 − ω1 ) (k − 1)ωs − ω1 ≤ ω1

(2.11)

kωs − ω2 ≥ ω2

(2.13)

(2.12)

and where ω1 = 2πf1 , ω2 = 2πf2 , and k is an integer (k = 1, 2, . . .). Since f1 = f2 − fB , these constraints can be expressed as k fs f2 ≤ (2.14) 1≤k≤ fB 2 fB and

1999 by CRC Press LLC

c

f2 k − 1 fs ≤ −1 2 fB fB

(2.15)

A graphical description of Eqs. (2.14) and (2.15) is illustrated in Fig. 2.3. The unshaded regions represent where the constraints are satisfied, whereas the shaded regions represent the regions where the constraints are not satisfied and overlap will occur. The solid line in Fig. 2.3 shows the locus of the minimum sampling rate. The minimum sampling rate is given by

min {fs } =

2f2 m

(2.16)

where m is the largest integer not exceeding f2 /fB . Note that if the ratio f2 /fB is an integer, then the minimum sampling rate is 2fB . As an example, consider a bandpass signal with f1 = 1.5 kHz and f2 = 2.5 kHz. Here fB = f2 − f1 = 1 kHz, and f2 /fB = 2.5. Then from Eq. (2.16) and Fig. 2.3 we see that the minimum sampling rate is 2f2 /2 = f2 = 2.5 kHz, and allowable ranges of sampling rate are 2.5 kHz ≤ fs ≤ 3 kHz and fs ≥ 5 kHz (= 2f2 ). fs / f B k=1

k=2

7

6

k=3

5

4

3

2 1

2

2.5

3

4

5

6

7

FIGURE 2.3: Minimum and permissible sampling rates for a bandpass signal.

1999 by CRC Press LLC

c

f2 / f B

2.6

Practical Sampling

In practice, the sampling of an analog signal is performed by means of high-speed switching circuits, and the sampling process takes the form of natural sampling or flat-top sampling.

2.6.1

Natural Sampling

Natural sampling of a band-limited signal m(t) is shown in Fig. 2.4. The sampled signal mns (t) can be expressed as mns (t) = m(t)xp (t)ψ

(2.17)

where xp (t) is the periodic train of rectangular pulses with fundamental period Ts , and each rectangular pulse in xp (t) has duration d and unit amplitude [Fig. 2.4(b)]. Observe that the sampled signal mns (t) consists of a sequence of pulses of varying amplitude whose tops follow the waveform of the signal m(t) [Fig. 2.4(c)]. m (t )

0

t

(a) x p (t ) 1

−Ts

0d

Ts

2T s

t

2T s

t

(b) m ns (t )

−T s

0

Ts

(c)

FIGURE 2.4: Natural sampling. 1999 by CRC Press LLC

c

The Fourier transform of xp (t) is Xp (ω) =

∞ X

cn δ (ω − nωs )

ωs = 2π/Ts

(2.18)

n=−∞

where cn =

d sin (nωs d/2) −j nωs d/2 e Ts nωs d/2

(2.19)

Then the Fourier transform of mns (t) is given by Mns (ω) = M(ω) ∗ Xp (ω) =

∞ X

cn M (ω − nωs )

(2.20)

n=−∞

from which we see that the effect of the natural sampling is to multiply the nth shifted spectrum M(ω − nωs ) by a constant cn . Thus, the original signal m(t) can be reconstructed from mns (t) with no distortion by passing mns (t) through an ideal low-pass filter if the sampling rate fs is equal to or greater than the Nyquist rate 2fM .

2.6.2

Flat-Top Sampling

The sampled waveform, produced by practical sampling devices that are the sample and hold types, has the form [Fig. 2.5(c)] ∞ X m (nTs ) p (t − nTs ) (2.21) mfs (t) = n=−∞

where p(t) is a rectangular pulse of duration d with unit amplitude [Fig. 2.5(a)]. This type of sampling is known as flat-top sampling. Using the ideal sampled signal ms (t) of Eq. (2.1), mfs (t) can be expressed as # " ∞ X m (nTs ) δ (t − nTs ) = p(t) ∗ ms (t) (2.22) mfs (t) = p(t) ∗ n=−∞

Using the convolution property of the Fourier transform and Eq. (2.4), the Fourier transform of mfs (t) is given by Mfs (ω) = P (ω)Ms (ω) = where P (ω) = d

∞ 1 X P (ω)M (ω − nωs ) Ts n=−∞

sin (ωd/2) −j ωd/2 e ωd/2

(2.23)

(2.24)

From Eq.(2.23) we see that by using flat-top sampling we have introduced amplitude distortion and time delay, and the primary effect is an attenuation of high-frequency components. This effect is known as the aperture effect. The aperture effect can be compensated by an equalizing filter with a frequency response Heq (ω) = 1/P (ω). If the pulse duration d is chosen such that d Ts , however, then P (ω) is essentially constant over the baseband and no equalization may be needed. 1999 by CRC Press LLC

c

p (t ) 1

0d

t

(a)

m (t )

0

t

(b) m fs (t )

−T s

0

Ts

2T s

t

(c)

FIGURE 2.5: Flat-top sampling.

2.7

Sampling Theorem in the Frequency Domain

The sampling theorem expressed in Eq. (2.4) is the time-domain sampling theorem. There is a dual to this time-domain sampling theorem, i.e., the sampling theorem in the frequency domain. Time-limited signals: A continuous-time signal m(t) is called time limited if m(t) = 0

for |t| > |T0 |

(2.25)

Frequency-domain sampling theorem: The frequency-domain sampling theorem states that the Fourier transform M(ω) of a time-limited signal m(t) specified by Eq. (2.25) can be uniquely determined from its values M(nωs ) sampled at a uniform rate ωs if ωs ≤ π/T0 . In fact, when ωs = π/T0 , then M(ω) is given by ∞ X sin T0 (ω − nωs ) M (nωs ) (2.26) M(ω) = T0 (ω − nωs ) n=−∞

2.8

Summary and Discussion

The sampling theorem is the fundamental principle of digital communications. We state the sampling theorem in two parts. 1999 by CRC Press LLC

c

THEOREM 2.1 If the signal contains no frequency higher than fM Hz, it is completely described by specifying its samples taken at instants of time spaced 1/2fM s.

THEOREM 2.2 The signal can be completely recovered from its samples taken at the rate of 2fM samples per second or higher.

The preceding sampling theorem assumes that the signal is strictly band limited. It is known that if a signal is band limited it cannot be time limited and vice versa. In many practical applications, the signal to be sampled is time limited and, consequently, it cannot be strictly band limited. Nevertheless, we know that the frequency components of physically occurring signals attenuate rapidly beyond some defined bandwidth, and for practical purposes we consider these signals are band limited. This approximation of real signals by band limited ones introduces no significant error in the application of the sampling theorem. When such a signal is sampled, we band limit the signal by filtering before sampling and sample at a rate slightly higher than the nominal Nyquist rate.

Defining Terms Band-limited signal: A signal whose frequency content (Fourier transform) is equal to zero above some specified frequency. Bandpass signal: A signal whose frequency content (Fourier transform) is nonzero only in a band of frequencies not including the origin. Flat-top sampling: Sampling with finite width pulses that maintain a constant value for a time period less than or equal to the sampling interval. The constant value is the amplitude of the signal at the desired sampling instant. Ideal sampled signal: A signal sampled using an ideal impulse train. Nyquist rate: The minimum allowable sampling rate of 2fM samples per second, to reconstruct a signal band limited to fM hertz. Nyquist-Shannon interpolation formula: The infinite series representing a time domain waveform in terms of its ideal samples taken at uniform intervals. Sampling interval: The time between samples in uniform sampling. Sampling rate: The number of samples taken per second (expressed in Hertz and equal to the reciprocal of the sampling interval). Time-limited: A signal that is zero outside of some specified time interval.

References [1] Brown, J.L. Jr., First order sampling of bandpass signals—A new approach. IEEE Trans. Information Theory, IT-26(5), 613–615, 1980. [2] Byrne, C.L. and Fitzgerald, R.M., Time-limited sampling theorem for band-limited signals, IEEE Trans. Information Theory, IT-28(5), 807–809, 1982. [3] Hsu, H.P., Applied Fourier Analysis, Harcourt Brace Jovanovich, San Diego, CA, 1984. [4] Hsu, H.P., Analog and Digital Communications, McGraw-Hill, New York, 1993. 1999 by CRC Press LLC

c

[5] Hulth´en, R., Restoring causal signals by analytical continuation: A generalized sampling theorem for causal signals. IEEE Trans. Acoustics, Speech, and Signal Processing, ASSP-31(5), 1294–1298, 1983. [6] Jerri, A.J., The Shannon sampling theorem—Its various extensions and applications: A tutorial review, Proc. IEEE. 65(11), 1565–1596, 1977.

Further Information For a tutorial review of the sampling theorem, historical notes, and earlier references see Jerri [6].

1999 by CRC Press LLC

c

Couch, II, L.W. “Pulse Code Modulation” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

1

Pulse Code Modulation

Leon W. Couch, II Universityof Florida

3.1

3.1 Introduction 3.2 Generation of PCM 3.3 Percent Quantizing Noise 3.4 Practical PCM Circuits 3.5 Bandwidth of PCM 3.6 Effects of Noise 3.7 Nonuniform Quantizing: µ-Law and A-Law Companding 3.8 Example: Design of a PCM System Defining Terms References Further Information

Introduction

Pulse code modulation (PCM) is analog-to-digital conversion of a special type where the information contained in the instantaneous samples of an analog signal is represented by digital words in a serial bit stream. If we assume that each of the digital words has n binary digits, there are M = 2n unique code words that are possible, each code word corresponding to a certain amplitude level. Each sample value from the analog signal, however, can be any one of an infinite number of levels, so that the digital word that represents the amplitude closest to the actual sampled value is used. This is called quantizing. That is, instead of using the exact sample value of the analog waveform, the sample is replaced by the closest allowed value, where there are M allowed values, and each allowed value corresponds to one of the code words. PCM is very popular because of the many advantages it offers. Some of these advantages are as follows. • Relatively inexpensive digital circuitry may be used extensively in the system. • PCM signals derived from all types of analog sources (audio, video, etc.) may be timedivision multiplexed with data signals (e.g., from digital computers) and transmitted over

1 Source: Leon W. Couch, II. 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River,

NJ. With permission. 1999 by CRC Press LLC

c

a common high-speed digital communication system. • In long-distance digital telephone systems requiring repeaters, a clean PCM waveform can be regenerated at the output of each repeater, where the input consists of a noisy PCM waveform. The noise at the input, however, may cause bit errors in the regenerated PCM output signal. • The noise performance of a digital system can be superior to that of an analog system. In addition, the probability of error for the system output can be reduced even further by the use of appropriate coding techniques. These advantages usually outweigh the main disadvantage of PCM: a much wider bandwidth than that of the corresponding analog signal.

3.2

Generation of PCM

The PCM signal is generated by carrying out three basic operations: sampling, quantizing, and encoding (see Fig. 3.1). The sampling operation generates an instantaneously-sampled flat-top pulse-amplitude modulated (PAM) signal. The quantizing operation is illustrated in Fig. 3.2 for the M = 8 level case. This quantizer is said to be uniform since all of the steps are of equal size. Since we are approximating the analog sample values by using a finite number of levels (M = 8 in this illustration), error is introduced into the recovered output analog signal because of the quantizing effect. The error waveform is illustrated in Fig. 3.2c. The quantizing error consists of the difference between the analog signal at the sampler input and the output of the quantizer. Note that the peak value of the error (±1) is one-half of the quantizer step size (2). If we sample at the Nyquist rate (2B, where B is the absolute bandwidth, in hertz, of the input analog signal) or faster and there is negligible channel noise, there will still be noise, called quantizing noise, on the recovered analog waveform due to this error. The quantizing noise can also be thought of as a round-off error. The quantizer output is a quantized (i.e., only M possible amplitude values) PAM signal.

FIGURE 3.1: A PCM transmitter. Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 138. With permission.

The PCM signal is obtained from the quantized PAM signal by encoding each quantized sample value into a digital word. It is up to the system designer to specify the exact code word that will represent a particular quantized level. If a Gray code of Table 3.1 is used, the resulting PCM signal is shown in Fig. 3.2d where the PCM word for each quantized sample is strobed out of the encoder by the next clock pulse. The Gray code was chosen because it has only 1-b change for each step change in the quantized level. Consequently, single errors in the received PCM code word will cause minimum errors in the recovered analog level, provided that the sign bit is not in error. 1999 by CRC Press LLC

c

FIGURE 3.2: Illustration of waveforms in a PCM system. Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 139. With permission.

Here we have described PCM systems that represent the quantized analog sample values by binary code words. Of course, it is possible to represent the quantized analog samples by digital words using other than base 2. That is, for base q, the number of quantized levels allowed is M = q n , where n is the number of q base digits in the code word. We will not pursue this topic since binary (q = 2) digital circuits are most commonly used. 1999 by CRC Press LLC

c

3-b Gray Code for M = 8

TABLE 3.1 Levels Quantized Sample Voltage

Gray Code Word (PCM Output)

+7 +5 +3 +1

110 111 101 100

−1 −3 −5 −7

000 001 011 010

Mirror image except for sign bit

Source: Couch, L.W., II. 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 140. With permission.

3.3

Percent Quantizing Noise

The quantizer at the PCM encoder produces an error signal at the PCM decoder output as illustrated in Fig. 3.2c. The peak value of this error signal may be expressed as a percentage of the maximum possible analog signal amplitude. Referring to Fig. 3.2c, a peak error of 1 V occurs for a maximum analog signal amplitude of M = 8 V as shown Fig. 3.1c. Thus, in general, 1 1 2P = = n 100 M 2 or

50 (3.1) P where P is the peak percentage error for a PCM system that uses n bit code words. The design value of n needed in order to have less than P percent error is obtained by taking the base 2 logarithm of both sides of Eq. (3.1), where it is realized that log2 (x) = [log10 (x)]/ log10 (2) = 3.32 log10 (x). That is, 50 (3.2) n ≥ 3.32log10 P 2n =

where n is the number of bits needed in the PCM word in order to obtain less than P percent error in the recovered analog signal (i.e., decoded PCM signal).

3.4

Practical PCM Circuits

Three techniques are used to implement the analog-to-digital converter (ADC) encoding operation. These are the counting or ramp, serial or successive approximation, and parallel or flash encoders. In the counting encoder, at the same time that the sample is taken, a ramp generator is energized and a binary counter is started. The output of the ramp generator is continuously compared to the sample value; when the value of the ramp becomes equal to the sample value, the binary value of the counter is read. This count is taken to be the PCM word. The binary counter and the ramp generator are then reset to zero and are ready to be reenergized at the next sampling time. This technique 1999 by CRC Press LLC

c

requires only a few components, but the speed of this type of ADC is usually limited by the speed of the counter. The Maxim ICL7126 CMOS ADC integrated circuit uses this technique. The serial encoder compares the value of the sample with trial quantized values. Successive trials depend on whether the past comparator outputs are positive or negative. The trial values are chosen first in large steps and then in small steps so that the process will converge rapidly. The trial voltages are generated by a series of voltage dividers that are configured by (on-off) switches. These switches are controlled by digital logic. After the process converges, the value of the switch settings is read out as the PCM word. This technique requires more precision components (for the voltage dividers) than the ramp technique. The speed of the feedback ADC technique is determined by the speed of the switches. The National Semiconductor ADC0804 8-b ADC uses this technique. The parallel encoder uses a set of parallel comparators with reference levels that are the permitted quantized values. The sample value is fed into all of the parallel comparators simultaneously. The high or low level of the comparator outputs determines the binary PCM word with the aid of some digital logic. This is a fast ADC technique but requires more hardware than the other two methods. The Harris CA3318 8-b ADC integrated circuit is an example of the technique. All of the integrated circuits listed as examples have parallel digital outputs that correspond to the digital word that represents the analog sample value. For generation of PCM, the parallel output (digital word) needs to be converted to serial form for transmission over a two-wire channel. This is accomplished by using a parallel-to-serial converter integrated circuit, which is also known as a serialinput-output (SIO) chip. The SIO chip includes a shift register that is set to contain the parallel data (usually, from 8 or 16 input lines). Then the data are shifted out of the last stage of the shift register bit by bit onto a single output line to produce the serial format. Furthermore, the SIO chips are usually full duplex; that is, they have two sets of shift registers, one that functions for data flowing in each direction. One shift register converts parallel input data to serial output data for transmission over the channel, and, simultaneously, the other shift register converts received serial data from another input to parallel data that are available at another output. Three types of SIO chips are available: the universal asynchronous receiver/transmitter (UART), the universal synchronous receiver/transmitter (USRT), and the universal synchronous/asynchronous receiver transmitter (USART). The UART transmits and receives asynchronous serial data, the USRT transmits and receives synchronous serial data, and the USART combines both a UART and a USRT on one chip. At the receiving end the PCM signal is decoded back into an analog signal by using a digital-toanalog converter (DAC) chip. If the DAC chip has a parallel data input, the received serial PCM data are first converted to a parallel form using a SIO chip as described in the preceding paragraph. The parallel data are then converted to an approximation of the analog sample value by the DAC chip. This conversion is usually accomplished by using the parallel digital word to set the configuration of electronic switches on a resistive current (or voltage) divider network so that the analog output is produced. This is called a multiplying DAC since the analog output voltage is directly proportional to the divider reference voltage multiplied by the value of the digital word. The Motorola MC1408 and the National Semiconductor DAC0808 8-b DAC chips are examples of this technique. The DAC chip outputs samples of the quantized analog signal that approximates the analog sample values. This may be smoothed by a low-pass reconstruction filter to produce the analog output. The Communications Handbook [6, pp 107–117] and The Electrical Engineering Handbook [5, pp. 771–782] give more details on ADC, DAC, and PCM circuits. 1999 by CRC Press LLC

c

3.5

Bandwidth of PCM

A good question to ask is: What is the spectrum of a PCM signal? For the case of PAM signalling, the spectrum of the PAM signal could be obtained as a function of the spectrum of the input analog signal because the PAM signal is a linear function of the analog signal. This is not the case for PCM. As shown in Figs. 3.1 and 3.2, the PCM signal is a nonlinear function of the input signal. Consequently, the spectrum of the PCM signal is not directly related to the spectrum of the input analog signal. It can be shown that the spectrum of the PCM signal depends on the bit rate, the correlation of the PCM data, and on the PCM waveform pulse shape (usually rectangular) used to describe the bits [2, 3]. From Fig. 3.2, the bit rate is (3.3) R = nfs where n is the number of bits in the PCM word (M = 2n ) and fs is the sampling rate. For no aliasing we require fs ≥ 2B where B is the bandwidth of the analog signal (that is to be converted to the PCM signal). The dimensionality theorem [2, 3] shows that the bandwidth of the PCM waveform is bounded by 1 1 (3.4) BPCM ≥ R = nfs 2 2 where equality is obtained if a (sin x)/x type of pulse shape is used to generate the PCM waveform. The exact spectrum for the PCM waveform will depend on the pulse shape that is used as well as on the type of line encoding. For example, if one uses a rectangular pulse shape with polar nonreturn to zero (NRZ) line coding, the first null bandwidth is simply BPCM = R = nfs Hz

(3.5)

Table 3.2 presents a tabulation of this result for the case of the minimum sampling rate, fs = 2B. Note that Eq. (3.4) demonstrates that the bandwidth of the PCM signal has a lower bound given by BPCM ≥ nBψ

(3.6)

where fs > 2B and B is the bandwidth of the corresponding analog signal. Thus, for reasonable values of n, the bandwidth of the PCM signal will be significantly larger than the bandwidth of the corresponding analog signal that it represents. For the example shown in Fig. 3.2 where n = 3, the PCM signal bandwidth will be at least three times wider than that of the corresponding analog signal. Furthermore, if the bandwidth of the PCM signal is reduced by improper filtering or by passing the PCM signal through a system that has a poor frequency response, the filtered pulses will be elongated (stretched in width) so that pulses corresponding to any one bit will smear into adjacent bit slots. If this condition becomes too serious, it will cause errors in the detected bits. This pulse smearing effect is called intersymbol interference (ISI).

3.6

Effects of Noise

The analog signal that is recovered at the PCM system output is corrupted by noise. Two main effects produce this noise or distortion: 1) quantizing noise that is caused by the M-step quantizer at the PCM transmitter and 2) bit errors in the recovered PCM signal. The bit errors are caused by channel noise as well as improper channel filtering, which causes ISI. In addition, if the input analog signal is not strictly band limited, there will be some aliasing noise on the recovered analog signal [12]. Under 1999 by CRC Press LLC

c

TABLE 3.2 Performance of a PCM System with Uniform Quantizing and No Channel Noise Number of Quantizer Levels Used, M 2 4 8 16 32 64 128 256 512 1,024 2,048 4,096 8,192 16,384 32,768 65,536

Length of the PCM Word, n (bits) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Bandwidth of PCM Signal (First Null Bandwidth)a 2B 4B 6B 8B 10B 12B 14B 16B 18B 20B 22B 24B 26B 28B 30B 32B

Recovered Analog Signal Power-toQuantizing Noise Power Ratios (dB) (S/N )out 6.0 12.0 18.1 24.1 30.1 36.1 42.1 48.2 54.2 60.2 66.2 72.2 78.3 84.3 90.3 96.3

a B is the absolute bandwidth of the input analog signal. Source: Couch,

L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 142. With permission.

certain assumptions, it can be shown that the recovered analog average signal power to the average noise power [2] is M2 S = (3.7) N out 1 + 4 M 2 − 1 Pe where M is the number of uniformly spaced quantizer levels used in the PCM transmitter and Pe is the probability of bit error in the recovered binary PCM signal at the receiver DAC before it is converted back into an analog signal. Most practical systems are designed so that Pe is negligible. Consequently, if we assume that there are no bit errors due to channel noise (i.e., Pe = 0), the S/N due only to quantizing errors is S = M2 (3.8) N out Numerical values for these S/N ratios are given in Table 3.2. To realize these S/N ratios, one critical assumption is that the peak-to-peak level of the analog waveform at the input to the PCM encoder is set to the design level of the quantizer. For example, referring to Fig. 3.2, this corresponds to the input traversing the range −V to +V volts where V = 8 V is the design level of the quantizer. Equation (3.7) was derived for waveforms with equally likely √ values, such as a triangle waveshape, that have a peak-to-peak value of 2V and an rms value of V / 3, where V is the design peak level of the quantizer. From a practical viewpoint, the quantizing noise at the output of the PCM decoder can be categorized into four types depending on the operating conditions. The four types are overload noise, random noise, granular noise, and hunting noise. As discussed earlier, the level of the analog waveform at the input of the PCM encoder needs to be set so that its peak level does not exceed the design peak of V volts. If the peak input does exceed V , the recovered analog waveform at the output of the PCM system will have flat tops near the peak values. This produces overload noise. The flat tops are easily seen on an oscilloscope, and the recovered analog waveform sounds distorted since the flat topping produces unwanted harmonic components. For example, this type of distortion can 1999 by CRC Press LLC

c

be heard on PCM telephone systems when there are high levels such as dial tones, busy signals, or off-hook warning signals. The second type of noise, random noise, is produced by the random quantization errors in the PCM system under normal operating conditions when the input level is properly set. This type of condition is assumed in Eq. (3.8). Random noise has a white hissing sound. If the input level is not sufficiently large, the S/N will deteriorate from that given by Eq. (3.8); the quantizing noise will still remain more or less random. If the input level is reduced further to a relatively small value with respect to the design level, the error values are not equally likely from sample to sample, and the noise has a harsh sound resembling gravel being poured into a barrel. This is called granular noise. This type of noise can be randomized (noise power decreased) by increasing the number of quantization levels and, consequently, increasing the PCM bit rate. Alternatively, granular noise can be reduced by using a nonuniform quantizer, such as the µ-law or A-law quantizers that are described in Section 3.7. The fourth type of quantizing noise that may occur at the output of a PCM system is hunting noise. It can occur when the input analog waveform is nearly constant, including when there is no signal (i.e., zero level). For these conditions the sample values at the quantizer output (see Fig. 3.2) can oscillate between two adjacent quantization levels, causing an undesired sinusoidal type tone of frequency 1/2fs at the output of the PCM system. Hunting noise can be reduced by filtering out the tone or by designing the quantizer so that there is no vertical step at the constant value of the inputs, such as at 0-V input for the no signal case. For the no signal case, the hunting noise is also called idle channel noise. Idle channel noise can be reduced by using a horizontal step at the origin of the quantizer output–input characteristic instead of a vertical step as shown in Fig. 3.2. Recalling that M = 2n , we may express Eq. (3.8) in decibels by taking 10 log10 (·) of both sides of the equation,

S N

= 6.02n + α

(3.9)

dB

where n is the number of bits in the PCM word and α = 0. This equation—called the 6-dB rule— points out the significant performance characteristic for PCM: an additional 6-dB improvement in S/N is obtained for each bit added to the PCM word. This is illustrated in Table 3.2. Equation (3.9) is valid for a wide variety of assumptions (such as various types of input waveshapes and quantification characteristics), although the value of α will depend on these assumptions [7]. Of course, it is assumed that there are no bit errors and that the input signal level is large enough to range over a significant number of quantizing levels. One may use Table 3.2 to examine the design requirements in a proposed PCM system. For example, high fidelity enthusiasts are turning to digital audio recording techniques. Here PCM signals are recorded instead of the analog audio signal to produce superb sound reproduction. For a dynamic range of 90 dB, it is seen that at least 15-b PCM words would be required. Furthermore, if the analog signal had a bandwidth of 20 kHz, the first null bandwidth for rectangular bit-shape PCM would be 2 × 20 kHz ×15 = 600 kHz. Consequently, video-type tape recorders are needed to record and reproduce high-quality digital audio signals. Although this type of recording technique might seem ridiculous at first, it is realized that expensive high-quality analog recording devices are hard pressed to reproduce a dynamic range of 70 dB. Thus, digital audio is one way to achieve improved performance. This is being proven in the marketplace with the popularity of the digital compact disk (CD). The CD uses a 16-b PCM word and a sampling rate of 44.1 kHz on each stereo 1999 by CRC Press LLC

c

channel [9, 10]. Reed–Solomon coding with interleaving is used to correct burst errors that occur as a result of scratches and fingerprints on the compact disk.

3.7

Nonuniform Quantizing: µ-Law and A-Law Companding

Voice analog signals are more likely to have amplitude values near zero than at the extreme peak values allowed. For example, when digitizing voice signals, if the peak value allowed is 1 V, weak passages may have voltage levels on the order of 0.1 V (20 dB down). For signals such as these with nonuniform amplitude distribution, the granular quantizing noise will be a serious problem if the step size is not reduced for amplitude values near zero and increased for extremely large values. This is called nonuniform quantizing since a variable step size is used. An example of a nonuniform quantizing characteristic is shown in Fig. 3.3. The effect of nonuniform quantizing can be obtained by first passing the analog signal through a compression (nonlinear) amplifier and then into the PCM circuit that uses a uniform quantizer. In the U.S., a µ-law type of compression characteristic is used. It is defined [11] by ln (1 + µ |w1 (t)|) ln(1 + µ)

|w2 (t)| =

(3.10)

where the allowed peak values of w1 (t) are ±1 (i.e., |w1 (t)| ≤ 1), µ is a positive constant that is a parameter. This compression characteristic is shown in Fig. 3.3(b) for several values of µ, and it is noted that µ → 0 corresponds to linear amplification (uniform quantization overall). In the United States, Canada, and Japan, the telephone companies use a µ = 255 compression characteristic in their PCM systems [4]. Another compression law, used mainly in Europe, is the A-law characteristic. It is defined [1] by A |w1 (t)| , 1 + ln A |w2 (t)| = 1 + ln (A |w1 (t)|) , 1 + ln A

0 ≤ |w1 (t)| ≤

1 A

1 ≤ |w1 (t)| ≤ 1 A

(3.11)

where |w1 (t)| < 1 and A is a positive constant. The A-law compression characteristic is shown in Fig. 3.3(c). The typical value for A is 87.6. When compression is used at the transmitter, expansion (i.e., decompression) must be used at the receiver output to restore signal levels to their correct relative values. The expandor characteristic is the inverse of the compression characteristic, and the combination of a compressor and an expandor is called a compandor. Once again, it can be shown that the output S/N follows the 6-dB law [2]

S N

= 6.02 + α

(3.12)

dB

where for uniform quantizing α = 4.77 − 20 log (V /xrms ) 1999 by CRC Press LLC

c

(3.13)

FIGURE 3.3: Compression characteristics (first quadrant shown). Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 147. With permission. and for sufficiently large input levels2 for µ-law companding α ≈ 4.77 − 20 log[ln(1 + µ)]

(3.14)

α ≈ 4.77 − 20 log[1 + ln A]

(3.15)

and for A-law companding [7]

n is the number of bits used in the PCM word, V is the peak design level of the quantizer, and xrms is the rms value of the input analog signal. Notice that the output S/N is a function of the input level

2 See Lathi, 1998 for a more complicated expression that is valid for any input level.

1999 by CRC Press LLC

c

FIGURE 3.4: Output S/N of 8-b PCM systems with and without companding. Source: Couch, L.W. II 1997. Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, p. 149. With permission.

for the uniform quantizing (no companding) case but is relatively insensitive to input level for µ-law and A-law companding, as shown in Fig. 3.4. The ratio V /xrms is called the loading factor. The input level is often set for a loading factor of 4 (12 dB) to ensure that the overload quantizing noise will be negligible. In practice this gives α = −7.3 for the case of uniform encoding as compared to α = 0, which was obtained for the ideal conditions associated with Eq. (3.8).

3.8

Example: Design of a PCM System

Assume that an analog voice-frequency signal, which occupies a band from 300 to 3400 Hz, is to be transmitted over a binary PCM system. The minimum sampling frequency would be 2 × 3.4 = 6.8 kHz. In practice the signal is oversampled, and in the U.S. a sampling frequency of 8 kHz is the standard used for voice-frequency signals in telephone communication systems. Assume that each sample value is represented by 8 b; then the bit rate of the PCM signal is R

1999 by CRC Press LLC

c

= fs samples/s (n b/s) = (8 k samples/s)(8 b/s) = 64 kb/s

(3.16)

Referring to the dimensionality theorem [Eq. (3.4)], we realize that the theoretically minimum absolute bandwidth of the PCM signal is Bmin =

1 D = 32 kHz 2

(3.17)

and this is realized if the PCM waveform consists of (sin x)/x pulse shapes. If rectangular pulse shaping is used, the absolute bandwidth is infinity, and the first null bandwidth [Eq. (3.5)] is Bnull = R =

1 = 64 kHz Tb

(3.18)

That is, we require a bandwidth of 64 kHz to transmit this digital voice PCM signal where the bandwidth of the original analog voice signal was, at most, 4 kHz. Using n = 8 in Eq. (3.1), the error on the recovered analog signal is ±0.2%. Using Eqs. (3.12) and (3.13) for the case of uniform quantizing with a loading factor, V /xrms , of 10 (20 dB), we get for uniform quantizing S = 32.9 dB (3.19) N dB Using Eqs. (3.12) and (3.14) for the case of µ = 255 companding, we get S = 38.05 dB N

(3.20)

These results are illustrated in Fig. 3.4.

Defining Terms Intersymbol interference: Filtering of a digital waveform so that a pulse corresponding to 1 b will smear (stretch in width) into adjacent bit slots. Pulse amplitude modulation: An analog signal is represented by a train of pulses where the pulse amplitudes are proportional to the analog signal amplitude. Pulse code modulation: A serial bit stream that consists of binary words which represent quantized sample values of an analog signal. Quantizing: Replacing a sample value with the closest allowed value.

References [1] Cattermole, K.W., Principles of Pulse-code Modulation, American Elsevier, New York, NY, 1969. [2] Couch, L.W., Digital and Analog Communication Systems, 5th ed., Prentice Hall, Upper Saddle River, NJ, 1997. [3] Couch, L.W., Modern Communication Systems: Principles and Applications, Macmillan Publishing, New York, NY, 1995. [4] Dammann, C.L., McDaniel, L.D., and Maddox, C.L., D2 Channel Bank—Multiplexing and Coding. B. S. T. J., 12(10), 1675–1700, 1972. [5] Dorf, R.C., The Electrical Engineering Handbook, CRC Press, Inc., Boca Raton, FL, 1993. 1999 by CRC Press LLC

c

[6] Gibson, J.D., The Communications Handbook, CRC Press, Inc., Boca Raton, FL, 1997. [7] Jayant, N.S. and Noll, P., Digital Coding of Waveforms, Prentice Hall, Englewood Cliffs, NJ, 1984. [8] Lathi, B.P., Modern Digital and Analog Communication Systems, 3rd ed., Oxford University Press, New York, NY, 1998. [9] Miyaoka, S., Digital Audio is Compact and Rugged. IEEE Spectrum, 21(3), 35–39, 1984. [10] Peek, J.B.H., Communication Aspects of the Compact Disk Digital Audio System. IEEE Comm. Mag., 23(2), 7–15, 1985. [11] Smith, B., Instantaneous Companding of Quantized Signals. B. S. T. J., 36(5), 653–709, 1957. [12] Spilker, J.J., Digital Communications by Satellite, Prentice Hall, Englewood Cliffs, NJ, 1977.

Further Information Many practical design situations and applications of PCM transmission via twisted-pair T-1 telephone lines, fiber optic cable, microwave relay, and satellite systems are given in [2] and [3].

1999 by CRC Press LLC

c

Honig, M.L. & Barton, M.“Baseband Signalling and Pulse Shaping” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Baseband Signalling and Pulse Shaping 4.1 4.2

Communications System Model Intersymbol Interference and the Nyquist Criterion

4.3 4.4

Nyquist Criterion with Matched Filtering Eye Diagrams

Raised Cosine Pulse

4.5

Vertical Eye Opening • Horizontal Eye Opening • Slope of the Inner Eye

Partial-Response Signalling Precoding

4.6

4.7

Michael L. Honig Northwestern University

Melbourne Barton Bellcore

Additional Considerations

Average Transmitted Power and Spectral Constraints • Peakto-Average Power • Channel and Receiver Characteristics • Complexity • Tolerance to Interference • Probability of Intercept and Detection

Examples

Global System for Mobile Communications (GSM) • U.S. Digital Cellular (IS-136) • Interim Standard-95 • Personal Access Communications System (PACS)

Defining Terms References Further Information

Many physical communications channels, such as radio channels, accept a continuous-time waveform as input. Consequently, a sequence of source bits, representing data or a digitized analog signal, must be converted to a continuous-time waveform at the transmitter. In general, each successive group of bits taken from this sequence is mapped to a particular continuous-time pulse. In this chapter we discuss the basic principles involved in selecting such a pulse for channels that can be characterized as linear and time invariant with finite bandwidth.

4.1

Communications System Model

Figure 4.1a shows a simple block diagram of a communications system. The sequence of source bits {bi } are grouped into sequential blocks (vectors) of m bits {bi }, and each binary vector bi is mapped to one of 2m pulses, p(bi ; t), which is transmitted over the channel. The transmitted signal as a 1999 by CRC Press LLC

c

function of time can be written as s(t) =

X

p(bi ; t − iT )ψ

(4.1)

i

where 1/T is the rate at which each group of m bits, or pulses, is introduced to the channel. The information (bit) rate is therefore m/T .

Figure 4.1a Communication system model. The source bits are grouped into binary vectors, which are mapped to a sequence of pulse shapes.

Figure 4.1b noise.

Channel model consisting of a linear, time-invariant system (transfer function) followed by additive

The channel in Fig. 4.1a can be a radio link, which may distort the input signal s(t) in a variety of ways. For example, it may introduce pulse dispersion (due to finite bandwidth) and multipath, as well as additive background noise. The output of the channel is denoted as x(t), which is processed by the receiver to determine estimates of the source bits. The receiver can be quite complicated; however, for the purpose of this discussion, it is sufficient to assume only that it contains a front-end filter and a sampler, as shown in Fig. 4.1a. This assumption is valid for a wide variety of detection strategies. The purpose of the receiver filter is to remove noise outside of the transmitted frequency band and to compensate for the channel frequency response. A commonly used channel model is shown in Fig. 4.1b and consists of a linear, time-invariant filter, denoted as G(f ), followed by additive noise n(t). The channel output is, therefore, x(t) = [g(t) ∗ s(t)] + n(t)ψ

(4.2)

where g(t) is the channel impulse response associated with G(f ), and the asterisk denotes convolution, Z g(t) ∗ s(t) =

∞

−∞

g(t − τ )s(τ ) dτ

This channel model accounts for all linear, time-invariant channel impairments, such as finite bandwidth and time-invariant multipath. It does not account for time-varying impairments, such as rapid fading due to time-varying multipath. Nevertheless, this model can be considered valid over short time periods during which the multipath parameters remain constant. In Figs. 4.1a, and 4.1b, it is assumed that all signals are baseband signals, which means that the frequency content is centered around f = 0 (DC). The channel passband, therefore, partially coincides with the transmitted spectrum. In general, this condition requires that the transmitted signal be modulated by an appropriate carrier frequency and demodulated at the receiver. In that case, the model in Figs. 4.1a, and 4.1b still applies; however, baseband-equivalent signals must be 1999 by CRC Press LLC

c

derived from their modulated (passband) counterparts. Baseband signalling and pulse shaping refers to the way in which a group of source bits is mapped to a baseband transmitted pulse. As a simple example of baseband signalling, we can take m = 1 (map each source bit to a pulse), assign a 0 bit to a pulse p(t), and a 1 bit to the pulse −p(t). Perhaps the simplest example of a baseband pulse is the rectangular pulse given by p(t) = 1, 0 < t ≤ T , and p(t) = 0 elsewhere. In this case, we can write the transmitted signal as s(t) =

X

Ai p(t − iT )ψ

(4.3)

i

where each symbol Ai takes on a value of +1 or −1, depending on the value of the ith bit, and 1/T is the symbol rate, namely, the rate at which the symbols Ai are introduced to the channel. The preceding example is called binary pulse amplitude modulation (PAM), since the data symbols Ai are binary valued, and they amplitude modulate the transmitted pulse p(t). The information rate (bits per second) in this case is the same as the symbol rate 1/T . As a simple extension of this signalling technique, we can increase m and choose Ai from one of M = 2m values to transmit at bit rate m/T . This is known as M-ary PAM. For example, letting m = 2, each pair of bits can be mapped to a pulse in the set {p(t), −p(t), 3p(t), −3p(t)}. In general, the transmitted symbols {Ai }, the baseband pulse p(t), and channel impulse response g(t) can be complex valued. For√example, each successive pair of bits might select a symbol from the set {1, −1, j, −j }, where j = −1. This is a consequence of considering the baseband equivalent of passband modulation. (That is, generating a transmitted spectrum which is centered around a carrier frequency fc .) Here we are not concerned with the relation between the passband and baseband equivalent models and simply point out that the discussion and results in this chapter apply to complex-valued symbols and pulse shapes. As an example of a signalling technique which is not PAM, let m = 1 and √ 2 sin(2πf1 t) 0 < t < T p(0; t) = elsewhere ( 0√ 2 sin(2πf2 t) 0 < t < T p(1; t) = 0 elsewhere

(4.4)

where f1 and f2 6 = f1 are fixed frequencies selected so that f1 T and f2 T (number of cycles for each bit) are multiples of 1/2. These pulses are orthogonal, namely, Z

T

p(1; t)p(0; t) dt = 0

0

This choice of pulse shapes is called binary frequency-shift keying (FSK). Another example of a set of orthogonal pulse shapes for m = 2 bits/T is shown in Fig. 4.2. Because these pulses may have as many as three transitions within a symbol period, the transmitted spectrum occupies roughly four times the transmitted spectrum of binary PAM with a rectangular pulse shape. The spectrum is, therefore, spread across a much larger band than the smallest required for reliable transmission, assuming a data rate of 2/T . This type of signalling is referred to as spread-spectrum. 1999 by CRC Press LLC

c

FIGURE 4.2: Four orthogonal spread-spectrum pulse shapes.

Spread-spectrum signals are more robust with respect to interference from other transmitted signals than are narrowband signals.1

4.2

Intersymbol Interference and the Nyquist Criterion

Consider the transmission of a PAM signal illustrated in Fig. 4.3. The source bits {bi } are mapped to a sequence of levels {Ai }, which modulate the transmitter pulse p(t). The channel input is, therefore, given by Eq. (4.3) where p(t) is the impulse response of the transmitter pulse-shaping filter P (f ) shown P in Fig. 4.3. The input to the transmitter filter P (f ) is the modulated sequence of delta functions i Ai δ(t − iT ). The channel is represented by the transfer function G(f ) (plus noise), which has impulse response g(t), and the receiver filter has transfer function R(f ) with associated impulse response r(t).

FIGURE 4.3: Baseband model of a pulse amplitude modulation system. Let h(t) be the overall impulse response of the combined transmitter, channel, and receiver, which has transfer function H (f ) = P (f )G(f )R(f ). We can write h(t) = p(t) ∗ g(t) ∗ r(t). The output

1 This example can also be viewed as coded binary PAM. Namely, each pair of two source bits are mapped to 4 coded bits, which are transmitted via binary PAM with a rectangular pulse. The current IS-95 air interface uses an extension of this signalling method in which groups of 6 bits are mapped to 64 orthogonal pulse shapes with as many as 63 transitions during a symbol.

1999 by CRC Press LLC

c

of the receiver filter is then y(t) =

X

Ai h(t − iT ) + n(t)ψ ˜

(4.5)

i

where n(t) ˜ = r(t) ∗ n(t) is the output of the filter R(f ) with input n(t). Assuming that samples are collected at the output of the filter R(f ) at the symbol rate 1/T , we can write the kth sample of y(t) as X Ai h(kT − iT ) + n(kT ˜ ) y(kT ) = i

=

Ak h(0) +

X

Ai h(kT − iT ) + n(kT ˜ )ψ

(4.6)

i6 =k

The first term on the right-hand side of Eq. (4.6) is the kth transmitted symbol scaled by the system impulse response at t = 0. If this were the only term on the right side of Eq. (4.6), we could obtain the source bits without error by scaling the received samples by 1/ h(0). The second term on the righthand side of Eq. (4.6) is called intersymbol interference, which reflects the view that neighboring symbols interfere with the detection of each desired symbol. One possible criterion for choosing the transmitter and receiver filters is to minimize intersymbol interference. Specifically, if we choose p(t) and r(t) so that 1 k=0 (4.7) h(kT ) = 0 k 6= 0 then the kth received sample is

y(kT ) = Ak + n(kT ˜ )ψ

(4.8)

In this case, the intersymbol interference has been eliminated. This choice of p(t) and r(t) is called a zero-forcing solution, since it forces the intersymbol interference to zero. Depending on the type of detection scheme used, a zero-forcing solution may not be desirable. This is because the probability of error also depends on the noise intensity, which generally increases when intersymbol interference is suppressed. It is instructive, however, to examine the properties of the zero-forcing solution. We now view Eq. (4.7) in the frequency domain. Since h(t) has Fourier transform H (f ) = P (f )G(f )R(f )ψ

(4.9)

where P (f ) is the Fourier transform of p(t), the bandwidth of H (f ) is limited by the bandwidth of the channel G(f ). We will assume that G(f ) = 0, |f | > W . The sampled impulse response h(kT ) can, therefore, be written as the inverse Fourier transform Z W H (f )ej 2πf kT df h(kT ) = −W

Through a series of manipulations, this integral can be rewritten as an inverse discrete Fourier transform, Z h(kT ) = T where 1999 by CRC Press LLC

c

1/(2T ) −1/(2T )

Heq ej 2πf T ej 2πf kT dfψ

(4.10a)

Heq (e

j 2πf T

)ψ = =

1 X k H f+ T T k X k k k 1 P f+ G f+ R f+ T T T T

(4.10b)

k

This relation states that Heq (z), z = ej 2πf T , is the discrete Fourier transform of the sequence {hk }, where hk = h(kT ). Sampling the impulse response h(t) therefore changes the transfer function H (f ) to the aliased frequency response Heq (ej 2πf T ). From Eqs. (4.10a–4.10b), and (4.6) we conclude that Heq (z) is the transfer function that relates the sequence of input data symbols {Ai } to the sequence of received samples {yi }, where yi = y(iT ), in the absence of noise. This is illustrated in Fig. 4.4. For this reason, Heq (z) is called the equivalent discrete-time transfer function for the overall system transfer function H (f ).

FIGURE 4.4: Equivalent discrete-time channel for the PAM system shown in Fig. 4.3 [yi = y(iT ), n˜ i = n(iT ˜ )] Since Heq (ej 2πf T ) is the discrete Fourier transform of the sequence {hk }, the time-domain, or sequence condition (4.7) is equivalent to the frequency-domain condition Heq ej 2πf T = 1

(4.11)

This relation is called the Nyquist criterion. From Eqs. (4.10b) and (4.11) we make the following observations. 1. To satisfy the Nyquist criterion, the channel bandwidth W must be at least 1/(2T ). Otherwise, G(f + n/T ) = 0 for f in some interval of positive length for all n, which implies that Heq (ej 2πf T ) = 0 for f in the same interval. 2. For the minimum bandwidth W = 1/(2T ), Eqs. (4.10b) and (4.11) imply that H (f ) = T for |f | < 1/(2T ) and H (f ) = 0 elsewhere. This implies that the system impulse response is given by h(t) =

sin(π t/T ) π t/T

(4.12)

R∞ P (Since −∞ h2 (t) dt = T , the transmitted signal s(t) = i Ai h(t − iT ) has power equal to the symbol variance E[|Ai |2 ].) The impulse response in Eq. (4.12) is called a minimum bandwidth or Nyquist pulse. The frequency band [−1/(2T ), 1/(2T )] [i.e., the passband of H (f )] is called the Nyquist band. 3. Suppose that the channel is bandlimited to twice the Nyquist bandwidth. That is, G(f ) = 0 for |f | > 1/T . The condition (4.11) then becomes 1999 by CRC Press LLC

c

1 H (f ) + H f − T

1 +H f + T

=T

(4.13)

Assume for the moment that H (f ) and h(t) are both real valued, so that H (f ) is an even function of f [H (f ) = H (−f )]. This is the case when the receiver filter is the matched filter (see Section 4.3). We can then rewrite Eq. (4.13) as 1 1 − f = T, 0 < f < (4.14) H (f ) + H T 2T which states that H (f ) must have odd symmetry about f = 1/(2T ). This is illustrated in Fig. 4.5, which shows two different transfer functions H (f ) that satisfy the Nyquist criterion. 4. The pulse shape p(t) enters into Eq. (4.11) only through the product P (f )R(f ). Consequently, either P (f ) or R(f ) can be fixed, and the other filter can be adjusted or adapted to the particular channel. Typically, the pulse shape p(t) is fixed, and the receiver filter is adapted to the (possibly time-varying) channel.

FIGURE 4.5: Two examples of frequency responses that satisfy the Nyquist criterion.

4.2.1

Raised Cosine Pulse

Suppose that the channel is ideal with transfer function ( 1, |f | < W G(f ) = 0, |f | > W

(4.15)

To maximize bandwidth efficiency, Nyquist pulses given by Eq. (4.12) should be used where W = 1/(2T ). This type of signalling, however, has two major drawbacks. First, Nyquist pulses are noncausal and of infinite duration. They can be approximated in practice by introducing an appropriate delay, and truncating the pulse. The pulse, however, decays very slowly, namely, as 1/t, so that the truncation window must be wide. This is equivalent to observing that the ideal bandlimited frequency response given by Eq. (4.15) is difficult to approximate closely. The second drawback, which is more important, is the fact that this type of signalling is not robust with respect to sampling jitter. Namely, a small sampling offset ε produces the output sample y(kT + ε) =

X i

1999 by CRC Press LLC

c

Ai

sin[π(k − i + ε/T )] π(k − i + ε/T )

(4.16)

Since the Nyquist pulse decays as 1/t, this sum is not guaranteed to converge. A particular choice of symbols {Ai } can, therefore, lead to very large intersymbol interference, no matter how small the offset. Minimum bandwidth signalling is therefore impractical. The preceding problem is generally solved in one of two ways in practice: 1. The pulse bandwidth is increased to provide a faster pulse decay than 1/t. 2. A controlled amount of intersymbol interference is introduced at the transmitter, which can be subtracted out at the receiver. The former approach sacrifices bandwidth efficiency, whereas the latter approach sacrifices power efficiency. We will examine the latter approach in Section 4.5. The most common example of a pulse, which illustrates the first technique, is the raised cosine pulse, given by cos(απ t/T ) sin(πt/T ) (4.17) h(t) = π t/T 1 − (2αt/T )2 which has Fourier transform 1−α Tψ 0 ≤ |f | ≤ 2T T πT 1−α 1−α 1+α 1 + cos |f | − ≤ |f | ≤ H (f ) = 2 α 2T 2T 2T 1+α 0 |f | > 2T

(4.18)

where 0 ≤ α ≤ 1. Plots of p(t) and P (f ) are shown in Figs. 4.6a, and 4.6b for different values of α. It is easily verified that h(t) satisfies the Nyquist criterion (4.7) and, consequently, H (f ) satisfies Eq. (4.11). When α = 0, H (f ) is the Nyquist pulse with minimum bandwidth 1/(2T ), and when α > 0, H (f ) has bandwidth (1 + α)/(2T ) with a raised cosine rolloff. The parameter α, therefore, represents the additional, or excess bandwidth as a fraction of the minimum bandwidth 1/(2T ). For example, when α = 1, we say that the pulse is a raised cosine pulse with 100% excess bandwidth. This is because the pulse bandwidth 1/T is twice the minimum bandwidth. Because the raised cosine pulse decays as 1/t 3 , performance is robust with respect to sampling offsets. The raised cosine frequency response (4.18) applies to the combination of transmitter, channel, and receiver. If the transmitted pulse shape p(t) is a raised cosine pulse, then h(t) is a raised cosine pulse only if the combined receiver and channel frequency response is constant. Even with an ideal (transparent) channel, however, the optimum (matched) receiver filter response is generally not constant in the presence of additive Gaussian noise. An alternative is to transmit the square-root raised cosine pulse shape, which has frequency response P (f ) given by the square-root of the raised cosine frequency response in Eq. (4.18). Assuming an ideal channel, setting the receiver frequency response R(f ) = P (f ) then results in an overall raised cosine system response H (f ).

4.3

Nyquist Criterion with Matched Filtering

Consider the transmission of an isolated pulse A0 δ(t). In this case the input to the receiver in Fig. 4.3 is ˜ + n(t)ψ (4.19) x(t) = A0 g(t) 1999 by CRC Press LLC

c

Figure 4.6a

Raised cosine pulse.

Figure 4.6b

Raised cosine spectrum.

where g(t) ˜ is the inverse Fourier transform of the combined transmitter-channel transfer function ˜ ) = P (f )G(f ). We will assume that the noise n(t) is white with spectrum N0 /2. The output G(f of the receiver filter is then ˜ + [r(t) ∗ n(t)] y(t) = r(t) ∗ x(t) = A0 [r(t) ∗ g(t)]

(4.20)

The first term on the right-hand side is the desired signal, and the second term is noise. Assuming that y(t) is sampled at t = 0, the ratio of signal energy to noise energy, or signal-to-noise ratio (SNR) 1999 by CRC Press LLC

c

at the sampling instant, is 2 Z ∞ r(−t)g(t) ˜ dt E |A0 |2 Z −∞ SNR = N0 ∞ |r(t)|2 dt 2 −∞

(4.21)

The receiver impulse response that maximizes this expression is r(t) = g˜ ∗ (−t) [complex conjugate of g(−t)], ˜ which is known as the matched filter impulse response. The associated transfer function ˜ ∗ (f ). is R(f ) = G Choosing the receiver filter to be the matched filter is optimal in more general situations, such as when detecting a sequence of channel symbols with intersymbol interference (assuming the additive noise is Gaussian). We, therefore, reconsider the Nyquist criterion when the receiver filter is the matched filter. In this case, the baseband model is shown in Fig. 4.7, and the output of the receiver filter is given by X Ai h(t − iT ) + n(t)ψ ˜ (4.22) y(t) = i

where the baseband pulse h(t) is now the impulse response of the filter with transfer function ˜ )|2 = |P (f )G(f )|2 . This impulse response is the autocorrelation of the impulse response of |G(f ˜ ), the combined transmitter-channel filter G(f Z ∞ g˜ ∗ (s)g(s ˜ + t) dsψ (4.23) h(t) = −∞

FIGURE 4.7: Baseband PAM model with a matched filter at the receiver. With a matched filter at the receiver, the equivalent discrete-time transfer function is 1 X ˜ k 2 j 2πf T )ψ = G f − Heq (e T T k 1 X k 2 k = P f − T G f − T T

(4.24)

k

which relates the sequence of transmitted symbols {Ak } to the sequence of received samples {yk } in the absence of noise. Note that Heq (ej 2πf T ) is positive, real valued, and an even function of f . If the channel is bandlimited to twice the Nyquist bandwidth, then H (f ) = 0 for |f | > 1/T , and the Nyquist condition is given by Eq. (4.14) where H (f ) = |G(f )P (f )|2 . The aliasing sum in Eq. (4.10b) can therefore be described as a folding operation in which the channel response |H (f )|2 is folded around the Nyquist frequency 1/(2T ). For this reason, Heq (ej 2πf T ) with a matched receiver filter is often referred to as the folded channel spectrum. 1999 by CRC Press LLC

c

4.4

Eye Diagrams

One way to assess the severity of distortion due to intersymbol interference in a digital communications system is to examine the eye diagram. The eye diagram is illustrated in Figs. 4.8a and 4.8b, for a raised cosine pulse shape with 25% excess bandwidth and an ideal bandlimited channel. Figure 4.8a shows the data signal at the receiver X Ai h(t − iT ) + n(t)ψ ˜ (4.25) y(t) = i

where h(t) is given by Eq. (4.17), α = 1/4, each symbol Ai is independently chosen from the set {±1, ±3}, where each symbol is equally likely, and n(t) ˜ is bandlimited white Gaussian noise. (The received SNR is 30 dB.) The eye diagram is constructed from the time-domain data signal y(t) as follows (assuming nominal sampling times at kT , k = 0, 1, 2, . . .): 1. Partition the waveform y(t) into successive segments of length T starting from t = T /2. 2. Translate each of these waveform segments [y(t), (k + 1/2)T ≤ t ≤ (k + 3/2)T , k = 0, 1, 2, . . .] to the interval [−T /2, T /2], and superimpose. The resulting picture is shown in Fig. 4.8b for the y(t) shown in Fig. 4.8a. (Partitioning y(t) into successive segments of length iT , i > 1, is also possible. This would result in i successive eye diagrams.) The number of eye openings is one less than the number of transmitted signal levels. In practice, the eye diagram is easily viewed on an oscilloscope by applying the received waveform y(t) to the vertical deflection plates of the oscilloscope and applying a sawtooth waveform at the symbol rate 1/T to the horizontal deflection plates. This causes successive symbol intervals to be translated into one interval on the oscilloscope display. Each waveform segment y(t), (k+1/2)T ≤ t ≤ (k+3/2)T , depends on the particular sequence of channel symbols surrounding Ak . The number of channel symbols that affects a particular waveform segment depends on the extent of the intersymbol interference, shown in Eq. (4.6). This, in turn, depends on the duration of the impulse response h(t). For example, if h(t) has most of its energy in the interval 0 < t < mT , then each waveform segment depends on approximately m symbols. Assuming binary transmission, this implies that there are a total of 2m waveform segments that can be superimposed in the eye diagram. (It is possible that only one sequence of channel symbols causes significant intersymbol interference, and this sequence occurs with very low probability.) In current digital wireless applications the impulse response typically spans only a few symbols. The eye diagram has the following important features which measure the performance of a digital communications system.

4.4.1

Vertical Eye Opening

The vertical openings at any time t0 , −T /2 ≤ t0 ≤ T /2, represent the separation between signal levels with worst-case intersymbol interference, assuming that y(t) is sampled at times t = kT + t0 , k = 0, 1, 2, . . . . It is possible for the intersymbol interference to be large enough so that this vertical opening between some, or all, signal levels disappears altogether. In that case, the eye is said to be closed. Otherwise, the eye is said to be open. A closed eye implies that if the estimated bits are obtained by thresholding the samples y(kT ), then the decisions will depend primarily on the intersymbol interference rather than on the desired symbol. The probability of error will, therefore, be close to 1/2. Conversely, wide vertical spacings between signal levels imply a large degree of immunity to additive noise. In general, y(t) should be sampled at the times kT + t0 , k = 0, 1, 2, . . . , where t0 is chosen to maximize the vertical eye opening. 1999 by CRC Press LLC

c

Figure 4.8a

Received signal y(t).

Figure 4.8b

Eye diagram for received signal shown in Fig. 4.8a.

4.4.2

Horizontal Eye Opening

The width of each opening indicates the sensitivity to timing offset. Specifically, a very narrow eye opening indicates that a small timing offset will result in sampling where the eye is closed. Conversely, a wide horizontal opening indicates that a large timing offset can be tolerated, although the error probability will depend on the vertical opening.

1999 by CRC Press LLC

c

4.4.3 Slope of the Inner Eye The slope of the inner eye indicates sensitivity to timing jitter or variance in the timing offset. Specifically, a very steep slope means that the eye closes rapidly as the timing offset increases. In this case, a significant amount of jitter in the sampling times significantly increases the probability of error. The shape of the eye diagram is determined by the pulse shape. In general, the faster the baseband pulse decays, the wider the eye opening. For example, a rectangular pulse produces a box-shaped eye diagram (assuming binary signalling). The minimum bandwidth pulse shape Eq. (4.12) produces an eye diagram which is closed for all t except for t = 0. This is because, as shown earlier, an arbitrarily small timing offset can lead to an intersymbol interference term that is arbitrarily large, depending on the data sequence.

4.5

Partial-Response Signalling

To avoid the problems associated with Nyquist signalling over an ideal bandlimited channel, bandwidth and/or power efficiency must be compromised. Raised cosine pulses compromise bandwidth efficiency to gain robustness with respect to timing errors. Another possibility is to introduce a controlled amount of intersymbol interference at the transmitter, which can be removed at the receiver. This approach is called partial-response (PR) signalling. The terminology reflects the fact that the sampled system impulse response does not have the full response given by the Nyquist condition Eq. (4.7). To illustrate PR signalling, suppose that the Nyquist condition Eq. (4.7) is replaced by the condition

k = 0, 1 all other k

(4.26)

yk = Ak + Ak−1 + n˜ k

(4.27)

hk = The kth received sample is then

1 0

so that there is intersymbol interference from one neighboring transmitted symbol. For now we focus on the spectral characteristics of PR signalling and defer discussion of how to detect the transmitted sequence {Ak } in the presence of intersymbol interference. The equivalent discrete-time transfer function in this case is the discrete Fourier transform of the sequence in Eq. (4.26), 1 X k H f+ Heq (ej 2πf T )ψ = T T k

=

1 + e−j 2πf T = 2e−j πf T cos(πf T )ψ

(4.28)

As in the full-response case, for Eq. (4.28) to be satisfied, the minimum bandwidth of the channel G(f ) and transmitter filter P (f ) is W = 1/(2T ). Assuming P (f ) has this minimum bandwidth implies ( H (f ) =

and 1999 by CRC Press LLC

c

2T e−j πf T cos(πf T )ψ

|f | < 1/(2T )

0

|f | > 1/(2T )

(4.29a)

h(t) = T { sinc (t/T ) + sinc [(t − T )/T ]}

(4.29b)

where sinc x = (sin πx)/(πx). This pulse is called a duobinary pulse and is shown along with the associated H (f ) in Fig. 4.9. [Notice that h(t) satisfies Eq. (4.26).] Unlike the ideal bandlimited frequency response, the transfer function H (f ) in Eq. (4.29a) is continuous and is, therefore, easily approximated by a physically realizable filter. Duobinary PR was first proposed by Lender, [7], and later generalized by Kretzmer, [6].

FIGURE 4.9: Duobinary frequency response and minimum bandwidth pulse. 1999 by CRC Press LLC

c

The main advantage of the duobinary pulse Eq. (4.29b), relative to the minimum bandwidth pulse Eq. (4.12), is that signalling at the Nyquist symbol rate is feasible with zero excess bandwidth. Because the pulse decays much more rapidly than a Nyquist pulse, it is robust with respect to timing errors. Selecting the transmitter and receiver filters so that the overall system response is duobinary is appropriate in situations where the channel frequency response G(f ) is near zero or has a rapid rolloff at the Nyquist band edge f = 1/(2T ). As another example of PR signalling, consider the modified duobinary partial response 1 −1 hk = 0

k = −1 k=1 all other k

(4.30)

which has equivalent discrete-time transfer function = ej 2πf T − e−j 2πf T Heq ej 2πf T =

j 2 sin(2πf T )

(4.31)

With zero excess bandwidth, the overall system response is ( H (f ) =

j 2T sin(2πf T )

|f | < 1/(2T )

0

|f | > 1/(2T )

(4.32a)

and h(t) = T {sinc [(t + T )/T ] − sinc [(t − T )/T ]}

(4.32b)

These functions are plotted in Fig. 4.10. This pulse shape is appropriate when the channel response G(f ) is near zero at both DC (f = 0) and at the Nyquist band edge. This is often the case for wire (twisted-pair) channels where the transmitted signal is coupled to the channel through a transformer. Like duobinary PR, modified duobinary allows minimum bandwidth signalling at the Nyquist rate. A particular partial response is often identified by the polynomial K X

hk D k

k=0

where D (for delay) takes the place of the usual z−1 in the z transform of the sequence {hk }. For example, duobinary is also referred to as 1 + D partial response. In general, more complicated system responses than those shown in Figs. 4.9 and 4.10 can be generated by choosing more nonzero coefficients in the sequence {hk }. This complicates detection, however, because of the additional intersymbol interference that is generated. Rather than modulating a PR pulse h(t), a PR signal can also be generated by filtering the sequence of transmitted levels {Ai }. This is shown in Fig. 4.11. Namely, the transmitted levels are first passed through a discrete-time (digital) filter with transfer function Pd (ej 2πf T ) (where the subscript d indicates discrete). [Note that Pd (ej 2πf T ) can be selected to be Heq (ej 2πf T ).] The outputs of this filter form the PAM signal, where the pulse shaping filter P (f ) = 1, |f | < 1/(2T ) and is zero elsewhere. If the transmitted levels {Ak } are selected independently and are identically distributed, 1999 by CRC Press LLC

c

FIGURE 4.10: Modified duobinary frequency response and minimum bandwidth pulse. then the transmitted spectrum is σA2 |Pd (ej 2πf T )|2 for |f | < 1/(2T ) and is zero for |f | > 1/(2T ), where σA2 = E[|Ak |2 ]. Shaping the transmitted spectrum to have nulls coincident with nulls in the channel response potentially offers significant performance advantages. By introducing intersymbol interference, however, PR signalling increases the number of received signal levels, which increases the complexity of the detector and may reduce immunity to noise. For example, the set of received signal levels for duobinary signalling is {0, ± 2} from which the transmitted levels {± 1} must be estimated. The performance of a particular PR scheme depends on the channel characteristics, as well as the type of detector used at the receiver. We now describe a simple suboptimal detection strategy. 1999 by CRC Press LLC

c

FIGURE 4.11: Generation of PR signal.

4.5.1 Precoding Consider the received signal sample Eq. (4.27) with duobinary signalling. If the receiver has correctly decoded the symbol Ak−1 , then in the absence of noise Ak can be decoded by subtracting Ak−1 from the received sample yk . If an error occurs, however, then subtracting the preceding symbol estimate from the received sample will cause the error to propagate to successive detected symbols. To avoid this problem, the transmitted levels can be precoded in such a way as to compensate for the intersymbol interference introduced by the overall partial response.

FIGURE 4.12: Precoding for a PR channel.

TABLE 4.1

Example of Precoding for Duobinary PR.

{bi }: {bi0 }: {Ai }: {yi }:

1

0

0

1

1

1

0

0

1

0

0

1

1

1

0

1

0

0

0

1

1

−1

1

1

1

−1

1

−1

−1

−1

1

1

0

2

2

0

0

0

−2

−2

0

2

We first illustrate precoding for duobinary PR. The sequence of operations is illustrated in Fig. 4.12. Let {bk } denote the sequence of source bits where bk ∈ {0, 1}. This sequence is transformed to the sequence {bk0 } by the operation 0 (4.33) bk0 = bk ⊕ bk−1 where ⊕ denotes modulo 2 addition (exclusive OR). The sequence {bk0 } is mapped to the sequence of binary transmitted signal levels {Ak } according to Ak = 2bk0 − 1

(4.34)

That is, bk0 = 0 (bk0 = 1) is mapped to the transmitted level Ak = −1 (Ak = 1). In the absence of noise, the received symbol is then 0 −1 (4.35) yk = Ak + Ak−1 = 2 bk0 + bk−1 and combining Eqs. (4.33) and (4.35) gives 1 yk + 1 mod 2 bk = 2 1999 by CRC Press LLC

c

(4.36)

That is, if yk = ±2, then bk = 0, and if yk = 0, then bk = 1. Precoding, therefore, enables the detector to make symbol-by-symbol decisions that do not depend on previous decisions. Table 4.1 shows a sequence of transmitted bits {bi }, precoded bits {bi0 }, transmitted signal levels {Ai }, and received samples {yi }. The preceding precoding technique can be extended to multilevel PAM and to other PR channels. Suppose that the PR is specified by Heq (D) =

K X

hk D k

k=0

where the coefficients are integers and that the source symbols {bk } are selected from the set {0, 1, . . . , M − 1}. These symbols are transformed to the sequence {bk0 } via the precoding operation ! K X 0 0 hi bk−i mod M (4.37) bk = bk − i=1

Because of the modulo operation, each symbol bk0 is also in the set {0, 1, . . . , M − 1}. The kth transmitted signal level is given by Ak = 2bk0 − (M − 1)

(4.38)

so that the set of transmitted levels is {−(M − 1), . . . , (M − 1)} (i.e., a shifted version of the set of values assumed by bk ). In the absence of noise the received sample is yk =

K X

hi Ak−i

(4.39)

i=0

and it can be shown that the kth source symbol is given by bk =

1 yk + (M − 1) · Heq (1) mod M 2

(4.40)

Precoding the symbols {bk } in this manner, therefore, enables symbol-by-symbol decisions at the receiver. In the presence of noise, more sophisticated detection schemes (e.g., maximum likelihood) can be used with PR signalling to obtain improvements in performance.

4.6

Additional Considerations

In many applications, bandwidth and intersymbol interference are not the only important considerations for selecting baseband pulses. Here we give a brief discussion of additional practical constraints that may influence this selection.

4.6.1

Average Transmitted Power and Spectral Constraints

The constraint on average transmitted power varies according to the application. For example, low-average power is highly desirable for mobile wireless applications that use battery-powered transmitters. In many applications (e.g., digital subscriber loops, as well as digital radio), constraints are imposed to limit the amount of interference, or crosstalk, radiated into neighboring receivers and 1999 by CRC Press LLC

c

communications systems. Because this type of interference is frequency dependent, the constraint may take the form of a spectral mask that specifies the maximum allowable transmitted power as a function of frequency. For example, crosstalk in wireline channels is generally caused by capacitive coupling and increases as a function of frequency. Consequently, to reduce the amount of crosstalk generated at a particular transmitter, the pulse shaping filter generally attenuates high frequencies more than low frequencies. In radio applications where signals are assigned different frequency bands, constraints on the transmitted spectrum are imposed to limit adjacent-channel interference. This interference is generated by transmitters assigned to adjacent frequency bands. Therefore, a constraint is needed to limit the amount of out-of-band power generated by each transmitter, in addition to an overall average power constraint. To meet this constraint, the transmitter filter in Fig. 4.3 must have a sufficiently steep rolloff at the edges of the assigned frequency band. (Conversely, if the transmitted signals are time multiplexed, then the duration of the system impulse response must be contained within the assigned time slot.)

4.6.2 Peak-to-Average Power In addition to a constraint on average transmitted power, a peak-power constraint is often imposed as well. This constraint is important in practice for the following reasons: 1. The dynamic range of the transmitter is limited. In particular, saturation of the output amplifier will “clip” the transmitted waveform. 2. Rapid fades can severely distort signals with high peak-to-average power. 3. The transmitted signal may be subjected to nonlinearities. Saturation of the output amplifier is one example. Another example that pertains to wireline applications is the companding process in the voice telephone network [5]. Namely, the compander used to reduce quantization noise for pulse-code modulated voice signals introduces amplitudedependent distortion in data signals. The preceding impairments or constraints indicate that the transmitted waveform should have a low peak-to-average power ratio (PAR). For a transmitted waveform x(t), the PAR is defined as PAR =

max |x(t)|2 E |x(t)|2

where E(·) denotes expectation. Using binary signalling with rectangular pulse shapes minimizes the PAR. However, this compromises bandwidth efficiency. In applications where PAR should be low, binary signalling with rounded pulses are often used. Operating RF power amplifiers with power back-off can also reduce PAR, but leads to inefficient amplification. For an orthogonal frequency division multiplexing (OFDM) system, it is well known that the transmitted signal can exhibit a very high PAR compared to an equivalent single-carrier system. Hence more sophisticated approaches to PAR reduction are required for OFDM. Some proposed approaches are described in [8] and references therein. These include altering the set of transmitted symbols and setting aside certain OFDM tones specifically to minimize PAR.

4.6.3

Channel and Receiver Characteristics

The type of channel impairments encountered and the type of detection scheme used at the receiver can also influence the choice of a transmitted pulse shape. For example, a constant amplitude 1999 by CRC Press LLC

c

pulse is appropriate for a fast fading environment with noncoherent detection. The ability to track channel characteristics, such as phase, may allow more bandwidth efficient pulse shapes in addition to multilevel signalling. High-speed data communications over time-varying channels requires that the transmitter and/or receiver adapt to the changing channel characteristics. Adapting the transmitter to compensate for a time-varying channel requires a feedback channel through which the receiver can notify the transmitter of changes in channel characteristics. Because of this extra complication, adapting the receiver is often preferred to adapting the transmitter pulse shape. However, the following examples are notable exceptions. 1. The current IS-95 air interface for direct-sequence code-division multiple access adapts the transmitter power to control the amount of interference generated and to compensate for channel fades. This can be viewed as a simple form of adaptive transmitter pulse shaping in which a single parameter associated with the pulse shape is varied. 2. Multitone modulation divides the channel bandwidth into small subbands, and the transmitted power and source bits are distributed among these subbands to maximize the information rate. The received signal-to-noise ratio for each subband must be transmitted back to the transmitter to guide the allocation of transmitted bits and power [1]. In addition to multitone modulation, adaptive precoding (also known as Tomlinson–Harashima precoding [4, 11]) is another way in which the transmitter can adapt to the channel frequency response. Adaptive precoding is an extension of the technique described earlier for partial-response channels. Namely, the equivalent discrete-time channel impulse response is measured at the receiver and sent back to the transmitter, where it is used in a precoder. The precoder compensates for the intersymbol interference introduced by the channel, allowing the receiver to detect the data by a simple threshhold operation. Both multitone modulation and precoding have been used with wireline channels (voiceband modems and digital subscriber loops).

4.6.4

Complexity

Generation of a bandwidth-efficient signal requires a filter with a sharp cutoff. In addition, bandwidth-efficient pulse shapes can complicate other system functions, such as timing and carrier recovery. If sufficient bandwidth is available, the cost can be reduced by using a rectangular pulse shape with a simple detection strategy (low-pass filter and threshold).

4.6.5

Tolerance to Interference

Interference is one of the primary channel impairments associated with digital radio. In addition to adjacent-channel interference described earlier, cochannel interference may be generated by other transmitters assigned to the same frequency band as the desired signal. Cochannel interference can be controlled through frequency (and perhaps time slot) assignments and by pulse shaping. For example, assuming fixed average power, increasing the bandwidth occupied by the signal lowers the power spectral density and decreases the amount of interference into a narrowband system that occupies part of the available bandwidth. Sufficient bandwidth spreading, therefore, enables wideband signals to be overlaid on top of narrowband signals without disrupting either service.

1999 by CRC Press LLC

c

4.6.6

Probability of Intercept and Detection

The broadcast nature of wireless channels generally makes eavesdropping easier than for wired channels. A requirement for most commercial, as well as military applications, is to guarantee the privacy of user conversations (low probability of intercept). An additional requirement, in some applications, is that determining whether or not communications is taking place must be difficult (low probability of detection). Spread-spectrum waveforms are attractive in these applications since spreading the pulse energy over a wide frequency band decreases the power spectral density and, hence, makes the signal less visible. Power-efficient modulation combined with coding enables a further reduction in transmitted power for a target error rate.

4.7

Examples

We conclude this chapter with a brief description of baseband pulse shapes used in existing and emerging standards for digital mobile cellular and Personal Communications Services (PCS).

4.7.1

Global System for Mobile Communications (GSM)

The European GSM standard for digital mobile cellular communications operates in the 900-MHz frequency band, and is based on time-division multiple access (TDMA) [9]. The U.S. version operates at 1900 MHz, and is called PCS-1900. A special variant of binary FSK is used called Gaussian minimum-shift keying (GMSK). The GMSK modulator is illustrated in Fig. 4.13. The input to the modulator is a binary PAM signal s(t), given by Eq. (4.3), where the pulse p(t) is a Gaussian function and |s(t)| < 1. This waveform frequency modulates the carrier fc , so that the (passband) transmitted signal is Z t s(τ ) dτ w(t) = Kcos 2πfc t + 2πfd −∞

The maximum frequency deviation from the carrier is fd = 1/(2T ), which characterizes minimumshift keying. This technique can be used with a noncoherent receiver that is easy to implement. Because the transmitted signal has a constant envelope, the data can be reliably detected in the presence of rapid fades that are characteristic of mobile radio channels.

FIGURE 4.13: Generation of GMSK signal; LPF is low-pass filter.

4.7.2

U.S. Digital Cellular (IS-136)

The IS-136 air interface (formerly IS-54) operates in the 800 MHz band and is based on TDMA [3]. There is also a 1900 MHz version of IS-136. The baseband signal is given by Eq. (4.3) where the symbols are complex-valued, corresponding to quadrature phase modulation. The pulse has a square-root raised cosine spectrum with 35% excess bandwidth. 1999 by CRC Press LLC

c

4.7.3

Interim Standard-95

The IS-95 air interface for digital mobile cellular uses spread-spectrum signalling (CDMA) in the 800MHz band [10]. There is also a 1900 MHz version of IS-95. The baseband transmitted pulse shapes are analogous to those shown in Fig. 4.2, where the number of square pulses (chips) per bit is 128. To improve spectral efficiency the (wideband) transmitted signal is filtered by an approximation to an ideal low-pass response with a small amount of excess bandwidth. This shapes the chips so that they resemble minimum bandwidth pulses.

4.7.4

Personal Access Communications System (PACS)

Both PACS and the Japanese personal handy phone (PHP) system are TDMA systems which have been proposed for personal communications systems (PCS), and operate near 2 GHz [2]. The baseband signal is given by Eq. (4.3) with four complex symbols representing four-phase quadrature modulation. The baseband pulse has a square-root raised cosine spectrum with 50% excess bandwidth.

Defining Terms Baseband signal: A signal with frequency content centered around DC. Equivalent discrete-time transfer function: A discrete-time transfer function (z transform) that relates the transmitted amplitudes to received samples in the absence of noise. Excess bandwidth: That part of the baseband transmitted spectrum which is not contained within the Nyquist band. Eye diagram: Superposition of segments of a received PAM signal that indicates the amount of intersymbol interference present. Frequency-shift keying: A digital modulation technique in which the transmitted pulse is sinusoidal, where the frequency is determined by the source bits. Intersymbol interference: The additive contribution (interference) to a received sample from transmitted symbols other than the symbol to be detected. Matched filter: The receiver filter with impulse response equal to the time-reversed, complex conjugate impulse response of the combined transmitter filter-channel impulse response. Nyquist band: The narrowest frequency band that can support a PAM signal without intersymbol interference (the interval [−1/(2T ), 1/(2T )] where 1/T is the symbol rate). Nyquist criterion: A condition on the overall frequency response of a PAM system that ensures the absence of intersymbol interference. Orthogonal frequency division multiplexing (OFDM): Modulation technique in which the transmitted signal is the sum of low-bit-rate narrowband digital signals modulated on orthogonal carriers. Partial-response signalling: A signalling technique in which a controlled amount of intersymbol interference is introduced at the transmitter in order to shape the transmitted spectrum. Precoding: A transformation of source symbols at the transmitter that compensates for intersymbol interference introduced by the channel. Pulse amplitude modulation (PAM): A digital modulation technique in which the source bits are mapped to a sequence of amplitudes that modulate a transmitted pulse. 1999 by CRC Press LLC

c

Raised cosine pulse: A pulse shape with Fourier transform that decays to zero according to a raised cosine; see Eq. (4.18). The amount of excess bandwidth is conveniently determined by a single parameter (α). Spread spectrum: A signalling technique in which the pulse bandwidth is many times wider than the Nyquist bandwidth. Zero-forcing criterion: A design constraint which specifies that intersymbol interference be eliminated.

References [1] Bingham, J.A.C., Multicarrier modulation for data transmission: an idea whose time has come. IEEE Commun. Mag., 28(May), 5–14, 1990. [2] Cox, D.C., Wireless personal communications: what is it? IEEE Personal Comm., 2(2), 20–35, 1995. [3] Electronic Industries Association/Telecommunications Industry Association. Recommended minimum performance standards for 800 MHz dual-mode mobile stations. Incorp. EIA/TIA 19B, EIA/TIA Project No. 2216, Mar.,1991 [4] Harashima, H. and Miyakawa, H., Matched-transmission technique for channels with intersymbol interference. IEEE Trans. on Commun., COM-20(Aug.), 774–780, 1972. [5] Kalet, I. and Saltzberg, B.R., QAM transmission through a companding channel—signal constellations and detection. IEEE Trans. on Comm., 42(2–4), 417–429, 1994. [6] Kretzmer, E.R., Generalization of a technique for binary data communication. IEEE Trans. Comm. Tech., COM-14 (Feb.), 67, 68, 1966. [7] Lender, A., The duobinary technique for high-speed data Transmission. AIEE Trans. on Comm. Electronics, 82 (March), 214–218, 1963. [8] Muller, S.H. and Huber, J.B., A comparison of peak power reduction schemes for OFDM. Proc. GLOBECOM ’97, (Mon.), 1–5, 1997. [9] Rahnema, M., Overview of the GSM system and protocol architecture. IEEE Commun. Mag., (April), 92–100, 1993. [10] Telecommunication Industry Association. Mobile station-base station compatibility standard for dual-mode wideband spread spectrum cellular system. TIA/EIA/IS-95-A. May, 1995. [11] Tomlinson, M., New automatic equalizer employing modulo arithmetic. Electron. Lett., 7 (March), 138, 139, 1971.

Further Information Baseband signalling and pulse shaping is fundamental to the design of any digital communications system and is, therefore, covered in numerous texts on digital communications. For more advanced treatments see E.A. Lee and D.G. Messerschmitt, Digital Communication, Kluwer 1994, and J.G. Proakis, Digital Communications, McGraw-Hill 1995.

1999 by CRC Press LLC

c

Proakis, J.G. “Channel Equalization” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Channel Equalization 5.1 5.2 5.3

Characterization of Channel Distortion Characterization of Intersymbol Interference Linear Equalizers Adaptive Linear Equalizers

John G. Proakis Northeastern University

5.1

5.4 Decision-Feedback Equalizer 5.5 Maximum-Likelihood Sequence Detection 5.6 Conclusions Defining Terms References Further Information

Characterization of Channel Distortion

Many communication channels, including telephone channels, and some radio channels, may be generally characterized as band-limited linear filters. Consequently, such channels are described by their frequency response C(f ), which may be expressed as C(f ) = A(f )ej θ(f )

(5.1)

where A(f ) is called the amplitude response and θ (f ) is called the phase response. Another characteristic that is sometimes used in place of the phase response is the envelope delay or group delay, which is defined as 1 dθ (f ) (5.2) τ (f ) = − 2π df A channel is said to be nondistorting or ideal if, within the bandwidth W occupied by the transmitted signal, A(f ) = const and θ(f ) is a linear function of frequency [or the envelope delay τ (f ) = const]. On the other hand, if A(f ) and τ (f ) are not constant within the bandwidth occupied by the transmitted signal, the channel distorts the signal. If A(f ) is not constant, the distortion is called amplitude distortion and if τ (f ) is not constant, the distortion on the transmitted signal is called delay distortion. As a result of the amplitude and delay distortion caused by the nonideal channel frequency response characteristic C(f ), a succession of pulses transmitted through the channel at rates comparable to the bandwidth W are smeared to the point that they are no longer distinguishable as well-defined pulses at the receiving terminal. Instead, they overlap and, thus, we have intersymbol interference (ISI). As an example of the effect of delay distortion on a transmitted pulse, Fig. 5.1(a) illustrates a bandlimited pulse having zeros periodically spaced in time at points labeled ±T , ±2T , etc. If information 1999 by CRC Press LLC

c

is conveyed by the pulse amplitude, as in pulse amplitude modulation (PAM), for example, then one can transmit a sequence of pulses, each of which has a peak at the periodic zeros of the other pulses. Transmission of the pulse through a channel modeled as having a linear envelope delay characteristic τ (f ) [quadratic phase θ(f )], however, results in the received pulse shown in Fig. 5.1(b) having zero crossings that are no longer periodically spaced. Consequently a sequence of successive pulses would be smeared into one another, and the peaks of the pulses would no longer be distinguishable. Thus, the channel delay distortion results in intersymbol interference. As will be discussed in this chapter, it is possible to compensate for the nonideal frequency response characteristic of the channel by use of a filter or equalizer at the demodulator. Figure 5.1(c) illustrates the output of a linear equalizer that compensates for the linear distortion in the channel. The extent of the intersymbol interference on a telephone channel can be appreciated by observing a frequency response characteristic of the channel. Figure 5.2 illustrates the measured average amplitude and delay as a function of frequency for a medium-range (180–725 mi) telephone channel of the switched telecommunications network as given by Duffy and Tratcher, 1971. We observe that the usable band of the channel extends from about 300 Hz to about 3000 Hz. The corresponding impulse response of the average channel is shown in Fig. 5.3. Its duration is about 10 ms. In comparison, the transmitted symbol rates on such a channel may be of the order of 2500 pulses or symbols per second. Hence, intersymbol interference might extend over 20–30 symbols. Besides telephone channels, there are other physical channels that exhibit some form of time dispersion and, thus, introduce intersymbol interference. Radio channels, such as short-wave ionospheric propagation (HF), tropospheric scatter, and mobile cellular radio are three examples of time-dispersive wireless channels. In these channels, time dispersion and, hence, intersymbol interference is the result of multiple propagation paths with different path delays. The number of paths and the relative time delays among the paths vary with time and, for this reason, these radio channels are usually called time-variant multipath channels. The time-variant multipath conditions give rise to a wide variety of frequency response characteristics. Consequently, the frequency response characterization that is used for telephone channels is inappropriate for time-variant multipath channels. Instead, these radio channels are characterized statistically in terms of the scattering function, which, in brief, is a two-dimensional representation of the average received signal power as a function of relative time delay and Doppler frequency (see Proakis [4]). For illustrative purposes, a scattering function measured on a medium-range (150 mi) tropospheric scatter channel is shown in Fig. 5.4. The total time duration (multipath spread) of the channel response is approximately 0.7 µs on the average, and the spread between half-power points in Doppler frequency is a little less than 1 Hz on the strongest path and somewhat larger on the other paths. Typically, if one is transmitting at a rate of 107 symbols/s over such a channel, the multipath spread of 0.7 µs will result in intersymbol interference that spans about seven symbols.

5.2

Characterization of Intersymbol Interference

In a digital communication system, channel distortion causes intersymbol interference, as illustrated in the preceding section. In this section, we shall present a model that characterizes the ISI. The digital modulation methods to which this treatment applies are PAM, phase-shift keying (PSK) and quadrature amplitude modulation (QAM). The transmitted signal for these three types of modulation may be expressed as s(t)

= =

1999 by CRC Press LLC

c

vc (t) cos 2πfc t − vs (t) sin 2πfc t i h Re v(t) ej 2πfc t

(5.3)

FIGURE 5.1: Effect of channel distortion: (a) channel input, (b) channel output, (c) equalizer output.

1999 by CRC Press LLC

c

FIGURE 5.2: Average amplitude and delay characteristics of medium-range telephone channel. where v(t) = vc (t) + j vs (t) is called the equivalent low-pass signal, fc is the carrier frequency, and Re[ ] denotes the real part of the quantity in brackets. In general, the equivalent low-pass signal is expressed as v(t) =

∞ X

In gT (t − nT )

(5.4)

n=0

where gT (t) is the basic pulse shape that is selected to control the spectral characteristics of the transmitted signal, {In } the sequence of transmitted information symbols selected from a signal constellation consisting of M points, and T the signal interval (1/T is the symbol rate). For PAM, PSK, and QAM, the values of In are points from M-ary signal constellations. Figure 5.5 illustrates the signal constellations for the case of M = 8 signal points. Note that for PAM, the signal constellation is one dimensional. Hence, the equivalent low-pass signal v(t) is real valued, i.e., vs (t) = 0 and vc (t) = v(t). For M-ary (M > 2) PSK and QAM, the signal constellations are two dimensional and, hence, v(t) is complex valued. 1999 by CRC Press LLC

c

FIGURE 5.3: Impulse response of average channel with amplitude and delay shown in Fig.5.2.

FIGURE 5.4: Scattering function of a medium-range tropospheric scatter channel.

The signal s(t) is transmitted over a bandpass channel that may be characterized by an equivalent low-pass frequency response C(f ). Consequently, the equivalent low-pass received signal can be represented as r(t) =

∞ X n=0

1999 by CRC Press LLC

c

In h(t − nT ) + w(t)

(5.5)

−7

−5

−3

−1

1

3

5

7

000

001

011

010

110

111

101

100

(a) PAM

011

2

010

001

110

000

111

100

101

(b) PSK

(1, 1) (1 + 3, 0)

2 2

(c) QAM

FIGURE 5.5: M = 8 signal constellations for PAM, PSK, and QAM.

1999 by CRC Press LLC

c

where h(t) = gT (t) ∗ c(t), and c(t) is the impulse response of the equivalent low-pass channel, the asterisk denotes convolution, and w(t) represents the additive noise in the channel. To characterize the ISI, suppose that the received signal is passed through a receiving filter and then sampled at the rate 1/T samples/s. In general, the optimum filter at the receiver is matched to the received signal pulse h(t). Hence, the frequency response of this filter is H ∗ (f ). We denote its output as ∞ X In x(t − nT ) + ν(t) (5.6) y(t) = n=0

where x(t) is the signal pulse response of the receiving filter, i.e., X(f ) = H (f )H ∗ (f ) = |H (f )|2 , and ν(t) is the response of the receiving filter to the noise w(t). Now, if y(t) is sampled at times t = kT , k = 0, 1, 2, . . . , we have y(kT ) ≡ yk

= =

∞ X n=0 ∞ X

In x(kT − nT ) + ν(kT ) In xk−n + νk ,

k = 0, 1, . . .

(5.7)

n=0

The sample values {yk } can be expressed as 1 yk = x0 Ik + x0

∞ X n=0 n6 =k

In xk−n + νk ,

k = 0, 1, . . .

(5.8)

The term x0 is an arbitrary scale factor, which we arbitrarily set equal to unity for convenience. Then yk = Ik +

∞ X

In xk−n + νk

(5.9)

n=0 n6 =k

The term Ik represents the desired information symbol at the kth sampling instant, the term ∞ X

In xk−n

(5.10)

n=0 n6 =k

represents the ISI, and νk is the additive noise variable at the kth sampling instant. The amount of ISI, and noise in a digital communications system can be viewed on an oscilloscope. For PAM signals, we can display the received signal y(t) on the vertical input with the horizontal sweep rate set at 1/T . The resulting oscilloscope display is called an eye pattern because of its resemblance to the human eye. For example, Fig. 5.6 illustrates the eye patterns for binary and four-level PAM modulation. The effect of ISI is to cause the eye to close, thereby reducing the margin for additive noise to cause errors. Figure 5.7 graphically illustrates the effect of ISI in reducing the opening of a binary eye. Note that intersymbol interference distorts the position of the zero crossings and causes a reduction in the eye opening. Thus, it causes the system to be more sensitive to a synchronization error. 1999 by CRC Press LLC

c

QUATERNARY

BINARY

FIGURE 5.6: Examples of eye patterns for binary and quaternary amplitude shift keying (or PAM). Optimum sampling time Sensitivity to timing error

Peak distortion

Distortion of zero crossings

Noise margin

FIGURE 5.7: Effect of intersymbol interference on eye opening.

For PSK and QAM it is customary to display the eye pattern as a two-dimensional scatter diagram illustrating the sampled values {yk } that represent the decision variables at the sampling instants. Figure 5.8 illustrates such an eye pattern for an 8-PSK signal. In the absence of intersymbol interference and noise, the superimposed signals at the sampling instants would result in eight distinct points corresponding to the eight transmitted signal phases. Intersymbol interference and noise result in a deviation of the received samples {yk } from the desired 8-PSK signal. The larger the intersymbol interference and noise, the larger the scattering of the received signal samples relative to the transmitted signal points. In practice, the transmitter and receiver filters are designed for zero ISI at the desired sampling times t = kT . Thus, if GT (f ) is the frequency response of the transmitter filter and GR (f ) is the frequency response of the receiver filter, then the product GT (f ) GR (f ) is designed to yield zero ISI. 1999 by CRC Press LLC

c

FIGURE 5.8: Two-dimensional digital eye patterns. For example, the product GT (f ) GR (f ) may be selected as GT (f )GR (f ) = Xrc (f )ψ where Xrc (f ) is the raised-cosine frequency response characteristic, defined as T,ψ 0 ≤ |f | ≤ (1 − α)/2T T πT 1−α 1−α 1+α b1 + cos |f | − , ≤ |f | ≤ Xrc (f ) = 2 α 2T 2T 2T 1+α 0,ψ |f | > 2T

(5.11)

(5.12)

where α is called the rolloff factor, which takes values in the range 0 ≤ α ≤ 1, and 1/T is the symbol rate. The frequency response Xrc (f ) is illustrated in Fig. 5.9(a) for α = 0, 1/2, and 1. Note that when α = 0, Xrc (f ) reduces to an ideal brick wall physically nonrealizable frequency response with bandwidth occupancy 1/2T . The frequency 1/2T is called the Nyquist frequency. For α > 0, the bandwidth occupied by the desired signal Xrc (f ) beyond the Nyquist frequency 1/2T is called the excess bandwidth, and is usually expressed as a percentage of the Nyquist frequency. For example, when α = 1/2, the excess bandwidth is 50% and when α = 1, the excess bandwidth is 100%. The signal pulse xrc (t) having the raised-cosine spectrum is xrc (t) =

sin π t/T π t/T

cos (π αt/T ) 1 − 4α 2 t 2 /T 2

(5.13)

Figure 5.9(b) illustrates xrc (t) for α = 0, 1/2, and 1. Note that xrc (t) = 1 at t = 0 and xrc (t) = 0 at t = kT , k = ±1, ±2, . . . . Consequently, at the sampling instants t = kT , k 6 = 0, there is no ISI from adjacent symbols when there is no channel distortion. In the presence of channel distortion, however, the ISI given by Eq. (5.10) is no longer zero, and a channel equalizer is needed to minimize its effect on system performance.

1999 by CRC Press LLC

c

x rc (t) 1 α = 0; 0.5

β=1 α=1 0

3T

2T α=0

4T

α = 0.5

(a)

Xrc (f) T

α=0 α = 0.5 α=1

−1 T

−

1 2T

0

1 2T

1 T

f

(b)

FIGURE 5.9: Pulses having a raised cosine spectrum.

5.3

Linear Equalizers

The most common type of channel equalizer used in practice to reduce SI is a linear transversal filter with adjustable coefficients {ci }, as shown in Fig. 5.10. On channels whose frequency response characteristics are unknown, but time invariant, we may measure the channel characteristics and adjust the parameters of the equalizer; once adjusted, the parameters remain fixed during the transmission of data. Such equalizers are called preset equalizers. On the other hand, adaptive equalizers update their parameters on a periodic basis during the transmission of data and, thus, they are capable of tracking a slowly time-varying channel response. First, let us consider the design characteristics for a linear equalizer from a frequency domain viewpoint. Figure 5.11 shows a block diagram of a system that employs a linear filter as a channel equalizer. The demodulator consists of a receiver filter with frequency response GR (f ) in cascade with a channel equalizing filter that has a frequency response GE (f ). As indicated in the preceding section, the receiver filter response GR (f ) is matched to the transmitter response, i.e., GR (f ) = G∗T (f ), and the product GR (f )GT (f ) is usually designed so that there is zero ISI at the sampling instants as, for example, when GR (t)GT (f ) = Xrc (f ). For the system shown in Fig. 5.11, in which the channel frequency response is not ideal, the desired 1999 by CRC Press LLC

c

FIGURE 5.10: Linear transversal filter.

FIGURE 5.11: Block diagram of a system with an equalizer. condition for zero ISI is GT (f )C(f )GR (f )GE (f ) = Xrc (f )ψ

(5.14)

where Xrc (f ) is the desired raised-cosine spectral characteristic. Since GT (f )GR (f ) = Xrc (f ) by design, the frequency response of the equalizer that compensates for the channel distortion is GE (f ) =

1 1 = e−j θc (f ) C(f ) |C(f )|

(5.15)

Thus, the amplitude response of the equalizer is |GE (f )| = 1/|C(f )| and its phase response is θE (f ) = −θc (f ). In this case, the equalizer is said to be the inverse channel filter to the channel response. We note that the inverse channel filter completely eliminates ISI caused by the channel. Since it forces the ISI to be zero at the sampling instants t = kT , k = 0, 1, . . . , the equalizer is called a zero-forcing equalizer. Hence, the input to the detector is simply zk = Ik + ηk ,

k = 0, 1, . . . ψ

(5.16)

where ηk represents the additive noise and Ik is the desired symbol. In practice, the ISI caused by channel distortion is usually limited to a finite number of symbols on either side of the desired symbol. Hence, the number of terms that constitute the ISI in the summation 1999 by CRC Press LLC

c

given by Eq. (5.10) is finite. As a consequence, in practice the channel equalizer is implemented as a finite duration impulse response (FIR) filter, or transversal filter, with adjustable tap coefficients {cn }, as illustrated in Fig. 5.10. The time delay τ between adjacent taps may be selected as large as T , the symbol interval, in which case the FIR equalizer is called a symbol-spaced equalizer. In this case, the input to the equalizer is the sampled sequence given by Eq. (5.7). We note that when the symbol rate 1/T < 2W , however, frequencies in the received signal above the folding frequency 1/T are aliased into frequencies below 1/T . In this case, the equalizer compensates for the aliased channel-distorted signal. On the other hand, when the time delay τ between adjacent taps is selected such that 1/τ ≥ 2W > 1/T , no aliasing occurs and, hence, the inverse channel equalizer compensates for the true channel distortion. Since τ < T , the channel equalizer is said to have fractionally spaced taps and it is called a fractionally spaced equalizer. In practice, τ is often selected as τ = T /2. Notice that, in this case, the sampling rate at the output of the filter GR (f ) is 2/T . The impulse response of the FIR equalizer is N X

gE (t) =

cn δ(t − nτ )ψ

(5.17)

cn e−j 2πf nτ

(5.18)

n=−N

and the corresponding frequency response is GE (f ) =

N X n=−N

where {cn } are the (2N + 1) equalizer coefficients and N is chosen sufficiently large so that the equalizer spans the length of the ISI, i.e., 2N + 1 ≥ L, where L is the number of signal samples spanned by the ISI. Since X(f ) = GT (f )C(f )GR (f ) and x(t) is the signal pulse corresponding to X(f ), then the equalized output signal pulse is q(t) =

N X

cn x(t − nτ )ψ

(5.19)

n=−N

The zero-forcing condition can now be applied to the samples of q(t) taken at times t = mT . These samples are N X cn x(mT − nτ ),ψ m = 0, ±1, . . . , ±Nψ (5.20) q(mT ) = n=−N

Since there are 2N + 1 equalizer coefficients, we can control only 2N + 1 sampled values of q(t). Specifically, we may force the conditions q(mT ) =

N X n=−N

cn x(mT − nτ ) =

1, 0,

m=0 m = ±1, ±2, . . . , ±N

(5.21)

which may be expressed in matrix form as Xc = q, where X is a (2N + 1) × (2N + 1) matrix with elements {x(mT − nτ )}, c is the (2N + 1) coefficient vector and q is the (2N + 1) column vector with one nonzero element. Thus, we obtain a set of 2N + 1 linear equations for the coefficients of the zero-forcing equalizer. 1999 by CRC Press LLC

c

We should emphasize that the FIR zero-forcing equalizer does not completely eliminate ISI because it has a finite length. As N is increased, however, the residual ISI can be reduced, and in the limit as N → ∞, the ISI is completely eliminated.

EXAMPLE 5.1:

Consider a channel distorted pulse x(t), at the input to the equalizer, given by the expression x(t) =

1 2 2t 1+ T

where 1/T is the symbol rate. The pulse is sampled at the rate 2/T and equalized by a zero-forcing equalizer. Determine the coefficients of a five-tap zero-forcing equalizer. Solution 5.1

According to Eq. (5.21), the zero-forcing equalizer must satisfy the equations

q(mT ) =

2 X

cn x (mT − nT /2) =

n=−2

The matrix X with elements x(mT − nT /2) is given as 1 1 1 5 10 17 1 1 1 2 5 1 1 1 X= 5 2 1 1 1 17 10 5 1 1 1 37 26 17 The coefficient vector c and the vector q are given as c−2 c−1 c= c0 c1 c2

m=0 m = ±1, ±2

1, 0,

1 26 1 10 1 2 1 2 1 10

1 37 1 17 1 5

1 1 5

q=

0 0 1 0 0

(5.22)

(5.23)

Then, the linear equations Xc = q can be solved by inverting the matrix X. Thus, we obtain −2.2 4.9 (5.24) copt = X−1 q = −3 4.9 −2.2 One drawback to the zero-forcing equalizer is that it ignores the presence of additive noise. As a consequence, its use may result in significant noise enhancement. This is easily seen by noting that 1999 by CRC Press LLC

c

in a frequency range where C(f ) is small, the channel equalizer GE (f ) = 1/C(f ) compensates by placing a large gain in that frequency range. Consequently, the noise in that frequency range is greatly enhanced. An alternative is to relax the zero ISI condition and select the channel equalizer characteristic such that the combined power in the residual ISI and the additive noise at the output of the equalizer is minimized. A channel equalizer that is optimized based on the minimum mean square error (MMSE) criterion accomplishes the desired goal. To elaborate, let us consider the noise corrupted output of the FIR equalizer, which is z(t) =

N X

cn y(t − nτ )

(5.25)

n=−N

where y(t) is the input to the equalizer, given by Eq. (5.6). The equalizer output is sampled at times t = mT . Thus, we obtain N X cn y(mT − nτ ) (5.26) z(mT ) = n=−N

The desired response at the output of the equalizer at t = mT is the transmitted symbol Im . The error is defined as the difference between Im and z(mT ). Then, the mean square error (MSE) between the actual output sample z(mT ) and the desired values Im is MSE

E |z(mT ) − Im |2 2 N X = E cn y(mT − nτ ) − Im

=

n=−N

=

N X

N X

cn ck RY (n − k)

n=−N k=−N

−2

N X

ck RI Y (k) + E |Im |2

(5.27)

k=−N

where the correlations are defined as RY (n − k) RI Y (k)

=

E y ∗ (mT − nτ )y(mT − kτ )

= E y(mT − kτ )Im∗

(5.28)

and the expectation is taken with respect to the random information sequence {Im } and the additive noise. The minimum MSE solution is obtained by differentiating Eq. (5.27) with respect to the equalizer coefficients {cn }. Thus, we obtain the necessary conditions for the minimum MSE as N X

cn RY (n − k) = RI Y (k),

k = 0, ±1, 2, . . . , ±N

(5.29)

n=−N

These are the (2N + 1) linear equations for the equalizer coefficients. In contrast to the zero-forcing solution already described, these equations depend on the statistical properties (the autocorrelation) of the noise as well as the ISI through the autocorrelation RY (n). 1999 by CRC Press LLC

c

In practice, the autocorrelation matrix RY (n) and the crosscorrelation vector RI Y (n) are unknown a priori. These correlation sequences can be estimated, however, by transmitting a test signal over the channel and using the time-average estimates Rˆ Y (n)

K 1 X ∗ y (kT − nτ )y(kT ) K

=

k=1

Rˆ I Y (n)

K 1 X y(kT − nτ )Ik∗ K

=

(5.30)

k=1

in place of the ensemble averages to solve for the equalizer coefficients given by Eq. (5.29).

5.3.1

Adaptive Linear Equalizers

We have shown that the tap coefficients of a linear equalizer can be determined by solving a set of linear equations. In the zero-forcing optimization criterion, the linear equations are given by Eq. (5.21). On the other hand, if the optimization criterion is based on minimizing the MSE, the optimum equalizer coefficients are determined by solving the set of linear equations given by Eq. (5.29). In both cases, we may express the set of linear equations in the general matrix form Bc = d

(5.31)

where B is a (2N + 1) × (2N + 1) matrix, c is a column vector representing the 2N + 1 equalizer coefficients, and d a (2N + 1)-dimensional column vector. The solution of Eq. (5.31) yields copt = B −1 d

(5.32)

In practical implementations of equalizers, the solution of Eq. (5.31) for the optimum coefficient vector is usually obtained by an iterative procedure that avoids the explicit computation of the inverse of the matrix B. The simplest iterative procedure is the method of steepest descent, in which one begins by choosing arbitrarily the coefficient vector c, say c0 . This initial choice of coefficients corresponds to a point on the criterion function that is being optimized. For example, in the case of the MSE criterion, the initial guess c0 corresponds to a point on the quadratic MSE surface in the (2N +1)-dimensional space of coefficients. The gradient vector, defined as g 0 , which is the derivative of the MSE with respect to the 2N +1 filter coefficients, is then computed at this point on the criterion surface, and each tap coefficient is changed in the direction opposite to its corresponding gradient component. The change in the j th tap coefficient is proportional to the size of the j th gradient component. For example, the gradient vector denoted as g k , for the MSE criterion, found by taking the derivatives of the MSE with respect to each of the 2N + 1 coefficients, is g k = Bck − d,

k = 0, 1, 2, . . .

(5.33)

Then the coefficient vector ck is updated according to the relation ck+1 = ck − 1g k

(5.34)

where 1 is the step-size parameter for the iterative procedure. To ensure convergence of the iterative procedure, 1 is chosen to be a small positive number. In such a case, the gradient vector g k converges 1999 by CRC Press LLC

c

toward zero, i.e., g k → 0 as k → ∞, and the coefficient vector ck → copt as illustrated in Fig. 5.12 based on two-dimensional optimization. In general, convergence of the equalizer tap coefficients to copt cannot be attained in a finite number of iterations with the steepest-descent method. The optimum solution copt , however, can be approached as closely as desired in a few hundred iterations. In digital communication systems that employ channel equalizers, each iteration corresponds to a time interval for sending one symbol and, hence, a few hundred iterations to achieve convergence to copt corresponds to a fraction of a second.

FIGURE 5.12: Examples of convergence characteristics of a gradient algorithm.

Adaptive channel equalization is required for channels whose characteristics change with time. In such a case, the ISI varies with time. The channel equalizer must track such time variations in the channel response and adapt its coefficients to reduce the ISI. In the context of the preceding discussion, the optimum coefficient vector copt varies with time due to time variations in the matrix B and, for the case of the MSE criterion, time variations in the vector d. Under these conditions, the iterative method described can be modified to use estimates of the gradient components. Thus, the algorithm for adjusting the equalizer tap coefficients may be expressed as cˆ k+1 = cˆ k − 1gˆ k

(5.35)

where gˆ k denotes an estimate of the gradient vector g k and cˆ k denotes the estimate of the tap coefficient vector. In the case of the MSE criterion, the gradient vector g k given by Eq. (5.33) may also be expressed as g k = −E ek y ∗k An estimate gˆ k of the gradient vector at the kth iteration is computed as gˆ k = −ek y ∗k

(5.36)

where ek denotes the difference between the desired output from the equalizer at the kth time instant and the actual output z(kT ), and y k denotes the column vector of 2N + 1 received signal values contained in the equalizer at time instant k. The error signal ek is expressed as ek = Ik − zk 1999 by CRC Press LLC

c

(5.37)

where zk = z(kT ) is the equalizer output given by Eq. (5.26) and Ik is the desired symbol. Hence, by substituting Eq. (5.36) into Eq. (5.35), we obtain the adaptive algorithm for optimizing the tap coefficients (based on the MSE criterion) as cˆ k+1 = cˆ k + 1ek y ∗k

(5.38)

Since an estimate of the gradient vector is used in Eq. (5.38) the algorithm is called a stochastic gradient algorithm; it is also known as the LMS algorithm. A block diagram of an adaptive equalizer that adapts its tap coefficients according to Eq. (5.38) is illustrated in Fig. 5.13. Note that the difference between the desired output Ik and the actual output zk from the equalizer is used to form the error signal ek . This error is scaled by the step-size parameter 1, and the scaled error signal 1ek multiplies the received signal values {y(kT − nτ )} at the 2N + 1 taps. The products 1ek y ∗ (kT − nτ ) at the (2N + 1) taps are then added to the previous values of the tap coefficients to obtain the updated tap coefficients, according to Eq. (5.38). This computation is repeated as each new symbol is received. Thus, the equalizer coefficients are updated at the symbol rate.

Input

{y k }

Σ

τ

τ

Σ

Σ

c −N+1

c −N

τ

τ

Σ c1

c0

τ

Σ cN

+ {z k } +

Detector

{e k } {I k } ∆ Output

FIGURE 5.13: Linear adaptive equalizer based on the MSE criterion.

Initially, the adaptive equalizer is trained by the transmission of a known pseudo-random sequence {Im } over the channel. At the demodulator, the equalizer employs the known sequence to adjust its coefficients. Upon initial adjustment, the adaptive equalizer switches from a training mode to a decision-directed mode, in which case the decisions at the output of the detector are sufficiently reliable so that the error signal is formed by computing the difference between the detector output 1999 by CRC Press LLC

c

and the equalizer output, i.e., ek = I˜k − zk

(5.39)

where I˜k is the output of the detector. In general, decision errors at the output of the detector occur infrequently and, consequently, such errors have little effect on the performance of the tracking algorithm given by Eq. (5.38). A rule of thumb for selecting the step-size parameter so as to ensure convergence and good tracking capabilities in slowly varying channels is 1=

1 5(2N + 1)PR

(5.40)

where PR denotes the received signal-plus-noise power, which can be estimated from the received signal (see Proakis [4]). The convergence characteristic of the stochastic gradient algorithm in Eq. (5.38) is illustrated in Fig. 5.14. These graphs were obtained from a computer simulation of an 11-tap adaptive equalizer operating on a channel with a rather modest amount of ISI. The input signal-plus-noise power PR was normalized to unity. The rule of thumb given in Eq. (5.40) for selecting the step size gives 1 = 0.018. The effect of making 1 too large is illustrated by the large jumps in MSE as shown for 1 = 0.115. As 1 is decreased, the convergence is slowed somewhat, but a lower MSE is achieved, indicating that the estimated coefficients are closer to copt .

FIGURE 5.14: Initial convergence characteristics of the LMS algorithm with different step sizes.

Although we have described in some detail the operation of an adaptive equalizer that is optimized on the basis of the MSE criterion, the operation of an adaptive equalizer based on the zero-forcing method is very similar. The major difference lies in the method for estimating the gradient vectors g k at each iteration. A block diagram of an adaptive zero-forcing equalizer is shown in Fig. 5.15. For more details on the tap coefficient update method for a zero-forcing equalizer, the reader is referred to the papers by Lucky [2, 3], and the text by Proakis [4]. 1999 by CRC Press LLC

c

Input

τ

Σ

τ

Σ

τ

Σ

Σ yk

~ Ik

Detector

Output

− εk

∆ T

T

~ ~

T

T

FIGURE 5.15: An adaptive zero-forcing equalizer.

5.4

Decision-Feedback Equalizer

The linear filter equalizers described in the preceding section are very effective on channels, such as wire line telephone channels, where the ISI is not severe. The severity of the ISI is directly related to the spectral characteristics and not necessarily to the time span of the ISI. For example, consider the ISI resulting from the two channels that are illustrated in Fig. 5.16. The time span for the ISI in channel A is 5 symbol intervals on each side of the desired signal component, which has a value of 0.72. On the other hand, the time span for the ISI in channel B is one symbol interval on each side of the desired signal component, which has a value of 0.815. The energy of the total response is normalized to unity for both channels. In spite of the shorter ISI span, channel B results in more severe ISI. This is evidenced in the frequency response characteristics of these channels, which are shown in Fig. 5.17. We observe that channel B has a spectral null [the frequency response C(f ) = 0 for some frequencies in the band |f | ≤ W ] at f = 1/2T , whereas this does not occur in the case of channel A. Consequently, a linear equalizer will introduce a large gain in its frequency response to compensate for the channel null. Thus, the noise in channel B will be enhanced much more than in channel A. This implies that the performance of the linear equalizer for channel B will be sufficiently poorer than that for channel A. This fact is borne out by the computer simulation results for the performance of the two linear equalizers shown in Fig. 5.18. Hence, the basic limitation of a linear equalizer is that it performs poorly on channels having spectral nulls. Such channels are often encountered in radio communications, such as ionospheric transmission at frequencies below 30 MHz and mobile radio channels, such as those used for cellular radio communications. A decision-feedback equalizer (DFE) is a nonlinear equalizer that employs previous decisions to eliminate the ISI caused by previously detected symbols on the current symbol to be detected. A 1999 by CRC Press LLC

c

FIGURE 5.16: Two channels with ISI. simple block diagram for a DFE is shown in Fig. 5.19. The DFE consists of two filters. The first filter is called a feedforward filter and it is generally a fractionally spaced FIR filter with adjustable tap coefficients. This filter is identical in form to the linear equalizer already described. Its input is the received filtered signal y(t) sampled at some rate that is a multiple of the symbol rate, e.g., at rate 2/T . The second filter is a feedback filter. It is implemented as an FIR filter with symbol-spaced taps having adjustable coefficients. Its input is the set of previously detected symbols. The output of the feedback filter is subtracted from the output of the feedforward filter to form the input to the detector. Thus, we have zm =

0 X n=−N1

cn y(mT − nτ ) −

N2 X

bn I˜m−n

(5.41)

n=1

where {cn } and {bn } are the adjustable coefficients of the feedforward and feedback filters, respectively, I˜m−n , n = 1, 2, . . . , N2 are the previously detected symbols, N1 + 1 is the length of the feedforward filter, and N2 is the length of the feedback filter. Based on the input zm , the detector determines which of the possible transmitted symbols is closest in distance to the input signal Im . Thus, it makes its decision and outputs I˜m . What makes the DFE nonlinear is the nonlinear characteristic of the detector that provides the input to the feedback filter. The tap coefficients of the feedforward and feedback filters are selected to optimize some desired performance measure. For mathematical simplicity, the MSE criterion is usually applied, and a stochastic gradient algorithm is commonly used to implement an adaptive DFE. Figure 5.20 illustrates the block diagram of an adaptive DFE whose tap coefficients are adjusted by means of the LMS stochastic gradient algorithm. Figure 5.21 illustrates the probability of error performance of the 1999 by CRC Press LLC

c

FIGURE 5.17: Amplitude spectra for (a) channel A shown in Fig.5.16(a) and (b) channel B shown in Fig.5.16(b).

DFE, obtained by computer simulation, for binary PAM transmission over channel B. The gain in performance relative to that of a linear equalizer is clearly evident. We should mention that decision errors from the detector that are fed to the feedback filter have a small effect on the performance of the DFE. In general, a small loss in performance of one to two decibels is possible at error rates below 10−2 , as illustrated in Fig. 5.21, but the decision errors in the feedback filters are not catastrophic.

1999 by CRC Press LLC

c

FIGURE 5.18: Error-rate performance of linear MSE equalizer.

FIGURE 5.19: Block diagram of DFE.

5.5

Maximum-Likelihood Sequence Detection

Although the DFE outperforms a linear equalizer, it is not the optimum equalizer from the viewpoint of minimizing the probability of error in the detection of the information sequence {Ik } from the received signal samples {yk } given in Eq. (5.5). In a digital communication system that transmits information over a channel that causes ISI, the optimum detector is a maximum-likelihood symbol sequence detector which produces at its output the most probable symbol sequence {I˜k } for the given received sampled sequence {yk }. That is, the detector finds the sequence {I˜k } that maximizes the likelihood function (5.42) 3 ({Ik }) = ln p {yk } {Ik } where p({yk } | {Ik }) is the joint probability of the received sequence {yk } conditioned on {Ik }. The sequence of symbols {I˜k } that maximizes this joint conditional probability is called the maximum1999 by CRC Press LLC

c

FIGURE 5.20: Adaptive DFE.

likelihood sequence detector. An algorithm that implements maximum-likelihood sequence detection (MLSD) is the Viterbi algorithm, which was originally devised for decoding convolutional codes. For a description of this algorithm in the context of sequence detection in the presence of ISI, the reader is referred to the paper by Forney [1] and the text by Proakis [4]. The major drawback of MLSD for channels with ISI is the exponential behavior in computational complexity as a function of the span of the ISI. Consequently, MLSD is practical only for channels where the ISI spans only a few symbols and the ISI is severe, in the sense that it causes a severe degradation in the performance of a linear equalizer or a decision-feedback equalizer. For example, Fig. 5.22 illustrates the error probability performance of the Viterbi algorithm for a binary PAM signal transmitted through channel B (see Fig. 5.16). For purposes of comparison, we also illustrate the probability of error for a DFE. Both results were obtained by computer simulation. We observe that the performance of the maximum likelihood sequence detector is about 4.5 dB better than that of the DFE at an error probability of 10−4 . Hence, this is one example where the ML sequence detector provides a significant performance gain on a channel with a relatively short ISI span.

5.6

Conclusions

Channel equalizers are widely used in digital communication systems to mitigate the effects of ISI caused by channel distortion. Linear equalizers are widely used for high-speed modems that transmit data over telephone channels. For wireless (radio) transmission, such as in mobile cellular communi1999 by CRC Press LLC

c

FIGURE 5.21: Performance of DFE with and without error propagation.

FIGURE 5.22: Comparison of performance between MLSE and decision-feedback equalization for channel B of Fig.5.16.

1999 by CRC Press LLC

c

cations and interoffice communications, the multipath propagation of the transmitted signal results in severe ISI. Such channels require more powerful equalizers to combat the severe ISI. The decisionfeedback equalizer and the MLSD are two nonlinear channel equalizers that are suitable for radio channels with severe ISI.

Defining Terms Adaptive equalizer: A channel equalizer whose parameters are updated automatically and adaptively during transmission of data. Channel equalizer: A device that is used to reduce the effects of channel distortion in a received signal. Decision-directed mode: Mode for adjustment of the equalizer coefficient adaptively based on the use of the detected symbols at the output of the detector. Decision-feedback equalizer (DFE): An adaptive equalizer that consists of a feedforward filter and a feedback filter, where the latter is fed with previously detected symbols that are used to eliminate the intersymbol interference due to the tail in the channel impulse response. Fractionally spaced equalizer: A tapped-delay line channel equalizer in which the delay between adjacent taps is less than the duration of a transmitted symbol. Intersymbol interference: Interference in a received symbol from adjacent (nearby) transmitted symbols caused by channel distortion in data transmission. LMS algorithm: See stochastic gradient algorithm. Maximum-likelihood sequence detector: A detector for estimating the most probable sequence of data symbols by maximizing the likelihood function of the received signal. Preset equalizer: A channel equalizer whose parameters are fixed (time-invariant) during transmission of data. Stochastic gradient algorithm: An algorithm for adaptively adjusting the coefficients of an equalizer based on the use of (noise-corrupted) estimates of the gradients. Symbol-spaced equalizer: A tapped-delay line channel equalizer in which the delay between adjacent taps is equal to the duration of a transmitted symbol. Training mode: Mode for adjustment of the equalizer coefficients based on the transmission of a known sequence of transmitted symbols. Zero-forcing equalizer: A channel equalizer whose parameters are adjusted to completely eliminate intersymbol interference in a sequence of transmitted data symbols.

References [1] Forney, G.D., Jr., Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference. IEEE Trans. Inform. Theory, IT-18, 363–378, May 1972. [2] Lucky, R.W., Automatic equalization for digital communications. Bell Syst. Tech. J., 44, 547–588, Apr. 1965. [3] Lucky, R.W., Techniques for adaptive equalization of digital communication. Bell Syst. Tech. J., 45, 255–286, Feb. 1966. [4] Proakis, J.G., Digital Communications, 3rd ed., McGraw-Hill, New York, 1995. 1999 by CRC Press LLC

c

Further Information For a comprehensive treatment of adaptive equalization techniques and their performance characteristics, the reader may refer to the book by Proakis [4]. The two papers by Lucky [2, 3], provide a treatment on linear equalizers based on the zero-forcing criterion. Additional information on decision-feedback equalizers may be found in the journal papers “An Adaptive Decision-Feedback Equalizer” by D.A. George, R.R. Bowen, and J.R. Storey, IEEE Transactions on Communications Technology, Vol. COM-19, pp. 281–293, June 1971, and “Feedback Equalization for Fading Dispersive Channels” by P. Monsen, IEEE Transactions on Information Theory, Vol. IT-17, pp. 56–64, January 1971. A through treatment of channel equalization based on maximum-likelihood sequence detection is given in the paper by Forney [1].

1999 by CRC Press LLC

c

LoCicero, J.L. & Patel, B.P. “Line Coding” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Line Coding 6.1 6.2

6.3

6.4

Joseph L. LoCicero Illinois Institute of Technology

Bhasker P. Patel Illinois Institute of Technology

6.1

Introduction Common Line Coding Formats

Unipolar NRZ (Binary On-Off Keying) • Unipolar RZ • Polar NRZ • Polar RZ [Bipolar, Alternate Mark Inversion (AMI), or Pseudoternary] • Manchester Coding (Split Phase or Digital Biphase)

Alternate Line Codes

Delay Modulation (Miller Code) • Split Phase (Mark) • Biphase (Mark) • Code Mark Inversion (CMI) • NRZ (I) • Binary N Zero Substitution (BNZS) • High-Density Bipolar N (HDBN) • Ternary Coding

Multilevel Signalling, Partial Response Signalling, and Duobinary Coding

Multilevel Signalling • Partial Response Signalling and Duobinary Coding

6.5 Bandwidth Comparison 6.6 Concluding Remarks Defining Terms References

Introduction

The terminology line coding originated in telephony with the need to transmit digital information across a copper telephone line; more specifically, binary data over a digital repeatered line. The concept of line coding, however, readily applies to any transmission line or channel. In a digital communication system, there exists a known set of symbols to be transmitted. These can be designated as {mi }, i = 1, 2, . . . , N, with a probability of occurrence {pi }, i = 1, 2, . . . , N, where the sequentially transmitted symbols are generally assumed to be statistically independent. The conversion or coding of these abstract symbols into real, temporal waveforms to be transmitted in baseband is the process of line coding. Since the most common type of line coding is for binary data, such a waveform can be succinctly termed a direct format for serial bits. The concentration in this section will be line coding for binary data. Different channel characteristics, as well as different applications and performance requirements, have provided the impetus for the development and study of various types of line coding [1, 2]. For example, the channel might be ac coupled and, thus, could not support a line code with a dc component or large dc content. Synchronization or timing recovery requirements might necessitate a discrete component at the data rate. The channel bandwidth and crosstalk limitations might dictate 1999 by CRC Press LLC

c

the type of line coding employed. Even such factors as the complexity of the encoder and the economy of the decoder could determine the line code chosen. Each line code has its own distinct properties. Depending on the application, one property may be more important than the other. In what follows, we describe, in general, the most desirable features that are considered when choosing a line code. It is commonly accepted [1, 2, 5, 8] that the dominant considerations effecting the choice of a line code are: 1) timing, 2) dc content, 3) power spectrum, 4) performance monitoring, 5) probability of error, and 6) transparency. Each of these are detailed in the following paragraphs. 1) Timing: The waveform produced by a line code should contain enough timing information such that the receiver can synchronize with the transmitter and decode the received signal properly. The timing content should be relatively independent of source statistics, i.e., a long string of 1s or 0s should not result in loss of timing or jitter at the receiver. 2) DC content: Since the repeaters used in telephony are ac coupled, it is desirable to have zero dc in the waveform produced by a given line code. If a signal with significant dc content is used in ac coupled lines, it will cause dc wander in the received waveform. That is, the received signal baseline will vary with time. Telephone lines do not pass dc due to ac coupling with transformers and capacitors to eliminate dc ground loops. Because of this, the telephone channel causes a droop in constant signals. This causes dc wander. It can be eliminated by dc restoration circuits, feedback systems, or with specially designed line codes. 3) Power spectrum: The power spectrum and bandwidth of the transmitted signal should be matched to the frequency response of the channel to avoid significant distortion. Also, the power spectrum should be such that most of the energy is contained in as small bandwidth as possible. The smaller is the bandwidth, the higher is the transmission efficiency. 4) Performance monitoring: It is very desirable to detect errors caused by a noisy transmission channel. The error detection capability in turn allows performance monitoring while the channel is in use (i.e., without elaborate testing procedures that require suspending use of the channel). 5) Probability of error: The average error probability should be as small as possible for a given transmitter power. This reflects the reliability of the line code. 6) Transparency: A line code should allow all the possible patterns of 1s and 0s. If a certain pattern is undesirable due to other considerations, it should be mapped to a unique alternative pattern.

6.2

Common Line Coding Formats

A line coding format consists of a formal definition of the line code that specifies how a string of binary digits are converted to a line code waveform. There are two major classes of binary line codes: level codes and transition codes. Level codes carry information in their voltage level, which may be high or low for a full bit period or part of the bit period. Level codes are usually instantaneous since they typically encode a binary digit into a distinct waveform, independent of any past binary data. However, some level codes do exhibit memory. Transition codes carry information in the change in level appearing in the line code waveform. Transition codes may be instantaneous, but they generally have memory, using past binary data to dictate the present waveform. There are two common forms of level line codes: one is called return to zero (RZ) and the other is called nonreturn to zero (NRZ). In RZ coding, the level of the pulse returns to zero for a portion of the bit interval. In NRZ coding, the level of the pulse is maintained during the entire bit interval. Line coding formats are further classified according to the polarity of the voltage levels used to represent the data. If only one polarity of voltage level is used, i.e., positive or negative (in addition to the zero level) then it is called unipolar signalling. If both positive and negative voltage levels are being used, with or without a zero voltage level, then it is called polar signalling. The term bipolar 1999 by CRC Press LLC

c

signalling is used by some authors to designate a specific line coding scheme with positive, negative, and zero voltage levels. This will be described in detail later in this section. The formal definition of five common line codes is given in the following along with a representative waveform, the power spectral density (PSD), the probability of error, and a discussion of advantages and disadvantages. In some cases specific applications are noted.

6.2.1

Unipolar NRZ (Binary On-Off Keying)

In this line code, a binary 1 is represented by a non-zero voltage level and a binary 0 is represented by a zero voltage level as shown in Fig. 6.1(a). This is an instantaneous level code. The PSD of this code with equally likely 1s and 0s is given by [5, 8] S1 (f ) =

V 2T 4

sin πf T πf T

2 +

V2 δ(f ) 4

(6.1)

where V is the binary 1 voltage level, T = 1/R is the bit duration, and R is the bit rate in bits per second. The spectrum of unipolar NRZ is plotted in Fig. 6.2a. This PSD is a two-sided even spectrum, although only half of the plot is shown for efficiency of presentation. If the probability of a binary 1 is p, and the probability of a binary 0 is (1 − p), then the PSD of this code, in the most general case, is 4p(1 − p) S1 (f ). Considering the frequency of the first spectral null as the bandwidth of the waveform, the bandwidth of unipolar NRZ is R in hertz. The error rate performance of this code, for equally likely data, with additive white Gaussian noise (AWGN) and optimum, i.e., matched filter, detection is given by [1, 5] s ! 1 Eb (6.2) Pe = erfc 2 2N0 where Eb /N0 is a measure of the signal-to-noise ratio (SNR) of the received signal. In general, Eb is the energy per bit and N0 /2 is the two-sided PSD of the AWGN. More specifically, for unipolar NRZ, Eb is the energy in a binary 1, which is V 2 T . The performance of the unipolar NRZ code is plotted in Fig. 6.3 The principle advantages of unipolar NRZ are ease of generation, since it requires only a single power supply, and a relatively low bandwidth of R Hz. There are quite a few disadvantages of this line code. A loss of synchronization and timing jitter can result with a long sequence of 1s or 0s because no pulse transition is present. The code has no error detection capability and, hence, performance cannot be monitored. There is a significant dc component as well as a dc content. The error rate performance is not as good as that of polar line codes.

6.2.2

Unipolar RZ

In this line code, a binary 1 is represented by a nonzero voltage level during a portion of the bit duration, usually for half of the bit period, and a zero voltage level for rest of the bit duration. A binary 0 is represented by a zero voltage level during the entire bit duration. Thus, this is an instantaneous level code. Figure 6.1(b) illustrates a unipolar RZ waveform in which the 1 is represented by a nonzero voltage level for half the bit period. The PSD of this line code, with equally likely binary digits, is given by [5, 6, 8] S2 (f ) 1999 by CRC Press LLC

c

=

V 2T 16

sin πf T /2 πf T /2

2

1 Unipolar RZ (a)

0

1

1

0

0

0

1

1

1

0

T

2T

3T

4T

5T

6T

7T

8T

9T

10T

11T

T

2T

3T

4T

5T

6T

7T

8T

9T

10T

11T

Unipolar RZ (b)

Polar NRZ (c)

Bipolar (AMI) (d)

Manchester (Bi-phase) (e)

Delay Modulation (f)

Split Phase (Mark) (g)

Split Phase (Space) (h)

Bi-Phase (Mark) (i)

Bi-Phase (Space) (j)

Code Mark Inversion (k)

NRZ (M) (l)

NRZ (s) (m)

1999 by CRC Press LLC

c

FIGURE 6.1: Waveforms for different line codes.

Figure 6.2a

Power spectral density of different line codes, where R = 1/T is the bit rate.

V2 + 4π 2

"

# ∞ X π2 1 δ(f ) + δ(f − (2n + 1)R) 4 (2n + 1)2 n=−∞

(6.3)

where again V is the binary 1 voltage level, and T = 1/R is the bit period. The spectrum of this code is drawn in Fig. 6.2a. In the most general case, when the probability of a 1 is p, the continuous portion of the PSD in Eq. (6.3) is scaled by the factor 4p(1 − p) and the discrete portion is scaled by the factor 4p2 . The first null bandwidth of unipolar RZ is 2R Hz. The error rate performance of this line code is the same as that of the unipolar NRZ provided we increase the voltage level of this code such that the energy in binary 1, Eb , is the same for both codes. The probability of error is given by Eq. (6.2) and identified in Fig. 6.3. If the voltage level and bit period are the same for unipolar NRZ and unipolar RZ, then the energy in a binary 1 for unipolar RZ will be V 2 T /2 and the probability of error is worse by 3 dB. The main advantages of unipolar RZ are, again, ease of generation since it requires a single power supply and the presence of a discrete spectral component at the symbol rate, which allows simple timing recovery. A number of disadvantages exist for this line code. It has a nonzero dc component and nonzero dc content, which can lead to dc wander. A long string of 0s will lack pulse transitions and could lead to loss of synchronization. There is no error detection capability and, hence, performance monitoring is not possible. The bandwidth requirement (2R Hz) is higher than that of NRZ signals. The error rate performance is worse than that of polar line codes. Unipolar NRZ as well as unipolar RZ are examples of pulse/no-pulse type of signalling. In this 1999 by CRC Press LLC

c

Figure 6.2b

Power spectral density of different line codes, where R = 1/T is the bit rate.

type of signalling, the pulse for a binary 0, g2 (t), is zero and the pulse for a binary 1 is specified generically as g1 (t) = g(t). Using G(f ) as the Fourier transform of g(t), the PSD of pulse/no-pulse signalling is given as [6, 7, 10] SPNP (f ) = p(1 − p)R|G(f )|2 + p2 R 2

∞ X

|G(nR)|2 δ(f − nR)

(6.4)

n=−∞

where p is the probability of a binary 1, and R is the bit rate.

6.2.3

Polar NRZ

In this line code, a binary 1 is represented by a positive voltage +V and a binary 0 is represented by a negative voltage −V over the full bit period. This code is also referred to as NRZ (L), since a bit is represented by maintaining a level (L) during its entire period. A polar NRZ waveform is shown in Fig. 6.1(c). This is again an instantaneous level code. Alternatively, a 1 may be represented by a −V voltage level and a 0 by a +V voltage level, without changing the spectral characteristics and 1999 by CRC Press LLC

c

FIGURE 6.3: Bit error probability for different line codes.

performance of the line code. The PSD of this line code with equally likely bits is given by [5, 8] S3 (f ) = V 2 T

sin πf T πf T

2 (6.5)

This is plotted in Fig. 6.2b. When the probability of a 1 is p, and p is not 0.5, a dc component exists, and the PSD becomes [10]

sin πf T S3p (f ) = 4V Tp(1 − p) πf T 2

1999 by CRC Press LLC

c

2 + V 2 (1 − 2p)2 δ(f )

(6.6)

The first null bandwidth for this line code is again R Hz, independent of p. The probability of error of this line code when p = 0.5 is given by [1, 5] s ! 1 Eb (6.7) Pe = erfc 2 N0 The performance of polar NRZ is plotted in Fig. 6.3. This is better than the error performance of the unipolar codes by 3 dB. The advantages of polar NRZ include a low-bandwidth requirement, R Hz, comparable to unipolar NRZ, very good error probability, and greatly reduced dc because the waveform has a zero dc component when p = 0.5 even though the dc content is never zero. A few notable disadvantages are that there is no error detection capability, and that a long string of 1s or 0s could result in loss of synchronization, since there are no transitions during the string duration. Two power supplies are required to generate this code.

6.2.4 Polar RZ [Bipolar, Alternate Mark Inversion (AMI), or Pseudoternary] In this scheme, a binary 1 is represented by alternating the positive and negative voltage levels, which return to zero for a portion of the bit duration, generally half the bit period. A binary 0 is represented by a zero voltage level during the entire bit duration. This line coding scheme is often called alternate mark inversion (AMI) since 1s (marks) are represented by alternating positive and negative pulses. It is also called pseudoternary since three different voltage levels are used to represent binary data. Some authors designate this line code as bipolar RZ (BRZ). An AMI waveform is shown in Fig. 6.1(d). Note that this is a level code with memory. The AMI code is well known for its use in telephony. The PSD of this line code with memory is given by [1, 2, 7] 1 − cos 2πf T 2 (6.8) S4p (f ) = 2p(1 − p)R|G(f )| 1 + (2p − 1)2 + 2(2p − 1) cos 2πf T where G(f ) is the Fourier transform of the pulse used to represent a binary 1, and p is the probability of a binary 1. When p = 0.5 and square pulses with amplitude ±V and duration T /2 are used to represent binary 1s, the PSD becomes S4 (f ) =

V 2T 4

sin πf T /2 πf T /2

2 sin2 (πf T )ψ

(6.9)

This PSD is plotted in Fig. 6.2a. The first null bandwidth of this waveform is R Hz. This is true for RZ rectangular pulses, independent of the value of p in Eq. (6.8). The error rate performance of this line code for equally likely binary data is given by [5] s ! 3 Eb (6.10) , Eb /N0 > 2 Pe ≈ erfc 4 2N0 This curve is plotted in Fig. 6.3 and is seen to be no more than 0.5 dB worse than the unipolar codes. The advantages of polar RZ (or AMI, as it is most commonly called) outweigh the disadvantages. This code has no dc component and zero dc content, completely avoiding the dc wander problem. Timing recovery is rather easy since squaring, or full-wave rectifying, this type of signal yields a unipolar RZ waveform with a discrete component at the bit rate, R Hz. Because of the alternating 1999 by CRC Press LLC

c

polarity pulses for binary 1s, this code has error detection and, hence, performance monitoring capability. It has a low-bandwidth requirement, R Hz, comparable to unipolar NRZ. The obvious disadvantage is that the error rate performance is worse than that of the unipolar and polar waveforms. A long string of 0s could result in loss of synchronization, and two power supplies are required for this code.

6.2.5

Manchester Coding (Split Phase or Digital Biphase)

In this coding, a binary 1 is represented by a pulse that has positive voltage during the first-half of the bit duration and negative voltage during second-half of the bit duration. A binary 0 is represented by a pulse that is negative during the first-half of the bit duration and positive during the secondhalf of the bit duration. The negative or positive midbit transition indicates a binary 1 or binary 0, respectively. Thus, a Manchester code is classified as an instantaneous transition code; it has no memory. The code is also called diphase because a square wave with a 0◦ phase is used to represent a binary 1 and a square wave with a phase of 180◦ used to represent a binary 0; or vice versa. This line code is used in Ethernet local area networks (LANs). The waveform for Manchester coding is shown in Fig. 6.1(e). The PSD of a Manchester waveform with equally likely bits is given by [5, 8] sin πf T /2 2 2 sin (πf T /2) (6.11) S5 (f ) = V 2 T πf T /2 where ±V are used as the positive/negative voltage levels for this code. Its spectrum is plotted in Fig. 6.2b. When the probability p of a binary 1, is not equal to one-half, the continuous portion of the PSD is reduced in amplitude and discrete components appear at integer multiples of the bit rate, R = 1/T . The resulting PSD is [6, 10] sin πf T /2 2 2 πf T 2 sin S5p (f ) = V T 4p(1 − p) πf T /2 2 2 ∞ X 2 δ(f − nR) (6.12) + V 2 (1 − 2p)2 nπ n=−∞,n6 =0

The first null bandwidth of the waveform generated by a Manchester code is 2R Hz. The error rate performance of this waveform when p = 0.5 is the same as that of polar NRZ, given by Eq. (6.9), and plotted in Fig. 6.3. The advantages of this code include a zero dc content on an individual pulse basis, so no pattern of bits can cause dc buildup; midbit transitions are always present making it is easy to extract timing information; and it has good error rate performance, identical to polar NRZ. The main disadvantage of this code is a larger bandwidth than any of the other common codes. Also, it has no error detection capability and, hence, performance monitoring is not possible. Polar NRZ and Manchester coding are examples of the use of pure polar signalling where the pulse for a binary 0, g2 (t) is the negative of the pulse for a binary 1, i.e., g2 (t) = −g1 (t). This is also referred to as an antipodal signal set. For this broad type of polar binary line code, the PSD is given by [10] SBP (f ) = 4p(1 − p)R|G(f )|2 + (2p − 1)2 R 2

∞ X

|G(nR)|2 δ(f − nR)

n=−∞

where |G(f )| is the magnitude of the Fourier transform of either g1 (t) or g2 (t). 1999 by CRC Press LLC

c

(6.13)

A further generalization of the PSD of binary line codes can be given, wherein a continuous spectrum and a discrete spectrum is evident. Let a binary 1, with probability p, be represented by g1 (t) over the T = 1/R second bit interval; and let a binary 0, with probability 1 − p, be represented by g2 (t) over the same T second bit interval. The two-sided PSD for this general binary line code is [10] SGB (f )

=

p(1 − p)R |G1 (f ) − G2 (f )|2 ∞ X |pG1 (nR) + (1 − p)G2 (nR)|2 δ(f − nR) + R2

(6.14)

n=−∞

where the Fourier transform of g1 (t) and g2 (t) are given by G1 (f ) and G2 (f ), respectively.

6.3

Alternate Line Codes

Most of the line codes discussed thus far were instantaneous level codes. Only AMI had memory, and Manchester was an instantaneous transition code. The alternate line codes presented in this section all have memory. The first four are transition codes, where binary data is represented as the presence or absence of a transition, or by the direction of transition, i.e., positive to negative or vice versa. The last four codes described in this section are level line codes with memory.

6.3.1

Delay Modulation (Miller Code)

In this line code, a binary 1 is represented by a transition at the midbit position, and a binary 0 is represented by no transition at the midbit position. If a 0 is followed by another 0, however, the signal transition also occurs at the end of the bit interval, that is, between the two 0s. An example of delay modulation is shown in Fig. 6.1(f). It is clear that delay modulation is a transition code with memory. This code achieves the goal of providing good timing content without sacrificing bandwidth. The PSD of the Miller code for equally likely data is given by [10] S6 (f )

=

V 2T 2(πf T )2 (17 + 8 cos 2πf T )

× (23 − 2 cos πf T − 22 cos 2πf T − 12 cos 3πf T + 5 cos 4πf T + 12 cos 5πf T + 2 cos 6πf T − 8 cos 7πf T + 2 cos 8πf T )

(6.15)

This spectrum is plotted in Fig. 6.2b. The advantages of this code are that it requires relatively low bandwidth, most of the energy is contained in less than 0.5R. However, there is no distinct spectral null within the 2R-Hz band. It has low dc content and no dc component. It has very good timing content, and carrier tracking is easier than Manchester coding. Error rate performance is comparable to that of the common line codes. One important disadvantage is that it has no error detection capability and, hence, performance cannot be monitored.

6.3.2

Split Phase (Mark)

This code is similar to Manchester in the sense that there are always midbit transitions. Hence, this code is relatively easy to synchronize and has no dc. Unlike Manchester, however, split phase (mark) encodes a binary digit into a midbit transition dependent on the midbit transition in the 1999 by CRC Press LLC

c

previous bit period [12]. Specifically, a binary 1 produces a reversal of midbit transition relative to the previous midbit transition. A binary 0 produces no reversal of the midbit transition. Certainly this is a transition code with memory. An example of a split phase (mark) coded waveform is shown in Fig. 6.1(g), where the waveform in the first bit period is chosen arbitrarily. Since this method encodes bits differentially, there is no 180◦ -phase ambiguity associated with some line codes. This phase ambiguity may not be an issue in most baseband links but is important if the line code is modulated. Split phase (space) is very similar to split phase (mark), where the role of the binary 1 and binary 0 are interchanged. An example of a split phase (space) coded waveform is given in Fig. 6.1(h); again, the first bit waveform is arbitrary.

6.3.3 Biphase (Mark) This code, designated as Bi φ-M, is similar to a Miller code in that a binary 1 is represented by a midbit transition, and a binary 0 has no midbit transition. However, this code always has a transition at the beginning of a bit period [10]. Thus, the code is easy to synchronize and has no dc. An example of Bi φ-M is given in Fig. 6.1(i), where the direction of the transition at t = 0 is arbitrarily chosen. Biphase (space) or Bi φ-S is similar to Bi φ-M, except the role of the binary data is reversed. Here a binary 0 (space) produces a midbit transition, and a binary 1 does not have a midbit transition. A waveform example of Bi φ-S is shown in Fig. 6.1(j). Both Bi φ-S and Bi φ-M are transition codes with memory.

6.3.4

Code Mark Inversion (CMI)

This line code is used as the interface to a Consultative Committee on International Telegraphy and Telephony (CCITT) multiplexer and is very similar to Bi φ-S. A binary 1 is encoded as an NRZ pulse with alternate polarity, +V or −V . A binary 0 is encoded with a definitive midbit transition (or square wave phase) [1]. An example of this waveform is shown in Fig. 6.1(k) where a negative to positive transition (or 180◦ phase) is used for a binary 0. The voltage level of the first binary 1 in this example is chosen arbitrarily. This example waveform is identical to Bi φ-S shown in Fig. 6.1(j), except for the last bit. CMI has good synchronization properties and has no dc.

6.3.5

NRZ (I)

This type of line code uses an inversion (I) to designate binary digits, specifically, a change in level or no change in level. There are two variants of this code, NRZ mark (M) and NRZ space (S) [5, 12]. In NRZ (M), a change of level is used to indicate a binary 1, and no change of level is used to indicate a binary 0. In NRZ (S) a change of level is used to indicate a binary 0, and no change of level is used to indicate a binary 1. Waveforms for NRZ (M) and NRZ (S) are depicted in Fig. 6.1(l) and Fig. 6.1(m), respectively, where the voltage level of the first binary 1 in the example is chosen arbitrarily. These codes are level codes with memory. In general, line codes that use differential encoding, like NRZ (I), are insensitive to 180◦ phase ambiguity. Clock recovery with NRZ (I) is not particularly good, and dc wander is a problem as well. Its bandwidth is comparable to polar NRZ.

6.3.6

Binary N Zero Substitution (BNZS)

The common bipolar code AMI has many desirable properties of a line code. Its major limitation, however, is that a long string of zeros can lead to loss of synchronization and timing jitter because there are no pulses in the waveform for relatively long periods of time. Binary N zero substitution (BNZS) attempts to improve AMI by substituting a special code of length N for all strings of N zeros. 1999 by CRC Press LLC

c

This special code contains pulses that look like binary 1s but purposely produce violations of the AMI pulse convention. Two consecutive pulses of the same polarity violate the AMI pulse convention, independent of the number of zeros between the two consecutive pulses. These violations can be detected at the receiver, and the special code replaced by N zeros. The special code contains pulses facilitating synchronization even when the original data has long string of zeros. The special code is chosen such that the desirable properties of AMI coding are retained despite the AMI pulse convention violations, i.e., dc balance and error detection capability. The only disadvantage of BNZS compared to AMI is a slight increase in crosstalk due to the increased number of pulses and, hence, an increase in the average energy in the code. Choosing different values of N yields different BNZS codes. The value of N is chosen to meet the timing requirements of the application. In telephony, there are three commonly used BNZS codes: B6ZS, B3ZS, and B8ZS. All BNZS codes are level codes with memory. In a B6ZS code, a string of six consecutive zeros is replaced by one of two the special codes according to the rule: 0 + − 0 − +. 0 − + 0 + −.

If the last pulse was positive (+), the special code is: If the last pulse was negative (−), the special code is:

Here a zero indicates a zero voltage level for the bit period; a plus designates a positive pulse; and a minus indicates a negative pulse. This special code causes two AMI pulse violations: in its second bit position and in its fifth bit position. These violations are easily detected at the receiver and zeros resubstituted. If the number of consecutive zeros is 12, 18, 24, . . . , the substitution is repeated 2, 3, 4, . . . times. Since the number of violations is even, the B6ZS waveform is the same as the AMI waveform outside the special code, i.e., between special code sequences. There are four pulses introduced by the special code that facilitates timing recovery. Also, note that the special code is dc balanced. An example of the B6ZS code is given as follows, where the special code is indicated by the bold characters. Original data:

0

1

0

0

0

0

0

0

1

1

0

1

0

0

0

0

0

0

1

1

B6ZS format:

0

+

0

+

−

0

−

+

−

+

0

−

0

−

+

0

+

−

+

−

The computation of the PSD of a B6ZS code is tedious. Its shape is given in Fig. 6.4, for comparison purposes with AMI, for the case of equally likely data. In a B3ZS code, a string of three consecutive zeros is replaced by either B0V or 00V, where B denotes a pulse obeying the AMI (bipolar) convention and V denotes a pulse violating the AMI convention. B0V or 00V is chosen such that the number of bipolar (B) pulses between the violations is odd. The B3ZS rules are summarized in Table 6.1. TABLE 6.1

B3ZS Substitution Rules

Number of B Pulses

Polarity of Last

Substitution

Substitution

Since Last Violation

B Pulse

Code

Code Form

Odd

Negative (−)

00–

00V

Odd

Positive (+)

00+

00V

Even

Negative (−)

+0+

B 0V

Even

Positive (+)

–0–

B 0V

Observe that the violation always occurs in the third bit position of the substitution code, and 1999 by CRC Press LLC

c

FIGURE 6.4: Power spectral density of different line codes, where R = 1/T is the bit rate. so it can be easily detected and zero replacement made at the receiver. Also, the substitution code selection maintains dc balance. There is either one or two pulses in the substitution code, facilitating synchronization. The error detection capability of AMI is retained in B3ZS because a single channel error would make the number of bipolar pulses between violations even instead of being odd. Unlike B6ZS, the B3ZS waveform between violations may not be the same as the AMI waveform. B3ZS is used in the digital signal-3 (DS-3) signal interface in North America and also in the long distance-4 (LD-4) coaxial transmission system in Canada. Next is an example of a B3ZS code, using the same symbol meaning as in the B6ZS code. Original data: B3ZS format: Even No. of B pulses:

1

0

0

1

0

0

+

0

0

−

+

Odd No. of B pulses:

+

0

0

−

0

0

1

1

0

0

0

0

+

−

+

0

−

+

−

−

0

−

+

0

+

0

1

0

0

0

1

0

+

0

0

−

0

0

+

−

0

−

+

The last BNZS code considered here uses N = 8. A B8ZS code is used to provide transparent channels for the Integrated Services Digital Network (ISDN) on T1 lines and is similar to the B6ZS code. Here a string of eight consecutive zeros is replaced by one of two special codes according to the 1999 by CRC Press LLC

c

following rule: If the last pulse was positive (+), the special code is: If the last pulse was negative (−), the special code is:

0 0 0 + − 0 − +. 0 0 0 − + 0 + −.

There are two bipolar violations in the special codes, at the fourth and seventh bit positions. The code is dc balanced, and the error detection capability of AMI is retained. The waveform between substitutions is the same as that of AMI. If the number of consecutive zeros is 16, 24, . . . , then the substitution is repeated 2, 3, . . . , times.

6.3.7 High-Density Bipolar N (HDBN) This coding algorithm is a CCITT standard recommended by the Conference of European Posts and Telecommunications Administrations (CEPT), a European standards body. It is quite similar to BNZS coding. It is thus a level code with memory. Whenever there is a string of N + 1 consecutive zeros, they are replaced by a special code of length N + 1 containing AMI violations. Specific codes can be constructed for different values of N. A specific high-density bipolar N (HDBN) code, HDB3, is implemented as a CEPT primary digital signal. It is very similar to the B3ZS code. In this code, a string of four consecutive zeros is replaced by either B00V or 000V . B00V or 000V is chosen such that the number of bipolar (B) pulses between violations is odd. The HDB3 rules are summarized in Table 6.2. TABLE 6.2

HDB3 Substitution Rules

Number of B Pulses

Polarity of Last

Substitution

Substitution

Since Last Violation

B Pulse

Code

Code Form

Odd

Negative (−)

000–

000V

Odd

Positive (+)

000+

000V

Even

Negative (−)

+00+

B 00V

Even

Positive (+)

–00–

B 00V

Here the violation always occurs in the fourth bit position of the substitution code, so that it can be easily detected and zero replacement made at the receiver. Also, the substitution code selection maintains dc balance. There is either one or two pulses in the substitution code facilitating synchronization. The error detection capability of AMI is retained in HDB3 because a single channel error would make the number of bipolar pulses between violations even instead of being odd.

6.3.8

Ternary Coding

Many line coding schemes employ three symbols or levels to represent only one bit of information, like AMI. Theoretically, it should be possible to transmit information more efficiently with three symbols, specifically the maximum efficiency is log2 3 = 1.58 bits per symbol. Alternatively, the redundancy in the code signal space can be used to provide better error control. Two examples of ternary coding are described next [1, 2]: pair selected ternary (PST) and 4 binary 3 ternary (4B3T). The PST code has many of the desirable properties of line codes, but its transmission efficiency is still 1 bit per symbol. The 4B3T code also has many of the desirable properties of line codes, and it has increased transmission efficiency. 1999 by CRC Press LLC

c

In the PST code, two consecutive bits, termed a binary pair, are grouped together to form a word. These binary pairs are assigned codewords consisting of two ternary symbols, where each ternary symbol can be +, −, or 0, just as in AMI. There are nine possible ternary codewords. Ternary codewords with identical elements, however, are avoided, i.e., ++, −−, and 00. The remaining six codewords are transmitted using two modes called + mode and − mode. The modes are switched whenever a codeword with a single pulse is transmitted. The PST code and mode switching rules are summarized in Table 6.3. TABLE 6.3 PST Codeword Assignment and Mode Switching Rules Ternary Codewords

Mode

Binary Pair

+ Mode

− Mode

Switching

11

+−

+−

No

10

+0

−0

Yes

01

0+

0−

Yes

00

−+

−+

No

PST is designed to maintain dc balance and include a strong timing component. One drawback of this code is that the bits must be framed into pairs. At the receiver, an out-of-frame condition is signalled when unused ternary codewords (++, −−, and 00) are detected. The mode switching property of PST provides error detection capability. PST can be classified as a level code with memory. If the original data for PST coding contains only 1s or 0s, an alternating sequence of +− +− · · · is transmitted. As a result, an out-of-frame condition can not be detected. This problem can be minimized by using the modified PST code as shown in Table 6.4. TABLE 6.4 Modified PST Codeword Assignment and Mode Switching Rules Ternary Codewords

Mode

Binary Pair

+ Mode

− Mode

Switching

11

+0

0−

Yes

10

+−

+−

No

01

−+

−+

No

00

0+

−0

Yes

It is tedious to derive the PSD of a PST coded waveform. Again, Fig. 6.4 shows the PSD of the PST code along with the PSD of AMI and B6ZS for comparison purposes, all for equally likely binary data. Observe that PST has more power than AMI and, thus, a larger amount of energy per bit, which translates into slightly increased crosstalk. In 4B3T coding, words consisting of four binary digits are mapped into three ternary symbols. Four bits imply 24 = 16 possible binary words, whereas three ternary symbols allow 33 = 27 possible ternary codewords. The binary-to-ternary conversion in 4B3T insures dc balance and a strong timing component. The specific codeword assignment is as shown in Table 6.5. There are three types of codewords in Table 6.5, organized into three columns. The codewords in 1999 by CRC Press LLC

c

TABLE 6.5

4B3T Codeword Assignment Ternary Codewords

Binary Words

Column 1

0000

−−−

Column 2

Column 3 +++

0001

−−0

++0

0010

−0−

+0+

0011

0−−

0++

0100

−−+

++−

0101

−+−

+−+

0110

+−−

−++

0111

−00

+00

1000

0−0

0+0

1001

00−

00+

1010

0+−

1011

0−+

1100

+0−

1101

−0+

1110

+−0

1111

−+0

the first column have negative dc, codewords in the second column have zero dc, and those in the third column have positive dc. The encoder monitors the integer variable I = Np − Nn ,

(6.16)

where Np is the number of positive pulses transmitted and Nn are the number of negative pulses transmitted. Codewords are chosen according to following rule: If I < 0, If I > 0, If I = 0,

choose the ternary codeword from columns 1 and 2. choose the ternary codeword from columns 2 and 3. choose the ternary word from column 2, and from column 1 if the previous I > 0 or from column 3 if the previous I < 0.

Note that the ternary codeword 000 is not used, but the remaining 26 codewords are used in a complementary manner. For example, the column 1 codeword for 0001 is −−0, whereas the column 3 codeword is ++0. The maximum transmission efficiency for the 4B3T code is 1.33 bits per symbol compared to 1 bit per symbol for the other line codes. The disadvantages of 4B3T are that framing is required and that performance monitoring is complicated. The 4B3T code is used in the T148 span line developed by ITT Telecommunications. This code allows transmission of 48 channels using only 50% more bandwidth than required by T1 lines, instead of 100% more bandwidth.

6.4

Multilevel Signalling, Partial Response Signalling, and Duobinary Coding

Ternary coding, such as 4B3T, is an example of the use of more than two levels to improve the transmission efficiency. To increase the transmission efficiency further, more levels and/or more signal 1999 by CRC Press LLC

c

processing is needed. Multilevel signalling allows an improvement in the transmission efficiency at the expense of an increase in the error rate, i.e., more transmitter power will be required to maintain a given probability of error. In partial response signalling, intersymbol interference is deliberately introduced by using pulses that are wider and, hence, require less bandwidth. The controlled amount of interference from each pulse can be removed at the receiver. This improves the transmission efficiency, at the expense of increased complexity. Duobinary coding, a special case of partial response signalling, requires only the minimum theoretical bandwidth of 0.5R Hz. In what follows these techniques are discussed in slightly more detail.

6.4.1 Multilevel Signalling The number of levels that can be used for a line code is not restricted to two or three. Since more levels or symbols allow higher transmission efficiency, multilevel signalling can be considered in bandwidth-limited applications. Specifically, if the signalling rate or baud rate is Rs and the number of levels used is L, the equivalent transmission bit rate Rb is given by Rb = Rs log2 [L] .ψ

(6.17)

Alternatively, multilevel signalling can be used to reduce the baud rate, which in turn can reduce crosstalk for the same equivalent bit rate. The penalty, however, is that the SNR must increase to achieve the same error rate. The T1G carrier system of AT&T uses multilevel signalling with L = 4 and a baud rate of 3.152 mega-symbols/s to double the capacity of the T1C system from 48 channels to 96 channels. Also, a four level signalling scheme at 80-kB is used to achieve 160 kb/s as a basic rate in a digital subscriber loop (DSL) for ISDN.

6.4.2

Partial Response Signalling and Duobinary Coding

This class of signalling is also called correlative coding because it purposely introduces a controlled or correlated amount of intersymbol interference in each symbol. At the receiver, the known amount of interference is effectively removed from each symbol. The advantage of this signalling is that wider pulses can be used requiring less bandwidth, but the SNR must be increased to realize a given error rate. Also, errors can propagate unless precoding is used. There are many commonly used partial response signalling schemes, often described in terms of the delay operator D, which represents one signalling interval delay. For example, in (1 + D) signalling the current pulse and the previous pulse are added. The T1D system of AT&T uses (1 + D) signalling with precoding, referred to as duobinary signalling, to convert binary (two level) data into ternary (three level) data at the same rate. This requires the minimum theoretical channel bandwidth without the deleterious effects of intersymbol interference and avoids error propagation. Complete details regarding duobinary coding are found in Lender, 1963 and Schwartz, 1980. Some partial response signalling schemes, such as (1 − D), are used to shape the bandwidth rather than control it. Another interesting example of duobinary coding is a (1 − D 2 ), which can be analyzed as the product (1 − D) (1 + D). It is used by GTE in its modified T carrier system. AT&T also uses (1 − D 2 ) with four input levels to achieve an equivalent data rate of 1.544 Mb/s in only a 0.5-MHz bandwidth.

6.5

Bandwidth Comparison

We have provided the PSD expressions for most of the commonly used line codes. The actual bandwidth requirement, however, depends on the pulse shape used and the definition of bandwidth 1999 by CRC Press LLC

c

itself. There are many ways to define bandwidth, for example, as a percentage of the total power or the sidelobe suppression relative to the main lobe. Using the first null of the PSD of the code as the definition of bandwidth, Table 6.6 provides a useful bandwidth comparison. TABLE 6.6 First Null Bandwidth Comparison Bandwidth

Codes Unipolar NRZ

R

2R

BNZS

Polar NRZ

HDBN

Polar RZ (AMI)

PST

Unipolar RZ

Split Phase

Manchester

CMI

The notable omission in Table 6.6 is delay modulation (Miller code). It does not have a first null in the 2R-Hz band, but most of its power is contained in less than 0.5R Hz.

6.6

Concluding Remarks

An in-depth presentation of line coding, particularly applicable to telephony, has been included in this chapter. The most desirable characteristics of line codes were discussed. We introduced five common line codes and eight alternate line codes. Each line code was illustrated by an example waveform. In most cases expressions for the PSD and the probability of error were given and plotted. Advantages and disadvantages of all codes were included in the discussion, and some specific applications were noted. Line codes for optical fiber channels and networks built around them, such as fiber distributed data interface (FDDI) were not included in this section. A discussion of line codes for optical fiber channels, and other new developments in this topic area can be found in [1, 3, 4].

Defining Terms Alternate mark inversion (AMI): A popular name for bipolar line coding using three levels: zero, positive, and negative. Binary N zero substitution (BNZS): A class of coding schemes that attempts to improve AMI line coding. Bipolar: A particular line coding scheme using three levels: zero, positive, and negative. Crosstalk: An unwanted signal from an adjacent channel. DC wander: The dc level variation in the received signal due to a channel that cannot support dc. Duobinary coding: A coding scheme with binary input and ternary output requiring the minimum theoretical channel bandwidth. 4 Binary 3 Ternary (4B3T): A line coding scheme that maps four binary digits into three ternary symbols. High-density bipolar N (HDBN): A class of coding schemes that attempts to improve AMI. 1999 by CRC Press LLC

c

Level codes: Line codes carrying information in their voltage levels. Line coding: The process of converting abstract symbols into real, temporal waveforms to be transmitted through a baseband channel. Nonreturn to zero (NRZ): A signal that stays at a nonzero level for the entire bit duration. Pair selected ternary (PST): A coding scheme based on selecting a pair of three level symbols. Polar: A line coding scheme using both polarity of voltages, with or without a zero level. Return to zero (RZ): A signal that returns to zero for a portion of the bit duration. Transition codes: Line codes carrying information in voltage level transitions. Unipolar: A line coding scheme using only one polarity of voltage, in addition to a zero level.

References [1] Bellamy, J., Digital Telephony, John Wiley & Sons, New York, NY, 1991. [2] Bell Telephone Laboratories Technical Staff Members. Transmission Systems for Communications, 4th ed., Western Electric Company, Technical Publications, Winston-Salem, NC, 1970. [3] Bic, J.C., Duponteil, D., and Imbeaux, J.C., Elements of Digital Communication, John Wiley & Sons, New York, NY, 1991. [4] Bylanski, P., Digital Transmission Systems, Peter Peregrinus, Herts, England, 1976. [5] Couch, L.W., Modern Communication Systems: Principles and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1994. [6] Feher, K., Digital Modulation Techniques in an Interference Environment, EMC Encyclopedia Series, Vol. IX. Don White Consultants, Germantown, MD, 1977. [7] Gibson, J.D., Principles of Analog and Digital Communications, MacMillan Publishing, New York, NY, 1993. [8] Lathi, B.P., Modern Digital and Analog Communication Systems, Holt, Rinehart and Winston, Philadelphia, PA, 1989. [9] Lender, A., Duobinary Techniques for High Speed Data Transmission, IEEE Trans. Commun. Electron., CE-82, 214–218, May 1963. [10] Lindsey, W.C. and Simon, M.K., Telecommunication Systems Engineering, Prentice-Hall, Englewood Cliffs, NJ, 1973. [11] Schwartz, M., Information Transmission, Modulation, and Noise, McGraw-Hill, New York, NY, 1980. [12] Stremler, F.G., Introduction to Communication Systems, Addison-Wesley Publishing, Reading, MA, 1990.

1999 by CRC Press LLC

c

Cherubini, G. “Echo Cancellation” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Echo Cancellation

Giovanni Cherubini IBM Zurich Research Laboratory

7.1

7.1 Introduction 7.2 Echo Cancellation for PAM Systems 7.3 Echo Cancellation for QAM Systems 7.4 Echo Cancellation for OFDM Systems 7.5 Summary and Conclusions References Further Information

Introduction

Full-duplex data transmission over a single twisted-pair cable permits the simultaneous flow of information in two directions when the same frequency band is used. Examples of applications of this technique are found in digital communications systems that operate over the telephone network. In a digital subscriber loop, at each end of the full-duplex link, a circuit known as a hybrid separates the two directions of transmission. To avoid signal reflections at the near- and far-end hybrid, a precise knowledge of the line impedance would be required. Since the line impedance depends on line parameters that, in general, are not exactly known, an attenuated and distorted replica of the transmit signal leaks to the receiver input as an echo signal. Data-driven adaptive echo cancellation mitigates the effects of impedance mismatch. A similar problem is caused by crosstalk in transmission systems over voice-grade unshielded twisted-pair cables for local-area network applications, where multipair cables are used to physically separate the two directions of transmission. Crosstalk is a statistical phenomenon due to randomly varying differential capacitive and inductive coupling between adjacent two-wire transmission lines. At the rates of several megabits per second that are usually considered for local-area network applications, near-end crosstalk (NEXT) represents the dominant disturbance; hence adaptive NEXT cancellation must be performed to ensure reliable communications. In voiceband data modems, the model for the echo channel is considerably different from the echo model adopted in baseband transmission. The transmitted signal is a passband signal obtained by quadrature amplitude modulation (QAM), and the far-end echo may exhibit significant carrierphase jitter and carrier-frequency shift, which are caused by signal processing at intermediate points in the telephone network. Therefore, a digital adaptive echo canceller for voiceband modems needs to embody algorithms that account for the presence of such additional impairments. In this chapter, we describe the echo channel models and adaptive echo canceller structures that are obtained for various digital communications systems, which are classified according to the employed 1999 by CRC Press LLC

c

modulation techniques. We also address the tradeoffs between complexity, speed of adaptation, and accuracy of cancellation in adaptive echo cancellers.

7.2

Echo Cancellation for Pulse–Amplitude Modulation (PAM) Systems

The model of a full-duplex baseband data transmission system employing pulse–amplitude modulation (PAM) and adaptive echo cancellation is shown in Fig. 7.1. To describe system operations, we consider one end of the full-duplex link. The configuration of an echo canceller for a PAM transmission system is shown in Fig. 7.2. The transmitted data consist of a sequence {an } of independent and identically distributed (i.i.d.) real-valued symbols from the M-ary alphabet A = {±1, ±3, . . . , ±(M − 1)}. The sequence {an } is converted into an analog signal by a digitalto-analog (D/A) converter. The conversion to a staircase signal by a zero-order hold D/A converter is described by the frequency response HD/A (f ) = T sin(πf T )/(πf T ), where T is the modulation interval. The D/A converter output is filtered by the analog transmit filter and is input to the channel through the hybrid.

FIGURE 7.1: Model of a full-duplex PAM transmission system. The signal x(t) at the output of the low-pass analog receive filter has three components, namely, the signal from the far-end transmitter r(t), the echo u(t), and additive Gaussian noise w(t). The signal x(t) is given by x(t)

= =

r(t) + u(t) + w(t) ∞ ∞ X X anR h(t − nT ) + an hE (t − nT ) + w(t) ,

n=−∞

(7.1)

n=−∞

where {anR } is the sequence of symbols from the remote transmitter, and h(t) and hE (t) = {hD/A ⊗ gE }(t) are the impulse responses of the overall channel and the echo channel, respectively. In the expression of hE (t), the function hD/A (t) is the inverse Fourier transform of HD/A (f ), and the operator ⊗ denotes convolution. The signal obtained after echo cancellation is processed by a 1999 by CRC Press LLC

c

detector that outputs the sequence of estimated symbols {aˆ nR }. In the case of full-duplex PAM data

FIGURE 7.2: Configuration of an echo canceller for a PAM transmission system.

transmission over multi-pair cables for local-area network applications, where NEXT represents the main disturbance, the configuration of a digital NEXT canceller is obtained from Fig. 7.2, with the echo channel replaced by the crosstalk channel. For these applications, however, instead of monoduplex transmission, where one pair is used to transmit only in one direction and the other pair to transmit only in the reverse direction, dual-duplex transmission may be adopted. Bi-directional transmission at rate % over two pairs is then accomplished by full-duplex transmission of data streams at rate %/2 over each of the two pairs. The lower modulation rate and/or spectral efficiency required per pair for achieving an aggregate rate equal to % represents an advantage of dual-duplex over mono-duplex transmission. Dual-duplex transmission requires two transmitters and two receivers at each end of a link, as well as separation of the simultaneously transmitted and received signals on each pair, as illustrated in Fig. 7.3. In dual-duplex transceivers it is therefore necessary to suppress echoes returning from the hybrids and impedance discontinuities in the cable, as well as self NEXT, by adaptive digital echo and NEXT cancellation [3]. Although a dual-duplex scheme might appear to require higher implementation complexity than a mono-duplex scheme, it turns out that the two schemes are equivalent in terms of the number of multiply-and-add operations per second that are needed to perform the various filtering operations. One of the transceivers in a full-duplex link will usually employ an externally provided reference clock for its transmit and receive operations. The other transceiver will extract timing from the received signal, and use this timing for its transmitter operations. This is known as loop timing, also illustrated in Fig. 7.3. If signals were transmitted in opposite directions with independent clocks, signals received from the remote transmitter would generally shift in phase relative to the also received echo signals. To cope with this effect, some form of interpolation would be required that can significantly increase the transceiver complexity [2]. In general, we consider baseband signalling techniques such that the signal at the output of the overall channel has nonnegligible excess bandwidth, i.e., nonnegligible spectral components at fre1999 by CRC Press LLC

c

FIGURE 7.3: Model of a dual-duplex transmission system. quencies larger than half of the modulation rate, |f | ≥ 1/2T . Therefore, to avoid aliasing, the signal x(t) is sampled at twice the modulation rate or at a higher sampling rate. Assuming a sampling rate equal to m/T , m > 1, the ith sample during the nth modulation interval is given by T = xnm+i = rnm+i + unm+i + wnm+i , i = 0, . . . , m − 1 x (nm + i) m ∞ ∞ X X R hkm+i an−k + hE,km+i an−k + wnm+i , (7.2) = k=−∞

k=−∞

where {hnm+i , i = 0, . . . , m − 1} and {hE,nm+i , i = 0, . . . , m − 1} are the discrete-time impulse responses of the overall channel and the echo channel, respectively, and {wnm+i , i = 0, . . . , m − 1} is a sequence of Gaussian noise samples with zero mean and variance σw2 . Equation (7.2) suggests that the sequence of samples {xnm+i , i = 0, . . . , m − 1} be regarded as a set of m interleaved sequences, each with a sampling rate equal to the modulation rate. Similarly, the sequence of echo samples {unm+i , i = 0, . . . , m − 1} can be regarded as a set of m interleaved sequences that are output by m independent echo channels with discrete-time impulse responses {hE,nm+i }, i = 0, . . . , m − 1, and an identical sequence {an } of input symbols [7]. Hence, echo cancellation can be performed by m interleaved echo cancellers, as shown in Fig. 7.4. Since the performance of each canceller is independent of the other m − 1 units, in the remaining part of this section we will consider the operations of a single echo canceller. The echo canceller generates an estimate uˆ n of the echo signal. If we consider a transversal filter realization, uˆ n is obtained as the inner product of the vector of filter coefficients at time t = nT , cn = (cn,0 , . . . , cn,N−1 )0 and the vector of signals stored in the echo canceller delay line at the same instant, a n = (an , . . . , an−N+1 )0 , expressed by uˆ n = c0n a n =

N −1 X

cn,k an−k

(7.3)

k=0

where c0n denotes the transpose of the vector cn . The estimate of the echo is subtracted from the received signal. The result is defined as the cancellation error signal zn = xn − uˆ n = xn − c0n a n .

(7.4)

The echo attenuation that must be provided by the echo canceller to achieve proper system operation depends on the application. For example, for the Integrated Services Digital Network (ISDN) 1999 by CRC Press LLC

c

FIGURE 7.4: A set of m interleaved echo cancellers. U-Interface transceiver, the echo attenuation must be larger than 55 dB [10]. It is then required that the echo signals outside of the time span of the echo canceller delay line be negligible, i.e., hE,n ≈ 0 for n < 0 and n > N − 1. As a measure of system performance, we consider the mean square error εn2 at the output of the echo canceller at time t = nT , defined by n o (7.5) εn2 = E zn2 , where {zn } is the error sequence and E{·} denotes the expectation operator. For a particular coefficient vector cn , substitution of Eq. (7.4) into Eq. (7.5) yields n o (7.6) εn2 = E xn2 − 2c0n q + c0n R cn , where q = E{xn a n } and R = E{a n a 0n }. With the assumption of i.i.d. transmitted symbols, the correlation matrix R is diagonal. The elements on the diagonal are equal to the variance of the transmitted symbols, σa2 = (M 2 − 1)/3. The minimum mean square error is given by n o 2 = E xn2 − c0opt R copt , (7.7) εmin where the optimum coefficient vector is copt = R −1 q. We note that proper system operation is achieved only if the transmitted symbols are uncorrelated with the symbols from the remote transmitter. If this condition is satisfied, the optimum filter coefficients are given by the values of the discrete-time echo channel impulse response, i.e., copt,k = hE,k , k = 0, . . . , N − 1. By the decision-directed stochastic gradient algorithm, also known as the least mean square (LMS) algorithm, the coefficients of the echo canceller converge in the mean to copt . The LMS algorithm for an N-tap adaptive linear transversal filter is formulated as follows: n o 1 (7.8) cn+1 = cn − α∇c zn2 = cn + αzn a n , 2 1999 by CRC Press LLC

c

where α is the adaptation gain and 0 n o ∂z2 ∂zn2 n 2 ,..., = −2zn a n ∇c zn = ∂cn,0 ∂cn,N −1 is the gradient of the squared error with respect to the vector of coefficients. The block diagram of an adaptive transversal filter echo canceller is shown in Fig. 7.5.

FIGURE 7.5: Block diagram of an adaptive transversal filter echo canceller. If we define the vector pn = copt − cn , the mean square error can be expressed as 2 + p0n R pn , εn2 = εmin

(7.9)

where the term p0n R pn represents an ‘excess mean square distortion’ due to the misadjustment of the filter settings. The analysis of the convergence behavior of the excess mean square distortion was first proposed for adaptive equalizers [13] and later extended to adaptive echo cancellers [9]. Under the assumption that the vectors pn and a n are statistically independent, the dynamics of the mean square error are given by h in n o + E εn2 = ε02 1 − ασa2 2 − αN σa2

2 2εmin , 2 − αN σa2

(7.10)

where ε02 is determined by the initial conditions. The mean square error converges to a finite steady2 if the stability condition 0 < α < 2/(N σ 2 ) is satisfied. The optimum adaptation state value ε∞ a gain that yields fastest convergence at the beginning of the adaptation process is αopt = 1/(N σa2 ). 1999 by CRC Press LLC

c

2 = 2ε 2 , The corresponding time constant and asymptotic mean square error are τopt = N and ε∞ min respectively. We note that a fixed adaptation gain equal to αopt could not be adopted in practice, since after echo cancellation the signal from the remote transmitter would be embedded in a residual echo having approximately the same power. If the time constant of the convergence mode is not a critical system parameter, an adaptation gain smaller than αopt will be adopted to achieve an asymptotic mean 2 . On the other hand, if fast convergence is required, a variable gain will be square error close to εmin chosen. Several techniques have been proposed to increase the speed of convergence of the LMS algorithm. In particular, for echo cancellation in data transmission, the speed of adaptation is reduced by the presence of the signal from the remote transmitter in the cancellation error. To mitigate this problem, the data signal can be adaptively removed from the cancellation error by a decision-directed algorithm [5]. Modified versions of the LMS algorithm have been also proposed to reduce system complexity. For example, the sign algorithm suggests that only the sign of the error signal be used to compute an approximation of the stochastic gradient [4]. An alternative means to reduce the implementation complexity of an adaptive echo canceller consists in the choice of a filter structure with a lower computational complexity than the transversal filter. At high data rates, very large scale integration (VLSI) technology is needed for the implementation of transceivers for full-duplex data transmission. High-speed echo cancellers and near-end crosstalk cancellers that do not require multiplications represent an attractive solution because of their low complexity. As an example of an architecture suitable for VLSI implementation, we consider echo cancellation by a distributed-arithmetic filter, where multiplications are replaced by table lookup and shift-and-add operations [12]. By segmenting the echo canceller into filter sections of shorter lengths, various tradeoffs concerning the number of operations per modulation interval and the number of memory locations needed to store the lookup tables are possible. Adaptivity is achieved by updating the values stored in the lookup tables by the LMS algorithm. To describe the principles of operations of a distributed-arithmetic echo canceller, we assume that the number of elements in the alphabet of input symbols is a power of two, M = 2W . Therefore, (0) (W −1) (i) ), where an , i = 0, . . . , W − 1, are each symbol is represented by the vector (an , . . . , an independent binary random variables, i.e.,

an =

w=0 (w)

W −1 X bn(w) 2w , 2an(w) − 1 2w =

W −1 X

(7.11)

w=0

(w)

where bn = (2an − 1) ∈ {−1, +1}. By substituting Eq. (7.11) into Eq. (7.1) and segmenting the delay line of the echo canceller into L sections with K = N/L delay elements each, we obtain uˆ n =

−1 L−1 XW X `=0 w=0

2

w

"K−1 X k=0

# (w) bn−`K−k cn,`K+k

.

(7.12)

Equation (7.12) suggests that the filter output can be computed using a set of L2K val(w) ues that are stored in L tables with 2K memory locations each. The binary vectors a n,` = (w)

(w)

(an−(`+1)K+1 , . . . , an−`K ), w = 0, . . . , W − 1, ` = 0, . . . , L − 1, determine the addresses of the memory locations where the values that are needed to compute the filter output are stored. The filter output is obtained by W L table lookup and shift-and-add operations. 1999 by CRC Press LLC

c

(w) (w) We observe that a n,` and its binary complement a¯ n,` select two values that differ only in their sign. This symmetry is exploited to halve the number of values to be stored. To determine the output of a distributed-arithmetic filter with reduced memory size, we reformulate Eq. (7.12) as

uˆ n =

−1 L−1 XW X `=0 w=0

K−1 X (w) (w) (w) 2w bn−`K−k0 + b bn−`K−k cn,`K+k c n,`K+k 0 n−`K−k0 ,

(7.13)

k=0 k6 =k0

where k0 can be any element of the set {0, . . . , K − 1}. In the following, we take k0 = 0. Then the (w) binary symbols bn−`K determine whether the selected values are to be added or subtracted. Each table has now 2K−1 memory locations, and the filter output is given by uˆ n =

−1 L−1 XW X `=0 w=0

(w) (w) 2w bn−`K dn in,` , ` ,

(7.14) (w)

where dn (k, `), k = 0, . . . , 2K−1 − 1, ` = 0, . . . , L − 1, are the look up values, and in,` , w = 0, . . . , W − 1, ` = 0, . . . , L − 1, are the look up indices computed as follows: K−1 X (w) (w) a 2k−1 if an−`K = 1 k=1 n−`K−k (w) . (7.15) in,` = K−1 X (w) (w) a¯ n−`K−k 2k−1 if an−`K = 0 k=1

We note that, as long as Eqs. (7.12) and (7.13) hold for some coefficient vector (cn,0 , . . . , cn,N −1 ), the distributed-arithmetic filter emulates the operation of a linear transversal filter. For arbitrary values dn (k, `), however, a nonlinear filtering operation results. The expression of the LMS algorithm to update the values of a distributed-arithmetic echo canceller takes the form n o 1 (7.16) d n+1 = d n − α∇d zn2 = d n + αzn y n , 2 where d 0n = [d 0n (0), . . . , d 0n (L − 1)], with d 0n (`) = [dn (0, `), . . . , dn (2K−1 − 1, `)], and y 0n = [y 0n (0), . . . , y 0n (L − 1)], with y 0n (`) =

W −1 X w=0

(w) 2w bn−`K δ0,i (w) , . . . , δ2K−1 −1,i (w) , n,`

n,`

are L2K−1 × 1 vectors and where δi,j is the Kronecker delta. We note that at each iteration only those values that are selected to generate the filter output are updated. The block diagram of an adaptive distributed-arithmetic echo canceller with input symbols from a quaternary alphabet is shown in Fig. 7.6. The analysis of the mean square error convergence behavior and steady-state performance has been extended to adaptive distributed-arithmetic echo cancellers [1]. The dynamics of the mean square error are given by 2 n n o 2εmin ασa2 2 2 2 + . (7.17) E εn = ε0 1 − K−1 2 − αLσa 2 2 − αLσa2 1999 by CRC Press LLC

c

The stability condition for the echo canceller is 0 < α < 2/(Lσa2 ). For a given adaptation gain, echo canceller stability depends on the number of tables and on the variance of the transmitted symbols. Therefore, the time span of the echo canceller can be increased without affecting system stability, provided that the number L of tables is kept constant. In that case, however, mean square error convergence will be slower. From Eq. (7.17), we find that the optimum adaptation gain that permits the fastest mean square error convergence at the beginning of the adaptation process is αopt = 1/(Lσa2 ). The time constant of the convergence mode is τopt = L2K−1 . The smallest achievable time constant is proportional to the total number of values. The realization of a distributedarithmetic echo canceller can be further simplified by updating at each iteration only the values that are addressed by the most significant bits of the symbols stored in the delay line. The complexity required for adaptation can thus be reduced at the price of a slower rate of convergence.

FIGURE 7.6: Block diagram of an adaptive distributed-arithmetic echo canceller.

1999 by CRC Press LLC

c

7.3

Echo Cancellation for Quadrature Amplitude Modulation (QAM) Systems

Although most of the concepts presented in the preceding sections can be readily extended to echo cancellation for communications systems employing QAM, the case of full-duplex transmission over a voiceband data channel requires a specific discussion. We consider the system model shown in Fig. 7.7. The transmitter generates a sequence {an } of i.i.d. complex-valued symbols from a twodimensional constellation A, which are modulated by the carrier ej 2πfc nT , where T and fc denote the modulation interval and the carrier frequency, respectively. The discrete-time signal at the output of the transmit Hilbert filter may be regarded as an analytic signal, which is generated at the rate of m/T samples/s, m > 1. The real part of the analytic signal is converted into an analog signal by a D/A converter and input to the channel. We note that by transmitting the real part of a complex-valued signal positive- and negative-frequency components become folded. The image band attenuation of the transmit Hilbert filter thus determines the achievable echo suppression. In fact, the receiver cannot extract aliasing image-band components from desired passband frequency components, and the echo canceller is able to suppress only echo arising from transmitted passband components.

FIGURE 7.7: Configuration of an echo canceller for a QAM transmission system.

The output of the echo channel is represented as the sum of two contributions. The near-end echo uNE (t) arises from the impedance mismatch between the hybrid and the transmission line, as in the case of baseband transmission. The far-end echo uFE (t) represents the contribution due to echos that are generated at intermediate points in the telephone network. These echos are characterized by additional impairments, such as jitter and frequency shift, which are accounted for by introducing a carrier-phase rotation of an angle φ(t) in the model of the far-end echo. At the receiver, samples of the signal at the channel output are obtained synchronously with the transmitter timing, at the sampling rate of m/T samples/s. The discrete-time received signal is converted to a complex-valued baseband signal {xnm0 +i , i = 0, . . . , m0 − 1}, at the rate of m0 /T 1999 by CRC Press LLC

c

samples/s, 1 < m0 < m, through filtering by the receive Hilbert filter, decimation, and demodulation. From delayed transmit symbols, estimates of the near- and far-end echo signals after demodulation, 0 0 0 ˆ FE {uˆ NE nm0 +i , i = 0, . . . , m − 1}, respectively, are generated using m nm0 +i , i = 0, . . . , m − 1} and {u interleaved near- and far-end echo cancellers. The cancellation error is given by ˆ FE z` = x` − uˆ NE ` . ` −u

(7.18)

A different model is obtained if echo cancellation is accomplished before demodulation. In this case, two equivalent configurations for the echo canceller may be considered. In one configuration, the modulated symbols are input to the transversal filter, which approximates the passband echo response. Alternatively, the modulator can be placed after the transversal filter, which is then called a baseband transversal filter [14]. In the considered realization, the estimates of the echo signals after demodulation are given by uˆ NE nm0 +i = and uˆ FE nm0 +i

=

"N −1 FE X k=0

NX NE −1 k=0

i = 0, . . . , m0 − 1 ,

NE cn,km 0 +i an−k ,

(7.19)

# FE cn,km 0 +i an−k−DFE

ˆ

ej φnm0 +i ,

i = 0, . . . , m0 − 1 ,

(7.20)

NE , . . . , cNE FE FE 0 where (cn,0 n,m0 NNE −1 ) and (cn,0 , . . . , cn,m0 NFE −1 ) are the coefficients of the m interleaved near- and far-end echo cancellers, respectively, {φˆ nm0 +i , i = 0, . . . , m0 − 1} is the sequence of far-end echo phase estimates, and DFE denotes the bulk delay accounting for the round-trip delay from the transmitter to the point of echo generation. To prevent overlap of the time span of the near-end echo canceller with the time span of the far-end echo canceller, the condition DFE > NNE must be satisfied. We also note that, because of the different nature of near- and far-end echo generation, the time span of the far-end echo canceller needs to be larger than the time span of the near-end echo canceller, i.e., NFE > NNE . Adaptation of the filter coefficients in the near- and far-end echo cancellers by the LMS algorithm leads to NE cn+1,km 0 +i

=

NE ∗ cn,km 0 +i + αznm0 +i (an−k )

k

=

0, . . . , NNE − 1,

FE cn+1,km 0 +i

=

FE ∗ −j φnm0 +i cn,km 0 +i + αznm0 +i (an−k−DFE ) e

k

=

0, . . . , NFE − 1,

i = 0, . . . , m0 − 1 ,

(7.21)

and ˆ

i = 0, . . . , m0 − 1 ,

(7.22)

respectively, where the asterisk denotes complex conjugation. The far-end echo phase estimate is computed by a second-order phase-lock loop algorithm, where the following stochastic gradient approach is adopted: φˆ `+1 = φˆ ` − 21 γFE ∇φˆ |z` |2 + 1φ` 1999 by CRC Press LLC

c

1φ`+1 = 1φ` − 21 ζFE ∇φˆ |z` |2

(mod 2π )

,

(7.23)

where ` = nm0 + i, i = 0, . . . , m0 − 1, γFE and ζFE are step-size parameters, and ∇φˆ |z` |2 =

n ∗ o ∂ |z` |2 = −2Im z` uˆ FE . ` ∂ φˆ `

(7.24)

We note that algorithm (7.23) requires m0 iterations per modulation interval, i.e., we cannot resort to interleaving to reduce the complexity of the computation of the far-end echo phase estimate.

7.4

Echo Cancellation for Orthogonal Frequency Division Multiplexing (OFDM) Systems

Orthogonal frequency division multiplexing (OFDM) is a modulation technique whereby blocks of M symbols are transmitted in parallel over M subchannels by employing M orthogonal subcarriers. We consider a real-valued discrete-time channel impulse response {hi , i = 0, . . . , L} having length L + 1 M. To illustrate the basic principles of OFDM systems, let us consider a noiseless ideal channel with impulse response given by {hi } = {δi }, where {δi } is defined as the discrete-time delta function. Modulation of the complex-valued input symbols at the n-th modulation interval, denoted by the vector An = {An (i), i = 0, . . . , M −1}, is performed by an inverse discrete Fourier transform (IDFT), as shown in Fig. 7.8. We assume that M is even, and that each block of symbols satisfies the Hermitian symmetry conditions, i.e., An (0) and An (M/2) are real valued, and An (i) = A∗n (M − i), i = 1, . . . , M/2 − 1. Then the signals a n = {an (i), i = 0, . . . , M − 1} obtained at the output of the IDFT are real valued. After parallel-to-serial conversion, the M signals are sent over the channel at the given transmission rate M/T , where T denotes the modulation interval. At the output of the channel, the noiseless signals are received without distortion. Serial-to-parallel conversion yields blocks of M elements, with boundaries placed such that each block obtained at the modulator output is also presented at the demodulator input. Then demodulation performed by a discrete Fourier transform (DFT) will reproduce the blocks of M input symbols. The overall input-output relationship is therefore equivalent to that of a bank of M parallel, independent subchannels.

FIGURE 7.8: Block diagram of an OFDM system.

In the general case of a noisy channel with impulse response having length greater than one, M independent subchannels are obtained by a variant of OFDM that is also known as discrete multitone modulation (DMT) [11]. In a DMT system, modulation by the IDFT is performed at the rate 1/T 0 = M/(M + L)T < 1/T . After modulation, each block of M signals is cyclically extended by copying the last L signals in front of the block, and converted from parallel to serial. The resulting L + M signals are sent over the channel. At the receiver, blocks of samples with length L + M are taken. Block boundaries are placed such that the last M samples depend only on the elements of one cyclically extended block of signals. The first L samples are discarded, and the vector x n of the last M samples of the block received at the n-th modulation interval can be expressed as x n = 0n h + wn , 1999 by CRC Press LLC

c

(7.25)

where h is the vector of the impulse response extended with M − L − 1 zeros, wn is a vector of additive white Gaussian noise samples, and 0n is a M × M circulant matrix given by an (0) an (M − 1) . . . an (1) an (1) an (0) . . . an (2) . . . . (7.26) 0n = . . . . . . an (M − 1) an (M − 2) . . . an (0) −1 Recalling that FM 0n FM = diag(An ), where FM is the M × M DFT matrix defined as FM = j 2π

[(e− M )km ], k, m = 0, . . . , M − 1, and diag(An ) denotes the diagonal matrix with elements on the diagonal given by An , we find that the output of the demodulator is given by Xn = diag(An )H + W n ,

(7.27)

where H denotes the DFT of the vector h, and W n is a vector of independent Gaussian random variables. Equation (7.27) indicates that the sequence of transmitted symbol vectors can be detected by assuming a bank of M independent subchannels, at the price of a decrease in the data rate by a factor (M + L)/M. Note that in practice the computationally more efficient inverse fast Fourier transform and fast Fourier transform are used instead of IDFT and DFT. We discuss echo cancellation for OFDM with reference to a DMT system [6], as shown in Fig. 7.9. The real-valued discrete-time echo impulse response is {hE,i , i = 0, . . . , N − 1}, having length

FIGURE 7.9: Configuration of an echo canceller for a DMT transmission system. N < M. We initially assume N ≤ L + 1. Furthermore, we assume that the boundaries of the received blocks are placed such that the last M samples of the n-th received block are expressed by 1999 by CRC Press LLC

c

the vector x n = 0nR h + 0n hE + w n ,

(7.28)

where 0nR is the circulant matrix with elements given by the signals from the remote transmitter, and hE is the vector of the echo impulse response extended with M − N zeros. In the frequency domain, the echo is expressed as U n = diag(An )H E , where H E denotes the DFT of the vector hE . In this case, the echo canceller provides an echo estimate that is given by Uˆ n = diag(An )C n , where C n denotes the DFT of the vector cn of the N coefficients of the echo canceller filter extended with M − N zeros. In practice, however, we need to consider the case N > L + 1. The expression of the cancellation error is then given by zn = x n − 9n,n−1 cn ,

(7.29)

where the vector of the last M elements of the n-th received block is now x n = 0nR h+9n,n−1 hE +w n , and 9n,n−1 is a M × M Toeplitz matrix given by 9n,n−1 =

an (0)

an (M − 1)

···

an (M − L)

an−1 (M − 1)

···

an−1 (L + 1)

an (1) . . . an (M − 1)

an (0)

···

an (M − L)

···

an (M − 2)

···

an (M − L + 1) .. . an (M − L − 1)

an (M − L − 2)

···

an−1 (L + 2) . . . an (0)

.

(7.30)

In the frequency domain, the cancellation error can be expressed as Z n = FM x n − χn,n−1 cn − diag(An )C n ,

(7.31)

where χn,n−1 = 9n,n−1 − 0n is a M × M upper triangular Toeplitz matrix. Equation (7.31) suggests a computationally efficient, two-part echo cancellation technique. First, in the time domain, a short convolution is performed and the result subtracted from the received signals to compensate for the insufficient length of the cyclic extension. Second, in the frequency domain, cancellation of the residual echo is performed over a set of M independent echo subchannels. Observing that Eq. (7.31) is ˜ n,n−1 C n , where 9 ˜ n,n−1 = FM 9n,n−1 F −1 , the echo canceller adaptation equivalent to Z n = Xn − 9 M by the LMS algorithm in the frequency domain takes the form ∗ ˜ n,n−1 Zn , C n+1 = C n + α 9

(7.32)

˜∗ ˜ where α is the adaptation gain, and 9 n,n−1 denotes the transpose conjugate of 9n,n−1 . We note that, alternatively, echo canceller adaptation may also be performed by the algorithm C n+1 = C n + α diag(A∗n )Z n , which entails a substantially lower computational complexity than the LMS algorithm, at the price of a slower rate of convergence. In DMT systems it is essential that the length of the channel impulse response be much less than the number of subchannels, so that the reduction in data rate due to the cyclic extension may be considered negligible. Therefore, equalization is adopted in practice to shorten the length of the channel impulse response. From Eq. (7.31), however, we observe that transceiver complexity depends on the relative lengths of the echo and of the channel impulse responses. To reduce the length of the cyclic extension as well as the computational complexity of the echo canceller, various methods have been proposed to shorten both the channel and the echo impulse responses jointly [8]. 1999 by CRC Press LLC

c

7.5

Summary and Conclusions

Digital signal processing techniques for echo cancellation provide large echo attenuation, and eliminate the need for additional line interfaces and digital-to-analog and analog-to-digital converters that are required by echo cancellation in the analog signal domain. The realization of digital echo cancellers in transceivers for high-speed full-duplex data transmission today is possible at a low cost thanks to the advances in VLSI technology. Digital techniques for echo cancellation are also appropriate for near-end crosstalk cancellation in transceivers for transmission over voice-grade cables at rates of several megabits per second for local-area network applications. In voiceband modems for data transmission over the telephone network, digital techniques for echo cancellation also allow a precise tracking of the carrier phase and frequency shift of far-end echos.

References [1] Cherubini, G., Analysis of the convergence behavior of adaptive distributed-arithmetic echo cancellers. IEEE Trans. Commun., 41(11), 1703–1714, 1993. ¨ ¸ er, S., and Ungerboeck, G., A quaternaty partial-response class-IV transceiver [2] Cherubini, G., Olc for 125 Mbit/s data transmission over unshielded twisted-pair cables: Principles of operation and VLSI realization. IEEE J. Sel. Areas Commun., 13(9), 1656–1669, 1995. ¨ ¸ er, S., Rao, S.K., and Ungerboeck, G., 100BASE-T2: A new [3] Cherubini, G., Creigh, J., Olc standard for 100 Mb/s Ethernet transmission over voice-grade cables. IEEE Commun. Mag., 35(11), 115–122, 1997. [4] Duttweiler, D.L., Adaptive filter performance with nonlinearities in the correlation multiplier. IEEE Trans. Acoust., Speech, Signal Processing, 30(8), 578–586, 1982. [5] Falconer, D.D., Adaptive reference echo-cancellation. IEEE Trans. Commun., 30(9), 2083– 2094, 1982. [6] Ho, M., Cioffi, J.M. and Bingham, J.A.C., Discrete multitone echo cancellation. IEEE Trans. Commun., 44(7), 817–825, 1996. [7] Lee, E.A. and Messerschmitt, D.G., Digital Communication, 2nd ed., Kluwer Academic Publishers, Boston MA, 1994. [8] Melsa, P.J.W., Younce, R.C., and Rohrs, C.E., Impulse response shortening for discrete multitone transceivers. IEEE Trans. Commun., 44(12), 1662–1672, 1996. [9] Messerschmitt, D.G., Echo cancellation in speech and data transmission. IEEE J. Sel. Areas Commun., 2(2), 283–297, 1984. [10] Messerschmitt, D.G., Design issues for the ISDN U-Interface transceiver. IEEE J. Sel. Areas Commun., 4(8), 1281–1293, 1986. [11] Ruiz, A., Cioffi, J.M., and Kasturia, S., Discrete multiple tone modulation with coset coding for the spectrally shaped channel. IEEE Trans. Commun., 40(6), 1012–1029, 1992. [12] Smith, M.J., Cowan, C.F.N., and Adams, P.F., Nonlinear echo cancellers based on transpose distributed arithmetic. IEEE Trans. Circuits and Systems, 35(1), 6–18, 1988. [13] Ungerboeck, G., Theory on the speed of convergence in adaptive equalizers for digital communication. IBM J. Res. Develop., 16(6), 546–555, 1972. [14] Weinstein, S.B., A passband data-driven echo-canceller for full-duplex transmission on twowire circuits. IEEE Trans. Commun., 25(7), 654–666, 1977.

1999 by CRC Press LLC

c

Further Information For further information on adaptive transversal filters with application to echo cancellation, see Adaptive Filters: Structures, Algorithms, and Applications, M.L. Honig and D.G. Messerschmitt, Kluwer, 1984.

1999 by CRC Press LLC

c

Helleseth, T. & Kumar, P.V. “Pseudonoise Sequences” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Pseudonoise Sequences 8.1 8.2 8.3 8.4 8.5 8.6

Tor Helleseth University of Bergen

P. Vijay Kumar University of Southern California

8.1

Introduction m Sequences The q -ary Sequences with Low Autocorrelation

Families of Sequences with Low Crosscorrelation

Gold and Kasami Sequences • Quaternary Sequences with Low Crosscorrelation • Binary Kerdock Sequences

Aperiodic Correlation

Barker Sequences • Sequences with High Merit Factor quences with Low Aperiodic Crosscorrelation

•

Se-

Other Correlation Measures

Partial-Period Correlation • Mean Square Correlation • Optical Orthogonal Codes

Defining Terms References Further Information

Introduction

Pseudonoise sequences (PN sequences), also referred to as pseudorandom sequences, are sequences that are deterministically generated and yet possess some properties that one would expect to find in randomly generated sequences. Applications of PN sequences include signal synchronization, navigation, radar ranging, random number generation, spread-spectrum communications, multipath resolution, cryptography, and signal identification in multiple-access communication systems. The correlation between two sequences {x(t)} and {y(t)} is the complex inner product of the first sequence with a shifted version of the second sequence. The correlation is called 1) an autocorrelation if the two sequences are the same, 2) a crosscorrelation if they are distinct, 3) a periodic correlation if the shift is a cyclic shift, 4) an aperiodic correlation if the shift is not cyclic, and 5) a partial-period correlation if the inner product involves only a partial segment of the two sequences. More precise definitions are given subsequently. Binary m sequences, defined in the next section, are perhaps the best-known family of PN sequences. The balance, run-distribution, and autocorrelation properties of these sequences mimic those of random sequences. It is perhaps the random-like correlation properties of PN sequences that makes them most attractive in a communications system, and it is common to refer to any collection of low-correlation sequences as a family of PN sequences. Section 8.2 begins by discussing m sequences. Thereafter, the discussion continues with a description of sequences satisfying various correlation constraints along the lines of the accompanying 1999 by CRC Press LLC

c

self-explanatory figure, Fig. 8.1. Expanded tutorial discussions on pseudorandom sequences may be found in [14], in [15, Chapter 5] and in [6].

8.2

m Sequences

A binary {0, 1} shift-register sequence {s(t)} is a sequence that satisfies a linear recurrence relation of the form r X fi s(t + i) = 0 ,ψ for all t ≥ 0 (8.1) i=0

where r ≥ 1 is the degree of the recursion; the coefficients fi belong to the finite field GF (2) = {0, 1} where the leading coefficient fr = 1. Thus, both sequences {a(t)} and {b(t)} appearing in Fig. 8.2 are shift-register sequences. A sequence satisfying a recursion of the form in Eq. (8.1) is said to have P characteristic polynomial f (x) = ri=0 fi x i . Thus, {a(t)} and {b(t)} have characteristic polynomials given by f (x) = x 3 + x + 1 and f (x) = x 3 + x 2 + 1, respectively.

FIGURE 8.1: Overview of pseudonoise sequences. Since an r-bit binary shift register can assume a maximum of 2r different states, it follows that every shift-register sequence {s(t)} is eventually periodic with period n ≤ 2r , i.e., s(t) = s(t + n),ψ

for all t ≥ N

for some integer N. In fact, the maximum period of a shift-register sequence is 2r − 1, since a shift register that enters the all-zero state will remain forever in that state. The upper shift register in Fig. 8.2 when initialized with starting state 0 0 1 generates the periodic sequence {a(t)} given by 0010111 0010111

0010111

···

(8.2)

of period n = 7. It follows then that this shift register generates sequences of maximal period starting from any nonzero initial state. An m sequence is simply a binary shift-register sequence having maximal period. For every r ≥ 1, m sequences are known to exist. The periodic autocorrelation function θs of a binary {0, 1} sequence 1999 by CRC Press LLC

c

FIGURE 8.2: An example Gold sequence generator. Here {a(t)} and {b(t)} are m sequences of length 7. {s(t)} of period n is defined by θs (τ ) =

n−1 X (−1)s(t+τ )−s(t) ,

0≤τ ≤n−1

t=0

An m sequence of length 2r −1 has the following attributes. 1) Balance property: in each period of the m sequence there are 2r−1 ones and 2r−1 − 1 zeros. 2) Run property: every nonzero binary s-tuple, s ≤ r occurs 2r−s times, the all-zero s-tuple occurs 2r−s − 1 times. 3) Two-level autocorrelation function: n if τ = 0 (8.3) θs (τ ) = −1 if τ 6= 0 The first two properties follow immediately from the observation that every nonzero r-tuple occurs precisely once in each period of the m sequence. For the third property, consider the difference sequence {s(t + τ ) − s(t)} for τ 6 = 0. This sequence satisfies the same recursion as the m sequence {s(t)} and is clearly not the all-zero sequence. It follows, therefore, that {s(t +τ )−s(t)} ≡ {s(t +τ 0 )} for some τ 0 , 0 ≤ τ 0 ≤ n − 1, i.e., is a different cyclic shift of the m sequence {s(t)}. The balance property of the sequence {s(t + τ 0 )} then gives us attribute 3. The m sequence {a(t)} in Eq. (8.2) can be seen to have the three listed properties. If {s(t)} is any sequence of period n and d is an integer, 1 ≤ d ≤ n, then the mapping {s(t)} → {s(dt)} is referred to as a decimation of {s(t)} by the integer d. If {s(t)} is an m sequence of period n = 2r − 1 and d is an integer relatively prime to 2r − 1, then the decimated sequence {s(dt)} clearly also has period n. Interestingly, it turns out that the sequence {s(dt)} is always also an m sequence of the same period. For example, when {a(t)} is the sequence in Eq. (8.2), then a(3t) = 0011101

0011101

0011101

···

(8.4)

a(2t) = 0111001

0111001

0111001

···

(8.5)

and The sequence {a(3t)} is also an m sequence of period 7, since it satisfies the recursion s(t + 3) + s(t + 2) + s(t) = 0 1999 by CRC Press LLC

c

for all t

of degree r = 3. In fact {a(3t)} is precisely the sequence labeled {b(t)} in Fig. 8.2. The sequence {a(2t)} is simply a cyclically shifted version of {a(t)} itself; this property holds in general. If {s(t)} is any m sequence of period 2r − 1, then {s(2t)} will always be a shifted version of the same m sequence. Clearly, the same is true for decimations by any power of 2. Starting from an m sequence of period 2r − 1, it turns out that one can generate all m sequences of the same period through decimations by integers d relatively prime to 2r − 1. The set of integers d, 1 ≤ d ≤ 2r −1 satisfying (d, 2r −1) = 1 forms a group under multiplication modulo 2r −1, with the powers {2i | 0 ≤ i ≤ r −1} of 2 forming a subgroup of order r. Since decimation by a power of 2 yields a shifted version of the same m sequence, it follows that the number of distinct m sequences of period 2r − 1 is [φ(2r − 1)/r] where φ(n) denotes the number of integers d, 1 ≤ d ≤ n, relatively prime to n. For example, when r = 3, there are just two cyclically distinct m sequences of period 7, and these are precisely the sequences {a(t)} and {b(t)} discussed in the preceding paragraph. Tables provided in [12] can be used to determine the characteristic polynomial of the various m sequences obtainable through the decimation of a single given m sequence. The classical reference on m sequences is [4]. If one obtains a sequence of some large length n by repeatedly tossing an unbiased coin, then such a sequence will very likely satisfy the balance, run, and autocorrelation properties of an m sequence of comparable length. For this reason, it is customary to regard the extent to which a given sequence possesses these properties as a measure of randomness of the sequence. Quite apart from this, in many applications such as signal synchronization and radar ranging, it is desirable to have sequences {s(t)} with low autocorrelation sidelobes i.e., |θs (τ )| is small for τ 6= 0. Whereas m sequences are a prime example, there exist other methods of constructing binary sequences with low out-of-phase autocorrelation. Sequences {s(t)} of period n having an autocorrelation function identical to that of an m sequence, i.e., having θs satisfying Eq. (8.3) correspond to well-studied combinatorial objects known as cyclic Hadamard difference sets. Known infinite families fall into three classes 1) Singer and Gordon, Mills and Welch, 2) quadratic residue, and 3) twin-prime difference sets. These correspond, respectively, to sequences of period n of the form n = 2r − 1, r ≥ 1; n prime; and n = p(p + 2) with both p and p + 2 being prime in the last case. For a detailed treatment of cyclic difference sets, see [2]. A recent observation by Maschietti in [9] provides additional families of cyclic Hadamard difference sets that also correspond to sequences of period n = 2r − 1.

8.3

The q -ary Sequences with Low Autocorrelation

As defined earlier, the autocorrelation of a binary {0, 1} sequence {s(t)} leads to the computation of the inner product of an {−1, +1} sequence {(−1)s(t) } with a cyclically shifted version {(−1)s(t+τ ) } of itself. The {−1, +1} sequence is transmitted as a phase shift by either 0◦ and 180◦ of a radio-frequency carrier, i.e., using binary phase-shift keying (PSK) modulation. If the modulation is q-ary PSK, then one is led to consider sequences {s(t)} with symbols in the set Zq , i.e., the set of integers modulo q. The relevant autocorrelation function θs (τ ) is now defined by θs (τ ) =

n−1 X

ωs(t+τ )−s(t)

t=0

where n is the period of {s(t)} and ω is a complex primitive qth root of unity. It is possible to construct sequences {s(t)} over Zq whose autocorrelation function satisfies n if τ = 0 θs (τ ) = 0 if τ 6 = 0 1999 by CRC Press LLC

c

For obvious reasons, such sequences are said to have an ideal autocorrelation function. We provide without proof two sample constructions. The sequences in the first construction are given by 2 when n is even t /2 (mod n)ψ s(t) = t (t + 1)/2 (mod n)ψ when n is odd Thus, this construction provides sequences with ideal autocorrelation for any period n. Note that the size q of the sequence symbol alphabet equals n when n is odd and 2n when n is even. The second construction also provides sequences over Zq of period n but requires that n be a perfect square. Let n = r 2 and let π be an arbitrary permutation of the elements in the subset {0, 1, 2, . . . , (r − 1)} of Zn : Let g be an arbitrary function defined on the subset {0, 1, 2, . . . , r − 1} of Zn . Then any sequence of the form s(t) = rt1 π(t2 ) + g(t2 ) (mod n) where t = rt1 + t2 with 0 ≤ t1 , t2 ≤ r − 1 is the base-r decomposition of t, has an ideal autocorrelation function. When the alphabet size q equals or divides the period n of the sequence, ideal-autocorrelation sequences also go by the name generalized bent functions. For details, see [6].

8.4

Families of Sequences with Low Crosscorrelation

Given two sequences {s1 (t)} and {s2 (t)} over Zq of period n, their crosscorrelation function θ1,2 (τ ) is defined by n−1 X ωs1 (t+τ )−s2 (t) θ1,2 (τ ) = t=0

where ω is a primitive qth root of unity. The crosscorrelation function is important in code-division multiple-access (CDMA) communication systems. Here, each user is assigned a distinct signature sequence and to minimize interference due to the other users, it is desirable that the signature sequences have pairwise, low values of crosscorrelation function. To provide the system in addition with a self-synchronizing capability, it is desirable that the signature sequences have low values of the autocorrelation function as well. Let F = {{si (t)} | 1 ≤ i ≤ M} be a family of M sequences {si (t)} over Zq each of period n. Let θi,j (τ ) denote the crosscorrelation between the ith and j th sequence at shift τ , i.e., θi,j (τ ) =

n−1 X

ωsi (t+τ )−sj (t) ,ψ

0≤τ ≤n−1

t=0

The classical goal in sequence design for CDMA systems has been minimization of the parameter θmax = max θi,j (τ ) | either i 6= j or τ 6 = 0 for fixed n and M. It should be noted though that, in practice, because of data modulation the correlations that one runs into are typically of an aperiodic rather than a periodic nature (see Section 8.5). The problem of designing for low aperiodic correlation, however, is a more difficult one. A typical approach, therefore, has been to design based on periodic correlation, and then to analyze the resulting design for its aperiodic correlation properties. Again, in many practical systems, the mean square correlation properties are of greater interest than the worst-case correlation represented by a parameter such as θmax . The mean square correlation is discussed in Section 8.6. 1999 by CRC Press LLC

c

Bounds on the minimum possible value of θmax for given period n, family size M, and alphabet size q are available that can be used to judge the merits of a particular sequence design. The most efficient bounds are those due to Welch, Sidelnikov, and Levenshtein, see [6]. √In CDMA systems, √ there is greatest interest in designs in which the parameter θmax is in the range n ≤ θmax ≤ 2 n. Accordingly, Table 8.1 uses the Welch, Sidelnikov, and Levenshtein bounds to provide an order-ofmagnitude upper bound on the family size M for certain θmax in the cited range. Practical considerations dictate that q be small. The bit-oriented nature of electronic hardware makes it preferable to have q a power of 2. With this in mind, a description of some efficient sequence families having low auto- and crosscorrelation values and alphabet sizes q = 2 and q = 4 are described next. TABLE 8.1 Bounds on Family Size M for Given n, θmax

8.4.1

θmax

Upper bound on M q=2

Upper Bound on M q>2

√ √n 2n √ 2 n

n/2 n 3n2 /10

n n2 /2 n3 /2

Gold and Kasami Sequences

Given the low autocorrelation sidelobes of an m sequence, it is natural to attempt to construct families of low correlation sequences starting from m sequences. Two of the better known constructions of this type are the families of Gold and Kasami sequences. Let r be odd and d = 2k + 1 where k, 1 ≤ k ≤ r − 1, is an integer satisfying (k, r) = 1. Let {s(t)} be a cyclic shift of an m sequence of period n = 2r − 1 that satisfies S(dt) 6 ≡ 0 and let G be the Gold family of 2r + 1 sequences given by G = {s(t)} ∪ {s(dt)} ∪ {{s(t) + s(d[t + τ ])} | 0 ≤ τ ≤ n − 1} Then each sequence in G has period 2r − 1 and the maximum-correlation parameter θmax of G satisfies p θmax ≤ 2r+1 + 1 An application of the Sidelnikov bound coupled with the information that θmax must be an odd integer yields that for the family G, θmax is as small as it can possibly be. In this sense the family G is an optimal family. We remark that these comments remain true even when d is replaced by the integer d = 22k − 2k + 1 with the conditions on k remaining unchanged. The Gold family remains the best-known family of m sequences having low crosscorrelation. Applications include the Navstar Global Positioning System whose signals are based on Gold sequences. The family of Kasami sequences has a similar description. Let r = 2v and d = 2v + 1. Let {s(t)} be a cyclic shift of an m sequence of period n = 2r − 1 that satisfies s(dt) 6 ≡ 0, and consider the family of Kasami sequences given by K = {s(t)} ∪ {s(t) + s(d[t + τ ])} | 0 ≤ τ ≤ 2v − 2 Then the Kasami family K contains 2v sequences of period 2r − 1. It can be shown that in this case θmax = 1 + 2v 1999 by CRC Press LLC

c

This time an application of the Welch bound and the fact that θmax is an integer shows that the Kasami family is optimal in terms of having the smallest possible value of θmax for given n and M.

8.4.2

Quaternary Sequences with Low Crosscorrelation

The entries in Table 8.1 suggest that nonbinary (i.e., q > 2) designs may be used for improved performance. A family of quaternary sequences that outperform the Gold and Kasami sequences is now discussed below. Let f (x) be the characteristic polynomial of a binary m sequence of length 2r − 1 for some integer r. The coefficients of f (x) are either 0 or 1. Now, regard f (x) as a polynomial over Z4 and form the x 2 . Define the polynomial g(x) product (−1)r f (x)f (−x). This can be seen to be a polynomial Pin r 2 r = i=0 gi x i and consider the set of all of degree r by setting g(x ) = (−1) f (x)f (−x). Let Pg(x) r quaternary sequences {a(t)} satisfying the recursion i=0 gi a(t + i) = 0 for all t. It turns out that with the exception of the all-zero sequence, all of the sequences generated in this way have period 2r − 1. Thus, the recursion generates a family A of 2r + 1 cyclically distinct quaternary sequences. Closer √ study reveals that the maximum correlation parameter θmax of this to the family of Gold sequences, the family family satisfies θmax ≤ 1 + 2r . Thus, in comparison √ A offers a lower value of θmax (by a factor of 2) for the same family size. In comparison to the set of Kasami sequences, it offers a much larger family size for the same bound on θmax . Family A sequences may be found discussed in [16, 3]. We illustrate with an example. Let f (x) = x 3 + x + 1 be the characteristic polynomial of the m sequence {a(t)} in Eq. (8.1). Then over Z4 g x 2 = (−1)3 f (x)f (−x) = x 6 + 2x 4 + x 2 + 3 so that g(x) = x 3 + 2x 2 + x + 3. Thus, the sequences in family A are generated by the recursion s(t + 3) + 2s(t + 2) + s(t + 1) + 3s(t) = 0 mod 4. The corresponding shift register is shown in Fig. 8.3. By varying initial conditions, this shift register can √ be made to generate nine cyclically distinct sequences, each of length 7. In this case θmax ≤ 1 + 8.

FIGURE 8.3: Shift register that generates family A quaternary sequences {s(t)} of period 7.

8.4.3

Binary Kerdock Sequences

The Gold and Kasami families of sequences are closely related to binary linear cyclic codes. It is well known in coding theory that there exists nonlinear binary codes whose performance exceeds that of the best possible linear code. Surprisingly, some of these examples come from binary codes, which are images of linear quaternary (q = 4) codes under the Gray map: 0 → 00, 1 → 01, 2 → 11, 1999 by CRC Press LLC

c

3 → 10. A prime example of this is the Kerdock code, which recently has been shown to be the Gray image of a quaternary linear code. Thus, it is not surprising that the Kerdock code yields binary sequences that significantly outperform the family of Kasami sequences. The Kerdock sequences may be constructed as follows: let f (x) be the characteristic polynomial of an m sequence of period 2r − 1, r odd. As before, regarding f (x) as a polynomial over Z4 (which happens to have {0, 1} coefficients), let the polynomial g(x) over Z4 be defined via g(x 2 ) = −f (x)f (−x). [Thus, g(x) is thePcharacteristic polynomial of a family A sequence set of period r i 2r − 1.] Set Prh(x) = −g(−x) = i=0 hi x , and rlet S be the set of all Z4 sequences satisfying the recursion i=0 hi s(t + i) = 0. Then S contain 4 -distinct sequences corresponding to all possible distinct initializations of the shift register. Let T denote the subset S of size 2r -consisting of those sequences corresponding to initializations of the shift register only using the symbols 0 and 2 in Z4 . Then the set S − T of size 4r − 2r contains a set U of 2r−1 cyclically distinct sequences each of period 2(2r − 1). Given x = a + 2b ∈ Z4 with a, b ∈ {0, 1}, let µ denote the most significant bit (MSB) map µ(x) = b. Let KE denote the family of 2r−1 binary sequences obtained by applying the map µ to each sequence in U. It turns out that √ each sequence in U also has period 2(2r − 1) and that, furthermore, for the family KE , θmax ≤ 2 + 2r+1 . Thus, KE is a much larger family than the Kasami family, while having almost exactly the same value of θmax . For example, taking r = 3 and f (x) = x 3 + x + 1, we have from the previous family A example that g(x) = x 3 + 2x 2 + x + 3, so that h(x) = −g(−x) = x 3 + 2x 2 + x + 1. Applying the MSB map to the head of the shift register, and discarding initializations of the shift register involving only 0’s and 2’s yields a family of four cyclically distinct binary sequences of period 14. Kerdock sequences are discussed in [6, 11, 1, 17].

8.5

Aperiodic Correlation

Let {x(t)} and {y(t)} be complex-valued sequences of length (or period) n, not necessarily distinct. Their aperiodic correlation values {ρx,y (τ )| − (n − 1) ≤ τ ≤ n − 1} are given by

ρx,y (τ ) =

min{n−1,n−1−τ } X

x(t + τ )y ∗ (t)

t=max{0,−τ }

where y ∗ (t) denotes the complex conjugate of y(t). When x ≡ y, we will abbreviate and write ρx in place of ρx,y . The sequences described next are perhaps the most famous example of sequences with low-aperiodic autocorrelation values.

8.5.1

Barker Sequences

A binary {−1, +1} sequence {s(t)} of length n is said to be a Barker sequence if the aperiodic autocorrelation values ρs (τ ) satisfy |ρs (τ )| ≤ 1 for all τ, −(n − 1) ≤ τ ≤ n − 1. The Barker property is preserved under the following transformations: s(t) → −s(t), 1999 by CRC Press LLC

c

s(t) → (−1)t s(t) and s(t) → s(n − 1 − t)

as well as under compositions of the preceding transformations. Only the following Barker sequences are known: n=2 ++ n = 3 + +− n=4 +++− n = 5 + + + −+ n = 7 + + + − − +− n = 11 + + + − − − + − − +− n = 13 + + + + + − − + + − + −+ where + denotes +1 and − denotes −1 and sequences are generated from these via the transformations already discussed. It is known that if any other Barker sequence exists, it must have length n > 1,898,884, that is a multiple of 4. For an upper bound to the maximum out-of-phase aperiodic autocorrelation of an m sequence, see [13].

8.5.2

Sequences with High Merit Factor

The merit factor F of a {−1, +1} sequence {s(t)} is defined by F =

2

n2 Pn−1

2 τ =1 ρs (τ )

Since ρs (τ ) = ρs (−τ ) for 1 ≤ |τ | ≤ n − 1 and ρs (0) = n, factor F may be regarded as the ratio of the square of the in-phase autocorrelation, to the sum of the squares of the out-of-phase aperiodic autocorrelation values. Thus, the merit factor is one measure of the aperiodic autocorrelation properties of a binary {−1, +1} sequence. It is also closely connected with the signal to self-generated noise ratio of a communication system in which coded pulses are transmitted and received. Let Fn denote the largest merit factor of any binary {−1, +1} sequence of length n. For example, at length n = 13, the Barker sequence of length 13 has a merit factor F = F13 = 14.08. Assuming a certain ergodicity postulate it was established by Golay that limn→∞ Fn = 12.32. Exhaustive computer searches carried out for n ≤ 40 have revealed the following. 1. For 1 ≤ n ≤ 40, n 6 = 11, 13, 3.3 ≤ Fn ≤ 9.85 , 2. F11 = 12.1, F13 = 14.08. The value F11 is also achieved by a Barker sequence. From partial searches, for lengths up to 117, the highest known merit factor is between 8 and 9.56; for lengths from 118 to 200, the best-known factor is close to 6. For lengths > 200, statistical search methods have failed to yield a sequence having merit factor exceeding 5. An offset sequence is one in which a fraction θ of the elements of a sequence of length n are chopped off at one end and appended to the other end, i.e., an offset sequence is a cyclic shift of the original sequence by nθ symbols. It turns out that the asymptotic merit factor of m sequences is equal to 3 and is independent of the particular offset of the m sequence. There exist offsets of sequences associated with quadratic-residue and twin-prime difference sets that achieve a larger merit factor of 6. Details may be found in [7]. 1999 by CRC Press LLC

c

8.5.3

Sequences with Low Aperiodic Crosscorrelation

If {u(t)} and {v(t)} are sequences of length 2n − 1 defined by ( x(t) if 0 ≤ t ≤ n − 1 u(t) = 0 if n ≤ t ≤ 2n − 2 (

and v(t) = then

y(t) if 0 ≤ t ≤ n − 1 if n ≤ t ≤ 2n − 2

0

{ρx,y (τ ) | −(n − 1) ≤ τ ≤ n − 1} = θu,v (τ ) | 0 ≤ τ ≤ 2n − 2

Given a collection

(8.6)

U = {{xi (t)} | 1 ≤ i ≤ M}

of sequences of length n over Zq , let us define ρmax = max ρa,b (τ ) | a, b ∈ U , either a 6 = b or τ 6 = 0 It is clear from Eq. (8.6) how bounds on the periodic correlation parameter θmax can be adapted to give bounds on ρmax . Translation of the Welch bound gives that for every integer k ≥ 1, ) ( M(2n − 1) n2k 2k −1 ρmax ≥ 2n+k−2 M(2n − 1) − 1 k

Setting k = 1 in the preceding bound gives s ρmax ≥ n

M −1 M(2n − 1) − 1

Thus, for fixed M and large n, Welch’s bound gives ρmax ≥ O n1/2 There exist sequence families which asymptotically achieve ρmax ≈ O(n1/2 ), [10].

8.6 8.6.1

Other Correlation Measures Partial-Period Correlation

The partial-period (p-p) correlation between the sequences {u(t)} and {v(t)} is the collection {1u,v (l, τ, t0 ) | 1 ≤ l ≤ n, 0 ≤ τ ≤ n − 1, 0 ≤ t0 ≤ n − 1} of inner products 1u,v (l, τ, t0 ) =

t=tX 0 +l−1

u(t + τ )v ∗ (t)

t=t0

where l is the length of the partial period and the sum t + τ is again computed modulo n. 1999 by CRC Press LLC

c

In direct-sequence CDMA systems, the pseudorandom signature sequences used by the various users are often very long for reasons of data security. In such situations, to minimize receiver hardware complexity, correlation over a partial period of the signature sequence is often used to demodulate data, as well as to achieve synchronization. For this reason, the p-p correlation properties of a sequence are of interest. Researchers have attempted to determine the moments of the p-p correlation. Here the main tool is the application of the Pless power-moment identities of coding theory [8]. The identities often allow the first and second p-p correlation moments to be completely determined. For example, this is true in the case of m sequences (the remaining moments turn out to depend upon the specific characteristic polynomial of the m sequence). Further details may be found in [15].

8.6.2

Mean Square Correlation

Frequently in practice, there is a greater interest in the mean-square correlation distribution of a sequence family than in the parameter θmax . Quite often in sequence design, the sequence family is derived from a linear, binary cyclic code of length n by picking a set of cyclically distinct sequences of period n. The families of Gold and Kasami sequences are so constructed. In this case, as pointed out by Massey, the mean square correlation of the family can be shown to be either optimum or close to optimum, under certain easily satisfied conditions, imposed on the minimum distance of the dual code. A similar situation holds even when the sequence family does not come from a linear cyclic code. In this sense, mean square correlation is not a very discriminating measure of the correlation properties of a family of sequences. An expanded discussion of this issue may be found in [5].

8.6.3

Optical Orthogonal Codes

Given a pair of {0, 1} sequences {s1 (t)} and {s2 (t)} each having period n, we define the Hamming correlation function θ12 (τ ), 0 ≤ τ ≤ n − 1, by θ12 (τ ) =

n−1 X

s1 (t + τ )s2 (t)

t=0

Such correlations are of interest, for instance, in optical communication systems where the 1’s and 0’s in a sequence correspond to the presence or absence of pulses of transmitted light. An (n, w, λ) optical orthogonal code (OOC) is a family F = {{si (t)} | i = 1, 2, . . . , M}, of M {0, 1} sequences of period n, constant Hamming weight w, where w is an integer lying between 1 and n − 1 satisfying θij (τ ) ≤ λ whenever either i 6 = j or τ 6= 0. Note that the Hamming distance da,b between a period of the corresponding codewords {a(t)}, {b(t)}, 0 ≤ t ≤ n − 1 in an (n, w, λ) OOC having Hamming correlation ρ, 0 ≤ ρ ≤ λ, is given by da,b = 2(w − ρ), and, thus, OOCs are closely related to constant-weight error correcting codes. Given an (n, w, λ) OOC, by enlarging the OOC to include every cyclic shift of each sequence in the code, one obtains a constant-weight, minimum distance dmin ≥ 2(w − λ) code. Conversely, given a constant-weight cyclic code of length n, weight w and minimum distance dmin , one can derive an (n, w, λ) OOC code with λ ≤ w − dmin /2 by partitioning the code into cyclic equivalence classes and then picking precisely one representative from each equivalence class of size n. By making use of this connection, one can derive bounds on the size of an OOC from known bounds on the size of constant-weight codes. The bound given next follows directly from the Johnson bound 1999 by CRC Press LLC

c

for constant weight codes [8]. The number M(n, w, λ) of codewords in a (n, w, λ) OOC satisfies n−λ+1 n−λ 1 n−1 ··· ··· M(n, w, λ) ≤ w w−1 w−λ+1 w−λ An OOC code that achieves the Johnson bound is said to be optimal. A family {Fn } of OOCs indexed by the parameter n and arising from a common construction is said to be asymptotically optimum if |Fn | =1 lim n→∞ M(n, w, λ) Constructions for optical orthogonal codes are available for the cases when λ = 1 and λ = 2. For larger values of λ, there exist constructions which are asymptotically optimum. Further details may be found in [6].

Defining Terms Autocorrelation of a sequence: The complex inner product of the sequence with a shifted version itself. Crosscorrelation of two sequences: The complex inner product of the first sequence with a shifted version of the second sequence. m Sequence: A periodic binary {0, 1} sequence that is generated by a shift register with linear feedback and which has maximal possible period given the number of stages in the shift register. Pseudonoise sequences: Also referred to as pseudorandom sequences (PN), these are sequences that are deterministically generated and yet possess some properties that one would expect to find in randomly generated sequences. Shift-register sequence: A sequence with symbols drawn from a field, which satisfies a linearrecurrence relation and which can be implemented using a shift register.

References [1] Barg, A. On small families of sequences with low periodic correlation, Lecture Notes in Computer Science, 781, 154–158, Berlin, Springer-Verlag, 1994. [2] Baumert, L.D. Cyclic Difference Sets, Lecture Notes in Mathematics 182, Springer–Verlag, New York, 1971. [3] Boztas¸, S., Hammons, R., and Kumar, P.V. 4-phase sequences with near-optimum correlation properties, IEEE Trans. Inform. Theory, IT-38, 1101–1113, 1992. [4] Golomb, S.W. Shift Register Sequences, Aegean Park Press, San Francisco, CA, 1982. [5] Hammons, A.R., Jr. and Kumar, P.V. On a recent 4-phase sequence design for CDMA. IEICE Trans. Commun., E76-B(8), 1993. [6] Helleseth, T. and Kumar, P.V. (planned). Sequences with low correlation. In Handbook of Coding Theory, ed., V.S. Pless and W.C. Huffman, Elsevier Science Publishers, Amsterdam, 1998. [7] Jensen, J.M., Jensen, H.E., and Høholdt, T. The merit factor of binary sequences related to difference sets. IEEE Trans. Inform. Theory, IT-37(May), 617–626, 1991. [8] MacWilliams, F.J. and Sloane, N.J.A. The Theory of Error-Correcting Codes, North-Holland, Amsterdam, 1977. 1999 by CRC Press LLC

c

[9] Maschietti, A. Difference sets and hyperovals, Designs, Codes and Cryptography, 14, 89–98, 1998. [10] Mow, W.H. On McEliece’s open problem on minimax aperiodic correlation. In Proc. IEEE Intern. Symp. Inform. Theory, 75, 1994. [11] Nechaev, A. The Kerdock code in a cyclic form, Discrete Math. Appl., 1, 365–384, 1991. [12] Peterson, W.W. and Weldon, E.J., Jr. Error-Correcting Codes, 2nd ed. MIT Press, Cambridge, MA, 1972. [13] Sarwate, D.V. An upper bound on the aperiodic autocorrelation function for a maximal-length sequence. IEEE Trans. Inform. Theory, IT-30(July), 685–687, 1984. [14] Sarwate, D.V. and Pursley, M.B. Crosscorrelation properties of pseudorandom and related sequences. Proc. IEEE, 68(May), 593–619, 1980. [15] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K. Spread Spectrum Communications Handbook, revised ed., McGraw Hill, New York, 1994. [16] Sol´e, P. A quaternary cyclic code and a family of quadriphase sequences with low correlation properties, Coding Theory and Applications, Lecture Notes in Computer Science, 388, 193–201, Berlin, Springer-Verlag, 1989. [17] Udaya, P. and Siddiqi, M. Optimal biphase sequences with large linear complexity derived from sequences over Z4 , IEEE Trans. Inform. Theory, IT-42 (Jan), 206–216, 1996.

Further Information A more in-depth treatment of pseudonoise sequences, may be found in the following. [1] Golomb, S.W. Shift Register Sequences, Aegean Park Press, San Francisco, 1982. [2] Helleseth, T. and Kumar, P.V. Sequences with Low Correlation, in Handbook of Coding Theory, edited by V.S. Pless and W.C. Huffman, Elsevier Science Publishers, Amsterdam, 1998 (planned). [3] Sarwate, D.V. and Pursley, M.B. Crosscorrelation Properties of Pseudorandom and Related Sequences, Proc. IEEE, 68, May, 593–619, 1980. [4] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K. Spread Spectrum Communications Handbook, revised ed., McGraw Hill, New York, 1994.

1999 by CRC Press LLC

c

Orsak, G.C. “Optimum Receivers” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Optimum Receivers 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9

Geoffrey C. Orsak Southern Methodist University

9.1

Introduction Preliminaries Karhunen–Loeve Expansion Detection Theory Performance Signal Space Standard Binary Signalling Schemes M -ary Optimal Receivers More Realistic Channels

Random Phase Channels • Rayleigh Channel

9.10 Dispersive Channels Defining Terms References Further Information

Introduction

Every engineer strives for optimality in design. This is particularly true for communications engineers since in many cases implementing suboptimal receivers and sources can result in dramatic losses in performance. As such, this chapter focuses on design principles leading to the implementation of optimum receivers for the most common communication environments. The main objective in digital communications is to transmit a sequence of bits to a remote location with the highest degree of accuracy. This is accomplished by first representing bits (or more generally short bit sequences) by distinct waveforms of finite time duration. These time-limited waveforms are then transmitted (broadcasted) to the remote sites in accordance with the data sequence. Unfortunately, because of the nature of the communication channel, the remote location receives a corrupted version of the concatenated signal waveforms. The most widely accepted model for the communication channel is the so-called additive white Gaussian noise1 channel (AWGN channel).

1 For those unfamiliar with AWGN, a random process (waveform) is formally said to be white Gaussian noise if all collections

of instantaneous observations of the process are jointly Gaussian and mutually independent. An important consequence of this property is that the power spectral density of the process is a constant with respect to frequency variation (spectrally flat). For more on AWGN, see Papoulis [4]. 1999 by CRC Press LLC

c

Mathematical arguments based upon the central limit theorem [7], together with supporting empirical evidence, demonstrate that many common communication channels are accurately modeled by this abstraction. Moreover, from the design perspective, this is quite fortuitous since design and analysis with respect to this channel model is relatively straightforward.

9.2

Preliminaries

To better describe the digital communications process, we shall first elaborate on so-called binary communications. In this case, when the source wishes to transmit a bit value of 0, the transmitter broadcasts a specified waveform s0 (t) over the bit interval t ∈ [0, T ]. Conversely, if the source seeks to transmit the bit value of 1, the transmitter alternatively broadcasts the signal s1 (t) over the same bit interval. The received waveform R(t) corresponding to the first bit is then appropriately described by the following hypotheses testing problem: H0 : R(t) = s0 (t) + η(t)ψ H1 : R(t) = s1 (t) + η(t)

0≤t ≤T

(9.1)

where, as stated previously, η(t) corresponds to AWGN with spectral height nominally given by N0 /2. It is the objective of the receiver to determine the bit value, i.e., the most accurate hypothesis from the received waveform R(t). The optimality criterion of choice in digital communication applications is the total probability of error normally denoted as Pe . This scalar quantity is expressed as Pe

=

P r( declaring 1 | 0 transmitted)P r(0 transmitted) + P r( declaring 0 | 1 transmitted)P r(1 transmitted)ψ

(9.2)

The problem of determining the optimal binary receiver with respect to the probability of error is solved by applying stochastic representation theory [10] to detection theory [5, 9]. The specific waveform representation of relevance in this application is the Karhunen–Lo`eve (KL) expansion.

9.3

` Expansion Karhunen–Loeve

The Karhunen–Lo`eve expansion is a generalization of the Fourier series designed to represent a random process in terms of deterministic basis functions and uncorrelated random variables derived from the process. Whereas the Fourier series allows one to model or represent deterministic time-limited energy signals in terms of linear combinations of complex exponential waveforms, the Karhunen–Lo`eve expansion allows us to represent a second-order random process in terms of a set of orthonormal basis functions scaled by a sequence of random variables. The objective in this representation is to choose the basis of time functions so that the coefficients in the expansion are mutually uncorrelated random variables. To be more precise, if R(t) is a zero mean second-order random process defined over [0, T ] with covariance function KR (t, s), then so long as the basis of deterministic functions satisfy certain integral constraints [9], one may write R(t) as R(t) =

∞ X i=1

1999 by CRC Press LLC

c

Ri φi (t)ψ

0 ≤ t ≤ Tψ

(9.3)

where

Z Ri =

0

T

R(t)φi (t) dt

In this case the Ri will be mutually uncorrelated random variables with the φi being deterministic basis functions that are complete in the space of square integrable time functions over [0, T ]. Importantly, in this case, equality is to be interpreted as mean-square equivalence, i.e., !2 N X Ri φi (t) = 0 lim E R(t) − N→∞

i=1

for all 0 ≤ t ≤ T . If R(t) is AWGN, then any basis of the vector space of square integrable signals over [0, T ] results in uncorrelated and therefore independent Gaussian random variables.

FACT 9.1

The use of Fact 9.1 allows for a conversion of a continuous time detection problem into a finitedimensional detection problem. Proceeding, to derive the optimal binary receiver, we first construct our set of basis functions as the set of functions defined over t ∈ [0, T ] beginning with the signals of interest s0 (t) and s1 (t). That is, s0 (t), s1 (t), plus a countable number of functions which complete the basis In order to insure that the basis is orthonormal, we must apply the Gramm–Schmidt procedure2 [6] to the full set of functions beginning with s0 (t) and s1 (t) to arrive at our final choice of basis {φi (t)}. Let {φi (t)} be the resultant set of basis functions. Then for all i > 2, the φi (t) are orthogonal to s0 (t) and s1 (t). That is, Z T φi (t)sj (t) dt = 0

FACT 9.2

0

for all i > 2 and j = 0, 1. Using this fact in conjunction with Eq. (9.3), one may recognize that only the coefficients R1 and R2 are functions of our signals of interest. Moreover, since the Ri are mutually independent, the optimal receiver will, therefore, only be a function of these two values. Thus, through the application of the KL expansion, we arrive at an equivalent hypothesis testing problem to that given in Eq. (9.1), # " RT η1 0 φ1 (t)s0 (t) dt + H0 : R = RT η2 0 φ2 (t)s0 (t) dt # " RT η1 0 φ1 (t)s1 (t) dt + (9.4) H1 : R = RT η2 φ (t)s (t) dt 2 1 0

2 The Gramm-Schmidt procedure is a deterministic algorithm that simply converts an arbitrary set of basis functions

(vectors) into an equivalent set of orthonormal basis functions (vectors). 1999 by CRC Press LLC

c

where it is easily shown that η1 and η2 are mutually independent, zero-mean, Gaussian random variables with variance given by N0 /2, and where φ1 and φ2 are the first two functions from our orthonormal set of basis functions. Thus, the design of the optimal binary receiver reduces to a simple two-dimensional detection problem that is readily solved through the application of detection theory.

9.4

Detection Theory

It is well known from detection theory [5] that under the minimum Pe criterion, the optimal detector is given by the maximum a posteriori rule (MAP), choosei largest pHi |R (Hi | R = r)

(9.5)

i.e., determine the hypothesis that is most likely, given that our observation vector is r. By a simple application of Bayes theorem [4], we immediately arrive at the central result in detection theory: the optimal binary detector is given by the likelihood ratio test (LRT), H1 pR |H1 (R) > π0 L(R) = pR |H0 (R) < π1 H0

(9.6)

where the πi are the a priori probabilities of the hypotheses Hi being true. Since in this case we have assumed that the noise is white and Gaussian, the LRT can be written as 2 ! Q2 1 1 Ri − s1,i H1 exp − 1 √ 2 N0 /2 π N0 > π0 (9.7) L(R) = 2 ! < π1 Q2 1 1 Ri − s0,i exp − H0 1 √ 2 N0 /2 π N0 where

Z sj,i =

T

0

φi (t)sj (t) dt

By taking the logarithm and cancelling common terms, it is easily shown that the optimum binary receiver can be written as H1 2 2 > π0 1 X 2 2 X 2 R i s1,i − s0,i − ln s1,i − s0,i < N0 N0 π1 1 1 H0

(9.8)

This finite-dimensional version of the optimal receiver can be converted back into a continuous time receiver by the direct application of Parseval’s theorem [4] where it is easily shown that 2 X

Z Ri sk,i =

i=1

2 X i=1

1999 by CRC Press LLC

c

0

Z 2 sk,i

T

= 0

T

R(t)sk (t) dt (9.9)

sk2 (t) dt

By applying Eq. (9.9) to Eq. (9.8) the final receiver structure is then given by Z 0

T

H1 1 > N0 π0 R(t) [s1 (t) − s0 (t)] dt − (E1 − E0 ) ln < 2 2 π1 H0

(9.10)

where E1 and E0 are the energies of signals s1 (t) and s0 (t), respectively. (See Fig. 9.1 for a block diagram.) Importantly, if the signals are equally likely (π0 = π1 ), the optimal receiver is independent of the typically unknown spectral height of the background noise.

FIGURE 9.1: Optimal correlation receiver structure for binary communications.

One can readily observe that the optimal binary communication receiver correlates the received waveform with the difference signal s1 (t) − s0 (t) and then compares the statistic to a threshold. This operation can be interpreted as identifying the signal waveform si (t) that best correlates with the received signal R(t). Based on this interpretation, the receiver is often referred to as the correlation receiver. As an alternate means of implementing the correlation receiver, we may reformulate the computation of the left-hand side of Eq. (9.10) in terms of standard concepts in filtering. Let h(t) be the impulse response of a linear, time-invariant (LTI) system. By letting h(t) = s1 (T − t) − s0 (T − t), then it is easily verified that the output of R(t) to a LTI system with impulse response given by h(t) and then sampled at time t = T gives the desired result. (See Fig. 9.2 for a block diagram.) Since the impulse response is matched to the signal waveforms, this implementation is often referred to as the matched filter receiver.

FIGURE 9.2: Optimal matched filter receiver structure for binary communications. In this case h(t) = s1 (T − t) − s0 (t − t).

1999 by CRC Press LLC

c

9.5

Performance

Because of the nature of the statistics of the channel and the relative simplicity of the receiver, performance analysis of the optimal binary receiver in AWGN is a straightforward task. Since the conditional statistics of the log likelihood ratio are Gaussian random variables, the probability of error can be computed directly in terms of Marcum Q functions3 as Pe = Q

ks 0 − s 1 k √ 2N 0

where the s i are the two-dimensional signal vectors obtained from Eq. (9.4), and where kxk denotes the Euclidean length of the vector x. Thus, ks 0 − s 1 k is best interpreted as the distance between the respective signal representations. Since the Q function is monotonically decreasing with an increasing argument, one may recognize that the probability of error for the optimal receiver decreases with an increasing separation between the signal representations, i.e., the more dissimilar the signals, the lower the Pe .

9.6

Signal Space

The concept of a signal space allows one to view the signal classification problem (receiver design) within a geometrical framework. This offers two primary benefits: first it supplies an often more intuitive perspective on the receiver characteristics (e.g., performance) and second it allows for a straightforward generalization to standard M-ary signalling schemes. To demonstrate this, in Fig. 9.3, we have plotted an arbitrary signal space for the binary signal classification problem. The axes are given in terms of the basis functions φ1 (t) and φ2 (t). Thus, every point in the signal space is a time function constructed as a linear combination of the two basis functions. By Fact 9.2, we recall that both signals s0 (t) and s1 (t) can be constructed as a linear combination of φ1 (t) and φ2 (t) and as such we may identify these two signals in this figure as two points. Since the decision statistic given in Eq. (9.8) is a linear function of the observed vector R which is also located in the signal space, it is easily shown that the set of vectors under which the receiver declares hypothesis Hi is bounded by a line in the signal space. This so-called decision boundary is obtained by solving the equation ln[L(R)] = 0. (Here again we have assumed equally likely hypotheses.) In the case under current discussion, this decision boundary is simply the hyperplane separating the two signals in signal space. Because of the generality of this formulation, many problems in communication system design are best cast in terms of the signal space, that is, signal locations and decision boundaries.

9.7

Standard Binary Signalling Schemes

The framework just described allows us to readily analyze the most popular signalling schemes in binary communications: amplitude-shift keying (ASK), frequency-shift keying (FSK), and phase-

3 The Q function is the probability that a standard normal random variable exceeds a specified constant, i.e., Q(x) = R∞ √ 2 x 1/ 2π exp(−z /2) dz.

1999 by CRC Press LLC

c

FIGURE 9.3: Signal space and decision boundary for optimal binary receiver. shift keying (PSK). Each of these examples simply constitute a different selection for signals s0 (t) and s1 (t). √ In the case of ASK, s0 (t) = 0, while s1 (t) = 2E/T sin(2πfc t), where E denotes the energy of the waveform and fc denotes the frequency of the carrier wave with fc T being an integer. √ Because s0 (t) (t) = 2/T sin(2πfc t). is the null signal, the signal space is a one-dimensional vector space with φ 1 √ This, in turn, implies that ks0 − s1 k = E. Thus, the corresponding probability of error for ASK is s ! E Pe ( ASK) = Q 2N0 For FSK,√the signals are given by equal amplitude sinusoids with distinct center frequencies, that is, si (t) = 2E/T sin(2πfi t) with fi T being two distinct integers. In √ this case, it is easily verified that the signal√space is a two-dimensional vector space with φi (t) = 2/T sin(2πfi t) resulting in ks 0 − s 1 k = 2E. The corresponding error rate is given to be s ! E Pe (FSK) = Q N0 Finally, with regard to PSK signalling, the most frequently utilized binary PSK signal set is an example of an antipodal signal set. Specifically, the antipodal signal set results in the greatest separation between the signals in the signal space subject to an energy constraint on both signals. This, in turn, translates into the√ energy constrained signal set with the minimum Pe . In this case, the si (t) are typically given by 2E/T sin[2πfc t + θ (i)], where θ (0) = 0 and θ (1) = π . As in √ the ASK case, this results in a one-dimensional signal space, however, in this case ks 0 − s 1 k = 2 E resulting in probability of error given by s ! 2E Pe (PSK) = Q N0 In all three of the described cases, one can readily observe that the resulting performance is a function of only the signal-to-noise ratio E/N0 . In the more general case, the performance will be a function of the intersignal energy to noise ratio. To gauge the relative difference in performance of the three signalling schemes, in Fig. 9.4, we have plotted the Pe as a function of the SNR. Please note the large variation in performance between the three schemes for even moderate values of SNR. 1999 by CRC Press LLC

c

FIGURE 9.4: Pe vs. the signal to noise ratio in decibels [dB = 10 log(E/N0 )] for amplitudeshift keying, frequency-shift keying, and phase-shift keying; note that there is a 3-dB difference in performance from ASK to FSK to PSK.

9.8

M-ary Optimal Receivers

In binary signalling schemes, one seeks to transmit a single bit over the bit interval [0, T ]. This is to be contrasted with M-ary signalling schemes where one transmits multiple bits simultaneously over the so-called symbol interval [0, T ]. For example, using a signal set with 16 separate waveforms will allow one to transmit a length four-bit sequence per symbol (waveform). Examples of M-ary waveforms are quadrature phase-shift keying (QPSK) and quadrature amplitude modulation (QAM). The derivation of the optimum receiver structure for M-ary signalling requires the straightforward application of fundamental results in detection theory. As with binary signalling, the Karhunen– Lo`eve expansion is the mechanism utilized to convert a hypotheses testing problem based on continuous waveforms into a vector classification problem. Depending on the complexity of the M waveforms, the signal space can be as large as an M-dimensional vector space. By extending results from the binary signalling case, it is easily shown that the optimum M-ary receiver computes Z ξi [R(t)] =

0

T

si (t)R(t) dt −

N0 Ei + ln πi 2 2

i = 1, . . . , M

where, as before, the si (t) constitute the signal set with the πi being the corresponding a priori probabilities. After computing M separate values of ξi , the minimum probability of error receiver simply chooses the largest amongst this set. Thus, the M-ary receiver is implemented with a bank of correlation or matched filters followed by choose-largest decision logic. In many cases of practical importance, the signal sets are selected so that the resulting signal space is a two-dimensional vector space irrespective of the number of signals. This simplifies the receiver 1999 by CRC Press LLC

c

structure in that the sufficient statistics are obtained by implementing only two matched filters. Both QPSK and QAM signal sets fit into this category. As an example, in Fig. 9.5, we have depicted the signal locations for standard 16-QAM signalling with the associated decision boundaries. In this case we have assumed an equally likely signal set. As can be seen, the optimal decision rule selects the signal representation that is closest to the received signal representation in this two-dimensional signal space.

9.9

More Realistic Channels

As is unfortunately often the case, many channels of practical interest are not accurately modeled as simply an AWGN channel. It is often that these channels impose nonlinear effects on the transmitted signals. The best example of this are channels that impose a random phase and random amplitude onto the signal. This typically occurs in applications such as in mobile communications, where one often experiences rapidly changing path lengths from source to receiver. Fortunately, by the judicious choice of signal waveforms, it can be shown that the selection of the φi in the Karhunen–Lo`eve transformation is often independent of these unwanted parameters. In these situations, the random amplitude serves only to scale the signals in signal space, whereas the random phase simply imposes a rotation on the signals in signal space. Since the Karhunen–Lo`eve basis functions typically do not depend on the unknown parameters, we may again convert the continuous time classification problem to a vector channel problem where the received vector R is computed as in Eq. (9.3). Since this vector is a function of both the unknown parameters (i.e., in this case amplitude A and phase ν), to obtain a likelihood ratio test independent of A and ν, we simply apply Bayes theorem to obtain the following form for the LRT: H1 E pR |H1 ,A,ν (R | H1 , A, ν) > π0 L(R) = E pR |H0 ,A,ν (R | H0 , A, ν) < π1 H0 where the expectations are taken with respect to A and ν, and where pR|Hi ,A,ν are the conditional probability density functions of the signal representations. Assuming that the background noise is AWGN, it can be shown that the LRT simplifies to choosing the largest amongst

Z ξi [R(t)]

=

πi

A,ν

exp

2 N0

Z

T

0

R(t)si (t | A, ν) dt −

Ei (A, ν) pA,ν (A, ν) dA dν N0 i = 1, . . . , M

(9.11)

It should be noted that in the Eq. (9.11) we have explicitly shown the dependence of the transmitted signals si on the parameters A and ν. The final receiver structures, together with their corresponding performance are, thus, a function of both the choice of signal sets and the probability density functions of the random amplitude and random phase.

9.9.1

Random Phase Channels

If we consider first the special case where the channel simply imposes a uniform random phase on the signal, then it can be easily shown that the so-called in-phase and quadrature statistics obtained from the received signal R(t) (denoted by RI and RQ , respectively), are sufficient statistics for the 1999 by CRC Press LLC

c

FIGURE 9.5: Signal space representation of 16-QAM signal set. Optimal decision regions for equally likely signals are also noted.

FIGURE 9.6: Optimum receiver structure for noncoherent (random or unknown phase) ASK demodulation.

1999 by CRC Press LLC

c

signal classification problem. These quantities are computed as Z T R(t) cos [2πfc (i)t] dt RI (i) = 0

and

Z RQ (i) =

0

T

R(t) sin [2πfc (i)t] dt

where in this case the index i corresponds to the center frequencies of hypotheses Hi , (e.g., FSK signalling). The optimum binary receiver selects the largest from amongst q Ei 2 2 2 RI (i) + RQ (i) i = 1, . . . , M I0 ξi [R(t)] = πi exp − N0 N0 where I0 is a zeroth-order, modified Bessel function of the first kind. If the signals have equal energy and are equally likely (e.g., FSK signalling), then the optimum receiver is given by H1 > 2 2 2 (1) R (0) + RQ (0) RI2 (1) + RQ < I H0 One may readily observe that the q optimum receiver bases its decision on the values of the two 2 (i) and, as a consequence, is often referred to as an envelopes of the received signal RI2 (i) + RQ envelope or square-law detector. Moreover, it should be observed that the computation of the envelope is independent of the underlying phase of the signal and is as such known as a noncoherent receiver. The computation of the error rate for this detector is a relatively straightforward exercise resulting in 1 E Pe ( noncoherent) = exp − 2 2N0 As before, note that the error rate for the noncoherent receiver is simply a function of the SNR.

9.9.2 Rayleigh Channel As an important generalization of the described random phase channel, many communication systems are designed under the assumption that the channel introduces both a random amplitude and a random phase on the signal. Specifically, if the original signal sets are of the form si (t) = mi (t) cos(2πfc t) where mi (t) is the baseband version of the message (i.e., what distinguishes one signal from another), then the so-called Rayleigh channel introduces random distortion in the received signal of the following form: si (t) = Ami (t) cos (2πfc t + ν) where the amplitude A is a Rayleigh random variable4 and where the random phase ν is a uniformly distributed between zero and 2π.

4 The density of a Rayleigh random variable is given by p (a) = a/σ 2 exp(−a 2 /2σ 2 ) for a ≥ 0. A

1999 by CRC Press LLC

c

To determine the optimal receiver under this distortion, we must first construct an alternate statistical model for si (t). To begin, it can be shown from the theory of random variables [4] that if XI and XQ are statistically independent, zero mean, Gaussian random variables with variance given by σ 2 , then Ami (t) cos (2πfc t + ν) = mi (t)XI cos (2πfc t) + mi (t)XQ sin (2πfc t) Equality here is to be interpreted as implying that both A and ν will be the appropriate random variables. From this, we deduce that the combined uncertainty in the amplitude and phase of the signal is incorporated into the Gaussian random variables XI and XQ . The in-phase and quadrature components of the signal si (t) are given by sI i (t) = mi (t) cos(2πfc t) and sQ i (t) = mi (t) sin(2πfc t), respectively. By appealing to Eq. (9.11), it can be shown that the optimum receiver selects the largest from

2 πi σ2 hR(t), sI i (t)i2 + R(t), sQ i (t) exp ξi [R(t)] = Ei 2 2Ei 2 1 + 1+ σ σ N0 2 N0 where the inner product

Z hR(t), Si (t)i =

0

T

R(t)si (t) dt

Further, if we impose the conditions that the signals be equally likely with equal energy over the symbol interval, then optimum receiver selects the largest amongst q

2 ξi [R(t)] = hR(t), sI i (t)i2 + R(t), sQ i (t) Thus, much like for the random phase channel, the optimum receiver for the Rayleigh channel computes the projection of the received waveform onto the in-phase and quadrature components of the hypothetical signals. From a signal space perspective, this is akin to computing the length of the received vector in the subspace spanned by the hypothetical signal. The optimum receiver then chooses the largest amongst these lengths. As with the random phase channel, computing the performance is a straightforward task resulting in (for the equally likely, equal energy case) 1 2 Pe ( Rayleigh) = Eσ 2 1+ N0 Interestingly, in this case the performance depends not only on the SNR, but also on the variance (spread) of the Rayleigh amplitude A. Thus, if the amplitude spread is large, we expect to often experience what is known as deep fades in the amplitude of the received waveform and as such expect a commensurate loss in performance.

9.10

Dispersive Channels

The dispersive channel model assumes that the channel not only introduces AWGN but also distorts the signal through a filtering process. This model incorporates physical realities such as multipath 1999 by CRC Press LLC

c

effects and frequency selective fading. In particular, the standard model adopted is depicted in the block diagram given in Fig. 9.7. As can be seen, the receiver observes a filtered version of the signal plus AWGN. If the impulse response of the channel is known, then we arrive at the optimum receiver design by applying the previously presented theory. Unfortunately, the duration of the filtered signal can be a complicating factor. More often than not, the channel will increase the duration of the transmitted signals, hence, leading to the description, dispersive channel.

FIGURE 9.7: Standard model for dispersive channel. The time varying impulse response of the channel is denoted by hc (t, τ ). However, if the designers take this into account by shortening the duration of si (t) so that the duration of si∗ (t) is less than T , then the optimum receiver chooses the largest amongst

1 N0 ln πi + R(t), si∗ (t) − Ei∗ 2 2 If we limit our consideration to equally likely binary signal sets, then the minimum Pe matches the received waveform to the filtered versions of the signal waveforms. The resulting error rate is given by

∗

!

s − s ∗ 0 0 Pe ( dispersive) = Q √ 2N 0 ξi (R(t)) =

Thus, in this case the minimum Pe is a function of the separation of the filtered version of the signals in the signal space. The problem becomes substantially more complex if we cannot insure that the filtered signal durations are less than the symbol lengths. In this case we experience what is known as intersymbol interference (ISI). That is, observations over one symbol interval contain not only the symbol information of interest but also information from previous symbols. In this case we must appeal to optimum sequence estimation [5] to take full advantage of the information in the waveform. The basis for this procedure is the maximization of the joint likelihood function conditioned on the sequence of symbols. This procedure not only defines the structure of the optimum receiver under ISI but also is critical in the decoding of convolutional codes and coded modulation. Alternate adaptive techniques to solve this problem involve the use of channel equalization.

Defining Terms Additive white Gaussian noise (AWGN) channel: The channel whose model is that of corrupting a transmitted waveform by the addition of white (i.e., spectrally flat) Gaussian noise. 1999 by CRC Press LLC

c

Bit (symbol) interval: The period of time over which a single symbol is transmitted. Communication channel: The medium over which communication signals are transmitted. Examples are fiber optic cables, free space, or telephone lines. Correlation or matched filter receiver: The optimal receiver structure for digital communications in AWGN. Decision boundary: The boundary in signal space between the various regions where the receiver declares Hi . Typically a hyperplane when dealing with AWGN channels. Dispersive channel: A channel that elongates and distorts the transmitted signal. Normally modeled as a time-varying linear system. Intersymbol interference: The ill-effect of one symbol smearing into adjacent symbols thus interfering with the detection process. This is a consequence of the channel filtering the transmitted signals and therefore elongating their duration, see dispersive channel. Karhunen–Lo`eve expansion: A representation for second-order random processes. Allows one to express a random process in terms of a superposition of deterministic waveforms. The scale values are uncorrelated random variables obtained from the waveform. Mean-square equivalence: Two random vectors or time-limited waveforms are mean-square equivalent if and only if the expected value of their mean-square error is zero. Orthonormal: The property of two or more vectors or time-limited waveforms being mutually orthogonal and individually having unit length. Orthogonality and length are typically measured by the standard Euclidean inner product. Rayleigh channel: A channel that randomly scales the transmitted waveform by a Rayleigh random variable while adding an independent uniform phase to the carrier. Signal space: An abstraction for representing a time limited waveform in a low-dimensional vector space. Usually arrived at through the application of the Karhunen–Lo`eve transformation. Total probability of error: The probability of classifying the received waveform into any of the symbols that were not transmitted over a particular bit interval.

References [1] Gibson, J.D., Principles of Digital and Analog Communications, 2nd ed., MacMillan, New York, 1993. [2] Haykin, S., Communication Systems, 3rd ed., John Wiley & Sons, New York, 1994. [3] Lee, E.A. and Messerschmitt, D.G., Digital Communication, Kluwer Academic Publishers, Norwell, MA, 1988. [4] Papoulis, A., Probability, Random Variables, and Stochastic Processes, 3rd ed., McGraw-Hill, New York, 1991. [5] Poor, H.V., An Introduction to Signal Detection and Estimation, Springer-Verlag, New York, 1988. [6] Proakis, J.G., Digital Communications, 2nd ed., McGraw-Hill, New York, 1989. [7] Shiryayev, A.N., Probability, Springer-Verlag, New York, 1984. [8] Sklar, B., Digital Communications, Fundamentals and Applications, Prentice Hall, Englewood Cliffs, NJ, 1988. [9] Van Trees, H.L., Detection, Estimation, and Modulation Theory, Part I, John Wiley & Sons, New York, 1968. 1999 by CRC Press LLC

c

[10] Wong, E. and Hajek, B., Stochastic Processes in Engineering Systems, Springer-Verlag, New York, 1985. [11] Wozencraft, J.M. and Jacobs, I., Principles of Communication Engineering, reissue, Waveland Press, Prospect Heights, Illinois, 1990. [12] Ziemer, R.E. and Peterson, R.L., Introduction to Digital Communication, Macmillan, New York, 1992.

Further Information The fundamentals of receiver design were put in place by Wozencraft and Jacobs in their seminal book. Since that time, there have been many outstanding textbooks in this area. For a sampling see [1, 2, 3, 8, 12]. For a complete treatment on the use and application of detection theory in communications see [5, 9]. For deeper insights into the Karhunen–Lo`eve expansion and its use in communications and signal processing see [10].

1999 by CRC Press LLC

c

Bhargava, V.K. “Forward Error Correction Coding” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Forward Error Correction Coding

V.K. Bhargava University of Victoria

I.J. Fair University of Alberta

10.1

10.1 Introduction 10.2 Fundamentals of Block Coding 10.3 Structure and Decoding of Block Codes 10.4 Important Classes of Block Codes 10.5 Principles of Convolutional Coding 10.6 Decoding of Convolutional Codes 10.7 Trellis-Coded Modulation 10.8 Additional Measures 10.9 Turbo Codes 10.10 Applications Defining Terms References Further Information

Introduction

In 1948, Claude Shannon issued a challenge to communications engineers by proving that communication systems could be made arbitrarily reliable as long as a fixed percentage of the transmitted signal was redundant [9]. He showed that limits exist only on the rate of communication and not its accuracy, and went on to prove that errorless transmission could be achieved in an additive white Gaussian noise (AWGN) environment with infinite bandwidth if the ratio of energy per data bit to noise power spectral density exceeds the Shannon Limit. He did not, however, indicate how this could be achieved. Subsequent research has led to a number of techniques that introduce redundancy to allow for correction of errors without retransmission. These techniques, collectively known as forward error correction (FEC) coding techniques, are used in systems where a reverse channel is not available for requesting retransmission, the delay with retransmission would be excessive, the expected number of errors would require a large number of retransmissions, or retransmission would be awkward to implement [10]. A simplified model of a digital communication system which incorporates FEC coding is shown in Fig. 10.1. The FEC code acts on a discrete data channel comprising all system elements between the encoder output and decoder input. The encoder maps the source data to q-ary code symbols which are modulated and transmitted. During transmission, this signal can be corrupted, causing errors to arise in the demodulated symbol sequence. The FEC decoder attempts to correct these errors and restore the original source data. 1999 by CRC Press LLC

c

FIGURE 10.1: Block diagram of a digital communication system with forward error correction.

A demodulator which outputs only a value for the q-ary symbol received during each symbol interval is said to make hard decisions. In the binary symmetric channel (BSC), hard decisions are made on binary symbols and the probability of error is independent of the value of the symbol. One example of a BSC is the coherently demodulated binary phase-shift-keyed (BPSK) signal corrupted by AWGN. The conditional probability density functions which result with this system are depicted in Fig. 10.2. The probability of error is given by the area under the density functions that lies across the decision threshold, and is a function of the symbol energy Es and the one-sided noise power spectral density N0 .

FIGURE 10.2: Hard and soft decision demodulation of a coherently demodulated BPSK signal corrupted by AWGN. f (z | 1) and f (z | 0) are the Gaussianly distributed conditional probability density functions at the threshold device. Alternatively, the demodulator can make soft decisions or output an estimate of the symbol value along with an indication of its confidence in this estimate. For example, if the BPSK demodulator uses three-bit quantization, the two least significant bits can be taken as a confidence measure. Possible soft-decision thresholds for the BPSK signal are depicted in Fig. 10.2. In practice, there is little to be gained by using many soft-decision quantization levels. Block and convolutional codes introduce redundancy by adding parity symbols to the message data. They map k source symbols to n code symbols and are said to have code rate R = k/n. With fixed information rates, this redundancy results in increased bandwidth and lower energy per transmitted symbol. At low signal-to-noise ratios, these codes cannot compensate for these impairments, and performance is degraded. At higher ratios of information symbol energy Eb to noise spectral density N0 , however, there is coding gain since the performance improvement offered by coding more than compensates for these impairments. Coding gain is usually defined as the reduction in required Eb /N0 to achieve a specific error rate in an error-control coded system over one without coding. In contrast to block and convolutional codes, trellis-coded modulation introduces redundancy by expanding the size of the signal set rather than increasing the number of symbols transmitted, and so offers the advantages of coding to band-limited systems. 1999 by CRC Press LLC

c

Each of these coding techniques is considered in turn. Following a discussion of interleaving and concatenated coding, this chapter gives an overview of a recent and significant advance in coding, the development of Turbo codes, and concludes with a brief overview of FEC applications.

10.2

Fundamentals of Block Coding

In block codes there is a one-to-one mapping between k-symbol source words and n-symbol codewords. With q-ary signalling, q k out of the q n possible n-tuples are valid code vectors. The set of all n-tuples forms a vector space in which the q k code vectors are distributed. The Hamming distance between any two code vectors is the number of symbols in which they differ; the minimum distance dmin of the code is the smallest Hamming distance between any two codewords. There are two contradictory objectives of block codes. The first is to distribute the code vectors in the vector space such that the distance between them is maximized. Then, if the decoder receives a corrupted vector, by evaluating the nearest valid code vector it will decode the correct word with high probability. The second is to pack the vector space with as many code vectors as possible to reduce the redundancy in transmission. When code vectors differ in at least dmin positions, a decoder which evaluates the nearest code vector to each received word is guaranteed to correct up to t random symbol errors per word if dmin ≥ 2t + 1

(10.1)

Alternatively, all q n − q k illegal words can be detected, including all error patterns with dmin − 1 or fewer errors. In general, a block code can correct all patterns of t or fewer errors and detect all patterns of u or fewer errors provided that u ≥ t and dmin ≥ t + u + 1

(10.2)

If q = 2, knowledge of the positions of the errors is sufficient for their correction; if q > 2, the decoder must determine both the positions and values of the errors. If the demodulator indicates positions in which the symbol values are unreliable, the decoder can assume their value unknown and has only to solve for the value of these symbols. These positions are called erasures. A block code can correct up to t errors and v erasures in each word if dmin ≥ 2t + v + 1

10.3

(10.3)

Structure and Decoding of Block Codes

Shannon showed that the performance limit of codes with fixed code rate improves as the block length increases. As n and k increase, however, practical implementation requires that the mapping from message to code vector not be arbitrary but that an underlying structure to the code exist. The structures developed to date limit the error correcting capability of these codes to below what Shannon proved possible, on average, for a code with random codeword assignments. Although Turbo codes have made significant strides towards approaching the Shannon Limit, the search for good constructive codes continues. A property which simplifies implementation of the coding operations is that of code linearity. A code is linear if the addition of any two code vectors forms another code vector, which implies that the code vectors form a subspace of the vector space of n-tuples. This subspace, which contains 1999 by CRC Press LLC

c

the all-zero vector, is spanned by any set of k linearly independent code vectors. Encoding can be described as the multiplication of the information k-tuple by a generator matrix G, of dimension k × n, which contains these basis vectors as rows. That is, a message vector mi is mapped to a code vector ci according to i = 0, 1, . . . , q k − 1 (10.4) ci = mi G, where elementwise arithmetic is defined in the finite field GF(q). In general, this encoding procedure results in code vectors with nonsystematic form in that the values of the message symbols cannot be determined by inspection of the code vector. However, if G has the form [I k , P ] where I k is the k × k identity matrix and P is a k × (n − k) matrix of parity checks, then the k most significant symbols of each code vector are identical to the message vector and the code has systematic form. This notation assumes that vectors are written with their most significant or first symbols in time on the left, a convention used throughout this chapter. For each generator matrix there is an (n−k)×k parity check matrix H whose rows are orthogonal to the rows in G, i.e., GH T = 0. If the code is systematic, H = [−P T , I n−k ]. Since all codewords are linear sums of the rows in G, it follows that ci H T = 0 for all i, i = 0, 1, . . . , q k − 1, and that the validity of the demodulated vectors can be checked by performing this multiplication. If a codeword c is corrupted during transmission so that the hard-decision demodulator outputs the vector cˆ = c+e, where e is a nonzero error pattern, the result of this multiplication is an (n−k)-tuple that is indicative of the validity of the sequence. This result, called the syndrome s, is dependent only on the error pattern since s = cˆ H T = (c + e)H T = cH T + eH T = eH T

(10.5)

If the error pattern is a code vector, the errors go undetected. For all other error patterns, however, the syndrome is nonzero. Since there are q n−k − 1 nonzero syndromes, q n−k − 1 error patterns can be corrected. When these patterns include all those with t or fewer errors and no others, the code is said to be a perfect code. Few codes are perfect; most codes are capable of correcting some patterns with more than t errors. Standard array decoders use lookup tables to associate each syndrome with an error pattern but become impractical as the block length and number of parity symbols increases. Algebraic decoding algorithms have been developed for codes with stronger structure. These algorithms are simplified with imperfect codes if the patterns corrected are limited to those with t or fewer errors, a simplification called bounded distance decoding. Cyclic codes are a subclass of linear block codes with an algebraic structure that enables encoding to be implemented with a linear feedback shift register and decoding to be implemented without a lookup table. As a result, most block codes in use today are cyclic or are closely related to cyclic codes. These codes are best described if vectors are interpreted as polynomials and the arithmetic follows the rules for polynomials where the elementwise operations are defined in GF(q). In a cyclic code, all codeword polynomials are multiples of a generator polynomial g(x) of degree n − k. This polynomial is chosen to be a divisor of x n − 1 so that a cyclic shift of a code vector yields another code vector, giving this class of codes its name. A message polynomial mi (x) can be mapped to a codeword polynomial ci (x) in nonsystematic form as ci (x) = mi (x)g(x),ψ

i = 0, 1, . . . , q k − 1

(10.6)

In systematic form, codeword polynomials have the form ci (x) = mi (x)x n−k − ri (x),ψ

i = 0, 1, . . . , q k − 1

(10.7)

where ri (x) is the remainder of mi (x)x n−k divided by g(x). Polynomial multiplication and division can be easily implemented with shift registers [5]. 1999 by CRC Press LLC

c

The first step in decoding the demodulated word is to determine if the word is a multiple of g(x). This is done by dividing it by g(x) and examining the remainder. Since polynomial division is a linear operation, the resulting syndrome s(x) depends only on the error pattern. If s(x) is the allzero polynomial, transmission is errorless or an undetectable error pattern has occurred. If s(x) is nonzero, at least one error has occurred. This is the principle of the cyclic redundancy check (CRC). It remains to determine the most likely error pattern that could have generated this syndrome. Single error correcting binary codes can use the syndrome to immediately locate the bit in error. More powerful codes use this information to determine the locations and values of multiple errors. The most prominent approach of doing so is with the iterative technique developed by Berlekamp. This technique, which involves computing an error-locator polynomial and solving for its roots, was subsequently interpreted by Massey in terms of the design of a minimum-length shift register. Once the location and values of the errors are known, Chien’s search algorithm efficiently corrects them. The implementation complexity of these decoders increases only as the square of the number of errors to be corrected [4] but does not generalize easily to accommodate soft-decision information. Other decoding techniques, including Chase’s algorithm and threshold decoding, are easier to implement with soft-decision input [6]. Berlekamp’s algorithm can be used in conjunction with transformdomain decoding, which involves transforming the received block with a finite field Fourier-like transform and solving for errors in the transform domain. Since the implementation complexity of these decoders depends on the block length rather than the number of symbols corrected, this approach results in simpler circuitry for codes with high redundancy [13]. Other block codes have also been constructed, including codes that are based on transform-domain spectral properties, codes that are designed specifically for correction of burst errors, and codes that are decodable with straightforward threshold or majority logic decoders [5, 6, 7].

10.4

Important Classes of Block Codes

When errors occur independently, Bose–Chaudhuri–Hocquenghem (BCH) codes provide one of the best performances of known codes for a given block length and code rate. They are cyclic codes with n = q m − 1, where m is any integer greater than 2. They are designed to correct up to t errors per word and so have designed distance d = 2t + 1; the minimum distance may be greater. Generator polynomials for these codes are listed in many texts, including [6]. These polynomials are of degree less than or equal to mt, and so k ≥ n − mt. BCH codes can be shortened to accommodate system requirements by deleting positions for information symbols. Some subclasses of these codes are of special interest. Hamming codes are perfect single error correcting binary BCH codes. Full length codes have n = 2m − 1 and k = n − m for any m greater than 2. The duals of these codes are maximal-length codes, with n = 2m − 1, k = m, and dmin = 2m−1 . All 2m − 1 nonzero code vectors in these codes are cyclic shifts of a single nonzero code vector. Reed–Solomon (RS) codes are nonbinary BCH codes defined over GF(q), where q is often taken as a power of two so that symbols can be represented by a sequence of bits. In these cases, correction of even a single symbol allows for correction of a burst of bit errors. The block length is n = q − 1, and the minimum distance dmin = 2t + 1 is achieved using only 2t parity symbols. Since RS codes meet the Singleton bound of dmin ≤ n − k + 1, they have the largest possible minimum distance for these values of n and k and are called maximum distance separable codes. The Golay codes are the only nontrivial perfect codes that can correct more than one error. The (11, 6) ternary Golay code has minimum distance 5. The (23, 12) binary code is a triple error correcting BCH code with dmin = 7. To simplify implementation, it is often extended to a (24, 12) code through the addition of an extra parity bit. The extended code has dmin = 8. 1999 by CRC Press LLC

c

The (23, 12) Golay code is also a binary quadratic residue code. √ These cyclic codes have prime length of the form n = 8m ± 1, with k = (n + 1)/2 and dmin ≥ n. Some of these codes are as good as the best codes known with these values of n and k, but it is unknown if there are good quadratic residue codes with large n [5]. Reed-Muller codes are equivalent to binary cyclic codes with an additional overall parity bit. For m r m−r . The rth-order any m, the rth-order Reed-Muller code has n = 2m , k = 6i=0 i , and dmin = 2 and (m − r − 1)th-order codes are duals, and the first-order codes are similar to maximal-length codes. These codes, and the closely related Euclidean geometry and projective geometry codes, can be decoded with threshold decoding. The performance of several of these block codes is shown in Fig. 10.3 in terms of decoded bit error probability vs. Eb /N0 for systems using coherent, hard-decision demodulated BPSK signalling. Many other block codes have also been developed, including Goppa codes, quasicyclic codes, burst error correcting Fire codes, and other lesser known codes.

10.5

Principles of Convolutional Coding

Convolutional codes map successive information k-tuples to a series of n-tuples such that the sequence of n-tuples has distance properties that allow for detection and correction of errors. Although these codes can be defined over any alphabet, their implementation has largely been restricted to binary signals, and only binary convolutional codes are considered here. In addition to the code rate R = k/n, the constraint length K is an important parameter for these codes. Definitions vary; we will use the definition that K equals the number of k-tuples that affect formation of each n-tuple during encoding. That is, the value of an n-tuple depends on the k-tuple that arrives at the encoder during that encoding interval as well as the K − 1 previous information k-tuples. Binary convolutional encoders can be implemented with kK-stage shift registers and n modulo-2 adders, an example of which is given in Fig. 10.4(a) for a rate 1/2, constraint length 3 code. The encoder shifts in a new k-tuple during each encoding interval and samples the outputs of the adders sequentially to form the coded output. Although connection diagrams similar to that of Fig. 10.4(a) completely describe the code, a more concise description can be given by stating the values of n, k, and K and giving the adder connections in the form of vectors or polynomials. For instance, the rate 1/2 code has the generator vectors g 1 = 111 and g 2 = 101, or equivalently, the generator polynomials g1 (x) = x 2 + x + 1 and g2 (x) = x 2 + 1. Alternatively, a convolutional code can be characterized by its impulse response, the coded sequence generated due to input of a single logic-1. It is straightforward to verify that the circuit in Fig. 10.4(a) has the impulse response 111011. Since modulo-2 addition is a linear operation, convolutional codes are linear, and the coded output can be viewed as the convolution of the input sequence with the impulse response, hence the name of this coding technique. Shifted versions of the impulse response or generator vectors can be combined to form an infinite-order generator matrix which also describes the code. Shift register circuits can be modeled as finite state machines. A Mealy machine description of a convolutional encoder requires 2k(K−1) states, each describing a different value of the K − 1 k-tuples which have most recently entered the shift register. Each state has 2k exit paths which correspond to the value of the incoming k-tuple. A state machine description for the rate 1/2 encoder depicted in Fig. 10.4(a) is given in Fig. 10.4(b). States are labeled with the contents of the two leftmost register stages; edges are labeled with information bit values and their corresponding coded output. The dimension of time is added to the description of the encoder with tree and trellis diagrams. 1999 by CRC Press LLC

c

FIGURE 10.3: Block code performance. Source: Sklar, B., 1988, Digital Communications: Fundac 1988, p. 300. Reprinted by permission of Prentice-Hall, Inc., Englewood mentals and Applications, Cliffs, NJ.

The tree diagram for the rate 1/2 convolutional code is given in Fig. 10.4(c), assuming the shift register is initially clear. Each node represents an encoding interval, from which the upper branch is taken if the input bit is a 0 and the lower branch is taken if the input bit is a 1. Each branch is labeled with the corresponding output bit sequence. A drawback of the tree representation is that it grows without bound as the length of the input sequence increases. This is overcome with the trellis diagram depicted in Fig. 10.4(d), Again, encoding results in left-to-right movement, where the upper of the two branches is taken whenever the input is a 0, the lower branch is taken when the input is a 1, and the output is the bit sequence which weights the branch taken. Each level of nodes corresponds to a state of the encoder as shown on the left-hand side of the diagram. 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 10.4: A rate 1/2, constraint length 3 convolutional code.

FIGURE 10.4: (Continued). If the received sequence contains errors, it may no longer depict a valid path through the tree or trellis. It is the job of the decoder to determine the original path. In doing so, the decoder does not so much correct errors as find the closest valid path to the received sequence. As a result, the error correcting capability of a convolutional code is more difficult to quantify than that of a block code; it depends on how valid paths differ. One measure of this difference is the column distance dc (i), the minimum Hamming distance between all coded sequences generated over i encoding intervals which differ in the first interval. The nondecreasing sequence of column distance values is the distance profile of the code. The column distance after K intervals is the minimum distance of the code and is important for evaluating the performance of a code that uses threshold decoding. As i increases, dc (i) approaches the free distance of the code, dfree , which is the minimum Hamming distance in the set of arbitrarily long paths that diverge and then remerge in the trellis. With maximum likelihood decoding, convolutional codes can generally correct up to t errors within three to five constraint lengths, depending on how the errors are distributed, where dfree ≥ 2t + 1

(10.8)

The free distance can be calculated by exhaustively searching for the minimum-weight path that returns to the all-zero state, or evaluating the term of lowest degree in the generating function of the code. The objective of a convolutional code is to maximize these distance properties. They generally improve as the constraint length of the code increases, and nonsystematic codes generally have better properties than systematic ones. Good codes have been found by computer search and are tabulated in many texts, including [6]. Convolutional codes with high code rate can be constructed by puncturing or periodically deleting coded symbols from a low rate code. A list of low rate codes and perforation matrices that result in good high rate codes can be found in many sources, including [13]. The performance of good punctured codes approaches that of the best convolutional codes known with similar rate, and decoder implementation is significantly less complex. Convolutional codes can be catastrophic, having the potential to generate an unlimited number of decoded bit errors in response to a finite number of errors in the demodulated bit sequence. Catastrophic error propagation is avoided if the code has generator polynomials with a greatest common divisor of the form x a for any a or, equivalently, if there are no closed-loop paths in the state diagram with all-zero output other than the one taken with all-zero input. Systematic codes are not catastrophic.

1999 by CRC Press LLC

c

10.6

Decoding of Convolutional Codes

In 1967, Viterbi developed a maximum likelihood decoding algorithm that takes advantage of the trellis structure to reduce the complexity of the evaluation. This algorithm has become known as the Viterbi algorithm. With each received n-tuple, the decoder computes a metric or measure of likelihood for all paths that could have been taken during that interval and discards all but the most likely to terminate on each node. An arbitrary decision is made if path metrics are equal. The metrics can be formed using either hard or soft decision information with little difference in implementation complexity. If the message has finite length and the encoder is subsequently flushed with zeros, a single decoded path remains. With a BSC, this path corresponds to the valid code sequence with minimum Hamming distance from the demodulated sequence. Full-length decoding becomes impractical as the length of the message sequence increases. The most likely paths tend to have a common stem, however, and selecting the trace value four or five times the constraint length prior to the present decoding depth results in near-optimum performance. Since the number of paths examined during each interval increases exponentially with the constraint length, the Viterbi algorithm also becomes impractical for codes with large constraint length. To date, Viterbi decoding has been implemented for codes with constraint lengths up to ten. Other decoding techniques, such as sequential and threshold decoding, can be used with larger constraint lengths. Sequential decoding was proposed by Wozencraft, and the most widely used algorithm was developed by Fano. Rather than tracking multiple paths through the trellis, the sequential decoder operates on a single path while searching the code tree for a path with high probability. It makes tentative decisions regarding the transmitted sequence, computes a metric between its proposed path and the demodulated sequence, and moves forward through the tree as long as the metric indicates that the path is likely. If the likelihood of the path becomes low, the decoder moves backward, searching other paths until it finds one with high probability. The number of computations involved in this procedure is almost independent of the constraint length and is typically quite small, but it can be highly variable, depending on the channel. Buffers must be provided to store incoming sequences as the decoder searches the tree. Their overflow is a significant limiting factor in the performance of these decoders. Figure 10.5 compares the performance of the Viterbi and sequential decoding algorithms for several convolutional codes operating on coherently demodulated BPSK signals corrupted by AWGN. Other decoding algorithms have also been developed, including syndrome decoding methods such as table look-up feedback decoding and threshold decoding [6]. These algorithms are easily implemented but offer suboptimal performance. Techniques such as the one discussed by [1] have been developed to support both soft input and soft output, but these decoding techniques typically increase decoder complexity.

10.7

Trellis-Coded Modulation

Trellis-coded modulation (TCM) has received considerable attention since its development by Ungerboeck in the late 1970s [11]. Unlike block and convolutional codes, TCM schemes achieve coding gain by increasing the size of the signal alphabet and using multilevel/phase signalling. Like convolutional codes, sequences of coded symbols are restricted to certain valid patterns. In TCM, these patterns are chosen to have large Euclidean distance from one another so that a large number of corrupted sequences can be corrected. The Viterbi algorithm is often used to decode these sequences. Since the symbol transmission rate does not increase, coded and uncoded signals require the same transmis1999 by CRC Press LLC

c

c 1982 IEEE, FIGURE 10.5: Convolutional code performance. Source: Omura, J.K. and Levitt, B.K., “Coded Error Probability Evaluation for Antijam Communication Systems,” IEEE Trans. Commun., vol. COM-30, no. 5, pp. 896–903. Reprinted by permission of IEEE.

sion bandwidth. If transmission power is held constant, the signal constellation of the coded signal is denser. The loss in symbol separation, however, is more than overcome by the error correction capability of the code. Ungerboeck investigated the increase in channel capacity that can be obtained by increasing the size of the signal set and restricting the pattern of transmitted symbols, and concluded that almost all of the additional capacity can be gained by doubling the number of points in the signal constellation. This is accomplished by encoding the binary data with a rate R = k/(k + 1) code and mapping sequences of k + 1 coded bits to points in a constellation of 2k+1 symbols. For example, the rate 2/3 1999 by CRC Press LLC

c

encoder of Fig. 10.6(a) encodes pairs of source bits to three coded bits. Figure 10.6(b) depicts one stage in the trellis of the coded output where, as with the convolutional code, the state of the encoder is defined by the values of the two most recent bits to enter the shift register. Note that unlike the trellis for the convolutional code, this trellis contains parallel paths between nodes.

FIGURE 10.6: Rate 2/3 trellis-coded modulation.

The key to improving performance with TCM is to map the coded bits to points in the signal space such that the Euclidean distance between transmitted sequences is maximized. A method that ensures improved Euclidean distance is the method of set partitioning. This involves separating all parallel paths on the trellis with maximum distance and assigning the next greatest distance to paths that diverge from or merge onto the same node. Figures 10.6(c) and 10.6(d) give examples of mappings for the rate 2/3 code with 8-PSK and 8-PAM signal constellations, respectively.

As with convolutional codes, the free distance of a TCM code is defined as the minimum distance between paths through the trellis, where the distance of concern is now Euclidean distance rather than Hamming distance. The free distance of an uncoded signal is defined as the distance between the closest signal points. When coded and uncoded signals have the same average power, the coding 1999 by CRC Press LLC

c

gain of the TCM system is defined as coding gain = 20 log10

dfree, coded dfree, uncoded

(10.9)

It can be shown that the simple, rate 2/3 8 phase-shift keying (PSK) and 8 pulse-amplitude modulation (PAM) TCM systems provide gains of 3 dB and 3.3 dB, respectively, [6]. More complex TCM systems yield gains up to 6 dB. Tables of good codes are given in [11].

10.8

Additional Measures

When the demodulated sequence contains bursts of errors, the performance of codes designed to correct independent errors improves if coded sequences are interleaved prior to transmission and deinterleaved prior to decoding. Deinterleaving separates the burst errors, making them appear more random and increasing the likelihood of accurate decoding. It is generally sufficient to interleave several block lengths of a block coded signal or several constraint lengths of a convolutionally encoded signal. Block interleaving is the most straightforward approach, but delay and memory requirements are halved with convolutional and helical interleaving techniques. Periodicity in the way sequences are combined is avoided with pseudorandom interleaving. Serially concatenated codes, first investigated by Forney, use two levels of coding to achieve a level of performance with less complexity than a single coding stage would require. The inner code interfaces with the modulator and demodulator and corrects the majority of the errors; the outer code corrects errors that appear at the output of the inner-code decoder. A convolutional code with Viterbi decoding is usually chosen as the inner code, and an RS code is often chosen as the outer code due to its ability to correct the bursts of bit errors which can result with incorrect decoding of trellis-coded sequences. Interleaving and deinterleaving outer-code symbols between coding stages offers further protection against the burst error output of the inner code. Product codes effectively place the data in a two dimensional array and use FEC techniques over both the rows and columns of this array. Not only do these codes result in error protection in two dimensions, but the manner in which the array is constructed can offer advantages similar to those achieved through interleaving.

10.9

Turbo Codes

The most recent significant achievement in FEC coding is the development of Turbo codes [3]. The principle of this coding technique is to encode the data with two or more constituent codes concatenated in parallel form. The received sequence is decoded in an iterative, serial approach using soft-input, soft-output decoders. This iterative decoding approach involves feedback of information in a manner similar to processes within the turbo engine, giving this coding technique its name. Turbo codes effectively result in the construction of relatively long codewords with few codewords being close in terms of Hamming distance, while at the same time constraining the implementation complexity of the decoder to practical limits. The first Turbo codes developed used recursive systematic convolutional codes as the constituent codes, and punctured them to improve the code rate. The use of other constituent codes has since been considered. Two or more of these codes are concatenated in parallel,where code concatenation is combined with interleaving in order to increase the independence of the data sequences encoded by the constituent encoders. This apparent increase in randomness, implemented with simple interleavers, is an important contributing factor to the excellent performance of the decoders. 1999 by CRC Press LLC

c

As in other multi-stage coding techniques, the complexity of the decoder is limited through use of separate decoding stages for each constituent code. The input to the first stage is the soft output of the demodulator for a finite-length received symbol sequence. Subsequent stages use both the demodulator output and an output of the previous decoding stage which is indicative of the reliability of the symbols. This information, gleaned from soft-output decoders, is called extrinsic information. Decoding proceeds by iterating through constituent decoders, each forwarding updated extrinsic information to the next decoder, until a predefined number of iterations has been completed or the extrinsic information indicates that high reliability has been achieved. This approach results in very good performance at low values of Eb /N0 . Simulations have demonstrated error rates of 10−5 at signal-to-noise ratios appreciably less than 1 dB. At higher values of Eb /N0 , however, the performance curves can exhibit flattening if constituent codes are chosen in a manner that results in an overall small Hamming distance for the code. Although this coding technique has shown great promise, there remains considerable work with regard to optimizing code parameters. Great strides have been made over the last few years in understanding the structure of these codes and relating them to serially concatenated and product codes, but many researchers are still examining these codes in order to advance their development. With this research will come optimization of the Turbo code process and application of these codes in various communication systems.

10.10

Applications

FEC coding remained of theoretical interest until advances in digital technology and improvements in decoding algorithms made their implementation possible. It has since become an attractive alternative to improving other system components or boosting transmission power. FEC codes are commonly used in digital storage systems, deep-space and satellite communication systems, terrestrial radio and band limited wireline systems, and have also been proposed for fiber optic transmission. Accordingly, the theory and practice of error correcting codes now occupies a prominent position in the field of communications engineering. Deep-space systems began using forward error correction in the early 1970s to reduce transmission power requirements, and used multiple error correcting RS codes for the first time in 1977 to protect against corruption of compressed image data in the Voyager missions [12]. The Consultative Committee for Space Data Systems (CCSDS) has since recommended use of a concatenated coding system which uses a rate 1/2, constraint length 7 convolutional inner code and a (255, 223) RS outer code. Coding is now commonly used in satellite systems to reduce power requirements and overall hardware costs and to allow closer orbital spacing of geosynchronous satellites [2]. FEC codes play integral roles in the VSAT, MSAT, INTELSAT, and INMARSAT systems [13]. Further, a (31, 15) RS code is used in the joint tactical information distribution system (JTIDS), a (7, 2) RS code is used in the air force satellite communication system (AFSATCOM), and a (204, 192) RS code has been designed specifically for satellite time division multiple access (TDMA) systems. Another code designed for military applications involves concatenation of a Golay and RS code with interleaving to ensure an imbalance of 1’s and 0’s in the transmitted symbol sequence and enhance signal recovery under severe noise and interference [2]. TCM has become commonplace in transmission of data over voiceband telephone channels. Modems developed since 1984 use trellis coded QAM modulation to provide robust communication at rates above 9.6 kb/s. Various coding techniques are used in the new digital cellular and 1999 by CRC Press LLC

c

personal communication standards, with an emphasis on convolutional and cyclic redundancy check codes [8]. FEC codes have also been widely used in digital recording systems, most prominently in the compact disc digital audio system. This system uses two levels of coding and interleaving in the cross-interleaved RS coding (CIRC) system to correct errors that result from disc imperfections and dirt and scratches which accumulate during use. Steps are also taken to mute uncorrectable sequences [12].

Defining Terms Binary symmetric channel: A memoryless discrete data channel with binary signalling, harddecision demodulation, and channel impairments that do not depend on the value of the symbol transmitted. Bounded distance decoding: Limiting the error patterns which are corrected in an imperfect code to those with t or fewer errors. Catastrophic code: A convolutional code in which a finite number of code symbol errors can cause an unlimited number of decoded bit errors. Code rate: The ratio of source word length to codeword length, indicative of the amount of information transmitted per encoded symbol. Coding gain: The reduction in signal-to-noise ratio required for specified error performance in a block or convolutional coded system over an uncoded system with the same information rate, channel impairments, and modulation and demodulation techniques. In TCM, the ratio of the squared free distance in the coded system to that of the uncoded system. Column distance: The minimum Hamming distance between convolutionally encoded sequences of a specified length with different leading n-tuples. Constituent codes: Two or more FEC codes that are combined in concatenated coding techniques. Cyclic code: A block code in which cyclic shifts of code vectors are also code vectors. Cyclic redundancy check: When the syndrome of a cyclic block code is used to detect errors. Designed distance: The guaranteed minimum distance of a BCH code designed to correct up to t errors. Discrete data channel: The concatenation of all system elements between FEC encoder output and decoder input. Distance profile: The minimum Hamming distance after each encoding interval of convolutionally encoded sequences which differ in the first interval. Erasure: A position in the demodulated sequence where the symbol value is unknown. Extrinsic information: The output of a constituent soft decision decoder that is forwarded as input to the next decoding stage in iterative decoding of Turbo codes. Finite field: A finite set of elements and operations of addition and multiplication that satisfy specific properties. Often called Galois fields and denoted GF(q), where q is the number of elements in the field. Finite fields exist for all q which are prime or the power of a prime. Free distance: The minimum Hamming weight of convolutionally encoded sequences that diverge and remerge in the trellis. Equals the maximum column distance and the limiting value of the distance profile. 1999 by CRC Press LLC

c

Generator matrix: A matrix used to describe a linear code. Code vectors equal the information vectors multiplied by this matrix. Generator polynomial: The polynomial that is a divisor of all codeword polynomials in a cyclic block code; a polynomial that describes circuit connections in a convolutional encoder. Hamming distance: The number of symbols in which codewords differ. Hard decision: Demodulation that outputs only a value for each received symbol. Interleaving: Shuffling the coded bit sequence prior to modulation and reversing this operation following demodulation. Used to separate and redistribute burst errors over several codewords (block codes) or constraint lengths (trellis codes) for higher probability of correct decoding by codes designed to correct random errors. Linear code: A code whose code vectors form a vector space. Equivalently, a code where the addition of any two code vectors forms another code vector. Maximum distance separable: A code with the largest possible minimum distance given the block length and code rate. These codes meet the Singleton bound of dmin ≤ n − k + 1. Metric: A measure of goodness against which items are judged. In the Viterbi algorithm, an indication of the probability of a path being taken given the demodulated symbol sequence. Minimum distance: In a block code, the smallest Hamming distance between any two codewords. In a convolutional code, the column distance after K intervals. Parity check matrix: A matrix whose rows are orthogonal to the rows in the generator matrix of a linear code. Errors can be detected by multiplying the received vector by this matrix. n t Perfect code: A t error correcting (n, k) block code in which q n−k − 1 = 6i=1 i . Puncturing: Periodic deletion of code symbols from the sequence generated by a convolutional encoder for purposes of constructing a higher rate code. Also, deletion of parity bits in a block code. Set partitioning: Rules for mapping coded sequences to points in the signal constellation that always result in a larger Euclidean distance for a TCM system than an uncoded system, given appropriate construction of the trellis. Shannon Limit: The ratio of energy per data bit Eb to one-sided noise power spectral density N0 in an AWGN channel above which errorless transmission is possible when bandwidth limitations are not placed on the signal and transmission is at channel capacity. This limit has the value ln 2 = 0.693 = −1.6 dB. Soft decision: Demodulation that outputs an estimate of the received symbol value along with an indication of the reliability of this value. Usually implemented by quantizing the received signal to more levels than there are symbol values. Standard array decoding: Association of an error pattern with each syndrome by way of a lookup table. Syndrome: An indication of whether or not errors are present in the demodulated symbol sequence. Systematic code: A code in which the values of the message symbols can be identified by inspection of the code vector. Vector space: An algebraic structure comprised of a set of elements in which operations of vector addition and scalar multiplication are defined. For our purposes, a set of n-tuples consisting of symbols from GF(q) with addition and multiplication defined in terms of elementwise operations from this finite field. 1999 by CRC Press LLC

c

Viterbi algorithm: A maximum-likelihood decoding algorithm for trellis codes that discards low-probability paths at each stage of the trellis, thereby reducing the total number of paths that must be considered.

References [1] Bahl, L.R., Cocke, J., Jelinek, F., and Raviv, J., Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate. IEEE Transactions on Information Theory, 20, 248–287, 1974. [2] Berlekamp, E.R., Peile, R.E., and Pope, S.P., The application of error control to communications. IEEE Commun. Mag., 25(4), 44–57, 1987. [3] Berrou, C., Glavieux, A., and Thitimajshima, P., Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes. Proceedings of ICC’93, Geneva, Switzerland, 1064–1070, 1993. Later expanded and published as: Berrou, C., Glavieux, A., 1996. Near Optimum Error Correcting Coding and Decoding. IEEE Transactions on Communications, 44(10), 1261–1271, 1996. [4] Bhargava, V.K., Forward error correction schemes for digital communications. IEEE Commun. Mag., 21(1), 11–19, 1983. [5] Blahut, R.E., Theory and Practice of Error Control Codes, Addison-Wesley, Reading, MA, 1983. [6] Clark, G.C. Jr. and Cain, J.B., Error Correction Coding for Digital Communications, Plenum Press, New York, 1981. [7] Lin, S. and Costello, D.J. Jr., Error Control Coding: Fundamentals and Applications, PrenticeHall, Englewood Cliffs, NJ, 1983. [8] Rappaport, T.S., Wireless Communications, Principles and Practice, Prentice-Hall and IEEE Press, NJ, 1996. [9] Shannon, C.E., A mathematical theory of communication. Bell Syst. Tech. J., 27(3), 379–423 and 623–656, 1948. [10] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1988. [11] Ungerboeck, G., Trellis-coded modulation with redundant signal sets. IEEE Commun. Mag., 25(2), 5–11 and 12–21, 1987. [12] Wicker, S.B. and Bhargava, V.K., Reed-Solomon Codes and Their Applications, IEEE Press, NJ, 1994. [13] Wu, W.W., Haccoun, D., Peile, R., and Hirata, Y., Coding for satellite communication. IEEE J. Selected Areas in Commun., SAC-5(4), 724–748, 1987.

Further Information There is now a large amount of literature on the subject of FEC coding. An introduction to the philosophy and limitations of these codes can be found in the second chapter of Lucky’s book Silicon Dreams: Information, Man, and Machine, St. Martin’s Press, New York, 1989. More practical introductions can be found in overview chapters of many communications texts. The number of texts devoted entirely to this subject also continues to grow. Although these texts summarize the algebra underlying block codes, more in-depth treatments can be found in mathematical texts. Survey papers appear occasionally in the literature, but the interested reader is directed to the seminal papers by Shannon, Hamming, Reed and Solomon, Bose and Chaudhuri, Hocquenghem, Wozencraft, Fano, Forney, Berlekamp, Massey, Viterbi, Ungerboeck, Berrou and Glavieux, among others. The most recent advances in the theory and implementation of error control codes are published in IEEE 1999 by CRC Press LLC

c

Transactions on Information Theory, IEEE Transactions on Communications, and special issues of IEEE Journal on Selected Areas in Communications.

1999 by CRC Press LLC

c

Milstein, L.B. & Simon, M.K. “Spread Spectrum Communications” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Spread Spectrum Communications 11.1 11.2 11.3 11.4

Laurence B. Milstein University of California

Marvin K. Simon Jet Propulsion Laboratory

11.1

A Brief History Why Spread Spectrum? Basic Concepts and Terminology Spread Spectrum Techniques

Direct Sequence Modulation • Frequency Hopping Modulation • Time Hopping Modulation • Hybrid Modulations

11.5 Applications of Spread Spectrum Military • Commercial

Defining Terms References

A Brief History

Spread spectrum (SS) has its origin in the military arena where the friendly communicator is 1) susceptible to detection/interception by the enemy and 2) vulnerable to intentionally introduced unfriendly interference (jamming). Communication systems that employ spread spectrum to reduce the communicator’s detectability and combat the enemy-introduced interference are respectively referred to as low probability of intercept (LPI) and antijam (AJ) communication systems. With the change in the current world political situation wherein the U.S. Department of Defense (DOD) has reduced its emphasis on the development and acquisition of new communication systems for the original purposes, a host of new commercial applications for SS has evolved, particularly in the area of cellular mobile communications. This shift from military to commercial applications of SS has demonstrated that the basic concepts that make SS techniques so useful in the military can also be put to practical peacetime use. In the next section, we give a simple description of these basic concepts using the original military application as the basis of explanation. The extension of these concepts to the mentioned commercial applications will be treated later on in the chapter.

11.2

Why Spread Spectrum?

Spread spectrum is a communication technique wherein the transmitted modulation is spread (increased) in bandwidth prior to transmission over the channel and then despread (decreased) in bandwidth by the same amount at the receiver. If it were not for the fact that the communication channel introduces some form of narrowband (relative to the spread bandwidth) interference, the receiver performance would be transparent to the spreading and despreading operations (assuming that they are identical inverses of each other). That is, after despreading the received signal would be identical 1999 by CRC Press LLC

c

to the transmitted signal prior to spreading. In the presence of narrowband interference, however, there is a significant advantage to employing the spreading/despreading procedure described. The reason for this is as follows. Since the interference is introduced after the transmitted signal is spread, then, whereas the despreading operation at the receiver shrinks the desired signal back to its original bandwidth, at the same time it spreads the undesired signal (interference) in bandwidth by the same amount, thus reducing its power spectral density. This, in turn, serves to diminish the effect of the interference on the receiver performance, which depends on the amount of interference power in the despread bandwidth. It is indeed this very simple explanation, which is at the heart of all spread spectrum techniques.

11.3

Basic Concepts and Terminology

To describe this process analytically and at the same time introduce some terminology that is common in spread spectrum parlance, we proceed as follows. Consider a communicator that desires to send a message using a transmitted power S Watts (W) at an information rate Rb bits/s (bps). By introducing a SS modulation, the bandwidth of the transmitted signal is increased from Rb Hz to Wss Hz where Wss Rb denotes the spreadspectrumbandwidth. Assume that the channel introduces, in addition to the usual thermal noise (assumed to have a single-sided power spectral density (PSD) equal to N0 W/Hz), an additive interference (jamming) having power J distributed over some bandwidth WJ . After despreading, the desired signal bandwidth is once again now equal to Rb Hz and the interference PSD is now NJ = J /Wss . Note that since the thermal noise is assumed to be white, i.e., it is uniformly distributed over all frequencies, its PSD is unchanged by the despreading operation and, thus, remains equal to N0 . Regardless of the signal and interferer waveforms, the equivalent bit energy-to-total noise spectral density ratio is, in terms of the given parameters, Eb S/Rb Eb = = Nt N0 + NJ N0 + J /Wss

(11.1)

For most practical scenarios, the jammer limits performance and, thus, the effects of receiver noise in the channel can be ignored. Thus, assuming NJ N0 , we can rewrite Eq. (11.1) as S Wss S/Rb Eb ∼ Eb = = = Nt NJ J /Wss J Rb

(11.2)

where the ratio J /S is the jammer-to-signal power ratio and the ratio Wss /Rb is the spreading ratio and is defined as the processing gain of the system. Since the ultimate error probability performance of the communication receiver depends on the ratio Eb /NJ , we see that from the communicator’s viewpoint his goal should be to minimize J /S (by choice of S) and maximize the processing gain (by choice of Wss for a given desired information rate). The possible strategies for the jammer will be discussed in the section on military applications dealing with AJ communications.

11.4

Spread Spectrum Techniques

By far the two most popular spreading techniques are direct sequence (DS) modulation and frequency hopping (FH) modulation. In the following subsections, we present a brief description of each.

1999 by CRC Press LLC

c

FIGURE 11.1: A DS-BPSK system (complex form).

11.4.1 Direct Sequence Modulation A direct sequence modulation c(t) is formed by linearly modulating the output sequence {cn } of a pseudorandom number generator onto a train of pulses, each having a duration Tc called the chip time. In mathematical form, ∞ X cn p (t − nTc )ψ (11.3) c(t) = n=−∞

where p(t) is the basic pulse shape and is assumed to be of rectangular form. This type of modulation is usually used with binary phase-shift-keyed (BPSK) information signals, which have the complex form d(t) exp{j (2πfc t +θc )}, where d(t) is a binary-valued data waveform of rate 1/Tb bits/s and fc and θc are the frequency and phase of the data-modulated carrier, respectively. As such, a DS/BPSK signal is formed by multiplying the BPSK signal by c(t) (see Fig. 11.1), resulting in the real transmitted signal (11.4) x(t) = Re {c(t)d(t) exp [j (2πfc t + θc )]} Since Tc is chosen so that Tb Tc , then relative to the bandwidth of the BPSK information signal, the bandwidth of the DS/BPSK signal1 is effectively increased by the ratio Tb /Tc = Wss /2Rb , which is one-half the spreading factor or processing gain of the system. At the receiver, the sum of the transmitted DS/BPSK signal and the channel interference I (t) (as discussed before, we ignore the presence of the additive thermal noise) are ideally multiplied by the identical DS modulation (this operation is known as despreading), which returns the DS/BPSK signal to its original BPSK form whereas the real interference signal is now the real wideband signal Re{I (t)c(t)}. In the previous sentence, we used the word ideally, which implies that the PN waveform used for despreading at the receiver is identical to that used for spreading at the transmitter. This simple implication covers up a multitude of tasks that a practical DS receiver must perform. In particular, the receiver must first acquire the PN waveform. That is, the local PN random generator that generates the PN waveform at the receiver used for despreading must be aligned (synchronized) to within one chip of the PN waveform of the received DS/BPSK signal. This is accomplished by employing some sort of search algorithm which typically steps the local PN waveform sequentially in time by a fraction of a chip (e.g., half a chip) and at each position searches for a high degree of correlation between the received and local PN reference waveforms. The search terminates when the correlation exceeds a given threshold, which is an indication that the alignment has been achieved. After bringing the two PN waveforms into coarse alignment, a tracking algorithm is employed to maintain fine alignment.

1 For the usual case of a rectangular spreading pulse p(t), the PSD of the DS/BPSK modulation will have (sin x/x)2 form

with first zero crossing at 1/Tc , which is nominally taken as one-half the spread spectrum bandwidth Wss .

1999 by CRC Press LLC

c

The most popular forms of tracking loops are the continuous time delay-locked loop and its timemultiplexed version the tau–dither loop. It is the difficulty in synchronizing the receiver PN generator to subnanosecond accuracy that limits PN chip rates to values on the order of hundreds of Mchips/s, which implies the same limitation on the DS spread spectrum bandwidth Wss .

11.4.2 Frequency Hopping Modulation A frequency hopping (FH) modulation c(t) is formed by nonlinearly modulating a train of pulses with a sequence of pseudorandomly generated frequency shifts {fn }. In mathematical terms, c(t) has the complex form c(t) =

∞ X

exp {j (2πfn + φn )} p (t − nTh )ψ

(11.5)

n=−∞

where p(t) is again the basic pulse shape having a duration Th , called the hop time and {φn } is a sequence of random phases associated with the generation of the hops. FH modulation is traditionally used with multiple-frequency-shift-keyed (MFSK) information signals, which have the complex form exp{j [2π(fc + d(t))t]}, where d(t) is an M-level digital waveform (M denotes the symbol alphabet size) representing the information frequency modulation at a rate 1/Ts symbols/s (sps). As such, an FH/MFSK signal is formed by complex multiplying the MFSK signal by c(t) resulting in the real transmitted signal (11.6) x(t) = Re {c(t) exp {j [2π(fc + d(t))t]}} In reality, c(t) is never generated in the transmitter. Rather, x(t) is obtained by applying the sequence of pseudorandom frequency shifts {fn } directly to the frequency synthesizer that generates the carrier frequency fc (see Fig. 11.2). In terms of the actual implementation, successive (not necessarily

FIGURE 11.2: An FH-MFSK system. disjoint) k-chip segments of a PN sequence drive a frequency synthesizer, which hops the carrier over 2k frequencies. In view of the large bandwidths over which the frequency synthesizer must operate, it is difficult to maintain phase coherence from hop to hop, which explains the inclusion of 1999 by CRC Press LLC

c

the sequence {φn } in the Eq. (11.5) model for c(t). On a short term basis, e.g., within a given hop, the signal bandwidth is identical to that of the MFSK information modulation, which is typically much smaller than Wss . On the other hand, when averaged over many hops, the signal bandwidth is equal to Wss , which can be on the order of several GHz, i.e., an order of magnitude larger than that of implementable DS bandwidths. The exact relation between Wss , Th , Ts and the number of frequency shifts in the set {fn } will be discussed shortly. At the receiver, the sum of the transmitted FH/MFSK signal and the channel interference I (t) is ideally complex multiplied by the identical FH modulation (this operation is known as dehopping), which returns the FH/MFSK signal to its original MFSK form, whereas the real interference signal is now the wideband (in the average sense) signal Re{I (t)c(t)}. Analogous to the DS case, the receiver must acquire and track the FH signal so that the dehopping waveform is as close to the hopping waveform c(t) as possible. FH systems are traditionally classified in accordance with the relationship between Th and Ts . Fast frequency-hopped (FFH) systems are ones in which there exists one or more hops per data symbol, that is, Ts = NTh (N an integer) whereas slow frequency-hopped (SFH) systems are ones in which there exists more than one symbol per hop, that is, Th = N Ts . It is customary in SS parlance to refer to the FH/MFSK tone of shortest duration as a “chip”, despite the same usage for the PN chips associated with the code generator that drives the frequency synthesizer. Keeping this distinction in mind, in an FFH system where, as already stated, there are multiple hops per data symbol, a chip is equal to a hop. For SFH, where there are multiple data symbols per hop, a chip is equal to an MFSK symbol. Combining these two statements, the chip rate Rc in an FH system is given by the larger of Rh = 1/Th and Rs = 1/Ts and, as such, is the highest system clock rate. The frequency spacing between the FH/MFSK tones is governed by the chip rate Rc and is, thus, dependent on whether the FH modulation is FFH or SFH. In particular, for SFH where Rc = Rs , the spacing between FH/MFSK tones is equal to the spacing between the MFSK tones themselves. For noncoherent detection (the most commonly encountered in FH/MFSK systems), the separation of the MFSK symbols necessary to provide orthogonality2 is an integer multiple of Rs . Assuming the minimum spacing, i.e., Rs , the entire spread spectrum band is then partitioned into a total of Nt = Wss /Rs = Wss /Rc equally spaced FH tones. One arrangement, which is by far the most common, is to group these Nt tones into Nb = Nt /M contiguous, nonoverlapping bands, each with bandwidth MRs = MRc ; see Fig. 11.3a. Assuming symmetric MFSK modulation around the carrier frequency, then the center frequencies of the Nb = 2k bands represent the set of hop carriers, each of which is assigned to a given k-tuple of the PN code generator. In this fixed arrangement, each of the Nt FH/MFSK tones corresponds to the combination of a unique hop carrier (PN code k-tuple) and a unique MFSK symbol. Another arrangement, which provides more protection against the sophisticated interferer (jammer), is to overlap adjacent M-ary bands by an amount equal to Rc ; see Fig. 11.3b. Assuming again that the center frequency of each band corresponds to a possible hop carrier, then since all but M − 1 of the Nt tones are available as center frequencies, the number of hop carriers has been increased from Nt /M to Nt − (M − 1), which for Nt M is approximately an increase in randomness by a factor of M. For FFH, where Rc = Rh , the spacing between FH/MFSK tones is equal to the hop rate. Thus, the entire spread spectrum band is partitioned into a total of Nt = Wss /Rh = Wss /Rc equally

2 An optimum noncoherent MFSK detector consists of a bank of energy detectors each matched to one of the M frequencies

in the MFSK set. In terms of this structure, the notion of orthogonality implies that for a given transmitted frequency there will be no crosstalk (energy spillover) in any of the other M − 1 energy detectors. 1999 by CRC Press LLC

c

Figure 11.3a frequencies.

Frequency distribution for FH-4FSK—nonoverlapping bands. Dashed lines indicate location of hop

spaced FH tones, each of which is assigned to a unique k-tuple of the PN code generator that drives the frequency synthesizer. Since for FFH there are Rh /Rs hops per symbol, then the metric used to make a noncoherent decision on a particular symbol is obtained by summing up Rh /Rs detected chip (hop) energies, resulting in a so-called noncoherent combining loss.

1999 by CRC Press LLC

c

Figure 11.3b

Frequency distribution for FH-4FSK—over-lapping bands.

11.4.3 Time Hopping Modulation Time hopping (TH) is to spread spectrum modulation what pulse position modulation (PPM) is to information modulation. In particular, consider segmenting time into intervals of Tf seconds and further segment each Tf interval into MT increments of width Tf /MT . Assuming a pulse of maximum duration equal to Tf /MT , then a time hopping spread spectrum modulation would take the form ∞ X an p t − n+ (11.7) c(t) = Tf MT n=−∞ 1999 by CRC Press LLC

c

where an denotes the pseudorandom position (one of MT uniformly spaced locations) of the pulse within the Tf -second interval. For DS and FH, we saw that multiplicative modulation, that is the transmitted signal is the product of the SS and information signals, was the natural choice. For TH, delay modulation is the natural choice. In particular, a TH-SS modulation takes the form x(t) = Re {c(t − d(t)) exp [j (2πfc + φT )]}

(11.8)

where d(t) is a digital information modulation at a rate 1/Ts sps. Finally, the dehopping procedure at the receiver consists of removing the sequence of delays introduced by c(t), which restores the information signal back to its original form and spreads the interferer.

11.4.4 Hybrid Modulations By blending together several of the previous types of SS modulation, one can form hybrid modulations that, depending on the system design objectives, can achieve a better performance against the interferer than can any of the SS modulations acting alone. One possibility is to multiply several of the c(t) wideband waveforms [now denoted by c(i) (t) to distinguish them from one another] resulting in a SS modulation of the form Y c(i) (t)ψ (11.9) c(t) = i

Such a modulation may embrace the advantages of the various c(i) (t), while at the same time mitigating their individual disadvantages.

11.5

Applications of Spread Spectrum

11.5.1 Military Antijam (AJ) Communications

As already noted, one of the key applications of spread spectrum is for antijam communications in a hostile environment. The basic mechanism by which a direct sequence spread spectrum receiver attenuates a noise jammer was illustrated in Section 11.3. Therefore, in this section, we will concentrate on tone jamming. Assume the received signal, denoted r(t), is given by r(t) = Ax(t) + I (t) + nw (t)ψ

(11.10)

where x(t) is given in Eq. (11.4), A is a constant amplitude, I (t) = α cos (2πfc t + θ )ψ

(11.11)

and nw (t) is additive white Gaussian noise (AWGN) having two-sided spectral density N0 /2. In Eq. (11.11), α is the amplitude of the tone jammer and θ is a random phase uniformly distributed in [0, 2π ] . If we employ the standard correlation receiver of Fig. 11.4, it is straightforward to show that the final test statistic out of the receiver is given by Z Tb c(t) dt + N (Tb )ψ (11.12) g(Tb ) = ATb + α cos θ 0

1999 by CRC Press LLC

c

FIGURE 11.4: Standard correlation receiver. where N(Tb ) is the contribution to the test statistic due to the AWGN. Noting that, for rectangular chips, we can express Z

Tb

0

c(t) dt = Tc

M X

ci

(11.13)

i=1

where Tb Tc

4

M=

(11.14)

is one-half of the processing gain, it is straightforward to show that, for a given value of θ , the signal-to-noise-plus-interference ratio, denoted by S/Ntotal , is given by S = Ntotal

1 N0 2Eb

J MS

+

cos2 θ

(11.15)

In Eq. (11.15), the jammer power is α2 2

4

J=

(11.16)

and the signal power is A2 (11.17) 2 If we look at the second term in the denominator of Eq. (11.15), we see that the ratio J /S is divided by M. Realizing that J /S is the ratio of the jammer power to the signal power before despreading, and J /MS is the ratio of the same quantity after despreading, we see that, as was the case for noise jamming, the benefit of employing direct sequence spread spectrum signalling in the presence of tone jamming is to reduce the effect of the jammer by an amount on the order of the processing gain. Finally, one can show that an estimate of the average probability of error of a system of this type is given by s ! Z 2π 1 S φ − (11.18) dθ Pe = 2π 0 Ntotal 4

S=

where 1 φ(x) = √ 2π 4

Z

x

−∞

e−y

2 /2

dy

(11.19)

If Eq. (11.18) is evaluated numerically and plotted, the results are as shown in Fig. 11.5. It is clear from this figure that a large initial power advantage of the jammer can be overcome by a sufficiently large value of the processing gain. 1999 by CRC Press LLC

c

FIGURE 11.5: Plotted results of Eq. (11.18).

Low-Probability of Intercept (LPI)

The opposite side of the AJ problem is that of LPI, that is, the desire to hide your signal from detection by an intelligent adversary so that your transmissions will remain unnoticed and, thus, neither jammed nor exploited in any manner. This idea of designing an LPI system is achieved in a variety of ways, including transmitting at the smallest possible power level, and limiting the transmission time to as short an interval in time as is possible. The choice of signal design is also important, however, and it is here that spread spectrum techniques become relevant. The basic mechanism is reasonably straightforward; if we start with a conventional narrowband signal, say a BPSK waveform having a spectrum as shown in Fig. 11.6a, and then spread it so that its new spectrum is as shown in Fig. 11.6b, the peak amplitude of the spectrum after spreading has been reduced by an amount on the order of the processing gain relative to what it was before spreading. Indeed, a sufficiently large processing gain will result in the spectrum of the signal after spreading falling below the ambient thermal noise level. Thus, there is no easy way for an unintended listener to determine that a transmission is taking place. 1999 by CRC Press LLC

c

Figure 11.6a

Figure 11.6b

That is not to say the spread signal cannot be detected, however, merely that it is more difficult for an adversary to learn of the transmission. Indeed, there are many forms of so-called intercept receivers that are specifically designed to accomplish this very task. By way of example, probably the best known and simplest to implement is a radiometer, which is just a device that measures the total power present in the received signal. In the case of our intercept problem, even though we have lowered the power spectral density of the transmitted signal so that it falls below the noise floor, we have not lowered its power (i.e., we have merely spread its power over a wider frequency range). Thus, if the radiometer integrates over a sufficiently long period of time, it will eventually determine the presence of the transmitted signal buried in the noise. The key point, of course, is that the use of the spreading makes the interceptor’s task much more difficult, since he has no knowledge of the spreading code and, thus, cannot despread the signal.

11.5.2 Commercial Multiple Access Communications

From the perspective of commercial applications, probably the most important use of spread spectrum communications is as a multiple accessing technique. When used in this manner, it becomes an alternative to either frequency division multiple access (FDMA) or time division multiple access (TDMA) and is typically referred to as either code division multiple access (CDMA) or spread spectrum multiple access (SSMA). When using CDMA, each signal in the set is given its own spreading sequence. As opposed to either FDMA, wherein all users occupy disjoint frequency bands but are transmitted simultaneously in time, or TDMA, whereby all users occupy the same bandwidth but transmit in disjoint intervals of time, in CDMA, all signals occupy the same bandwidth and are transmitted simultaneously in time; the different waveforms in CDMA are distinguished from one another at the receiver by the specific spreading codes they employ. Since most CDMA detectors are correlation receivers, it is important when deploying such a system to have a set of spreading sequences that have relatively low-pairwise cross-correlation between any two sequences in the set. Further, there are two fundamental types of operation in CDMA, synchronous and asynchronous. In the former case, the symbol transition times of all of the users are aligned; this allows for orthogonal sequences to be used as the spreading sequences and, thus, eliminates interference from one user to another. Alternately, if no effort is made to align the sequences, the 1999 by CRC Press LLC

c

system operates asychronously; in this latter mode, multiple access interference limits the ultimate channel capacity, but the system design exhibits much more flexibility. CDMA has been of particular interest recently for applications in wireless communications. These applications include cellular communications, personal communications services (PCS), and wireless local area networks. The reason for this popularity is primarily due to the performance that spread spectrum waveforms display when transmitted over a multipath fading channel. To illustrate this idea, consider DS signalling. As long as the duration of a single chip of the spreading sequence is less than the multipath delay spread, the use of DS waveforms provides the system designer with one of two options. First, the multipath can be treated as a form of interference, which means the receiver should attempt to attenuate it as much as possible. Indeed, under this condition, all of the multipath returns that arrive at the receiver with a time delay greater than a chip duration from the multipath return to which the receiver is synchronized (usually the first return) will be attenuated because of the processing gain of the system. Alternately, the multipath returns that are separated by more than a chip duration from the main path represent independent “looks” at the received signal and can be used constructively to enhance the overall performance of the receiver. That is, because all of the multipath returns contain information regarding the data that is being sent, that information can be extracted by an appropriately designed receiver. Such a receiver, typically referred to as a RAKE receiver, attempts to resolve as many individual multipath returns as possible and then to sum them coherently. This results in an implicit diversity gain, comparable to the use of explicit diversity, such as receiving the signal with multiple antennas. The condition under which the two options are available can be stated in an alternate manner. If one envisions what is taking place in the frequency domain, it is straightforward to show that the condition of the chip duration being smaller than the multipath delay spread is equivalent to requiring that the spread bandwidth of the transmitted waveform exceed what is called the coherence bandwidth of the channel. This latter quantity is simply the inverse of the multipath delay spread and is a measure of the range of frequencies that fade in a highly correlated manner. Indeed, anytime the coherence bandwidth of the channel is less than the spread bandwidth of the signal, the channel is said to be frequency selective with respect to the signal. Thus, we see that to take advantage of DS signalling when used over a multipath fading channel, that signal should be designed such that it makes the channel appear frequency selective. In addition to the desirable properties that spread spectrum signals display over multipath channels, there are two other reasons why such signals are of interest in cellular-type applications. The first has to do with a concept known as the reuse factor. In conventional cellular systems, either analog or digital, in order to avoid excessive interference from one cell to its neighbor cells, the frequencies used by a given cell are not used by its immediate neighbors (i.e., the system is designed so that there is a certain spatial separation between cells that use the same carrier frequencies). For CDMA, however, such spatial isolation is typically not needed, so that so-called universal reuse is possible. Further, because CDMA systems tend to be interference limited, for those applications involving voice transmission, an additional gain in the capacity of the system can be achieved by the use of voice activity detection. That is, in any given two-way telephone conversation, each user is typically talking only about 50% of the time. During the time when a user is quiet, he is not contributing to the instantaneous interference. Thus, if a sufficiently large number of users can be supported by the system, statistically only about one-half of them will be active simultaneously, and the effective capacity can be doubled. 1999 by CRC Press LLC

c

Interference Rejection

In addition to providing multiple accessing capability, spread spectrum techniques are of interest in the commercial sector for basically the same reasons they are in the military community, namely their AJ and LPI characteristics. However, the motivations for such interest differ. For example, whereas the military is interested in ensuring that systems they deploy are robust to interference generated by an intelligent adversary (i.e., exhibit jamming resistance), the interference of concern in commercial applications is unintentional. It is sometimes referred to as cochannel interference (CCI) and arises naturally as the result of many services using the same frequency band at the same time. And while such scenarios almost always allow for some type of spatial isolation between the interfering waveforms, such as the use of narrow-beam antenna patterns, at times the use of the inherent interference suppression property of a spread spectrum signal is also desired. Similarly, whereas the military is very much interested in the LPI property of a spread spectrum waveform, as indicated in Section 11.3, there are applications in the commercial segment where the same characteristic can be used to advantage. To illustrate these two ideas, consider a scenario whereby a given band of frequencies is somewhat sparsely occupied by a set of conventional (i.e., nonspread) signals. To increase the overall spectral efficiency of the band, a set of spread spectrum waveforms can be overlaid on the same frequency band, thus forcing the two sets of users to share common spectrum. Clearly, this scheme is feasible only if the mutual interference that one set of users imposes on the other is within tolerable limits. Because of the interference suppression properties of spread spectrum waveforms, the despreading process at each spread spectrum receiver will attenuate the components of the final test statistic due to the overlaid narrowband signals. Similarly, because of the LPI characteristics of spread spectrum waveforms, the increase in the overall noise level as seen by any of the conventional signals, due to the overlay, can be kept relatively small.

Defining Terms Antijam communication system: A communication system designed to resist intentional jamming by the enemy. Chip time (interval): The duration of a single pulse in a direct sequence modulation; typically much smaller than the information symbol interval. Coarse alignment: The process whereby the received signal and the despreading signal are aligned to within a single chip interval. Dehopping: Despreading using a frequency-hopping modulation. Delay-locked loop: A particular implementation of a closed-loop technique for maintaining fine alignment. Despreading: The notion of decreasing the bandwidth of the received (spread) signal back to its information bandwidth. Direct sequence modulation: A signal formed by linearly modulating the output sequence of a pseudorandom number generator onto a train of pulses. Direct sequence spread spectrum: A spreading technique achieved by multiplying the information signal by a direct sequence modulation. Fast frequency-hopping: A spread spectrum technique wherein the hop time is less than or equal to the information symbol interval, i.e., there exist one or more hops per data symbol. 1999 by CRC Press LLC

c

Fine alignment: The state of the system wherein the received signal and the despreading signal are aligned to within a small fraction of a single chip interval. Frequency-hopping modulation: A signal formed by nonlinearly modulating a train of pulses with a sequence of pseudorandomly generated frequency shifts. Hop time (interval): The duration of a single pulse in a frequency-hopping modulation. Hybrid spread spectrum: A spreading technique formed by blending together several spread spectrum techniques, e.g., direct sequence, frequency-hopping, etc. Low-probability-of-intercept communication system: A communication system designed to operate in a hostile environment wherein the enemy tries to detect the presence and perhaps characteristics of the friendly communicator’s transmission. Processing gain (spreading ratio): The ratio of the spread spectrum bandwidth to the information data rate. Radiometer: A device used to measure the total energy in the received signal. Search algorithm: A means for coarse aligning (synchronizing) the despreading signal with the received spread spectrum signal. Slow frequency-hopping: A spread spectrum technique wherein the hop time is greater than the information symbol interval, i.e., there exists more than one data symbol per hop. Spread spectrum bandwidth: The bandwidth of the transmitted signal after spreading. Spreading: The notion of increasing the bandwidth of the transmitted signal by a factor far in excess of its information bandwidth. Tau–dither loop: A particular implementation of a closed-loop technique for maintaining fine alignment. Time-hopping spread spectrum: A spreading technique that is analogous to pulse position modulation. Tracking algorithm: An algorithm (typically closed loop) for maintaining fine alignment.

References [1] Cook, C.F., Ellersick, F.W., Milstein, L.B., and Schilling, D.L., Spread Spectrum Communications, IEEE Press, 1983. [2] Dixon, R.C., Spread Spectrum Systems, 3rd ed., John Wiley and Sons, Inc. 1994. [3] Holmes, J.K., Coherent Spread Spectrum Systems, John Wiley and Sons, Inc. 1982. [4] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K., Spread Spectrum Communications Handbook, McGraw Hill, 1994 (previously published as Spread Spectrum Communications, Computer Science Press, 1985). [5] Ziemer, R.E. and Peterson, R.L., Digital Communications and Spread Spectrum Techniques, Macmillan, 1985.

1999 by CRC Press LLC

c

Paulraj, A.J. “Diversity” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Diversity 12.1 Introduction 12.2 Diversity Schemes

Space Diversity • Polarization Diversity • Angle Diversity • Frequency Diversity • Path Diversity • Time Diversity • Transformed Diversity

12.3 Diversity Combining Techniques

Selection Combining • Maximal Ratio Combining • Equal Gain Combining • Loss of Diversity Gain Due to Branch Correlation and Unequal Branch Powers

Arogyaswami J. Paulraj Stanford University

12.1

12.4 Effect of Diversity Combining on Bit Error Rate 12.5 Concluding Remarks Defining Terms References

Introduction

Diversity is a commonly used technique in mobile radio systems to combat signal fading. The basic principle of diversity is as follows. If several replicas of the same information-carrying signal are received over multiple channels with comparable strengths, which exhibit independent fading, then there is a good likelihood that at least one or more of these received signals will not be in a fade at any given instant in time, thus making it possible to deliver adequate signal level to the receiver. Without diversity techniques, in noise limited conditions, the transmitter would have to deliver a much higher power level to protect the link during the short intervals when the channel is severely faded. In mobile radio, the power available on the reverse link is severely limited by the battery capacity of hand-held subscriber units. Diversity methods play a crucial role in reducing transmit power needs. Also, cellular communication networks are mostly interference limited and, once again, mitigation of channel fading through use of diversity can translate into reduced variability of carrier-to-interference ratio (C/I), which in turn means lower C/I margin and hence better reuse factors and higher system capacity. The basic principles of diversity have been known since 1927 when the first experiments in space diversity were reported. There are many techniques for obtaining independently fading branches, and these can be subdivided into two main classes. The first are explicit techniques where explicit redundant signal transmission is used to exploit diversity channels. Use of dual polarized signal transmission and reception in many point-to-point radios is an example of explicit diversity. Clearly such redundant signal transmission involves a penalty in frequency spectrum or additional power. In the second class are implicit diversity techniques: the signal is transmitted only once, but the 1999 by CRC Press LLC

c

decorrelating effects in the propagation medium such as multipaths are exploited to receive signals over multiple diversity channels. A good example of implicit diversity is the RAKE receiver in code division multiple access (CDMA) systems, which uses independent fading of resolvable multipaths to achieve diversity gain. Figure 12.1 illustrates the principle of diversity where two independently fading signals are shown along with the selection diversity output signal which selects the stronger signal. The fades in the resulting signal have been substantially smoothed out while also yielding higher average power.

FIGURE 12.1: Example of diversity combining. Two independently fading signals 1 and 2. The signal 3 is the result of selecting the strongest signal.

If antennas are used in transmit, they can be exploited for diversity. If the transmit channel is known, the antennas can be driven with complex conjugate channel weighting to co-phase the signals at the receive antenna. If the forward channel is not known, we have several methods to convert space selective fading at the transmit antennas to other forms of diversity exploitable in the receiver. Exploiting diversity needs careful design of the communication link. In explicit diversity, multiple copies of the same signal are transmitted in channels using either a frequency, time, or polarization dimension. At the receiver end we need arrangements to receive the different diversity branches (this is true for both explicit and implicit diversity). The different diversity branches are then combined to reduce signal outage probability or bit error rate. In practice, the signals in the diversity branches may not show completely independent fading. 1999 by CRC Press LLC

c

The envelope cross correlation ρ between these signals is a measure of their independence. E [[r1 − r¯1 ] [r2 − r¯2 ]] ρ=p E |r1 − r¯1 |2 E |r2 − r¯2 |2 where r1 and r2 represent the instantaneous envelope levels of the normalized signals at the two receivers and r¯1 and r¯2 are their respective means. It has been shown that a cross correlation of 0.7 [3] between signal envelopes is sufficient to provide a reasonable degree of diversity gain. Depending on the type of diversity employed, these diversity channels must be sufficiently separated along the appropriate diversity dimension. For spatial diversity, the antennas should be separated by more than the coherence distance to ensure a cross correlation of less than 0.7. Likewise in frequency diversity, the frequency separation must be larger than the coherence bandwidth, and in time diversity the separation between channel reuse in time should be longer than the coherence time. These coherence factors in turn depend on the channel characteristics. The coherence distance, coherence bandwidth and coherence time vary inversely as the angle spread, delay spread, and Doppler spread, respectively. If the receiver has a number of diversity branches, it has to combine these branches to maximize the signal level. Several techniques have been studied for diversity combining. We will describe three main techniques: selection combining, equal gain combining, and maximal ratio combining. Finally, we should note that diversity is primarily used to combat fading and if the signal does not show significant fading in the first place, for example when there is a direct path component, diversity combining may not provide significant diversity gain. In the case of antenna diversity, array gain proportional to the number of antennas will still be available.

12.2

Diversity Schemes

There are several techniques for obtaining diversity branches, sometimes also known as diversity dimensions. The most important of these are discussed in the following sections.

12.2.1

Space Diversity

This has historically been the most common form of diversity in mobile radio base stations. It is easy to implement and does not require additional frequency spectrum resources. Space diversity is exploited on the reverse link at the base station receiver by spacing antennas apart so as to obtain sufficient decorrelation. The key for obtaining minimum uncorrelated fading of antenna outputs is adequate spacing of the antennas. The required spacing depends on the degree of multipath angle spread. For example if the multipath signals arrive from all directions in the azimuth, as is usually the case at the mobile, antenna spacing (coherence distance) of the order of 0.5λ to 0.8λ is quite adequate [5]. On the other hand if the multipath angle spread is small, as in the case of base stations, the coherence distance is much larger. Also empirical measurements show a strong coupling between antenna height and spatial correlation. Larger antenna heights imply larger coherence distances. Typically 10λ to 20λ separation is adequate to achieve ρ = 0.7 at base stations in suburban settings when the signals arrive from the broadside direction. The coherence distance can be 3 to 4 times larger for endfire arrivals. The endfire problem is averted in base stations with trisectored antennas as each sector needs to handle only signals arriving ±60◦ off the broadside. The coherence distance depends strongly on the terrain. Large multipath angle spread means smaller coherence distance. Base stations normally use space diversity in the horizontal plane only. Separation in the vertical plane can also be used, and the necessary spacing depends upon vertical multipath angle spread. This can be small for distant mobiles making vertical plane diversity less attractive in most applications. 1999 by CRC Press LLC

c

Space diversity is also exploitable at the transmitter. If the forward channel is known, it works much like receive space diversity. If it is not known, then space diversity can be transformed to another form of diversity exploitable at the receiver. (See Section 12.2.7 below). If antennas are used at transmit and receive, the M transmit and N receive antennas both contribute to diversity. It can be shown that if simple weighting is used without additional bandwidth or time/memory processing, then maximum diversity gain is obtained if the transmitter and receiver use the left and right singular vectors of the M × N channel matrix, respectively. However, to approach the maximum M × N order diversity order will require the use of additional bandwidth or time/memory-based methods.

12.2.2 Polarization Diversity In mobile radio environments, signals transmitted on orthogonal polarizations exhibit low fade correlation, and therefore, offer potential for diversity combining. Polarization diversity can be obtained either by explicit or implicit techniques. Note that with polarization only two diversity branches are available as against space diversity where several branches can be obtained using multiple antennas. In explicit polarization diversity, the signal is transmitted and received in two orthogonal polarizations. For a fixed total transmit power, the power in each branch will be 3 dB lower than if single polarization is used. In the implicit polarization technique, the signal is launched in a single polarization, but is received with cross-polarized antennas. The propagation medium couples some energy into the cross-polarization plane. The observed cross-polarization coupling factor lies between 8 to 12 dB in mobile radio [8, 1]. The cross-polarization envelope decorrelation has been found to be adequate. However, the large branch imbalance reduces the available diversity gain. With hand-held phones, the handset can be held at random orientations during a call. This results in energy being launched with varying polarization angles ranging from vertical to horizontal. This further increases the advantage of cross-polarized antennas at the base station since the two antennas can be combined to match the received signal polarization. This makes polarization diversity even more attractive. Recent work [4] has shown that with variable launch polarization, a cross-polarized antenna can give comparable overall (matching plus diversity) performance to a vertically polarized space diversity antenna. Finally, we should note that cross-polarized antennas can be deployed in a compact antenna assembly and do not need large physical separation needed in space diversity antennas. This is an important advantage in the PCS base stations where low profile antennas are needed.

12.2.3 Angle Diversity In situations where the angle spread is very high, such as indoors or at the mobile unit in urban locations, signals collected from multiple nonoverlapping beams offer low fade correlation with balanced power in the diversity branches. Clearly, since directional beams imply use of antenna aperture, angle diversity is closely related to space diversity. Angle diversity has been utilized in indoor wireless LANs, where its use allows substantial increase in LAN throughputs [2].

12.2.4 Frequency Diversity Another technique to obtain decorrelated diversity branches is to transmit the same signal over different frequencies. The frequency separation between carriers should be larger than the coherence bandwidth. The coherence bandwidth, of course, depends on the multipath delay spread of the channel. The larger the delay spread, the smaller the coherence bandwidth and the more closely 1999 by CRC Press LLC

c

we can space the frequency diversity channels. Clearly, frequency diversity is an explicit diversity technique and needs additional frequency spectrum. A common form of frequency diversity is multicarrier (also known as multitone) modulation. This technique involves sending redundant data over a number of closely spaced carriers to benefit from frequency diversity, which is then exploited by applying interleaving and channel coding/forward error correction across the carriers. Another technique is to use frequency hopping wherein the interleaved and channel coded data stream is transmitted with widely separated frequencies from burst to burst. The wide frequency separation is chosen to guarantee independent fading from burst to burst.

12.2.5 Path Diversity This implicit diversity is available if the signal bandwidth is much larger than the channel coherence bandwidth. The basis for this method is that when the multipath arrivals can be resolved in the receiver and since the paths fade independently, diversity gain can be obtained. In CDMA systems, the multipath arrivals must be separated by more than one chip period and the RAKE receiver provides the diversity [9]. In TDMA systems, the multipath arrivals must be separated by more than one symbol period and the MLSE receiver provides the diversity.

12.2.6 Time Diversity In mobile communications channels, the mobile motion together with scattering in the vicinity of the mobile causes time selective fading of the signal with Rayleigh fading statistics for the signal envelope. Signal fade levels separated by the coherence time show low correlation and can be used as diversity branches if the same signal can be transmitted at multiple instants separated by the coherence time. The coherence time depends on the Doppler spread of the signal, which in turn is a function of the mobile speed and the carrier frequency. Time diversity is usually exploited via interleaving, forward-error correction (FEC) coding, and automatic request for repeat (ARQ). These are sophisticated techniques to exploit channel coding and time diversity. One fundamental drawback with time diversity approaches is the delay needed to collect the repeated or interleaved transmissions. If the coherence time is large, as for example when the vehicle is slow moving, the required delay becomes too large to be acceptable for interactive voice conversation. The statistical properties of fading signals depend on the field component used by the antenna, the vehicular speed, and the carrier frequency. For an idealized case of a mobile surrounded by scatterers in all directions, the autocorrelation function of the received signal x(t) (note this is not the envelope r(t)) can be shown to be E [x(t)x(t + τ )] = J0 (2πτ v/λ) where J0 is a Bessel function of the 0th order and v is the mobile velocity.

12.2.7 Transformed Diversity In transformed diversity, the space diversity branches at the transmitter are transformed into other forms of diversity branches exploitable at the receiver. This is used when the forward channel is not known and shifts the responsibility of diversity combining to the receiver which has the necessary channel knowledge. 1999 by CRC Press LLC

c

Space to Frequency

• Antenna-delay. Here the signal is transmitted from two or more antennas with delays of the order of a chip or symbol period in CDMA or TDMA, respectively. The different transmissions simulate resolved path arrivals that can be used as diversity branches by the RAKE or MLSE equalizer. • Multicarrier modulation. The data stream after interleaving and coding is modulated as a multicarrier output using an inverse DFT. The carriers are then mapped to the different antennas. The space selective fading at the antennas is now transformed to frequency selective fading and diversity is obtained during decoding. Space to Time • Antenna hopping/phase rolling. In this method the data stream after coding and interleaving is switched randomly from antenna to antenna. The space selective fading at the transmitter is converted into a time selective fading at the receiver. This is a form of “active” fading. • Space-time coding. The approach in space-time coding is to split the encoded data into multiple data streams each of which is modulated and simultaneously transmitted from different antennas. The received signal is a superposition of the multiple transmitted signals. Channel decoding can be used to recover the data sequence. Since the encoded data arrive over uncorrelated fade branches, diversity gain can be realized.

12.3

Diversity Combining Techniques

Several diversity combining methods are known. We describe three main techniques: selection, maximal ratio, and equal gain. They can be used with each of the diversity schemes discussed above.

12.3.1

Selection Combining

This is the simplest and perhaps the most frequently used form of diversity combining. In this technique, one of the two diversity branches with the highest carrier-to-noise ratio (C/N) is connected to the output. See Fig. 12.2(a). The performance improvement due to selection diversity can be seen as follows. Let the signal in each branch exhibit Rayleigh fading with mean power σ 2 . The density function of the envelope is given by 2 ri −ri (12.1) p (ri ) = 2 e 2σ 2 σ where ri is the signal envelope in each branch. If we define two new variables γi

=

0

=

Instantaneous signal power in each branch Mean noise power Mean signal power in each branch Mean noise power

then the probability that the C/N is less than or equal to some specified value γs is Prob γi ≤ γs = 1 − e−γs / 0 1999 by CRC Press LLC

c

(12.2)

FIGURE 12.2: Diversity combining methods for two diversity branches.

The probability that γi in all branches with independent fading will be simultaneously less than or equal to γs is then M (12.3) Prob γ1 , γ2 , . . . γM ≤ γs = 1 − e−γs / 0 This is the distribution of the best signal envelope from the two diversity branches. Figure 12.3 shows the distribution of the combiner output C/N for M = 1,2,3, and 4 branches. The improvement in signal quality is significant. For example at 99% reliability level, the improvement in C/N is 10 dB for two branches and 16 dB for four branches. Selection combining also increases the mean C/N of the combiner output and can be shown to be [3] Mean (γs ) = 0

M X 1 k=1

k

(12.4)

This indicates that with 4 branches, for example, the mean C/N of the selected branch is 2.08 better than the mean C/N in any one branch.

12.3.2

Maximal Ratio Combining

In this technique the M diversity branches are first co-phased and then weighted proportionally to their signal level before summing. See Fig. 12.2(b). The distribution of the maximal ratio combiner 1999 by CRC Press LLC

c

FIGURE 12.3: Probability distribution of signal envelope for selection combining. has been shown to be [5] M X (γm / 0)k−1 Prob γ ≤ γm = 1 − e(−γm / 0) (k − 1)!

(12.5)

k=1

The distribution of output of a maximal ratio combiner is shown in Fig. 12.4. Maximal ratio combining is known to be optimal in the sense that it yields the best statistical reduction of fading of any linear diversity combiner. In comparison to the selection combiner, at 99% reliability level, the maximal ratio combiner provides a 11.5 dB gain for two branches and a 19 dB gain for four branches, an improvement of 1.5 and 3 dB, respectively, over the selection diversity combiner. The mean C/N of the combined signal may be easily shown to be Mean (γm ) = M0

(12.6)

Therefore, combiner output mean varies linearly with M. This confirms the intuitive result that the output C/N averaged over fades should provide gain proportional to the number of diversity branches. This is a situation similar to conventional beamforming.

12.3.3

Equal Gain Combining

In some applications, it may be difficult to estimate the amplitude accurately, the combining gains may all be set to unity, and the diversity branches merely summed after co-phasing. [See Fig. 12.2(c)]. The distribution of equal gain combiner does not have a neat expression and has been computed by numerical evaluation. Its performance has been shown to be very close to within a decibel to 1999 by CRC Press LLC

c

FIGURE 12.4: Probability distribution for signal envelope for maximal ratio combining. maximal ratio combining. The mean C/N can be shown to be [3] h i π Mean (γe ) = 0 1 + (M − 1) 4

(12.7)

Like maximal ratio combining, the mean C/N for equal gain combining grows almost linearly with M and is approximately only one decibel poorer than maximal ratio combiner even with an infinite number of branches.

12.3.4

Loss of Diversity Gain Due to Branch Correlation and Unequal Branch Powers

The above analysis assumed that the fading signals in the diversity branches were all uncorrelated and of equal power. In practice, this may be difficult to achieve and as we saw earlier, the branch crosscorrelation coefficient ρ = 0.7 is considered to be acceptable. Also, equal mean powers in diversity branches are rarely available. In such cases we can expect a certain loss of diversity gain. However, since most of the damage in fading is due to deep fades, and also since the chance of coincidental deep fades is small even for moderate branch correlation, one can expect a reasonable tolerance to branch correlation. The distribution of the output signal envelope of maximal ratio combiner has been shown to be [6]: M X An −γm /2λn e Prob γm = 2λn n=1

1999 by CRC Press LLC

c

(12.8)

where λn are the eigenvalues of the M × M branch envelope covariance matrix whose elements are defined by h i (12.9) R ij = E ri rj∗ and An is defined by An =

M Y

k=1 k 6= n

12.4

1 1 − λk /λn

(12.10)

Effect of Diversity Combining on Bit Error Rate

So far we have studied the distribution of the instantaneous envelope or C/N after diversity combining. We will now briefly survey how diversity combining affects BER performance in digital radio links; we assume maximal ratio combining. To begin let us first examine the effect of Rayleigh fading on the BER performance of digital transmission links. This has been studied by several authors and is summarized in [7]. Table 12.1 gives the BER expressions in the large Eb /N0 case for coherent binary PSK and coherent binary orthogonal FSK for unfaded and Rayleigh faded AWGN (additive white Gaussian noise channels) channels. E¯ b /N0 represents the average Eb /N0 for the fading channel. TABLE 12.1 Comparison of BER Performance for Unfaded and Rayleigh Faded Signals Modulaton

Unfaded BER

Coh BPSK

p 1 Eb /N0 2 erfc

Coh FSK

1 2 erfc

q

Faded BER

1 2 Eb /N0

1 4 E¯b /N0 1 2 E¯b /N0

Observe that error rates decrease only inversely with SNR as against exponential decreases for the unfaded channel. Also note that for fading channels, coherent binary PSK is 3 dB better than coherent binary FSK, exactly the same advantage as in unfaded case. Even for modest target BER of 10−2 that is usually needed in mobile communications, the loss due to fading can be very high—17.2 dB. To obtain the BER with maximal ratio diversity combining we have to average the BER expression for the unfaded BER with the distribution obtained for the maximal ratio combiner given in (12.5). Analytical expressions have been derived for these in [7]. For a branch SNR greater than 10 dB, the BER after maximal ratio diversity combining is given in Table 12.2. We observe that the probability of error varies as 1/E¯ b /N0 raised to the Lth power. Thus, diversity reduces the error rate exponentially as the number of independent branches increases. 1999 by CRC Press LLC

c

TABLE 12.2 BER Performance for Coherent BPSK and FSK with Diversity Modulaton Coherent BPSK Coherent FSK

12.5

Post Diversity BER L 2L − 1 1 L 4 E¯ b /N0 L 2L − 1 1 L ¯ 2Eb /N0

Concluding Remarks

Diversity provides a powerful technique for combating fading in mobile communication systems. Diversity techniques seek to generate and exploit multiple branches over which the signal shows low fade correlation. To obtain the best diversity performance, the multiple access, modulation, coding and antenna design of the wireless link must all be carefully chosen so as to provide a rich and reliable level of well-balanced, low-correlation diversity branches in the target propagation environment. Successful diversity exploitation can impact a mobile network in several ways. Reduced power requirements can result in increased coverage or improved battery life. Low signal outage improves voice quality and handoff performance. Finally, reduced fade margins directly translate to better reuse factors and, hence, increased system capacity.

Defining Terms Automatic request for repeat: An error control mechanism in which received packets that cannot be corrected are retransmitted. Channel coding/Forward error correction: A technique that inserts redundant bits during transmission to help detect and correct bit errors during reception. Fading: Fluctuation in the signal level due to shadowing and multipath effects. Frequency hopping: A technique where the signal bursts are transmitted at different frequencies separated by random spacing that are multiples of signal bandwidth. Interleaving: A form of data scrambling that spreads burst of bit errors evenly over the received data allowing efficient forward error correction. Outage probability: The probability that the signal level falls below a specified minimum level. PCS: Personal Communications Services. RAKE receiver: A receiver used in direct sequence spread spectrum signals. The receiver extracts energy in each path and then adds them together with appropriate weighting and delay.

References [1] Adachi, F., Feeney, M.T., Williason, A.G., and Parsons, J.D., Crosscorrelation between the envelopes of 900 MHz signals received at a mobile radio base station site. Proc. IEE, 133(6), 506–512, 1986. [2] Freeburg, T.A., Enabling technologies for in-building network communications—four technical challenges and four solutions. IEEE Trans. Veh. Tech., 29(4), 58–64, 1991. [3] Jakes, W.C., Microwave Mobile Communications, John Wiley & Sons, New York, 1974. 1999 by CRC Press LLC

c

[4] Jefford, P.A., Turkmani, A.M.D., Arowojulu, A.A., and Kellet, C.J., An experimental evaluation of the performance of the two branch space and polarization schemes at 1800 MHz. IEEE Trans. Veh. Tech., VT-44(2), 318–326, 1995. [5] Lee, W.C.Y., Mobile Communications Engineering, McGraw-Hill, New York, 1982. [6] Pahlavan, K. and Levesque, A.H., Wireless Information Networks, John Wiley & Sons, New York, 1995. [7] Proakis, J.G., Digital Communications, McGraw-Hill, New York, 1989. [8] Vaughan, R.G., Polarization diversity system in mobile communications. IEEE Trans. Veh. Tech., VT-39(3), 177–186, 1990. [9] Viterbi, A.J., CDMA: Principle of Spread Spectrum Communications, Addison-Wesley, Reading, MA, 1995.

1999 by CRC Press LLC

c

Sklar, B. “Digital Communication System Performance” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Digital Communication System Performance 1

13.1 Introduction

The Channel • The Link

13.2 Bandwidth and Power Considerations

The Bandwidth Efficiency Plane • M-ary Signalling • Bandwidth-Limited Systems • Power-Limited Systems • Minimum Bandwidth Requirements for MPSK and MFSK Signalling

13.3 Example 1: Bandwidth-Limited Uncoded System Solution to Example 1

13.4 Example 2: Power-Limited Uncoded System Solution to Example 2

13.5 Example 3: Bandwidth-Limited Power-Limited Coded System Solution to Example 3 • Calculating Coding Gain

13.6 Example 4: Direct-Sequence Spread-Spectrum Coded System

Processing Gain • Channel Parameters for Example 13.4 • Solution to Example 13.4

Bernard Sklar Communications Engineering Services

13.1

13.7 Conclusion Appendix: Received E b /N 0 Is Independent of the Code Parameters References Further Information

Introduction

In this section we examine some fundamental tradeoffs among bandwidth, power, and error performance of digital communication systems. The criteria for choosing modulation and coding schemes, based on whether a system is bandwidth limited or power limited, are reviewed for several system examples. Emphasis is placed on the subtle but straightforward relationships we encounter when transforming from data-bits to channel-bits to symbols to chips.

1 A version of this chapter has appeared as a paper in the IEEE Communications Magazine, November 1993, under the title

“Defining, Designing, and Evaluating Digital Communication Systems.” 1999 by CRC Press LLC

c

The design or definition of any digital communication system begins with a description of the communication link. The link is the name given to the communication transmission path from the modulator and transmitter, through the channel, and up to and including the receiver and demodulator. The channel is the name given to the propagating medium between the transmitter and receiver. A link description quantifies the average signal power that is received, the available bandwidth, the noise statistics, and other impairments, such as fading. Also needed to define the system are basic requirements, such as the data rate to be supported and the error performance.

13.1.1

The Channel

For radio communications, the concept of free space assumes a channel region free of all objects that might affect radio frequency (RF) propagation by absorption, reflection, or refraction. It further assumes that the atmosphere in the channel is perfectly uniform and nonabsorbing, and that the earth is infinitely far away or its reflection coefficient is negligible. The RF energy arriving at the receiver is assumed to be a function of distance from the transmitter (simply following the inverse-square law of optics). In practice, of course, propagation in the atmosphere and near the ground results in refraction, reflection, and absorption, which modify the free space transmission.

13.1.2

The Link

A radio transmitter is characterized by its average output signal power Pt and the gain of its transmitting antenna Gt . The name given to the product Pt Gt , with reference to an isotropic antenna is effective radiated power (EIRP) in watts (or dBW). The predetection average signal power S arriving at the output of the receiver antenna can be described as a function of the EIRP, the gain of the receiving antenna Gr , the path loss (or space loss) Ls , and other losses, Lo , as follows [14, 15]: S=

EIRP Gr Ls Lo

(13.1)

The path loss Ls can be written as follows [15]: Ls =

4π d λ

2 (13.2)

where d is the distance between the transmitter and receiver and λ is the wavelength. We restrict our discussion to those links distorted by the mechanism of additive white Gaussian noise (AWGN) only. Such a noise assumption is a very useful model for a large class of communication systems. A valid approximation for average received noise power N that this model introduces is written as follows [5, 9]: (13.3) N∼ = kT ◦ W where k is Boltzmann’s constant (1.38 × 10−23 joule/K), T ◦ is effective temperature in kelvin, and W is bandwidth in hertz. Dividing Eq. (13.3) by bandwidth, enables us to write the received noise-power spectral density N0 as follows: N (13.4) = kT ◦ N0 = W Dividing Eq. (13.1) by N0 yields the received average signal-power to noise-power spectral density S/N0 as EIRP Gr /T ◦ S = (13.5) N0 kLs Lo 1999 by CRC Press LLC

c

where Gr /T ◦ is often referred to as the receiver figure of merit. A link budget analysis is a compilation of the power gains and losses throughout the link; it is generally computed in decibels, and thus takes on the bookkeeping appearance of a business enterprise, highlighting the assets and liabilities of the link. Once the value of S/N0 is specified or calculated from the link parameters, we then shift our attention to optimizing the choice of signalling types for meeting system bandwidth and error performance requirements. Given the received S/N0 , we can write the received bit-energy to noise-power spectral density Eb /N0 , for any desired data rate R, as follows: STb S 1 Eb = = (13.6) N0 N0 N0 R Equation (13.6) follows from the basic definitions that received bit energy is equal to received average signal power times the bit duration and that bit rate is the reciprocal of bit duration. Received Eb /N0 is a key parameter in defining a digital communication system. Its value indicates the apportionment of the received waveform energy among the bits that the waveform represents. At first glance, one might think that a system specification should entail the symbol-energy to noise-power spectral density Es /N0 associated with the arriving waveforms. We will show, however, that for a given S/N0 the value of Es /N0 is a function of the modulation and coding. The reason for defining systems in terms of Eb /N0 stems from the fact that Eb /N0 depends only on S/N0 and R and is unaffected by any system design choices, such as modulation and coding.

13.2

Bandwidth and Power Considerations

Two primary communications resources are the received power and the available transmission bandwidth. In many communication systems, one of these resources may be more precious than the other and, hence, most systems can be classified as either bandwidth limited or power limited. In bandwidth-limited systems, spectrally efficient modulation techniques can be used to save bandwidth at the expense of power; in power-limited systems, power efficient modulation techniques can be used to save power at the expense of bandwidth. In both bandwidth- and power-limited systems, error-correction coding (often called channel coding) can be used to save power or to improve error performance at the expense of bandwidth. Recently, trellis-coded modulation (TCM) schemes have been used to improve the error performance of bandwidth-limited channels without any increase in bandwidth [17], but these methods are beyond the scope of this chapter.

13.2.1

The Bandwidth Efficiency Plane

Figure 13.1 shows the abscissa as the ratio of bit-energy to noise-power spectral density Eb /N0 (in decibels) and the ordinate as the ratio of throughput, R (in bits per second), that can be transmitted per hertz in a given bandwidth W . The ratio R/W is called bandwidth efficiency, since it reflects how efficiently the bandwidth resource is utilized. The plot stems from the Shannon–Hartley capacity theorem [12, 13, 15], which can be stated as S (13.7) C = W log2 1 + N where S/N is the ratio of received average signal power to noise power. When the logarithm is taken to the base 2, the capacity C, is given in bits per second. The capacity of a channel defines the 1999 by CRC Press LLC

c

maximum number of bits that can be reliably sent per second over the channel. For the case where the data (information) rate R is equal to C, the curve separates a region of practical communication systems from a region where such communication systems cannot operate reliably [12, 15].

FIGURE 13.1: Bandwidth-efficiency plane.

13.2.2 M-ary Signalling Each symbol in an M-ary alphabet can be related to a unique sequence of m bits, expressed as M = 2m

or

m = log2 M

(13.8)

where M is the size of the alphabet. In the case of digital transmission, the term symbol refers to the member of the M-ary alphabet that is transmitted during each symbol duration Ts . To transmit the symbol, it must be mapped onto an electrical voltage or current waveform. Because the waveform represents the symbol, the terms symbol and waveform are sometimes used interchangeably. Since one of M symbols or waveforms is transmitted during each symbol duration Ts , the data rate R in bits per second can be expressed as log2 M m = (13.9) R= Ts Ts Data-bit-time duration is the reciprocal of data rate. Similarly, symbol-time duration is the reciprocal of symbol rate. Therefore, from Eq. (13.9), we write that the effective time duration Tb of each bit in 1999 by CRC Press LLC

c

terms of the symbol duration Ts or the symbol rate Rs is Tb =

Ts 1 1 = = R m mRs

(13.10)

Then, using Eqs. (13.8) and (13.10) we can express the symbol rate Rs in terms of the bit rate R as follows: R Rs = (13.11) log2 M From Eqs. (13.9) and (13.10), any digital scheme that transmits m = log2 M bits in Ts seconds, using a bandwidth of W hertz, operates at a bandwidth efficiency of 1 log2 M R = = W W Ts W Tb

(b/s)/Hz

(13.12)

where Tb is the effective time duration of each data bit.

13.2.3

Bandwidth-Limited Systems

From Eq. (13.12), the smaller the W Tb product, the more bandwidth efficient will be any digital communication system. Thus, signals with small W Tb products are often used with bandwidthlimited systems. For example, the European digital mobile telephone system known as Global System for Mobile Communications (GSM) uses Gaussian minimum shift keying (GMSK) modulation having a W Tb product equal to 0.3 Hz/(b/s), where W is the 3-dB bandwidth of a Gaussian filter [4]. For uncoded bandwidth-limited systems, the objective is to maximize the transmitted information rate within the allowable bandwidth, at the expense of Eb /N0 (while maintaining a specified value of bit-error probability PB ). The operating points for coherent M-ary phase-shift keying (MPSK) at PB = 10−5 are plotted on the bandwidth-efficiency plane of Fig. 13.1. We assume Nyquist (ideal rectangular) filtering at baseband [10]. Thus, for MPSK, the required double-sideband (DSB) bandwidth at an intermediate frequency (IF) is related to the symbol rate as follows: W =

1 = Rs Ts

(13.13)

where Ts is the symbol duration and Rs is the symbol rate. The use of Nyquist filtering results in the minimum required transmission bandwidth that yields zero intersymbol interference; such ideal filtering gives rise to the name Nyquist minimum bandwidth. From Eqs. (13.12) and (13.13), the bandwidth efficiency of MPSK modulated signals using Nyquist filtering can be expressed as R/W = log2 M

(b/s)/Hz

(13.14)

The MPSK points in Fig. 13.1 confirm the relationship shown in Eq. (13.14). Note that MPSK modulation is a bandwidth-efficient scheme. As M increases in value, R/W also increases. MPSK modulation can be used for realizing an improvement in bandwidth efficiency at the cost of increased Eb /N0 . Although beyond the scope of this chapter, many highly bandwidth-efficient modulation schemes are under investigation [1]. 1999 by CRC Press LLC

c

13.2.4

Power-Limited Systems

Operating points for noncoherent orthogonal M-ary FSK (MFSK) modulation at PB = 10−5 are also plotted on Fig. 13.1. For MFSK, the IF minimum bandwidth is as follows [15] W =

M = MRs Ts

(13.15)

where Ts is the symbol duration and Rs is the symbol rate. With MFSK, the required transmission bandwidth is expanded M-fold over binary FSK since there are M different orthogonal waveforms, each requiring a bandwidth of 1/Ts . Thus, from Eqs. (13.12) and (13.15), the bandwidth efficiency of noncoherent orthogonal MFSK signals can be expressed as log2 M R (13.16) = (b/s)/Hz W M The MFSK points plotted in Fig. 13.1 confirm the relationship shown in Eq. (13.16). Note that MFSK modulation is a bandwidth-expansive scheme. As M increases, R/W decreases. MFSK modulation can be used for realizing a reduction in required Eb /N0 at the cost of increased bandwidth. In Eqs. (13.13) and (13.14) for MPSK, and Eqs. (13.15) and (13.16) for MFSK, and for all the points plotted in Fig. 13.1, ideal filtering has been assumed. Such filters are not realizable! For realistic channels and waveforms, the required transmission bandwidth must be increased in order to account for realizable filters. In the examples that follow, we will consider radio channels that are disturbed only by additive white Gaussian noise (AWGN) and have no other impairments, and for simplicity, we will limit the modulation choice to constant-envelope types, i.e., either MPSK or noncoherent orthogonal MFSK. For an uncoded system, MPSK is selected if the channel is bandwidth limited, and MFSK is selected if the channel is power limited. When error-correction coding is considered, modulation selection is not as simple, because coding techniques can provide power-bandwidth tradeoffs more effectively than would be possible through the use of any M-ary modulation scheme considered in this chapter [3]. In the most general sense, M-ary signalling can be regarded as a waveform-coding procedure, i.e., when we select an M-ary modulation technique instead of a binary one, we in effect have replaced the binary waveforms with better waveforms—either better for bandwidth performance (MPSK) or better for power performance (MFSK). Even though orthogonal MFSK signalling can be thought of as being a coded system, i.e., a first-order Reed-Muller code [8], we restrict our use of the term coded system to those traditional error-correction codes using redundancies, e.g., block codes or convolutional codes.

13.2.5

Minimum Bandwidth Requirements for MPSK and MFSK Signalling

The basic relationship between the symbol (or waveform) transmission rate Rs and the data rate R was shown in Eq. (13.11). Using this relationship together with Eqs. (13.13–13.16) and R = 9600 b/s, a summary of symbol rate, minimum bandwidth, and bandwidth efficiency for MPSK and noncoherent orthogonal MFSK was compiled for M = 2, 4, 8, 16, and 32 (Table 13.1). Values of Eb /N0 required to achieve a bit-error probability of 10−5 for MPSK and MFSK are also given for each value of M. These entries (which were computed using relationships that are presented later in this chapter) corroborate the tradeoffs shown in Fig. 13.1. As M increases, MPSK signalling provides more bandwidth efficiency at the cost of increased Eb /N0 , whereas MFSK signalling allows for a reduction in Eb /N0 at the cost of increased bandwidth. 1999 by CRC Press LLC

c

TABLE 13.1 Symbol Rate, Minimum Bandwidth, Bandwidth Efficiency, and Required Eb /N0 for MPSK and Noncoherent Orthogonal MFSK Signalling at 9600 bit/s R (b/s)

Rs (symb/s)

MPSK Minimum Bandwidth (Hz)

MPSK R/W

MPSK Eb /N0 (dB) PB = 10−5

Noncoherent Orthog MFSK Min Bandwidth (Hz)

MFSK R/W

MFSK Eb /N0 (dB) PB = 10−5

M

m

2

1

9600

9600

9600

1

9.6

19,200

1/2

13.4

4

2

9600

4800

4800

2

9.6

19,200

1/2

10.6

8

3

9600

3200

3200

3

13.0

25,600

3/8

9.1

16

4

9600

2400

2400

4

17.5

38,400

1/4

8.1

32

5

9600

1920

1920

5

22.4

61,440

5/32

7.4

13.3

Example 1: Bandwidth-Limited Uncoded System

Suppose we are given a bandwidth-limited AWGN radio channel with an available bandwidth of W = 4000 Hz. Also, suppose that the link constraints (transmitter power, antenna gains, path loss, etc.) result in the ratio of received average signal-power to noise-power spectral density S/N0 being equal to 53 dB-Hz. Let the required data rate R be equal to 9600 b/s, and let the required bit-error performance PB be at most 10−5 . The goal is to choose a modulation scheme that meets the required performance. In general, an error-correction coding scheme may be needed if none of the allowable modulation schemes can meet the requirements. In this example, however, we shall find that the use of error-correction coding is not necessary.

13.3.1

Solution to Example 1

For any digital communication system, the relationship between received S/N0 and received bitenergy to noise-power spectral density, Eb /N0 was given in Eq. (13.6) and is briefly rewritten as S Eb = R N0 N0

(13.17)

Solving for Eb /N0 in decibels, we obtain Eb (dB) N0

S (dB-Hz) − R (dB-b/s) N0 = 53 dB-Hz − 10 × log10 9600 dB-b/s = 13.2 dB (or 20.89) =

(13.18)

Since the required data rate of 9600 b/s is much larger than the available bandwidth of 4000 Hz, the channel is bandwidth limited. We therefore select MPSK as our modulation scheme. We have confined the possible modulation choices to be constant-envelope types; without such a restriction, we would be able to select a modulation type with greater bandwidth efficiency. To conserve power, we compute the smallest possible value of M such that the MPSK minimum bandwidth does not exceed the available bandwidth of 4000 Hz. Table 13.1 shows that the smallest value of M meeting this requirement is M = 8. Next we determine whether the required bit-error performance of PB ≤ 10−5 can be met by using 8-PSK modulation alone or whether it is necessary to use an error-correction coding scheme. Table 13.1 shows that 8-PSK alone will meet the requirements, since the required Eb /N0 listed for 8-PSK is less than the received Eb /N0 derived in Eq. (13.18). Let us imagine that we do not have Table 13.1, however, and evaluate whether or not error-correction coding is necessary. 1999 by CRC Press LLC

c

Figure 13.2 shows the basic modulator/demodulator (MODEM) block diagram summarizing the functional details of this design. At the modulator, the transformation from data bits to symbols yields an output symbol rate Rs , that is, a factor log2 M smaller than the input data-bit rate R, as is seen in Eq. (13.11). Similarly, at the input to the demodulator, the symbol-energy to noise-power spectral density ES /N0 is a factor log2 M larger than Eb /N0 , since each symbol is made up of log2 M bits. Because ES /N0 is larger than Eb /N0 by the same factor that Rs is smaller than R, we can expand Eq. (13.17), as follows: Eb Es S = R= Rs (13.19) N0 N0 N0 The demodulator receives a waveform (in this example, one of M = 8 possible phase shifts) during each time interval Ts . The probability that the demodulator makes a symbol error PE (M) is well approximated by the following equation for M > 2 [6]: # "s π 2Es ∼ sin (13.20) PE (M) = 2Q N0 M where Q(x), sometimes called the complementary error function, represents the probability under the tail of a zero-mean unit-variance Gaussian density function. It is defined as follows [18]: 2 Z ∞ u 1 exp − (13.21) du Q(x) = √ 2 2π x A good approximation for Q(x), valid for x > 3, is given by the following equation [2] 2 x 1 Q(x) ∼ = √ exp − 2 x 2π

(13.22)

In Fig. 13.2 and all of the figures that follow, rather than show explicit probability relationships, the generalized notation f (x) has been used to indicate some functional dependence on x. A traditional way of characterizing communication efficiency in digital systems is in terms of the received Eb /N0 in decibels. This Eb /N0 description has become standard practice, but recall that there are no bits at the input to the demodulator; there are only waveforms that have been assigned bit meanings. The received Eb /N0 represents a bit-apportionment of the arriving waveform energy. To solve for PE (M) in Eq. (13.20), we first need to compute the ratio of received symbol-energy to noise-power spectral density Es /N0 . Since from Eq. (13.18) Eb = 13.2 dB (or 20.89) N0 and because each symbol is made up of log2 M bits, we compute the following using M = 8. Eb Es = log2 M = 3 × 20.89 = 62.67 N0 N0

(13.23)

Using the results of Eq. (13.23) in Eq. (13.20), yields the symbol-error probability PE = 2.2 × 10−5 . To transform this to bit-error probability, we use the relationship between bit-error probability PB and symbol-error probability PE , for multiple-phase signalling [8] for PE 1 as follows: PB ∼ = 1999 by CRC Press LLC

c

PE PE = log2 M m

(13.24)

FIGURE 13.2: Basic modulator/demodulator (MODEM) without channel coding.

which is a good approximation when Gray coding is used for the bit-to-symbol assignment [6]. This last computation yields PB = 7.3 × 10−6 , which meets the required bit-error performance. No error-correction coding is necessary, and 8-PSK modulation represents the design choice to meet the requirements of the bandwidth-limited channel, which we had predicted by examining the required Eb /N0 values in Table 13.1.

13.4

Example 2: Power-Limited Uncoded System

Now, suppose that we have exactly the same data rate and bit-error probability requirements as in Example 1, but let the available bandwidth W be equal to 45 kHz, and the available S/N0 be equal to 48 dB-Hz. The goal is to choose a modulation or modulation/coding scheme that yields the required performance. We shall again find that error-correction coding is not required.

13.4.1

Solution to Example 2

The channel is clearly not bandwidth limited since the available bandwidth of 45 kHz is more than adequate for supporting the required data rate of 9600 bit/s. We find the received Eb /N0 from Eq. (13.18), as follows: Eb (dB) = 48 dB-Hz − 10 × log10 9600 dB-b/s = 8.2 dB (or 6.61) N0

(13.25)

Since there is abundant bandwidth but a relatively small Eb /N0 for the required bit-error probability, we consider that this channel is power limited and choose MFSK as the modulation scheme. To conserve power, we search for the largest possible M such that the MFSK minimum bandwidth is not expanded beyond our available bandwidth of 45 kHz. A search results in the choice of M = 16 (Table 13.1). Next, we determine whether the required error performance of PB ≤ 10−5 can be met by using 16-FSK alone, i.e., without error-correction coding. Table 13.1 shows that 16-FSK alone meets the requirements, since the required Eb /N0 listed for 16-FSK is less than the received Eb /N0 1999 by CRC Press LLC

c

derived in Eq. (13.25). Let us imagine again that we do not have Table 13.1, and evaluate whether or not error-correction coding is necessary. The block diagram in Fig. 13.2 summarizes the relationships between symbol rate Rs , and bit rate R, and between Es /N0 and Eb /N0 , which is identical to each of the respective relationships in Example 1. The 16-FSK demodulator receives a waveform (one of 16 possible frequencies) during each symbol time interval Ts . For noncoherent orthogonal MFSK, the probability that the demodulator makes a symbol error PE (M) is approximated by the following upper bound [20]:

PE (M) ≤

M −1 Es exp − 2 2N0

(13.26)

To solve for PE (M) in Eq. (13.26), we compute ES /N0 , as in Example 1. Using the results of Eq. (13.25) in Eq. (13.23), with M = 16, we get

Eb Es = log2 M = 4 × 6.61 = 26.44 N0 N0

(13.27)

Next, using the results of Eq. (13.27) in Eq. (13.26), yields the symbol-error probability PE = 1.4 × 10−5 . To transform this to bit-error probability, PB , we use the relationship between PB and PE for orthogonal signalling [20], given by

PB =

2m−1 PE (2m − 1)

(13.28)

This last computation yields PB = 7.3×10−6 , which meets the required bit-error performance. Thus, we can meet the given specifications for this power-limited channel by using 16-FSK modulation, without any need for error-correction coding, as we had predicted by examining the required Eb /N0 values in Table 13.1.

13.5

Example 3: Bandwidth-Limited and Power-Limited Coded System

We start with the same channel parameters as in Example 1 (W = 4000 Hz, S/N0 = 53 dB-Hz, and R = 9600 b/s), with one exception. 1999 by CRC Press LLC

c

In this example, we specify that PB must be at most 10−9 . Table 13.1 shows that the system is both bandwidth limited and power limited, based on the available bandwidth of 4000 Hz and the available Eb /N0 of 20.2 dB, from Eq. (13.18); 8-PSK is the only possible choice to meet the bandwidth constraint; however, the available Eb /N0 of 20.2 dB is certainly insufficient to meet the required PB of 10−9 . For this small value of PB , we need to consider the performance improvement that error-correction coding can provide within the available bandwidth. In general, one can use convolutional codes or block codes. The Bose–Chaudhuri–Hocquenghem (BCH) codes form a large class of powerful error-correcting cyclic (block) codes [7]. To simplify the explanation, we shall choose a block code from the BCH family. Table 13.2 presents a partial catalog of the available BCH codes in terms of n, k, and t, where k represents the number of information (or data) bits that the code transforms into a longer block of n coded bits (or channel bits), and t represents the largest number of incorrect channel bits that the code can correct within each n-sized block. The rate of a code is defined as the ratio k/n; its inverse represents a measure of the code’s redundancy [7].

13.5.1

TABLE 13.2 BCH Codes (Partial Catalog) n

k

t

7

4

1

15

11 7 5

1 2 3

31

26 21 16 11

1 2 3 5

63

57 51 45 39 36 30

1 2 3 4 5 6

127

120 113 106 99 92 85 78 71 64

1 2 3 4 5 6 7 9 10

Solution to Example 3

Since this example has the same bandwidth-limited parameters given in Example 1, we start with the same 8-PSK modulation used to meet the stated bandwidth constraint. We now employ errorcorrection coding, however, so that the bit-error probability can be lowered to PB ≤ 10−9 . To make the optimum code selection from Table 13.2, we are guided by the following goals. 1. The output bit-error probability of the combined modulation/coding system must meet the system error requirement. 2. The rate of the code must not expand the required transmission bandwidth beyond the available channel bandwidth. 3. The code should be as simple as possible. Generally, the shorter the code, the simpler will be its implementation. The uncoded 8-PSK minimum bandwidth requirement is 3200 Hz (Table 13.1) and the allowable channel bandwidth is 4000 Hz, and so the uncoded signal bandwidth can be increased by no more than a factor of 1.25 (i.e., an expansion of 25%). The very first step in this (simplified) code selection example is to eliminate the candidates in Table 13.2 that would expand the bandwidth by more than 25%. The remaining entries form a much reduced set of bandwidth-compatible codes (Table 13.3). In Table 13.3, a column designated Coding Gain G (for MPSK at PB = 10−9 ) has been added. Coding gain in decibels is defined as follows: Eb Eb − s (13.29) G= N0 uncoded N0 coded G can be described as the reduction in the required Eb /N0 (in decibels) that is needed due to the error-performance properties of the channel coding. G is a function of the modulation type and bit-error probability, and it has been computed for MPSK at PB = 10−9 (Table 13.3). For MPSK 1999 by CRC Press LLC

c

TABLE 13.3 Bandwidth-Compatible BCH Codes n

k

t

Coding Gain, G (dB) MPSK, PB = 10−9

31

26

1

2.0

63

57 51

1 2

2.2 3.1

127

120 113 106

1 2 3

2.2 3.3 3.9

modulation, G is relatively independent of the value of M. Thus, for a particular bit-error probability, a given code will provide about the same coding gain when used with any of the MPSK modulation schemes. Coding gains were calculated using a procedure outlined in the subsequent Calculating Coding Gain section. A block diagram summarizes this system, which contains both modulation and coding (Fig. 13.3). The introduction of encoder/decoder blocks brings about additional transformations. The relationships that exist when transforming from R b/s to Rc channel-b/s to Rs symbol/s are shown at the encoder/modulator. Regarding the channel-bit rate Rc , some authors prefer to use the units of channel-symbol/s (or code-symbol/s). The benefit is that error-correction coding is often described more efficiently with nonbinary digits. We reserve the term symbol for that group of bits mapped onto an electrical waveform for transmission, and we designate the units of Rc to be channel-b/s (or coded-b/s).

FIGURE 13.3: MODEM with channel coding.

We assume that our communication system cannot tolerate any message delay, so that the channel1999 by CRC Press LLC

c

bit rate Rc must exceed the data-bit rate R by the factor n/k. Further, each symbol is made up of log2 M channel bits, and so the symbol rate Rs is less than Rc by the factor log2 M. For a system containing both modulation and coding, we summarize the rate transformations as follows: n (13.30) R Rc = k Rc Rs = (13.31) log2 M At the demodulator/decoder in Fig.13.3, the transformations among data-bit energy, channel- bit energy, and symbol energy are related (in a reciprocal fashion) by the same factors as shown among the rate transformations in Eqs. (13.30) and (13.31). Since the encoding transformation has replaced k data bits with n channel bits, then the ratio of channel-bit energy to noise-power spectral density Ec /N0 is computed by decrementing the value of Eb /N0 by the factor k/n. Also, since each transmission symbol is made up of log2 M channel bits, then ES /N0 , which is needed in Eq. (13.20) to solve for PE , is computed by incrementing Ec /N0 by the factor log2 M. For a system containing both modulation and coding, we summarize the energy to noise-power spectral density transformations as follows: k Eb Ec = (13.32) N0 n N0 Ec Es = log2 M (13.33) N0 N0 Using Eqs. (13.30) and (13.31), we can now expand the expression for S/N0 in Eq. (13.19), as follows (Appendix). Eb Ec Es S = R= Rc = Rs (13.34) N0 N0 N0 N0 As before, a standard way of describing the link is in terms of the received Eb /N0 in decibels. However, there are no data bits at the input to the demodulator, and there are no channel bits; there are only waveforms that have bit meanings and, thus, the waveforms can be described in terms of bit-energy apportionments. Since S/N0 and R were given as 53 dB-Hz and 9600 b/s, respectively, we find as before, from Eq. (13.18), that the received Eb /N0 = 13.2 dB. The received Eb /N0 is fixed and independent of n, k, and t (Appendix). As we search, in Table 13.3 for the ideal code to meet the specifications, we can iteratively repeat the computations suggested in Fig. 13.3. It might be useful to program on a personal computer (or calculator) the following four steps as a function of n, k, and t. Step 1 starts by combining Eqs. (13.32) and (13.33), as follows. Step 1: Ec k Eb Es = log2 M = log2 M (13.35) N0 N0 n N0 Step 2:

"s PE (M) ∼ = 2Q

# π 2Es sin N0 M

(13.36)

which is the approximation for symbol-error probability PE rewritten from Eq. (13.20). At each symbol-time interval, the demodulator makes a symbol decision, but it delivers a channel-bit sequence representing that symbol to the decoder. When the channel-bit output of the demodulator is 1999 by CRC Press LLC

c

quantized to two levels, 1 and 0, the demodulator is said to make hard decisions. When the output is quantized to more than two levels, the demodulator is said to make soft decisions [15]. Throughout this paper, we shall assume hard-decision demodulation. Now that we have a decoder block in the system, we designate the channel-bit-error probability out of the demodulator and into the decoder as pc , and we reserve the notation PB for the bit-error probability out of the decoder. We rewrite Eq. (13.24) in terms of pc for PE 1 as follows. Step 3: PE PE pc ∼ (13.37) = = log2 M m relating the channel-bit-error probability to the symbol-error probability out of the demodulator, assuming Gray coding, as referenced in Eq. (13.24). For traditional channel-coding schemes and a given value of received S/N0 , the value of Es /N0 with coding will always be less than the value of Es /N0 without coding. Since the demodulator with coding receives less Es /N0 , it makes more errors! When coding is used, however, the system error-performance does not only depend on the performance of the demodulator, it also depends on the performance of the decoder. For error-performance improvement due to coding, the decoder must provide enough error correction to more than compensate for the poor performance of the demodulator. The final output decoded bit-error probability PB depends on the particular code, the decoder, and the channel-bit-error probability pc . It can be expressed by the following approximation [11]. Step 4: n n j 1 X ∼ PB = j pc (1 − pc )n−j (13.38) j n j =t+1

where t is the largest number of channel bits that the code can correct within each block of n bits. Using Eqs. (13.35–13.38) in the four steps, we can compute the decoded bit-error probability PB as a function of n, k, and t for each of the codes listed in Table 13.3. The entry that meets the stated error requirement with the largest possible code rate and the smallest value of n is the double-error correcting (63, 51) code. The computations are as follows. Step 1: 51 Es 20.89 = 50.73 =3 N0 63 where M = 8, and the received Eb /N0 = 13.2 dB (or 20.89). Step 2: i h√ ∼ 2Q 101.5 × sin π = 2Q(3.86) = 1.2 × 10−4 PE = 8 Step 3: 1.2 × 10−4 = 4 × 10−5 pc ∼ = 3 Step 4: 3 60 3 63 4 × 10−5 PB ∼ 1 − 4 × 10−5 = 63 3 4 59 4 63 + + ··· 1 − 4 × 10−5 4 × 10−5 63 4 = 1999 by CRC Press LLC

c

1.2 × 10−10

where the bit-error-correcting capability of the code is t = 2. For the computation of PB in step 4, we need only consider the first two terms in the summation of Eq. (13.38) since the other terms have a vanishingly small effect on the result. Now that we have selected the (63, 51) code, we can compute the values of channel-bit rate Rc and symbol rate Rs using Eqs. (13.30) and (13.31), with M = 8, n 63 R= 9600 ≈ 11,859 channel-b/s Rc = k 51 Rc 11859 = = 3953 symbol/s Rs = log2 M 3

13.5.2

Calculating Coding Gain

Perhaps a more direct way of finding the simplest code that meets the specified error performance is to first compute how much coding gain G is required in order to yield PB = 10−9 when using 8-PSK modulation alone; then, from Table 13.3, we can simply choose the code that provides this performance improvement. First, we find the uncoded Es /N0 that yields an error probability of PB = 10−9 , by writing from Eqs. (13.24) and (13.36), the following: r π 2Es 2Q sin PE ∼ N0 M (13.39) = 10−9 PB ∼ = = log2 M log2 M At this low value of bit-error probability, it is valid to use Eq. (13.22) to approximate Q(x) in Eq. (13.39) By trial and error (on a programmable calculator), we find that the uncoded Es /N0 = 120.67 = 20.8 dB, and since each symbol is made up of log2 8 = 3 bits, the required (Eb /N0 )uncoded = 120.67/3 = 40.22 = 16 dB. From the given parameters and Eq. (13.18), we know that the received (Eb /N0 )coded = 13.2 dB. Using Eq. (13.29), the required coding gain to meet the bit-error performance of PB = 10−9 in decibels is Eb Eb − = 16 − 13.2 = 2.8 G= N0 uncoded N0 coded To be precise, each of the Eb /N0 values in the preceding computation must correspond to exactly the same value of bit-error probability (which they do not). They correspond to PB = 10−9 and PB = 1.2 × 10−10 , respectively. At these low probability values, however, even with such a discrepancy, this computation still provides a good approximation of the required coding gain. In searching Table 13.3 for the simplest code that will yield a coding gain of at least 2.8 dB, we see that the choice is the (63, 51) code, which corresponds to the same code choice that we made earlier.

13.6

Example 4: Direct-Sequence (DS) Spread-Spectrum Coded System

Spread-spectrum systems are not usually classified as being bandwidth- or power-limited. They are generally perceived to be power-limited systems, however, because the bandwidth occupancy of the information is much larger than the bandwidth that is intrinsically needed for the information transmission. In a direct-sequence spread-spectrum (DS/SS) system, spreading the signal bandwidth by some factor permits lowering the signal-power spectral density by the same factor (the total average signal power is the same as before spreading). The bandwidth spreading is typically accomplished 1999 by CRC Press LLC

c

by multiplying a relatively narrowband data signal by a wideband spreading signal. The spreading signal or spreading code is often referred to as a pseudorandom code or PN code.

13.6.1

Processing Gain

A typical DS/SS radio system is often described as a two-step BPSK modulation process. In the first step, the carrier wave is modulated by a bipolar data waveform having a value +1 or −1 during each data-bit duration; in the second step, the output of the first step is multiplied (modulated) by a bipolar PN-code waveform having a value +1 or −1 during each PN-code-bit duration. In reality, DS/SS systems are usually implemented by first multiplying the data waveform by the PN-code waveform and then making a single pass through a BPSK modulator. For this example, however, it is useful to characterize the modulation process in two separate steps—the outer modulator/demodulator for the data, and the inner modulator/demodulator for the PN code (Fig. 13.4).

FIGURE 13.4: Direct-sequence spread-spectrum MODEM with channel coding. A spread-spectrum system is characterized by a processing gain Gp , that is defined in terms of the spread-spectrum bandwidth Wss and the data rate R as follows [20]: Gp =

Wss R

(13.40)

For a DS/SS system, the PN-code bit has been given the name chip, and the spread-spectrum signal bandwidth can be shown to be about equal to the chip rate Rch as follows: Gp = 1999 by CRC Press LLC

c

Rch R

(13.41)

Some authors define processing gain to be the ratio of the spread-spectrum bandwidth to the symbol rate. This definition separates the system performance that is due to bandwidth spreading from the performance that is due to error-correction coding. Since we ultimately want to relate all of the coding mechanisms relative to the information source, we shall conform to the most usually accepted definition for processing gain, as expressed in Eqs. (13.40) and (13.41). A spread-spectrum system can be used for interference rejection and for multiple access (allowing multiple users to access a communications resource simultaneously). The benefits of DS/SS signals are best achieved when the processing gain is very large; in other words, the chip rate of the spreading (or PN) code is much larger than the data rate. In such systems, the large value of Gp allows the signalling chips to be transmitted at a power level well below that of the thermal noise. We will use a value of Gp = 1000. At the receiver, the despreading operation correlates the incoming signal with a synchronized copy of the PN code and, thus, accumulates the energy from multiple (Gp ) chips to yield the energy per data bit. The value of Gp has a major influence on the performance of the spread-spectrum system application. We shall see, however, that the value of Gp has no effect on the received Eb /N0 . In other words, spread spectrum techniques offer no error-performance advantage over thermal noise. For DS/SS systems, there is no disadvantage either! Sometimes such spreadspectrum radio systems are employed only to enable the transmission of very small power-spectral densities and thus avoid the need for FCC licensing [16].

13.6.2

Channel Parameters for Example 13.4

Consider a DS/SS radio system that uses the same (63, 51) code as in the previous example. Instead of using MPSK for the data modulation, we shall use BPSK. Also, we shall use BPSK for modulating the PN-code chips. Let the received S/N0 = 48 dB-Hz, the data rate R = 9600 b/s, and the required PB ≤ 10−6 . For simplicity, assume that there are no bandwidth constraints. Our task is simply to determine whether or not the required error performance can be achieved using the given system architecture and design parameters. In evaluating the system, we will use the same type of transformations used in the previous examples.

13.6.3

Solution to Example 13.4

A typical DS/SS system can be implemented more simply than the one shown in Fig. 13.4. The data and the PN code would be combined at baseband, followed by a single pass through a BPSK modulator. We will, however, assume the existence of the individual blocks in Fig. 13.4 because they enhance our understanding of the transformation process. The relationships in transforming from data bits, to channel bits, to symbols, and to chips Fig. 13.4 have the same pattern of subtle but straightforward transformations in rates and energies as previous relationships (Figs. 13.2 and 13.3). The values of Rc , Rs , and Rch can now be calculated immediately since the (63, 51) BCH code has already been selected. From Eq. (13.30) we write n 63 R= 9600 ≈ 11,859 channel-b/s Rc = k 51 Since the data modulation considered here is BPSK, then from Eq. (13.31) we write Rs = Rc ≈ 11,859 symbol/s and from Eq. (13.41), with an assumed value of Gp = 1000 Rch = Gp R = 1000 × 9600 = 9.6 × 106 chip/s 1999 by CRC Press LLC

c

Since we have been given the same S/N0 and the same data rate as in Example 2, we find the value of received Eb /N0 from Eq. (13.25) to be 8.2 dB (or 6.61). At the demodulator, we can now expand the expression for S/N0 in Eq. (13.34) and the Appendix as follows: Eb Ec Es Ech S = R= Rc = Rs = Rch N0 N0 N0 N0 N0

(13.42)

Corresponding to each transformed entity (data bit, channel bit, symbol, or chip) there is a change in rate and, similarly, a reciprocal change in energy-to-noise spectral density for that received entity. Equation (13.42) is valid for any such transformation when the rate and energy are modified in a reciprocal way. There is a kind of conservation of power (or energy) phenomenon that exists in the transformations. The total received average power (or total received energy per symbol duration) is fixed regardless of how it is computed, on the basis of data bits, channel bits, symbols, or chips. The ratio Ech /N0 is much lower in value than Eb /N0 . This can be seen from Eqs. (13.42) and (13.41), as follows: 1 S 1 S 1 Eb Ech = = (13.43) = N0 N0 Rch N0 Gp R Gp N0 But, even so, the despreading function (when properly synchronized) accumulates the energy contained in a quantity Gp of the chips, yielding the same value Eb /N0 = 8.2 dB, as was computed earlier from Eq. (13.25). Thus, the DS spreading transformation has no effect on the error performance of an AWGN channel [15], and the value of Gp has no bearing on the value of PB in this example. From Eq. (13.43), we can compute, in decibels, Ech N0

=

Eb /N0 − Gp

= 8.2 − 10 × log10 1000 = −21.8

(13.44)

The chosen value of processing gain (Gp = 1000) enables the DS/SS system to operate at a value of chip energy well below the thermal noise, with the same error performance as without spreading. Since BPSK is the data modulation selected in this example, each message symbol therefore corresponds to a single channel bit, and we can write Ec k Eb 51 Es × 6.61 = 5.35 = = = (13.45) N0 N0 n N0 63 where the received Eb /N0 = 8.2 dB (or 6.61). Out of the BPSK data demodulator, the symbol-error probability PE (and the channel-bit error probability pc ) is computed as follows [15]: s ! 2Ec (13.46) pc = PE = Q N0 Using the results of Eq. (13.45) in Eq. (13.46) yields pc = Q(3.27) = 5.8 × 10−4 Finally, using this value of pc in Eq. (13.38) for the (63,51) double-error correcting code yields the output bit-error probability of PB = 3.6 × 10−7 . We can, therefore, verify that for the given architecture and design parameters of this example the system does, in fact, achieve the required error performance. 1999 by CRC Press LLC

c

13.7

Conclusion

The goal of this section has been to review fundamental relationships used in evaluating the performance of digital communication systems. First, we described the concept of a link and a channel and examined a radio system from its transmitting segment up through the output of the receiving antenna. We then examined the concept of bandwidth-limited and power-limited systems and how such conditions influence the system design when the choices are confined to MPSK and MFSK modulation. Most important, we focused on the definitions and computations involved in transforming from data bits to channel bits to symbols to chips. In general, most digital communication systems share these concepts; thus, understanding them should enable one to evaluate other such systems in a similar way.

Appendix: Received E b /N 0 Is Independent of the Code Parameters Starting with the basic concept that the received average signal power S is equal to the received symbol or waveform energy, Es , divided by the symbol-time duration, Ts (or multiplied by the symbol rate, Rs ), we write Es /Ts Es S = = Rs (A13.1) N0 N0 N0 where N0 is noise-power spectral density. Using Eqs. (13.27) and (13.25), rewritten as Ec Es = log2 M N0 N0

and

Rs =

Rc log2 M

let us make substitutions into Eq. (A13.1), which yields Ec S = Rc N0 N0 Next, using Eqs. (13.26) and (13.24), rewritten as Ec k Eb = and N0 n N0

(A13.2)

Rc =

n k

R

let us now make substitutions into Eq. (A13.2), which yields the relationship expressed in Eq. (13.11) Eb S = R N0 N0

(A13.3)

Hence, the received Eb /N0 is only a function of the received S/N0 and the data rate R. It is independent of the code parameters, n, k, and t. These results are summarized in Fig. 13.3.

References [1] Anderson, J.B. and Sundberg, C.-E.W., Advances in constant envelope coded modulation, IEEE Commun., Mag., 29(12), 36–45, 1991. 1999 by CRC Press LLC

c

[2] Borjesson, P.O. and Sundberg, C.E., Simple approximations of the error function Q(x) for communications applications, IEEE Trans. Comm., COM-27, 639–642, Mar. 1979. [3] Clark Jr., G.C. and Cain, J.B., Error-Correction Coding for Digital Communications, Plenum Press, New York, 1981. [4] Hodges, M.R.L., The GSM radio interface, British Telecom Technol. J., 8(1), 31–43, 1990. [5] Johnson, J.B., Thermal agitation of electricity in conductors, Phys. Rev., 32, 97–109, Jul. 1928. [6] Korn, I., Digital Communications, Van Nostrand Reinhold Co., New York, 1985. [7] Lin, S. and Costello Jr., D.J., Error Control Coding: Fundamentals and Applications, PrenticeHall, Englewood Cliffs, NJ, 1983. [8] Lindsey, W.C. and Simon, M.K., Telecommunication Systems Engineering, Prentice-Hall, Englewood Cliffs, NJ, 1973. [9] Nyquist, H., Thermal agitation of electric charge in conductors, Phys. Rev., 32, 110–113, Jul. 1928. [10] Nyquist, H., Certain topics on telegraph transmission theory, Trans. AIEE, 47, 617–644, Apr. 1928. [11] Odenwalder, J.P., Error Control Coding Handbook. Linkabit Corp., San Diego, CA, Jul. 15, 1976. [12] Shannon, C.E., A mathematical theory of communication, BSTJ. 27, 379–423, 623–657, 1948. [13] Shannon, C.E., Communication in the presence of noise, Proc. IRE. 37(1), 10–21, 1949. [14] Sklar, B., What the system link budget tells the system engineer or how I learned to count in decibels, Proc. of the Intl. Telemetering Conf., San Diego, CA, Nov. 1979. [15] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1988. [16] Title 47, Code of Federal Regulations, Part 15 Radio Frequency Devices. [17] Ungerboeck, G., Trellis-coded modulation with redundant signal sets, Pt. I and II, IEEE Comm. Mag., 25, 5–21. Feb. 1987. [18] Van Trees, H.L., Detection, Estimation, and Modulation Theory, Pt. I, John Wiley & Sons, New York, 1968. [19] Viterbi, A.J., Principles of Coherent Communication, McGraw-Hill, New York, 1966. [20] Viterbi, A.J., Spread spectrum communications—myths and realities, IEEE Comm. Mag., 11– 18, May, 1979.

Further Information A useful compilation of selected papers can be found in: Cellular Radio & Personal Communications– A Book of Selected Readings, edited by Theodore S. Rappaport, Institute of Electrical and Electronics Engineers, Inc., Piscataway, New Jersey, 1995. Fundamental design issues, such as propagation, modulation, channel coding, speech coding, multiple-accessing and networking, are well represented in this volume. Another useful sourcebook that covers the fundamentals of mobile communications in great detail is: Mobile Radio Communications, edited by Raymond Steele, Pentech Press, London 1992. This volume is also available through the Institute of Electrical and Electronics Engineers, Inc., Piscataway, New Jersey. For spread spectrum systems, an excellent reference is: Spread Spectrum Communications Handbook, by Marvin K. Simon, Jim K. Omura, Robert A. Scholtz, and Barry K. Levitt, McGraw-Hill Inc., New York, 1994.

1999 by CRC Press LLC

c

Dimolitsas, S. & Onufry, M. “Telecommunications Standardization” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Telecommunications Standardization 14.1 Introduction 14.2 Global Standardization

ITU-T • ITU-R • BDT • ISO/IEC JTC 1

14.3 Regional Standardization CITEL

14.4 National Standardization ANSI T1 • TIA • TTC

Spiros Dimolitsas Lawrence Livermore National Laboratory

Michael Onufry COMSAT Laboratories

14.1

14.5 Intellectual Property 14.6 Standards Coordination 14.7 Scientific 14.8 Standards Development Cycle Defining Terms Further Information

Introduction

National economies are increasingly becoming information based, where networking and information transport provide a foundation for productivity and economic growth. Concurrently, many countries are rapidly adopting deregulation policies that are resulting in a telecommunications industry that is increasingly multicarrier and multivendor based, and where interconnectivity and compatibility between different networks is emerging as key to the success of this technological and regulatory transition. The communications industry has, consequently, become more interested in standardization; standards give manufacturers, service providers, and users freedom of choice at reasonable cost. In this chapter, a review is provided of the primary telecommunications standards setting bodies. As will be seen, these bodies are often driven by slightly different underlying philosophies, but the output of their activities, i.e., the standards, possess essentially the same characteristics. An all-encompassing review of standardization bodies is not attempted here; this would clearly take many volumes to describe. Furthermore, as country after country increasingly deregulates its telecommunication industry, new standards setting bodies emerge to fill in the void of the de-facto (but no longer existing) standards setting bodies: the national telecommunications administration. The principal communications standards bodies that will be covered are the following: the International Telecommunications Union (ITU); the United States ANSI Committee T1 on Telecom1999 by CRC Press LLC

c

munications; the Telecommunications Industry Association (TIA); the European Telecommunications Standards Institute (ETSI); the Inter-American Telecommunications Commission (CITEL); the Japanese Telecommunications Technology Committee (TTC); and the Institute of Electrical and Electronics Engineers (IEEE). Not addressed explicitly are other standards setting bodies that are either national or regional in character, even though it is recognized that sometimes there is overlap in scope with the bodies explicitly covered here. Most notably, standards setting bodies that are not covered, but that are worth noting, include: the United States ANSI Committee X3; the International Standards Organization (ISO), the International Electrotechnical Commission (IEC) [except ISO/IEC joint technical committee (JTC) 1], the Telecommunications Standards Advisory Council of Canada (TSACC), the Australian Telecommunications Standardization Committee (ATSC), the Telecommunication Technology Association (TTA) in Korea, and several forums (whose scope is, in principle, somewhat different) such as the asynchronous transfer mode (ATM) forum, the frame relay forum, the integrated digital services network (ISDN) users’ forum, and telocator. As will be described later, many of these bodies operate in a coherent fashion through a mechanism developed by the Interregional Telecommunications Standards Conference (ITSC) and its successor, the Global Standards Collaboration (GSC).

14.2

Global Standardization

When it comes to setting global communications standards, the ITU comes to the forefront. The ITU is an intergovernmental organization, whereby each sovereign state that is a member of the United Nations may become a member of the ITU. Member governments (in most cases represented by their telecommunications administrations) are constitutional members with a right to vote. Other organizations, such as network and service providers, manufacturers, and scientific and industrial organizations also participate in ITU activities but with a lower legal status. ITU traces its history back to 1865 in the era of telegraphy. The supreme organ of the ITU is the plenipotentiary conference, which is held not less than every five years and plays a major role in the management of ITU. In 1993 the ITU as a U.N.-specialized agency was reorganized into three sectors (see Fig. 14.1): The telecommunications standardization sector (ITU-T), the radiocommunications sector (ITU-R), and the development sector (BDT). These sectors’ activities are, respectively, standardization of telecommunications, including radio communications; regulation of telecommunications (mainly for radio communications); and development of telecommunications. It should be noted that, in general, the ITU-T is the successor of the international telephone and telegraph consultative committee (CCITT) of the ITU with additional responsibilities for standardization of network-related radio communications. Similarly, the ITU-R is the successor of the international radio consultative committee (CCIR) and the international frequency registration bureau (IFRB) of the ITU (after transferring some of its standardization activities to the ITU-T). The BDT is a new sector, which became operational in 1989.

14.2.1 ITU-T Within the ITU structure, standardization work is undertaken by a number of study groups (SG) dealing with specific areas of communications. There are currently 14 study groups, as shown in Table 14.1. Study groups develop standards for their respective work areas, which then have to be agreed upon by consensus—a process that for the time being is reserved to administrations only. The standards so developed are called recommendations to indicate their legal nonbinding nature. Technically, 1999 by CRC Press LLC

c

FIGURE 14.1: The ITU structure.

however, there is no distinction between recommendations developed by the ITU and standards developed by other standards setting bodies. The study groups’ work is undertaken by delegation members, sent or sponsored by their national administrations, and delegates from recognized private operating organizations (RPOA), scientific and industrial organizations (SIO) or international organizations. Because an ITU-T study group can typically have from 100 to more than 500 participating members and deal with 20–50 project standards, the work of each study group is often divided among working parties (WP). Such working parties are usually split further into experts’ groups led by a chair or “rapporteur” with responsibility for leading the work defined in an approved active question or subelement of a question. To coordinate standardization work that spans several study groups, two joint coordination groups (JCG) have also been established (not shown in Fig. 14.1): International Mobile Communications (IMT-2000) and Satellite Matters. Such groups do not have executive powers but are merely there to coordinate work of pervasive interest within the ITU-T sector. Also part of the ITU-T structure is the telecommunications standardization bureau (TSB) or, as it was formerly called, the CCITT secretariat. The TSB is responsible for the organization of numerous meetings held by the sector each year as well as all other support services required to 1999 by CRC Press LLC

c

TABLE 14.1

ITU-T Study Group Structure

SG 2

Network and service operation Lead SG on Service definition, Numbering, Routing and Global Mobility

SG 3

Tariff and accounting principles

SG 4

TMN and network maintenance Lead SG on Telecommunication management network (TMN) studies

SG 5

Protection against electromagnetic environmental effects

SG 6

Outside plant

SG 7

Data networks and open systems communications Lead SG on Open Distributed Processing (ODP), Frame Relay and for Communications System Security

SG 8

Characteristics of telematic services Lead SG on Facsimile

SG 9

Television and sound transmission

SG 10

Languages and general software aspects for telecommunications systems

SG 11

Signalling requirements and protocols Lead SG on Intelligent Network and IMT-2000

SG 12

End-to-end transmission performance of networks and terminals

SG 13

General network aspects Lead SG on General network aspects Global Information Infrastructures and Broadband ISDN

SG 15

Transport networks, systems and equipment Lead SG on Access Network Transport

SG 16

Transmission systems and equipment Lead SG on Multimedia services and systems

ensure the smooth and efficient operation of the sector (including, but not limited to, document production and distribution). The TSB is headed by a director, who holds the executive power and, in collaboration with the study groups, bears full responsibility for the ITU-T activities. In this structure, unlike other U.N. organizations, the secretary general is the legal representative of the ITU, with the executive powers being vested in the director. Finally, the ITU-T is supported by an advisory group, i.e., the telecommunications standardization advisory group (TSAG), which together with interested ITU members, the ITU-T Director, and ITUT SG chairman, guides standardization activities.

14.2.2 ITU-R The radiocommunications sector emphasizes the regulatory and pure radio-interface aspects. The functional structure of the ITU-R currently includes eight study groups, (shown in Table 14.2) a radiocommunications bureau, and an advisory board. The role of the latter two elements is very similar to the ITU-T and, thus, need not be repeated here. As within the ITU-T, there are areas of pervasive interest, and so areas of common interest can be found between the ITU-T and ITU-R where activities need to be coordinated. To achieve this objective, two intersector coordination groups (ICG) have been established (not shown in Fig. 14.1) dealing with international mobile telecommunications (IMT-2000), and satellite matters. Three major special activities have been organized within ITU-R: • IMT-2000 (formerly known as Future Public Land Mobile Telecommunications Systems FPLMTS). The objective of the International Mobile Telecommunications (IMT)2000 activity is to provide seamless satellite and terrestrial operation of mobile terminals 1999 by CRC Press LLC

c

TABLE 14.2

ITU-R Study Group Structure

SG 1

Spectrum management

SG 3

Radio wave propagation

SG 4

Fixed satellite service

SG 7

Science services

SG 8

Mobile, radio determination, amateur and related satellite services

SG 9

Fixed service

SG 10

Broadcasting services: sound

SG 11

Broadcasting services: television

throughout the world—anywhere, anytime—where communication coverage requires interoperation of satellite and terrestrial networks. This is to be accomplished using technology available around the year 2000. • Mobile-satellite and radionavigation-satellite service (MSS-RNSS). The rapid growth of service in these areas has created a need to focus attention on interference and spectrum allocation. • Wireless Access Systems (WAS). This is an application of radio technology and personal communications systems directed toward lowering the installation and maintenance cost of the local access network. The traditional high cost has prevented penetration of basic telephone service in evolving and developing countries of the world. Overcoming this barrier is an objective of the BDT, described next.

14.2.3

BDT

Unlike the ITU-T (and to some extent ITU-R), which deals with standardization, the BDT deals with aspects that promote the integration and deployment of communications in developing countries. Typical outputs from this sector include implementation guides that expand the utility of ITU recommendations and ensure their expeditious implementation. Communications has been recognized as a necessary element for economic growth. The BDT also seeks to arrange special financing involving communication suppliers and governments or authorized carriers within developing countries to enable provision of basic communications service where otherwise it would not be possible.

14.2.4

ISO/IEC JTC 1

Two global organizations are active in the information processing systems area, the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC), particularly through the Joint Technical Committee 1 (JTC 1). The ISO comprises national standards bodies, which have the responsibility for promoting and distributing ISO standards within their own countries. ISO technical work is carried out by some 200 technical committees (TC). Technical committees are established by the ISO council and their work program is approved by the technical board on behalf of the council. The IEC comprises national committees (one from each country) and deals with almost all spheres of electrotechnology, including power, electronics, telecommunications, and nuclear energy. IEC technical work is performed by some 200 TCs set up by its council and some 700 working groups. Part of this organization, a President’s Advisory Committee on future technology (PACT) advises the IEC president on new technologies which require preliminary or immediate standardization work. 1999 by CRC Press LLC

c

PACT is designed to form a direct link with private and public research and development activities, keeping the IEC abreast of accelerating technological changes and the accompanying demand for new standards. Small industrial project teams examine new work initiatives which can be introduced into the regular IEC working structure. In 1987 a joint technical committee was established incorporating ISO TC97, IEC TC83, and subcommittee 47B to deal with generic information technology. The international standards developed by JTC1 are published under the ISO and IEC logos. The activities of ISO/IEC/JTC 1 are listed in Table 14.3 expressed in terms of its subcommittees (SC). TABLE 14.3 SC 1

ISO/IEC/JTC1 Subcommittees

Vocabulary

SC 2

Coded character sets

SC 6

Telecommunications information exchange between systems

SC 7

Software engineering

SC 11

Flexible magnetic media for digital data interchange

SC 17

Identification cards and related devices

SC 22

Programming languages, their environments and systems software interfaces

SC 23

Optical disk cartridges for information interchange

SC 24

Computer graphics and image processing

SC 25

Interconnection information technology management

SC 26

Microprocessor systems

SC 27

IT security techniques

SC 28

Office equipment

SC 29

Coding of audio, picture, multimedia and hypermedia information

SC 31

Automatic data capture

SC 32

Data management services

SC 33

Distributed application services

The ISO and IEC jointly issue directions for the work of the technical committees. The scope (or area of activity) of each technical committee (TC)/subcommittee (SC) is defined by the TC/SC itself, and then submitted to the Committee of Action (CA)/parent TC for approval. The TCs/SCs prepare technical documents on specific subjects within their respective scopes, which area then submitted to the National Committees for voting with a view to their approval as international standards.

14.3

Regional Standardization

Today the ETSI comes closest to being a true regional standards setting body, together with CITEL, the regional (Latin-American) standardization body. ETSI is the result of the Single Act of the European community and the EC commission green paper in 1987 that analyzed the consequences of the Single Act and recommended that a European telecommunications standards body be created to develop common standards for telecommunications equipment and networks. Out of this recommendation, the Committee for Harmonization (CCH) and the European Conference for Post and Telecommunications (CEPT) evolved into ETSI, which formally came into being in March 1988. It should be noted, however, that even though ETSI attributes at least part of its existence to the European Community, its membership is wider than just the European Union Nations. 1999 by CRC Press LLC

c

Because of the way ETSI came into being, ETSI is characterized by a unique aspect, namely, it is often called upon by the European Commission to develop standards that are necessary to implement legislation. Such standards, which are referred to as technical basis reports (TBR) and whose application is usually mandatory, are often needed in public procurements, as well as in provisioning for open network interconnection as national telecommunications administrations are being deregulated. Like ITU, however, ETSI also develops voluntary standards in accordance with common international understanding against which industry is not obliged to produce conforming products. These standards fall into either the European technical standard (ETS) class when fully approved, or into the interim-ETS class, when not fully stable or proven. ETSI standards are typically sought when either the subject matter is not studied at the global level (such as when it may be required to support some piece of legislation), or the development of the standard is justified by market needs that exist in Europe and not in other parts of the world. In some cases, it may be necessary to adapt ITU standards for the European continent, although a simple endorsement of an ITU standard as a European standard is also possible. A more delicate case arises when both the ITU and ETSI are pursuing parallel standards activities, in which case close coordination with the ITU is sought either through member countries that may input ETSI standards to the ITU for consideration or through the global standards collaboration process. The highest authority of ETSI is the general assembly, which determines ETSI’s policy, appoints its director and deputy, adopts the budget, and approves the audited accounts. The more technical issues are addressed by the technical assembly, which approves technical standards, advises on the work to be undertaken, and sets priorities. The ETSI technical committees are listed in Table 14.4. TABLE 14.4

ETSI Technical Committees

TCEE

Environmental engineering

TCHF

Human factors

TCMTS

Methods for testing and specification

TCSEC

Security

TCSPS

Signalling protocols and switching

TCTM

Transmission and multiplexing

TCERM

EMC and radio spectrum matters

TCICC

Integrated circuit cards

TCNA

Network aspects

TCSES

Satellite earth stations and systems

TCSTQ

Speech processing, transmission and quality

TCTMN

Telecommunications management networks

ECMA TC32

Communication, networks and systems interconnection

EBU/CENELEC/ETSI JTC

Joint technical committee

It can be seen that ETSI currently comprises 14 technical committees reporting to the technical assembly. These committees are responsible for the development of technical standards. In addition, these committees are responsible for prestandardization activities, that is, activities lead to ETSI technical reports (ETR) that eventually become the basis for future standards. In addition to the technical assembly, a strategic review committee (SRC) is responsible for prospective examination of a single technical domain, whereas an intellectual property rights committee defines ETSI’s policy in the area of intellectual property. Although by no means unique to ETSI, 1999 by CRC Press LLC

c

the rapid pace of technological progress has resulted in more standards being adopted that embrace technologies that are still under patent protection. This creates a fundamental conflict between the private, exclusive nature of industrial property rights, and the open, public nature of standards. Harmonizing those conflicting claims has emerged as a thorny issue in all standards organizations; ETSI has established a formal function for this purpose. Finally, the ETS/EBU technical committee coordinates activities with the European broadcasting union (EBU), whereas the ISDN committee is in charge of managing and coordinating the standardization process for narrowband ISDN.

14.3.1

CITEL

On June 11, 1993, the Organization of American States (OAS) General Assembly revised the existing Inter-American Telecommunication Commission (CITEL) strengthening and reorganizing the activities of CITEL, creating a position for the executive secretariat of CITEL and opening the doors, as associate members, to enterprises, organisms, and private telecommunication organizations, to act as observers of the permanent consultative committees of CITEL and its working groups. CITEL’s objectives include facilitating and promoting the continuous development of telecommunications in the hemisphere. It serves as the organization’s principal advisory body on matters related to telecommunications. The commission represents all the members states. It has a permanent executive committee consisting of 11 members, and three permanent consultative committees. The permanent consultative committees, whose members are all member states of the organization, also have associate members that represent various private telecommunications agencies or companies. The general assembly of CITEL, through resolution CITEL Res.8(I-94) established the following specific mandates for the three permanent consultative committees and the steering committee. Permanent Consultative Committee I: Public Telecommunication Services. To promote and watch over the integration and strengthening of networks and public telecommunication services operating in the countries of the Americas, taking into account the need for modernization of networks and promotion of universal telephone basic services, as well as for increasing the public availability of specialized services and the promotion of the use of international ITU standards and radio regulations. Permanent Consultative Committee II: Broadcasting. To stimulate and encourage the regional presence of broadcasting services, promoting the use of modern technologies and improving the public availability of such communication media, including audio and video systems, and the promotion of the use of international ITU standards and radio regulations. Permanent Consultative Committee III: Radiocommunications. To promote the harmonization of radiocommunication services bearing especially in mind the need for a reduction to the minimum of those factors that may cause harmful interferences in the performance and operation of networks and services. To promote the use of modern technologies and the application of the ITU radio regulations and standards. Steering Committee. The Steering Committee shall be formed by the chairman and vice-chairman of COM/CITEL and the chairman of the PCCs. The committee will be responsible for the revision and proposal to COM/CITEL of the continuous updating of the regulations, mandates and work programs of CITEL bodies; the executive secretary of CITEL will act as the secretary of said committee.

14.4

National Standardization

As standardization moves from global to regional and then to national levels, the number of actual participating entities rapidly grows. Here, the function of two national standards bodies are reviewed, 1999 by CRC Press LLC

c

primarily because these have been in existence the longest and secondarily because they also represent major markets for commercial communications.

14.4.1

ANSI T1

Unlike the ETSI, which came into being partly as a consequence of legislative recommendations, the ANSI Committee T1 on telecommunications came into being as a result of the realization that with the breakup of the Bell System, de-facto standards could no longer be expected. In fact, T1 came into being the very same year (1984) that the breakup of the Bell System came into effect. The T1 membership comprises four types of interest groups: users and general interest groups, manufacturers, interexchange carriers, and exchange carriers. This rather broad membership is reflected, to some extent, by the scope to which T1 standards are being applied; this means that nontraditional telecommunications service providers are utilizing the technologies standardized by committee T1. This situation is the result of the rapid evolution and convergence of the telecommunications, computer, and cable television industries in the United States, and advances in wireless technology. Committee T1 currently addresses approximately 150 approved projects, which led to the establishment of six, primarily functionally oriented, technical subcommittees (TSC), as shown in Table 14.5 and Fig. 14.2 [although not evident from Table 14.3, subcommittee T1P1 has primary responsibility for management of activities on personal communications systems (PCS)]. In-turn, each of these six subcommittees is divided into a number of subtending working groups, and subworking groups. TABLE 14.5

T1 Subcommittee Structure

TSC: T1A1

Performance and signal processing

TSC: T1E1

Network interfaces and environmental considerations

TSC: T1M1

Interwork operations, administration, maintenance, and provisioning

TSC: T1P1

Systems engineering, standards planning, and program management

TSC: T1S1

Services, architecture, and signalling

TSC: T1X1

Digital hierarchy and synchronization

FIGURE 14.2: T1 committee structure.

1999 by CRC Press LLC

c

Committee T1 also has an advisory group (T1AG) made up of elected representatives from each of the four interest groups to carry out committee T1 directives and to develop proposals for consideration by the T1 membership. In parallel to serving as the forum that establishes ANSI telecommunications network standards, committee T1 technical subcommittees draft candidate U.S. technical contributions to the ITU. These contributions are submitted to the U.S. Department of State National Committee for the ITU, which administers U.S. participation and contributions to the ITU (see Fig. 14.3). In this manner, activities within T1 are coordinated with those of the ITU. This coordination with other standards setting bodies is also reflected in T1’s involvement with Latin-American standards, through the formation of an ad hoc group with CITEL’s permanent technical committee 1 (PTC 1/T1). Further coordination with ETSI and other standards setting bodies is accomplished through the global standards collaboration process.

FIGURE 14.3: Committee T1 output.

14.4.2 TIA The TIA is a full-service trade organization that provides its members with numerous services including government relations, market support activities, educational programs, and standards setting activities. TIA is a member-driven organization. Policy is formulated by 25 board members selected from member companies, and is carried out by a permanent professional staff located in Washington D.C. TIA comprises six issue-oriented standing committees, each of which is chaired by a board member. The six committees are membership scope and development, international, marketing and trade shows, public policy and government relations, and technical. It is this last committee that in 1992 was accredited by ANSI in the United States to standardize telecommunications products. Technology standardization activities are reflected by TIA’s four product-oriented divisions, namely, user premises equipment, network equipment, mobile and personal communications equipment, and fiber optics. In these divisions the legislative and regulatory concerns of product manufacturers and the preparation of standards dealing with performance testing and compatibility are addressed. For example, modem and telematic standards, as well as much of the cellular standards technology, has been standardized in the United States under the mandate of TIA.

1999 by CRC Press LLC

c

14.4.3

TTC

The third national committee to be addressed is the TTC in Japan. TTC was established in October 1985 to develop and disseminate Japanese domestic standards for deregulated technical items and protocols. It is a nongovernmental, nonprofit standards setting organization established to ensure fair and transparent standardization procedures. TTC’s primary emphasis is to develop, conduct studies and research, and disseminate protocols and standards for the connection of telecommunications networks. TTC is organized along six technical subcommittees that report to a board of directors through a technical assembly (see Fig. 14.4).

FIGURE 14.4: Organization of TTC.

The TTC organization comprises a general assembly, which is in charge of matters such as business plans and budgets. The councilors meeting examines standards development procedures in order to assure impartiality and clarity. The secretariat provides overall support to the organization; the technical assembly develops standards and handles technical matters including surveys and research. Each technical subcommittee is partitioned into two or more working groups (WG). The coordination committee handles all issues in or between the TSCs and WGs, and it assures the smooth running of all technical committee meetings. Under the coordination committee, a subcommittee examines users’ requests and studies their applicability to the five-year standardization-project plan. This subcommittee also conducts userrequest surveys. The areas of involvement of each of the five subcommittees are shown in Table 14.6. TTC membership is divided into four categories. Type I telecommunications carriers, that is, those carriers that own telecommunications circuits and facilities; type II telecommunications carTABLE 14.6

TTC SubCommittees

Strategic Research and Planning Committee: Technical Survey and International Collaboration TSC 1 Network-to-network interfaces, mobile communications TSC 2 User-network interfaces TSC 3 PBX, LAN TSC 4 Higher level protocols TSC 5 Voice and video signal coding scheme and systems

1999 by CRC Press LLC

c

riers, that is, those with telecommunications circuits leased from type I carriers; related equipment manufacturers; and others, including users. Underlying objectives that guide TTC’s approach to standards development are 1) to conform to international recommendations or standards; 2) standardize items, where either international recommendations or standards are not clear, or where national standards need to be set, and where a consensus is achieved; and 3) to conduct further studies into any of the items just mentioned whenever the technical assembly is unable to arrive at a consensus. These objectives, which give highest priority in developing standards that are compatible with international recommendations or standards, have often driven TTC to adapt international standards for national use through the use of supplements that: • • • • • •

Give guidelines for users of TTC standards on how to apply them Help clarify the contents of standards Help with the implementation of standards in terminal equipment and adaptors Assure interconnection between terminal equipment and adaptors Provide background information regarding the content of standards Assure interconnection.

These supplements also include questions and answers that help in implementing the standards, including encoding examples of various parameters and explanation of the practical meaning of a standard.

14.5

Intellectual Property

In the deregulating telecommunication arena patents have become increasingly more important. New ideas that are incorporated in standards often have global market potential and patent holders are seeking to obtain an income from their intellectual property as well as from products. In addition, the general effort to develop standards quickly places them closer to the leading edge of technology. There are some cases, for example speech encoding algorithms, where terms of reference for performance are typically set as objectives that no one can meet when the objectives are defined. The state of the art is being pushed by goals of the standards development organization. In this environment, incorporation of some intellectual property in standards is practically unavoidable. With regard to intellectual property rights in the ITU, the TSB has developed a “code of practice” which may be summarized as follows. The TSB requests members putting forth standards to draw the attention of the TSB to any known patent or patent pending application relevant to the developing standard. Where such information has been declared to the TSB, a log of registered patent holders for each affected recommendation is maintained for the convenience of users of ITU standards. If a recommendation, which is a nonbinding international standard, is developed and contains patented intellectual property there are three situations that may arise. • The patent holder waives the rights and the recommendation is freely accessible to everybody. • The patent holder will not waive the rights but is willing to negotiate licenses with other parties on a nondiscriminatory basis and on reasonable terms and conditions. What is reasonable is not defined, and the ITU-T will not participate in such negotiations. 1999 by CRC Press LLC

c

• The patent holder is not willing to comply with either of the above two situations, in which case the ITU-T will not approve a recommendation containing such intellectual property. The patent policy of the American National Standards Institute (ANSI), which governs all standards development organizations accredited by ANSI, is defined in ANSI procedures 1.2.11. It is similar to that of the ITU in that it requires a statement from patent holders or identified parties to indicate granting of a royalty-free license, willingness to license on reasonable and nondiscriminatory terms and conditions, or a disclaimer of no patent. Unlike the ITU, ANSI advises that is prepared to get involved in resolving disputes of what is considered “nondiscriminatory” and “reasonable.” Additional information on ANSI patent guidelines can be found at http://web.ansi.org/public/library/guides/ppguide.html. As mentioned earlier ETSI produces a combination of mandatory and voluntary standards. This can create additional complications when intellectual property issues are encapsulated within the standards. To formally address these issues an intellectual property rights committee defines ETSI’s policy in the area of intellectual property. Given the different patent policies adopted by various standards organizations, it is recommended that companies developing products based on standards investigate and understand the patent policy of the associated standards body and the patent statements filed regarding the standard being implemented.

14.6

Standards Coordination

The pace of technological advancements coupled with deregulation has given rise to increased global telecommunications standards activities. At the same time a growth of regional standards bodies has occurred which has increased the potential for duplication of work, wasting resources, and creating conflicting standards. This potentially adverse situation was addressed by a number of interregional telecommunications standardization conferences (ITSCs) that were held in the early 1990s. A global standards collaboration (GSC) group was established to oversee collaborative activities including electronic document handling (EDH) and five high-interest standards subjects: • • • • •

Broadband integrated services digital network (B-ISDN) Intelligent Networks (IN) Transmission management network (TMN) Universal personal telecommunications (UPT) Synchronous digital hierarchy/synchronous optical network (SDH/SONET)

This early activity was successful in avoiding duplication of effort and coordinating activities on these major standardization efforts. Today the level of cooperative activities, again driven by the pressure to avoid wasting valuable resources and reaching agreed standards more rapidly, are being driven to lower levels through the use of liaison statements between regional standards groups and permitting “documents of information” to flow between standards development organizations. The processes for this information flow are evolving and the electronic addresses provided at the end of this chapter should be consulted for the current interstandards organization communication mechanisms. 1999 by CRC Press LLC

c

14.7

Scientific

Another global, scientifically based organization that has been particularly active in standards development (more recently emphasizing information processing) is the IEEE. Responsibility for standards adoption within the IEEE lies with the IEEE standards board. The board is supported by nine standing committees (see Fig. 14.5).

FIGURE 14.5: IEEE standards board organization.

Proposed standards are normally developed in the technical committees of the IEEE societies. There are occasions, however, when the scope of activity is too broad to be encompassed by a single society or where the societies are not able to do so for other reasons. In this case the standards board establishes its own standards developing committees, namely, the standards coordinating committees (SCC), to perform this function. The adoption of IEEE standards is based on projects that have been approved by the IEEE standards board, while each project is the responsibility of a sponsor. Sponsors need not be an SCC, but can also include technical committees of IEEE societies; a standards, or standards coordinating committee of an IEEE Society; an accredited standards committee; or another organization approved by the IEEE standards board.

14.8

Standards Development Cycle

Although the manner in which standards are developed and approved somewhat varies between standards organizations, there are common characteristics to be found. For most standards, first a set of requirements is defined. This may be done either by the standards committee actually developing the standard or by another entity in collaboration with such a committee. Subsequently, the technical details of a standard are developed. The actual entity developing a standard may be a member of the standards committee, or the actual standards committee 1999 by CRC Press LLC

c

itself. Outsiders may also contribute to standards development but, typically, only if sponsored by a committee member. Membership in the standards committee and the right to contribute technical information towards the development of the standard differs among the various standards’ organizations, as indicated. This process is illustrated in Fig. 14.6.

FIGURE 14.6: Typical standards development and approval process.

Finally, once the standard has been fully developed, it is placed under an approval cycle. Each standards setting body typically has precisely defined and often complex procedures for reviewing and then approving proposed standards, which although different in detail, are typically consensus driven.

Defining Terms ANSI: The American National Standards Institute. CCIR: The International Radio Consultative Committee, the predecessor of the ITU-R. CCITT: The International Telephone and Telegraph Consultative Committee, the predecessor of the ITU-T. CEPT: The European Conference for Post and Telecommunications, a predecessor of ETSI. CITEL: Inter-American Telecommunications Commission, a standards setting body for the Americas. ETS: A European (ETSI) technical standard. ETSI: The European Telecommunications Standards Institute. GSC: The Global Standards Collaboration group. ICG: Intersector Coordination Group, a group which coordinates activities between the ITU-T and ITU-R. IEC: The International Electrotechnical Commission. IEEE: The Institute of Electrical and Electronics Engineers. ISO: The International Standards Organization. ITU: The International Telecommunications Union, an international treaty organization, which is part of the United Nations. ITU-R: The radio communications sector of the ITU, the successor of the CCIR. 1999 by CRC Press LLC

c

ITU-T: The standardization sector of the ITU, the successor of the CCITT. JCG: The Joint Coordination Group, which oversees the coordination of common work between ITU-T study groups. Recommendation: An ITU technical standard. SCC: A standard’s coordinating committee within the IEEE organization. Standard: A publicly approved technical specification. T1: An ANSI-approved standards body, which develops telecommunications standards in the United States. T1AG: The primary advisory group within ANSI Committee T1 on Telecommunications. TIA: The Telecommunications Industry Association, which is an ANSI-approved standards body that develops terminal equipment standards. TTC: The Telecommunications Technology Committee, a Japanese standards setting body.

Further Information [1] Irmer, T., Shaping future telecommunications: the challenge of global standardization, IEEE Comm. Mag., 32(1), 20–28, 1994. [2] Matute, M.A., CITEL: formulating telecommunications in the Americas. IEEE Comm. Mag., 32(1), 38–39, 1994. [3] Robin, G., The European perspective for telecommunications standards. IEEE Comm. Mag., 32(1), 40–50, 1994. [4] Reilly, A.K., A U.S. perspective on standards development. IEEE Comm. Mag., 32(1), 30–36, 1994. [5] Iida, T., Domestic standards in a changing world. IEEE Comm. Mag., 32(1), 46–50, 1994. [6] Habara, K., Cooperation in standardization. IEEE Comm. Mag., 32(1), 78–84, 1994. [7] IEEE Standards Board Bylaws. Institute of Electrical and Electronics Engineers. Dec. 1993. [8] Chiarottino, W. and Pirani, G., International telecommunications standards organizations, CSELT Tech. Repts., XXI(2), 207–236, 1993. [9] ITU, Book No. 1. Resolutions; Recommendations on the organization of the work of ITU-T (series A); study groups and other groups; list of study questions (1993-1996). World Standardization Conf. Helsinki, 1–12, Mar. 1993. [10] Standards Committee T1., Telecommunications. Procedures Manual. 7th Iss. Jun. 1992.

The standards’ organizations often undergo structural and substantive changes. It is recommended that the following web sites be visited for the most updated information. ANSI CITEL ETSI IEC IEEE ISO ITU T1 TIA TTC 1999 by CRC Press LLC

c

http://www.ansi.org/ http://www.oas.org http://www.etsi.org http://www.iec.ch http://www.ieee.org http://www.iso.ch http://www.itu.ch http://www.t1.org http://www.tia.org http://www.ttc.or.jp

Cox, D.C. “Wireless Personal Communications: A Perspective” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Wireless Personal Communications: A Perspective 15.1 Introduction 15.2 Background and Issues Mobility and Freedom from Tethers

15.3 Evolution of Technologies, Systems, and Services

Cordless Telephones • Cellular Mobile Radio Systems • WideArea Wireless Data Systems • High-Speed Wireless Local-Area Networks (WLANs) • Paging/Messaging Systems • SatelliteBased Mobile Systems • { Fixed Point-to-Multipoint Wireless Loops • Reality Check

15.4 Evolution Toward the Future and to Low-Tier Personal Communications Services 15.5 Comparisons with Other Technologies Complexity/Coverage Area Comparisons Speed, and Environments

•

{ Coverage, Range,

15.6 Quality, Capacity, and Economic Issues

Capacity, Quality, and Complexity • Economics, System Capacity, and Coverage Area Size • { Loop Evolution and Economics

15.7 Other Issues

Improvement of Batteries • People Only Want One Handset • Other Environments • Speech Quality Issues • New Technology • High-Tier to Low-Tier or Low-Tier to High-Tier Dual Mode

Donald C. Cox Stanford University

15.8 Infrastructure Networks 15.9 Conclusion References

{ This chapter has been updated using { } as indicators of inserts into the text of the original chapter of the same title that appeared in the first edition of this Handbook in 1996. }

15.1

Introduction

Wireless personal communications has captured the attention of the media and with it, the imagination of the public. Hardly a week goes by without one seeing an article on the subject appearing in a popular U.S. newspaper or magazine. Articles ranging from a short paragraph to many pages regularly appear in local newspapers, as well as in nationwide print media, e.g., The Wall Street Journal, The New York Times, Business Week, and U.S. News and World Report. Countless marketing surveys 1999 by CRC Press LLC

c

continue to project enormous demand, often projecting that at least half of the households, or half of the people, want wireless personal communications. Trade magazines, newsletters, conferences, and seminars on the subject by many different names have become too numerous to keep track of, and technical journals, magazines, conferences, and symposia continue to proliferate and to have ever increasing attendance and numbers of papers presented. It is clear that wireless personal communications is, by any measure, the fastest growing segment of telecommunications. { The explosive growth of wireless personal communications has continued unabated worldwide. Cellular and high-tier PCS pocketphones, pagers, and cordless telephones have become so common in many countries that few people even notice them anymore. These items have become an expected part of everyday life in most developed countries and in many developing countries around the world. } If you look carefully at the seemingly endless discussions of the topic, however, you cannot help but note that they are often describing different things, i.e., different versions of wireless personal communications [29, 50]. Some discuss pagers, or messaging, or data systems, or access to the national information infrastructure, whereas others emphasize cellular radio, or cordless telephones, or dense systems of satellites. Many make reference to popular fiction entities such as Dick Tracy, Maxwell Smart, or Star Trek. { In addition to the things noted above, the topic of wireless loops [24], [30], [32] has also become popular in the widespread discussions of wireless communications. As discussed in [30], this topic includes several fixed wireless applications as well as the low-tier PCS application that was discussed originally under the wireless loop designation [24, 32]. The fixed wireless applications are aimed at reducing the cost of wireline loop-ends, i.e., the so-called “last mile” or “last km” of wireline telecommunications. } Thus, it appears that almost everyone wants wireless personal communications, but What is it? There are many different ways to segment the complex topic into different communications applications, modes, functions, extent of coverage, or mobility [29, 30, 50]. The complexity of the issues has resulted in considerable confusion in the industry, as evidenced by the many different wireless systems, technologies, and services being offered, planned, or proposed. Many different industry groups and regulatory entities are becoming involved. The confusion is a natural consequence of the massive dislocations that are occurring, and will continue to occur, as we progress along this large change in the paradigm of the way we communicate. Among the different changes that are occurring in our communications paradigm, perhaps the major constituent is the change from wired fixed place-to-place communications to wireless mobile person-to-person communications. Within this major change are also many other changes, e.g., an increase in the significance of data and message communications, a perception of possible changes in video applications, and changes in the regulatory and political climates. { The fixed wireless loop applications noted earlier do not fit the new mobile communications paradigm. After many years of decline of fixed wireless communications applications, e.g., intercontinental HF radio and later satellites, point-to-point terrestrial microwave radio, and tropospheric scatter, it is interesting to see this rebirth of interest in fixed wireless applications. This rebirth is riding on the gigantic “wireless wave” resulting from the rapid public acceptance of mobile wireless communications. It will be interesting to observe this rebirth to see if communications history repeats; certainly mobility is wireless, but there is also considerable historical evidence that wireless is also mobility. } This chapter attempts to identify different issues and to put many of the activities in wireless into a framework that can provide perspective on what is driving them, and perhaps even to yield some indication of where they appear to be going in the future. Like any attempt to categorize many complex interrelated issues, however, there are some that do not quite fit into neat categories, and so there will remain some dangling loose ends. Like any major paradigm shift, there will continue to be considerable confusion as many entities attempt to interpret the different needs and expectations associated with the new paradigm. 1999 by CRC Press LLC

c

15.2

Background and Issues

15.2.1

Mobility and Freedom from Tethers

Perhaps the clearest constituents in all of the wireless personal communications activity are the desire for mobility in communications and the companion desire to be free from tethers, i.e., from physical connections to communications networks. These desires are clear from the very rapid growth of mobile technologies that provide primarily two-way voice services, even though economical wireline voice services are readily available. For example, cellular mobile radio has experienced rapid growth. Growth rates have been between 35 and 60% per year in the United States for a decade, with the total number of subscribers reaching 20 million by year-end 1994. The often neglected wireless companions to cellular radio, i.e., cordless telephones, have experienced even more rapid, but harder to quantify, growth with sales rates often exceeding 10 million sets a year in the United States, and with an estimated usage significantly exceeding 50 million in 1994. Telephones in airlines have also become commonplace. Similar or even greater growth in these wireless technologies has been experienced throughout the world. { The explosive growth in cellular and its identical companion, high-tier PCS, has continued to about 55 million subscribers in the U.S. at year-end 1997 and a similar number worldwide. In Sweden the penetration of cellular subscribers by 1997 was over one-third of the total population, i.e., the total including every man, woman, and child! And the growth has continued since. Similar penetrations of mobile wireless services are seen in some other developed nations, e.g., Japan. The growth in users of cordless telephones also has continued to the point that they have become the dominant subscriber terminal on wireline telephone loops in the U.S. It would appear that, taking into account cordless telephones and cellular and high-tier PCS phones, half of all telephone calls in the U.S. terminate with at least one end on a wireless device. } { Perhaps the most significant event in wireless personal communications since the writing of this original chapter was the widespread deployment and start of commercial service of personal handphone (PHS) in Japan in July of 1995 and its very rapid early acceptance by the consumer market [53]. By year-end 1996 there were 5 million PHS subscribers in Japan with the growth rate exceeding one-half million/month for some months. The PHS “phenomena” was one of the fastest adoptions of a new technology ever experienced. However, the PHS success story [41] peaked at a little over 7 million subscribers in 1997 and has declined slightly to a little under 7 million in mid-1998. This was the first mass deployment of a low-tier-like PCS technology (see later sections of this chapter), but PHS has some significant limitations. Perhaps the most significant limitation is the inability to successfully handoff at vehicular speeds. This handoff limitation is a result of the cumbersome radio link structure and control algorithms used to implement dynamic channel allocation (DCA) in PHS. DCA significantly increases channel occupancy (base station capacity) but incurs considerable complexity in implementing handoff. Another significant limitation of the PHS standard has been insufficient receiver sensitivity to permit “adequate” coverage from a “reasonably” dense deployment of base stations. These technology deficiencies coupled with heavy price cutting by the cellular service providers to compete with the rapid advancing of the PHS market were significant contributors to the leveling out of PHS growth. It is again evident, as with CT-2 phone point discussed in a later section, that low-tier PCS has very attractive features that can attract many subscribers, but it must also provide vehicle speed handoff and widespread coverage of highways as well as populated regions. Others might point out the deployment and start of service of CDMA systems as a significant event since the first edition. However, the major significance of this CDMA activity is that it confirmed that CDMA performance was no better than other less-complex technologies and that those, including this author, who had been branded as “unbelieving skeptics” were correct in their assessments of 1999 by CRC Press LLC

c

the shortcomings of this technology. The overwhelming failure of CDMA technology to live up to the early claims for it can hardly be seen as a significant positive event in the evolution of wireless communication. It was, of course, a significant negative event. After years of struggling with the problems of this technology, service providers still have significantly fewer subscribers on CDMA worldwide than there are PHS subscribers in Japan alone! CDMA issues are discussed more in later sections dealing with technology issues. } Paging and associated messaging, although not providing two-way voice, do provide a form of tetherless mobile communications to many subscribers worldwide. These services have also experienced significant growth { and have continued to grow since 1996. } There is even a glimmer of a market in the many different specialized wireless data applications evident in the many wireless local area network (WLAN) products on the market, the several wide area data services being offered, and the specialized satellite-based message services being provided to trucks on highways. { Wireless data technologies still have many supporters, but they still have fallen far short of the rapid deployment and growth of the more voice oriented wireless technologies. However, hope appears to be eternal in the wireless data arena. } The topics discussed in the preceding two paragraphs indicate a dominant issue separating the different evolutions of wireless personal communications. That issue is the voice versus data communications issue that permeates all of communications today; this division also is very evident in fixed networks. The packet-oriented computer communications community and the circuit-oriented voice telecommunications (telephone) community hardly talk to each other and often speak different languages in addressing similar issues. Although they often converge to similar overall solutions at large scales (e.g., hierarchical routing with exceptions for embedded high-usage routes), the smallscale initial solutions are frequently quite different. Asynchronous transfer mode (ATM-) based networks are an attempt to integrate, at least partially, the needs of both the packet-data and circuitoriented communities. Superimposed on the voice-data issue is an issue of competing modes of communications that exist in both fixed and mobile forms. These different modes include the following. Messaging is where the communication is not real time but is by way of message transmission, storage, and retrieval. This mode is represented by voice mail, electronic facsimile (fax), and electronic mail (e-mail), the latter of which appears to be a modern automated version of an evolution that includes telegraph and telex. Radio paging systems often provide limited one-way messaging, ranging from transmitting only the number of a calling party to longer alpha-numeric text messages. Real-time two-way communications are represented by the telephone, cellular mobile radio telephone, and interactive text (and graphics) exchange over data networks. Two-way video phone always captures significant attention and fits into this mode; however, its benefit/cost ratio has yet to exceed a value that customers are willing to pay. Paging, i.e., broadcast with no return channel, alerts a paged party that someone wants to communicate with him/her. Paging is like the ringer on a telephone without having the capability for completing the communications. Agents are new high-level software applications or entities being incorporated into some computer networks. When launched into a data network, an agent is aimed at finding information by some title or characteristic and returning the information to the point from which the agent was launched. { The rapid growth of the worldwide web is based on this mode of communications. } There are still other ways in which wireless communications have been segmented in attempts to optimize a technology to satisfy the needs of some particular group. Examples include 1) user location, which can be differentiated by indoors or outdoors, or on an airplane or a train and 2) degree of mobility, which can be differentiated either by speed, e.g., vehicular, pedestrian, or stationary, or 1999 by CRC Press LLC

c

by size of area throughout which communications are provided. { As noted earlier, wireless local loop with stationary terminals has become a major segment in the pursuit of wireless technology. } At this point one should again ask; wireless personal communications—What is it? The evidence suggests that what is being sought by users, and produced by providers, can be categorized according to the following two main characteristics. Communications portability and mobility on many different scales: • Within a house or building [cordless telephone, (WLANs)] • Within a campus, a town, or a city (cellular radio, WLANs, wide area wireless data, radio paging, extended cordless telephone) • Throughout a state or region (cellular radio, wide area wireless data, radio paging, satellitebased wireless) • Throughout a large country or continent (cellular radio, paging, satellite-based wireless) • Throughout the world? Communications by many different modes for many different applications: • • • •

Two-way voice Data Messaging Video?

Thus, it is clear why wireless personal communications today is not one technology, not one system, and not one service but encompasses many technologies, systems, and services optimized for different applications.

15.3

Evolution of Technologies, Systems, and Services

Technologies and systems [27, 29, 30, 39, 50, 59, 67, 87], that are currently providing, or are proposed to provide, wireless communications services can be grouped into about seven relatively distinct groups, { the seven previous groups are still evident in the technology but with the addition of the fixed point-to-multipoint wireless loops there are now eight, } although there may be some disagreement on the group definitions, and in what group some particular technology or system belongs. All of the technologies and systems are evolving as technology advances and perceived needs change. Some trends are becoming evident in the evolutions. In this section, different groups and evolutionary trends are explored along with factors that influence the characteristics of members of the groups. The grouping is generally with respect to scale of mobility and communications applications or modes.

15.3.1

Cordless Telephones

Cordless telephones [29, 39, 50] generally can be categorized as providing low-mobility, low-power, two-way tetherless voice communications, with low mobility applying both to the range and the user’s speed. Cordless telephones using analog radio technologies appeared in the late 1970s, and have experienced spectacular growth. They have evolved to digital radio technologies in the forms of second-generation cordless telephone (CT-2), and digital European cordless telephone (DECT) 1999 by CRC Press LLC

c

standards in Europe, and several different industrial scientific medical (ISM) band technologies in the United States.1 { Personal handyphone (PHS) noted earlier and discussed in later sections and inserts can be considered either as a quite advanced digital cordless telephone similar to DECT or as a somewhat limited low-tier PCS technology. It has most of the attributes of similarity of the digital cordless telephones listed later in this section except that PHS uses π/4 QPSK modulation. } Cordless telephones were originally aimed at providing economical, tetherless voice communications inside residences, i.e., at using a short wireless link to replace the cord between a telephone base unit and its handset. The most significant considerations in design compromises made for these technologies are to minimize total cost, while maximizing the talk time away from the battery charger. For digital cordless phones intended to be carried away from home in a pocket, e.g., CT-2 or DECT, handset weight and size are also major factors. These considerations drive designs toward minimizing complexity and minimizing the power used for signal processing and for transmitting. Cordless telephones compete with wireline telephones. Therefore, high circuit quality has become a requirement. Early cordless sets had marginal quality. They were purchased by the millions, and discarded by the millions, until manufacturers produced higher-quality sets. Cordless telephones sales then exploded. Their usage has become commonplace, approaching, and perhaps exceeding, usage of corded telephones. The compromises accepted in cordless telephone design in order to meet the cost, weight, and talk-time objectives are the following. • Few users per megahertz • Few users per base unit (many link together a particular handset and base unit) • Large number of base units per unit area; one or more base units per wireline access line (in high-rise apartment buildings the density of base units is very large) • Short transmission range There is no added network complexity since a base unit looks to a telephone network like a wireline telephone. These issues are also discussed in [29, 50]. Digital cordless telephones in Europe have been evolving for a few years to extend their domain of use beyond the limits of inside residences. Cordless telephone, second generation, (CT-2) has evolved to provide telepoint or phone-point services. Base units are located in places where people congregate. e.g., along city streets and in shopping malls, train stations, etc. Handsets registered with the phone-point provider can place calls when within range of a telepoint. CT-2 does not provide capability for transferring (handing off) active wireless calls from one phone point to another if a user moves out of range of the one to which the call was initiated. A CT-2+ technology, evolved from CT-2 and providing limited handoff capability, is being deployed in Canada. { CT-2+ deployment was never completed. } Phone-point service was introduced in the United Kingdom twice, but failed to attract enough customers to become a viable service. In Singapore and Hong Kong, however, CT-2 phone point has grown rapidly, reaching over 150,000 subscribers in Hong Kong [75] in mid-1994. The reasons for success in some places and failure in others are still being debated, but it is clear that the compactness of the Hong Kong and Singapore populations make the service more widely available, using fewer base stations than in more spreadout cities. Complaints of CT-2 phone-point

1 These ISM technologies either use spread spectrum techniques (direct sequence or frequency hopping) or very lowtransmitter power ( 10 ms. Simple Frequency-Shift Modulation and Noncoherent Detection: Although still being low in complexity, the slightly more complex 4QPSK modulation with coherent detection provides significantly more spectrum efficiency, range, and interference immunity. Dynamic Channel Allocation: Although this technique has potential for improved system capacity, the cordless-telephone implementations do not take full advantage of this feature for handoff and, thus, cannot reap the full benefit for moving users [15, 19]. Time Division Duplex (TDD): This technique permits the use of a single contiguous frequency band and implementation of diversity from one end of a radio link. Unless all base station transmissions are synchronized in time, however, it can incur severe cochannel interference penalties in outside environments [15, 16]. Of course, for cordless telephones used inside with base stations not having a propagation advantage, this is not a problem. Also, for small indoor PBX networks, synchronization of base station transmission is easier than is synchronization throughout a widespread outdoor network, which can have many adjacent base stations connected to different geographic locations for central control and switching.

15.3.2

Cellular Mobile Radio Systems

Cellular mobile radio systems are becoming known in the United States as high-tier personal communications service (PCS), particularly when implemented in the new 1.9-GHz PCS bands [20]. These systems generally can be categorized as providing high-mobility, wide-ranging, two-way tetherless voice communications. In these systems, high mobility refers to vehicular speeds, and also to widespread regional to nationwide coverage [27, 29, 50]. Mobile radio has been evolving for over 50 years. Cellular radio integrates wireless access with large-scale networks having sophisticated intelligence to manage mobility of users. Cellular radio was designed to provide voice service to wide-ranging vehicles on streets and highways [29, 39, 50, 82], and generally uses transmitter power on the order of 100 times that of cordless telephones (≈ 2 W for cellular). Thus, cellular systems can only provide reduced service to handheld sets that are disadvantaged by using somewhat lower transmitter power (< 0.5 W) and less efficient antennas than vehicular sets. Handheld sets used inside buildings have the further disadvantage of attenuation through walls that is not taken into account in system design. Cellular radio or high-tier PCS has experienced large growth as noted earlier. In spite of the 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

TABLE 15.1

Wireless PCS Technologies High-Power Systems

Low-Power Systems

Digital Cellular (High-Tier PCS)

Low-Tier PCS

Digital Cordless

System

IS-54

IS-95 (DS)

GSM

DCS-1800

WACS/PACS

Handi-Phone

DECT

CT-2

Multiple Access

TDMA/ FDMA

CDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

TDMA/ FDMA

FDMA

Freq. band, MHz Uplink, MHz 869–894 Downlink, MHz 824–849 (USA)

1895–1907

1880–1990

864–868

869–894 824–849 (USA)

935–960 890–915 (Eur.)

1710–1785 1805–1880 (UK)

Emerg. Tech.∗ (USA)

RF ch. spacing Downlink, KHz 30 Uplink, KHz 30

1250 1250

200 200

200 200

300 300

(Japan)

(Eur.)

(Eur. and Asia)

300

1728

100

Modulation

π/4 DQPSK

BPSK/QPSK GMSK

GMSK

π/4 QPSK

π/4 DQPSK

GFSK

GFSK

Portable txmit Power, max./avg.

600 mW/ 200 mW

600 mW

1 W/ 125 mW

1 W/ 125 mW

200 mW/ 25 mW

80 mW/ 10 mW

250 mW/ 10 mW

10 mW/ 5 mW

Speech coding

VSELP

QCELP

RPE-LTP

RPE-LTP

ADPCM

ADPCM

ADPCM

ADPCM

Speech rate, kb/s

7.95

8 (var.)

13

13

32/16/8

32

32

32

Speech ch./RF ch.

3

—

8

8

8/16/32

4

12

1

384

1152

72

Ch. Bit rate, kb/s Uplink, kb/s 48.6 Downlink, kb/s 48.6

270.833 270.833

270.833 270.833

384 384

Ch. coding

1/2 rate conv.

1/2 rate fwd. 1/2 rate 1/3 rate rev. conv.

1/2 rate conv.

CRC

CRC

CRC (control )

None

Frame, ms

40

20

4.615

2.5

5

10

2

4.615

∗ Spectrum is 1.85–2.2 GHz allocated by the FCC for emerging technologies; DS is direct sequence.

limitations on usage of handheld sets already noted, handheld cellular sets have become very popular, with their sales becoming comparable to the sales of vehicular sets. Frequent complaints from handheld cellular users are that batteries are too large and heavy, and both talk time and standby time are inadequate. { Cellular and high-tier PCS pocket handsets have continued to decrease in size and weight and more efficient lithium batteries have been incorporated. This has increased their attractiveness (more on this in the later section “Reality Check”). For several years there have been many more pocket handsets sold than vehicular mounted sets every year. However, despite the improvements in these handsets and batteries, the complaints of weight and limited talk time still persist. The electronics have become essentially weightless compared to the batteries required for these high-tier PCS and cellular handsets. } Cellular radio at 800 MHz has evolved to digital radio technologies [29, 39, 50] in the forms of the deployed systems standards • Global Standard for Mobile (GSM) in Europe • Japanese or personal digital cellular (JDC or PDC) in Japan • U.S. TDMA digital cellular known as USDC or IS-54. and in the form of the code division multiple access (CDMA) standard, IS-95, which is under development but not yet deployed. { Since the first edition was published, CDMA systems have been deployed in the U.S., Korea, Hong Kong, and other countries after many months (years) of redesign, reprogramming, and adjustment. These CDMA issues are discussed later in the section “New Technology.” } The most significant consideration in the design compromises made for the U.S. digital cellular or high-tier PCS systems was the high cost of cell sites (base stations). A figure often quoted is U.S. $1 million for a cell site. This consideration drove digital system designs to maximize users per megahertz and to maximize the users per cell site. Because of the need to cover highways running through low-population-density regions between cities, the relatively high transmitter power requirement was retained to provide maximum range from high antenna locations. Compromises that were accepted while maximizing the two just cited parameters are as follows. • • • •

High transmitter power consumption. High user-set complexity, and thus high signal-processing power consumption. Low circuit quality. High network complexity, e.g., the new IS-95 technology will require complex new switching and control equipment in the network, as well as high-complexity wireless-access technology.

Cellular radio or high-tier PCS has also been evolving for a few years in a different direction, toward very small coverage areas or microcells. This evolution provides increased capacity in areas having high user density, as well as improved coverage of shadowed areas. Some microcell base stations are being installed inside, in conference center lobbies and similar places of high user concentrations. Of course, microcells, also permit lower transmitter power that conserves battery power when power control is implemented, and base stations inside buildings circumvent the outside wall attenuation. Low-complexity microcell base stations also are considerably less expensive than conventional cell sites, perhaps two orders of magnitude less expensive. Thus, the use of microcell base stations provides large increases in overall system capacity, while also reducing the cost per available radio channel and the battery drain on portable subscriber equipment. This microcell evolution, illustrated in Fig. 15.1, 1999 by CRC Press LLC

c

moves handheld cellular sets in a direction similar to that of the expanded-coverage evolution of cordless telephones to phone points and wireless PBX. Some of the characteristics of digital-cellular or high-tier PCS technologies are listed in Table 15.1 for IS-54, IS-95, and GSM at 900 MHz, and DCS-1800, which is GSM at 1800 MHz. { The technology listed here as IS-54 has also become known as IS-136 having more sophisticated digital control channels. These technologies, IS-54/IS-136 are also sometimes known as DAMPS (i.e., Digital AMPS), as U.S. TDMA or North American TDMA, or sometimes just as “TDMA.” } Additional information can be found in [29, 39, 50]. The JDC or PDC technology, not listed, is similar to IS-54. As with the digital cordless technologies, there are significant differences among these cellular technologies, e.g., modulation type, multiple access technology, and channel bit rate. There are also many similarities, however, that are fundamental to the design objectives discussed earlier. These similarities and their implications are as follows. Low Bit-Rate Speech Coding ≤13 kb/s with Some ≤8 kb/s: Low bit-rate speech coding obviously increases the number of users per megahertz and per cell site. However, it also significantly reduces speech quality [29], and does not permit speech encodings in tandem while traversing a network; see also the section on Other Issues later in this chapter. Some Implementations Make Use of Speech Inactivity: This further increases the number of users per cell site, i.e., the cell-site capacity. It also further reduces speech quality [29], however, because of the difficulty of detecting the onset of speech. This problem is even worse in an acoustically noisy environment like an automobile. High Transmission Delay; ≈200-ms Round Trip: This is another important circuit-quality issue. Such large delay is about the same as one-way transmission through a synchronous-orbit communications satellite. A voice circuit with digital cellular technology on both ends will experience the delay of a full satellite circuit. It should be recalled that one reason long-distance circuits have been removed from satellites and put onto fiber-optic cable is because customers find the delay to be objectionable. This delay in digital cellular technology results from both computation for speech bitrate reduction and from complex signal processing, e.g., bit interleaving, error correction decoding, and multipath mitigation [equalization or spread spectrum code division multiple access (CDMA)]. High-Complexity Signal Processing, Both for Speech Encoding and for Demodulation: Signal processing has been allowed to grow without bound and is about a factor of 10 greater than that used in the low-complexity digital cordless telephones [29]. Since several watts are required from a battery to produce the high transmitter power in a cellular or high-tier PCS set, signal-processing power is not as significant as it is in the low-power cordless telephones; see also the section on Complexity/Coverage Area Comparisons later in this chapter. Fixed Channel Allocation: The difficulties associated with implementing capacity-increasing dynamic channel allocation to work with handoff [15, 19] have impeded its adoption in systems requiring reliable and frequent handoff. Frequency Division Duplex (FDD): Cellular systems have already been allocated pairedfrequency bands suitable for FDD. Thus, the network or system complexity required for providing synchronized transmissions [15, 16] from all cell sites for TDD has not been embraced in these digital cellular systems. Note that TDD has not been employed in IS-95 even though such synchronization is required for other reasons. Mobile/Portable Set Power Control: The benefits of increased capacity from lower overall cochannel interference and reduced battery drain have been sought by incorporating power control in the digital cellular technologies.

1999 by CRC Press LLC

c

15.3.3

Wide-Area Wireless Data Systems

Existing wide area data systems generally can be categorized as providing high mobility, wide-ranging, low-data-rate digital data communications to both vehicles and pedestrians [29, 50]. These systems have not experienced the rapid growth that the two-way voice technologies have, even though they have been deployed in many cities for a few years and have established a base of customers in several countries. Examples of these packet data systems are shown in Table 15.2. TABLE 15.2

Wide-Area Wireless Packet Data Systems RAM Mobile

Metricom

CDPD1

(Mobitex)

ARDIS2 (KDT)

(MDN)3

Data rate, kb/s

19.2

8 (19.2)

4.8 (19.2)

76

Modulation

GMSK BT = 0.5

GMSK

GMSK

GMSK

Frequency, MHz

800

900

800

915

Chan. spacing, kHz

30

12.5

25

160

Full service

Status

1994 service

Full service

Access means

Unused AMPS channels

Slotted Aloha CSMA

Transmit power, W

In service FH SS (ISM)

40

1

Note: Data in parentheses ( ) indicates proposed. 1 Cellular Digital Packet Data 2 Advanced Radio Data Information Service 3 Microcellular Data Network

The earliest and best known of these systems in the United States are the ARDIS network developed and run by Motorola, and the RAM mobile data network based on Ericsson Mobitex Technology. These technologies were designed to make use of standard, two-way voice, land mobile-radio channels, with 12.5- or 25-kHz channel spacing. In the United States these are specialized mobile radio services (SMRS) allocations around 450 MHz and 900 MHz. Initially, the data rates were low: 4.8 kb/s for ARDIS and 8 kb/s for RAM. The systems use high transmitter power (several tens of watts) to cover large regions from a few base stations having high antennas. The relatively low data capacity of a relatively expensive base station has resulted in economics that have not favored rapid growth. The wide-area mobile data systems also are evolving in several different directions in an attempt to improve base station capacity, economics, and the attractiveness of the service. The technologies used in both the ARDIS and RAM networks are evolving to higher channel bit rates of 19.2 kb/s. The cellular carriers and several manufacturers in the United States are developing and deploying a new wide area packet data network as an overlay to the cellular radio networks. This cellular digital packet data (CDPD) technology shares the 30-kHz spaced 800-MHz voice channels used by the analog FM advanced mobile phone service (AMPS) systems. Data rate is 19.2 kb/s. The CDPD base station equipment also shares cell sites with the voice cellular radio system. The aim is to reduce the cost of providing packet data service by sharing the costs of base stations with the better established and higher cell-site capacity cellular systems. This is a strategy similar to that used by nationwide fixed wireline packet data networks that could not provide an economically viable data service if they did not share costs by leasing a small amount of the capacity of the interexchange networks that are paid for largely by voice traffic. { CDPD has been deployed in many U.S. cities for several years. However, it has not lived up to early expectations and has become 1999 by CRC Press LLC

c

“just another” wireless data service with some subscribers, but not with the large growth envisioned earlier. } Another evolutionary path in wide-area wireless packet data networks is toward smaller coverage areas or microcells. This evolutionary path also is indicated on Fig. 15.1. The microcell data networks are aimed at stationary or low-speed users. The design compromises are aimed at reducing service costs by making very small and inexpensive base stations that can be attached to utility poles, the sides of buildings and inside buildings and can be widely distributed throughout a region. Basestation-to-base-station wireless links are used to reduce the cost of the interconnecting data network. In one network this decreases the overall capacity to serve users, since it uses the same radio channels that are used to provide service. Capacity is expected to be made up by increasing the number of base stations that have connections to a fixed-distribution network as service demand increases. Another such network uses other dedicated radio channels to interconnect base stations. In the highcapacity limit, these networks will look more like a conventional cellular network architecture, with closely spaced, small, inexpensive base stations, i.e., microcells, connected to a fixed infrastructure. Specialized wireless data networks have been built to provide metering and control of electric power distributions, e.g., Celldata and Metricom in California. A large microcell network of small inexpensive base stations has been installed in the lower San Francisco Bay Area by Metricom, and public packet-data service was offered during early 1994. Most of the small (shoe-box size) base stations are mounted on street light poles. Reliable data rates are about 75 kb/s. The technology is based on slow frequency-hopped spread spectrum in the 902–928 MHz U.S. ISM band. Transmitter power is 1 W maximum, and power control is used to minimize interference and maximize battery life time. { The metricom network has been improved and significantly expanded in the San Francisco Bay Area and has been deployed in Washington, D.C. and a few other places in the U.S. However, like all wireless data services so far, it has failed to grow as rapidly or to attract as many subscribers as was originally expected. Wireless data overall has had only very limited success compared to that of the more voice-oriented technologies, systems, and services. }

15.3.4

High-Speed Wireless Local-Area Networks (WLANs)

Wireless local-area data networks can be categorized as providing low-mobility high-data-rate data communications within a confined region, e.g., a campus or a large building. Coverage range from a wireless data terminal is short, tens to hundreds of feet, like cordless telephones. Coverage is limited to within a room or to several rooms in a building. WLANs have been evolving for a few years, but overall the situation is chaotic, with many different products being offered by many different vendors [29, 59]. There is no stable definition of the needs or design objectives for WLANs, with data rates ranging from hundreds of kb/s to more than 10 Mb/s, and with several products providing one or two Mb/s wireless link rates. The best description of the WLAN evolutionary process is: having severe birth pains. An IEEE standards committee, 802.11, has been attempting to put some order into this topic, but their success has been somewhat limited. A partial list of some advertised products is given in Table 15.3. Users of WLANs are not nearly as numerous as the users of more voice-oriented wireless systems. Part of the difficulty stems from these systems being driven by the computer industry that views the wireless system as just another plug-in interface card, without giving sufficient consideration to the vagaries and needs of a reliable radio system. { This section still describes the WLAN situation in spite of some attempts at standards in the U.S. and Europe, and continuing industry efforts. Some of the products in Table 15.3 have been discontinued because of lack of market and some new products have been offered, but the manufacturers still continue to struggle to find enough customers to support their efforts. Optimism remains high in the WLAN 1999 by CRC Press LLC

c

community that “eventually” they will find the “right” technology, service, or application to make WLANs “take off ” — but the world still waits. Success is still quite limited. } There are two overall network architectures pursued by WLAN designers. One is a centrally coordinated and controlled network that resembles other wireless systems. There are base stations in these networks that exercise overall control over channel access [44]. The other type of network architecture is the self-organizing and distributed controlled network where every terminal has the same function as every other terminal, and networks are formed ad hoc by communications exchanges among terminals. Such ad hoc networks are more like citizen band (CB) radio networks, with similar expected limitations if they were ever to become very widespread. Nearly all WLANs in the United States have attempted to use one of the ISM frequency bands for unlicensed operation under part 15 of the FCC rules. These bands are 902–928 MHz, 2400–2483.5 MHz, and 5725–5850 MHz, and they require users to accept interference from any interfering source that may also be using the frequency. The use of ISM bands has further handicapped WLAN development because of the requirement for use of either frequency hopping or direct sequence spread spectrum as an access technology, if transmitter power is to be adequate to cover more than a few feet. One exception to the ISM band implementations is the Motorola ALTAIR, which operates in a licensed band at 18 GHz. { It appears that ALTAIR has been discontinued because of the limited market. } The technical and economic challenges of operation at 18 GHz have hampered the adoption of this 10–15 Mb/s technology. The frequency-spectrum constraints have been improved in the United States with the recent FCC allocation of spectrum from 1910–1930 MHz for unlicensed data PCS applications. Use of this new spectrum requires implementation of an access etiquette incorporating listen before transmit in an attempt to provide some coordination of an otherwise potentially chaotic, uncontrolled environment [68]. Also, since spread spectrum is not a requirement, access technologies and multipath mitigation techniques more compatible with the needs of packet-data transmission [59], e.g., multipath equalization or multicarrier transmission can be incorporated into new WLAN designs. { The FCC is allocating spectrum at 5 GHz for wideband wireless data for internet and next generation data network access, BUT it remains to be seen whether this initiative is any more successful than past wireless data attempts. Optimism is again high, BUT... } Three other widely different WLAN activities also need mentioning. One is a large European Telecommunications Standards Institute (ETSI) activity to produce a standard for high performance radio local area network (HIPERLAN), a 20-Mb/s WLAN technology to operate near 5 GHz. Other activities are large U.S. Advance Research Projects Agency- (ARPA-) sponsored, WLAN research projects at the Universities of California at Berkeley (UCB), and at Los Angeles (UCLA). The UCB Infopad project is based on a coordinated network architecture with fixed coordinating nodes and direct-sequence spread spectrum (CDMA), whereas, the UCLA project is aimed at peer-to-peer networks and uses frequency hopping. Both ARPA sponsored projects are concentrated on the 900-MHz ISM band. As computers shrink in size from desktop to laptop to palmtop, mobility in data network access is becoming more important to the user. This fact, coupled with the availability of more usable frequency spectrum, and perhaps some progress on standards, may speed the evolution and adoption of wireless mobile access to WLANs. From the large number of companies making products, it is obvious that many believe in the future of this market. { It should be noted that the objective for 10 MB/s data service with widespread coverage from a sparse distribution of widely separated base stations equivalent to cellular is unrealistic and unrealizable. This can be readily seen by considering a simple example. Consider a cellular coverage area that requires full cellular power of 0.5 watt to cover from a handset. Consider the handset to use a typical digital cellular bit rate of about 10 kb/s (perhaps 8 kb/s speech coding + overhead). With all else in the system the same, e.g., antennas, antenna height, receiver noise figure, detection sensitivity, etc., the 10 MB/s data would require 10 MB/s ÷ 10 kb/s 1999 by CRC Press LLC

c

TABLE 15.3

Partial List of WLAN Products

1999 by CRC Press LLC

c

Product

No. of chan.

Company

Freq.,

Link Rate,

or Spread

Location

MHz

Mb/s

User Rate

Protocol(s)

Mod./

Factor

Coding

Power, mW

Altair Plus Motorola Arlington Hts, IL

18–19 GHz

15

5.7 Mb/s

Ethernet

Topology

4-level FSK

25 peak

Eight devices/ radio; radio to base to ethernet

WaveLAN NCR/AT&T Dayton, OH

902–928

2

1.6 Mb/s

Ethernet-like

DS SS

DQPSK

250

Peer-to-peer

AirLan Solectek San Diego, CA

902–928

2 Mb/s

Ethernet

DS SS

DQPSK

250

PCMCIA w/ant.; radio to hub

Freeport Windata Inc. Northboro, MA

902–928

5.7 Mb/s

Ethernet

DS SS

16 PSK trellis coding

650

Hub

Intersect Persoft Inc. Madison, WI

902–928

2 Mb/s

Ethernet token ring

DS SS

DQPSK

250

Hub

LAWN O’Neill Comm. Horsham, PA

902–928

38.4 kb/s

AX.25

SS

20

Peer-to-peer

WILAN Wi-LAN Inc. Calgary, Alberta

902–928

30

Peer-to-peer

RadioPort ALPS Electric USA

100

Peer-to-peer

1W max

PCs with ant.; radio to hub

16

Access

32 chips/bit

20

Network

users/chan.; max. 4 chan. 20

1.5 Mb/s/ chan.

Ethernet, token ring

CDMA/ TDMA

3 chan. 10–15 links each

902–928

242 kb/s

Ethernet

SS

7/3 channels

ArLAN 600 Telesys. SLW Don Mills, Ont.

902–928; 2.4 GHz

1.35 Mb/s

Ethernet

SS

Radio Link Cal. Microwave Sunnyvale, CA

902–928; 2.4 GHz

Range LAN Proxim, Inc. Mountain View, CA

902–928

RangeLAN 2 Proxim, Inc. Mountain View, CA

2.4 GHz

1.6

Netwave Xircom Calabasas, CA

2.4 GHz

1/adaptor

Freelink Cabletron Sys. Rochester, NH

2.4 and 5.8 GHz

250 kb/s

64 kb/s

FH SS

250 ms/hop 500 kHz space

unconventional

Hub

242 kb/s

Ethernet, token ring

DS SS

3 chan.

100

50 kb/s max.

Ethernet, token ring

FH SS

10 chan. at 5 kb/s; 15 sub-ch. each

100

Ethernet, token ring

FH SS

82 l-MHz chn. or “hops”

Ethernet

DS SS

32 chips/bit

5.7 Mb/s

Peer-to-peer bridge Hub

16 PSK trellis coding

100

Hub

= 1000 times as much power as the 10 kb/s cellular. Thus, it would require 0.5 × 1000 = 500 watts for the wireless data transmitter. This is a totally unrealistic situation. If the data system operates at a higher frequency (e.g., 5 GHz) than the cellular system (e.g., 1 or 2 GHz) then there will be even more power required to overcome the additional loss at a higher frequency. The sometimes expressed desire by the wireless data community for a system to provide network access to users in and around buildings and to provide 10 MB/s over 10 miles with 10 milliwatts of transmitter power and costing $10.00 is totally impossible. It requires violation of the “laws of physics.” }

15.3.5

Paging/Messaging Systems

Radio paging began many years ago as a one-bit messaging system. The one bit was: some one wants to communicate with you. More generally, paging can be categorized as one-way messaging over wide areas. The one-way radio link is optimized to take advantage of the asymmetry. High transmitter power (hundreds of watts to kilowatts), and high antennas at the fixed base stations permit low-complexity, very low-power-consumption, pocket paging receivers that provide long usage time from small batteries. This combination provides the large radio-link margins needed to penetrate walls of buildings without burdening the user set battery. Paging has experienced steady rapid growth for many years and serves about 15 million subscribers in the United States. Paging also has evolved in several different directions. It has changed from analog tone coding for user identification to digitally encoded messages. It has evolved from the 1-b message, someone wants you, to multibit messages from, first, the calling party’s telephone number to, now, short e-mail text messages. This evolution is noted in Fig. 15.1. The region over which a page is transmitted has also increased from 1) local, around one transmitting antenna; to 2) regional, from multiple widely-dispersed antennas; to 3) nationwide, from large networks of interconnected paging transmitters. The integration of paging with CT-2 user sets for phone-point call alerting was noted previously. Another evolutionary paging route sometimes proposed is two-way paging. This is an ambiguous and unrealizable concept, however, since the requirement for two-way communications destroys the asymmetrical link advantage so well exploited by paging. Two-way paging puts a transmitter in the user’s set and brings along with it all of the design compromises that must be faced in such a two-way radio system. Thus, the word paging is not appropriate to describe a system that provides two-way communications. { The two-way paging situation is as unrealistic as that noted earlier for widearea, high-speed, low-power wireless data. This can be seen by looking at the asymmetry situation in paging. In order to achieve comparable coverage uplink and downlink, a 500-watt paging transmitter downlink advantage must be overcome in the uplink. Even considering the relatively high cellular handset transmit power levels on the order of 0.5 watt results in a factor of 1000 disadvantage, and 0.5 watt is completely incompatible with the low power paging receiver power requirements. If the same uplink and downlink coverage is required for an equivalent set of system parameters, then the only variable left to work with is bandwidth. If the paging link bit rate is taken to be 10 kb/sec (much higher than many paging systems), then the usable uplink rate is 10 kb/s/1000 = 10 B/s, an unusably low rate. Of course, some uplink benefit can be gained because of better base station receiver noise figure and by using forward error correction and perhaps ARQ. However, this is unlikely to raise the allowable rate to greater than 100 B/s which even though likely overoptimistic is still unrealistically low and we have assumed an unrealistically high transmit power in the two-way “pager!” } 1999 by CRC Press LLC

c

15.3.6

Satellite-Based Mobile Systems

Satellite-based mobile systems are the epitome of wide-area coverage, expensive base station systems. They generally can be categorized as providing two-way (or one-way) limited quality voice and/or very limited data or messaging to very wide-ranging vehicles (or fixed locations). These systems can provide very widespread, often global, coverage, e.g., to ships at sea by INMARSAT. There are a few messaging systems in operation, e.g., to trucks on highways in the United States by Qualcomm’s Omnitracs system. A few large-scale mobile satellite systems have been proposed and are being pursued: perhaps the best known is Motorola’s Iridium; others include Odyssey, Globalstar, and Teledesic. The strength of satellite systems is their ability to provide large regional or global coverage to users outside buildings. However, it is very difficult to provide adequate link margin to cover inside buildings, or even to cover locations shadowed by buildings, trees, or mountains. A satellite system’s weakness is also its large coverage area. It is very difficult to provide from Earth orbit the small coverage cells that are necessary for providing high overall systems capacity from frequency reuse. This fact, coupled with the high cost of the orbital base stations, results in low capacity along with the wide overall coverage but also in expensive service. Thus, satellite systems are not likely to compete favorably with terrestrial systems in populated areas or even along well-traveled highways. They can complement terrestrial cellular or PCS systems in low-population-density areas. It remains to be seen whether there will be enough users with enough money in low-population-density regions of the world to make satellite mobile systems economically viable. { Some of the mobile satellite systems have been withdrawn, e.g., Odyssey. Some satellites in the Iridium and Globalstar systems have been launched. The industry will soon find out whether these systems are economically viable. } Proposed satellite systems range from 1) low-Earth-orbit systems (LEOS) having tens to hundreds of satellites through 2) intermediate- or medium-height systems (MEOS) to 3) geostationary or geosynchronous orbit systems (GEOS) having fewer than ten satellites. LEOS require more, but less expensive, satellites to cover the Earth, but they can more easily produce smaller coverage areas and, thus, provide higher capacity within a given spectrum allocation. Also, their transmission delay is significantly less (perhaps two orders of magnitude!), providing higher quality voice links, as discussed previously. On the other hand, GEOS require only a few, somewhat more expensive, satellites (perhaps only three) and are likely to provide lower capacity within a given spectrum allocation and suffer severe transmission-delay impairment on the order of 0.5 s. Of course, MEOS fall in between these extremes. The possible evolution of satellite systems to complement high-tier PCS is indicated in Fig. 15.1.

15.3.7 {Fixed Point-to-Multipoint Wireless Loops Wideband point-to-multipoint wireless loop technologies sometimes have been referred to earlier as “wireless cable” when they were proposed as an approach for providing interactive video services to homes [30]. However, as the video application started to appear less attractive, the application emphasis shifted to providing wideband data access for the internet, the worldwide web, and future wideband data networks. Potentially lower costs are the motivation for this wireless application. As such, these technologies will have to compete with existing coaxial cable and fiber/coax distribution by CATV companies, with satellites, and with fiber and fiber/coax systems being installed or proposed by telephone companies and other entities [30]. Another competitor is asymmetric digital subscriber line technology, which uses advanced digital signal processing to provide high-bandwidth digital distribution over twisted copper wire pairs. In the U.S. two widely different frequency bands are being pursued for fixed point-to-multipoint 1999 by CRC Press LLC

c

wireless loops. These bands are at 28 GHz for local multipoint distribution systems or services (LMDS) [52] and 2.5 to 2.7 GHz for microwave or metropolitan distribution systems (MMDS) [74]. The goal of low-cost fixed wireless loops is based on the low cost of point-to-multipoint line-of-sight wireless technology. However, significant challenges are presented by the inevitable blockage by trees, terrain, and houses, and by buildings in heavily built-up residential areas. Attenuation in rainstorms presents an additional problem at 28 GHz in some localities. Even at the 2.5-GHz MMDS frequencies, the large bandwidth required for distribution of many video channels presents a challenge to provide adequate radio-link margin over obstructed paths. From mobile satellite investigations it is known that trees can often produce over 15 dB additional path attenuation [38]. Studies of blockage by buildings in cities have shown that it is difficult to have line-of-sight access to more than 60% of the buildings from a single base station [55]. Measurements in a region in Brooklyn, NY [60], suggest that access from a single base station can range from 25% to 85% for subscriber antenna heights of 10 to 35 ft and a base station height of about 290 ft. While less blockage by houses could be expected in residential areas, such numbers would suggest that greater than 90% access to houses could be difficult, even from multiple elevated locations, when mixes of one- and two-story houses, trees, and hills are present. In regions where tree cover is heavy, e.g., the eastern half of the U.S., tree cover in many places will present a significant obstacle. Heavy rainfall is an additional problem at 28 GHz in some regions. In spite of these challenges, the lure of low-cost wireless loops is attracting many participants, both service providers and equipment manufacturers. }

15.3.8

Reality Check

Before we go on to consider other applications and compromises, perhaps it would be helpful to see if there is any indication that the previous discussion is valid. For this check, we could look at cordless telephones for telepoint use (i.e., pocketphones) and at pocket cellular telephones that existed in the 1993 time frame. Two products from one United States manufacturer are good for this comparison. One is a thirdgeneration hand-portable analog FM cellular phone from this manufacturer that represents their second generation of pocketphones. The other is a first-generation digital cordless phone built to the United Kingdom CT-2 common air interface (CAI) standard. Both units are of flip phone type with the earpiece on the main handset body and the mouthpiece formed by or on the flip-down part. Both operate near 900 MHz and have 1/4 wavelength pull-out antennas. Both are fully functional within their class of operation (i.e., full number of U.S. cellular channels, full number of CT-2 channels, automatic channel setup, etc.) Table 15.4 compares characteristics of these two wireless access pocketphones from the same manufacturer. The following are the most important items to note in the Table 15.4 comparison. 1. The talk time of the low-power pocketphone is four times that of the high-power pocketphone. 2. The battery inside the low-power pocketphone is about one-half the weight and size of the battery attached to the high-power pocketphone. 3. The battery-usage ratio, talk time/weight of battery, is eight times greater, almost an order of magnitude, for the low-power pocketphone compared to the high-power pocketphone! 4. Additionally, the lower power (5 mW) digital cordless pocketphone is slightly smaller and lighter than the high-power (500 mW) analog FM cellular mobile pocketphone. { Similar comparisons can be made between PHS advanced cordless/low-tier PCS phones and advanced cellular/high-tier PCS pocketphones. New lithium batteries have permitted increased talk 1999 by CRC Press LLC

c

TABLE 15.4 Comparison of CT-2 and Cellular Pocket Size Flip-Phones from the Same Manufacturer Characteristics/Parameter

CT-2

Cellular

Weight, oz Flip phone only Battery1 only Total unit

5.2 1.9 7.1

4.2 3.6 7.8

5.9 × 2.2 × 0.95 8.5 in3 1.9 × 1.3 × 0.5 internal 5.9 × 2.2 × 0.95 8.5 in3

5.5 × 2.4 × 0.9 — 4.7 × 2.3 × 0.4 external 5.9 × 2.4 × 1.1 11.6 in3

Talk-time, min (h) Rechargeable battery2 Nonrechargeable battery

180 (3) 600 (10)

45 N/A

Standby time, h Rechargeable battery Nonrechargeable battery

30 100

8 N/A

Speech quality

32 kb/s telephone quality

30 kHz FM depends on channel quality

Transmit power avg., W

0.005

0.5

Size (max.dimensions), in Flip phone only Battery1 only Total unit

1 Rechargeable battery. 2 Ni-cad battery.

time in pocketphones. Digital control/paging channels facilitate significantly extended standby time. Advances in solid-state circuits have reduced the size and weight of cellular pocketphone electronics so that they are almost insignificant compared to the battery required for the high power transmitter and complex digital signal processing. However, even with all these changes, there is still a very significant weight and talk time benefit in the low complexity PHS handsets compared to the most advanced cellular/high-tier PCS handsets. Picking typical minimum size and weight handsets for both technologies results in the following comparisons. PHS

Cellular

weight, oz total unit

3

4.5

size total unit

4.2 in3

—

talk-time, h

8

3

600

48

standby time, h

From the table, the battery usage ratio has been reduced to a factor of about 4 from a factor of 8, but this is based on total weight, not battery weight alone as used for the earlier CT-2 and cellular comparison. Thus, there is still a large talk time and weight benefit for low-power low-complexity low-tier PCS compared to higher power high-complexity, high tier PCS and cellular. } The following should also be noted. 1. The room for technology improvement of the CT-2 cordless phone is greater since it is first generation and the cellular phone is second/third generation. 1999 by CRC Press LLC

c

2. A digital cellular phone built to the IS-54, GSM, or JDC standard, or in the proposed United States CDMA technology, would either have less talk time or be heavier and larger than the analog FM phone, because: a) the low-bit-rate digital speech coder is more complex and will consume more power than the analog speech processing circuits; b) the digital units have complex digital signal-processing circuits for forward error correction—either for delay dispersion equalizing or for spread-spectrum processing— that will consume significant amounts of power and that have no equivalents in the analog FM unit; and c) power amplifiers for the shaped-pulse nonconstant-envelope digital signals will be less efficient than the amplifiers for constant-envelope analog FM. Although it may be suggested that transmitter power control will reduce the weight and size of a CDMA handset and battery, if that handset is to be capable of operating at full power in fringe areas, it will have to have capabilities similar to other cellular sets. Similar power control applied to a CT-2-like low-maximum-power set would also reduce its power consumption and thus also its weight and size. The major difference in size, weight, and talk time between the two pocketphones is directly attributable to the two orders of magnitude difference in average transmitter power. The generation of transmitter power dominates power consumption in the analog cellular phone. Power consumption in the digital CT-2 phone is more evenly divided between transmitter-power generation and digital signal processing. Therefore, power consumption in complex digital signal processing would have more impact on talk time in small low-power personal communicators than in cellular handsets where the transmitter-power generation is so large. Other than reducing power consumption for both functions, the only alternative for increasing talk time and reducing battery weight is to invent new battery technology having greater density; see section on Other Issues later in this chapter. In contrast, lowering the transmitter power requirement, modestly applying digital signal processing, and shifting some of the radio coverage burden to a higher density of small, low-power, low-complexity, low-cost fixed radio ports has the effect of shifting some of the talk time, weight, and cost constraints from battery technology to solid state electronics technology, which continues to experience orders-of-magnitude improvements in the span of several years. Digital signal-processing complexity, however, cannot be permitted to overwhelm power consumption in low-power handsets; whereas small differences in complexity will not matter much, orders-of-magnitude differences in complexity will continue to be significant. Thus, it can be seen from Table 15.4 that the size, weight, and quality arguments in the preceding sections generally hold for these examples. It also is evident from the preceding paragraphs that they will be even more notable when comparing digital cordless pocketphones with digital cellular pocketphones of the same development generations.

15.4

Evolution Toward the Future and to Low-Tier Personal Communications Services

After looking at the evolution of several wireless technologies and systems in the preceding sections it appears appropriate to ask again: wireless personal communications, What is it? All of the technologies in the preceding sections claim to provide wireless personal communications, and all do to some extent. All have significant limitations, however, and all are evolving in attempts to overcome the limitations. It seems appropriate to ask, what are the likely endpoints? Perhaps some hint of the endpoints can be found by exploring what users see as limitations of existing technologies and systems and by looking at the evolutionary trends. 1999 by CRC Press LLC

c

In order to do so, we summarize some important clues from the preceding sections and project them, along with some U.S. standards activity, toward the future. Digital Cordless Telephones

• Strengths: good circuit quality; long talk time; small lightweight battery; low-cost sets and service. • Limitations: limited range; limited usage regions. • Evolutionary trends: phone points in public places; wireless PBX in business. • Remaining limitations and issues: limited usage regions and coverage holes; limited or no handoff; limited range. { Experience with PHS and CT-2 phone point have provided more emphasis on the need for vehicle speed handoff and continuous widespread coverage of populated areas and of highways in between. } Digital Cellular Pocket Handsets

• Strength: widespread service availability. • Limitations: limited talk time; large heavy batteries; high-cost sets and service; marginal circuit quality; holes in coverage and poor in-building coverage; limited data capabilities; complex technologies. • Evolutionary trends: microcells to increase capacity and in-building coverage and to reduce battery drain; satellite systems to extend coverage. • Remaining limitations and issues: limited talk time and large battery; marginal circuit quality; complex technologies. Wide Area Data

• Strength: digital messages. • Limitations: no voice, limited data rate; high cost. • Evolutionary trends: microcells to increase capacity and reduce cost; share facilities with voice systems to reduce cost. • Remaining limitations and issues: no voice; limited capacity. Wireless Local Area Networks (WLANs)

• Strength: high data rate. • Limitations: insufficient capacity for voice, limited coverage; no standards; chaos. • Evolutionary trends: hard to discern from all of the churning. Paging/messaging

• Strengths: widespread coverage; long battery life; small lightweight sets and batteries; economical. • Limitations: one-way message only; limited capacity. • Evolutionary desire: two-way messaging and/or voice; capacity. • Limitations and issues: two-way link cannot exploit the advantages of one-way link asymmetry. 1999 by CRC Press LLC

c

{Fixed Wireless Loops • Strength: High data rates. • Limitations: no mobility. } There is a strong trajectory evident in these systems and technologies aimed at providing the following features. High Quality Voice and Data

• • • • • •

To small, lightweight, pocket carried communicators. Having small lightweight batteries. Having long talk time and long standby battery life. Providing service over large coverage regions. For pedestrians in populated areas (but not requiring high population density). Including low to moderate speed mobility with handoff. { It has become evident from the experience with PHS and CT-2 phone point that vehicle speed handoff is essential so that handsets can be used in vehicles also. } Economical Service

• Low subscriber-set cost. • Low network-service cost. Privacy and Security of Communications

• Encrypted radio links. This trajectory is evident in all of the evolving technologies but can only be partially satisfied by any of the existing and evolving systems and technologies! Trajectories from all of the evolving technologies and systems are illustrated in Fig. 15.1 as being aimed at low-tier personal communications systems or services, i.e., low-tier PCS. Taking characteristics from cordless, cellular, wide-area data and, at least moderate-rate, WLANs, suggests the following attributes for this low-tier PCS. 1. 32 kb/s ADPCM speech encoding in the near future to take advantage of the low complexity and low power consumption, and to provide low-delay high-quality speech. 2. Flexible radio link architecture that will support multiple data rates from several kilobits per second. This is needed to permit evolution in the future to lower bit rate speech as technology improvements permit high quality without excessive power consumption or transmission delay and to provide multiple data rates for data transmission and messaging. 3. Low transmitter power (≤ 25 mW average) with adaptive power control to maximize talk time and data transmission time. This incurs short radio range that requires many base stations to cover a large region. Thus, base stations must be small and inexpensive, like cordless telephone phone points or the Metricom wireless data base stations. { The lower power will require somewhat closer spacing of base stations in cluttered environments with many buildings, etc. This issue is dealt with in more detail in Section 15.5. The issues associated with greater base station spacing along highways are also considered in Section 15.5. } 4. Low-complexity signal processing to minimize power consumption. Complexity onetenth that of digital cellular or high-tier PCS technologies is required [29]. With only 1999 by CRC Press LLC

c

5.

6. 7.

8.

several tens of milliwatts (or less under power control) required for transmitter power, signal processing power becomes significant. Low cochannel interference and high coverage area design criteria. In order to provide high-quality service over a large region, at least 99% of any covered area must receive good or better coverage and be below acceptable cochannel interference limits. This implies less than 1% of a region will receive marginal service. This is an order-of-magnitude higher service requirement than the 10% of a region permitted to receive marginal service in vehicular cellular system (high-tier PCS) design criteria. Four-level phase modulation with coherent detection to maximize radio link performance and capacity with low complexity. Frequency division duplexing to relax the requirement for synchronizing base station transmissions over a large region. { PHS uses time division duplexing and requires base station synchronization. In first deployments, one provider did not implement this synchronization. The expected serious performance degradation prompted system upgrades to provide the needed synchronization. While this is not a big issue, it does add complexity to the system and decreases the overall robustness. } { As noted previously, experience with PHS and CT-2 phone point have emphasized the need for vehicular speed handoff in these low-tier PCS systems. Such handoff is readily implemented in PACS and has been demonstrated in the field [51]. This issue is discussed in more detail later in this section. }

Such technologies and systems have been designed, prototyped, and laboratory and field tested and evaluated for several years [7, 23, 24, 25, 26, 27, 28, 29, 31, 32, 50]. The viewpoint expressed here is consistent with the progress in the Joint Technical Committee (JTC) of the U.S. standards bodies, Telecommunications Industry Association (TIA) and Committee T1 of the Alliance for Telecommunications Industry Solutions (ATIS). Many technologies and systems were submitted to the JTC for consideration for wireless PCS in the new 1.9-GHz frequency bands for use in the United States [20]. Essentially all of the technologies and systems listed in Table 15.1, and some others, were submitted in late 1993. It was evident that there were at least two and perhaps three distinctly different classes of submissions. No systems optimized for packet data were submitted, but some of the technologies are optimized for voice. One class of submissions was the group labeled high-power systems, digital cellular (high-tier PCS) in Table 15.1. These are the technologies discussed previously in this chapter. They are highly optimized for low-bit-rate voice and, therefore, have somewhat limited capability for serving packet-data applications. Since it is clear that wireless services to wide ranging high-speed mobiles will continue to be needed, and that the technology already described for low-tier PCS may not be optimum for such services, Fig. 15.1 shows a continuing evolution and need in the future for hightier PCS systems that are the equivalent of today’s cellular radio. There are more than 100 million vehicles in the United States alone. In the future, most, if not all, of these will be equipped with high-tier cellular mobile phones. Therefore, there will be a continuing and rapidly expanding market for high-tier systems. Another class of submissions to the JTC [20] included the Japanese personal handyphone system (PHS) and a technology and system originally developed at Bellcore but carried forward to prototypes and submitted to the JTC by Motorola and Hughes Network Systems. This system was known 1999 by CRC Press LLC

c

as wireless access communications systems (WACS).2 These two submissions were so similar in their design objectives and system characteristics that, with the agreement of the delegations from Japan and the United States, the PHS and WACS submissions were combined under a new name, personal access communication systems (PACS), that was to incorporate the best features of both. This advanced, low-power wireless access system, PACS, was to be known as low-tier PCS. Both WACS/PACS and Handyphone (PHS) are shown in Table 15.1 as low-tier PCS and represent the evolution to low-tier PCS in Fig. 15.1. The WACS/PACS/ UDPC system and technology are discussed in [7, 23, 24, 25, 26, 28, 29, 31, 32, 50]. In the JTC, submissions for PCS of DECT and CT-2 and their variations were also lumped under the class of low-tier PCS, even though these advanced digital cordless telephone technologies were somewhat more limited in their ability to serve all of the low-tier PCS needs. They are included under digital cordless technologies in Table 15.1. Other technologies and systems were also submitted to the JTC for high-tier and low-tier applications, but they have not received widespread industry support. One wireless access application discussed earlier that is not addressed by either high-tier or low-tier PCS is the high-speed WLAN application. Specialized high-speed WLANs also are likely to find a place in the future. Therefore, their evolution is also continued in Fig. 15.1. The figure also recognizes that widespread low-tier PCS can support data at several hundred kilobits per second and, thus, can satisfy many of the needs of WLAN users. It is not clear what the future roles are for paging/messaging, cordless telephone appliances, or wide-area packet-data networks in an environment with widespread contiguous coverage by lowtier and high-tier PCS. Thus, their extensions into the future are indicated with a question mark in Fig. 15.1. Those who may object to the separation of wireless PCS into high-tier and low-tier should review this section again, and note that we have two tiers of PCS now. On the voice side there is cellular radio, i.e., high-tier PCS, and cordless telephone, i.e., an early form of low-tier PCS. On the data side there is wide-area data, i.e., high-tier data PCS, and WLANs, i.e., perhaps a form of low-tier data PCS. In their evolutions, these all have the trajectories discussed and shown in Fig. 15.1 that point surely toward low-tier PCS. It is this low-tier PCS that marketing studies continue to project is wanted by more than half of the U.S. households or by half of the people, a potential market of over 100 million subscribers in the United States alone. Similar projections have been made worldwide. { PACS technology [6] has been prototyped by several manufacturers. In 1995 field demonstrations were run in Boulder, CO at a U.S. West test site using radio ports (base stations) and radio port control units made by NEC. “Handset” prototypes made by Motorola and Panasonic were trialed. The handsets and ports were brought together for the first time in Boulder. The highly successful trial demonstrated the ease of integrating the subsystems of the low-complexity PACS technology and the overall advantages of PACS from a user’s perspective as noted throughout this chapter. Effective vehicular speed operation was demonstrated in these tests. Also, Hughes Network Systems (HNS) has developed and tested many sets of PACS infrastructure technology with different handsets in several settings and has many times demonstrated highly reliable high vehicular speed (in excess of 70 mi/hr) operation and handoff among several radio ports. Motorola also has demonstrated PACS equipment in several settings at vehicular speeds as well as for wireless loop applications. Highly successful demonstrations of PACS prototypes have been conducted from Alaska to Florida, from New York to California, and in China and elsewhere. A PACS deployment in China using NEC equipment started to provide service in 1998. The U.S.

2 WACS was known previously as Universal Digital Portable Communications (UDPC).

1999 by CRC Press LLC

c

Service Provider, 21st Century Telesis, is poised to begin a PACS deployment in several states in the U.S. using infrastructure equipment from HNS and handsets and switching equipment from different suppliers. Perhaps, with a little more support of these deployments, the public will finally be able to obtain the benefits of low-tier PCS. }

15.5

Comparisons with Other Technologies

15.5.1

Complexity/Coverage Area Comparisons

Experimental research prototypes of radio ports and subscriber sets [64, 66] have been constructed to demonstrate the technical feasibility of the radio link requirements in [7]. These WACS prototypes generally have the characteristics and parameters previously noted, with the exceptions that 1) the portable transmitter power is lower (10 mW average, 100 mW peak), 2) dynamic power control and automatic time slot transfer are not implemented, and 3) a rudimentary automatic link-transfer implementation is based only on received power. The experimental base stations transmit near 2.17 GHz; the experimental subscriber sets transmit near 2.12 GHz. Both operated under a Bellcore experimental license. The experimental prototypes incorporate application-specific, very large-scale integrated circuits3 fabricated to demonstrate the feasibility of the low-complexity high-performance digital signal-processing techniques [63, 64] for symbol timing and coherent bit detection. These techniques permit the efficient short TDMA bursts having only 100 b that are necessary for low-delay TDMA implementations. Other digital signal-processing functions in the prototypes are implemented in programmable logic devices. All of the digital signal-processing functions combined require about 1/10 of the logic gates that are required for digital signal processing in vehicular digital cellular mobile implementations [42, 62, 63]; that is, this low-complexity PCS implementation having no delay-dispersion-compensating circuits and no forward error-correction decoding and is about 1/10 as complex as the digital cellular implementations that include these functions.4 The 32 kb/s ADPCM speech-encoding in the low-complexity PCS implementation is also about 1/10 as complex as the less than 10-kb/s speech encoding used in digital cellular implementations. This significantly lower complexity will continue to translate into lower power consumption and cost. It is particularly important for low-power pocket personal communicators with power control in which the DC power expended for radio frequency transmitting can be only tens of milliwatts for significant lengths of time. The experimental radio links have been tested in the laboratory for detection sensitivity [bit error rate (BER) vs SNR] [18, 61, 66] and for performance against cochannel interference [1] and intersymbol interference caused by multipath delay spread [66]. These laboratory tests confirm the performance of the radio-link techniques. In addition to the laboratory tests, qualitative tests have been made in several PCS environments to compare these experimental prototypes with several United States CT-1 cordless telephones at 50 MHz, with CT-2 cordless telephones at 900 MHz, and with DCT900 cordless telephones at 900 MHz. Some of these comparisons have been reported [8, 71, 84, 85]. In general, depending on the criteria, e.g., either no degradation or limited degradation of circuit quality, these WACS experimental prototypes covered areas inside buildings that ranged from 1.4

3 Applications specific integrated circuits (ASIC), very large-scale integration (VLSI). 4 Some indication of VLSI complexity can be seen by the number of people required to design the circuits. For the low-

complexity TDMA ASIC set, only one person part time plus a student part time were required; the complex CDMA ASIC has six authors on the paper alone. 1999 by CRC Press LLC

c

to 4 times the areas covered by the other technologies. The coverage areas for the experimental prototypes were always substantially limited in two or three directions by the outside walls of the buildings. These area factors could be expected to be even larger if the coverage were not limited by walls, i.e., once all of a building is covered in one direction, no more area can be covered no matter what the radio link margin. The earlier comparisons [8, 84, 85] were made with only two-branch uplink diversity before subscriber-set transmitting antenna switching was implemented and, with only one radio port before automatic radio-link transfer was implemented. The later tests [71] included these implementations. These reported comparisons agree with similar unreported comparisons made in a Bellcore Laboratory building. Similar coverage comparison results have been noted for a 900-MHz ISM-band cordless telephone compared to the 2-GHz experimental prototype. The area coverage factors (e.g., ×1.4 to ×4) could be expected to be even greater if the cordless technologies had also been operated at 2 GHz since attenuation inside buildings between similar small antennas is about 7 dB greater at 2 GHz than at 900 MHz [35, 36] and the 900 MHz handsets transmitted only 3 dB less average power than the 2-GHz experimental prototypes. The greater area coverage demonstrated for this technology is expected because of the different compromises noted earlier; the following, in particular. 1. Coherent detection of QAM provides more detection sensitivity than noncoherent detection of frequency-shift modulations [17]. 2. Antenna diversity mitigates bursts of errors from multipath fading [66]. 3. Error detection and blanking of TDMA bursts having errors significantly improves perceived speech quality [72]. (Undetected errors in the most significant bit cause sharp audio pops that seriously degrade perceived speech quality.) 4. Robust symbol timing and burst and frame synchronization reduce the number of frames in error due to imperfect timing and synchronization [66]. 5. Transmitting more power from the radio port compared to the subscriber set offsets the less sensitive subscriber set receiver compared to the port receiver that results from power and complexity compromises made in a portable set. Of course, as expected, the low-power (10-mW) radio links cover less area than high-power (0.5-W) cellular mobile pocketphone radio links because of the 17-dB transmitter power difference resulting from the compromises discussed previously. In the case of vehicular mounted sets, even more radiolink advantage accrues to the mobile set because of the higher gain of vehicle-mounted antennas and higher transmitter power (3 W). { The power difference between a low-tier PACS handset and a high-tier PCS or cellular pocket handset is not as significant in limiting range as is often portrayed. Other differences in deployment scenarios for low-tier and high-tier systems are as large or even larger factors, e.g., base station antenna height and antenna gain. This can be seen by considering using the same antennas and receiver noise figures at base stations and looking at the range of high-tier and low-tier handsets. High-tier handsets typically transmit a maximum average power of 0.5 watt. The PACS handset average transmit power is 25 milliwatts (peak power is higher for TDMA, but equal comparisons can be made considering average power and equivalent receiver sensitivities). This power ratio of ×20 translates to approximately a range reduction of a factor of about 0.5 for an environment with a distance dependence of 1/(d)4 or a factor of about 0.4 for a 1/(d)3.5 environment. These represent typical values of distance dependence for PCS and cellular environments. Thus, if the high-tier handset would provide a range of 5 miles in some environment, the low-tier handset would provide a range of 2 to 2.5 miles in the same environment, if the base station antennas and receiver noise figures were the same. This difference in range is no greater than the difference in range between 1999 by CRC Press LLC

c

high-tier PCS handsets used in 1.9 GHz systems and cellular handsets with the same power used in 800 MHz systems. Range factors are discussed further in the next section. }

15.5.2 {Coverage, Range, Speed, and Environments Interest has been expressed in having greater range for low-tier PCS technology for low-populationdensity areas. One should first note that the range of a wireless link is highly dependent on the amount of clutter or obstructions in the environment in which it is operated. For example, radio link calculations that result in a 1400-ft base station (radio-port) separation at 1.9 GHz contain over 50-dB margin for shadowing from obstructions and multipath effects [25, 37]. Thus, in an environment without obstructions, e.g., along a highway, the base station separation can be increased at least by a factor of 4 to over a mile, i.e., 25 dB for an attenuation characteristic of d −4 , while providing the same quality of service, without any changes to the base station or subscriber transceivers, and while still allowing over 25-dB margin for multipath and some shadowing. This remaining margin allows for operation of a handset inside an automobile. In such an unobstructed environment, multipath RMS delay spread [21, 33] will still be less than the 0.5 µs in which PACS was designed to operate [28]. Operation at still greater range along unobstructed highways or at a range of a mile along more obstructed streets can be obtained in several ways. Additional link gain of 6 dB can be obtained by going from omnidirectional antennas at base stations to 90◦ sectored antennas (four sectors). Another 6 dB can be obtained by raising base station antennas by a factor of 2 from 27 ft to 55 ft in height. This additional 12 dB will allow another factor of 2 increase in range to 2-mile base station separation along highways, or to about 3000-ft separation in residential areas. Even higher-gain and taller antennas could be used to concentrate coverage along highways, particularly in rural areas. Of course, range could be further increased by increasing the power transmitted. As the range of the low-tier PACS technology is extended in cluttered areas by increasing link gain, increased RMS delay spread is likely to be encountered. This will require increasing complexity in receivers. A factor of 2 in tolerance of delay spread can be obtained by interference-canceling signal combining [76, 77, 78, 79, 80] from two antennas instead of the simpler selection diversity combining originally used in PACS. This will provide adequate delay-spread tolerance for most suburban environments [21, 33]. The PACS downlink contains synchronization words that could be used to train a conventional delay-spread equalizer in subscriber set receivers. Constant-modulus (blind) equalization will provide greater tolerance to delay spread in base station receivers on the uplink [45, 46, 47, 48] than can be obtained by interference-cancellation combining from only two antennas. The use of more basestation antennas and receivers can also help mitigate uplink delay spread. Thus, with some added complexity, the low-tier PACS technology can work effectively in the RMS delay spreads expected in cluttered environments for base station separations of 2 miles or so. The guard time in the PACS TDMA uplink is adequate for 1-mile range, i.e., 2-mile separation between base station and subscriber transceivers. A separation of up to 3 miles between transceivers could be allowed if some statistical outage were accepted for the few times when adjacent uplink timeslots are occupied by subscribers at the extremes of range (near–far). With some added complexity in assigning timeslots, the assignment of subscribers at very different ranges to adjacent timeslots could be avoided, and the base station separation could be increased to several miles without incurring adjacent slot interference. A simple alternative in low-density (rural) areas, where lower capacity could be acceptable and greater range could be desirable, would be to use every other timeslot to ensure adequate guard time for range differences of many tens of miles. Also, the capability of transmitter time advance has been added to PACS standard in order to increase the range of operation. Such time advance is applied in the cellular TDMA technologies. 1999 by CRC Press LLC

c

The synchronization, carrier recovery, and detection in the low-complexity PACS transceivers will perform well at highway speeds. The two-receiver diversity used in uplink transceivers also will perform well at highway speeds. The performance of the single-receiver selection diversity used in the low-complexity PACS downlink transceivers begins to deteriorate at speeds above about 30 mi/h. However, at any speed, the performance is always at least as good as that of a single transceiver without the low-complexity diversity. Also, fading in the relatively uncluttered environment of a highway is likely to have a less severe Ricean distribution, so diversity will be less needed for mitigating the fading. Cellular handsets do not have diversity. Of course, more complex two-receiver diversity could be added to downlink transceivers to provide two-branch diversity performance at highway speeds. It should be noted that the very short 2.5-ms TDMA frames incorporated into PACS to provide low transmission delay (for high speech quality) also make the technology significantly less sensitive to high-speed fading than the longer-frame-period cellular technologies. The short frame also facilitates the rapid coordination needed to make reliable high-speed handoffs between base stations. Measurements on radio links to potential handoff base stations can be made rapidly, i.e., a measurement on at least one radio link every 2.5 ms. Once a handoff decision is made, signalling exchanges every 2.5 ms ensure that the radio link handoff is completed quickly. In contrast, the long frame periods in the high-tier (cellular) technologies prolong the time it takes to complete a handoff. As noted earlier, high speed handoff has been demonstrated many times with PACS technology and at speeds over 70 mi/hr. }

15.6

Quality, Capacity, and Economic Issues

Although the several trajectories toward low-tier PCS discussed in the preceding section are clear, it does not fit the existing wireless communications paradigms. Thus, low-tier PCS has attracted less attention than the systems and technologies that are compatible with the existing paradigms. Some examples are cited in the following paragraphs. The need for intense interaction with an intelligent network infrastructure in order to manage mobility is not compatible with the cordless telephone appliance paradigm. In that paradigm, independence of network intelligence and base units that mimic wireline telephones are paramount. Wireless data systems often do not admit to the dominance of wireless voice communications and, thus, do not take advantage of the economics of sharing network infrastructure and base station equipment. Also, wireless voice systems often do not recognize the importance of data and messaging and, thus, only add them in as bandaids to systems. The need for a dense collection of many low-complexity, low-cost, low-tier PCS base stations interconnected with inexpensive fixed-network facilities (copper or fiber based) does not fit the cellular high-tier paradigm that expects sparsely distributed $1 million cell sites. Also, the need for high transmission quality to compete with wireline telephones is not compatible with the drive toward maximizing users-per-cell-site and per megahertz to minimize the number of expensive cell sites. These concerns, of course, ignore the hallmark of frequency-reusing cellular systems. That hallmark is the production of almost unlimited overall system capacity by reducing the separation between base stations. The cellular paradigm does not recognize the fact that almost all houses in the U.S. have inexpensive copper wires connecting telephones to the telephone network. The use of low-tier PCS base stations that concentrate individual user services before backhauling in the network will result in less fixed interconnecting facilities than exist now for wireline telephones. Thus, inexpensive techniques for interconnecting many low-tier base stations are already deployed to provide wireline telephones to almost all houses. { The cost of backhaul to many base stations (radio ports) in a low-tier system is often cited as an economic disadvantage that cannot be overcome. 1999 by CRC Press LLC

c

However, this perception is based on existing tariffs for T1 digital lines which are excessive considering current digital subscriber line technology. These tariffs were established many years ago when digital subscriber line electronics were very expensive. With modern low-cost high-rate digital subscriber line (HDSL) electronics, the cost of backhaul could be greatly reduced. If efforts were made to revise tariffs for digital line backhaul based on low cost electronics and copper loops like residential loops, the resulting backhaul costs would more nearly approach the cost of residential telephone lines. As it is now, backhaul costs are calculated based on antiquated high T1 line tariffs that were established for “antique” high cost electronics. } This list could be extended, but the preceding examples are sufficient, along with the earlier sections of the paper, to indicate the many complex interactions among circuit quality, spectrum utilization, complexity (circuit and network), system capacity, and economics that are involved in the design compromises for a large, high-capacity wireless-access system. Unfortunately, the tendency has been to ignore many of the issues and focus on only one, e.g., the focus on cell site capacity that drove the development of digital-cellular high-tier systems in the United States. Interactions among circuit quality, complexity, capacity, and economics are considered in the following sections.

15.6.1

Capacity, Quality, and Complexity

Although capacity comparisons frequently are made without regard to circuit quality, complexity, or cost per base station, such comparisons are not meaningful. An example in Table 15.5 compares capacity factors for U.S. cellular or high-tier PCS technologies with the low-tier PCS technology, PACS/WACS. The mean opinion scores (MOS) (noted in Table 15.5) for speech coding are discussed later. Detection of speech activity and turning off the transmitter during times of no activity is implemented in IS-95. Its impact on MOS also is noted later. A similar technique has been proposed as E-TDMA for use with IS-54 and is discussed with respect to TDMA system in [29]. Note that the use of low-bit-rate speech coding combined with speech activity degrades the high-tier system’s quality by nearly one full MOS point on the five-point MOS scale when compared to 32 kb/s ADPCM. Tandem encoding is discussed in a later section. These speech quality degrading factors alone provide a base station capacity increasing factor of ×4 × 2.5 = ×10 over the high-speech-quality low-tier system! Speech coding, of course, directly affects base station capacity and, thus, overall system capacity by its effect on the number of speech channels that can fit into a given bandwidth. TABLE 15.5 Comparison of Cellular (IS-54/IS-95) and Low-Tier PCS (WACS/PACS). Capacity Comparisons Made without Regard to Quality Factors, Complexity, and Cost per Base Station Are not Meaningful Parameter

Cellular (High-Tier)

Low-Tier PCS

Capacity Factor

Speech Coding, kb/s

8 (MOS 3.4) No tandem coding

32 (MOS 4.1) 3 or 4 tandem

×4

Speech activity

Yes (MOS 3.2)

No (MOS 4.1)

×2.5

Percentage of good areas, %

90

99

×2

Propagation σ , dB

8

10

×1.5

Total: trading quality for capacity

×30

The allowance of extra system margin to provide coverage of 99% of an area for low-tier PCS versus 90% coverage for high-tier is discussed in the previous section and [29]. This additional quality factor costs a capacity factor of ×2. The last item in Table 15.5 does not change the actual system, but only 1999 by CRC Press LLC

c

changes the way that frequency reuse is calculated. The additional 2-dB margin in standard deviation σ , allowed for coverage into houses and small buildings for low-tier PCS, costs yet another factor of ×1.5 in calculation only. Frequency reuse factors affect the number of sets of frequencies required and, thus, the bandwidth available for use at each base station. Thus, these factors also affect the base station capacity and the overall system capacity. For the example in Table 15.5, significant speech and coverage quality has been traded for a factor of ×30 in base station capacity! Whereas base station capacity affects overall system capacity directly, it should be remembered that overall system capacity can be increased arbitrarily by decreasing the spacing between base stations. Thus, if the PACS low-tier PCS technology were to start with a base station capacity of ×0.5 of AMPS cellular5 (a much lower figure than the ×0.8 sometimes quoted [20]), and then were degraded in quality as described above to yield the ×30 capacity factor, it would have a resulting capacity of ×15 of AMPS! Thus, it is obvious that making such a base station capacity comparison without including quality is not meaningful.

15.6.2

Economics, System Capacity, and Coverage Area Size

Claims are sometimes made that low-tier PCS cannot be provided economically, even though it is what the user wants. These claims are often made based on economic estimates from the cellular paradigm. These include the following. • Very low estimates of market penetration, much less than cordless telephones, and often even less than cellular. • High estimates of base station costs more appropriate to high-complexity, high-cost cellular technology than to low-complexity, low-cost, low-tier technology. • Very low estimates of circuit usage time more appropriate to cellular than to cordless/wireline telephone usage, which is more likely for low-tier PCS. • { Backhaul costs based on existing T1 line tariffs that are based on “antique” high cost digital loop electronics. (See discussion in fourth paragraph at start of Section 15.6.) } Such economic estimates are often done by making absolute economic calculations based on very uncertain input data. The resulting estimates for low-tier and high-tier are often closer together than the large uncertainties in the input data. A perhaps more realistic approach for comparing such systems is to vary only one or two parameters while holding all others fixed and then looking at relative economics between high-tier and low-tier systems. This is the approach used in the following examples.

EXAMPLE 15.1:

In the first example (see Table 15.6), the number of channels per megahertz is held constant for cellular and for low-tier PCS. Only the spacing is varied between base stations, e.g., cell sites for cellular and radio ports for low-tier PCS, to account for the differences in transmitter power, antenna height, etc. In this example, overall system capacity varies directly as the square of base station spacing, but base station capacity is the same for both cellular and low-tier PCS. For the typical values in the

5 Note that the ×0.5 factor is an arbitrary factor taken for illustrating this example. The so-called ×AMPS factors are only with regard to base station capacity, although they are often misused as system capacity.

1999 by CRC Press LLC

c

example, the resulting low-tier system capacity is ×400 greater, only because of the closer base station spacing. If the two systems were to cost the same, the equivalent low-tier PCS base stations would have to cost less than $2,500. TABLE 15.6

System Capacity/Coverage Area Size/Economics Example 15.1

Assume channels/MHz are the same for cellular and PCS Cell site: spacing = 20.000 ft cost $ = 1 M PCS port: spacing = 1,000 ft PCS system capacity is (20000/1000)2 = 400 × cellular capacity Then, for the system costs to be the same Port cost = ($ 1 M/400) $2,500 a reasonable figure If, cell site and port each have 180 channels Cellular cost/circuit = $ 1 M/180 = $5,555/circuit PCS cost/circuit = $2500/180 = $14/circuit Example 15.2 Assume equal cellular and PCS system capacity Cell site: spacing = 20,000 ft PCS port: spacing = 1,000 ft If, a cell site has 180 channels then, for equal system capacity, a PCS port needs 180/400 < 1 channel/port Example 15.3 Quality/cost trade Cell site: Spacing = 20,000 ft PCS port: Spacing = 1,000 ft

cost = $1 M channels = 180 cost = $2,500

Cellular to PCS, base station spacing capacity factor = × 400 PCS to cellular quality reduction factors: 32 to 8 kb/s speech ×4 Voice activity (buying) ×2 ×2 99–90% good areas Both in same environment (same σ ) ×1 Capacity factor traded ×16 180 ch/16 = 11.25 channels/port then, $2500/11.25 = $222/circuit and remaining is ×400/16 = ×25 system capacity of PCS over cellular

This cost is well within the range of estimates for such base stations, including equivalent infrastructure. These low-tier PCS base stations are of comparable or lower complexity than cellular vehicular subscriber sets, and large-scale manufacture will be needed to produce the millions that will be required. Also, land, building, antenna tower and legal fees for zoning approval, or rental of expensive space on top of commercial buildings, represent large expenses for cellular cell sites. Low-tier PCS base stations that are mounted on utility poles and sides of buildings will not incur such large additional expenses. Therefore, costs of the order of magnitude indicated seem reasonable in large quantities. Note that, with these estimates, the per-wireless-circuit cost of the low-tier PCS circuits would be only $14/circuit compared to $5,555/circuit for the high-tier circuits. Even if there were a factor of 10 error in cost estimates, or a reduction of channels per radio port of a factor of 10, the per-circuit cost of low-tier PCS would still be only $140/circuit, which is still much less than the per-circuit cost of high-tier. EXAMPLE 15.2:

In the second example (see Table 15.6), the overall system capacity is held constant, and the number of channels/port, i.e., channels/(base station) is varied. In this example, less than 1/2 channel/port 1999 by CRC Press LLC

c

is needed, again indicating the tremendous capacity that can be produced with close-spaced lowcomplexity base stations. EXAMPLE 15.3:

Since the first two examples are somewhat extreme, the third example (see Table 15.6) uses a more moderate, intermediate approach. In this example, some of the cellular high-tier channels/(base station) are traded to yield higher quality low-tier PCS as in the previous subsection. This reduces the channels/port to 11+, with an accompanying increase in cost/circuit up to $222/circuit, which is still much less than the $5,555/circuit for the high-tier system. Note, also, that the low-tier system still has ×25 the capacity of the high-tier system! Low-tier base station (Port) cost would have to exceed $62,500 for the low-tier per-circuit cost to exceed that of the high-tier cellular system. Such a high port cost far exceeds any existing realistic estimate of low-tier system costs. It can be seen from these examples, and particularly Example 15.3, that the circuit economics of low-tier PCS are significantly better than for high-tier PCS, if the user demand and density is sufficient to make use of the large system capacity. Considering the high penetration of cordless telephones, the rapid growth of cellular handsets, and the enormous market projections for wireless PCS noted earlier in this chapter, filling such high capacity in the future would appear to be certain. The major problem is providing rapidly the widespread coverage (buildout) required by the FCC in the United States. If this unrealistic regulatory demand can be overcome, low-tier wireless PCS promises to provide the wireless personal communications that everyone wants.

15.6.3 {Loop Evolution and Economics It is interesting to note that several wireless loop applications are aimed at reducing cost by replacing parts of wireline or CATV loops with wireless links between transceivers. The economics of these applications are driven by the replacing of labor-intensive wireline and cable technologies with massproduced solid-state electronics in transceivers. Consider first a cordless telephone base unit. The cordless base-unit transceiver usually serves one or, at most, two handsets at the end of one wireline loop. Now consider moving such a base unit back along the copper-wire-pair loop end a distance that can be reliably covered by a low-power wireless link [25, 31], i.e., several hundred to a thousand feet or so, and mounting it on a utility pole or a street light pole. This replaces the copper loop end with the wireless link. Many additional copper loop ends to other subscribers will be contained within a circle around the pole having a maximum usable radius of this wireless link. Replace all of the copper loop ends within the circle with cordless base units on the same pole. Note that this process replaces the most expensive parts of these many loops, i.e., the many individual loop ends, with the wireless links from cordless handsets to“equivalent” cordless base units on a pole. Of course, being mounted outside will require somewhat stronger enclosures and means of powering the base units, but these additional costs are considerably more than offset by eliminating the many copper wire drops. It is instructive to consider how many subscribers could be collected at a pole containing base units. Consider, as an example, a coverage square of 1400 ft on a side (PACS will provide good coverage over this range, i.e., for base unit pole separations of about 1400 ft, at 1.9 GHz). Within this square will be 45 houses for a 1 house/acre density typical of low-housing-density areas, or 180 houses for 4 house/acre density more typical of high-density single-family housing areas. These represent significant concentration of traffic at a pole. 1999 by CRC Press LLC

c

Because of the trunking advantage of the significant number of subscribers concentrated at a pole, they can share a smaller number of base unit, i.e., wireless base unit transceivers, than there are wireless subscriber sets. Therefore, the total cost compared with having a cordless base unit per subscriber also is reduced by the concentration of users. A single PACS transceiver will support simultaneously eight TDMA channels or circuits at 32 kb/s (or 16 at 16 kb/s or 32 at 8 kb/s) [56]. Of these, one channel is reserved for system control. The cost of such moderate-rate transceivers is relatively insensitive to the number of channels supported; i.e., the cost of such an 8-channel (or 16 or 32) transceiver will be significantly less than twice the cost of a similar one-channel transceiver. Thus, another economic advantage accrues to this wireless loop approach from using time-multiplexed (TDMA) transceivers instead of single-channel-pertransceiver cordless telephone base units. For an offered traffic of 0.06 Erlang, a typical busy-hour value for a wireline subscriber, a sevenchannel transceiver could serve about 40 subscribers at 1% blocking, based on the Erlang B queuing discipline. From the earlier example, such a transceiver could serve most of the 45 houses within a 1400-ft square. Considering partial penetration, the transceiver capacity is more than adequate for the low-density housing.6 Considering the high-density example of 4 houses/acre, a seven-channel transceiver could serve only about 20% of the subscribers within a 1400-ft square. If the penetration became greater than about 20%, either additional transceivers, perhaps those of other service providers, or closer transceiver spacing would be required. Another advantageous economic factor for wireless loops results when considering timemultiplexed transmission in the fixed distribution facilities. For copper or fiber digital subscriber loop carrier (SLC), e.g., T1 or high-rate digital subscriber line (HDSL), a demultiplexing/multiplexing terminal and drop interface are required at the end of the time-multiplexed SLC line to provide the individual circuits for each subscriber loop-end circuit, i.e., for each drop. The most expensive part of such an SLC terminating unit is the subscriber line cards that provide per-line interfaces for each subscriber drop. Terminating a T1 or HDSL line on a wireless loop transceiver eliminates all per-line interfaces, i.e., all line cards, the most expensive part of a SLC line termination. Thus, the greatly simplified SLC termination can be incorporated within a TDMA wireless loop transceiver, resulting in another cost savings over the conventional copper-wire-pair telephone loop end. The purpose of the previous discussions is not to give an exact system design or economic analysis, but to illustrate the inherent economic advantages of low-power wireless loops over copper loop ends and over copper loop ends with cordless telephone base units. Some economic analyses have found wireless loop ends to be more economical than copper loop ends when subscribers use low-power wireless handsets. Rizzo and Sollenberger [56] have also discussed the advantageous economics of PACS wireless loop technology in the context of low-tier PCS. The discussions in this section can be briefly summarized as follows. Replacing copper wire telephone loop ends with low-complexity wireless loop technology like PACS can produce economic benefits in at least four ways. These are. 1. Replacing the most expensive part of a loop, the per-subscriber loop-end, with a wireless link. 2. Taking advantage of trunking in concentrating many wireless subscriber loops into a

6 The range could be extended by using higher base unit antennas, by using higher-gain directional (sectored) antennas, and/or by increasing the maximum power that can be transmitted.

1999 by CRC Press LLC

c

smaller number of wireless transceiver channels. 3. Reducing the cost of wireless transceivers by time multiplexing (TDMA) a few (7, 15, or 31) wireless loop circuits (channels) in each transceiver. 4. Eliminating per-line interface cards in digital subscriber line terminations by terminating time-multiplexed subscriber lines in the wireless loop transceivers. }

15.7

Other Issues

Several issues in addition to those addressed in the previous two sections continue to be raised with respect to low-tier PCS. These are treated in this section.

15.7.1

Improvement of Batteries

Frequently, the suggestion is made that battery technology will improve so that high-power handsets will be able to provide the desired 5 or 6 hours of talk time in addition to 10 or 12 hours of standby time, and still weigh less than one-fourth of the weight of today’s smallest cellular handset batteries. This hope does not take into account the maturity of battery technology, and the long history (many decades) of concerted attempts to improve it. Increases in battery capacity have come in small increments, a few percent, and very slowly over many years, and the shortfall is well over a factor of 10. In contrast, integrated electronics and radio frequency devices needed for low-power low-tier PCS continue to improve and to decrease in cost by factors of greater than 2 in time spans on the order of a year or so. It also should be noted that, as the energy density of a battery is increased, the energy release rate per volume must also increase in order to supply the same amount of power. If energy storage density and release rate are increased significantly, the difference between a battery and a bomb become indistinguishable! The likelihood of a ×10 improvement in battery capacity appears to be essentially zero. If even a modest improvement in battery capacity were possible, many people would be driving electric vehicles. { As noted in the addition to the “Reality Check” section, new lithium batteries have become the batteries of choice for the smallest cellular/high-tier PCS handsets. While these lithium batteries have higher energy density than earlier nickel cadmium batteries, they still fall far short of the factor of 10 improvement that was needed to make long talk time, small size, and low weight possible. With the much larger advance in electronics, the battery is even more dominant in the size and weight of the newest cellular handsets. The introduction of these batteries incurred considerable startup pain because of the greater fire and explosive hazard associated with lithium materials, i.e., closer approach to a bomb. Further attempts in this direction will be even more hazardous. }

15.7.2

People Only Want One Handset

This issue is often raised in support of high-tier cellular handsets over low-tier handsets. Whereas the statement is likely true, the assumption that the handset must work with high-tier cellular is not. Such a statement follows from the current large usage of cellular handsets; but such usage results because that is the only form of widespread wireless service currently available, not because it is what people want. The statement assumes inadequate coverage of a region by low-tier PCS, and that low-tier handsets will not work in vehicles. The only way that high-tier handsets could serve the desires of people discussed earlier would be for an unlikely breakthrough in battery technology to occur. A low-tier system, however, can cover economically any large region having some people in 1999 by CRC Press LLC

c

it. (It will not cover rural or isolated areas but, by definition, there is essentially no one there to want communications anyway.) Low-tier handsets will work in vehicles on village and city streets at speeds up to 30 or 40 mi/h, and the required handoffs make use of computer technology that is rapidly becoming inexpensive. { As noted earlier, vehicular speed handoff is readily accomplished with PACS. Reliable handoff has been demonstrated for PACS at speeds in excess of 70 mi/hr. } Highways between populated areas, and also streets within them, will need to be covered by high-tier cellular PCS, but users are likely to use vehicular sets in these cellular systems. Frequently the vehicular mobile user will want a different communications device anyway, e.g., a hands-free phone. The use of hands-free phones in vehicles is becoming a legal requirement in some places now and is likely to become a requirement in many more places in the future. Thus, handsets may not be legally usable in vehicles anyway. With widespread deployment of low-tier PCS systems, the one handset of choice will be the low-power, low-tier PCS pocket handset or voice/data communicator. { As discussed in earlier sections, it is quite feasible economically to cover highways between cities with low-tier systems, if the low-tier base stations have antennas with the same height and gain as used for cellular and high-tier PCS systems. (The range penalty for the lower power was noted earlier to be only on the order of 1/2, or about the same as the range penalty in going from 800 MHz cellular to 1.9 GHz high-tier PCS.) } There are approaches for integrating low-tier pocket phones or pocket communicators with hightier vehicular cellular mobile telephones. The user’s identity could be contained either in memory in the low-tier set or in a small smart card inserted into the set, as is a feature of the European GSM system. When entering an automobile, the small low-tier communicator or card could be inserted into a receptacle in a high-tier vehicular cellular set installed in the automobile.7 The user’s identity would then be transferred to the mobile set. { “Car adapters” that have a cradle for a small cellular handset providing battery charging and connection to an outside antenna are quite common — e.g., in Sweden use of such adapters is commonplace. Thus, this concept has already evolved significantly, even for the disadvantaged cellular handsets when they are used in vehicles. } The mobile set could then initiate a data exchange with the high-tier system, indicating that the user could now receive calls at that mobile set. This information about the user’s location would then be exchanged between the network intelligence so that calls to the user could be correctly routed.8 In this approach the radio sets are optimized for their specific environments, high-power, high-tier vehicular or low-power, low-tier pedestrian, as discussed earlier, and the network access and call routing is coordinated by the interworking of network intelligence. This approach does not compromise the design of either radio set or radio system. It places the burden on network intelligence technology that benefits from the large and rapid advances in computer technology. The approach of using different communications devices for pedestrians than for vehicles is consistent with what has actually happened in other applications of technology in similarly different environments. For example, consider the case of audio cassette tape players. Pedestrians often carry and listen to small portable tape players with lightweight headsets (e.g., a Walkman).9 When one of these people enters an automobile, he or she often removes the tape from the Walkman and inserts it into a tape player installed in the automobile. The automobile player has speakers that fill the car

7 Inserting the small personal communicator in the vehicular set would also facilitate charging the personal communicator’s battery. 8 This is a feature proposed for FPLMTS in CCIR Rec. 687. 9 Walkman is a registered trademark of Sony Corporation.

1999 by CRC Press LLC

c

with sound. The Walkman is optimized for a pedestrian, whereas the vehicular-mounted player is optimized for an automobile. Both use the same tape, but they have separate tape heads, tape transports, audio preamps, etc. They do not attempt to share electronics. In this example, the tape cassette is the information-carrying entity similar to the user identification in the personal communications example discussed earlier. The main points are that the information is shared among different devices but that the devices are optimized for their environments and do not share electronics. Similarly, a high-tier vehicular-cellular set does not need to share oscillators, synthesizers, signal processing, or even frequency bands or protocols with a low-tier pocket-size communicator. Only the information identifying the user and where he or she can be reached needs to be shared among the intelligence elements, e.g., routing logic, databases, and common channel signalling [26, 29] of the infrastructure networks. This information exchange between network intelligence functions can be standardized and coordinated among infrastructure subnetworks owned and operated by different business entities (e.g., vehicular cellular mobile radio networks and intelligent low-tier PCS networks). Such standardization and coordination are the same as are required today to pass intelligence among local exchange networks and interexchange carrier networks.

15.7.3

Other Environments

Low-tier personal communications can be provided to occupants of airplanes, trains, and buses by installing compatible low-tier radio access ports inside these vehicles. The ports can be connected to high-power, high-tier vehicular cellular mobile sets or to special air-ground or satellite-based mobile communications sets. Intelligence between the internal ports and mobile sets could interact with cellular mobile, air-ground, or satellite networks in one direction, using protocols and spectrum allocated for that purpose, and with low-tier personal communicators in the other direction to exchange user identification and route calls to and from users inside these large vehicles. Radio isolation between the low-power units inside the large metal vehicles and low-power systems outside the vehicles can be ensured by using windows that are opaque to the radio frequencies. Such an approach also has been considered for automobiles, i.e., a radio port for low-tier personal communications connected to a cellular mobile set in a vehicle so that the low-tier personal communicator can access a high-tier cellular network. (This could be done in the United States using unlicensed PCS frequencies within the vehicle.)

15.7.4

Speech Quality Issues

All of the PCS and cordless telephone technologies that use CCITT standardized 32-kb/s ADPCM speech encoding can provide similar error-free speech distortion quality. This quality often is rated on a five-point subjective mean opinion score (MOS) with 5 excellent, 4 good, 3 fair, 2 poor, and 1 very poor. The error-free MOS of 32-kb/s ADPCM is about 4.1 and degrades very slightly with tandem encodings. Tandem encodings could be expected in going from a digital-radio PCS access link, through a network using analog transmission or 64-kb/s PCM, and back to another digital-radio PCS access link on the other end of the circuit. In contrast, a low-bit-rate ( W . Hence, all of the signal’s spectral components will be affected by the channel in a similar manner (e.g., fading or no fading); this is illustrated in Fig. 18.8(b). Flat-fading does not introduce channel-induced ISI distortion, but performance degradation can still be expected due to the loss in SNR whenever the signal is fading. In order to avoid channel-induced ISI distortion, the channel is required to exhibit flat fading by insuring that 1 (18.14) f0 > W ≈ Ts Hence, the channel coherence bandwidth f0 sets an upper limit on the transmission rate that can be used without incorporating an equalizer in the receiver. For the flat-fading case, where f0 > W (or Tm < Ts ), Fig. 18.8(b) shows the usual flat-fading pictorial representation. However, as a mobile radio changes its position, there will be times when the received signal experiences frequency-selective distortion even though f0 > W . This is seen in Fig. 18.8(c), where the null of the channel’s frequency transfer function occurs at the center of the signal band. Whenever this occurs, the baseband pulse will be especially mutilated by deprivation of its DC component. One consequence of the loss of DC (zero mean value) is the absence of a reliable pulse peak on which to establish the timing synchronization, or from which to sample the carrier phase carried by the pulse [18]. Thus, even though a channel is categorized as flat fading (based on rms relationships), it can still manifest frequency-selective fading on occasions. It is fair to say that a mobile radio channel, classified as having flat-fading degradation, cannot exhibit flat fading all of the time. As f0 becomes much larger than W (or Tm becomes much smaller than Ts ), less time will be spent in conditions approximating Fig. 18.8(c). By comparison, it should be clear that in Fig. 18.8(a) the fading is independent of the position of the signal band, and frequency-selective fading occurs all the time, not just occasionally.

18.6

Typical Examples of Flat Fading and Frequency-Selective Fading Manifestations

Figure 18.9 shows some examples of flat fading and frequency-selective fading for a direct-sequence spread-spectrum (DS/SS) system [20, 22]. In Fig. 18.9, there are three plots of the output of a pseudonoise (PN) code correlator vs. delay as a function of time (transmission or observation time). Each amplitude vs. delay plot is akin to S(τ ) vs. τ shown in Fig. 18.7(a). The key difference is that the amplitudes shown in Fig. 18.9 represent the output of a correlator; hence, the waveshapes are a function not only of the impulse response of the channel, but also of the impulse response of the correlator. The delay time is expressed in units of chip durations (chips), where the chip is defined as the spread-spectrum minimal-duration keying element. For each plot, the observation time is shown on an axis perpendicular to the amplitude vs. time-delay plane. Figure 18.9 is drawn from a satellite-toground communications link exhibiting scintillation because of atmospheric disturbances. However, Fig. 18.9 is still a useful illustration of three different channel conditions that might apply to a mobile radio situation. A mobile radio that moves along the observation-time axis is affected by changing multipath profiles along the route, as seen in the figure. The scale along the observation-time axis is also in units of chips. In Fig. 18.9(a), the signal dispersion (one “finger” of return) is on the order of a chip time duration, Tch . In a typical DS/SS system, the spread-spectrum signal bandwidth is approximately equal to 1/Tch ; hence, the normalized coherence bandwidth f0 Tch of approximately unity in Fig. 18.9(a) implies that the coherence bandwidth is about equal to the spread-spectrum bandwidth. This describes a channel that can be called frequency-nonselective or slightly frequencyselective. In Fig. 18.9(b), where f0 Tch = 0.25, the signal dispersion is more pronounced. There is 1999 by CRC Press LLC

c

FIGURE 18.9: DS/SS Matched-filter output time-history examples for three levels of channel conditions, where Tch is the time duration of a chip.

definite interchip interference, and the coherence bandwidth is approximately equal to 25% of the spread-spectrum bandwidth. In Fig. 18.9(c), where f0 Tch = 0.1, the signal dispersion is even more pronounced, with greater interchip-interference effects, and the coherence bandwidth is approximately equal to 10% of the spread-spectrum bandwidth. The channels of Figs. 18.9(b) and (c) can be categorized as moderately and highly frequency-selective, respectively, with respect to the basic signalling element, the chip. Later, we show that a DS/SS system operating over a frequency-selective channel at the chip level does not necessarily experience frequency-selective distortion at the symbol level. 1999 by CRC Press LLC

c

18.7

Time Variance Viewed in the Time Domain: Figure 18.1, Block 13—The Spaced-Time Correlation Function

Until now, we have described signal dispersion and coherence bandwidth, parameters that describe the channel’s time-spreading properties in a local area. However, they do not offer information about the time-varying nature of the channel caused by relative motion between a transmitter and receiver, or by movement of objects within the channel. For mobile-radio applications, the channel is time variant because motion between the transmitter and receiver results in propagation-path changes. Thus, for a transmitted continuous wave (CW) signal, as a result of such motion, the radio receiver sees variations in the signal’s amplitude and phase. Assuming that all scatterers making up the channel are stationary, then whenever motion ceases, the amplitude and phase of the received signal remain constant; that is, the channel appears to be time invariant. Whenever motion begins again, the channel appears time variant. Since the channel characteristics are dependent on the positions of the transmitter and receiver, time variance in this case is equivalent to spatial variance. Figure 18.7(c) shows the function R(1t), designated the spaced-time correlation function; it is the autocorrelation function of the channel’s response to a sinusoid. This function specifies the extent to which there is correlation between the channel’s response to a sinusoid sent at time t1 and the response to a similar sinusoid sent at time t2 , where 1t = t2 −t1 . The coherence time, T0 , is a measure of the expected time duration over which the channel’s response is essentially invariant. Earlier, we made measurements of signal dispersion and coherence bandwidth by using wideband signals. Now, to measure the time-variant nature of the channel, we use a narrowband signal. To measure R(1t) we can transmit a single sinusoid (1f = 0) and determine the autocorrelation function of the received signal. The function R(1t) and the parameter T0 provide us with knowledge about the fading rapidity of the channel. Note that for an ideal time-invariant channel (e.g., a mobile radio exhibiting no motion at all), the channel’s response would be highly correlated for all values of 1t, and R(1t) would be a constant function. When using the dense-scatterer channel model described earlier, with constant velocity of motion, and an unmodulated CW signal, the normalized R(1t) is described as (18.15) R(1t) = J0 (kV 1t) where J0 (·) is the zero-order Bessel function of the first kind, V is velocity, V 1t is distance traversed, and k = 2π/λ is the free-space phase constant (transforming distance to radians of phase). Coherence time can be measured in terms of either time or distance traversed (assuming some fixed velocity of motion). Amoroso described such a measurement using a CW signal and a dense-scatterer channel model [18]. He measured the statistical correlation between the combination of received magnitude and phase sampled at a particular antenna location x0 , and the corresponding combination sampled at some displaced location x0 + ζ , with displacement measured in units of wavelength λ. For a displacement ζ of 0.38λ between two antenna locations, the combined magnitudes and phases of the received CW are statistically uncorrelated. In other words, the state of the signal at x0 says nothing about the state of the signal at x0 + ζ . For a given velocity of motion, this displacement is readily transformed into units of time (coherence time).

18.7.1

The Concept of Duality

Two operators (functions, elements, or systems) are dual when the behavior of one with reference to a time-related domain (time or time-delay) is identical to the behavior of the other with reference to the corresponding frequency-related domain (frequency or Doppler shift). 1999 by CRC Press LLC

c

In Fig. 18.7, we can identify functions that exhibit similar behavior across domains. For understanding the fading channel model, it is useful to refer to such functions as duals. For example, R(1f ) in Fig. 18.7(b), characterizing signal dispersion in the frequency domain, yields knowledge about the range of frequency over which two spectral components of a received signal have a strong potential for amplitude and phase correlation. R(1t) in Fig. 18.7(c), characterizing fading rapidity in the time domain, yields knowledge about the span of time over which two received signals have a strong potential for amplitude and phase correlation. We have labeled these two correlation functions as duals. This is also noted in Fig. 18.1 as the duality between blocks 10 and 13, and in Fig. 18.6 as the duality between the time-spreading mechanism in the frequency domain and the time-variant mechanism in the time domain.

18.7.2

Degradation Categories due to Time Variance Viewed in the Time Domain

The time-variant nature of the channel or fading rapidity mechanism can be viewed in terms of two degradation categories as listed in Fig. 18.6: fast fading and slow fading. The terminology “fast fading” is used for describing channels in which T0 < Ts , where T0 is the channel coherence time and Ts is the time duration of a transmission symbol. Fast fading describes a condition where the time duration in which the channel behaves in a correlated manner is short compared to the time duration of a symbol. Therefore, it can be expected that the fading character of the channel will change several times during the time that a symbol is propagating, leading to distortion of the baseband pulse shape. Analogous to the distortion previously described as channel-induced ISI, here distortion takes place because the received signal’s components are not all highly correlated throughout time. Hence, fast fading can cause the baseband pulse to be distorted, resulting in a loss of SNR that often yields an irreducible error rate. Such distorted pulses cause synchronization problems (failure of phase-locked-loop receivers), in addition to difficulties in adequately defining a matched filter. A channel is generally referred to as introducing slow fading if T0 > Ts . Here, the time duration that the channel behaves in a correlated manner is long compared to the time duration of a transmission symbol. Thus, one can expect the channel state to virtually remain unchanged during the time in which a symbol is transmitted. The propagating symbols will likely not suffer from the pulse distortion described above. The primary degradation in a slow-fading channel, as with flat fading, is loss in SNR.

18.8

Time Variance Viewed in the Doppler-Shift Domain: Figure 18.1, Block 16—The Doppler Power Spectrum

A completely analogous characterization of the time-variant nature of the channel can begin in the Doppler-shift (frequency) domain. Figure 18.7(d) shows a Doppler power spectral density, S(v), plotted as a function of Doppler-frequency shift, v. For the case of the dense-scatterer model, a vertical receive antenna with constant azimuthal gain, a uniform distribution of signals arriving at all arrival angles throughout the range (0, 2π ), and an unmodulated CW signal, the signal spectrum at the antenna terminals is [19] S(v) =

r

1

πfd 1 − 1999 by CRC Press LLC

c

v−fc fd

2

(18.16)

The equality holds for frequency shifts of v that are in the range ±fd about the carrier frequency fc and would be zero outside that range. The shape of the RF Doppler spectrum described by Eq. (18.16) is classically bowl-shaped, as seen in Fig. 18.7(d). Note that the spectral shape is a result of the densescatterer channel model. Equation (18.16) has been shown to match experimental data gathered for mobile radio channels [23]; however, different applications yield different spectral shapes. For example, the dense-scatterer model does not hold for the indoor radio channel; the channel model for an indoor area assumes S(v) to be a flat spectrum [24]. In Fig. 18.7(d), the sharpness and steepness of the boundaries of the Doppler spectrum are due to the sharp upper limit on the Doppler shift produced by a vehicular antenna traveling among the stationary scatterers of the dense scatterer model. The largest magnitude (infinite) of S(v) occurs when the scatterer is directly ahead of the moving antenna platform or directly behind it. In that case the magnitude of the frequency shift is given by fd =

V λ

(18.17)

where V is relative velocity and λ is the signal wavelength. fd is positive when the transmitter and receiver move toward each other and negative when moving away from each other. For scatterers directly broadside of the moving platform, the magnitude of the frequency shift is zero. The fact that Doppler components arriving at exactly 0◦ and 180◦ have an infinite power spectral density is not a problem, since the angle of arrival is continuously distributed and the probability of components arriving at exactly these angles is zero [3, 19]. S(v) is the Fourier transform of R(1t). We know that the Fourier transform of the autocorrelation function of a time series is the magnitude squared of the Fourier transform of the original time series. Therefore, measurements can be made by simply transmitting a sinusoid (narrowband signal) and using Fourier analysis to generate the power spectrum of the received amplitude [16]. This Doppler power spectrum of the channel yields knowledge about the spectral spreading of a transmitted sinusoid (impulse in frequency) in the Doppler-shift domain. As indicated in Fig. 18.7, S(v) can be regarded as the dual of the multipath intensity profile, S(τ ), since the latter yields knowledge about the time spreading of a transmitted impulse in the time-delay domain. This is also noted in Fig. 18.1 as the duality between blocks 7 and 16, and in Fig. 18.6 as the duality between the time-spreading mechanism in the time-delay domain and the time-variant mechanism in the Doppler-shift domain. Knowledge of S(v) allows us to glean how much spectral broadening is imposed on the signal as a function of the rate of change in the channel state. The width of the Doppler power spectrum is referred to as the spectral broadening or Doppler spread, denoted by fd , and sometimes called the fading bandwidth of the channel. Equation (18.16) describes the Doppler frequency shift. In a typical multipath environment, the received signal arrives from several reflected paths with different path distances and different angles of arrival, and the Doppler shift of each arriving path is generally different from that of another path. The effect on the received signal is seen as a Doppler spreading or spectral broadening of the transmitted signal frequency, rather than a shift. Note that the Doppler spread, fd , and the coherence time, T0 , are reciprocally related (within a multiplicative constant). Therefore, we show the approximate relationship between the two parameters as T0 ≈

1 fd

(18.18)

Hence, the Doppler spread fd or 1/T0 is regarded as the typical fading rate of the channel. Earlier, T0 was described as the expected time duration over which the channel’s response to a sinusoid is essentially invariant. When T0 is defined more precisely as the time duration over which the 1999 by CRC Press LLC

c

channel’s response to a sinusoid has a correlation of at least 0.5, the relationship between T0 and fd is approximately [4] 9 (18.19) T0 ≈ 16πfd A popular “rule of thumb” is to define T0 as the geometric mean of Eqs. (18.18) and (18.19). This yields s 0.423 9 = (18.20) T0 = 2 fd 16πfd For the case of a 900 MHz mobile radio, Fig. 18.10 illustrates the typical effect of Rayleigh fading on a signal’s envelope amplitude vs. time [3]. The figure shows that the distance traveled by the

FIGURE 18.10: A typical Rayleigh fading envelope at 900 MHz.

mobile in the time interval corresponding to two adjacent nulls (small-scale fades) is on the order of a half-wavelength (λ/2) [3]. Thus, from Fig. 18.10 and Eq. (18.17), the time (approximately, the coherence time) required to traverse a distance λ/2 when traveling at a constant velocity, V , is: T0 ≈

λ/2 0.5 = V fd

(18.21)

Thus, when the interval between fades is taken to be λ/2, as in Fig. 18.10, the resulting expression for T0 in Eq. (18.21) is quite close to the rule-of-thumb shown in Eq. (18.20). Using Eq. (18.21), with the parameters shown in Fig. 18.10 (velocity = 120 km/hr, and carrier frequency = 900 MHz), it is 1999 by CRC Press LLC

c

straightforward to compute that the coherence time is approximately 5 ms and the Doppler spread (channel fading rate) is approximately 100 Hz. Therefore, if this example represents a voice-grade channel with a typical transmission rate of 104 symbols/s, the fading rate is considerably less than the symbol rate. Under such conditions, the channel would manifest slow-fading effects. Note that if the abscissa of Fig. 18.10 were labeled in units of wavelength instead of time, the figure would look the same for any radio frequency and any antenna speed.

18.9

Analogy Between Spectral Broadening in Fading Channels and Spectral Broadening in Digital Signal Keying

Help is often needed in understanding why spectral broadening of the signal is a function of fading rate of the channel. Figure 18.11 uses the keying of a digital signal (such as amplitude-shift-keying or frequency-shift-keying) to illustrate an analogous case. Figure 18.11(a) shows that a single tone, cos 2πfc t (−∞ < t < ∞) that exists for all time is characterized in the frequency domain in terms of impulses (at ±fc ). This frequency domain representation is ideal (i.e., zero bandwidth), since the tone is pure and neverending. In practical applications, digital signalling involves switching (keying) signals on and off at a required rate. The keying operation can be viewed as multiplying the infinite-duration tone in Fig. 18.11(a) by an ideal rectangular (switching) function in Fig. 18.11(b). The frequency-domain description of the ideal rectangular function is of the form (sin f )/f . In Fig. 18.11(c), the result of the multiplication yields a tone, cos 2πfc t, that is time-duration limited in the interval −T /2 < t < T /2. The resulting spectrum is obtained by convolving the spectral impulses in part (a) with the (sin f )/f function in part (b), yielding the broadened spectrum in part (c). It is further seen that, if the signalling occurs at a faster rate characterized by the rectangle of shorter duration in part (d), the resulting spectrum of the signal in part (e) exhibits greater spectral broadening. The changing state of a fading channel is somewhat analogous to the keying on and off of digital signals. The channel behaves like a switch, turning the signal “on” and “off.” The greater the rapidity of the change in the channel state, the greater the spectral broadening of the received signals. The analogy is not exact because the on and off switching of signals may result in phase discontinuities, but the typical multipath-scatterer environment induces phase-continuous effects.

18.10

Degradation Categories due to Time Variance, Viewed in the Doppler-Shift Domain

A channel is referred to as fast fading if the symbol rate, 1/Ts (approximately equal to the signalling rate or bandwidth W ) is less than the fading rate, 1/T0 (approximately equal to fd ); that is, fast fading is characterized by W < fd

(18.22a)

Ts > T0

(18.22b)

or

Conversely, a channel is referred to as slow fading if the signalling rate is greater than the fading 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 18.11: Analogy between spectral broadening in fading and spectral broadening in keying a digital signal.

rate. Thus, in order to avoid signal distortion caused by fast fading, the channel must be made to exhibit slow fading by insuring that the signalling rate must exceed the channel fading rate. That is W > fd

(18.23a)

Ts < T0

(18.23b)

or

In Eq. (18.14), it was shown that due to signal dispersion, the coherence bandwidth, f0 , sets an upper limit on the signalling rate which can be used without suffering frequency-selective distortion. Similarly, Eq. (18.23a–18.23b) shows that due to Doppler spreading, the channel fading rate, fd , sets a lower limit on the signalling rate that can be used without suffering fast-fading distortion. For HF communicating systems, when teletype or Morse-coded messages were transmitted at a low data rate, the channels were often fast fading. However, most present-day terrestrial mobile-radio channels can generally be characterized as slow fading. Equation (18.23a–18.23b) doesn’t go far enough in describing what we desire of the channel. A better way to state the requirement for mitigating the effects of fast fading would be that we desire W fd (or Ts T0 ). If this condition is not satisfied, the random frequency modulation (FM) due to varying Doppler shifts will limit the system performance significantly. The Doppler effect yields an irreducible error rate that cannot be overcome by simply increasing Eb /N0 [25]. This irreducible error rate is most pronounced for any modulation that involves switching the carrier phase. A single specular Doppler path, without scatterers, registers an instantaneous frequency shift, classically calculated as fd = V /λ. However, a combination of specular and multipath components yields a rather complex time dependence of instantaneous frequency which can cause much larger frequency swings than ±V /λ when detected by an instantaneous frequency detector (a nonlinear device) [26]. Ideally, coherent demodulators that lock onto and track the information signal should suppress the effect of this FM noise and thus cancel the impact of Doppler shift. However, for large values of fd , carrier recovery becomes a problem because very wideband (relative to the data rate) phase-lock loops (PLLs) need to be designed. For voice-grade applications with bit-error rates of 10−3 to 10−4 , a large value of Doppler shift is considered to be on the order of 0.01 × W . Therefore, to avoid fast-fading distortion and the Doppler-induced irreducible error rate, the signalling rate should exceed the fading rate by a factor of 100 to 200 [27]. The exact factor depends on the signal modulation, receiver design, and required error-rate [3], [26]–[29]. Davarian [29] showed that a frequency-tracking loop can help lower, but not completely remove, the irreducible error rate in a mobile system when using differential minimum-shift keyed (DMSK) modulation.

18.11

Mitigation Methods

Figure 18.12, subtitled “The Good, The Bad, and The Awful,” highlights three major performance categories in terms of bit-error probability, PB , vs. Eb /N0 . The leftmost exponentially-shaped curve represents the performance that can be expected when using any nominal modulation type in AWGN. Observe that with a reasonable amount of Eb /N0 , good performance results. The middle curve, referred to as the Rayleigh limit, shows the performance degradation resulting from a loss in SNR that is characteristic of flat fading or slow fading when there is no line-of-sight signal component present. The curve is a function of the reciprocal of Eb /N0 (an inverse-linear function), so for 1999 by CRC Press LLC

c

reasonable values of SNR, performance will generally be “bad.” In the case of Rayleigh fading, parameters with overbars are often introduced to indicate that a mean is being taken over the “ups” and “downs” of the fading experience. Therefore, one often sees such bit-error probability plots with mean parameters denoted by PB and Eb /N0 . The curve that reaches an irreducible level, sometimes called an error floor, represents “awful” performance, where the bit-error probability can approach the value of 0.5. This shows the severe distorting effects of frequency-selective fading or fast fading.

FIGURE 18.12: Error performance: The good, the bad, and the awful.

1999 by CRC Press LLC

c

If the channel introduces signal distortion as a result of fading, the system performance can exhibit an irreducible error rate; when larger than the desired error rate, no amount of Eb /N0 will help achieve the desired level of performance. In such cases, the general approach for improving performance is to use some form of mitigation to remove or reduce the distortion. The mitigation method depends on whether the distortion is caused by frequency-selective fading or fast fading. Once the distortion has been mitigated, the PB vs. Eb /N0 performance should have transitioned from the “awful” bottoming out curve to the merely “bad” Rayleigh limit curve. Next, we can further ameliorate the effects of fading and strive to approach AWGN performance by using some form of diversity to provide the receiver with a collection of uncorrelated samples of the signal, and by using a powerful error-correction code. In Fig. 18.13, several mitigation techniques for combating the effects of both signal distortion and loss in SNR are listed. Just as Figs. 18.1 and 18.6 serve as a guide for characterizing fading phenomena and their effects, Fig. 18.13 can similarly serve to describe mitigation methods that can be used to ameliorate the effects of fading. The mitigation approach to be used should follow two basic steps: first, provide distortion mitigation; second, provide diversity.

18.11.1

Mitigation to Combat Frequency-Selective Distortion

• Equalization can compensate for the channel-induced ISI that is seen in frequencyselective fading. That is, it can help move the operating point from the error-performance curve that is “awful” in Fig. 18.12 to the one that is “bad.” The process of equalizing the ISI involves some method of gathering the dispersed symbol energy back together into its original time interval. In effect, equalization involves insertion of a filter to make the combination of channel and filter yield a flat response with linear phase. The phase linearity is achieved by making the equalizer filter the complex conjugate of the time reverse of the dispersed pulse [30]. Because in a mobile system the channel response varies with time, the equalizer filter must also change or adapt to the time-varying channel. Such equalizer filters are, therefore, called adaptive equalizers. An equalizer accomplishes more than distortion mitigation; it also provides diversity. Since distortion mitigation is achieved by gathering the dispersed symbol’s energy back into the symbol’s original time interval so that it doesn’t hamper the detection of other symbols, the equalizer is simultaneously providing each received symbol with energy that would otherwise be lost. • The decision feedback equalizer (DFE) has a feedforward section that is a linear transversal filter [30] whose length and tap weights are selected to coherently combine virtually all of the current symbol’s energy. The DFE also has a feedback section which removes energy that remains from previously detected symbols [14], [30]–[32]. The basic idea behind the DFE is that once an information symbol has been detected, the ISI that it induces on future symbols can be estimated and subtracted before the detection of subsequent symbols. • The maximum-likelihood sequence estimation (MLSE) equalizer tests all possible data sequences (rather than decoding each received symbol by itself) and chooses the data sequence that is the most probable of the candidates. The MLSE equalizer was first proposed by Forney [33] when he implemented the equalizer using the Viterbi decoding algorithm [34]. The MLSE is optimal in the sense that it minimizes the probability of a sequence error. Because the Viterbi decoding algorithm is the way in which the MLSE equalizer is typically implemented, the equalizer is often referred to as the Viterbi equalizer. Later in this chapter, we illustrate the adaptive equalization performed in the 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 18.13: Basic mitigation types.

Global System for Mobile Communications (GSM) using the Viterbi equalizer. • Spread-spectrum techniques can be used to mitigate frequency-selective ISI distortion because the hallmark of any spread-spectrum system is its capability to reject interference, and ISI is a type of interference. Consider a direct-sequence spread-spectrum (DS/SS) binary phase shift keying (PSK) communication channel comprising one direct path and one reflected path. Assume that the propagation from transmitter to receiver results in a multipath wave that is delayed by τk compared to the direct wave. If the receiver is synchronized to the waveform arriving via the direct path, the received signal, r(t), neglecting noise, can be expressed as r(t) = Ax(t)g(t) cos (2πfc t) + αAx (t − τk ) g (t − τk ) cos (2πfc t + 2)

(18.24)

where x(t) is the data signal, g(t) is the pseudonoise (PN) spreading code, and τk is the differential time delay between the two paths. The angle 2 is a random phase, assumed to be uniformly distributed in the range (0, 2π ), and α is the attenuation of the multipath signal relative to the direct path signal. The receiver multiplies the incoming r(t) by the code g(t). If the receiver is synchronized to the direct path signal, multiplication by the code signal yields Ax(t)g 2 (t) cos (2πfc t) + αAx (t − τk ) g(t)g (t − τk ) cos (2πfc t + 2) where g 2 (t) = 1, and if τk is greater than the chip duration, then, Z Z g ∗ (t)g (t − τk ) dt g ∗ (t)g(t)dt

(18.25)

(18.26)

over some appropriate interval of integration (correlation), where ∗ indicates complex conjugate, and τk is equal to or larger than the PN chip duration. Thus, the spread spectrum system effectively eliminates the multipath interference by virtue of its codecorrelation receiver. Even though channel-induced ISI is typically transparent to DS/SS systems, such systems suffer from the loss in energy contained in all the multipath components not seen by the receiver. The need to gather up this lost energy belonging to the received chip was the motivation for developing the Rake receiver [35]–[37]. The Rake receiver dedicates a separate correlator to each multipath component (finger). It is able to coherently add the energy from each finger by selectively delaying them (the earliest component gets the longest delay) so that they can all be coherently combined. • Earlier, we described a channel that could be classified as flat fading, but occasionally exhibits frequency-selective distortion when the null of the channel’s frequency transfer function occurs at the center of the signal band. The use of DS/SS is a good way to mitigate such distortion because the wideband SS signal would span many lobes of the selectively faded frequency response. Hence, a great deal of pulse energy would then be passed by the scatterer medium, in contrast to the nulling effect on a relatively narrowband signal [see Fig. 18.8(c)] [18]. • Frequency-hopping spread-spectrum (FH/SS) can be used to mitigate the distortion due to frequency-selective fading, provided the hopping rate is at least equal to the symbol rate. Compared to DS/SS, mitigation takes place through a different mechanism. FH receivers avoid multipath losses by rapid changes in the transmitter frequency band, thus avoiding the interference by changing the receiver band position before the arrival of the multipath signal. 1999 by CRC Press LLC

c

• Orthogonal frequency-division multiplexing (OFDM) can be used in frequency-selective fading channels to avoid the use of an equalizer by lengthening the symbol duration. The signal band is partitioned into multiple subbands, each one exhibiting a lower symbol rate than the original band. The subbands are then transmitted on multiple orthogonal carriers. The goal is to reduce the symbol rate (signalling rate), W ≈ 1/Ts , on each carrier to be less than the channel’s coherence bandwidth f0 . OFDM was originally referred to as Kineplex. The technique has been implemented in the U.S. in mobile radio systems [38], and has been chosen by the European community under the name Coded OFDM (COFDM), for high-definition television (HDTV) broadcasting [39]. • Pilot signal is the name given to a signal intended to facilitate the coherent detection of waveforms. Pilot signals can be implemented in the frequency domain as an in-band tone [40], or in the time domain as a pilot sequence, which can also provide information about the channel state and thus improve performance in fading [41].

18.11.2

Mitigation to Combat Fast-Fading Distortion

• For fast fading distortion, use a robust modulation (noncoherent or differentially coherent) that does not require phase tracking, and reduce the detector integration time [20]. • Increase the symbol rate, W ≈ 1/Ts , to be greater than the fading rate, fd ≈ 1/T0 , by adding signal redundancy. • Error-correction coding and interleaving can provide mitigation because instead of providing more signal energy, a code reduces the required Eb /N0 . For a given Eb /N0 , with coding present, the error floor will be lowered compared to the uncoded case. • An interesting filtering technique can provide mitigation in the event of fast-fading distortion and frequency-selective distortion occurring simultaneously. The frequencyselective distortion can be mitigated by the use of an OFDM signal set. Fast fading, however, will typically degrade conventional OFDM because the Doppler spreading corrupts the orthogonality of the OFDM subcarriers. A polyphase filtering technique [42] is used to provide time-domain shaping and duration extension to reduce the spectral sidelobes of the signal set and thus help preserve its orthogonality. The process introduces known ISI and adjacent channel interference (ACI) which are then removed by a post-processing equalizer and canceling filter [43].

18.11.3

Mitigation to Combat Loss in SNR

After implementing some form of mitigation to combat the possible distortion (frequency-selective or fast fading), the next step is to use some form of diversity to move the operating point from the errorperformance curve labeled as “bad” in Fig. 18.12 to a curve that approaches AWGN performance. The term “diversity” is used to denote the various methods available for providing the receiver with uncorrelated renditions of the signal. Uncorrelated is the important feature here, since it would not help the receiver to have additional copies of the signal if the copies were all equally poor. Listed below are some of the ways in which diversity can be implemented. • Time diversity—Transmit the signal on L different time slots with time separation of at least T0 . Interleaving, often used with error-correction coding, is a form of time diversity. • Frequency diversity—Transmit the signal on L different carriers with frequency separation of at least f0 . Bandwidth expansion is a form of frequency diversity. The 1999 by CRC Press LLC

c

•

• •

• •

signal bandwidth, W , is expanded to be greater than f0 , thus providing the receiver with several independently fading signal replicas. This achieves frequency diversity of the order L = W/f0 . Whenever W is made larger than f0 , there is the potential for frequency-selective distortion unless we further provide some mitigation such as equalization. Thus, an expanded bandwidth can improve system performance (via diversity) only if the frequency-selective distortion the diversity may have introduced is mitigated. Spread spectrum is a form of bandwidth expansion that excels at rejecting interfering signals. In the case of direct-sequence spread-spectrum (DS/SS), it was shown earlier that multipath components are rejected if they are delayed by more than one chip duration. However, in order to approach AWGN performance, it is necessary to compensate for the loss in energy contained in those rejected components. The Rake receiver (described later) makes it possible to coherently combine the energy from each of the multipath components arriving along different paths. Thus, used with a Rake receiver, DS/SS modulation can be said to achieve path diversity. The Rake receiver is needed in phasecoherent reception, but in differentially coherent bit detection, a simple delay line (one bit long) with complex conjugation will do the trick [44]. Frequency-hopping spread-spectrum (FH/SS) is sometimes used as a diversity mechanism. The GSM system uses slow FH (217 hops/s) to compensate for those cases where the mobile user is moving very slowly (or not at all) and happens to be in a spectral null. Spatial diversity is usually accomplished through the use of multiple receive antennas, separated by a distance of at least 10 wavelengths for a base station (much less for a mobile station). Signal processing must be employed to choose the best antenna output or to coherently combine all the outputs. Systems have also been implemented with multiple spaced transmitters; an example is the Global Positioning System (GPS). Polarization diversity [45] is yet another way to achieve additional uncorrelated samples of the signal. Any diversity scheme may be viewed as a trivial form of repetition coding in space or time. However, there exist techniques for improving the loss in SNR in a fading channel that are more efficient and more powerful than repetition coding. Error-correction coding represents a unique mitigation technique, because instead of providing more signal energy it reduces the required Eb /N0 in order to accomplish the desired error performance. Error-correction coding coupled with interleaving [20], [46]–[51] is probably the most prevalent of the mitigation schemes used to provide improved performance in a fading environment.

18.12

Summary of the Key Parameters Characterizing Fading Channels

We summarize the conditions that must be met so that the channel does not introduce frequencyselective distortion and fast-fading distortion. Combining the inequalities of Eqs. (18.14) and (18.23a– 18.23b), we obtain f0 > W > fd or 1999 by CRC Press LLC

c

(18.27a)

Tm < Ts < T0

(18.27b)

In other words, we want the channel coherence bandwidth to exceed our signalling rate, which in turn should exceed the fading rate of the channel. Recall that without distortion mitigation, f0 sets an upper limit on signalling rate, and fd sets a lower limit on it.

18.12.1

Fast-Fading Distortion: Example #1

If the inequalities of Eq. (18.27a–18.27b) are not met and distortion mitigation is not provided, distortion will result. Consider the fast-fading case where the signalling rate is less than the channel fading rate, that is, (18.28) f0 > W < fd Mitigation consists of using one or more of the following methods. (See Fig. 18.13). • Choose a modulation/demodulation technique that is most robust under fast-fading conditions. That means, for example, avoiding carrier recovery with PLLs since the fast fading could keep a PLL from achieving lock conditions. • Incorporate sufficient redundancy so that the transmission symbol rate exceeds the channel fading rate. As long as the transmission symbol rate does not exceed the coherence bandwidth, the channel can be classified as flat fading. However, even flat-fading channels will experience frequency-selective distortion whenever a channel null appears at the band center. Since this happens only occasionally, mitigation might be accomplished by adequate errorcorrection coding and interleaving. • The above two mitigation approaches should result in the demodulator operating at the Rayleigh limit [20] (see Fig. 18.12). However, there may be an irreducible floor in the error-performance vs. Eb /N0 curve due to the FM noise that results from the random Doppler spreading. The use of an in-band pilot tone and a frequency-control loop can lower this irreducible performance level. • To avoid this error floor caused by random Doppler spreading, increase the signalling rate above the fading rate still further (100–200 × fading rate) [27]. This is one architectural motive behind time-division multiple access (TDMA) mobile systems. • Incorporate error-correction coding and interleaving to lower the floor and approach AWGN performance.

18.12.2

Frequency-Selective Fading Distortion: Example #2

Consider the frequency-selective case where the coherence bandwidth is less than the symbol rate; that is, (18.29) f0 < W > fd Mitigation consists of using one or more of the following methods. (See Fig. 18.13). • Since the transmission symbol rate exceeds the channel-fading rate, there is no fastfading distortion. Mitigation of frequency-selective effects is necessary. One or more of the following techniques may be considered: 1999 by CRC Press LLC

c

• Adaptive equalization, spread spectrum (DS or FH), OFDM, pilot signal. The European GSM system uses a midamble training sequence in each transmission time slot so that the receiver can learn the impulse response of the channel. It then uses a Viterbi equalizer (explained later) for mitigating the frequency-selective distortion. • Once the distortion effects have been reduced, introduce some form of diversity and error-correction coding and interleaving in order to approach AWGN performance. For direct-sequence spread-spectrum (DS/SS) signalling, the use of a Rake receiver (explained later) may be used for providing diversity by coherently combining multipath components that would otherwise be lost.

18.12.3

Fast-Fading and Frequency-Selective Fading Distortion: Example #3

Consider the case where the coherence bandwidth is less than the signalling rate, which in turn is less than the fading rate. The channel exhibits both fast-fading and frequency-selective fading which is expressed as f0 < W < fd

(18.30a)

f0 < fd

(18.30b)

or

Recalling from Eq. (18.27a–18.27b) that f0 sets an upper limit on signalling rate and fd sets a lower limit on it, this is a difficult design problem because, unless distortion mitigation is provided, the maximum allowable signalling rate is (in the strict terms of the above discussion) less than the minimum allowable signalling rate. Mitigation in this case is similar to the initial approach outlined in example #1. • Choose a modulation/demodulation technique that is most robust under fast-fading conditions. • Use transmission redundancy in order to increase the transmitted symbol rate. • Provide some form of frequency-selective mitigation in a manner similar to that outlined in example #2. • Once the distortion effects have been reduced, introduce some form of diversity and error-correction coding and interleaving in order to approach AWGN performance.

18.13

The Viterbi Equalizer as Applied to GSM

Figure 18.14 shows the GSM time-division multiple access (TDMA) frame, having a duration of 4.615 ms and comprising 8 slots, one assigned to each active mobile user. A normal transmission burst occupying one slot of time contains 57 message bits on each side of a 26-bit midamble called a training or sounding sequence. The slot-time duration is 0.577 ms (or the slot rate is 1733 slots/s). The purpose of the midamble is to assist the receiver in estimating the impulse response of the channel in an adaptive way (during the time duration of each 0.577 ms slot). In order for the technique to be effective, the fading behavior of the channel should not change appreciably during the time interval 1999 by CRC Press LLC

c

FIGURE 18.14: The GSM TDMA frame and time-slot containing a normal burst.

of one slot. In other words, there should not be any fast-fading degradation during a slot time when the receiver is using knowledge from the midamble to compensate for the channel’s fading behavior. Consider the example of a GSM receiver used aboard a high-speed train, traveling at a constant velocity of 200 km/hr (55.56 m/s). Assume the carrier frequency to be 900 MHz, (the wavelength is λ = 0.33 m). From Eq. (18.21), we can calculate that a half-wavelength is traversed in approximately the time (coherence time) λ/2 ≈ 3 ms (18.31) T0 ≈ V Therefore, the channel coherence time is over 5 times greater than the slot time of 0.577 ms. The time needed for a significant change in fading behavior is relatively long compared to the time duration of one slot. Note, that the choices made in the design of the GSM TDMA slot time and midamble were undoubtedly influenced by the need to preclude fast fading with respect to a slot-time duration, as in this example. The GSM symbol rate (or bit rate, since the modulation is binary) is 271 kilosymbols/s and the bandwidth is W = 200 kHz. If we consider that the typical rms delay spread in an urban environment is on the order of στ = 2µs, then using Eq. (18.13) the resulting coherence bandwidth is f0 ≈ 100 kHz. It should therefore be apparent that since f0 < W , the GSM receiver must utilize some form of mitigation to combat frequency-selective distortion. To accomplish this goal, the Viterbi equalizer is typically implemented. Figure 18.15 illustrates the basic functional blocks used in a GSM receiver for estimating the channel impulse response, which is then used to provide the detector with channel-corrected reference waveforms [52]. In the final step, the Viterbi algorithm is used to compute the MLSE of the message. As stated in Eq. (18.2), a received signal can be described in terms of the transmitted signal convolved with the impulse response of the channel, hc (t). We show this below, using the notation of a received training sequence, rtr (t), and the transmitted training sequence, str (t), as follows: rtr (t) = str (t) ∗ hc (t) 1999 by CRC Press LLC

c

(18.32)

FIGURE 18.15: The Viterbi equalizer as applied to GSM. where ∗ denotes convolution. At the receiver, rtr (t) is extracted from the normal burst and sent to a filter having impulse response, hmf (t), that is matched to str (t). This matched filter yields at its output an estimate of hc (t), denoted he (t), developed from Eq. (18.32) as follows. he (t)

= rtr (t) ∗ hmf (t) = str (t) ∗ hc (t) ∗ hmf (t) = Rs (t) ∗ hc (t)

(18.33)

where Rs (t) is the autocorrelation function of str (t). If Rs (t) is a highly peaked (impulse-like) function, then he (t) ≈ hc (t). Next, using a windowing function, w(t), we truncate he (t) to form a computationally affordable function, hw (t). The window length must be large enough to compensate for the effect of typical channel-induced ISI. The required observation interval L0 for the window can be expressed as the sum of two contributions. The interval of length LCI SI is due to the controlled ISI caused by Gaussian filtering of the baseband pulses, which are then MSK modulated. The interval of length LC is due to the channel-induced ISI caused by multipath propagation; therefore, L0 can be written as L0 = LCI SI + LC

(18.34)

The GSM system is required to provide mitigation for distortion due to signal dispersions of approximately 15–20 µs. The bit duration is 3.69 µs. Thus, the Viterbi equalizer used in GSM has a memory of 4–6 bit intervals. For each L0 -bit interval in the message, the function of the Viterbi equalizer is to find the most likely L0 -bit sequence out of the 2L0 possible sequences that might have been transmitted. Determining the most likely L0 -bit sequence requires that 2L0 meaningful reference waveforms be created by modifying (or disturbing) the 2L0 ideal waveforms in the same way that the channel has disturbed the transmitted message. Therefore, the 2L0 reference waveforms are convolved with the windowed estimate of the channel impulse response, hw (t) in order to derive the disturbed or channel-corrected reference waveforms. Next, the channel-corrected reference waveforms are compared against the received data waveforms to yield metric calculations. However, before the comparison takes place, the received data waveforms are convolved with the known windowed autocorrelation function w(t)Rs (t), transforming them in a manner comparable to that applied to the 1999 by CRC Press LLC

c

reference waveforms. This filtered message signal is compared to all possible 2L0 channel-corrected reference signals, and metrics are computed as required by the Viterbi decoding algorithm (VDA). The VDA yields the maximum likelihood estimate of the transmitted sequence [34].

18.14

The Rake Receiver Applied to Direct-Sequence Spread-Spectrum (DS/SS) Systems

Interim Specification 95 (IS-95) describes a DS/SS cellular system that uses a Rake receiver [35]–[37] to provide path diversity. In Fig. 18.16, five instances of chip transmissions corresponding to the code sequence 1 0 1 1 1 are shown, with the transmission or observation times labeled t−4 for the earliest transmission and t0 for the latest. Each abscissa shows three “fingers” of a signal that arrive at the receiver with delay times τ1 , τ2 , and τ3 . Assume that the intervals between the ti transmission times and the intervals between the τi delay times are each one chip long. From this, one can conclude that the finger arriving at the receiver at time t−4 , with delay τ3 , is time coincident with two other fingers, namely the fingers arriving at times t−3 and t−2 with delays τ2 and τ1 , respectively. Since, in this example, the delayed components are separated by exactly one chip time, they are just resolvable. At the receiver, there must be a sounding device that is dedicated to estimating the τi delay times. Note that for a terrestrial mobile radio system, the fading rate is relatively slow (milliseconds) or the channel coherence time large compared to the chip time (T0 > Tch ). Hence, the changes in τi occur slowly enough so that the receiver can readily adapt to them. Once the τi delays are estimated, a separate correlator is dedicated to processing each finger. In this example, there would be three such dedicated correlators, each one processing a delayed version of the same chip sequence 1 0 1 1 1. In Fig. 18.16, each correlator receives chips with power profiles represented by the sequence of fingers shown along a diagonal line. Each correlator attempts to match these arriving chips with the same PN code, similarly delayed in time. At the end of a symbol interval (typically there may be hundreds or thousands of chips per symbol), the outputs of the correlators are coherently combined, and a symbol detection is made. At the chip level, the Rake receiver resembles an equalizer, but its real function is to provide diversity. The interference-suppression nature of DS/SS systems stems from the fact that a code sequence arriving at the receiver merely one chip time late, will be approximately orthogonal to the particular PN code with which the sequence is correlated. Therefore, any code chips that are delayed by one or more chip times will be suppressed by the correlator. The delayed chips only contribute to raising the noise floor (correlation sidelobes). The mitigation provided by the Rake receiver can be termed path diversity, since it allows the energy of a chip that arrives via multiple paths to be combined coherently. Without the Rake receiver, this energy would be transparent and therefore lost to the DS/SS system. In Fig. 18.16, looking vertically above point τ3 , it is clear that there is interchip interference due to different fingers arriving simultaneously. The spread-spectrum processing gain allows the system to endure such interference at the chip level. No other equalization is deemed necessary in IS-95.

18.15

Conclusion

In this chapter, the major elements that contribute to fading in a communication channel have been characterized. Figure 18.1 was presented as a guide for the characterization of fading phenomena. Two types of fading, large-scale and small-scale, were described. Two manifestations of small-scale fading (signal dispersion and fading rapidity) were examined, and the examination involved two views, time and frequency. Two degradation categories were defined for dispersion: frequency-selective 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 18.16: Example of received chips seen by a 3-finger rake receiver.

fading and flat-fading. Two degradation categories were defined for fading rapidity: fast and slow. The small-scale fading degradation categories were summarized in Fig. 18.6. A mathematical model using correlation and power density functions was presented in Fig. 18.7. This model yields a nice symmetry, a kind of “poetry” to help us view the Fourier transform and duality relationships that describe the fading phenomena. Further, mitigation techniques for ameliorating the effects of each degradation category were treated, and these techniques were summarized in Fig. 18.13. Finally, mitigation methods that have been implemented in two system types, GSM and CDMA systems meeting IS-95, were described.

References [1] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, Ch. 4, 1988. [2] Van Trees, H.L., Detection, Estimation, and Modulation Theory, Part I, John Wiley & Sons, New York, Ch. 4, 1968. [3] Rappaport, T.S., Wireless Communications, Prentice-Hall, Upper Saddle River, New Jersey, Chs. 3 and 4, 1996. [4] Greenwood, D. and Hanzo, L., Characterisation of Mobile Radio Channels, Mobile Radio Communications, Steele, R., Ed., Pentech Press, London, Ch. 2, 1994. [5] Lee, W.C.Y., Elements of cellular mobile radio systems, IEEE Trans. Vehicular Technol., V35(2), 48–56, May 1986. [6] Okumura, Y. et al., Field strength and its variability in VHF and UHF land mobile radio service, Rev. Elec. Comm. Lab., 16(9-10), 825–873, 1968. [7] Hata, M., Empirical formulæ for propagation loss in land mobile radio services, IEEE Trans. Vehicular Technol., VT-29(3), 317–325, 1980. [8] Seidel, S.Y. et al., Path loss, scattering and multipath delay statistics in four European cities for digital cellular and microcellular radiotelephone, IEEE Trans. Vehicular Technol., 40(4), 721–730, Nov. 1991. [9] Cox, D.C., Murray, R., and Norris, A., 800 MHz Attenuation measured in and around suburban houses, AT&T Bell Laboratory Technical Journal, 673(6), 921–954, Jul.-Aug. 1984. [10] Schilling, D.L. et al., Broadband CDMA for personal communications systems, IEEE Commun. Mag., 29(11), 86–93, Nov. 1991. [11] Andersen, J.B., Rappaport, T.S., and Yoshida, S., Propagation measurements and models for wireless communications channels, IEEE Commun. Mag., 33(1), 42–49, Jan. 1995. [12] Amoroso, F., Investigation of signal variance, bit error rates and pulse dispersion for DSPN signalling in a mobile dense scatterer ray tracing model, Intl. J. Satellite Commun., 12, 579– 588, 1994. [13] Bello, P.A., Characterization of randomly time-variant linear channels, IEEE Trans. Commun. Syst., 360–393, Dec. 1963. [14] Proakis, J.G., Digital Communications, McGraw-Hill, New York, Ch. 7, 1983. [15] Green, P.E., Jr., Radar astronomy measurement techniques, MIT Lincoln Laboratory, Lexington, MA, Tech. Report No. 282, Dec. 1962. [16] Pahlavan, K. and Levesque, A.H., Wireless Information Networks, John Wiley & Sons, New York, Chs. 3 and 4, 1995. [17] Lee, W.Y.C., Mobile Cellular Communications, McGraw-Hill, New York, 1989. [18] Amoroso, F., Use of DS/SS signalling to mitigate Rayleigh fading in a dense scatterer environment, IEEE Personal Commun., 3(2), 52–61, Apr. 1996. 1999 by CRC Press LLC

c

[19] Clarke, R.H., A statistical theory of mobile radio reception, Bell Syst. Tech. J., 47(6), 957–1000, Jul.-Aug. 1968. [20] Bogusch, R.L., Digital Communications in Fading Channels: Modulation and Coding, Mission Research Corp., Santa Barbara, California, Report No. MRC-R-1043, Mar. 11, 1987. [21] Amoroso, F., The bandwidth of digital data signals, IEEE Commun. Mag., 18(6), 13–24, Nov. 1980. [22] Bogusch, R.L. et al., Frequency selective propagation effects on spread-spectrum receiver tracking, Proc. IEEE, 69(7), 787–796, Jul. 1981. [23] Jakes, W.C., Ed., Microwave Mobile Communications, John Wiley & Sons, New York, 1974. Technical Committee of Committee T1 R1P1.4 and TIA [24] Joint TR46.3.3/TR45.4.4 on Wireless Access, Draft Final Report on RF Channel Characterization, Paper No. JTC(AIR)/94.01.17-238R4, Jan. 17, 1994. [25] Bello, P.A. and Nelin, B.D., The influence of fading spectrum on the binary error probabilities of incoherent and differentially coherent matched filter receivers, IRE Trans. Commun. Syst., CS-10, 160–168, Jun. 1962. [26] Amoroso, F., Instantaneous frequency effects in a Doppler scattering environment, IEEE International Conference on Communications, 1458–1466, Jun. 7–10, 1987. [27] Bateman, A.J. and McGeehan, J.P., Data transmission over UHF fading mobile radio channels, IEEE Proc., 131, Pt. F(4), 364–374, Jul. 1984. [28] Feher, K., Wireless Digital Communications, Prentice-Hall, Upper Saddle River, NJ, 1995. [29] Davarian, F., Simon, M., and Sumida, J., DMSK: A Practical 2400-bps Receiver for the Mobile Satellite Service, Jet Propulsion Laboratory Publication 85-51 (MSAT-X Report No. 111), Jun. 15, 1985. [30] Rappaport, T.S., Wireless Communications, Prentice-Hall, Upper Saddle River, NJ, Ch. 6, 1996. [31] Bogusch, R.L., Guigliano, F.W., and Knepp, D.L., Frequency-selective scintillation effects and decision feedback equalization in high data-rate satellite links, Proc. IEEE, 71(6), 754–767, Jun. 1983. [32] Qureshi, S.U.H., Adaptive equalization, Proc. IEEE, 73(9), 1340–1387, Sept. 1985. [33] Forney, G.D., The Viterbi algorithm, Proc. IEEE, 61(3), 268–278, Mar. 1978. [34] Sklar, B., Digital Communications: Fundamentals and Applications, Prentice-Hall, Englewood Cliffs, NJ, Ch. 6, 1988. [35] Price, R. and Green, P.E., Jr., A communication technique for multipath channels, Proc. IRE, 555–570, Mar. 1958. [36] Turin, G.L., Introduction to spread-spectrum antimultipath techniques and their application to urban digital radio, Proc. IEEE, 68(3), 328–353, Mar. 1980. [37] Simon, M.K., Omura, J.K., Scholtz, R.A., and Levitt, B.K., Spread Spectrum Communications Handbook, McGraw-Hill, New York, 1994. [38] Birchler, M.A. and Jasper, S.C., A 64 kbps Digital Land Mobile Radio System Employing M16QAM, Proceedings of the 1992 IEEE Intl. Conference on Selected Topics in Wireless Communications, Vancouver, British Columbia, 158–162, Jun. 25–26, 1992. [39] Sari, H., Karam, G., and Jeanclaude, I., Transmission techniques for digital terrestrial TV broadcasting, IEEE Commun. Mag., 33(2), 100–109, Feb. 1995. [40] Cavers, J.K., The performance of phase locked transparent tone-in-band with symmetric phase detection, IEEE Trans. Commun., 39(9), 1389–1399, Sept. 1991. [41] Moher, M.L. and Lodge, J.H., TCMP—A modulation and coding strategy for Rician fading channel, IEEE J. Selected Areas Commun., 7(9), 1347–1355, Dec. 1989.

1999 by CRC Press LLC

c

[42] Harris, F., On the Relationship Between Multirate Polyphase FIR Filters and Windowed, Overlapped FFT Processing, Proceedings of the Twenty Third Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, 485–488, Oct. 30 to Nov. 1, 1989. [43] Lowdermilk, R.W. and Harris, F., Design and Performance of Fading Insensitive Orthogonal Frequency Division Multiplexing (OFDM) using Polyphase Filtering Techniques, Proceedings of the Thirtieth Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, Nov. 3–6, 1996. [44] Kavehrad, M. and Bodeep, G.E., Design and experimental results for a direct-sequence spreadspectrum radio using differential phase-shift keying modulation for indoor wireless communications, IEEE JSAC, SAC-5(5), 815–823, Jun. 1987. [45] Hess, G.C., Land-Mobile Radio System Engineering, Artech House, Boston, 1993. [46] Hagenauer, J. and Lutz, E., Forward error correction coding for fading compensation in mobile satellite channels, IEEE JSAC, SAC-5(2), 215–225, Feb. 1987. [47] McLane, P.I. et al., PSK and DPSK trellis codes for fast fading, shadowed mobile satellite communication channels, IEEE Trans. Commun., 36(11), 1242–1246, Nov. 1988. [48] Schlegel, C. and Costello, D.J., Jr., Bandwidth efficient coding for fading channels: code construction and performance analysis, IEEE JSAC, 7(9), 1356–1368, Dec. 1989. [49] Edbauer, F., Performance of interleaved trellis-coded differential 8–PSK modulation over fading channels, IEEE J. Selected Areas Commun., 7(9), 1340–1346, Dec. 1989. [50] Soliman, S. and Mokrani, K., Performance of coded systems over fading dispersive channels, IEEE Trans. Commun., 40(1), 51–59, Jan. 1992. [51] Divsalar, D. and Pollara, F., Turbo Codes for PCS Applications, Proc. ICC’95, Seattle, Washington, 54–59, Jun. 18–22, 1995. [52] Hanzo, L. and Stefanov, J., The Pan-European Digital Cellular Mobile Radio System—known as GSM, Mobile Radio Communications. Steele, R., Ed., Pentech Press, London, Ch. 8, 1992.

1999 by CRC Press LLC

c

Paulraj, A.J. “Space-Time Processing” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Space-Time Processing 19.1 Introduction 19.2 The Space-Time Wireless Channel

Multipath Propagation • Space-Time Channel Model

19.3 Signal Models

Signal Model at Base Station (Reverse Link) • Signal Model at Mobile (Forward Link) • Discrete Time Signal Model • SignalPlus-Interference Model

19.4 ST Receive Processing (Base)

Receive Channel Estimation (Base) • Multiuser ST Receive Algorithms • Single-User ST Receive Algorithms

19.5 ST Transmit Processing (Base)

Transmit Channel Estimation (Base) • ST Transmit Processing • Forward Link Processing at the Mobile

Arogyaswami J. Paulraj Stanford University

19.1

19.6 Summary Defining Terms References

Introduction

Mobile radio signal processing includes modulation and demodulation, channel coding and decoding, equalization and diversity. Current cellular modems mainly use temporal signal processing. Use of spatio-temporal signal processing can improve average signal power, mitigate fading, and reduce cochannel and intersymbol interference. This can significantly improve the capacity, coverage, and quality of wireless networks. A space-time processing radio operates simultaneously on multiple antennas by processing signal samples both in space and time. In receive, space-time (ST) processing can increase array gain, spatial and temporal diversity and reduce cochannel interference and intersymbol interference. In transmit, the spatial dimension can enhance array gain, improve diversity, reduce generation of cochannel and inter-symbol interference.

19.2

The Space-Time Wireless Channel

19.2.1 Multipath Propagation Multipath scattering gives rise to a number of propagation effects described below. 1999 by CRC Press LLC

c

Scatterers Local to Mobile

Scattering local to the mobile is caused by buildings/other scatterers in the vicinity of the mobile (a few tens of meters). Mobile motion and local scattering give rise to Doppler spread which causes time-selective fading. For a mobile traveling at 65 mph, the Doppler spread is about 200 Hz in the 1900 MHz band. While local scatterers contribute to Doppler spread, the delay spread they contribute is usually insignificant because of the small scattering radius. Likewise, the angle spread induced at the base station is also small. Remote Scatterers

The emerging wavefront from the local scatterers may then travel directly to the base or may be scattered toward the base by remote dominant scatterers, giving rise to specular multipaths. These remote scatterers can be either terrain features or high-rise building complexes. Remote scattering can cause significant delay and angle spreads. Scatterers Local to Base

Once these multiple wavefronts reach the base station, they may be scattered further by local structures such as buildings or other structures in the vicinity of the base. Such scattering will be more pronounced for low elevation and below roof-top antennas. Scattering local to the base can cause severe angle spread which in turn causes space-selective fading. See Figure 19.1 for a depiction of different types of scattering.

FIGURE 19.1: Multipath propagation in macrocells.

The forward link channel is affected in similar ways by these scatterers, but in a reverse order.

19.2.2

Space-Time Channel Model

The effect of delay, Doppler and angle spreads makes the channel selective in frequency, time, and space. Figure 19.2 shows plots of the frequency response at each branch of a four-antenna receiver operating with a 200 Khz bandwidth. We can see that the channel is highly frequency-selective since the delay spread reaches 10 to 15 µs. Also, an angle spread of 30◦ causes variations in the channel from antenna to antenna. The channel variation in time depends upon the Doppler spread. As expected, the plots show negligible channel variation between adjacent time slots, despite the high velocity of the mobile (100 kph). Use of longer time slots such as in IS-136 will result in significant 1999 by CRC Press LLC

c

channel variations over the slot period. Therefore, space-time processing should address the effect of the three spreads on the signal.

FIGURE 19.2: ST channel.

19.3

Signal Models

We develop signal models for nonspread modulation used in time division multiple access (TDMA) systems.

19.3.1

Signal Model at Base Station (Reverse Link)

We assume that antenna arrays are used at the base station only and that the mobile has a single omni antenna. The mobile transmits a channel coded and modulated signal which does not incorporate any spatial (or indeed any special temporal) processing. See Figure 19.3. The baseband signal xi (t) received by the base station at the ith element of an m element antenna array is given by

xi (t) =

L X l=1

ai (θl ) αlR (t)u (t − τl ) + ni (t)

(19.1)

where L is the number of multipaths, ai (θl ) is the response of the ith element for the lth path from direction θl , αlR (t) is the complex path fading, τl is the path delay, ni (t) is the additive noise and u(·) is the transmitted signal that depends on the modulation waveform and the information data stream. 1999 by CRC Press LLC

c

FIGURE 19.3: ST Processing Model. For a linear modulation, the baseband transmitted signal is given by X g(t − kT )s(k) u(t) =

(19.2)

k

where g(·) is the pulse shaping waveform and s(k) represents the information bits. In the above model we have assumed that the inverse signal bandwidth is large compared to the travel time across the array. Therefore, the complex envelopes of the signals received by different antennas from a given path are identical except for phase and amplitude differences that depend on the path angle-of-arrival, array geometry and the element pattern. This angle-of-arrival dependent phase and amplitude response at the ith element is ai (θl ). We collect all the element responses to a path arriving from angle θl into an m-dimensional vector, called the array response vector defined as a (θl )

=

x(t)

=

[a1 (θl ) a2 (θl ) . . . am (θl )]T L X l=1

a (θl ) αlR (t)u (t − τl ) + n(t)

(19.3)

where x(t) and n(t) are m-dimensional complex vectors. The fading |α R (t)| is Rayleigh or Rician distributed depending on the propagation model.

19.3.2

Signal Model at Mobile (Forward Link)

In this model, the base station transmits different signals from each antenna with a defined relationship between them. In the case of a two element array, some examples of transmitted signals ui (t), i = 1, 2 can be: (a) delay diversity: u2 (t) = u1 (t − T ) where T is the symbol period; (b) Doppler diversity: beamforming: u2 (t) = w2 u1P (t) where u2 (t) = u1 (t)ej ωt where ω is differential carrier offset; (c)P w2 is complex scalar; and (d) space-time coding: u1 (t) = k g(t − kT )s 1 (k), u2 (t) = k g(t − kT )s 2 (k) where s 1 (k) and s 2 (k) are related to the symbol sequence s(k) through coding. The received signal at the mobile is then given by x(t) =

L m X X i=1 l=1

1999 by CRC Press LLC

c

ai (θl ) αlF (t)ui (t − τl ) + n(t)

(19.4)

where the path delay τl and angle parameters θl are the same as those of the reverse link. αlF (t) is the complex fading on the forward link. In (fast) TDD systems αlF (t) will be identical to the reverse link complex fading αlR (t). In a FDD system αlF (t) and αlR (t) will usually have the same statistics but will in general be uncorrelated with each other. We assume ai (θl ) is the same for both links. This is only approximately true in FDD systems. If simple beamforming alone is used in transmit, the signals radiated from the antennas are related by a complex scalar and result in a directional transmit beam which may selectively couple into the multipath environment and differentially scale the power in each path. The signal received by the mobile in this case can be written as x(t) =

L X l=1

w H a (θl ) αlF (t)u (t − τl ) + n(t)

(19.5)

where w is the beamforming vector.

19.3.3

Discrete Time Signal Model

The channel model described above uses physical path parameters such as path gain, delay, and angle of arrival. In practice these are not known and the discrete time received signal uses a more convenient discretized “symbol response” channel model. We derive a discrete-time signal model at the base station antenna array. Let the continuous-time output from the receive antenna array x(t) be sampled at the symbol rate at instants t = to +kT . Then the vector array output may be written as x(k) = HR s(k) + n(k)

(19.6)

where HR is the reverse link symbol response channel (a m × N matrix) that captures the effects of the array response, symbol waveform and path fading. m is the number of antennas, N is the channel length in symbol periods and n(k) the sampled vector of additive noise. Note that n(k) may be colored in space and time, as discussed later. HR is assumed to be time invariant. s(k) is a vector of N consecutive elements of the data sequence and is defined as s(k) .. (19.7) s(k) = . s(k−N+1)

Note that we have assumed a sampling rate of one sample per symbol. Higher sampling rates may be used. Also, HR is given by L X a (θl ) αlR gT (τl ) (19.8) HR = l=1

where g(τl ) is a vector defined by T spaced sampling of the pulse shaping function g(·) with an offset of τl . Likewise the forward discrete signal model at the mobile is given by x(k) =

m X i=1

1999 by CRC Press LLC

c

hiF s(k) + n(k)

(19.9)

where hiF is a 1 × N composite channel from the symbol sequence via the ith antenna to the mobile receiver which includes the effect transmit ST processing at the base station. In the case of two antenna delay diversity, hiF is given by L X

hF1 =

a1 (θl ) αlF g (τl )ψ

(19.10)

a2 (θl ) αlF g (τl − T )ψ

(19.11)

l=1

and hF2

=

L X l=1

If spatial beamforming alone is used, the signal model becomes x(k) =

L X

w H HF s(k) + n(k)ψ

(19.12)

l=1

where HF is the intrinsic forward (F) channel given by F

H =

h1F h2F

=

L X l=1

a (θl ) αlF gT (τl )ψ

(19.13)

19.3.4 Signal-Plus-Interference Model The overall received signal-plus-interference-and-noise model at the base station antenna array can be written as x(k) = HsR ss (k) +

Q−1 X q=1

HqR sq (k) + n(k)ψ

(19.14)

where HsR and HqR are channels for signal and CCI, respectively, while ss and sq are the corresponding data sequences. Note that Eq. (19.14) appears to suggest that the signal and interference are baud synchronous. However, this can be relaxed and the time offsets can be absorbed into the channel HqR . Similarly, the signal at the mobile can also be extended to include CCI. Note that in this case, the source of interference is from other base stations (in TDMA) and the channel is between the interfering base station and the desired mobile. It is often convenient to handle signals in blocks. Therefore, we may collect M consecutive snapshots of x(·) corresponding to time instants k, . . . , k + M − 1, (and dropping subscripts for a moment), we get X(k) = HR S(k) + N(k)ψ

(19.15)

where X(k), S(k) and N(k) are defined appropriately. Similarly the received signal at the mobile in the forward link has a block representation using a row vector. 1999 by CRC Press LLC

c

19.4

ST Receive Processing (Base)

The base station receiver receives the signals from the array antenna which consist of the signals from the desired mobile and the cochannel signals along with associated intersymbol interference and fading. The task of the receiver is to maximize signal power and mitigate fading, CCI and ISI. There are two broad approaches for doing this—one is multiuser detection wherein we demodulate both the cochannel and desired signals jointly, the other is to cancel CCI. The structure of the receiver depends on the nature of the channel estimates available and the tolerable receiver complexity. There are a number of options and we discuss only a few salient cases. Before discussing the receiver processing, we discuss how receiver channel is estimated.

19.4.1 Receive Channel Estimation (Base) In many mobile communications standards, such as GSM and IS-54, explicit training signals are inserted inside the TDMA data bursts. Let T be the training sequence arranged in a matrix form (T is arranged to be a Toeplitz matrix). Then, during the training burst, the received data is given by X = HR T + N

(19.16)

Clearly HR can be estimated using least squares as HR = XT†

H −1

(19.17)

where T† = TH TT . The use of training consumes spectrum resource. In GSM, for example, about 20% of the bits are dedicated to training. Moreover, in rapidly varying mobile channels, we may have to retrain frequently, resulting in even poorer spectral efficiency. There is, therefore, increased interest in blind methods that can estimate a channel without an explicit training signal.

19.4.2 Multiuser ST Receive Algorithms In multiuser (MU) algorithms, we address the problem of jointly demodulating the multiple signals. Recall the received signal is given by X = HR S + N

(19.18)

where HR and S are suitably defined to include multiple users and are of dimensions m × N Q and NQ × M, respectively. If the channels for all the arriving signals are known, then we jointly demodulate all the user data sequences using multiuser maximum likelihood sequence estimation (MLSE). Starting with the data model in Eq. (19.18), we can then search for multiple user data sequences that minimize the ML cost function

2

min X − HR S S

F

(19.19)

The multiuser MLSE will have a large number of states in the trellis. Efficient techniques for implementing this complex receiver are needed. Multiuser MLSE detection schemes outperform all other receivers. 1999 by CRC Press LLC

c

19.4.3

Single-User ST Receive Algorithms

In this scheme we only demodulate the desired user and cancel the CCI. Therefore, after CCI cancellation we can use MLSE receivers to handle diversity and ISI. In this method there is potential conflict between CCI mitigation and diversity maximization. We are forced to allocate the available degrees of freedom (antennas) to the competing requirements. One approach is to cancel CCI by a space-time filter followed by an MLSE receiver to handle ISI. We do this by reformulating the MLSE criterion to arrive at a joint solution for the ST-MMSE filter and the effective channel for the scalar MLSE. Another approach is to use a ST-MMSE receiver to handle both CCI and ISI. In a space-time filter (equalizer-beamformer), W has the following form w11 (k) · · · w1M (k) .. .. (19.20) W(k) = . ··· . wm1 (k) · · · wmM (k)

In order to obtain a convenient formulation for the space-time filter output, we introduce the quantities W (k) and X(k) as follows X(k) = vec (X(k)) (mM × 1) W (k) = vec (W(k)) (mM × 1)

(19.21)

where the operator vec(·) is defined as:

v1 vec ([v1 · · · vM ]) = ... vM The ST-MMSE filter chooses the filter weights to achieve the minimum mean square error. The ST-MMSE filter takes the familiar form −1 HR W = RXX

(19.22)

where H R is one column of vec (HR ). In ST-MMSE the CCI and spatial diversity conflict for the spatial degrees of freedom. Likewise, temporal diversity and ISI cancellation conflict for the temporal degrees of freedom.

19.5

ST Transmit Processing (Base)

The goal in ST transmit processing is to maximize the average signal power and diversity at the receiver as well as minimize cochannel generation to other mobiles. Note that the base station transmission cannot directly affect the CCI seen by its intended mobile. In transmit the space-time processing needs channel knowledge, but since it is carried out prior to transmission and, therefore, before the signal encounters the channel, this is different from the reverse link where the space-time processing is carried out after the channel has affected the signal. Note that the mobile receiver will, of course, need to know the channel for signal demodulation, but since it sees the signal after transmission through the channel, it can estimate the forward link channel using training signals transmitted from the individual transmitter antennas. 1999 by CRC Press LLC

c

19.5.1

Transmit Channel Estimation (Base)

The transmit channel estimation at the base of the vector forward channel can be done via feedback by use of reciprocity principles. In a TDD system, if the duplexing time is small compared to the coherence time of the channel, both channels are the same and the base-station can use its estimate of the reverse channel as the forward channel; i.e., HF = HR , where HR is the reverse channel (we have added superscript R to emphasize the receive channel). In FDD systems, the forward and reverse channels can potentially be very different. This arises from differences in instantaneous complex path gains α R 6 = α F . The other channel components a(θl ) and g(τl ) are very nearly equal. A direct approach to estimating the forward channel is to feed back the signal from the mobile unit and then estimate the channel. We can do this by transmitting orthogonal training signals through each base station antenna. We can feed back from the mobile to the base the received signal for each transmitted signal and thus estimate the channel.

19.5.2

ST Transmit Processing

The primary goals at the transmitter are to maximize diversity in the link and to reduce CCI generation to other mobiles. The diversity maximization depends on the inherent diversity at the antenna array and cannot be created at the transmitter. The role of ST processing is limited to maximizing the exploitability of this diversity R at the receiver. This usually leads to use of orthogonal or near orthogonal signalling at each antenna: u1 (t) u2 (t) dt ≈ 0. Orthogonality ensures that the transmitted signals are separable at the mobile which can now combine these signals after appropriate weighting to attain maximum diversity. In order to minimize CCI, our goal is to use the beamforming vector w to steer the radiated energy and therefore minimize the interference at the other mobiles while maximizing the signal level at one’s own mobile. Note that the CCI at the reference mobile is not controlled by its own base station but is generated by other base stations. Reducing CCI at one’s own mobile requires the cooperation of the other base stations. Therefore we choose w such that max w

E(wH HF s(k)s(k)H HF H w) Q−1 X q=1

H

w

(19.23)

HFq HFq H w

where Q−1 is the number of susceptible outer cell mobiles. HqF is the channel from the base station to the qth outer cell mobile. In order to solve the above equation, we need to know the forward link channel HF to the reference mobile and HqF to cochannel mobiles. In general, such complete channel knowledge may not be available and suboptimum receivers must be designed. Furthermore, we need to find a receiver that harmonizes maximization of diversity and reduction of CCI. Use of transmit ST processing affects HF and thus can be incorporated.

19.5.3

Forward Link Processing at the Mobile

The mobile will receive the composite signal from all the base station transmit antennas and will need to demodulate the signal to estimate the symbol sequence. In doing so it usually needs to estimate the individual channels from each base station antenna to itself. This is usually done via the use of training signals on each transmit antenna. Note that as the number of transmit antennas increases, 1999 by CRC Press LLC

c

there is a greater burden of training requirements. The use of transmit ST processing reduces the CCI power observed by the mobile as well enhances the diversity available.

19.6

Summary

Use of space-time processing can significantly improve average signal power, mitigate fading, and reduce cochannel and intersymbol interference in wireless networks. This can in turn result in significantly improved capacity, coverage, and quality of wireless networks. In this chapter we have discussed applications of ST processing to TDMA systems. The applications to CDMA systems follow similar principles, but differences arise due to the nature of the signal and interference models.

Defining Terms ISI: Intersymbol intereference is caused by multipath propagation where one symbol interferes with other symbols. CCI: Cochannel interference arises from neighboring cells where the frequency channel is reused. Maximum Likelihood Sequence Estimation: A technique for channel equalization based on determining the best symbol sequence that matches the received signal.

References [1] Lindskog, E. and Paulraj, A., A taxonomy of space-time signal processing, IEE Trans. Radar and Sonar, 25–31, Feb. 1998. [2] Ng, B.C. and Paulraj, A., Space-time processing for PCS, IEEE PCS Magazine, 5(1), 36–48, Feb. 1998. [3] Paulraj, A. and Papadias, C.B., Space-time processing for wireless communications, IEEE Signal Processing Magazine, 14(5), 49–83, Nov. 1997. [4] Paulraj, A., Papadias, C., Reddy, V.U., and Van der Veen, A., A Review of Space-Time Signal Processing for Wireless Communications, in Signal Processing for Wireless Communications, V. Poor, Ed., Prentice Hall, 179–210, Dec. 1997.

1999 by CRC Press LLC

c

Jain, R.; Lin, Y. & Mohan, S. “Location Strategies for Personal Communications Services” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Location Strategies for Personal Communications Services 20.1 Introduction 20.2 An Overview of PCS

Aspects of Mobility—Example 20.1 • A Model for PCS

20.3 IS-41 Preliminaries

Terminal/Location Registration • Call Delivery

20.4 Global System for Mobile Communications Architecture • User Location Strategy

20.5 Analysis of Database Traffic Rate for IS-41 and GSM

The Mobility Model for PCS Users • Additional Assumptions • Analysis of IS-41 • Analysis of GSM

20.6 20.7 20.8 20.9

Ravi Jain Bell Communications Research

Yi-Bing Lin Bell Communications Research

Seshadri

Mohan1

Bell Communications Research

Reducing Signalling During Call Delivery Per-User Location Caching Caching Threshold Analysis Techniques for Estimating Users’ LCMR

The Running Average Algorithm • The Reset-K Algorithm • Comparison of the LCMR Estimation Algorithms

20.10 Discussion

Conditions When Caching Is Beneficial • Alternative Network Architectures • LCMR Estimation and Caching Policy

20.11 Conclusions Acknowledgment References

1 Address correspondence to: Seshadri Mohan, MCC-1A216B, Bellcore, 445 South St, Morristown, NJ 07960; Phone:

973-829-5160, Fax: 973-829-5888, e-mail: [email protected]+

c

1996 by Bell Communications Research, Inc. Used with permission. The material in

this chapter appeared originally in the following IEEE publications: S. Mohan and R. Jain. 1994. Two user location strategies for personal communications services, IEEE Personal Communications: The Magazine of Nomadic Communications and Computing, pp. 42--50, Feb., and R. Jain, C.N. Lo, and S. Mohan. 1994. A caching strategy to reduce network impacts of PCS, J-SAC Special Issue on Wireless and Mobile Networks, Aug.

1999 by CRC Press LLC

c

20.1

Introduction

The vision of nomadic personal communications is the ubiquitous availability of services to facilitate exchange of information (voice, data, video, image, etc.) between nomadic end users independent of time, location, or access arrangements. To realize this vision, it is necessary to locate users that move from place to place. The strategies commonly proposed are two-level hierarchical strategies, which maintain a system of mobility databases, home location registers (HLR) and visitor location resisters (VLR), to keep track of user locations. Two standards exist for carrying out two-level hierarchical strategies using HLRs and VLRs. The standard commonly used in North America is the EIA/TIA Interim Standard 41 (IS 41) [6] and in Europe the Global System for Mobile Communications (GSM) [15, 18]. In this chapter, we refer to these two strategies as basic location strategies. We introduce these two strategies for locating users and provide a tutorial on their usage. We then analyze and compare these basic location strategies with respect to load on mobility databases and signalling network. Next we propose an auxiliary strategy, called the per-user caching or, simply, the caching strategy, that augments the basic location strategies to reduce the signalling and database loads. The outline of this chapter is as follows. In Section 20.2 we discuss different forms of mobility in the context of personal communications services (PCS) and describe a reference model for a PCS architecture. In Sections 20.3 and 20.4, we describe the user location strategies specified in the IS-41 and GSM standards, respectively, and in Section 20.5, using a simple example, we present a simplified analysis of the database loads generated by each strategy. In Section 20.6, we briefly discuss possible modifications to these protocols that are likely to result in significant benefits by either reducing query and update rate to databases or reducing the signalling traffic or both. Section 20.7 introduces the caching strategy followed by an analysis in the next two sections. This idea attempts to exploit the spatial and temporal locality in calls received by users, similar to the idea of exploiting locality of file access in computer systems [20]. A feature of the caching location strategy is that it is useful only for certain classes of PCS users, those meeting certain call and mobility criteria. We encapsulate this notion in the definition of the user’s call-to-mobility ratio (CMR), and local CMR (LCMR), in Section 20.8. We then use this definition and our PCS network reference architecture to quantify the costs and benefits of caching and the threshold LCMR for which caching is beneficial, thus characterizing the classes of users for which caching should be applied. In Section 20.9 we describe two methods for estimating users’ LCMR and compare their effectiveness when call and mobility patterns are fairly stable, as well as when they may be variable. In Section 20.10, we briefly discuss alternative architectures and implementation issues of the strategy proposed and mention other auxiliary strategies that can be designed. Section 20.11 provides some conclusions and discussion of future work. The choice of platforms on which to realize the two location strategies (IS-41 and GSM) may vary from one service provider to another. In this paper, we describe a possible realization of these protocols based on the advanced intelligent network (AIN) architecture (see [2, 5]), and signalling system 7 (SS7). It is also worthwhile to point out that several strategies have been proposed in the literature for locating users, many of which attempt to reduce the signalling traffic and database loads imposed by the need to locate users in PCS.

20.2

An Overview of PCS

This section explains different aspects of mobility in PCS using an example of two nomadic users who wish to communicate with each other. It also describes a reference model for PCS. 1999 by CRC Press LLC

c

20.2.1

Aspects of Mobility—Example 20.1

PCS can involve two possible types of mobility, terminal mobility and personal mobility, that are explained next. Terminal Mobility: This type of mobility allows a terminal to be identified by a unique terminal identifier independent of the point of attachment to the network. Calls intended for that terminal can therefore be delivered to that terminal regardless of its network point of attachment. To facilitate terminal mobility, a network must provide several functions, which include those that locate, identify, and validate a terminal and provide services (e.g., deliver calls) to the terminal based on the location information. This implies that the network must store and maintain the location information of the terminal based on a unique identifier assigned to that terminal. An example of a terminal identifier is the IS-41 EIA/TIA cellular industry term mobile identification number (MIN), which is a North American Numbering Plan (NANP) number that is stored in the terminal at the time of manufacture and cannot be changed. A similar notion exists in GSM (see Section 20.4). Personal Mobility: This type of mobility allows a PCS user to make and receive calls independent of both the network point of attachment and a specific PCS terminal. This implies that the services that a user has subscribed to (stored in that user’s service profile) are available to the user even if the user moves or changes terminal equipment. Functions needed to provide personal mobility include those that identify (authenticate) the end user and provide services to an end user independent of both the terminal and the location of the user. An example of a functionality needed to provide personal mobility for voice calls is the need to maintain a user’s location information based on a unique number, called the universal personal telecommunications (UPT) number, assigned to that user. UPT numbers are also NANP numbers. Another example is one that allows end users to define and manage their service profiles to enable users to tailor services to suit their needs. In Section 20.4, we describe how GSM caters to personal mobility via smart cards. For the purposes of the example that follows, the terminal identifiers (TID) and UPT numbers are NANP numbers, the distinction being TIDs address terminal mobility and UPT numbers address personal mobility. Though we have assigned two different numbers to address personal and terminal mobility concerns, the same effect could be achieved by a single identifier assigned to the terminal that varies depending on the user that is currently utilizing the terminal. For simplicity we assume that two different numbers are assigned. Figure 20.1 illustrates the terminal and personal mobility aspects of PCS, which will be explained via an example. Let us assume that users Kate and Al have, respectively, subscribed to PCS services from PCS service provider (PSP) A and PSP B. Kate receives the UPT number, say, 500 111 4711, from PSP A. She also owns a PCS terminal with TID 200 777 9760. Al too receives his UPT number 500 222 4712 from PSP B, and he owns a PCS terminal with TID 200 888 5760. Each has been provided a personal identification number (PIN) by their respective PSP when subscription began. We assume that the two PSPs have subscribed to PCS access services from a certain network provider such as, for example, a local exchange carrier (LEC). (Depending on the capabilities of the PSPs, the access services provided may vary. Examples of access services include translation of UPT number to a routing number, terminal and personal registration, and call delivery. Refer to Bellcore, [3], for further details). When Kate plugs in her terminal to the network, or when she activates it, the terminal registers itself with the network by providing its TID to the network. The network creates an entry for the terminal in an appropriate database, which, in this example, is entered in the terminal mobility database (TMDB) A. The entry provides a mapping of her terminal’s TID, 200 777 9760, to a routing number (RN), RN1. All of these activities happen without Kate being aware of them. After activating her terminal, Kate registers herself at that terminal by entering her UPT number (500 111 4711) to inform the network that all calls to her UPT number are to be delivered to her 1999 by CRC Press LLC

c

at the terminal. For security reasons, the network may want to authenticate her and she may be prompted to enter her PIN number into her terminal. (Alternatively, if the terminal is equipped with a smart card reader, she may enter her smart card into the reader. Other techniques, such as, for example, voice recognition, may be employed). Assuming that she is authenticated, Kate has now registered herself. As a result of personal registration by Kate, the network creates an entry for her in the personal mobility database (PMDB) A that maps her UPT number to the TID of the terminal at which she registered. Similarly, when Al activates his terminal and registers himself, appropriate entries are created in TMDB B and PMDB B. Now Al wishes to call Kate and, hence, he dials Kate’s UPT number (500 111 4711). The network carries out the following tasks. 1. The switch analyzes the dialed digits and recognizes the need for AIN service, determines that the dialed UPT number needs to be translated to a RN by querying PMDB A and, hence, it queries PMDB A. 2. PMDB A searches its database and determines that the person with UPT number 500 111 4711 is currently registered at terminal with TID 200 777 9760. 3. PMDB A then queries TMDB A for the RN of the terminal with TID 200 777 9760. TMDB A returns the RN (RN1). 4. PMDB A returns the RN (RN1) to the originating switch. 5. The originating switch directs the call to the switch RN1, which then alerts Kate’s terminal. The call is completed when Kate picks up her terminal. Kate may take her terminal wherever she goes and perform registration at her new location. From then on, the network will deliver all calls for her UPT number to her terminal at the new location. In fact, she may actually register on someone else’s terminal too. For example, suppose that Kate and Al agree to meet at Al’s place to discuss a school project they are working on together. Kate may register herself on Al’s terminal (TID 200 888 9534). The network will now modify the entry corresponding to 4711 in PMDB A to point to B 9534. Subsequent calls to Kate will be delivered to Al’s terminal. The scenario given here is used only to illustrate the key aspects of terminal and personal mobility; an actual deployment of these services may be implemented in ways different from those suggested here. We will not discuss personal registration further. The analyses that follow consider only terminal mobility but may easily be modified to include personal mobility.

20.2.2

A Model for PCS

Figure 20.2 illustrates the reference model used for the comparative analysis. The model assumes that the HLR resides in a service control point (SCP) connected to a regional signal transfer point (RSTP). The SCP is a storehouse of the AIN service logic, i.e., functionality used to perform the processing required to provide advanced services, such as speed calling, outgoing call screening, etc., in the AIN architecture (see Bellcore, [2] and Berman and Brewster, [5]). The RSTP and the local STP (LSTP) are packet switches, connected together by various links such A links or D links, that perform the signalling functions of the SS7 network. Such functions include, for example, global title translation for routing messages between the AIN switching system, which is also referred to as the service switching point (SSP), and SCP and IS-41 messages [6]. Several SSPs may be connected to an LSTP. The reference model in Fig. 20.2 introduces several terms which are explained next. We have tried to keep the terms and discussions fairly general. Wherever possible, however, we point to equivalent cellular terms from IS-41 or GSM. 1999 by CRC Press LLC

c

FIGURE 20.1: Illustrating terminal and personal mobility.

For our purposes, the geographical area served by a PCS system is partitioned into a number of radio port coverage areas (or cells, in cellular terms) each of which is served by a radio port (or, equivalently, base station) that communicates with PCS terminals in that cell. A registration area (also known in the cellular world as location area) is composed of a number of cells. The base stations of all cells in a registration area are connected by wireline links to a mobile switching center (MSC). We assume that each registration area is served by a single VLR. The MSC of a registration area is responsible for maintaining and accessing the VLR and for switching between radio ports. The VLR associated with a registration area is responsible for maintaining a subset of the user information contained in the HLR. Terminal registration process is initiated by terminals whenever they move into a new registration area. The base stations of a registration area periodically broadcast an identifier associated with that area. The terminals periodically compare an identifier they have stored with the identifier to the registration area being broadcast. If the two identifiers differ, the terminal recognizes that it has moved from one registration area to another and will, therefore, generate a registration message. It also replaces the previous registration area identifier with that of the new one. Movement of a terminal within the same registration area will not generate registration messages. Registration messages may also be generated when the terminals are switched on. Similarly, messages are generated to deregister them when they are switched off. PCS services may be provided by different types of commercial service vendors. Bellcore, [3] 1999 by CRC Press LLC

c

FIGURE 20.2: Example of a reference model for a PCS.

describes three different types of PSPs and the different access services that a public network may provide to them. For example, a PSP may have full network capabilities with its own switching, radio management, and radio port capabilities. Certain others may not have switching capabilities, and others may have only radio port capabilities. The model in Fig. 20.2 assumes full PSP capabilities. The analysis in Section 20.5 is based on this model and modifications may be necessary for other types of PSPs. It is also quite possible that one or more registration areas may be served by a single PSP. The PSP may have one or more HLRs for serving its service area. In such a situation users that move within the PSP’s serving area may generate traffic to the PSP’s HLR (not shown in Fig. 20.2) but not to the network’s HLR (shown in Fig. 20.2). In the interest of keeping the discussions simple, we have assumed that there is one-to-one correspondence between SSPs and MSCs and also between MSCs, registration areas, and VLRs. One impact of locating the SSP, MSC, and VLR in separate physical sites connected by SS7 signalling links would be to increase the required signalling message volume on the SS7 network. Our model assumes that the messages between the SSP and the associated MSC and VLR do not add to signalling load on the public network. Other configurations and assumptions could be studied for which the analysis may need to be suitably modified. The underlying analysis techniques will not, however, differ significantly.

20.3

IS-41 Preliminaries

We now describe the message flow for call origination, call delivery, and terminal registration, sometimes called location registration, based on the IS-41 protocol. This protocol is described in detail in EIA/TIA, [6]. Only an outline is provided here.

20.3.1

Terminal/Location Registration

During IS-41 registration, signalling is performed between the following pairs of network elements: • New serving MSC and the associated database (or VLR) 1999 by CRC Press LLC

c

• New database (VLR) in the visited area and the HLR in the public network • HLR and the VLR in former visited registration area or the old MSC serving area. Figure 20.3 shows the signalling message flow diagram for IS-41 registration activity, focusing only on the essential elements of the message flow relating to registration; for details of variations from the basic registration procedure, see Bellcore, [3].

FIGURE 20.3: Signalling flow diagram for registration in IS-41.

The following steps describe the activities that take place during registration. 1. Once a terminal enters a new registration area, the terminal sends a registration request to the MSC of that area. 2. The MSC sends an authentication request (AUTHRQST) message to its VLR to authenticate the terminal, which in turn sends the request to the HLR. The HLR sends its response in the authrqst message. 3. Assuming the terminal is authenticated, the MSC sends a registration notification (REGNOT) message to its VLR. 4. The VLR in turn sends a REGNOT message to the HLR serving the terminal. The HLR updates the location entry corresponding to the terminal to point to the new serving 1999 by CRC Press LLC

c

MSC/VLR. The HLR sends a response back to the VLR, which may contain relevant parts of the user’s service profile. The VLR stores the service profile in its database and also responds to the serving MSC. 5. If the user/terminal was registered previously in a different registration area, the HLR sends a registration cancellation (REGCANC) message to the previously visited VLR. On receiving this message, the VLR erases all entries for the terminal from the record and sends a REGCANC message to the previously visited MSC, which then erases all entries for the terminal from its memory. The protocol shows authentication request and registration notification as separate messages. If the two messages can be packaged into one message, then the rate of queries to HLR may be cut in half. This does not necessarily mean that the total number of messages are cut in half.

20.3.2

Call Delivery

The signalling message flow diagram for IS-41 call delivery is shown in Fig. 20.4. The following steps describe the activities that take place during call delivery. 1. A call origination is detected and the number of the called terminal (for example, MIN) is received by the serving MSC. Observe that the call could have originated from within the public network from a wireline phone or from a wireless terminal in an MSC/VLR serving area. (If the call originated within the public network, the AIN SSP analyzes the dialed digits and sends a query to the SCP.) 2. The MSC determines the associated HLR serving the called terminal and sends a location request (LOCREQ) message to the HLR. 3. The HLR determines the serving VLR for that called terminal and sends a routing address request (ROUTEREQ) to the VLR, which forwards it to the MSC currently serving the terminal. 4. Assuming that the terminal is idle, the serving MSC allocates a temporary identifier, called a temporary local directory number (TLDN), to the terminal and returns a response to the HLR containing this information. The HLR forwards this information to the originating SSP/MSC in response to its LOCREQ message. 5. The originating SSP requests call setup to the serving MSC of the called terminal via the SS7 signalling network using the usual call setup protocols. Similar to the considerations for reducing signalling traffic for location registration, the VLR and HLR functions could be united in a single logical database for a given serving area and collocated; further, the database and switch can be integrated into the same piece of physical equipment or be collocated. In this manner, a significant portion of the messages exchanged between the switch, HLR and VLR as shown in Fig. 20.4 will not contribute to signalling traffic.

20.4

Global System for Mobile Communications

In this section we describe the user location strategy proposed in the European Global System for Mobile Communications (GSM) standard and its offshoot, digital cellular system 1800 (DCS1800). There has recently been increased interest in GSM in North America, since it is possible that early deployment of PCS will be facilitated by using the communication equipment already available from 1999 by CRC Press LLC

c

FIGURE 20.4: Signalling flow diagram for call delivery in IS-41.

European manufacturers who use the GSM standard. Since the GSM standard is relatively unfamiliar to North American readers, we first give some background and introduce the various abbreviations. The reader will find additional details in Mouley and Pautet, [18]. For an overview on GSM, refer to Lycksell, [15]. The abbreviation GSM originally stood for Groupe Special Mobile, a committee created within the pan-European standardization body Conference Europeenne des Posts et Telecommunications (CEPT) in 1982. There were numerous national cellular communication systems and standards in Europe at the time, and the aim of GSM was to specify a uniform standard around the newly reserved 900-MHz frequency band with a bandwidth of twice 25 MHz. The phase 1 specifications of this standard were frozen in 1990. Also in 1990, at the request of the United Kingdom, specification of a version of GSM adapted to the 1800-MHz frequency, with bandwidth of twice 75 MHz, was begun. This variant is referred to as DCS1800; the abbreviation GSM900 is sometimes used to distinguish between the two variations, with the abbreviation GSM being used to encompass both GSM900 and DSC1800. The motivation for DCS1800 is to provide higher capacities in densely populated urban areas, particularly for PCS. The DCS1800 specifications were frozen in 1991, and by 1992 all major GSM900 European operators began operation. At the end of 1991, activities concerning the post-GSM generation of mobile communications were begun by the standardization committee, using the name universal mobile telecommunications system (UMTS) for this effort. In 1992, the name of the standardization committee was changed from GSM to special mobile group (SMG) to distinguish it from the 900-MHz system itself, and the term GSM was chosen as the commercial trademark of the European 900-MHz system, where GSM now stands for global system for mobile communications. The GSM standard has now been widely adopted in Europe and is under consideration in several other non-European countries, including the United Arab Emirates, Hong Kong, and New Zealand. 1999 by CRC Press LLC

c

FIGURE 20.5: Flow diagram for registration in GSM.

In 1992, Australian operators officially adopted GSM.

20.4.1

Architecture

In this section we describe the GSM architecture, focusing on those aspects that differ from the architecture assumed in the IS-41 standard. A major goal of the GSM standard was to enable users to move across national boundaries and still be able to communicate. It was considered desirable, however, that the operational network within each country be operated independently. Each of the operational networks is called a public land mobile network (PLMN) and its commercial coverage area is confined to the borders of one country (although some radio coverage overlap at national boundaries may occur), and each country may have several competing PLMNs. A GSM customer subscribes to a single PLMN called the home PLMN, and subscription information includes the services the customer subscribes to. During normal operation, a user may elect to choose other PLMNs as their service becomes available (either as the user moves or as new operators enter the marketplace). The user’s terminal [GSM calls the terminal a mobile station (MS)] assists the user in choosing a PLMN in this case, either presenting a list of possible PLMNs to the user using 1999 by CRC Press LLC

c

explicit names (e.g., DK Sonofon for the Danish PLMN) or choosing automatically based on a list of preferred PLMNs stored in the terminal’s memory. This PLMN selection process allows users to choose between the services and tariffs of several competing PLMNs. Note that the PLMN selection process differs from the cell selection and handoff process that a terminal carries out automatically without any possibility of user intervention, typically based on received radio signal strengths and, thus, requires additional intelligence and functionality in the terminal. The geographical area covered by a PLMN is partitioned into MSC serving areas, and a registration area is constrained to be a subset of a single MSC serving area. The PLMN operator has complete freedom to allocate cells to registration areas. Each PLMN has, logically speaking, a single HLR, although this may be implemented as several physically distributed databases, as for IS-41. Each MSC also has a VLR, and a VLR may serve one or several MSCs. As for IS-41, it is interesting to consider how the VLR should be viewed in this context. The VLR can be viewed as simply a database off loading the query and signalling load on the HLR and, hence, logically tightly coupled to the HLR or as an ancillary processor to the MSC. This distinction is not academic; in the first view, it would be natural to implement a VLR as serving several MSCs, whereas in the second each VLR would serve one MSC and be physically closely coupled to it. For GSM, the MSC implements most of the signalling protocols, and at present all switch manufacturers implement a combined MSC and VLR, with one VLR per MSC [18]. A GSM mobile station is split in two parts, one containing the hardware and software for the radio interface and the other containing subscribers-specific and location information, called the subscriber identity module (SIM), which can be removed from the terminal and is the size of a credit card or smaller. The SIM is assigned a unique identity within the GSM system, called the international mobile subscriber identity (IMSI), which is used by the user location strategy as described the next subsection. The SIM also stores authentication information, services lists, PLMN selection lists, etc., and can itself be protected by password or PIN. The SIM can be used to implement a form of large-scale mobility called SIM roaming. The GSM specifications standardize the interface between the SIM and the terminal, so that a user carrying his or her SIM can move between different terminals and use the SIM to personalize the terminal. This capability is particularly useful for users who move between PLMNs which have different radio interfaces. The user can use the appropriate terminal for each PLMN coverage area while obtaining the personalized facilities specified in his or her SIM. Thus, SIMs address personal mobility. In the European context, the usage of two closely related standards at different frequencies, namely, GSM900 and DCS1800, makes this capability an especially important one and facilitates interworking between the two systems.

20.4.2

User Location Strategy

We present a synopsis of the user location strategy in GSM using call flow diagrams similar to those used to describe the strategy in IS-41. In order to describe the registration procedure, it is first useful to clarify the different identifiers used in this procedure. The SIM of the terminal is assigned a unique identity, called the IMSI, as already mentioned. To increase confidentiality and make more efficient use of the radio bandwidth, however, the IMSI is not normally transmitted over the radio link. Instead, the terminal is assigned a temporary mobile subscriber identity (TMSI) by the VLR when it enters a new registration area. The TMSI is valid only within a given registration area and is shorter than the IMSI. The IMSI and TMSI are identifiers that are internal to the system and assigned to a terminal or SIM and should not be confused with the user’s number that would be dialed by a calling party; the latter is a separate number called the mobile subscriber integrated service digital network (ISDN) number (MSISDN), 1999 by CRC Press LLC

c

and is similar to the usual telephone number in a fixed network. We now describe the procedure during registration. The terminal can detect when it has moved into the cell of a new registration area from the system information broadcast by the base station in the new cell. The terminal initiates a registration update request to the new base station; this request includes the identity of the old registration area and the TMSI of the terminal in the old area. The request is forwarded to the MSC, which, in turn, forwards it to the new VLR. Since the new VLR cannot translate the TMSI to the IMSI of the terminal, it sends a request to the old VLR to send the IMSI of the terminal corresponding to that TMSI. In its response, the old VLR also provides the required authentication information. The new VLR then initiates procedures to authenticate the terminal. If the authentication succeeds, the VLR uses the IMSI to determine the address of the terminal’s HLR. The ensuing protocol is then very similar to that in IS-41, except for the following differences. When the new VLR receives the registration affirmation (similar to regnot in IS-41) from the HLR, it assigns a new TMSI to the terminal for the new registration area. The HLR also provides the new VLR with all relevant subscriber profile information required for call handling (e.g., call screening lists, etc.) as part of the affirmation message. Thus, in contrast with IS-41, authentication and subscriber profile information are obtained from both the HLR and old VLR and not just the HLR. The procedure for delivering calls to mobile users in GSM is very similar to that in IS-41. The sequence of messages between the caller and called party’s MSC/VLRs and the HLR is identical to that shown in the call flow diagrams for IS-41, although the names, contents and lengths of messages may be different and, hence, the details are left out. The interested reader is referred to Mouly and Pautet, [18], or Lycksell, [15], for further details.

20.5

Analysis of Database Traffic Rate for IS-41 and GSM

In the two subsections that follow, we state the common set of assumptions on which we base our comparison of the two strategies.

20.5.1

The Mobility Model for PCS Users

In the analysis that follows in the IS-41 analysis subsection, we assume a simple mobility model for the PCS users. The model, which is described in [23], assumes that PCS users carrying terminals are moving at an average velocity of v and their direction of movement is uniformly distributed over [0, 2π ]. Assuming that the PCS users are uniformly populated with a density of ρ and the registration area boundary is of length L, it has been shown that the rate of registration area crossing R is given by ρv L (20.1) R= π Using Eq. (20.1), we can calculate the signalling traffic due to registration, call origination, and delivery. We now need a set of assumptions so that we may proceed to derive the traffic rate to the databases using the model in Fig. 20.2.

20.5.2

Additional Assumptions

The following assumptions are made in performing the analysis. • 128 total registration areas 1999 by CRC Press LLC

c

• • • • • • •

Square registration area size: (7.575 km)2 = 57.5 km2 , with border length L = 30.3 km Average call origination rate = average call termination (delivery) rate = 1.4/h/terminal Mean density of mobile terminals = ρ = 390/km2 Total number of mobile terminals = 128 × 57.4 × 390 = 2.87 × 106 Average call origination rate = average call termination (delivery) rate = 1.4/h/terminal Average speed of a mobile, v = 5.6 km/h Fluid flow mobility model

The assumptions regarding the total number of terminals may also be obtained by assuming that a certain public network provider serves 19.15 × 106 users and that 15% (or 2.87 × 106 ) of the users also subscribe to PCS services from various PSPs. Note that we have adopted a simplified model that ignores situations where PCS users may turn their handsets on and off that will generate additional registration and deregistration traffic. The model also ignores wireline registrations. These activities will increase the total number of queries and updates to HLR and VLRs.

20.5.3

Analysis of IS-41

Using Eq. (20.1) and the parameter values assumed in the preceding subsection, we can compute the traffic due to registration. The registration traffic is generated by mobile terminals moving into a new registration area, and this must equal the mobile terminals moving out of the registration area, which per second is 390 × 30.3 × 5.6 = 5.85 Rreg, VLR = 3600π This must also be equal to the number of deregistrations (registration cancellations), Rdereg, VLR = 5.85 The total number of registration messages per second arriving at the HLR will be Rreg, HLR = Rreg, VLR × total No. of registration areas = 749 The HLR should, therefore, be able to handle, roughly, 750 updates per second. We observe from Fig. 20.3 that authenticating terminals generate as many queries to VLR and HLR as the respective number of updates generated due to registration notification messages. The number of queries that the HLR must handle during call origination and delivery can be similarly calculated. Queries to HLR are generated when a call is made to a PCS user. The SSP that receives the request for a call, generates a location request (LOCREQ) query to the SCP controlling the HLR. The rate per second of such queries must be equal to the rate of calls made to PCS users. This is calculated as RCallDeliv, HLR

=

call rate per user × total number of users

1.4 × 2.87 × 105 = 3600 = 1116 For calls originated from a mobile terminal by PCS users, the switch authenticates the terminal by querying the VLR. The rate per second of such queries is determined by the rate of calls originating 1999 by CRC Press LLC

c

in an SSP serving area, which is also a registration area (RA). This is given by RCallOrig, VLR =

1116 = 8.7 128

This is also the number of queries per second needed to authenticate terminals of PCS users to which calls are delivered: RCallDeliv, VLR = 8.7 Table 20.1 summarizes the calculations. TABLE 20.1

IS-41 Query and Update Rates to HLR and VLR

Activity

HLR Updates/s

Mobility-related activities at registration

749

Mobility-related activities at deregistration

VLR Updates/s 5.85

HLR Queries/s

5.85

1116

8.7

5.85

Call origination

8.7

Call delivery Total (per RA) Total (Network)

20.5.4

VLR queries/s

749

5.85 749

11.7 1497.6

14.57 1865

23.25 2976

Analysis of GSM

Calculations for query and update rates for GSM may be performed in the same manner as for IS-41, and they are summarized in Table 20.2. The difference between this table and Table 20.1 is that in GSM the new serving VLR does not query the HLR separately in order to authenticate the terminal during registration and, hence, there are no HLR queries during registration. Instead, the entry (749 queries) under HLR queries in Table 20.1, corresponding to mobility-related authentication activity at registration, gets equally divided between the 128 VLRs. Observe that with either protocol the total database traffic rates are conserved, where the total database traffic for the entire network is given by the sum of all of the entries in the last row total (Network), i.e., HLR updates + VLR updates + HLR queries + VLR queries From Tables 20.1 and 20.2 we see that this quantity equals 7087. The conclusion is independent of any variations we may provide to the assumptions in earlier in the section. For example, if the PCS penetration (the percentage of the total users subscribing to PCS services) were to increase from 15 to 30%, all of the entries in the two tables will double and, hence, the total database traffic generated by the two protocols will still be equal.

20.6

Reducing Signalling During Call Delivery

In the preceding section, we provided a simplified analysis of some scenarios associated with user location strategies and the associated database queries and updates required. Previous studies [13, 16] 1999 by CRC Press LLC

c

TABLE 20.2

GSM Query and Update Rates to HLR and VLR

Activity

HLR Updates/s

Mobility-related activities at registration

749

Mobility-related activities at deregistration

VLR Updates/s

HLR Queries/s

5.85

VLR Queries/s 11.7

5.85

Call origination

8.7

Call delivery

1116

8.7

Total (per VLR)

749

11.7

1116

29.1

Total (Network)

749

1497.6

1116

3724.8

indicate that the signalling traffic and database queries associated with PCS due to user mobility are likely to grow to levels well in excess of that associated with a conventional call. It is, therefore, desirable to study modifications to the two protocols that would result in reduced signalling and database traffic. We now provide some suggestions. For both GSM and IS-41, delivery of calls to a mobile user involves four messages: from the caller’s VLR to the called party’s HLR, from the HLR to the called party’s VLR, from the called party’s VLR to the HLR, and from the HLR to the caller’s VLR. The last two of these messages involve the HLR, whose role is to simply relay the routing information provided by the called party’s VLR to the caller’s VLR. An obvious modification to the protocol would be to have the called VLR directly send the routing information to the calling VLR. This would reduce the total load on the HLR and on signalling network links substantially. Such a modification to the protocol may not be easy, of course, due to administrative, billing, legal, or security concerns. Besides, this would violate the query/response model adopted in IS-41, requiring further analysis. A related question which arises is whether the routing information obtained from the called party’s VLR could instead be stored in the HLR. This routing information could be provided to the HLR, for example, whenever a terminal registers in a new registration area. If this were possible, two of the four messages involved in call delivery could be eliminated. This point was discussed at length by the GSM standards body, and the present strategy was arrived at. The reason for this decision was to reduce the number of temporary routing numbers allocated by VLRs to terminals in their registration area. If a temporary routing number (TLDN in IS-41 or MSRN in GSM) is allocated to a terminal for the whole duration of its stay in a registration area, the quantity of numbers required is much greater than if a number is assigned on a per-call basis. Other strategies may be employed to reduce signalling and database traffic via intelligent paging or by storing user’s mobility behavior in user profiles (see, for example, Tabbane, [22]). A discussion of these techniques is beyond the scope of the paper.

20.7

Per-User Location Caching

The basic idea behind per-user location caching is that the volume of SS7 message traffic and database accesses required in locating a called subscriber can be reduced by maintaining local storage, or cache, of user location information at a switch. At any switch, location caching for a given user should be employed only if a large number of calls originate for that user from that switch, relative to the user’s mobility. Note that the cached information is kept at the switch from which calls originate, which may or may not be the switch where the user is currently registered. Location caching involves the storage of location pointers at the originating switch; these point to 1999 by CRC Press LLC

c

the VLR (and the associated switch) where the user is currently registered. We refer to the procedure of locating a PCS user a FIND operation, borrowing the terminology from Awerbuch and Peleg, [1]. We define a basic FIND, or BasicFIND( ), as one where the following sequence of steps takes place. 1. The incoming call to a PCS user is directed to the nearest switch. 2. Assuming that the called party is not located within the immediate RA, the switch queries the HLR for routing information. 3. The HLR contains a pointer to the VLR in whose associated RA the subscriber is currently situated and launches a query to that VLR. 4. The VLR, in turn, queries the MSC to determine whether the user terminal is capable of receiving the call (i.e., is idle) and, if so, the MSC returns a routable address (TLDN in IS-41) to the VLR. 5. The VLR relays the routing address back to the originating switch via the HLR. At this point, the originating switch can route the call to the destination switch. Alternately, BasicFIND( ) can be described by pseudocode as follows. (We observe that a more formal method of specifying PCS protocols may be desirable). BasicFIND( ){ Call to PCS user is detected at local switch; if called party is in same RA then return; Switch queries called party’s HLR; Called party’s HLR queries called party’s current VLR, V ; V returns called party’s location to HLR; HLR returns location to calling switch; } In the FIND procedure involving the use of location caching, or CacheFIND( ), each switch contains a local memory (cache) that stores location information for subscribers. When the switch receives a call origination (from either a wire-line or wireless caller) directed to a PCS subscriber, it first checks its cache to see if location information for the called party is maintained. If so, a query is launched to the pointed VLR; if not, BasicFIND( ), as just described, is followed. If a cache entry exists and the pointed VLR is queried, two situations are possible. If the user is still registered at the RA of the pointed VLR (i.e., we have a cache hit), the pointed VLR returns the user’s routing address. Otherwise, the pointed VLR returns a cache miss. CacheFIND( ){

}

Call to PCS user is detected at local switch; if called is in same RA then return; if there is no cache entry for called user then invoke BasicFIND( ) and return; Switch queries the VLR, V , specified in the cache entry; if called is at V , then V returns called party’s location to calling switch; else { V returns “miss” to calling switch; Calling switch invokes BasicFIND( ); }

1999 by CRC Press LLC

c

When a cache hit occurs we save one query to the HLR [a VLR query is involved in both CacheFIND( ) and BasicFIND( )], and we also save traffic along some of the signalling links; instead of four message transmissions, as in BasicFIND( ), only two are needed. In steady-state operation, the cached pointer for any given user is updated only upon a miss. Note that the BasicFIND( ) procedure differs from that specified for roaming subscribers in the IS-41 standard EIA/TIA, [6]. In the IS-41 standard, the second line in the BasicFIND( ) procedure is omitted, i.e., every call results in a query of the called user’s HLR. Thus, in fact, the procedure specified in the standard will result in an even higher network load than the BasicFIND( ) procedure specified here. To make a fair assessment of the benefits of CacheFIND( ), however, we have compared it against BasicFIND( ). Thus, the benefits of CacheFIND( ) investigated here depend specifically on the use of caching and not simply on the availability of user location information at the local VLR.

20.8

Caching Threshold Analysis

In this section we investigate the classes of users for which the caching strategy yields net reductions in signalling traffic and database loads. We characterize classes of users by their CMR. The CMR of a user is the average number of calls to a user per unit time, divided by the average number of times the user changes registration areas per unit time. We also define a LCMR, which is the average number of calls to a user from a given originating switch per unit time, divided by the average number of times the user changes registration areas per unit time. For each user, the amount of savings due to caching is a function of the probability that the cached pointer correctly points to the user’s location and increases with the user’s LCMR. In this section we quantify the minimum value of LCMR for caching to be worthwhile. This caching threshold is parameterized with respect to costs of traversing signalling network elements and network databases and can be used as a guide to select the subset of users to whom caching should be applied. The analysis in this section shows that estimating user’s LCMRs, preferably dynamically, is very important in order to apply the caching strategy. The next section will discuss methods for obtaining this estimate. From the pseudocode for BasicFIND( ), the signalling network cost incurred in locating a PCS user in the event of an incoming call is the sum of the cost of querying the HLR (and receiving the response), and the cost of querying the VLR which the HLR points to (and receiving the response). Let α β

= cost of querying the HLR and receiving a response = cost of querying the pointed VLR and receiving a response

Then, the cost of BasicFIND( ) operation is CB = α + β

(20.2)

To quantify this further, assume costs for traversing various network elements as follows. Al = cost of transmitting a location request or response message on A link between SSP and LSTP D = cost of transmitting a location request on response message or D link Ar = cost of transmitting a location request or response message on A link between RSTP and SCP L = cost of processing and routing a location request or response message by LSTP R = cost of processing and routing a location request or response message by RSTP HQ = cost of a query to the HLR to obtain the current VLR location 1999 by CRC Press LLC

c

VQ = cost of a query to the VLR to obtain the routing address Then, using the PCS reference network architecture (Fig. 80.2), α β

= 2 (Al + D + Ar + L + R) + HQ = 2 (Al + D + Ar + L + R) + VQ

(20.3) (20.4)

From Eqs. (20.2)–(20.4) CB = 4 (Al + D + Ar + L + R) + HQ + VQ

(20.5)

We now calculate the cost of CacheFIND( ). We define the hit ratio as the relative frequency with which the cached pointer correctly points to the user’s location when it is consulted. Let p = cache hit ratio CH = cost of the CacheFIND( ) procedure when there is a hit CM = cost of the CacheFIND( ) procedure when there is a miss Then the cost of CacheFIND( ) is CC = p CH + (1 − p)CM

(20.6)

For CacheFIND( ), the signalling network costs incurred in locating a user in the event of an incoming call depend on the hit ratio as well as the cost of querying the VLR, which is stored in the cache; this VLR query may or may not involve traversing the RSTP. In the following, we say a VLR is a local VLR if it is served by the same LSTP as the originating switch, and a remote VLR otherwise. Let q δ η

= = = =

Prob (VLR in originating switch’s cache is a local VLR) cost of querying a local VLR cost of querying a remote VLR cost of updating the cache upon a miss

Then, δ CH

= 4Al + 2L + VQ = 4 (Al + D + L) + 2R + VQ = qδ + (1 − q)

(20.7) (20.8) (20.9)

Since updating the cache involves an operation to a fast local memory rather than a database operation, we shall assume in the following that η = 0. Then, CM = CH + CB = qδ + (1 − q) + α + β

(20.10)

From Eqs. (20.6), (20.9) and (20.10) we have CC = α + β + − p(α + β) + q(δ − )

(20.11)

For net cost savings we require CC < CB , or that the hit ratio exceeds a hit ratio threshold pT , derived using Eqs. (20.6), (20.9), and (20.2), p > pT =

CH CB

= =

1999 by CRC Press LLC

c

+ q(δ − ) α+β 4Al + 4D + 4L + 2R + VQ − q(4D + 2L + 2R) 4Al + 4D + 4Ar + 4L + 4R + HQ + VQ

(20.12) (20.13)

Equation (20.13) specifies the hit ratio threshold for a user, evaluated at a given switch, for which local maintenance of a cached location entry produces cost savings. As pointed out earlier, a given user’s hit ratio may be location dependent, since the rates of calls destined for that user may vary widely across switches. The hit ratio threshold in Eq. (20.13) is comprised of heterogeneous cost terms, i.e., transmission link utilization, packet switch processing, and database access costs. Therefore, numerical evaluation of the hit ratio threshold requires either detailed knowledge of these individual quantities or some form of simplifying assumptions. Based on the latter approach, the following two possible methods of evaluation may be employed. 1. Assume one or more cost terms dominate, and simplify Eq. (20.13) by setting the remaining terms to zero. 2. Establish a common unit of measure for all cost terms, for example, time delay. In this case, Al , Ar , and D may represent transmission delays of fixed transmission speed (e.g., 56 kb/s) signalling links, L and R may constitute the sum of queueing and service delays of packet switches (i.e., STPs), and HQ and VQ the transaction delays for database queries. In this section we adopt the first method and evaluate Eq. (20.13) assuming a single term dominates. (In Section 20.9 we present results using the second method). Table 20.3 shows the hit ratio threshold required to obtain net cost savings, for each case in which one of the cost terms is dominant. TABLE 20.3 Minimum Hit Ratios and LCMRs for Various Individual Dominant Signalling Network Cost Terms Dominant Cost Term

Hit ratio Threshold, pT

LCMR Threshold, LCMRT

LCMR Threshold (q = 0.043)

LCMR Threshold (q = 0.25)

Al

1

∞

∞

∞

Ar

0

0

0

0

D

1−q

1/q − 1

22

3

L

1 − q/2

2/q − 1

45

7

R

1 − q/2

2/q − 1

45

7

HQ

0

0

0

0

VQ

1

∞

∞

∞

In Table 20.3 we see that if the cost of querying a VLR or of traversing a local A link is the dominant cost, caching for users who may move is never worthwhile, regardless of users’ call reception and mobility patterns. This is because the caching strategy essentially distributes the functionality of the HLR to the VLRs. Thus, the load on the VLR and the local A link is always increased, since any move by a user results in a cache miss. On the other hand, for a fixed user (or telephone), caching is always worthwhile. We also observe that if the remote A links or HLR querying are the bottlenecks, caching is worthwhile even for users with very low hit ratios. As a simple average-case calculation, consider the net network benefit of caching when HLR access and update is the performance bottleneck. Consider a scenario where u = 50% of PCS users receive c = 80% of their calls from s = 5 RAs where their hit ratio p > 0, and s 0 = 4 of the SSPs at those RAs contain sufficiently large caches. Assume that caching is applied only to this subset of users and to no other users. Suppose that the average hit ratio for these users is p = 80%, so that 80% of the 1999 by CRC Press LLC

c

HLR accesses for calls to these users from these RA are avoided. Then the net saving in the accesses to the system’s HLR is H = (u c s 0 p)/s = 25%. We discuss other quantities in Table 20.3 next. It is first useful to relate the cache hit ratio to users’ calling and mobility patterns directly via the LCMR. Doing so requires making assumptions about the distribution of the user’s calls and moves. We consider the steady state where the incoming call stream from an SSP to a user is a Poisson process with arrival rate λ, and the time that the user resides in an RA has a general distribution F (t) with mean 1/µ. Thus, LCMR =

λ µ

(20.14)

Let t be the time interval between two consecutive calls from the SSP to the user and t1 be the time interval between the first call and the time when the user moves to a new RA. From the random observer property of the arrival call stream [7], the hit ratio is Z ∞ Z ∞ −λt λe µ [1 − F (t1 )] dt1 dt p = Pr [t < t1 ] = t1 =t

t=0

If F (t) is an exponential distribution, then p=

λ λ+µ

(20.15)

and we can derive the LCMR threshold, the minimum LCMR required for caching to be beneficial assuming incoming calls are a Poisson process and intermove times are exponentially distributed, LCMRT =

pT 1 − pT

(20.16)

Equation (20.16) is used to derive LCMR thresholds assuming various dominant costs terms, as shown in Table 20.3. Several values for LCMRT in Table 20.3 involve the term q, i.e., the probability that the pointed VLR is a local VLR. These values may be numerically evaluated by simplifying assumptions. For example, assume that all of the SSPs in the network are uniformly distributed amongst l LSTPs. Also, assume that all of the PCS subscribers are uniformly distributed in location across all SSPs and that each subscriber exhibits the same incoming call rate at every SSP. Under those conditions, q is simply 1/ l. Consider the case of the public switched telephone network. Given that there are a total of 160 local access transport area (LATA) across the 7 Regional Bell Operating Company (RBOC) regions [4], the average number of LATAs, or l, is 160/7 or 23. Table 20.3 shows the results with q = 1/ l in this case. We observe that the assumption that all users receive calls uniformly from all switches in the network is extremely conservative. In practice, we expect that user call reception patterns would display significantly more locality, so that q would be larger and the LCMR thresholds required to make caching worthwhile would be smaller. It is also worthwhile to consider the case of a RBOC region with PCS deployed in a few LATA only, a likely initial scenario, say, 4 LATAs. In either case the value of q would be significantly higher; Table 20.3 shows the LCMR threshold when q = 0.25. It is possible to quantify the net costs and benefits of caching in terms of signalling network impacts in this way and to determine the hit ratio and LCMR threshold above which users should have the caching strategy applied. Applying caching to users whose hit ratio and LCMR is below this threshold results in net increases in network impacts. It is, thus, important to estimate users’ LCMRs accurately. The next section discusses how to do so. 1999 by CRC Press LLC

c

20.9

Techniques for Estimating Users’ LCMR

Here we sketch some methods of estimating users’ LCMR. A simple and attractive policy is to not estimate these quantities on a per-user basis at all. For instance, if the average LCMR over all users in a PCS system is high enough (and from Table 20.3, it need not be high depending on which network elements are the dominant costs), then caching could be used at every SSP to yield net system-wide benefits. Alternatively, if it is known that at any given SSP the average LCMR over all users is high enough, a cache can be installed at that SSP. Other variations can be designed. One possibility for deciding about caching on a per-user basis is to maintain information about a user’s calling and mobility pattern at the HLR and to download it periodically to selected SSPs during off-peak hours. It is easy to envision numerous variations on this idea. In this section we investigate two possible techniques for estimating LCMR on a per-user basis when caching is to be deployed. The first algorithm, called the running average algorithm, simply maintains a running average of the hit ratio for each user. The second algorithm, called the reset-K algorithm, attempts to obtain a measure of the hit ratio over the recent history of the user’s movements. We describe the two algorithms next and evaluate their effectiveness using a stochastic analysis taking into account user calling and mobility patterns.

20.9.1

The Running Average Algorithm

The running average algorithm maintains, for every user that has a cache entry, the running average of the hit ratio. A running count is kept of the number of calls to a given user, and, regardless of the FIND procedure used to locate the user, a running count of the number of times that the user was at the same location for any two consecutive calls; the ratio of these numbers provides the measured running average of the hit ratio. We denote the measured running average of the hit ratio by pM ; in steady state, we expect that pM = p. The user’s previous location as stored in the cache entry is used only if the running average of the hit ratio pM is greater than the cache hit threshold pT . Recall that the cache scheme outperforms the basic scheme if p > pT = CH /CB . Thus, in steady state, the running average algorithm will outperform the basic scheme when pM > pT . We consider, as before, the steady state where the incoming call stream from an SSP to a user is a Poisson process with arrival rate λ, and the time that the user resides in an RA has an exponential distribution with mean 1/µ. Thus LCMR = λ/µ [Eq. (20.14)] and the location tracking cost at steady state is pM CH + (1 − pM ) CB , pM > pT (20.17) CC = CB , otherwise Figure 20.6 plots the cost ratio CC /CB from Eq. (20.17) against LCMR. (This corresponds to assigning uniform units to all cost terms in Eq. (20.13), i.e., the second evaluation method as discussed in Section 20.8. Thus, the ratio CC /CB may represent the percentage reduction in user location time with the caching strategy compared to the basic strategy.) The figure indicates that in the steady state, the caching strategy with the running average algorithm for estimating LCMR can significantly outperform the basic scheme if LCMR is sufficiently large. For instance with LCMR ∼ 5, caching can lead to cost savings of 20–60% over the basic strategy. Equation (20.17) (cf., solid curves in Fig. 20.6) is validated against a simple Monte Carlo simulation (cf., dashed curves in Fig. 20.6). In the simulation, the confidence interval for the 95% confidence level of the output measure CC /CB is within 3% of the mean value. This simulation model will later be used to study the running average algorithm when the mean of the movement distribution changes from time to time [which cannot be modeled by using Eq. (20.17)]. 1999 by CRC Press LLC

c

FIGURE 20.6: The location tracking cost for the running average algorithm. One problem with the running average algorithm is that the parameter p is measured from the entire past history of the user’s movement, and the algorithm may not be sufficiently dynamic to adequately reflect the recent history of the user’s mobility patterns.

20.9.2

The Reset-K Algorithm

We may modify the running average algorithm such that p is measured from the recent history. Define every K incoming calls as a cycle. The modified algorithm, which is referred to as the reset-K algorithm, counts the number of cache hits n in a cycle. If the measured hit ratio for a user, pM = n/K ≥ pT , then the cache is enabled for that user, and the cached information is always used to locate the user in the next cycle. Otherwise, the cache is disabled for that user, and the basic scheme is used. At the beginning of a cycle, the cache hit count is reset, and a new pM value is measured during the cycle. To study the performance of the reset-K algorithm, we model the number of cache misses in a cycle by a Markov process. Assume as before that the call arrivals are a Poisson process with arrival rate λ and the time period the user resides in an RA has an exponential distribution with mean 1/µ. A pair (i, j ), where i > j , represents the state that there are j cache misses before the first i incoming 1999 by CRC Press LLC

c

1999 by CRC Press LLC

c

FIGURE 20.7: State transitions.

phone calls in a cycle. A pair (i, j )∗ , where i ≥ j ≥ 1, represents the state that there are j − 1 cache misses before the first i incoming phone calls in a cycle, and the user moves between the ith and the i + 1 phone calls. The difference between (i, j ) and (i, j )∗ is that if the Markov process is in the state (i, j ) and the user moves, then the process moves into the state (i, j + 1)∗ . On the other hand, if the process is in state (i, j )∗ when the user moves, the process remains in (i, j )∗ because at most one cache miss occurs between two consecutive phone calls. Figure 20.7(a) illustrates the transitions for state (i, 0) where 2 < i < K + 1. The Markov process moves from (i − 1, 0) to (i, 0) if a phone call arrives before the user moves out. The rate is λ. The process moves from (i, 0) to (i, 1)∗ if the user moves to another RA before the i + 1 call arrival. Let π(i, j ) denote the probability of the process being in state (i, j ). Then the transition equation is π(i, 0) =

λ π(i − 1, 0), λ+µ

2 1.5 with M = 8), but an increased range in the downlink is also needed to get an effective reduction of the number of cell sites. An increased downlink range can be achieved using adaptive beamforming (but with a much higher complexity compared to the uplink-only implementation), a multibeam antenna (i.e., a phased array doing fixed beamforming), or an increased transmit power of the base station. However, the success of smart antenna techniques for range extension applications in second generation systems has been slowed down by their complexity of implementation and by operational constraints (multiple feeders, large antenna panels).

26.5

High Bit Rate Data Transmission

26.5.1

Circuit Mode Techniques

All second generation wireless systems support circuit mode data services with basic rates typically ranging from 9.6 kb/s (in cellular systems) to 32 kb/s (in cordless systems) for a single physical 1999 by CRC Press LLC

c

radio resource. With the growing needs for higher rates, new services have been developed based on multiple allocation or grouping of physical resource. In GSM, HSCSD (High Speed Circuit Switched Data) enables multiple Full Rate Traffic Channels (TCH/F) to be allocated to a call so that a mobile subscriber can use n times the transmission capacity of a single TCH/F channel (Fig. 26.2). The n full rate channels over which the user data stream is split are handled completely independently in the physical layer and for layer 1 error control. The HSCSD channel resulting from the logical combination of n TCH/F channels is controlled as a single radio link during cellular operations such as handover. At the A interface, calls will be limited to a single 64 kb/s circuit. Thus HSCSD will support transparent (up to 64 kb/s) and nontransparent modes (up to 4 × 9.6 = 38.4 kb/s and, later, 4 × 14.4 = 57.6 kb/s). The initial allocation can be changed during a call if required by the user and authorized by the network. Initially the network allocates an appropriate HSCSD connection according to the requested user bit rate over the air interface. Both symmetric and asymmetric configurations for bidirectional HSCSD operation are authorized. The required TCH/F channels are allocated over consecutive or nonconsecutive timeslots.

FIGURE 26.2: Simplified GSM network configuration for HSCSD.

Similar multislot schemes are envisaged or standardized for other TDMA systems. In IS-54 and PDC, where radio channels are relatively narrowband, no more than three time slots can be used per carrier and the achievable data rate is therefore limited to, say, 32 kb/s. On the contrary, in DECT up to 12 time slots can be used at 32 kb/s each, yielding a maximum data rate of 384 kb/s. Moreover, the TDD access mode of DECT allows asymmetric time slot allocation between uplink and downlink, thus enabling even higher data rates in one direction.

26.5.2

Packet Mode Techniques

There is a growing interest for packet data services in second generation wireless systems to support data applications with intermittent and bursty transmission requirements like the Internet, with a better usage of available radio resources, thanks to the multiplexing of data from several mobile users on the same physical channel. Cellular Digital Packet Data (CDPD) has been defined in the U.S. as a radio access overlay for AMPS or D-AMPS (IS-54) systems, allowing packet data transmission on available radio channels. However, CDPD is optimized for short data transmission and the bit rate is limited to 19.2 kb/s. A CDMA packet data standard has also been defined (IS-657) which supports 1999 by CRC Press LLC

c

CDPD and Internet protocols with a similar bit rate limitation but allowing use of the same backhaul as for voice traffic. In Europe, ETSI has almost completed the standardization of GPRS (General Packet Radio Service) for GSM. A GPRS subscriber will be able to send and receive in an end-to-end packet transfer mode. Both point-to-point and point-to-multipoint modes are defined. A GPRS network coexists with a GSM PLMN as an autonomous network. In fact, the Serving GPRS Support Node (SGSN) interfaces with the GSM Base Station Controller (BSC), an MSC and a Gateway GPRS Service Node (GGSN). In turn, the GGSN interfaces with the GGSNs of other GPRS networks and with public Packet Data Networks (PDN). Typically, GPRS traffic can be set up through the common control channels of GSM, which are accessed in slotted ALOHA mode. The layer 2 protocol data units, which are about 2 kbytes in length, are segmented and transmitted over the air interface using one of the four possible channel coding schemes. The system is highly scaleable as it allows from one mobile using 8 radio time slots up to 16 mobiles per time slot, with separate allocation in up- and downlink. The resulting peak data rate per user ranges from 9 kb/s up to 170 kb/s. Time slot concatenation and variable channel coding to maximize the user information bit rate are envisaged for future implementations. This is indicated by the mobile station, which provides information concerning the desire to initiate in-call modifications and the channel coding schemes that can be used during the call set up phase. It is expected that use of the GPRS service will initially be limited and traffic growth will depend on the introduction of GPRS capable subscriber terminals. Easy scalability of the GPRS backbone (e.g., by introducing parallel GGSNs) is an essential feature of the system architecture (Fig. 26.3).

FIGURE 26.3: Simplified view of the GPRS architecture.

26.5.3

New Modulation Schemes

New modulation schemes are being studied as an option in several second generation wireless standards. The aim is to offer higher rate data services equivalent or close to the 2 Mb/s objective of the forthcoming third generation standards. Multilevel modulations (i.e., several bits per modulated symbol) represent a straightforward means to increase the carrier bit rate. However, it represents a significant change in the air interface characteristics, and the increased bit rate is achieved at the expense of a higher operational signal-to-noise plus interference ratio, which is not compatible with 1999 by CRC Press LLC

c

large cell dimensions. Therefore, the new high bit rate data services are mainly targetting urban areas, and the effective bit rate allocated to data users will depend on the system load. Such a new air interface option is being standardized for GSM under the name of EDGE (Enhanced Data rates for GSM Evolution). The selected modulation scheme is 8-PSK, suitable coding schemes are under study, whereas the other air interface parameters (carrier spacing, TDMA frame structure,...) are kept unchanged. Reusing HSCSD (for circuit data) and GPRS (for packet data) protocols and service capabilities, EDGE will provide similar ECSD and EGPRS services but with a three-fold increase of the user bit rate. The higher level modulation requires better radio link performances, typically a loss of 3 to 4 dB in sensitivity and a C/I increased by 6 to 7 dB. Operation will also be restricted to environments with limited time dispersion and limited mobile speed. Nevertheless, EGPRS will roughly double the mean throughput compared to GPRS (for the same average transmitted power). EDGE will also increase the maximum achievable data rate in a GSM system to 553.6 kb/s in multislot (unprotected) operation. Six different protection schemes are foreseen in EGPRS using convolutional coding with a rate ranging from 1/3 to 1 and corresponding to user rates between 22.8 and 69.2 kb/s per time slot. This is in addition to the four coding schemes already defined for GPRS. An intelligent link adaptation algorithm will dynamically select the most appropriate modulation and coding schemes, i.e., those yielding the highest throughput for a given channel quality. The first phase of EDGE standardization should be completed by end 1999. It should be noted that a similar EDGE option is being studied for IS-54/IS-136 (and their PCS derivatives). Initially, the 30 kHz channel spacing will be maintained and then extension to a 200 kHz channel will be provided in order to offer a convergence with its GSM counterpart. A higher bit rate option is also under standardization for DECT. Here it is seen as an essential requirement to maintain backward compatibility with existing equipment so the new multilevel modulation will only affect the payload part of the bursts, keeping the control and signalling parts unchanged. This ensures that equipment with basic modulation and equipment with a higher rate option can efficiently share a common base station infrastructure. Only 4-level and 8-level modulations are considered and the symbol length, carrier spacing, and slot structure remain unchanged. The requirements on transmitter modulation accuracy need to be more stringent for 4- and 8-level modulation than for the current 2-level scheme. An increased accuracy can provide for coherent demodulation, whereby some (or most) of the sensitivity and C/I loss when using the multilevel mode can be regained. In combination with other new air interface features like forward error correction and double slots (with reduced overhead), the new modulation scheme will provide a wide range of data rates up to 2 Mb/s. For instance using (5/4-DQPSK modulation (a possible/suitable choice), an unprotected connection with two double slots in each direction gives a data rate of 384 kb/s. Asymmetric connections with a maximum of 11 double slots in one direction will also be supported.

26.6

Conclusion

Since their introduction in the early 1990s, most of the second generation systems have been enjoying exponential growth. With more than 100 million subscribers acquired worldwide in less than ten years of lifetime, the systems based on the GSM family of standards have demonstrated the most spectacular development. Despite a more regional implementation of other second generation systems, each one of those can boast a multimillion subscriber base in mobile or fixed wireless networks. A variety of service requirements of third generation mobile communication systems are being already met by the upcoming enhancements of second generation systems. Two important trends are reflected by this: • The introduction of third generation systems like Universal Mobile Telecommunication 1999 by CRC Press LLC

c

System (UMTS) or International Mobile Telecommunication-2000 (IMT-2000) might be delayed to a point in time where the evolutionary capabilities of second generation systems have been exhausted. • The deployment of networks based on third generation systems will be progressive. Any new radio interface will be imposed worldwide if and only if it provides substantial advantages as compared to the present systems. Another essential requirement is the capability of downward compatibility to second generation systems.

Defining Terms Capacity: In a mobile network it can be defined as the Erlangs throughput by a cell, a cluster of cells, or by a portion of a network. For a given radio interface, the achievable capacity is a function of the robustness of the physical layer, the effectiveness of the medium access control (MAC) layer and the multiple access technique. Moreover, it is strongly dependent on the radio spectrum available for network planning. Cellular: Refers to public land mobile radio networks for generally wide area (e.g., national) coverage, to be used with medium- or high-power vehicular mobiles or portable stations and for providing mobile access to the Public Switched Telephone Network (PSTN). The network implementation exhibits a cellular architecture which enables frequency reuse in nonadjacent cells. Cordless: These are systems to be used with simple low power portable stations operating within a short range of a base station and providing access to fixed public or private networks. There are three main applications, namely, residential (at home, for Plain Old Telephone Service, POTS), public-access (in public places and crowded areas, also called Telepoint), and Wireless Private Automatic Branch eXchange (WPABX, providing cordless access in the office environment), plus emerging applications like radio access for local loop. Coverage quality: It is the percentage of the served area where a communication can be established. It is determined by the acceptable path loss of the radio link and by the propagation characteristics in the area. The radio link budget generally includes some margin depending on the type of terrain (for shadowing effects) and on operator’s requirements (for indoor penetration). A coverage quality of 90% is a typical value for cellular networks. Speech quality: It strongly depends on the intrinsic performance of the speech coder and its evaluation normally requires intensive listening tests. When it is comparable to the quality achieved on modern wire-line telephone networks, it is called “toll quality.” In wireless systems it is also influenced by other parameters linked to the communication characteristics like radio channel impairments (bit error rate), transmission delay, echo, background noise, and tandeming (i.e., when several coding/decoding operations are involved in the link).

References [1] Anderson, S., Antenna Arrays in Mobile Communication Systems, Proc. Second Workshop on Smart Antennas in Wireless Mobile Communications, Stanford University, Jul. 1995. [2] Budagavi, M. and Gibson, J.D., Speech coding in mobile radio communications, Proceedings of the IEEE, 86(7), 1402–1412, Jul. 1998. 1999 by CRC Press LLC

c

[3] Cox, D.C., Wireless network access for personal communications, IEEE Communications Magazine, 96–115, Dec. 1992. [4] DECT, Digital European Cordless Telecommunications Common Interface, ETS-300-175, ETSI, 1992. [5] Fingscheidt, T. and Vary, P., Robust Speech Decoding: A Universal Approach to Bit Error Concealment, Proc. IEEE ICASSP, 1667–1670, Apr. 1997. [6] Ganz, A., et al., On optimal design of multitier wireless cellular systems, IEEE Communications Magazine, 88–93, Feb. 1997. [7] GSM, GSM Recommendations Series 01-12, ETSI, 1990. [8] IS-54, Cellular System, Dual-Mode Mobile Station-Base Station Compatibility Standard, EIA/TIA Interim Standard, 1991. [9] IS-95, Mobile Station-Base Station Compatibility Standard for Dual-Mode Wideband Spread Spectrum Cellular System, EIA/TIA Interim Standard, 1993. [10] Kuhn, A., et al., Validation of the Feature Frequency Hopping in a Live GSM Network, Proc. 46th IEEE Vehic. Tech. Conf., 321–325, Apr. 1996. [11] Lagrange, X., Multitier cell design, IEEE Communications Magazine, 60–64, Aug. 1997. [12] Lee, D. and Xu, C., The effect of narrowbeam antenna and multiple tiers on system capacity in CDMA wireless local loop, IEEE Communications Magazine, 110–114, Sep. 1997. [13] Olofsson, H., et al., Interference Diversity as Means for Increased Capacity in GSM, Proc. EPMCC’95, 97–102, Nov. 1995. [14] PDC, Personal Digital Cellular System Common Air Interface, RCR-STD27B, 1991. [15] PHS, Personal Handy Phone System: Second Generation Cordless Telephone System Standard, RCR-STD28, 1993. [16] Tuttlebee, W.H.W., Cordless personal communications, IEEE Communications Magazine, 42–53, Dec. 1992.

Further Information European standards (GSM, CT2, DECT, TETRA) are published by ETSI Secretariat, 06921 Sophia Antipolis Cedex, France. U.S. standards (IS-54, IS-95, APCO) are published by Electronic Industries Association, Engineering Department, 2001 Eye Street, N.W., Washington D.C. 20006, U.S.A. Japanese standards (PDC, PHS) are published by RCR (Research and Development Center for Radio Systems), 1-5-16, Toranomon, Minato-ku, Tokyo 105, Japan.

1999 by CRC Press LLC

c

Hanzo, L. “The Pan-European Cellular System” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

The Pan-European Cellular System

Lajos Hanzo University of Southampton

27.1

27.1 Introduction 27.2 Overview 27.3 Logical and Physical Channels 27.4 Speech and Data Transmission 27.5 Transmission of Control Signals 27.6 Synchronization Issues 27.7 Gaussian Minimum Shift Keying Modulation 27.8 Wideband Channel Models 27.9 Adaptive Link Control 27.10 Discontinuous Transmission 27.11 Summary Defining Terms References

Introduction

Following the standardization and launch of the Pan-European digital mobile cellular radio system known as GSM, it is of practical merit to provide a rudimentary introduction to the system’s main features for the communications practitioner. Since GSM operating licenses have been allocated to 126 service providers in 75 countries, it is justifiable that the GSM system is often referred to as the Global System of Mobile communications. The GSM specifications were released as 13 sets of recommendations [1], which are summarized in Table 27.1, covering various aspects of the system [3]. After a brief system overview in Section 27.2 and the introduction of physical and logical channels in Section 27.3 we embark upon describing aspects of mapping logical channels onto physical resources for speech and control channels in Sections 27.4 and 27.5, respectively. These details can be found in recommendations R.05.02 and R.05.03. These recommendations and all subsequently enumerated ones are to be found in [1]. Synchronization issues are considered in Section 27.6. Modulation (R.05.04), transmission via the standardized wideband GSM channel models (R.05.05), as well as adaptive radio link control (R.05.06 and R.05.08), discontinuous transmission (DTX) (R.06.31), and voice activity detection (VAD) (R.06.32) are highlighted in Sections 27.7–27.10, whereas a summary of the fundamental GSM features is offered in Section 27.11. 1999 by CRC Press LLC

c

TABLE 27.1

GSM Recommendations [R.01.01]

R.00

Preamble to the GSM recommendations

R.01

General structure of the recommendations, description of a GSM network, associated recommendations, vocabulary, etc.

R.02

Service aspects: bearer-, tele- and supplementary services, use of services, types and features of mobile stations (MS), licensing and subscription, as well as transferred and international accounting, etc.

R.03

Network aspects, including network functions and architecture, call routing to the MS, technical performance, availability and reliability objectives, handover and location registration procedures, as well as discontinuous reception and cryptological algorithms, etc.

R.04

Mobile/base station (BS) interface and protocols, including specifications for layer 1 and 3 aspects of the open systems interconnection (OSI) seven-layer structure.

R.05

Physical layer on the radio path, incorporating issues of multiplexing and multiple access, channel coding and modulation, transmission and reception, power control, frequency allocation and synchronization aspects, etc.

R.06

Speech coding specifications, such as functional, computational and verification procedures for the speech codec and its associated voice activity detector (VAD) and other optional features.

R.07

Terminal adaptors for MSs, including circuit and packet mode as well as voiceband data services.

R.08

Base station and mobile switching center (MSC) interface, and transcoder functions.

R.09

Network interworking with the public switched telephone network (PSTN), integrated services digital network (ISDN) and, packet data networks.

R.10

Service interworking, short message service.

R.11

Equipment specification and type approval specification as regards to MSs, BSs, MSCs, home (HLR) and visited location register (VLR), as well as system simulator.

R.12

Operation and maintenance, including subscriber, routing tariff and traffic administration, as well as BS, MSC, HLR and VLR maintenance issues.

27.2

Overview

The system elements of a GSM public land mobile network (PLMN) are portrayed in Fig. 27.1, where their interconnections via the standardized interfaces A and Um are indicated as well. The mobile station (MS) communicates with the serving and adjacent base stations (BS) via the radio interface Um, whereas the BSs are connected to the mobile switching center (MSC) through the network interface A. As seen in Fig. 27.1, the MS includes a mobile termination (MT) and a terminal equipment (TE). The TE may be constituted, for example, by a telephone set and fax machine. The MT performs functions needed to support the physical channel between the MS and the base station, such as radio transmissions, radio channel management, channel coding/decoding, speech encoding/decoding, and so forth. The BS is divided functionally into a number of base transceiver stations (BTS) and a base station controller (BSC). The BS is responsible for channel allocation (R.05.09), link quality and power budget control (R.05.06 and R.05.08), signalling and broadcast traffic control, frequency hopping (FH) (R.05.02), handover (HO) initiation (R.03.09 and R.05.08), etc. The MSC represents the gateway to other networks, such as the public switched telephone network (PSTN), integrated services digital network (ISDN) and packet data networks using the interworking functions standardized in recommendation R.09. The MSC’s further functions include paging, MS location updating (R.03.12), HO control (R.03.09), etc. The MS’s mobility management is assisted by the home location register (HLR) (R.03.12), storing part of the MS’s location information and routing incoming calls to the visitor location register (VLR) (R.03.12) in charge of the area, where the paged MS roams. Location update is asked for by the MS, whenever it detects from the received and decoded broadcast control channel (BCCH) messages that it entered a new location area. The HLR contains, amongst a number of other parameters, the international mobile subscriber identity (IMSI), which is used for the authentication (R.03.20) of the subscriber by his authentication center (AUC). This enables the 1999 by CRC Press LLC

c

TE

MT

Um

MS

TE

MT

BTS BTS

Um

BSC

Um

MS

TE

MT MS

OMC

OMC

MSC

MSC

HLR

VLR

AUC

EIR

BS A

MT

ADC

BTS

MS

TE

NMC

BTS BTS

Um

BSC

BTS BS

c ETT [4]. FIGURE 27.1: Simplified structure of GSM PLMN

system to confirm that the subscriber is allowed to access it. Every subscriber belongs to a home network and the specific services that the subscriber is allowed to use are entered into his HLR. The equipment identity register (EIR) allows for stolen, fraudulent, or faulty mobile stations to be identified by the network operators. The VLR is the functional unit that attends to a MS operating outside the area of its HLR. The visiting MS is automatically registered at the nearest MSC, and the VLR is informed of the MSs arrival. A roaming number is then assigned to the MS, and this enables calls to be routed to it. The operations and maintenance center (OMC), network management center (NMC) and administration center (ADC) are the functional entities through which the system is monitored, controlled, maintained and managed (R.12). The MS initiates a call by searching for a BS with a sufficiently high received signal level on the BCCH carrier; it will await and recognize a frequency correction burst and synchronize to it (R.05.08). Now the BS allocates a bidirectional signalling channel and also sets up a link with the MSC via the network. How the control frame structure assists in this process will be highlighted in Section 27.5. The MSC uses the IMSI received from the MS to interrogate its HLR and sends the data obtained to the serving VLR. After authentication (R.03.20) the MS provides the destination number, the BS allocates a traffic channel, and the MSC routes the call to its destination. If the MS moves to another cell, it is reassigned to another BS, and a handover occurs. If both BSs in the handover process are controlled by the same BSC, the handover takes place under the control of the BSC, otherwise it is performed by the MSC. In case of incoming calls the MS must be paged by the BSC. A paging signal is transmitted on a paging channel (PCH) monitored continuously by all MSs, and which covers the location area in which the MS roams. In response to the paging signal, the MS performs an access procedure identical to that employed when the MS initiates a call. 1999 by CRC Press LLC

c

27.3

Logical and Physical Channels

The GSM logical traffic and control channels are standardized in recommendation R.05.02, whereas their mapping onto physical channels is the subject of recommendations R.05.02 and R.05.03. The GSM system’s prime objective is to transmit the logical traffic channel’s (TCH) speech or data information. Their transmission via the network requires a variety of logical control channels. The set of logical traffic and control channels defined in the GSM system is summarized in Table 27.2. There are two general forms of speech and data traffic channels: the full-rate traffic channels (TCH/F), which carry information at a gross rate of 22.8 kb/s, and the half-rate traffic channels (TCH/H), which communicate at a gross rate of 11.4 kb/s. A physical channel carries either a full-rate traffic channel, or two half-rate traffic channels. In the former, the traffic channel occupies one timeslot, whereas in the latter the two half-rate traffic channels are mapped onto the same timeslot, but in alternate frames. TABLE 27.2

c ETT [4] GSM Logical Channels Logical Channels

Duplex BS ↔ MS Traffic Channels: TCH

Control Channels: CCH

FEC-coded Speech

FEC-coded Data

Broadcast CCH BCCH BS → MS

Common CCH CCCH

Stand-alone Dedicated CCH SDCCH BS ↔ MS

Associated CCH ACCH BS ↔ MS

TCH/F 22.8 kb/s

TCH/F9.6 TCH/F4.8 TCH/F2.4

Freq. Corr. Ch: FCCH

Paging Ch: PCH BS → MS

SDCCH/4

Fast ACCH: FACCH/F FACCH/H

Synchron. Ch: SCH

Random Access Ch: RACH MS → BS

SDCCH/8

General Inf.

Access Grant Ch: AGCH BS → MS

Slow ACCH: SACCH/TF SACCH/TH SACCH/C4 SACCH/C8

22.8 kb/s TCH/H 11.4 kb/s

TCH/H4.8 TCH/H2.4 11.4 kb/s

For a summary of the logical control channels carrying signalling or synchronisation data, see Table 27.2. There are four categories of logical control channels, known as the BCCH, the common control channel (CCCH), the stand-alone dedicated control channel (SDCCH), and the associated control channel (ACCH). The purpose and way of deployment of the logical traffic and control channels will be explained by highlighting how they are mapped onto physical channels in assisting high-integrity communications. A physical channel in a time division multiple access (TDMA) system is defined as a timeslot with a timeslot number (TN) in a sequence of TDMA frames. The GSM system, however, deploys TDMA combined with frequency hopping (FH) and, hence, the physical channel is partitioned in both time and frequency. Frequency hopping (R.05.02) combined with interleaving is known to be very efficient in combatting channel fading, and it results in near-Gaussian performance even over hostile Rayleigh-fading channels. The principle of FH is that each TDMA burst is transmitted via a different RF channel (RFCH). If the present TDMA burst happened to be in a deep fade, then the next burst most probably will not be. Consequently, the physical channel is defined as a sequence of radio frequency channels and timeslots. Each carrier frequency supports eight physical channels mapped onto eight timeslots within a TDMA frame. A given physical channel always uses the same 1999 by CRC Press LLC

c

TN in every TDMA frame. Therefore, a timeslot sequence is defined by a TN and a TDMA frame number FN sequence.

27.4

Speech and Data Transmission

The speech coding standard is recommendation R.06.10, whereas issues of mapping the logical speech traffic channel’s information onto the physical channel constituted by a timeslot of a certain carrier are specified in recommendation R.05.02. Since the error correction coding represents part of this mapping process, recommendation R.05.03 is also relevant to these discussions. The example of the full-rate speech traffic channel (TCH/FS) is used here to highlight how this logical channel is mapped onto the physical channel constituted by a so-called normal burst (NB) of the TDMA frame structure. This mapping is explained by referring to Figs. 27.2 and 27.3. Then this example will be extended to other physical bursts such as the frequency correction (FCB), synchronization (SB), access (AB), and dummy burst (DB) carrying logical control channels, as well as to their TDMA frame structures, as seen in Figs. 27.2 and 27.6. The regular pulse excited (RPE) speech encoder is fully characterized in the following references: [3, 5, 7]. Because of its complexity, its description is beyond the scope of this chapter. Suffice to say that, as it can be seen in Fig. 27.3, it delivers 260 b/20 ms at a bit rate of 13 kb/s, which are divided into three significance classes: class 1a (50 b), class 1b (132 b) and class 2 (78 b). The class-1a bits are encoded by a systematic (53, 50) cyclic error detection code by adding three parity bits. Then the bits are reordered and four zero tailing bits are added to periodically reset the memory of the subsequent half-rate, constraint length five convolutional codec (CC) CC(2, 1, 5), as portrayed in Fig. 27.3. Now the unprotected 78 class-2 bits are concatenated to yield a block of 456 b/20 ms, which implies an encoded bit rate of 22.8 kb/s. This frame is partitioned into eight 57-b subblocks that are block diagonally interleaved before undergoing intraburst interleaving. At this stage each 57-b subblock is combined with a similar subblock of the previous 456-b frame to construct a 116-b burst, where the flag bits hl and hu are included to classify whether the current burst is really a TCH/FS burst or it has been stolen by an urgent fast associated (FACCH) control channel message. Now the bits are encrypted and positioned in a NB, as depicted at the bottom of Fig. 27.2, where three tailing bits (TB) are added at both ends of the burst to reset the memory of the Viterbi channel equalizer (VE), which is responsible for removing both the channel-induced and the intentional controlled intersymbol interference [6]. The 8.25-b interval duration guard period (GP) at the bottom of Fig. 27.2 is provided to prevent burst overlapping due to propagation delay fluctuations. Finally, a 26-b equalizer training segment is included in the center of the normal traffic burst. This segment is constructed by a 16-b Viterbi channel equalizer training pattern surrounded by five quasiperiodically repeated bits on both sides. Since the MS has to be informed about which BS it communicates with, for neighboring BSs one of eight different training patterns is used, associated with the so-called BS color codes, which assist in identifying the BSs. This 156.25-b duration TCH/FS NB constitutes the basic timeslot of the TDMA frame structure, which is input to the Gaussian minimum shift keying (GMSK) modulator to be highlighted in Section 27.7, at a bit rate of approximately 271 kb/s. Since the bit interval is 1/(271 kb/s) = 3.69 µs, the timeslot duration is 156.25 · 3.69 ≈ 0.577 ms. Eight such normal bursts of eight appropriately staggered TDMA users are multiplexed onto one (RF) carrier giving, a TDMA frame of 8 · 0.577 ≈ 4.615-ms duration, as shown in Fig. 27.2. The physical channel as characterized earlier provides a physical timeslot with a throughput of 114 b/4.615 ms = 24.7 kb/s, which is sufficiently high to transmit the 22.8 kb/s TCH/FS information. It even has a reserved capacity of 24.7 − 22.8 = 1999 by CRC Press LLC

c

c

1999 by CRC Press LLC FIGURE 27.2:

c ETT [4]. The GSM TDMA frame structure

e. g. TCH / FS

0

0

2

3

0 0

1

2 1

1

TB 3

58 Encrypted bits

11 SACCH 12 13

1 multiframe = 26 TDMA frames (120 ms)

e. g. BCCH

e. g. TCH / FS

1

0

24

2

0

1

26 bits Training Seg.

∼ 3.69 us) (1 bit duration −

1 timeslot = (156.25 bit durations ∼ − 0.577 ms)

1

1 TDMA frame = 8 timeslots (4.615 ms)

Idle/SACCH

24

25

49

50

2046

7

58 Encrypted bits

2

TB 3

GP 8.25

49 50

1 multiframe = 51 TDMA frames (235 ms)

1 superframe = 1326 TDMA frames (6.12 s)

1 hyperframe = 2048 superframes = 2,715,648 TDMA frames (3 hours, 28 minutes, ...)

e. g. BCCH

2047

260 bits/20 ms = 13 kbps C1a 50 bits

C1b 132 bits

C2 78 bits

Parity Check

50

3

4

132 189 bits

Convolutional Code r = 1/2, k = 5 78

0

1

2

3

4

5

6

7

0

1

block (n − 1)

2

3

4

5

6

7

block (n)

57 hl hu57 57 hl hu 57 114

114

114

114

c ETT [4]. FIGURE 27.3: Mapping the TCH/FS logical channel onto a physical channel,

1.9 kb/s, which can be exploited to transmit slow control information associated with this specific traffic channel, i.e., to construct a so-called slow associated control channel (SACCH), constituted by the SACCH TDMA frames, interspersed with traffic frames at multiframe level of the hierarchy, as seen in Fig. 27.2. Mapping logical data traffic channels onto a physical channel is essentially carried out by the channel codecs [8], as specified in recommendation R.05.03. The full- and half-rate data traffic channels standardized in the GSM system are: TCH/F9.6, TCH/F4.8, TCH/F2.4, as well as TCH/H4.8, TCH/H2.4, as was shown earlier in Table 27.2. Note that the numbers in these acronyms represent the data transmission rate in kilobits per second. Without considering the details of these mapping processes we now focus our attention on control signal transmission issues.

27.5

Transmission of Control Signals

The exact derivation, forward error correcting (FEC) coding and mapping of logical control channel information is beyond the scope of this chapter, and the interested reader is referred to ETSI, 1988 1999 by CRC Press LLC

c

(R.05.02 and R.05.03) and Hanzo and Stefanov, 1992, for a detailed discussion. As an example, the mapping of the 184-b SACCH, FACCH, BCCH, SDCCH, PCH, and access grant control channel (AGCH) messages onto a 456-b block, i.e., onto four 114-b bursts is demonstrated in Fig. 27.4. A double-layer concatenated FIRE-code/convolutional code scheme generates 456 bits, using an overall coding rate of R = 184/456, which gives a stronger protection for control channels than the error protection of traffic channels.

184 bits Fire-Code (224, 184)

26

23

G 5 (D) = D 40+ D + D + D17 + D3 + 1 information bits: 184

tailing parity: 40

4

CC (2, 1, 5)

456

c ETT [4]. FIGURE 27.4: FEC in SACCH, FACCH, BCCH, SDCCH, PCH and AGCH,

Returning to Fig. 27.2 we will now show how the SACCH is accommodated by the TDMA frame structure. The TCH/FS TDMA frames of the eight users are multiplexed into multiframes of 24 TDMA frames, but the 13th frame will carry a SACCH message, rather than the 13th TCH/FS frame, whereas the 26th frame will be an idle or dummy frame, as seen at the left-hand side of Fig. 27.2 at the multiframe level of the traffic channel hierarchy. The general control channel frame structure shown at the right of Fig. 27.2 is discussed later. This way 24-TCH/FS frames are sent in a 26-frame multiframe during 26 · 4.615 = 120 ms. This reduces the traffic throughput to (24/26) · 24.7 = 22.8 kb/s required by TCH/FS, allocates (1/26) · 24.7 = 950 b/s to the SACCH and wastes 950 b/s in the idle frame. Observe that the SACCH frame has eight timeslots to transmit the eight 950-b/s SACCHs of the eight users on the same carrier. The 950-b/s idle capacity will be used in case of half-rate channels, where 16 users will be multiplexed onto alternate frames of the TDMA structure to increase system capacity. Then 16, 11.4-kb/s encoded half-rate speech TCHs will be transmitted in a 120-ms multiframe, where also 16 SACCHs are available. The FACCH messages are transmitted via the physical channels provided by bits stolen from their own host traffic channels. The construction of the FACCH bursts from 184 control bits is identical to that of the SACCH, as also shown in Fig. 27.4 but its 456-b frame is mapped onto eight consecutive 114-b TDMA traffic bursts, exactly as specified for TCH/FS. This is carried out by stealing the even bits of the first four and the odd bits of the last four bursts, which is signalled by setting hu = 1, hl = 0 and hu = 0, hl = 1 in the first and last bursts, respectively. The unprotected FACCH information 1999 by CRC Press LLC

c

rate is 184 b/20 ms = 9.2 kb/s, which is transmitted after concatenated error protection at a rate of 22.8 kb/s. The repetition delay is 20 ms, and the interleaving delay is 8 · 4.615 = 37 ms, resulting in a total of 57-ms delay. In Fig. 27.2 at the next hierarchical level, 51-TCH/FS multiframes are multiplexed into one superframe lasting 51·120 ms = 6.12 s, which contains 26·51 = 1326-TDMA frames. In the case of 1326-TDMA frames, however, the frame number would be limited to 0 ≤ F N ≤ 1326 and the encryption rule relying on such a limited range of F N values would not be sufficiently secure. Then 2048 superframes were amalgamated to form a hyperframe of 1326 · 2048 = 2,715,648-TDMA frames lasting 2048 · 6.12 s ≈ 3 h 28 min, allowing a sufficiently high F N value to be used in the encryption algorithm. The uplink and downlink traffic-frame structures are identical with a shift of three timeslots between them, which relieves the MS from having to transmit and receive simultaneously, preventing high-level transmitted power leakage back to the sensitive receiver. The received power of adjacent BSs can be monitored during unallocated timeslots. In contrast to duplex traffic and associated control channels, the simplex BCCH and CCCH logical channels of all MSs roaming in a specific cell share the physical channel provided by timeslot zero of the so-called BCCH carriers available in the cell. Furthermore, as demonstrated by the right-hand side section of Fig. 27.2, 51 BCCH and CCCH TDMA frames are mapped onto a 51 · 4.615 = 235-ms duration multiframe, rather than on a 26-frame, 120-ms duration multiframe. In order to compensate for the extended multiframe length of 235 ms, 26 multiframes constitute a 1326-frame superframe of 6.12-s duration. Note in Fig. 27.5, that the allocation of the uplink and downlink frames is different, since these control channels exist only in one direction.

51 time frames 51.4,615 = 235 ms

1 2 3 4 RR R R R R RR RR

RR RR R R RRRR

51 R RR R R RRRRR R R

(a) Uplink Direction

51 time frames 235 ms F S B B B B CC CC

F S CC C C CCCC

F S C C C CCCCC C I

(a) Downlink Direction

R: Random Access Channel F : Frequency Correction Channel S : Synchronisation Channel B : Broadcast Control Channel C: Access Grant/Paging Channel I : Idle Frame

c ETT [4]. FIGURE 27.5: The control multiframe, 1999 by CRC Press LLC

c

Specifically, the random access channel (RACH) is only used by the MSs in the uplink direction if they request, for example, a bidirectional SDCCH to be mapped onto an RF channel to register with the network and set up a call. The uplink RACH has a low capacity, carrying messages of 8-b/235-ms multiframe, which is equivalent to an unprotected control information rate of 34 b/s. These messages are concatenated FEC coded to a rate of 36 b/235 ms = 153 b/s. They are not transmitted by the NB derived for TCH/FS, SACCH, or FACCH logical channels, but by the AB, depicted in Fig. 27.6 in comparison to a NB and other types of bursts to be described later. The FEC coded, encrypted 36-b AB messages of Fig. 27.6 contain among other parameters, the encoded 6-b BS identifier code (BSIC) constituted by the 3-b PLMN color code and 3-b BS color code for unique BS identification. These 36 b are positioned after the 41-b synchronization sequence, which has a high wordlength in order to ensure reliable access burst recognition and a low probability of being emulated by interfering stray data. These messages have no interleaving delay, while they are transmitted with a repetition delay of one control multiframe length, i.e., 235 ms. 1 TDMA FRAME = 8 TIME SLOTS 0

1

2

3

4

5

6

7

1 TIME SLOT = 156.25 BIT DURATIONS

NORMAL BURST TAIL BITS 3

ENCRYPTED BITS 58

TRAINING SEQUENCE 26

ENCRYPTED BITS TAIL BITS GUARD PERIOD 58 3 8.25

FREQUENCY CORRECTION BURST TAIL BITS 3

FIXED BITS 142

TAIL BITS GUARD PERIOD 3 8.25

SYNCHRONISATION BURST TAIL BITS 3

ENCRYPTED SYNC BITS 39

EXTENDED TRAINING SEQUENCE 64

TAIL BITS GUARD PERIOD 3 8.25

ENCRYPTED SYNC BITS 39

ACCESS BURST TAIL BITS 8

SYNCHRO SEQUENCE ENCRYPTED BITS TAIL BITS 41 36 3

FIGURE 27.6: GSM burst structures,

GUARD PERIOD 68.25

c

ETT [4].

Adaptive time frame alignment is a technique designed to equalize propagation delay differences between MSs at different distances. The GSM system is designed to allow for cell sizes up to 35 km radius. The time a radio signal takes to travel the 70 km from the base station to the mobile station and back again is 233.3 µs. As signals from all the mobiles in the cell must reach the base station without overlapping each other, a long guard period of 68.25 b (252 µs) is provided in the access

1999 by CRC Press LLC

c

burst, which exceeds the maximum possible propagation delay of 233.3 µs. This long guard period in the access burst is needed when the mobile station attempts its first access to the base station or after a handover has occurred. When the base station detects a 41-b random access synchronization sequence with a long guard period, it measures the received signal delay relative to the expected signal from a mobile station of zero range. This delay, called the timing advance, is signalled using a 6-b number to the mobile station, which advances its timebase over the range of 0–63 b, i.e., in units of 3.69 µs. By this process the TDMA bursts arrive at the BS in their correct timeslots and do not overlap with adjacent ones. This process allows the guard period in all other bursts to be reduced to 8.25 · 3.69 µs ≈ 30.46 µs (8.25 b) only. During normal operation, the BS continuously monitors the signal delay from the MS and, if necessary, it will instruct the MS to update its time advance parameter. In very large traffic cells there is an option to actively utilize every second timeslot only to cope with higher propagation delays, which is spectrally inefficient, but in these large, low-traffic rural cells it is admissible. As demonstrated by Fig. 27.2, the downlink multiframe transmitted by the BS is shared amongst a number of BCCH and CCCH logical channels. In particular, the last frame is an idle frame (I), whereas the remaining 50 frames are divided in five blocks of ten frames, where each block starts with a frequency correction channel (FCCH) followed by a synchronization channel (SCH). In the first block of ten frames the FCCH and SCH frames are followed by four BCCH frames and by either four AGCH or four PCH. In the remaining four blocks of ten frames, the last eight frames are devoted to either PCHs or AGCHs, which are mutually exclusive for a specific MS being either paged or granted a control channel. The FCCH, SCH, and RACH require special transmission bursts, tailored to their missions, as depicted in Fig. 27.6. The FCCH uses frequency correction bursts (FCB) hosting a specific 142-b pattern. In partial response GMSK it is possible to design a modulating data sequence, which results in a near-sinusoidal modulated signal imitating an unmodulated carrier exhibiting a fixed frequency offset from the RF carrier utilized. The synchronization channel transmits SB hosting a 16 · 4 = 64b extended sequence exhibiting a high-correlation peak in order to allow frame alignment with a quarter-bit accuracy. Furthermore, the SB contains 2·39 = 78 encrypted FEC-coded synchronization bits, hosting the BS and PLMN color codes, each representing one of eight legitimate identifiers. Lastly, the AB contain an extended 41-b synchronization sequence, and they are invoked to facilitate initial access to the system. Their long guard space of 68.25-b duration prevents frame overlap, before the MS’s distance, i.e., the propagation delay becomes known to the BS and could be compensated for by adjusting the MS’s timing advance.

27.6

Synchronization Issues

Although some synchronization issues are standardized in recommendations R.05.02 and R.05.03, the GSM recommendations do not specify the exact BS-MS synchronization algorithms to be used, these are left to the equipment manufacturers. A unique set of timebase counters, however, is defined in order to ensure perfect BS-MS synchronism. The BS sends FCB and SB on specific timeslots of the BCCH carrier to the MS to ensure that the MS’s frequency standard is perfectly aligned with that of the BS, as well as to inform the MS about the required initial state of its internal counters. The MS transmits its uniquely numbered traffic and control bursts staggered by three timeslots with respect to those of the BS to prevent simultaneous MS transmission and reception, and also takes into account the required timing advance (TA) to cater for different BS-MS-BS round-trip delays. The timebase counters used to uniquely describe the internal timing states of BSs and MSs are the quarter-bit number (QN = 0–624) counting the quarter-bit intervals in bursts, bit number 1999 by CRC Press LLC

c

(BN = 0–156), timeslot number (T N = 0–7) and TDMA Frame Number (F N = 0–26·51·2048), given in the order of increasing interval duration. The MS sets up its timebase counters after receiving a SB by determining QN from the 64-b extended training sequence in the center of the SB, setting T N = 0 and decoding the 78-encrypted, protected bits carrying the 25-SCH control bits. The SCH carries frame synchronization information as well as BS identification information to the MS, as seen in Fig. 27.7, and it is provided solely to support the operation of the radio subsystem. The first 6 b of the 25-b segment consist of three PLMN color code bits and three

PLMN colour 3 bits

BS colour 3 bits

BSIC 6 bits

T1 : superframe index 11 bits

T2 : multiframe index

T1 : block frame index

5 bits

3 bits

RFN 19 bits

c ETT [4]. FIGURE 27.7: Synchronization channel (SCH) message format,

BS color code bits supplying a unique BS identifier code (BSIC) to inform the MS which BS it is communicating with. The second 19-bit segment is the so-called reduced TDMA frame number RFN derived from the full TDMA frame number F N , constrained to the range of [0–(26 · 51 · 2048) − 1] = (0–2,715,647) in terms of three subsegments T 1, T 2, and T 3. These subsegments are computed as follows: T 1(11 b) = [F N div (26 · 51)], T 2(5 b) = (F N mod 26) and T 30 (3b) = [(T 3 − 1) div 10], where T 3 = (F N mod 5), whereas div and mod represent the integer division and modulo operations, respectively. Explicitly, in Fig. 27.7 T 1 determines the superframe index in a hyperframe, T 2 the multiframe index in a superframe, T 3 the frame index in a multiframe, whereas T 30 is the so-called signalling block index [1–5] of a frame in a specific 51-frame control multiframe, and their roles are best understood by referring to Fig. 27.2. Once the MS has received the SB, it readily computes the F N required in various control algorithms, such as encryption, handover, etc., as F N = 51 (T 3 − T 2) mod 26 + T 3 + 51 · 26 · T 1,ψ where T 3 = 10 · T 30 + 1

27.7

Gaussian Minimum Shift Keying Modulation

The GSM system uses constant envelope partial response GMSK modulation [6] specified in recommendation R.05.04. Constant envelope, continuous-phase modulation schemes are robust against signal fading as well as interference and have good spectral efficiency. The slower and smoother are the phase changes, the better is the spectral efficiency, since the signal is allowed to change less abruptly, requiring lower frequency components. The effect of an input bit, however, is spread over several bit periods, leading to a so-called partial response system, which requires a channel equalizer in order to remove this controlled, intentional intersymbol interference (ISI) even in the absence of uncontrolled channel dispersion. The widely employed partial response GMSK scheme is derived from the full response minimum shift keying (MSK) scheme. In MSK the phase changes between adjacent bit periods are piecewise linear, which results in discontinuous-phase derivative, i.e., instantaneous frequency at the signalling instants, and hence widens the spectrum. Smoothing these phase changes, however, by a filter having 1999 by CRC Press LLC

c

a Gaussian impulse response [6], which is known to have the lowest possible bandwidth, this problem is circumvented using the schematic of Fig. 27.8, where the GMSK signal is generated by modulating and adding two quadrature carriers. The key parameter of GMSK in controlling both bandwidth and interference resistance is the 3-dB down filter-bandwidth × bit interval product (B · T ), referred to as normalized bandwidth. It was found that as the B · T product is increased from 0.2 to 0.5, the interference resistance is improved by approximately 2 dB at the cost of increased bandwidth occupancy, and best compromise was achieved for B · T = 0.3. This corresponds to spreading the effect of 1 b over approximately 3-b intervals. The spectral efficiency gain due to higher interference tolerance and, hence, more dense frequency reuse was found to be more significant than the spectral loss caused by wider GMSK spectral lobes.

cos [φ (t, α n)] phase pulse shaping

φ (t, α n)

cos ωt

cos cos [ωt + φ (t, α n)]

Gaussian filter

dt

frequency pulse shaping

sin sin [φ (t, α n)]

−sin ωt

c ETT [4]. FIGURE 27.8: GMSK modulator schematic diagram,

The channel separation at the TDMA burst rate of 271 kb/s is 200 kHz, and the modulated spectrum must be 40 dB down at both adjacent carrier frequencies. When TDMA bursts are transmitted in an on-off keyed mode, further spectral spillage arises, which is mitigated by a smooth power ramp up and down envelope at the leading and trailing edges of the transmission bursts, attenuating the signal by 70 dB during a 28- and 18-µs interval, respectively.

27.8

Wideband Channel Models

The set of 6-tap GSM impulse responses [2] specified in recommendation R.05.05 is depicted in Fig. 27.9, where the individual propagation paths are independent Rayleigh fading paths, weighted by the appropriate coefficients hi corresponding to their relative powers portrayed in the figure. In simple terms the wideband channel’s impulse response is measured by transmitting an impulse and detecting the received echoes at the channel’s output in every D-spaced so-called delay bin. In some bins no delayed and attenuated multipath component is received, whereas in others significant energy is detected, depending on the typical reflecting objects and their distance from the receiver. The path delay can be easily related to the distance of the reflecting objects, since radio waves are travelling at the speed of light. For example, at a speed of 300,000 km/s, a reflecting object situated at a distance of 0.15 km yields a multipath component at a round-trip delay of 1 µs. The typical urban (TU) impulse response spreads over a delay interval of 5 µs, which is almost two 3.69-µs bit-intervals duration and, therefore, results in serious ISI. In simple terms, it can be treated as a two-path model, where the reflected path has a length of 0.75 km, corresponding to a reflector 1999 by CRC Press LLC

c

HILLY TERRAIN (HT) IMPULSE RESPONSE

TYPICAL URBAN (TU) IMPULSE RESPONSE 1.2

REL. POWER

1.2

REL. POWER

1

××

××

1

0.8

0.6

0.6

0.4

0.4 0.2

××

××

××

0.2

××

××

××

××

××

0.8

0

0

0

20

10 DELAY (us)

15

20

EQUALISER TEST (EQ) IMPULSE RESPONSE

REL. POWER

1.2

REL. POWER

1

××

××

1

××

RURAL AREA (RA) IMPULSE RESPONSE 1.2

5

××

15

××

10 DELAY (us)

××

5

××

0

0.8

××

0.8

0.6

0.4

0.4

0.2

0.2

××

××

0.6

0 0

0 5

10 DELAY (us)

15

20

0

5

10 DELAY (us)

c ETT [4]. FIGURE 27.9: Typical GSM channel impulse responses,

1999 by CRC Press LLC

c

15

20

located at a distance of about 375 m. The hilly terrain (HT) model has a sharply decaying shortdelay section due to local reflections and a long-delay path around 15 µs due to distant reflections. Therefore, in practical terms it can be considered a two- or three-path model having reflections from a distance of about 2 km. The rural area (RA) response seems the least hostile amongst all standardized responses, decaying rapidly inside 1-b interval and, therefore, is expected to be easily combated by the channel equalizer. Although the type of the equalizer is not standardized, partial response systems typically use VEs. Since the RA channel effectively behaves as a single-path nondispersive channel, it would not require an equalizer. The fourth standardized impulse response is artificially contrived in order to test the equalizer’s performance and is constituted by six equidistant unit-amplitude impulses representing six equal-powered independent Rayleigh-fading paths with a delay spread over 16 µs. With these impulse responses in mind, the required channel is simulated by summing the appropriately delayed and weighted received signal components. In all but one case the individual components are assumed to have Rayleigh amplitude distribution, whereas in the RA model the main tap at zero delay is supposed to have a Rician distribution with the presence of a dominant line-of-sight path.

27.9

Adaptive Link Control

The adaptive link control algorithm portrayed in Fig. 27.10 and specified in recommendation R.05.08 allows for the MS to favor that specific traffic cell which provides the highest probability of reliable communications associated with the lowest possible path loss. It also decreases interference with other cochannel users and, through dense frequency reuse, improves spectral efficiency, whilst maintaining an adequate communications quality, and facilitates a reduction in power consumption, which is particularly important in hand-held MSs. The handover process maintains a call in progress as the MS moves between cells, or when there is an unacceptable transmission quality degradation caused by interference, in which case an intracell handover to another carrier in the same cell is performed. A radio-link failure occurs when a call with an unacceptable voice or data quality cannot be improved either by RF power control or by handover. The reasons for the link failure may be loss of radio coverage or very high-interference levels. The link control procedures rely on measurements of the received RF signal strength (RXLEV), the received signal quality (RXQUAL), and the absolute distance between base and mobile stations (DISTANCE). RXLEV is evaluated by measuring the received level of the BCCH carrier which is continuously transmitted by the BS on all time slots of the B frames in Fig. 27.5 and without variations of the RF level. A MS measures the received signal level from the serving cell and from the BSs in all adjacent cells by tuning and listening to their BCCH carriers. The root mean squared level of the received signal is measured over a dynamic range from −103 to −41 dBm for intervals of one SACCH multiframe (480 ms). The received signal level is averaged over at least 32 SACCH frames (≈15 s) and mapped to give RXLEV values between 0 and 63 to cover the range from −103 to −41 dBm in steps of 1 dB. The RXLEV parameters are then coded into 6-b words for transmission to the serving BS via the SACCH. RXQUAL is estimated by measuring the bit error ratio (BER) before channel decoding, using the Viterbi channel equalizer’s metrics [6] and/or those of the Viterbi convolutional decoder [8]. Eight values of RXQUAL span the logarithmically scaled BER range of 0.2–12.8% before channel decoding. The absolute DISTANCE between base and mobile stations is measured using the timing advance parameter. The timing advance is coded as a 6-b number corresponding to a propagation delay from 0 to 63 · 3.69 µs = 232.6 µs, characteristic of a cell radius of 35 km. While roaming, the MS needs to identify which potential target BS it is measuring, and the BCCH carrier frequency may not be sufficient for this purpose, since in small cluster sizes the same BCCH 1999 by CRC Press LLC

c

Switch ON

N

Home PLMN

MS selects new PLMN Y BCCHS for PLMN known

Y

N Measure RXLEV for all GSM carriers

Measure & store RXLEV for all GSM carriers

Hop to strongest carrier & await FCB

Hop to strongest BCCH carrier & await FCB

Sychronise & await BCCh data

Time out

Recognise FCB Sychronise & await BCCh data

Decode BSIC

Decode BSIC (PLMN & BS colour bits)

BCCH from selected PLMN

Time out

Time out

BCCH from selected PLMN

N

N

All BCCHS tested

Y

Y All 124 carriers tested

Y

Y

N

Y Barred cell

Barred cell

N Hop to next strongest crarrier

N

Pathloss acceptable

Pathloss acceptable

N

Hop to next strongest BCCH

N

N

Y Any BCCH decoded

N

Y Save BCCH list for this PLMN

Y

Hop to strongest BCCH

Idle Mode

c ETT [4]. FIGURE 27.10: Initial cell selection by the MS,

1999 by CRC Press LLC

c

frequency may be used in more than one surrounding cell. To avoid ambiguity a 6-b BSIC is transmitted on each BCCH carrier in the SB of Fig. 27.6. Two other parameters transmitted in the BCCH data provide additional information about the BS. The binary flag called PLMN PERMITTED indicates whether the measured BCCH carrier belongs to a PLMN that the MS is permitted to access. The second Boolean flag, CELL BAR ACCESS, indicates whether the cell is barred for access by the MS, although it belongs to a permitted PLMN. A MS in idle mode, i.e., after it has just been switched on or after it has lost contact with the network, searches all 125 RF channels and takes readings of RXLEV on each of them. Then it tunes to the carrier with the highest RXLEV and searches for FCB in order to determine whether or not the carrier is a BCCH carrier. If it is not, then the MS tunes to the next highest carrier, and so on, until it finds a BCCH carrier, synchronizes to it and decodes the parameters BSIC, PLMN PERMITTED and CELL BAR ACCESS in order to decide whether to continue the search. The MS may store the BCCH carrier frequencies used in the network accessed, in which case the search time would be reduced. Again, the process described is summarized in the flowchart of Fig. 27.10. The adaptive power control is based on RXLEV measurements. In every SACCH multiframe the BS compares the RXLEV readings reported by the MS or obtained by the base station with a set of thresholds. The exact strategy for RF power control is determined by the network operator with the aim of providing an adequate quality of service for speech and data transmissions while keeping interferences low. Clearly, adequate quality must be achieved at the lowest possible transmitted power to keep cochannel interferences low, which implies contradictory requirements in terms of transmitted power. The criteria for reporting radio link failure are based on the measurements of RXLEV and RXQUAL performed by both the mobile and base stations, and the procedures for handling link failures result in the re-establishment or the release of the call, depending on the network operator’s strategy. The handover process involves the most complex set of procedures in the radio-link control. Handover decisions are based on results of measurements performed both by the base and mobile stations. The base station measures RXLEV, RXQUAL, DISTANCE, and also the interference level in unallocated time slots, whereas the MS measures and reports to the BS the values of RXLEV and RXQUAL for the serving cell and RXLEV for the adjacent cells. When the MS moves away from the BS, the RXLEV and RXQUAL parameters for the serving station become lower, whereas RXLEV for one of the adjacent cells increases.

27.10

Discontinuous Transmission

Discontinuous transmission (DTX) issues are standardized in recommendation R.06.31, whereas the associated problems of voice activity detection VAD are specified by R.06.32. Assuming an average speech activity of 50% and a high number of interferers combined with frequency hopping to randomize the interference load, significant spectral efficiency gains can be achieved when deploying discontinuous transmissions due to decreasing interferences, while reducing power dissipation as well. Because of the reduction in power consumption, full DTX operation is mandatory for MSs, but in BSs, only receiver DTX functions are compulsory. The fundamental problem in voice activity detection is how to differentiate between speech and noise, while keeping false noise triggering and speech spurt clipping as low as possible. In vehiclemounted MSs the severity of the speech/noise recognition problem is aggravated by the excessive vehicle background noise. This problem is resolved by deploying a combination of threshold comparisons and spectral domain techniques [1, 3]. Another important associated problem is the intro1999 by CRC Press LLC

c

duction of noiseless inactive segments, which is mitigated by comfort noise insertion (CNI) in these segments at the receiver.

27.11

Summary

Following the standardization and launch of the GSM system its salient features were summarized in this brief review. Time division multiple access (TDMA) with eight users per carrier is used at a multiuser rate of 271 kb/s, demanding a channel equalizer to combat dispersion in large cell environments. The error protected chip rate of the full-rate traffic channels is 22.8 kb/s, whereas in half-rate channels it is 11.4 kb/s. Apart from the full- and half-rate speech traffic channels, there are 5 different rate data traffic channels and 14 various control and signalling channels to support the system’s operation. A moderately complex, 13 kb/s regular pulse excited speech codec with long term predictor (LTP) is used, combined with an embedded three-class error correction codec and multilayer interleaving to provide sensitivity-matched unequal error protection for the speech bits. An overall speech delay of 57.5 ms is maintained. Slow frequency hopping at 217 hops/s yields substantial performance gains for slowly moving pedestrians. TABLE 27.3

Summary of GSM Features

System feature

Specification

Up-link bandwidth, MHz

890–915 = 25

Down-link bandwidth, MHz

935–960 = 25

Total GSM bandwidth, MHz

50

Carrier spacing, KHz

200

No. of RF carriers

125

Multiple access

TDMA

No. of users/carrier

8

Total No. of channels

1000

TDMA burst rate, kb/s

271

Modulation

GMSK with BT = 0.3

Bandwidth efficiency, b/s/Hz

1.35

Channel equalizer

yes

Speech coding rate, kb/s

13

FEC coded speech rate, kb/s

22.8

FEC coding

Embedded block/ convolutional

Frequency hopping, hop/s

217

DTX and VAD

yes

Maximum cell radius, km

35

Constant envelope partial response GMSK with a channel spacing of 200 kHz is deployed to support 125 duplex channels in the 890–915-MHz up-link and 935–960-MHz down-link bands, respectively. At a transmission rate of 271 kb/s a spectral efficiency of 1.35-bit/s/Hz is achieved. The controlled GMSK-induced and uncontrolled channel-induced intersymbol interferences are removed by the channel equalizer. The set of standardized wideband GSM channels was introduced in order to provide bench markers for performance comparisons. Efficient power budgeting and minimum 1999 by CRC Press LLC

c

cochannel interferences are ensured by the combination of adaptive power and handover control based on weighted averaging of up to eight up-link and down-link system parameters. Discontinuous transmissions assisted by reliable spectral-domain voice activity detection and comfort-noise insertion further reduce interferences and power consumption. Because of ciphering, no unprotected information is sent via the radio link. As a result, spectrally efficient, high-quality mobile communications with a variety of services and international roaming is possible in cells of up to 35 km radius for signal-to-noise and interference ratios in excess of 10–12 dBs. The key system features are summarized in Table 27.3.

Defining Terms A3: Authentication algorithm A5: Cyphering algorithm A8: Confidential algorithm to compute the cyphering key AB: Access burst ACCH: Associated control channel ADC: Administration center AGCH: Access grant control channel AUC: Authentication center AWGN: Additive Gaussian noise BCCH: Broadcast control channel BER: Bit error ratio BFI: Bad frame indicator flag BN : Bit number BS: Base station BS-PBGT: BS powerbudget: to be evaluated for power budget motivated handovers BSIC: Base station identifier code CC: Convolutional codec CCCH: Common control channel CELL BAR ACCESS: Boolean flag to indicate, whether the MS is permitted to access the specific traffic cell CNC: Comfort noise computation CNI: Comfort noise insertion CNU: Comfort noise update state in the DTX handler DB: Dummy burst DL: Down link DSI: Digital speech interpolation to improve link efficiency DTX: Discontinuous transmission for power consumption and interference reduction EIR: Equipment identity register EOS: End of speech flag in the DTX handler FACCH: Fast associated control channel FCB: Frequency correction burst 1999 by CRC Press LLC

c

FCCH: Frequency correction channel FEC: Forward error correction FH: Frequency hopping F N: TDMA frame number GMSK: Gaussian minimum shift keying GP: Guard space HGO: Handover in the VAD HLR: Home location register HO: Handover HOCT: Handover counter in the VAD HO MARGIN: Handover margin to facilitate hysteresis HSN: Hopping sequence number: frequency hopping algorithm’s input variable IMSI: International mobile subscriber identity ISDN: Integrated services digital network LAI: Location area identifier LAR: Logarithmic area ratio LTP: Long term predictor MA: Mobile allocation: set of legitimate RF channels, input variable in the frequency hopping algorithm MAI: Mobile allocation index: output variable of the FH algorithm MAIO: Mobile allocation index offset: initial RF channel offset, input variable of the FH algorithm MS: Mobile station MSC: Mobile switching center MSRN: Mobile station roaming number MS TXPWR MAX: Maximum permitted MS transmitted power on a specific traffic channel in a specific traffic cell MS TXPWR MAX(n): Maximum permitted MS transmitted power on a specific traffic channel in the nth adjacent traffic cell NB: Normal burst NMC: Network management center NUFR: Receiver noise update flag NUFT: Noise update flag to ask for SID frame transmission OMC: Operation and maintenance center PARCOR: Partial correlation PCH: Paging channel PCM: Pulse code modulation PIN: Personal identity number for MSs PLMN: Public land mobile network PLMN PERMITTED: Boolean flag to indicate whether the MS is permitted to access the specific PLMN PSTN: Public switched telephone network 1999 by CRC Press LLC

c

QN: Quarter bit number R: Random number in the authentication process RA: Rural area channel impulse response RACH: Random access channel RF: Radio frequency RFCH: Radio frequency channel RFN: Reduced TDMA frame number: equivalent representation of the TDMA frame number that is used in the synchronization channel RNTABLE: Random number table utilized in the frequency hopping algorithm RPE: Regular pulse excited RPE-LTP: Regular pulse excited codec with long term predictor RS-232: Serial data transmission standard equivalent to CCITT V24. interface RXLEV: Received signal level: parameter used in handovers RXQUAL: Received signal quality: parameter used in handovers S: Signed response in the authentication process SACCH: Slow associated control channel SB: Synchronization burst SCH: Synchronization channel SCPC: Single channel per carrier SDCCH: Stand-alone dedicated control channel SE: Speech extrapolation SID: Silence identifier SIM: Subscriber identity module in MSs SPRX: Speech received flag SPTX: Speech transmit flag in the DTX handler STP: Short term predictor TA: Timing advance TB: Tailing bits TCH: Traffic channel TCH/F: Full-rate traffic channel TCH/F2.4: Full-rate 2.4-kb/s data traffic channel TCH/F4.8: Full-rate 4.8-kb/s data traffic channel TCH/F9.6: Full-rate 9.6-kb/s data traffic channel TCH/FS: Full-rate speech traffic channel TCH/H: Half-rate traffic channel TCH/H2.4: Half-rate 2.4-kb/s data traffic channel TCH/H4.8: Half-rate 4.8-kb/s data traffic channel TDMA: Time division multiple access TMSI: Temporary mobile subscriber identifier T N : Time slot number TU: Typical urban channel impulse response 1999 by CRC Press LLC

c

TXFL: Transmit flag in the DTX handler UL: Up link VAD: Voice activity detection VE: Viterbi equalizer VLR: Visiting location register

References [1] European Telecommunications Standardization Institute. Group Speciale Mobile or Global System of Mobile Communication (GSM) Recommendation, ETSI Secretariat, Sophia Antipolis Cedex, France, 1988. [2] Greenwood, D. and Hanzo, L., Characterisation of mobile radio channels, In Mobile Radio Communications. Steele, R., Ed., Chap. 2, 92–185. IEEE Press–Pentech Press, London, 1992. [3] Hanzo, L. and Stefanov, J., The Pan-European digital cellular mobile radio system—known as GSM. In Mobile Radio Communications, Steele, R., Ed., Chap. 8, 677–773, IEEE Press–Pentech Press, London, 1992. [4] Hanzo, L. and Steele, R., The Pan-European mobile radio system, Pts. 1 and 2, European Trans. on Telecomm., 5(2), 245–276, 1994. [5] Salami, R.A., Hanzo, L., et al., Speech coding. In Mobile Radio Communications, Steele, R., Ed., Chap. 3, 186–346. IEEE Press–Pentech Press, London, 1992. [6] Steele, R. Ed., Mobile Radio Communications, IEEE Press–Pentech Press, London, 1992. [7] Vary, P. and Sluyter, R.J., MATS-D speech codec: Regular-pulse excitation LPC, Proceedings of Nordic Conference on Mobile Radio Communications. 257–261, 1986. [8] Wong, K.H.H. and Hanzo, L., Channel coding. In Mobile Radio Communications. Steele, R., Ed., Chap. 4, 347–488. IEEE Press–Pentech Press, London, 1992.

1999 by CRC Press LLC

c

Mermelstein, P. “Speech and Channel Coding for North American TDMA Cellular Systems” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Speech and Channel Coding for North American TDMA Cellular Systems

Paul Mermelstein ´ ecommunications ´ INRS-Tel ´ University of Quebec

28.1

28.1 Introduction 28.2 Modulation of Digital Voice and Data Signals 28.3 Speech Coding Fundamentals 28.4 Channel Coding Considerations 28.5 VSELP Encoder 28.6 Linear Prediction Analysis and Quantization 28.7 Bandwidth Expansion 28.8 Quantizing and Encoding the Reflection Coefficients 28.9 VSELP Codebook Search 28.10 Long-Term Filter Search 28.11 Orthogonalization of the Codebooks 28.12 Quantizing the Excitation and Signal Gains 28.13 Channel Coding and Interleaving 28.14 Bad Frame Masking 28.15 ACELP Encoder 28.16 Algebraic Codebook Structure and Search 28.17 Quantization of the Gains for ACELP Encoding 28.18 Channel Coding for ACELP Encoding 28.19 Conclusions Defining Terms References Further Information

Introduction

The goals of this chapter are to give the reader a tutorial introduction and high-level understanding of the techniques employed for speech transmission by the IS-54 digital cellular standard. It builds on the information provided in the standards document but is not meant to be a replacement for it. Separate standards cover the control channel used for the setup of calls and their handoff to neighboring cells, as well as the encoding of data signals for transmission. For detailed implementation information 1999 by CRC Press LLC

c

the reader should consult the most recent standards document [9]. IS-54 provides for encoding bidirectional speech signals digitally and transmitting them over cellular and microcellular mobile radio systems. It retains the 30-kHz channel spacing of the earlier advanced mobile telephone service (AMPS), which uses analog frequency modulation for speech transmission and frequency shift keying for signalling. The two directions of transmission use frequencies some 45 MHz apart in the band between 824 and 894 MHz. AMPS employs one channel per conversation in each direction, a technique known as frequency division multiple access (FDMA). IS-54 employs time division multiple access (TDMA) by allowing three, and in the future six, simultaneous transmissions to share each frequency band. Because the overall 30-kHz channelization of the allocated 25 MHz of spectrum in each direction is retained, it is also known as a FDMA-TDMA system. In contrast, the later IS-95 standard employs code division multiple access (CDMA) over bands of 1.23 MHz by combining several 30-kHz frequency channels. Each frequency channel provides for transmission at a digital bit rate of 48.6 kb/s through use of differential quadrature-phase shift key (DQPSK) modulation at a 24.3-kBd channel rate. The channel is divided into six time slots every 40 ms. The full-rate voice coder employs every third time slot and utilizes 13 kb/s for combined speech and channel coding. The six slots provide for an eventual half-rate channel occupying one slot per 40 ms frame and utilizing only about 6.5 kb/s for each call. Thus, the simultaneous call carrying capacity with IS-54 is increased by a factor 3(factor 6 in the future) above that of AMPS. All digital transmission is expected to result in a reduction in transmitted power. The resulting reduction in intercell interference may allow more frequent reuse of the same frequency channels than the reuse pattern of seven cells for AMPS. Additional increases in erlang capacity (the total call-carrying capacity at a given blocking rate) may be available from the increased trunking efficiency achieved by the larger number of simultaneously available channels. The first systems employing dual-mode AMPS and TDMA service were put into operation in 1993. In 1996 the TIA introduced the IS-641 enhanced full rate codec. This codec consists of 7.4 kb/s speech coding following the algebraic code-excited linear prediction (ACELP) technique [7], and 5.6 kb/s channel coding. The 13 kb/s coded information replaces the combined 13 kb/s for speech and channel coding introduced by the IS-54 standard. The new codec provides significant enhancements in terms of speech quality and robustness to transmission errors. The quality enhancement for clear channels results from the improved modeling of the stochastic excitation by means of an algebraic codebook instead of the two trained VSELP codebooks. Improved robustness to transmission errors is achieved by employing predictive quantization techniques for the linear-prediction filter and gain parameters, and increasing the number of bits protected by forward error correction.

28.2

Modulation of Digital Voice and Data Signals

The modulation method used in IS-54 is π/4 shifted differentially encoded quadrature phase-shift keying (DPSK). Symbols are transmitted as changes in phase rather than their absolute values. The binary data stream is converted to two binary streams Xk and Yk formed from the odd- and evennumbered bits, respectively. The quadrature streams Ik and Qk are formed according to Ik Qk

= =

Ik−1 cos [1φ (Xk , Yk )] − Qk−1 sin [1φ (Xk , Yk )] Ik−1 sin [1φ (Xk , Yk )] + Qk−1 cos [1φ (Xk , Yk )]

where Ik−1 and Qk−1 are the amplitudes at the previous pulse time. The phase change 1φ takes the values π/4, 3π/4, −π/4, and −3π/4 for the dibit (Xk , Yk ) symbols (0,0), (0,1), (1,0) and (1,1), respectively. This results in a rotation by π/4 between the constellations for odd and even symbols. 1999 by CRC Press LLC

c

The differential encoding avoids the problem of 180◦ phase ambiguity that may otherwise result in estimation of the carrier phase. The signals √ Ik and Qk at the output of the differential phase encoder can take one of five values, 0, ±1, ±1/ 2 as indicated in the constellation of Fig. 28.1. The corresponding impulses are applied to the inputs of the I and Q baseband filters, which have linear phase and square root raised cosine frequency responses. The generic modulator circuit is shown in Fig. 28.2. The rolloff factor α determines the width of the transition band and its value is 0.35, 1, 0 ≤ f ≤ (1 − α)/2T √ 1/2{1 − sin[π(2f T − 1)/2α]}, (1 − α)/2T ≤ f ≤ (1 + α)/2T |H (f )| = 0, f > (1 + α)/2T

FIGURE 28.1: Constellation for π/4 shifted QPSK modulation. Source: TIA, 1992. Cellular System Dual-mode Mobile Station–Base Station Compatibility Standard TIA/EIA IS-54. With permission.

Baseband Filters

multiplier

lk

A cos(ω c t) source

~ Σ

s(t)

90 −A sin(ω c t) Qk

multiplier

FIGURE 28.2: Generic modulation circuit for digital voice and data signals. Source: TIA, 1992. Cellular System Dual-mode Mobile Station–Base Station Compatibility Standard TIA/EIA IS-54.

1999 by CRC Press LLC

c

28.3

Speech Coding Fundamentals

The IS-54 standard employs a vector-sum excited linear prediction (VSELP) coding technique. It represents a specific formulation of the much larger class of code-excited linear prediction (CELP) coders [2] that have proved effective in recent years for the coding of speech at moderate rates in the range 4–16 kb/s. VSELP provides reconstructed speech with a quality that is comparable to that available with frequency modulation and analog transmission over the AMPS system. The coding rate employed is 7.95 kb/s. Each of the six slots per frame carry 260 b of speech and channel coding information for a gross information rate of 13 kb/s. The 260 b correspond to 20 ms of real time speech, transmitted as a single burst. For an excellent recent review of speech coding techniques for transmission, the reader is referred to Gersho, 1994 [3]. Most modern speech coders use a form of analysis by synthesis coding where the encoder determines the coded signal one segment at a time by feeding candidate excitation segments into a replica of a synthesis filter and selecting the segment that minimizes the distortion between the original and reproduced signals. Linear prediction coding (LPC) techniques [1] encode the speech signal by first finding an optimum linear filter to remove the short-time correlation, passing the signal through that LPC filter to obtain a residual signal, and encoding this residual using much fewer bits than would have been required to code the original signal with the same fidelity. In most cases the coding of the residual is divided into two steps. First, the long-time correlation due to the periodic pitch excitation is removed by means of an optimum one-tap filter with adjustable gain and lag. Next, the remaining residual signal, which now closely resembles a white-noise signal, is encoded. Code-excited linear predictors use one or more codebooks from which they select replicas of the residual of the input signal by means of a closed-loop error-minimization technique. The index of the codebook entry as well as the parameters of all the filters are transmitted to allow the speech signal to be reconstructed at the receiver. Most code-excited coders use trained codebooks. Starting with a codebook containing Guassian signal segments, entries that are found to be used rarely in coding a large body of speech data are iteratively eliminated to result in a smaller codebook that is considered more effective. The speech signal can be considered quasistationary or stationary for the duration of the speech frame, of the order of 20 ms. The parameters of the short-term filter, the LPC coefficients, are determined by analysis of the autocorrelation function of a suitably windowed segment of the input signal. To allow accurate determination of the time-varying pitch lag as well as simplify the computations, each speech frame is divided into four 5-ms subframes. Independent pitch filter computations and residual coding operations are carried out for each subframe. The speech decoder attempts to reconstruct the speech signal from the received information as best possible. It employs a codebook identical to that of the encoder for excitation generation and, in the absence of transmission errors, would produce an exact replica of the signal that produced the minimized error at the encoder. Transmission errors do occur, however, due, to signal fading and excessive interference. Since any attempt at retransmission would incur unacceptable signal delays, sufficient error protection is provided to allow correction of most transmission errors.

28.4

Channel Coding Considerations

The sharp limitations on available bandwidth for error protection argue for careful consideration of the sensitivity of the speech coding parameters to transmission errors. Pairwise interleaving of coded blocks and convolutional coding of a subset of the parameters permit correction of a limited number of transmission errors. In addition, a cyclic redundancy check (CRC) is used to determine whether 1999 by CRC Press LLC

c

the error correction was successful. The coded information is divided into three blocks of varying sensitivity to errors. Group 1 contains the most sensitive bits, mainly the parameters of the LPC filter and frame energy, and is protected by both error detection and correction bits. Group 2 is provided with error correction only. The third group, comprising mostly the fixed codebook indices, is not protected at all. The speech signal contains significant temporal redundancy. Thus, speech frames within which errors have been detected may be reconstructed with the aid of previously correctly received information. A bad-frame masking procedure attempts to hide the effects of short fades by extrapolating the previously received parameters. Of course, if the errors persist, the decoded signal must be muted while an attempt is made to hand off the connection to a base station to/from which the mobile may experience better reception.

28.5

VSELP Encoder

A block diagram of the VSELP speech encoder [4] is shown in Fig. 28.3. The excitation signal is generated from three components, the output of a long term or pitch filter, as well as entries from two codebooks. A weighted synthesis filter generates a synthesized approximation to the frequencyweighted input signal. The weighted mean square error between these two signals is used to drive the error minimization process. This weighted error is considered to be a better approximation to the perceptually important noise components than the unweighted mean square error. The total weighted square error is minimized by adjusting the pitch lag and the codebook indices as well as their gains. The decoder follows the encoder closely and generates the excitation signal identically to the encoder but uses an unweighted linear-prediction synthesis filter to generate the decoded signal. A spectral postfilter is added after the synthesis filter to enhance the quality of the reconstructed speech. The precise data rate of the speech coder is 7950 b/s or 159 b per time slot, each corresponding to 20 ms of signal in real time. These 159 b are allocated as follows: 1) short-term filter coefficients, 38 bits; 2) frame energy, 5 bits; 3) pitch lag, 28 bits; 4) codewords, 56 bits; and 5) gain values, 32 bits.

28.6

Linear Prediction Analysis and Quantization

The purpose of the LPC analysis filter is to whiten the spectrum of the input signal so that it can be better matched by the codebook outputs. The corresponding LPC synthesis filter A(z) restores the short-time speech spectrum characteristics to the output signal. The transfer function of the tenth-order synthesis filter is given by A(z) =

1−

1 PNp

i=1 αi z

−i

The filter predictor parameters α1 , . . . , αNp are not transmitted directly. Instead, a set of reflection coefficients r1 , . . . , rNp are computed and quantized. The predictor parameters are determined from the reflection coefficients using a well-known backward recursion algorithm [6]. A variety of algorithms are known that determine a set of reflection coefficients from a windowed input signal. One such algorithm is the fixed point covariance lattice, FLAT, which builds an optimum inverse lattice stage by stage. At each stage j , the sum of the mean-squared forward and backward residuals is minimized by selection of the best reflection coefficient rj . The analysis window used is 170 samples long, centered with respect to the middle of the fourth 5-ms subframe of the 20-ms 1999 by CRC Press LLC

c

β

L

s(n)

Longterm filter state

W(z) weighting filter

p(n)

γ1

I

+

Codebook 1

ex(n)

p'(n) H(z)

−

Σ( ) 2

weighted synthesis filter γ2

H

Total weighted error

Codebook 2 Select indices L, I or H to minimize total weighted error. ERROR MINIMIZATION

FIGURE 28.3: Black diagram of the speech encoder in VSELP. TIA. 1992. Cellular system Dual-mode Mobile Station–Base Station Compatibility Standard. TIA/EIA IS-54.

frame. Since this centerpoint is 20 samples from the end of the frame, 65 samples from the next frame to be coded are used in computing the reflection coefficient of the current frame. This introduces a lookahead delay of 8.125 ms. The FLAT algorithm first computes the covariance matrix of the input speech for NA = 170 and Np = 10, φ(i, k) =

NX A −1

s(n − i)s(n − k),

0 ≤ i,

k ≤ Np ,

n=Np

Define the forward residual out of stage j as fj (n) and the backward residual as bj (n). Then the autocorrelation of the initial forward residual F0 (i, k) is given by φ(i, k). The autocorrelation of the initial backward residual B0 (i, k) is given by φ(i + 1, k + 1) and the initial cross correlation of the two residuals is given by C0 (i, k) = φ(i, k + 1) for 0 ≤ i, k ≤ Np−1 . Initially j is set to 1. The reflection coefficient at each stage is determined as the ratio of the cross correlation to the mean of the autocorrelations. A block diagram of the computations is shown in Fig. 28.4. By quantizing the reflection coefficients within the computation loops, reflection coefficients at subsequent stages are computed taking into account the quantization errors of the previous stages. Specifically, Cj0 −1

Fj0 −1 Bj0 −1

1999 by CRC Press LLC

c

=

Cj −1 (0, 0) + Cj −1 (Np − j, Np − j )

=

Fj −1 (0, 0) + Fj −1 (Np − j, Np − j )

=

Bj −1 (0, 0) + Bj −1 (Np − j, Np − j )

Fj − 1 Bj−1 Cj − 1

Fj Bj Cj rj

rj+1

F j − 1(i, k) B j − 1(i, k)

F j + 1(i, k)

F j (i, k)

C j − 1(i, k) + C j − 1(k, i) rj+1

rj F j − 1(i + 1, k + 1) B j − 1(i + 1, k + 1)

B j + 1(i, k)

B j(i, k)

C j − 1(i + 1, k + 1) + C j − 1(k + 1, i + 1) rj

rj+1

F j − 1(i, k + 1) B j − 1(i, k + 1)

C j (i, k)

C j + 1(i, k)

C j − 1(i, k + 1) C j − 1(k + 1, i)

FIGURE 28.4: Block diagram for lattice covariance computations. and rj =

−2Cj0 −1

Fj0 −1 + Bj0 −1

Use of two sets of correlation values separated by Np − j samples provides additional stability to the computed reflection coefficients in case the input signal changes form rapidly. Once a quantized reflection coefficient rj has been determined, the resulting auto- and cross correlations can be determined iteratively as Fj (i, k) = Bj (i, k) =

Fj −1 (i, k) + rj [Cj −1 (i, k) + Cj −1 (k, i)] + rj2 Bj −1 (i, k) Bj −1 (i + 1, k + 1) + rj [Cj −1 (i + 1, k + 1) + Cj −1 (k + 1, i + 1)] + rj2 Fj −1 (i + 1, k + 1)

and Cj (i, k) =

Cj −1 (i, k + 1) + rj [Bj −1 (i, k + 1) + Fj −1 (i, k + 1)] + rj2 Cj −1 (k + 1, i)

1999 by CRC Press LLC

c

These computations are carried out iteratively for rj , j = 1, . . . , Np .

28.7

Bandwidth Expansion

Poles with very narrow bandwidths may introduce undesirable distortions into the synthesized signal. Use of a binomial window with effective bandwidth of 80 Hz suffices to limit the ringing of the LPC filter and reduce the effect of the LPC filter selected for one frame on the signal reconstructed for subsequent frames. To achieve this, prior to searching for the reflection coefficients, the φ(i, k) is modified by use of a window function w( j ), j = 1, . . . , 10, as follows: φ 0 (i, k) = φ(i, k)w(|i − k|)

28.8

Quantizing and Encoding the Reflection Coefficients

The distortion introduced into the overall spectrum by quantizing the reflection coefficients diminishes as we move to higher orders in the reflection coefficients. Accordingly, more bits are assigned to the lower order coefficients. Specifically, 6, 5, 5, 4, 4, 3, 3, 3, 3, and 2 b are assigned to r1 , . . . , r10 , respectively. Scalar quantization of the reflection coefficients is used in IS-54 because it is particularly simple. Vector quantization achieves additional quantizing efficiencies at the cost of significant added complexity. It is important to preserve the smooth time evolution of the linear prediction filter. Both the encoder and decoder linearly interpolate the coefficients αi for the first, second and third subframes of each frame using the coefficients determined for the previous and current frames. The fourth subframe uses the values computed for that frame.

28.9

VSELP Codebook Search

The codebook search operation selects indices for the long-term filter (pitch lag L) and the two codebooks I and H so as to minimize the total weighted error. This closed-loop search is the most computationally complex part of the encoding operation, and significant effort has been invested to minimize the complexity of these operations without degrading performance. To reduce complexity, simultaneous optimization of the codebook selections is replaced by a sequential optimization procedure, which considers the long-term filter search as the most significant and therefore executes it first. The two vector-sum codebooks are considered to contribute less and less to the minimization of the error, and their search follows in sequence. Subdivision of the total codebook into two vector sums simplifies the processing and makes the result less sensitive to errors in decoding the individual bits arising from transmission errors. Entries from each of the two vector-sum codebooks can be expressed as the sum of basis vectors. By orthogonalizing these basis vectors to the previously selected codebook component(s), one ensures that the newly introduced components reduce the remaining errors. The subframes over which the codebook search is carried out are 5 ms or 40 samples long. An optimal search would need exploration of a 40-dimensional space. The vector-sum approximation limits the search to 14 dimensions after the optimal pitch lag has been selected. The search is further divided into two stages of 7 dimensions each. The two codebooks are specified in terms of the fourteen, 40-dimensional basis vectors stored at the encoder and decoder. The two 7-b indices indicate the required weights on the basic vectors to arrive at the two optimum codewords. 1999 by CRC Press LLC

c

The codebook search can be viewed as selecting the three best directions in 40-dimensional space, which when summed result in the best approximation to the weighted input signal. The gains of the three components are determined through a separate error minimization process.

28.10

Long-Term Filter Search

The long-term filter is optimized by selection of a lag value that minimizes the error between the weighted input signal p(n) and the past excitation signal filtered by the current weighted synthesis filter H (z). There are 127 possible coded lag values provided corresponding to lags of 20–146 samples. One value is reserved for the case when all correlations between the input and the lagged residuals are negative and use of no long term filter output would be best. To simplify the convolution operation between the impulse response of the weighted synthesis filter and the past excitation, the impulse response is truncated to 21 samples or 2.5 ms. Once the lag is determined, the untruncated impulse response is used to compute the weighted long-term lag vector.

28.11

Orthogonalization of the Codebooks

Prior to the search of the first codebook, each filtered basis vector may be made orthogonal to the long-term filter output, the zero-state response of the weighted synthesis filter H (z) to the long-term prediction vector. Each orthogonalized filtered basis vector is computed by subtracting its projection onto the long-term filter output from itself. Similarly, the basis vectors of the second codebook can be orthogonalized with respect to both the long-term filter output and the first codebook output, the zero-state response of H (z) to the previously selected summation of first-codebook basis vectors. In each case the codebook excitation can be reconstituted as M X θim vk,m (n) uk,i (n) = m=1

where k = 1, 2 for the two codebooks, i = I or H the 7-b code vector received, vk,m are the two sets of basis vectors, and θim = +1 if bit m of codeword i = 1 and −1 if bit m of codeword i = 0. Orthogonalization is not required at the decoder since the gains of the codebooks outputs are determined with respect to the weighted nonorthogonalized code vectors.

28.12

Quantizing the Excitation and Signal Gains

The three codebook gain values β, γ1 , and γ2 are transformed to three new parameters GS, P 0 and P 1 for quantization purposes. GS is an energy offset parameter that equalizes the input and output signal energies. It adjusts the energy of the output of the LPC synthesis filter to equal the energy computed for the same subframe at the encoder input. P 0 is the energy contribution of the long-term prediction vector as a fraction of the total excitation energy within the subframe. Similarly, P 1 is the energy contribution of the code vector selected from the first codebook as a fraction of the total excitation energy of the subframe. The transformation reduces the dynamic range of the parameters to be encoded. An 8-b vector quantizer efficiently encodes the appropriate (GS, P 0, P 1) vectors by selecting the vector which minimizes the weighted error. The received and decoded values β, γ1 , and γ2 are computed from the received (GS, P 0, P 1) vector and applied to reconstitute the decoded signal. 1999 by CRC Press LLC

c

28.13

Channel Coding and Interleaving

77 Class-1 bits

5 Tail Bits

7 178

Rate 1/2 Convolutional Coding

Coded Class-1 bits

260

2-Slot interleaver

12 Most Perceptually Significant Bits 7-bit CRC Computation

Voice cipher

Speech Coder

The goals of channel coding are to reduce the impairments in the reconstructed speech due to transmission errors. The 159 b characterizing each 20-ms block of speech are divided into two classes, 77 in class 1 and 82 in class 2. Class 1 includes the bits in which errors result in a more significant impairment, whereas the speech quality is considered less sensitive to the class- 2 bits. Class 1 generally includes the gain, pitch lag, and more significant reflection coefficient bits. In addition, a 7-b cyclic redundancy check is applied to the 12 most perceptually significant bits of class 1 to indicate whether the error correction was successful. Failure of the CRC check at the receiver suggests that the received information is so erroneous that it would be better to discard it than use it. The error correction coding is illustrated in Fig. 28.5.

260

82 Class-2 bits

Speech frames x and y

speech frame y and z

40 msec

FIGURE 28.5: Error correction insertion for speech coder. Source TIA, 1992. Cellular Systems DualMode Mobile Station–Base Station Compatibility Standards. TIA/EIA IS-54. With permission. The error correction technique used is rate 1/2 convolutional coding with a constraint length of 5 [5]. A tail of 5 b is appended to the 84 b to be convolutionally encoded to result in a 178-b output. Inclusion of the tail bits ensures independent decoding of successive time slots and no propagation of errors between slots. Interleaving the bits to be transmitted over two time slots is introduced to diminish the effects of short deep fades and to improve the error-correction capabilities of the channel coding technique. Two speech frames, the previous and the present, are interleaved so that the bits from each speech block span two transmission time slots separated by 20 ms. The interleaving attempts to separate the convolutionally coded class-1 bits from one frame as much as possible in time by inserting noncoded class-2 bits between them. 1999 by CRC Press LLC

c

28.14

Bad Frame Masking

A CRC failure indicates that the received data is unusable, either due to transmission errors resulting from a fade, or from pre-emption of the time slot by a control message (fast associated control channel, FACCH). To mask the effects that may result from leaving a gap in the speech signal, a masking operation based on the temporal redundancy between adjacent speech blocks has been proposed. Such masking can at best bridge over short gaps but cannot recover loss of signal of longer duration. The bad frame masking operation may follow a finite state machine where each state indicates an operation appropriate to the elapsed duration of the fade to which it corresponds. The masking operation consists of copying the previous LPC information and attenuating the gain of the signal. State 6 corresponds to error sequences exceeding 100 ms, for which the output signal is muted. The result of such a masking operation is generation of an extrapolation in the gap to the previously received signal, significantly reducing the perceptual effects of short fades. No additional delay is introduced in the reconstructed signal. At the same time, the receiver will report a high frequency of bad frames leading the system to explore handoff possibilities immediately. A quick successful handoff will result in rapid signal recovery.

28.15

ACELP Encoder

The ACELP encoder employs linear prediction analysis and quantization techniques similar to those used in VSELP and discussed in Section 28.6. The frame structure of 20 ms frames and 5 ms subframes is preserved. Linear prediction analysis is carried out for every frame. The ACELP encoder uses a long-term filter similar to the one discussed in Section 28.10 and represented as an adaptive codebook. The nonpredictable part of the LPC residual is represented in terms of ACELP codebooks, which replace the two VSELP codebooks shown in Fig. 28.3. Instead of encoding the reflection coefficients as in VSELP, the information is transformed into line-spectral frequency pairs (LSP) [8]. The LSPs can be derived from linear prediction coefficients, a 10th order analysis generating 10 line-spectral frequencies (LSF), 5 poles, and 5 zeroes. The LSFs can be vector quantized and the LPC coefficients recalculated from the quantized LSFs. As long as the interleaved order of the poles and zeroes is preserved, quantization of the LSPs preserves the stability of the LPC synthesis filters. The LSPs of any frame can be better predicted from the values calculated and transmitted corresponding to previous frames, resulting in additional advantages. The longterm means of the LSPs are calculated for a large body of speech data and stored at both the encoder and decoder. First-order moving-average prediction is then used for the mean-removed LSPs. The time-prediction technique also permits use of predicted values for the LSPs in case uncorrectable transmissions errors are encountered, resulting in reduced speech degradation. To simplify the vector quantization operations, each LSP vector is split into 3 subvectors of dimensions 3, 3, and 4. The three subvectors are quantized with 8, 9, and 9 bits respectively, corresponding to a total bit assignment of 26 bits per frame for LPC information.

28.16

Algebraic Codebook Structure and Search

Algebraic codebooks contain relatively few pulses having nonzero values leading to rapid search of the possible innovation vectors, the vectors which together with the ACB output form the excitation of the LPC filter for the current subframe. In this implementation the 40-position innovation vector contains only four nonzero pulses and each can take on only values +1 and −1. The 40 positions are divided into four tracks and one pulse is selected from each track. The tracks are generally equally 1999 by CRC Press LLC

c

spaced but differ in their starting value, thus the first pulse can take on positions 0, 5, 10, 15, 20, 25, 30, or 35 and the second has possible positions 1, 6, 11, 16, 21, 26, 31, or 36. The first three pulse positions are coded with 3 bits and the fourth pulse position (starting positions 3 or 4) with 4 bits, resulting in a 17-bit sequence for the algebraic code of each subframe. The algebraic codebook is searched by minimizing the mean square error between the weighted input speech and the weighted synthesized speech over the time span of each subframe. In each case the weighting is that produced by a perceptual weighting filter that has the effect of shaping the spectrum of the synthesis error signal so that it is better masked by spectrum of the current speech signal.

28.17

Quantization of the Gains for ACELP Encoding

The adaptive codebook gain and the fixed (algebraic) codebook gains are vector quantized using a 7-bit codebook. The gain codebook search is performed by minimizing the mean-square of the weighted error between the original and the reconstructed speech, expressed as a function of the adaptive codebook gain and a fixed codebook correction factor. This correction factor represents the log energy difference between a predicted gain and an estimated gain. The predicted gain is computed using fourth-order moving-average prediction with fixed coefficients on the innovation energy of each subframe. The result is a smoothed energy profile even in the presence of modest quantization errors. As discussed above in case of the LSP quantization, the moving-average prediction serves to provide predicted values even when the current frame information is lost due to transmission errors. Degradations resulting from loss of one or two frames of information are thereby mitigated.

28.18

Channel Coding for ACELP Encoding

The channel coding and interleaving operations for ACELP speech coding are similar to those discussed in Section 28.13 for VSELP coding. The number of bits protected by both error-detection (parity) and error-correction convolutional coding is increased to 48 from 12. Rate 1/2 convolutional coding is used on the 108 more significant bits, 96 class-1 bits, 7 CRC bits and the 5 tail bits of the convolutional coder, resulting in 216 coded class-1 bits. Eight of the 216 bits are dropped by puncturing, yielding 208 coded class-1 bits which are then combined with 52 nonprotected class-2 bits. As compared to the channel coding of the VSELP encoder, the numbers of protected bits is increased and the number of unprotected bits is reduced while keeping the overall coding structure unchanged.

28.19

Conclusions

The IS-54 digital cellular standard specifies modulation and speech coding techniques for mobile cellular systems that allow the interoperation of terminals built by a variety of manufacturers and systems operated across the country by a number of different service providers. It permits speech communication with good quality in a transmission environment characterized by frequent multipath fading and significant intercell interference. Generally, the quality of the IS-54 decoded speech is better at the edges of a cell than the corresponding AMPS transmission due to the error mitigation resulting from channel coding. Near a base station or in the absence of significant fading and interference, the IS-54 speech quality is reported to be somewhat worse than AMPS due to the inherent limitations of the analysis–synthesis model in reconstructing arbitrary speech signals with limited bits. The 1999 by CRC Press LLC

c

IS-641 standard coder achieves higher speech quality, particularly at the edges of heavily occupied cells where transmission errors may be more numerous. At this time no new systems following the IS-54 standard are being introduced. Most base-stations have been converted to transmit and receive on the IS-641 standard as well and use of IS-54 transmissions is dropping rapidly. At the time of its introduction in 1996 the IS-641 coder represented the state of the art in terms of toll quality speech coding near 8 kb/s, a significant improvement over the IS-54 coder introduced in 1990. These standards represent reasonable engineering compromises between high performance and complexity sufficiently low to permit single-chip implementations in mobile terminals. Both IS-54 and IS-641 are considered second generation cellular standards. Third generation cellular systems promise higher call capacities through better exploitation of the time-varying transmission requirements of speech conversations, as well as improved modulation and coding in wider spectrum bandwidths that achieve similar bit-error ratios but reduce the required transmitted power. Until such systems are introduced, the second generation TDMA systems can be expected to provide many years of successful cellular and personal communications services.

Defining Terms Codebook: A set of signal vectors available to both the encoder and decoder. Covariance lattice algorithm: An algorithm for reduction of the covariance matrix of the signal consisting of several lattice stages, each stage implementing an optimal first-order filter with a single coefficient. Reflection coefficient: A parameter of each stage of the lattice linear prediction filter that determines 1) a forward residual signal at the output of the filter-stage by subtracting from the forward residual at the input a linear function of the backward residual, also 2) a backward residual at the output of the filter stage by subtracting a linear function of the forward residual from the backward residual at the input. Vector quantizer: A quantizer that assigns quantized vectors to a vector of parameters based on their current values by minimizing some error criterion.

References [1] Atal, B.S. and Hanauer, S.L., Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am., 50, 637–655, 1971. [2] Atal, B.S. and Schroeder, M., Stochastic coding of speech signals at very low bit rates. Proc. Int. Conf. Comm., 1610–1613, 1984. [3] Gersho, A., Advances in speech and audio compression. Proc. IEEE, 82, 900–918, 1994. [4] Gerson, I.A. and Jasiuk, M.A., Vector sum excited linear prediction (VSELP) speech coding at 8 kbps. Int. Conf. Acoust. Speech and Sig. Proc., ICASSP90, 461–464, 1990. [5] Lin S. and Costello, D., Error Control Coding: Fundamentals and Application, Prentice Hall, Englewood Cliffs, NJ, 1983. [6] Makhoul, J., Linear prediction, a tutorial review. Proc. IEEE, 63, 561–580, 1975. [7] Salami, R., Laflamme, C., Adoul, J.P., and Massaloux, D., A toll quality 8 kb/s speech codec for the personal communication system (PCS). IEEE Trans. Vehic. Tech., 43, 808–816, 1994. [8] Soong, F.K. and Juang, B.H., Line spectrum pair (LSP) and speech data compression. Proc. ICASSP’84, 1.10.1–1.10.4, 1984. 1999 by CRC Press LLC

c

[9] Telecommunications Industry Association, EIA/TIA Interim Standard, Cellular System Dualmode Mobile Station–Base Station Compatibility Standard IS-54B, TIA/EIA, Washington, D.C., 1992.

Further Information For a general treatment of speech coding for telecommunications, see N.S. Jayant and P. Noll, Digital Coding of Waveforms, Prentice Hall, Englewood, NJ, 1984. For a more detailed treatment of linear prediction techniques, see J. Markel and A. Gray, Linear Prediction of Speech, Springer–Verlag, NY, 1976.

1999 by CRC Press LLC

c

Hanzo, L. “The British Cordless Telephone Standard: CT-2” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

The British Cordless Telephone Standard: CT-2 29.1 History and Background 29.2 The CT-2 Standard 29.3 The Radio Interface

Transmission Issues • Multiple Access and Burst Structure • Power Ramping, Guard Period, and Propagation Delay • Power Control

29.4 Burst Formats 29.5 Signalling Layer Two (L2)

General Message Format • Fixed Format Packet

Lajos Hanzo University of Southampton

29.1

29.6 CPP-Initiated Link Setup Procedures 29.7 CFP-Initiated Link Setup Procedures 29.8 Handshaking 29.9 Main Features of the CT-2 System Defining Terms References

History and Background

Following a decade of world-wide research and development (R&D), cordless telephones (CT) are now becoming widespread consumer products, and they are paving the way towards ubiquitous, low-cost personal communications networks (PCN) [7, 8]. The two most well-known European representatives of CTs are the digital European cordless telecommunications (DECT) system [1, 5] and the CT-2 system [2, 6]. Three potential application areas have been identified, namely, domestic, business, and public access, which is also often referred to as telepoint (TP). In addition to conventional voice communications, CTs have been conceived with additional data services and local area network (LAN) applications in mind. The fundamental difference between conventional mobile radio systems and CT systems is that CTs have been designed for small to very small cells, where typically benign low-dispersion, dominant line-of-sight (LOS) propagation conditions prevail. Therefore, CTs can usually dispense with channel equalizers and complex low-rate speech codecs, since the low-signal dispersion allows for the employment of higher bit rates before the effect of channel dispersion becomes a limiting factor. On the same note, the LOS propagation scenario is associated with mild fading or near-constant received signal level, and when combined with appropriate small-cell power-budget design, it ensures a high average signal-to-noise ratio (SNR). 1999 by CRC Press LLC

c

These prerequisites facilitate the employment of high-rate, low-complexity speech codecs, which maintain a low battery drain. Furthermore, the deployment of forward error correction codecs can often also be avoided, which reduces both the bandwidth requirement and the power consumption of the portable station (PS). A further difference between public land mobile radio (PLMR) systems [3] and CTs is that whereas the former endeavor to standardize virtually all system features, the latter seek to offer a so-called access technology, specifying the common air interface (CAI), access and signalling protocols, and some network architecture features, but leaving many other characteristics unspecified. By the same token, whereas PLMR systems typically have a rigid frequency allocation scheme and fixed cell structure, CTs use dynamic channel allocation (DCA) [4]. The DCA principle allows for a more intelligent and judicious channel assignment, where the base station (BS) and PS select an appropriate traffic channel on the basis of the prevailing traffic and channel quality conditions, thus minimizing, for example, the effect of cochannel interference or channel blocking probability. In contrast to PLMR schemes, such as the Pan-European global system of mobile communications (GSM) system [3], CT systems typically dispense with sophisticated mobility management, which accounts for the bulk of the cost of PLMR call charges, although they may facilitate limited hand-over capabilities. Whereas in residential applications CTs are the extension of the public switched telephone network (PSTN), the concept of omitting mobility management functions, such as location update, etc., leads to telepoint CT applications where users are able to initiate but not to receive calls. This fact drastically reduces the network operating costs and, ultimately, the call charge at a concomittant reduction of the services rendered. Having considered some of the fundamental differences between PLMR and CT systems let us now review the basic features of the CT-2 system.

29.2

The CT-2 Standard

The European CT-2 recommendation has evolved from the British standard MPT-1375 with the aim of ensuring the compatibility of various manufacturers’ systems as well as setting performance requirements, which would encourage the development of cost-efficient implementations. Further standardization objectives were to enable future evolution of the system, for example, by reserving signalling messages for future applications and to maintain a low PS complexity even at the expense of higher BS costs. The CT-2 or MPT 1375 CAI recommendation is constituted by the four following parts. 1. Radio interface: Standardizes the radio frequency (RF) parameters, such as legitimate channel frequencies, the modulation method, the transmitter power control, and the required receiver sensitivity as well as the carrier-to-interference ratio (CIR) and the time division duplex (TDD) multiple access scheme. Furthermore, the transmission burst and master/slave timing structures to be used are also laid down, along with the scrambling procedures to be applied. 2. Signalling layers one and two: Defines how the bandwidth is divided among signalling, traffic data, and synchronization information. The description of the first signalling layer includes the dynamic channel allocation strategy, calling channel detection, as well as link setup and establishment algorithms. The second layer is concerned with issues of various signalling message formats, as well as link establishment and re-establishment procedures. 1999 by CRC Press LLC

c

3. Signalling layer three: The third signalling layer description includes a range of message sequence diagrams as regards to call setup to telepoint BSs, private BSs, as well as the call clear down procedures. 4. Speech coding and transmission: The last part of the standard is concerned with the algorithmic and performance features of the audio path, including frequency responses, clipping, distortion, noise, and delay characteristics. Having briefly reviewed the structure of the CT-2 recommendations let us now turn our attention to its main constituent parts and consider specific issues of the system’s operation.

29.3

The Radio Interface

29.3.1 Transmission Issues In our description of the system we will adopt the terminology used in the recommendation, where the PS is called cordless portable part (CPP), whereas the BS is referred to as cordless fixed part (CFP). The channel bandwidth and the channel spacing are 100 kHz, and the allocated system bandwidth is 40 MHz, which is hosted in the range of 864.15–868.15 MHz. Accordingly, a total of 40 RF channels can be utilized by the system. The accuracy of the radio frequency must be maintained within ±10 kHz of its nominal value for both the CFP and CPP over the entire specified supply voltage and ambient temperature range. To counteract the maximum possible frequency drift of 20 kHz, automatic frequency correction (AFC) may be used in both the CFP and CPP receivers. The AFC may be allowed to control the transmission frequency of only the CPP, however, in order to prevent the misalignment of both transmission frequencies. Binary frequency shift keying (FSK) is proposed, and the signal must be shaped by an approximately Gaussian filter in order to maintain the lowest possible frequency occupancy. The resulting scheme is referred to as Gaussian frequency shift keying (GFSK), which is closely related to Gaussian minimum shift keying (GMSK) [7] used in the DECT [1] and GSM [3] systems. Suffice to say that in M-arry FSK modems the carrier’s frequency is modulated in accordance with the information to be transmitted, where the modulated signal is given by r Si (t) =

2E cos [ωi t + 8] T

i = 1, . . . , M

and E represents the bit energy, T the signalling interval length, ωi has M discrete values, whereas the phase 8 is constant.

29.3.2 Multiple Access and Burst Structure The so-called TDD multiple access scheme is used, which is demonstrated in Fig. 29.1. The simple principle is to use the same radio frequency for both uplink and downlink transmissions between the CPP and the CFP, respectively, but with a certain staggering in time. This figure reveals further details of the burst structure, indicating that 66 or 68 b per TDD frame are transmitted in both directions. There is a 3.5- or 5.5-b duration guard period (GP) between the uplink and downlink transmissions, and half of the time the CPP (the other half of the time the CFP) is transmitting with the other part listening, accordingly. Although the guard period wastes some channel capacity, it allows a finite time for both the CPP and CFP for switching from transmission to reception and vice versa. The burst 1999 by CRC Press LLC

c

structure of Fig. 29.1 is used during normal operation across an established link for the transmission of adaptive differential pulse code modulated (ADPCM) speech at 32 kb/s according to the CCITT G721 standard in a so-called B channel or bearer channel. The D channel, or signalling channel, is used for the transmission of link control signals. This specific burst structure is referred to as a multiplex one (M1) frame.

FIGURE 29.1: M1 burst and TDD frame structure. Since the speech signal is encoded according to the CCITT G721 recommendation at 32 kb/s the TDD bit rate must be in excess of 64 kb/s in order to be able to provide the idle guard space of 3.5or 5.5-b interval duration plus some signalling capacity. This is how channel capacity is sacrificed to provide the GP. Therefore, the transmission bit rate is stipulated to be 72 kb/s and the transmission burst length is 2 ms, during which 144-b intervals can be accommodated. As it was demonstrated in Fig. 29.1, 66 or 68 b are transmitted in both the uplink and downlink burst, and taking into account the guard spaces, the total transmission frame is constituted by (2·68) + 3.5 + 4.5 = 144 b or equivalently, by (2·66) + 5.5 + 4.5 = 144 b. The 66-b transmission format is compulsory, whereas the 68-b format is optional. In the 66-b burst there is one D bit dedicated to signalling at both ends of the burst, whereas in the 68-b burst the two additional bits are also assigned to signalling. Accordingly, the signalling rate becomes 2 b/2 ms or 4 b/2 ms, corresponding to 1 kb/s or 2 kb/s signalling rates.

29.3.3 Power Ramping, Guard Period, and Propagation Delay As mentioned before and suggested Fig. 29.1, there is a 3.5- or 5.5-b interval duration GP between transmitted and received bursts. Since the signalling rate is 72 kb/s, the bit interval becomes about 1/(72 kb/s) ≈ 13.9 µs and, hence, the GP duration is about 49 µs or 76 µs. This GP serves a number of purposes. Primarily, the GP allows the transmitter to ramp up and ramp down the transmitted signal level smoothly over a finite time interval at the beginning and end of the transmitted burst. This is necessary, because if the transmitted signal is toggled instantaneously, that is equivalent to multiplying the transmitted signal by a rectangular time-domain window function, which corresponds in the frequency domain to convolving the transmitted spectrum with a sinc function. This convolution would result in spectral side-lobes over a very wide frequency range, which would interfere with adjacent channels. Furthermore, due to the introduction of the guard period, both the CFP and CPP can tolerate a limited propagation delay, but the entire transmitted burst must arrive within the receivers’ window, otherwise the last transmitted bits cannot be decoded.

1999 by CRC Press LLC

c

29.3.4 Power Control In order to minimize the battery drain and the cochannel interference load imposed upon cochannel users, the CT-2 system provides a power control option. The CPPs must be able to transmit at two different power levels, namely, either between 1 and 10 mW or at a level between 12 and 20 dB lower. The mechanism for invoking the lower CPP transmission level is based on the received signal level at the CFP. If the CFP detects a received signal strength more than 90 dB relative to 1 µV/m, it may instruct the CPP to drop its transmitted level by the specified 12–20 dB. Since the 90-dB gain factor corresponds to about a ratio of 31,623, this received signal strength would be equivalent for a 10-cm antenna length to an antenna output voltage of about 3.16 mV. A further beneficial ramification of using power control is that by powering down CPPs that are in the vicinity of a telepoint-type multiple-transceiver CFP, the CFP’s receiver will not be so prone to being desensitised by the high-powered close-in CPPs, which would severely degrade the reception quality of more distant CPPs.

29.4

Burst Formats

As already mentioned in the previous section on the radio interface, there are three different subchannels assisting the operation of the CT-2 system, namely, the voice/data channel or B channel, the signalling channel or D channel, and the burst synchronization channel or SYN channel. According to the momentary system requirements, a variable fraction of the overall channel capacity or, equivalently, a variable fraction of the bandwidth can be allocated to any of these channels. Each different channel capacity or bandwidth allocation mode is associated with a different burst structure and accordingly bears a different name. The corresponding burst structures are termed as multiplex one (M1), multiplex two (M2), and multiplex three (M3), of which multiplex one used during the normal operation of established links has already been described in the previous section. Multiplex two and three will be extensively used during link setup and establishment in subsequent sections, as further details of the system’s operation are unravelled. Signalling layer one (L1) defines the burst formats multiplex one–three just mentioned, outlines the calling channel detection procedures, as well as link setup and establishment techniques. Layer two (L2) deals with issues of acknowledged and unacknowledged information transfer over the radio link, error detection and correction by retransmission, correct ordering of messages, and link maintenance aspects. The burst structure multiplex two is shown in Fig. 29.2. It is constituted by two 16-b D-channel segments at both sides of the 10-b preamble (P) and the 24-b frame synchronization pattern (SYN), and its signalling capacity is 32 b/2 ms = 16 kb/s. Note that the M2 burst does not carry any B-channel information, it is dedicated to synchronization purposes. The 32-b D-channel message is split in two 16-b segments in order to prevent that any 24-b fraction of the 32-b word emulates the 24-b SYN segment, which would result in synchronization misalignment. Since the CFP plays the role of the master in a telepoint scenario communicating with many CPPs, all of the CPP’s actions must be synchronized to those of the CFP. Therefore, if the CPP attempts to initiate a call, the CFP will reinitiate it using the M2 burst, while imposing its own timing structure. The 10-b preamble consists of an alternate zero/one sequence and assists in the operation of the clock recovery circuitry, which has to be able to recover the clock frequency before the arrival of the SYN sequence, in order to be able to detect it. The SYN sequence is a unique word determined by computer search, which has a sharp autocorrelation peak, and its function is discussed later. The way the M2 and M3 burst formats are used for signalling purposes will be made explicit in our further discussions when considering the link setup procedures. 1999 by CRC Press LLC

c

FIGURE 29.2: CT2 multiplex two burst structure.

The specific SYN sequences used by the CFP and the CPP are shown in Table 29.1 along with the so-called channel marker (CHM) sequences used for synchronization purposes by the M3 burst format. Their differences will be made explicit during our further discourse. Observe from the table that the sequences used by the CFP and CPP, namely, SYNF, CHMF and SYNP, CHMP, respectively, are each other’s bit-wise inverses. This was introduced in order to prevent CPPs and CFPs from calling each other directly. The CHM sequences are used, for instance, in residential applications, where the CFP can issue an M2 burst containing a 24-b CHMF sequence and a so-called poll message mapped on to the D-channel bits in order to wake up the specific CPP called. When the called CPP responds, the CFP changes the CHMF to SYNF in order to prevent waking up further CPPs unnecessarily. TABLE 29.1

CT-2 Synchronization Patterns

MSB (sent last)

LSB (sent first)

CHMF

1011

1110

0100

1110

0101

0000

CHMP

0100

0001

1011

0001

1010

1111

SYNCF

1110

1011

0001

1011

0000

0101

SYNCP

0001

0100

1110

0100

1111

1010

Since the CT-2 system does not entail mobility functions, such as registration of visiting CPPs in other than their own home cells, in telepoint applications all calls must be initiated by the CPPs. Hence, in this scenario when the CPP attempts to set up a link, it uses the so-called multiplex three burst format displayed in Fig. 29.3. The design of the M3 burst reflects that the CPP initiating the call is oblivious of the timing structure of the potentially suitable target CFP, which can detect access attempts only during its receive window, but not while the CFP is transmitting. Therefore, the M3 format is rather complex at first sight, but it is well structured, as we will show in our further discussions. Observe in the figure that in the M3 format there are five consecutive 2-ms long 144-b transmitted bursts, followed by two idle frames, during which the CPP listens in order to determine whether its 24-b CHMP sequence has been detected and acknowledged by the CFP. This process can be followed by consulting Fig. 29.6, which will be described in depth after considering the detailed construction of the M3 burst. The first four of the five 2-ms bursts are identical D-channel bursts, whereas the fifth one serves as a synchronization message and has a different construction. Observe, furthermore, that both the first four 144-b bursts as well as the fifth one contain four so-called submultiplex segments, each of which hosts a total of (6 + 10 + 8 + 10 + 2) = 36 b. In the first four 144-b bursts there are (6 + 8 + 2) = 16 one/zero clock-synchronizing P bits and (10 + 10) = 20 D bits or signalling bits. Since the D-channel message is constituted by two 10-b half-messages, the first half of the D-message 1999 by CRC Press LLC

c

FIGURE 29.3: CT2 multiplex three burst structure. is marked by the + sign in the figure. As mentioned in the context of M2, the D-channel bits are split in two halves and interspersed with the preamble segments in order to ensure that these bits do not emulate valid CHM sequences. Without splitting the D bits this could happen upon concatenating the one/zero P bits with the D bits, since the tail of the SYNF and SYNP sequences is also a one/zero segment. In the fifth 144-b M3 burst, each of the four submultiplex segments is constituted by 12 preamble bits and 24 CPP channel marker (CHMP) bits. The four-fold submultiplex M3 structure ensures that irrespective of how the CFP’s receive window is aligned with the CPP’s transmission window, the CFP will be able to capture one of the four submultiplex segments of the fifth M3 burst, establish clock synchronization during the preamble, and lock on to the CHMP sequence. Once the CFP has successfully locked on to one of the CHMP words, the corresponding D-channel messages comprising the CPP identifier can be decoded. If the CPP identifier has been recognized, the CFP can attempt to reinitialize the link using its own master synchronization.

29.5

Signalling Layer Two (L2)

29.5.1 General Message Format The signalling L2 is responsible for acknowledged and un-acknowledged information transfer over the air interface, error detection and correction by retransmission, as well as for the correct ordering of messages in the acknowledged mode. Its further functions are the link end-point identification and link maintenance for both CPP and CFP, as well as the definition of the L2 and L3 interface. Compliance with the L2 specifications will ensure the adequate transport of messages between 1999 by CRC Press LLC

c

the terminals of an established link. The L2 recommendations, however, do not define the meaning of messages, this is specified by L3 messages, albeit some of the messages are undefined in order to accommodate future system improvements. The L3 messages are broken down to a number of standard packets, each constituted by one or more codewords (CW), as shown in Fig. 29.4. The codewords have a standard length of eight octets, and each packet contains up to six codewords. The first codeword in a packet is the so-called address codeword (ACW) and the subsequent ones, if present, are data codewords (DCW). The first octet of the ACW of each packet contains a variety of parameters, of which the binary flag L3 END is indicated in Fig. 29.4, and it is set to zero in the last packet. If the L3 message transmitted is mapped onto more than one packet, the packets must be numbered up to N . The address codeword is always preceded by a 16-b D-channel frame synchronization word SYNCD. Furthermore, each eight-octet CW is protected by a 16-b parity-check word occupying its last two octets. The binary Bose–Chaudhuri– Hocquenghem BCH(63,48) code is used to encode the first six octets or 48 b by adding 15 parity b to yield 63 b. Then bit 7 of octet 8 is inverted and bit 8 of octet 8 added such that the 64-b codeword has an even parity. If there are no D-channel packets to send, a 3-octet idle message IDLE D constituted by zero/one reversals is transmitted. The 8-octet format of the ACWs and DCWs is made explicit in Fig. 29.5, where the two parity check octets occupy octets 7 and 8. The first octet hosts a number of control bits. Specifically, bit 1 is set to logical one for an ACW and to zero for a DCW, whereas bit 2 represents the so-called format type FT bit. F T = 1 indicates that variable length packet format is used for the transfer of L3 messages, whereas F T = 0 implies that a fixed length link setup is used for link end point addressing end service requests. F T is only relevant to ACWs, and in DCWs it has to be set to one.

FIGURE 29.4: General L2 and L3 message format.

1999 by CRC Press LLC

c

FIGURE 29.5: Fixed format packets mapped on M1, M2, and M3 during link initialization and on M1 and M2 during handshake.

29.5.2 Fixed Format Packet As an example, let us focus our attention on the fixed format scenario associated with F T = 0. The corresponding codeword format defined for use in M1, M2, and M3 for link initiation and in M1 and M2 for handshaking is displayed in Fig. 29.5. Bits 1 and 2 have already been discussed, whereas the 2-bit link status (LS) field is used during call setup and handshaking. The encoding of the four possible LS messages is given in Table 29.2. The aim of these LS messages will become more explicit during our further discussions with reference to Fig. 29.6 and Fig. 29.7. Specifically, link request is transmitted from the CPP to the CFP either in an M3 burst as the first packet during CPP-initiated call setup and link re-establishment, or returned as a poll response in an M2 burst from the CPP to the CFP, when the CPP is responding to a call. Link grant is sent by the CFP in response to a link request originating from the CPP. In octets 5 and 6 it hosts the so-called link identification (LID) code, which is used by the CPP, for example, to address a specific CFP or a requested service. The LID is also used to maintain link reference during handshake exchanges and link re-establishment. The two remaining link status handshake messages, namely, ID OK and ID lost, are used to report to the far end whether a positive confirmation of adequate link quality has been received within the required time-out period. These issues will be revisited during our further elaborations. Returning to Fig. 29.5, we note that the fixed packet format (F T = 0) also contains a 19-b handset identification code (HIC) and an 8-b manufacturer identification code (MIC). The concatenated HIC and MIC fields jointly from the unique 27-b portable identity code (PIC), serving as a link end-point identifier. Lastly, we have to note that bit 5 of octet 1 represents the signalling rate (SR) request/response bit, which is used by the calling party to specify the choice of the 66- or 68-b M1 format. Specifically, SR = 1 represents the four bit/burst M1 signalling format. The first 6 octets are then protected by the parity check information contained in octets 7 and 8.

1999 by CRC Press LLC

c

TABLE 29.2 Encoding of Link Status Messages

29.6

LS1

LS0

Message

0

0

Link request

0

1

Link grant

1

0

ID OK

1

1

ID lost

CPP-Initiated Link Setup Procedures

Calls can be initiated at both the CPP and CFP, and the call initiation and detection procedures invoked depend on which party initiated the call. Let us first consider calling channel detection at the CFP, which ensues as follows. Under the instruction of the CFP control scheme, the RF synthesizer tunes to a legitimate RF channel and after a certain settling time commences reception. Upon receiving the M3 bursts from the CPP, the automatic gain control (AGC) circuitry adjusts its gain factor, and during the 12-b preamble in the fifth M3 burst, bit synchronization is established. This specific 144-b M3 burst, is transmitted every 14 ms, corresponding to every seventh 144-b burst. Now the CFP is ready to bit-synchronously correlate the received sequences with its locally stored CHMP word in order to identify any CHMP word arriving from the CPP. If no valid CHMP word is detected, the CFP may retune itself to the next legitimate RF channel, etc. As mentioned, the call identification and link initialization process is shown in the flowchart of Fig. 29.6. If a valid 24-b CHMP word is identified, D-channel frame synchronization can take place using the 16-b SYNCD sequence and the next 8-octet L2 D-channel message delivering the link request handshake portrayed earlier in Fig. 29.5 and Table 29.2 is decoded by the CFP. The required 16 + 64 = 80 D bits are accommodated in this scenario by the 4·20 = 80 D bits of the next four 144-b bursts of the M3 structure, where the 20 D bits of the four submultiplex segments are transmitted four times within the same burst before the D message changes. If the decoded LID code of Fig. 29.5 is recognized by the CFP, the link may be reinitialized based on the master’s timing information using the M2 burst associated with SYNF and containing the link grant message addressed to the specific CPP identified by its PID. Otherwise the CFP returns to its scanning mode and attempts to detect the next CHMP message. The reception of the CFP’s 24-b SYNF segment embedded in the M2 message shown previously in Fig. 29.2 allows the CPP to identify the position of the CFP’s transmit and receive windows and, hence, the CPP now can respond with another M2 burst within the receive window of the CFP. Following a number of M2 message exchanges, the CFP then sends a L3 message to instruct the CPP to switch to M1 bursts, which marks the commencement of normal voice communications and the end of the link setup session.

29.7

CFP-Initiated Link Setup Procedures

Similar procedures are followed when the CPP is being polled. The CFP transmits the 24-b CHMF words hosted by the 24-b SYN segment of the M2 burst shown in Fig. 29.2 in order to indicate that one or more CPPs are being paged. This process is displayed in the flowchart of Fig. 29.7, as well as in the timing diagram displayed in Fig. 29.8. The M2 D-channel messages convey the identifiers of the polled CPPs. The CPPs keep scanning all 40 legitimate RF channels in order to pinpoint any 24-b CHMF words. 1999 by CRC Press LLC

c

FIGURE 29.6: Flowchart of the CT-2 link initialization by the CPP.

Explicitly, the CPP control scheme notifies the RF synthesizer to retune to the next legitimate RF channel if no CHMF words have been found on the current one. The synthesizer needs a finite time to settle on the new center frequency and then starts receiving again. Observe in Fig. 29.8 that at this stage only the CFP is transmitting the M2 bursts; hence, the uplink-half of the 2-ms TDD frame is unused. Since the M2 burst commences with the D-channel bits arriving from the CFP, the CPP receiver’s 1999 by CRC Press LLC

c

FIGURE 29.7: Flowchart of the CT-2 link initialization by the CFP.

1999 by CRC Press LLC

c

FIGURE 29.8: CT-2 call detection by the CPP. AGC will have to settle during this 16-b interval, which corresponds to about 16·1/[72 kb/s] ≈ 0.22 ms. Upon the arrival of the 10 alternating one–zero preamble bits, bit synchronization is established. Now the CPP is ready to detect the CHMF word using a simple correlator circuitry, which establishes the appropriate frame synchronization. If, however, no CHMF word is detected within the receive window, the synthesizer will be retuned to the next RF channel, and the same procedure is repeated, until a CHMF word is detected. When a CHMF word is correctly decoded by the CPP, the CPP is now capable of frame and bit synchronously decoding the D-channel bits. Upon decoding the D-channel message of the M2 burst, the CPP identifier (ID) constituted by the LID and PID segments of Fig. 29.5 is detected and compared to the CPP’s own ID in order to decide as to whether the call is for this specific CPP. If so, the CPP ID is reflected back to the CFP along with a SYNP word, which is included in the SYN segment of an uplink M2 burst. This channel scanning and retuning process continues until a legitimate incoming call is detected or the CPP intends to initiate a call. More precisely, if the specific CPP in question is polled and its own ID is recognized, the CPP sends its poll response message in three consecutive M2 bursts, since the capacity of a single M2 burst is 32 D bits only, while the handshake messages of Fig. 29.5 and Table 29.2 require 8 octets preceded by a 16-b SYNCD segment. If by this time all paged CPPs have responded, the CFP changes the CHMF word to a SYNF word, in order to prevent activating dormant CPPs who are not being paged. If any of the paged CPPs intends to set up the link, then it will change its poll response to a L2 link request message, in response to which the CFP will issue an M2 link grant message, as seen in Fig. 29.7, and from now on the procedure is identical to that of the CPP-initiated link setup portrayed in Fig. 29.6.

29.8

Handshaking

Having established the link, voice communications is maintained using M1 bursts, and the link quality is monitored by sending handshaking (HS) signalling messages using the D-channel bits. The required frequency of the handshaking messages must be between once every 400 ms and 1000 ms. The CT-2 codewords ID OK, ID lost, link request and link grant of Table 29.2 all represent valid handshakes. When using M1 bursts, however, the transmission of these 8-octet messages using the 2- or 4-b/2ms D-channel segment must be spread over 16 or 32 M1 bursts, corresponding to 32 or 64 ms. 1999 by CRC Press LLC

c

Let us now focus our attention on the handshake protocol shown in Fig. 29.9. Suppose that the CPP’s handshake interval of Thtx p = 0.4 s since the start of the last transmitted handshake has expired, and hence the CPP prepares to send a handshake message HS p. If the CPP has received a valid HS f message from the CFP within the last Thrx p = 1s, the CPP sends an HS p = ID OK message to the CFP, otherwise an ID Lost HS p. Furthermore, if the valid handshake was HS f = ID OK, the CPP will reset its HS f lost timer Thlost p to 10 s. The CFP will maintain a 1-s timer referred to as Thrx f, which is reset to its initial value upon the reception of a valid HS p from the CPP.

FIGURE 29.9: CT-2 handshake algorithms.

The CFP’s actions also follow the structure of Fig. 29.9 upon simply interchanging CPP with CFP and the descriptor p with f. If the Thrx f = 1 s timer expires without the reception of a valid HS p from the CPP, then the CFP will send its ID Lost HS f message to the CPP instead of the ID OK message and will not reset the Thlost f = 10 s timer. If, however, the CFP happens to detect a valid 1999 by CRC Press LLC

c

HS p, which can be any of the ID OK, ID Lost, link request and link grant messages of Table 29.2, arriving from the CPP, the CFP will reset its Thrx f = 1 s timer and resumes transmitting the ID OK HS f message instead of the ID Lost. Should any of the HS messages go astray for more than 3 s, the CPP or the CFP may try and re-establish the link on the current or another RF channel. Again, although any of the ID OK, ID Lost, link request and link grant represent valid handshakes, only the reception of the ID OK HS message is allowed to reset the Thlost = 10 s timer at both the CPP and CFP. If this timer expires, the link will be relinquished and the call dropped. The handshake mechanism is further augmented by referring to Fig. 29.10, where two different scenarios are examplified, portraying the situation when the HS message sent by the CPP to the CFP is lost or, conversely, that transmitted by the CFP is corrupted.

FIGURE 29.10: Handshake loss scenarios. Considering the first scenario, during error-free communications the CPP sends HS p = ID OK, and upon receiving it the CFP resets its Thlost f timer to 10 s. In due course it sends an HS f = ID OK acknowledgement, which also arrives free from errors. The CPP resets the Thlost f timer to 10 s and, after the elapse of the 0.4–1 s handshake interval, issues an HS p = ID OK message, which does not reach the CFP. Hence, the Thlost f timer is now reduced to 9 s and an HS f = ID Lost message is sent to the CPP. Upon reception of this, the CPP now cannot reset its Thlost p timer to 10 s but can respond with an HS p = ID OK message, which again goes astray, forcing the CFP to further reduce its Thlost f timer to 8 s. The CFP issues the valid handshake HS f = ID Lost, which arrives at the 1999 by CRC Press LLC

c

CPP, where the lack of HS f = ID OK reduces Thlost p to 8 s. Now the corruption of the issued HS p = ID OK reduces Thlost f to 7 s, in which event the link may be reinitialized using the M3 burst. The portrayed second example of Fig. 29.10 can be easily followed in case of the scenario when the HS f message is corrupted.

29.9

Main Features of the CT-2 System

In our previous discourse we have given an insight in the algorithmic procedures of the CT-2 MPT 1375 recommendation. We have briefly highlighted the four-part structure of the standard dealing with the radio interface, signalling layers 1 and 2, signalling layer 3, and the speech coding issues, respectively. There are forty 100-kHz wide RF channels in the band 864.15–868.15 MHz, and the 72 kb/s bit stream modulates a Gaussian filtered FSK modem. The multiple access technique is TDD, transmitting 2-ms duration, 144-b M1 bursts during normal voice communications, which deliver the 32-kb/s ADPCM-coded speech signal. During link establishment the M2 and M3 bursts are used, which were also portrayed in this treatise, along with a range of handshaking messages and scenarios.

Defining Terms AFC: Automatic frequency correction CAI: Common air interface CFP: Cordless fixed part CHM: Channel marker sequence CHMF: CFP channel marker CHMP: CPP channel marker CPP: Cordless portable part CT: Cordless telephone DCA: Dynamic channel allocation DCW: Data code word DECT: Digital European cordless telecommunications system FT: Frame format type bit GFSK: Gaussian frequency shift keying GP: Guard period HIC: Handset identification code HS: Handshaking ID: Identifier L2: Signalling layer 2 L3: Signalling layer 3 LAN: Local area network LID: Link identification LOS: Line of sight LS: Link status M1: Multiplex one burst format M2: Multiplex two burst format 1999 by CRC Press LLC

c

M3: Multiplex three burst format MIC: Manufacturer identification code MPT-1375: British CT2 standard PCN: Personal communications network PIC: Portable identification code PLMR: Public land mobile radio SNR: Signal-to-noise ratio SR: Signalling rate bit SYN: Synchronization sequence SYNCD: 16-b D-channel frame synchronization word TDD: Time division duplex multiple access scheme TP: Telepoint

References [1] Asghar, S., Digital European cordless telephone (DECT), In The Mobile Communications Handbook, Chap. 30, CRC Press, Boca Raton, FL, 1995. [2] Gardiner, J.G., Second generation cordless (CT-2) telephony in the UK: telepoint services and the common air-interface, Elec. & Comm. Eng. J., 71–78, Apr. 1990. [3] Hanzo, L., The Pan-European mobile radio system, In The Mobile Communications Handbook, Chap. 25, CRC Press, Boca Raton, FL, 1995. [4] Jabbari, B., Dynamic channel assignment, In The Mobile Communications Handbook, Chap. 21, CRC Press, Boca Raton, FL, 1995. [5] Ochsner, H., The digital European cordless telecommunications specification DECT. In Cordless telecommunication in Europe. Tuttlebee, W.H.M., Ed., 273–285. Springer-Verlag, 1990. [6] Steedman, R.A.J., The Common Air Interface MPT 1375. In Cordless Telecommunication in Europe. Tuttlebee, W.H.W. Ed., 261–272, Springer-Verlag, 1990. [7] Steele, R., Ed., Mobile Radio Communications, Pentech Press, London, 1992. [8] Tuttlebee, W.H.W., Ed., Cordless Telecommunication in Europe, Springer-Verlag, 1990.

1999 by CRC Press LLC

c

Chan, W.; Gerson, I. & Miki, T. “Half-Rate Standards” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Half-Rate Standards

Wai-Yip Chan Illinois Institute of Technology

Ira Gerson Motorola Corporate Systems Research Laboratories

Toshio Miki NTT Mobile Communication Network, Inc.

30.1

30.1 Introduction 30.2 Speech Coding for Cellular Mobile Radio Communications 30.3 Codec Selection and Performance Requirements 30.4 Speech Coding Techniques in the Half-Rate Standards 30.5 Channel Coding Techniques in the Half-Rate Standards 30.6 The Japanese Half-Rate Standard 30.7 The European GSM Half-Rate Standard 30.8 Conclusions Defining Terms Acknowledgment References Further Information

Introduction

A half-rate speech coding standard specifies a procedure for digital transmission of speech signals in a digital cellular radio system. The speech processing functions that are specified by a half-rate standard are depicted in Fig. 30.1. An input speech signal is processed by a speech encoder to generate

FIGURE 30.1: Digital speech transmission for digital cellular radio. Boxes with solid outlines represent processing modules that are specified by the half-rate standards. a digital representation at a net bit rate of Rs bits per second. The encoded bit stream representing the input speech signal is processed by a channel encoder to generate another bit stream at a gross bit rate of Rc bits per second, where Rc > Rs . The channel encoded bit stream is organized into data frames, and each frame is transmitted as payload data by a radio-link access controller and modulator. The net bit rate Rs counts the number of bits used to describe the speech signal, and the difference between the gross and net bit rates (Rc −Rs ) counts the number of error protection bits needed by the channel decoder to correct and detect transmission errors. The output of the channel decoder is given 1999 by CRC Press LLC

c

to the speech decoder to generate a quantized version of the speech encoder’s input signal. In current digital cellular radio systems that use time-division multiple access (TDMA), a voice connection is allocated a fixed transmission rate (i.e., Rc is a constant). The operations performed by the speech and channel encoders and decoders and their input and output data formats are governed by the half-rate standards. Globally, three major TDMA cellular radio systems have been developed and deployed. The initial digital speech services offered by these cellular systems were governed by full-rate standards. Because of the rapid growth in demand for cellular services, the available transmission capacity in some areas is frequently saturated, eroding customer satisfaction. By providing essentially the same voice quality but at half the gross bit rates of the full-rate standards, half-rate standards can readily double the number of callers that can be serviced by the cellular systems. The gross bit rates of the full-rate and half-rate standards for the European Groupe Speciale Mobile (GSM), Japanese Personal Digital Cellular1 (PDC), and North American cellular (IS-54) systems are listed in Table 30.1. The three systems were developed and deployed under different time tables. Their disparate full- and half-bit rates partly reflect this difference. At the time of writing (January, 1995), the European and the Japanese systems have each selected an algorithm for their respective half-rate codec. Standardization of the North American half-rate codec has not reached a conclusion as none of the candidate algorithms has fully satisfied the standard’s requirements. Thus, we focus here on the Japanese and European half-rate standards and will only touch upon the requirements of the North American standard. TABLE 30.1 Gross Bit Rates Used for Digital Speech Transmission in Three TDMA Cellular Radio Systems Gross Bit Rate, b/s

30.2

Standard Organization and Digital Cellular System

Full Rate

European Telecommunications Standards Institute (ETSI), GSM Research & Development Center for Radio Systems (RCR), PDC Telecommunication Industries Association (TIA), IS-54

22,800 11,200 13,000

Half Rate 11,400 5,600 6,500

Speech Coding for Cellular Mobile Radio Communications

Unlike the relatively benign transmission media commonly used in the public-switched telephone network (PSTN) for analog and digital transmission of speech signals, mobile radio channels are impaired by various forms of fading and interference effects. Whereas proper engineering of the radio link elements (modulation, power control, diversity, equalization, frequency allocation, etc.) ameliorates fading effects, burst and isolated bit errors still occur frequently. The net effect is such that speech communication may be required to be operational even for bit-error rates greater than 1%. In order to furnish reliable voice communication, typically half of the transmitted payload bits are devoted to error correction and detection. It is common for low-bit-rate speech codecs to process samples of the input speech signal one frame

1 Personal Digital Cellular was formerly Japanese Digital Cellular (JDC).

1999 by CRC Press LLC

c

at a time, e.g., 160 samples processed once every 20 ms. Thus, a certain amount of time is required to gather a block of speech samples, encode them, perform channel encoding, transport the encoded data over the radio channel, and perform channel decoding and speech synthesis. These processing steps of the speech codec add to the overall end-to-end transmission delay. Long transmission delay hampers conversational interaction. Moreover, if the cellular system is interconnected with the PSTN and a four-wire to two-wire (analog) circuit conversion is performed in the network, feedbacks called echoes may be generated across the conversion circuit. The echoes can be heard by the originating talker as a delayed and distorted version of his/her speech and can be quite annoying. The annoyance level increases with the transmission delay and may necessitate (at additional costs) the deployment of echo cancellers. A consequence of user mobility is that the level and other characteristics of the acoustic background noise can be highly variable. Though acoustic noise can be minimized through suitable acoustic transduction design and the use of adaptive filtering/cancellation techniques [9, 13, 15], the speech encoding algorithm still needs to be robust against background noise of various levels and kinds (e.g., babble, music, noise bursts, and colored noise). Processing complexity directly impacts the viability of achieving a circuit realization that is compact and has low-power consumption, two key enabling factors of equipment portability for the end user. Factors that tend to result in low complexity are fixed-point instead of floating-point computation, lack of complicated arithmetic operations (division, square roots, transcendental functions), regular algorithm structure, small data memory, and small program memory. Since, in general, better speech quality can be achieved with increasing speech and channel coding delay and complexity, the digital cellular mobile-radio environment imposes conflicting and challenging requirements on the speech codec.

30.3

Codec Selection and Performance Requirements

The half-rate speech coding standards are drawn up through competitive testing and selection. From a set of candidate codec algorithms submitted by contending organizations, the one algorithm that meets basic selection criteria and offers the best performance is selected to form the standard. The codec performance measures and codec testing and selection procedures are set out in a test plan under the auspices of the organization (Table 30.1) responsible for the standardization process (see, e.g., [16]). Major codec characteristics evaluated are speech quality, delay, and complexity. The fullrate codec is also evaluated as a reference codec, and its evaluation scores form part of the selection criteria for the codec candidates. The speech quality of each candidate codec is evaluated through listening tests. To conduct the tests, each candidate codec is required to process speech signals and/or encoded bit streams that have been preprocessed to simulate a range of operating conditions: variations in speaker voice and level, acoustic background noise type and level, channel error rate, and stages of tandem coding. During the tests, subjects listen to processed speech signals and judge their quality levels or annoyance levels on a five-point opinion scale. The opinion scores collected from the tests are suitably averaged over all trials and subjects for each test condition (see [11], for mean opinion score (MOS) and degradation mean opinion score). The categorical opinion scales of the subjects are also calibrated using modulated noise reference units (MNRUs) [3]. Modulated noise better resembles the distortions created by speech codecs than noise that is uncorrelated with the speech signal. Modulated noise is generated by multiplying the speech signal with a noise signal. The resultant modulated noise is scaled to a desired power level and then added to the uncoded (clean) speech signal. The ratio between the power level of the speech signal and that of the modulated noise is expressed in decibels 1999 by CRC Press LLC

c

and given the notation dBQ. Under each test condition, subjects are presented with speech signals processed by the codecs as well as speech signals corrupted by modulated noise. Through presenting a range of modulated-noise levels, the subjects’ opinions are calibrated on the dBQ scale. Thereafter, the mean opinion scores obtained for the codecs can also be expressed on that scale. For each codec candidate, a profile of scores is compiled, consisting of speech quality scores, delay measurements, and complexity estimates. Each candidate’s score profile is compared with that of the reference codec, ensuring that basic requirements are satisfied (see, e.g., [12]). An overall figure of merit for each candidate is also computed from the profile. The candidates, if any, that meet the basic requirements then compete on the basis of maximizing the figure of merit. Basic performance requirements for each of the three half-rate standards are summarized in Table 30.2. In terms of speech quality, the GSM and PDC half-rate codecs are permitted to underperform their respective full-rate codecs by no more than 1 dBQ averaging over all test conditions and no more than 3 dBQ within each test condition. More stringently, the North American half-rate codec is required to furnish a speech-quality profile that is statistically equivalent to that of the North American full-rate codec as determined by a specific statistical procedure for multiple comparisons [16]. Since various requirements on the half-rate standards are set relative to their full-rate counterparts, an indication of the relative speech quality between the three half-rate standards can be deduced from the test results of De Martino [2] comparing the three full-rate codecs. The maximum delays in Table 30.2 apply to the total of the delays through the speech and channel encoders and decoders (Fig. 30.1). Codec complexity is computed using a formula that counts the computational operations and memory usage of the codec algorithm. The complexity of the half-rate codecs is limited to 3 or 4 times that of their full-rate counterparts. TABLE 30.2 Standards

Basic Performance Requirements for the Three Half-Rate Basic performance requirements

30.4

Digital Cellular Systems

Min. Speech Quality, dBQ Rel. to Full Rate

Max. Delay, ms

Max. Complexity Rel. to Full Rate

Japanese (PDC) European (GSM) North American (IS-54)

−1 average, −3 maximum −1 average, −3 maximum Statistically equivalent

94.8 90 100

3× 4× 4×

Speech Coding Techniques in the Half-Rate Standards

Existing half-rate and full-rate standard coders can be characterized as linear-prediction based analysisby-synthesis (LPAS) speech coders [4]. LPAS coding entails using a time-varying all-pole filter in the decoder to synthesize the quantized speech signal. A short segment of the signal is synthesized by driving the filter with an excitation signal that is either quasiperiodic (for voiced speech) or random (for unvoiced speech). In either case, the excitation signal has a spectral envelope that is relatively flat. The synthesis filter serves to shape the spectrum of the excitation input so that the spectral envelope of the synthesized output resembles the filter’s magnitude frequency response. The magnitude response often has prominent peaks; they render the formants that give a speech signal its phonetic character. The synthesis filter has to be adapted to the current frame of input speech signal. This is accomplished with the encoder performing a linear prediction (LP) analysis of the frame: the inverse of the all-pole synthesis filter is applied as an LP error filter to the frame, and the values of the filter parameters are 1999 by CRC Press LLC

c

computed to minimize the energy of the filter’s output error signal. The resultant filter parameters are quantized and conveyed to the decoder for it to update the synthesis filter. Having executed an LP analysis and quantized the synthesis filter parameters, the LPAS encoder performs analysis-by-synthesis (ABS) on the input signal to find a suitable excitation signal. An ABS encoder maintains a copy of the decoder. The encoder examines the possible outputs that can be produced by the decoder copy in order to determine how best to instruct (using transmitted information) the actual decoder so that it would output (synthesize) a good approximation of the input speech signal. The decoder copy tracks the state of the actual decoder, since the latter evolves (under ideal channel conditions) according to information received from the encoder. The details of the ABS procedure vary with the particular excitation model employed in a specific coding scheme. One of the earliest seminal LPAS schemes is code excited linear prediction (CELP) [4]. In CELP, the excitation signal is obtained from a codebook of code vectors, each of which is a candidate for the excitation signal. The encoder searches the codebook to find the one code vector that would result in a best match between the resultant synthesis output signal and the encoder’s input speech signal. The matching is considered best when the energy of the difference between the two signals being matched is minimized. A perceptual weighting filter is usually applied to the difference signal (prior to energy integration) to make the minimization more relevant to human perception of speech fidelity. Regions in the frequency spectrum where human listeners are more sensitive to distortions are given relatively stronger weighting by the filter and vice versa. For instance, the concentration of spectral energy around the formant frequencies gives rise to stronger masking of coder noise (i.e., rendering the noise less audible) and, therefore, weaker weighting can be applied to the formant frequency regions. For masking to be effective, the weighting filter has to be adapted to the time-varying speech spectrum. Adaptation is achieved usually by basing the weighting filter parameters on the synthesis filter parameters. The CELP framework has evolved to form the basis of a great variety of speech coding algorithms, including all existing full- and half-rate standard algorithms for digital cellular systems. We outline next the basic CELP encoder-processing steps, in a form suited to our subsequent detailed descriptions of the PDC and GSM half-rate coders. These steps have accounted for various computational efficiency considerations and may, therefore, deviate from a conceptual functional description of the encoder constituents. 1. LP analysis on the current frame of input speech to determine the coefficients of the all-pole synthesis filter; 2. quantization of the LP filter parameters; 3. determination of the open-loop pitch period or lag; 4. adapting the perceptual weighting filter to the current LP information (and also pitch information when appropriate) and applying the adapted filter to the input speech signal; 5. formation of a filter cascade (which we shall refer to as perceptually weighted synthesis filter) consisting of the LP synthesis filter, as specified by the quantized parameters in step 2, followed by the perceptual weighting filter; 6. subtraction of the zero-input response of the perceptually weighted synthesis filter (the filter’s decaying response due to past input) from the perceptually weighted input speech signal obtained in step 4; 7. an adaptive codebook is searched to find the most suitable periodic excitation, i.e., when the perceptually weighted synthesis filter is driven by the best code vector from the adaptive codebook, the output of the filter cascade should best match the difference signal obtained in step 6; 1999 by CRC Press LLC

c

8. one or more nonadaptive excitation codebooks are searched to find the most suitable random excitation vectors that, when added to the best periodic excitation as determined in step 7 and with the resultant sum signal driving the filter cascade, would result in an output signal best matching the difference signal obtained in step 6. Steps 1–6 are executed once per frame. Steps 7 and 8 are executed once for each of the subframes that together constitute a frame. Step 7 may be skipped depending on the pitch information from step 3, or if step 7 were always executed, a nonperiodic excitation decision would be one of the possible outcomes of the search process in step 7. Integral to steps 7 and 8 is the determination of gain (scaling) parameters for the excitation vectors. For each frame of input speech, the filter and excitation and gain parameters determined as outlined are conveyed as encoded bits to the speech decoder. In a properly designed system, the data conveyed by the channel decoder to the speech decoder should be free of errors most of the time, and the speech signal synthesized by the speech decoder would be identical to that as determined in the speech encoder’s ABS operation. It is common to enhance the quality of the synthesized speech by using an adaptive postfilter to attenuate coder noise in the perceptually sensitive regions of the spectrum. The postfilter of the decoder and the perceptual weighting filter of the encoder may seem to be functionally identical. The weighting filter, however, influences the selection of the best excitation among available choices, whereas the postfilter actually shapes the spectrum of the synthesized signal. Since postfiltering introduces its own distortion, its advantage may be diminished if tandem coding occurs along the end-to-end communication path. Nevertheless, proper design can ensure that the net effect of postfiltering is a reduction in the amount of audible codec noise [1]. Excepting postfiltering, all other speech synthesis operations of an LPAS decoder are (effectively) duplicated in the encoder (though the converse is not true). Using this fact, we shall illustrate each coder in the sequel by exhibiting only a block diagram of its encoder or decoder but not both.

30.5

Channel Coding Techniques in the Half-Rate Standards

Crucial to the maintenance of quality speech communication is the ability to transport coded speech data across the radio channel with minimal errors. Low-bit-rate LPAS coders are particularly sensitive to channel errors; errors in the bits representing the LP parameters in one frame, for instance, could result in the synthesis of nonsensical sounds for longer than a frame duration. The error rate of a digital cellular radio channel with no channel coding can be catastrophically high for LPAS coders. The amount of tolerable transmission delay is limited by the requirement of interactive communication and, consequently, forward error control is used to remedy transmission errors. “Forward” means that channel errors are remedied in the receiver, with no additional information from the transmitter and, hence, no additional transmission delay. To enable the channel decoder to correct channel errors, the channel encoder conveys more bits than the amount generated by the speech encoder. The additional bits are for error protection, as errors may or may not occur in any particular transmission epoch. The ratio of the number of encoder input (information) bits to the number of encoder output (code) bits is called the (channel) coding rate. This is a number no more than one and generally decreases as the error protection power increases. Though a lower channel coding rate gives more error protection, fewer bits will be available for speech coding. When the channel is in good condition and, hence, less error protection is needed, the received speech quality could be better if bits devoted to channel coding were used for speech coding. On the other hand, if a high channel coding rate were used, there would be uncorrected errors under poor channel conditions and speech quality would suffer. 1999 by CRC Press LLC

c

Thus, when nonadaptive forward error protection is used over channels with nonstationary statistics, there is an inevitable tradeoff between quality degradation due to uncorrected errors and that due to expending bits on error protection (instead of on speech encoding). Both the GSM and PDC half-rate coders use convolutional coding [14] for error correction. Convolutional codes are sliding or sequential codes. The encoder of a rate m/n, m < n convolutional code can be realized using m shift registers. For every m information bits input to the encoder (one bit to each of the m shift registers), n code bits are output to the channel. Each code bit is computed as a modulo-2 sum of a subset of the bits in the shift registers. Error protection overhead can be reduced by exploiting the unequal sensitivity of speech quality to errors in different positions of the encoded bit stream. A family of rate-compatible punctured convolutional codes (RCPCCs) [10] is a collection of related convolutional codes; all of the codes in the collection except the one with the lowest rate are derived by puncturing (dropping) code bits from the convolutional code with the lowest rate. With an RCPCC, the channel coding rate can be varied on the fly (i.e., variable-rate coding) while a sequence of information bits is being encoded through the shift registers, thereby imparting on different segments in the sequence different degrees of error protection. For decoding a convolutional coded bit stream, the Viterbi algorithm [14] is a computationally efficient procedure. Given the output of the demodulator, the algorithm determines the most likely sequence of data bits sent by the channel encoder. To fully utilize the error correction power of the convolutional code, the amplitude of the demodulated channel symbol can be quantized to more bits than the minimum number required, i.e., for subsequent soft decision decoding. The minimum number of bits is given by the number of channel-coded bits mapped by the modulator onto each channel symbol; decoding based on the minimum-rate bit stream is called hard decision decoding. Although soft decoding gives better error protection, decoding complexity is also increased. Whereas convolutional codes are most effective against randomly scattered bit errors, errors on cellular radio channels often occur in bursts of bits. These bursts can be broken up if the bits put into the channel are rearranged after demodulation. Thus, in block interleaving, encoded bits are read into a matrix by row and then read out of the matrix by column (or vice versa) and then passed on to the modulator; the reverse operation is performed by a deinterleaver following demodulation. Interleaving increases the transmission delay to the extent that enough bits need to be collected in order to fill up the matrix. Owing to the severe nature of the cellular radio channel and limited available transmission capacity, uncorrected errors often remain in the decoded data. A common countermeasure is to append an error detection code to the speech data stream prior to channel coding. When residual channel errors are detected, the speech decoder can take various remedial measures to minimize the negative impact on speech quality. Common measures are repetition of speech parameters from the most recent good frames and gradual muting of the possibly corrupted synthesized speech. The PDC and GSM half-rate standard algorithms together embody some of the latest advances in speech coding techniques, including: multimodal coding where the coder configuration and bit allocation change with the type of speech input; vector quantization (VQ) [5] of the LP filter parameters; higher precision and improved coding efficiency for pitch-periodic excitation; and postfiltering with improved tandeming performance. We next explore the more distinctive features of the PDC and GSM speech coders.

30.6

The Japanese Half-Rate Standard

An algorithm was selected for the Japanese half-rate standard in April 1993, following the evaluation of 12 submissions in a first round, and four final candidates in a second round [12]. The selected 1999 by CRC Press LLC

c

algorithm, called pitch synchronous innovation CELP2 (PSI-CELP), met all of the basic selection criteria and scored the highest among all candidates evaluated. A block diagram of the PSI-CELP encoder is shown in Fig. 30.2, and bit allocations are summarized in Table 30.3. The complexity of the coder is estimated to be approximately 2.4 times that of the PDC full-rate coder. The frame size of the coder is 40 ms, and its subframe size is 10 ms. These sizes are longer than those used in most existing CELP-type standard coders. However, LP analysis is performed twice per frame in the PSI-CELP coder.

FIGURE 30.2: Basic structure of the PSI-CELP encoder.

A distinctive feature of the PSI-CELP coder is the use of an adaptive noise canceller [13, 15] to suppress noise in the input signal prior to coding. The input signal is classified into various modes, depending on the presence or absence of background noise and speech and their relative power levels. The current active mode determines whether Kalman filtering [9] is applied to the input signal

2 There were two candidate algorithms named PSI-CELP in the PDC half-rate competition. The algorithm described here was contributed by NTT Mobile Communications Network, Inc. (NTT DoCoMo).

1999 by CRC Press LLC

c

TABLE 30.3 Bit Allocations for the PSI-CELP Half- Rate PDC Speech Coder Parameter

Bits

Error Protected Bits

LP synthesis filter Frame energy Periodic excitation Stochastic excitation Gain Total

31 7 8×4 10 × 4 7×4 138

15 7 8×4 0 3×4 66

and whether the parameters of the Kalman filter are adapted. Kalman filtering is applied when a significant amount of background noise is present or when both background noise and speech are strongly present. The filter parameters are adapted to the statistics of the speech and noise signals in accordance with whether they are both present or only noise is present. The LP filter parameters in the PSI-CELP coder are encoded using VQ. A tenth-order LP analysis is performed every 20 ms. The resultant filter parameters are converted to 10 line spectral frequencies (LSFs).3 The LSF parameters have a naturally increasing order, and together are treated as the ordered components of a vector. Since the speech spectral envelope tends to evolve slowly with time, there is intervector dependency between adjacent LSF vectors that can be exploited. Thus, the two LSF vectors for each 40-ms frame are paired together and jointly encoded. Each LSF vector in the pair is split into three subvectors. The pair of subvectors that cover the same vector component indexes are combined into one composite vector and vector quantized. Altogether, 31 b are used to encode a pair of LSF vectors. This three-way split VQ4 scheme embodies a compromise between the prohibitively high complexity of using a large vector dimension and the performance gain from exploiting intraand intervector dependency. The PSI-CELP encoder uses a perceptual weighting filter consisting of a cascade of two filter sections. The sections exploit the pitch-harmonic structure and the LP spectral-envelope structure of the speech signal, respectively. The pitch-harmonic section has four parameters, a pitch lag and three coefficients, whose values are determined from an analysis of the periodic structure of the input speech signal. Pitch-harmonic weighting reduces the amount of noise in between the pitch harmonics by aggregating coder noise to be closer to the harmonic frequencies of the speech signal. In high-pitched voice, the harmonics are spaced relatively farther apart, and pitch-harmonic weighting becomes correspondingly more important. The excitation vector x (Fig. 30.2) is updated once every subframe interval (10 ms) and is constructed as a linear combination of two vectors x = g0 y + g1 z

(30.1)

where g0 and g1 are scalar gains, y is labeled as the periodic component of the excitation and z as the stochastic or random component. When the input speech is voiced, the ABS operation would find a value for y from the adaptive codebook (Fig. 30.2). The codebook is constructed out of past samples of the excitation signal x; hence, there is a feedback path into the adaptive codebook in Fig. 30.2. Each code vector in the adaptive codebook corresponds to one of the 192 possible pitch lag L values available for encoding; the code vector is populated with samples of x beginning with the Lth sample backward in time. L is not restricted to be an integer, i.e., fractional pitch period is

3 Also known as line spectrum pairs (LSPs). 4 Matrix quantization is another possible description.

1999 by CRC Press LLC

c

permitted. Successive values of L are more closely spaced for smaller values of L; short, medium, and long lags are quantized to one-quarter, one-half, and one sampling-period resolution, respectively. As a result, the relative quantization error in the encoded pitch frequency (which is the reciprocal of the encoded pitch lag) remains roughly constant with increasing pitch frequency. When the input speech is unvoiced, y would be obtained from the fixed codebook (Fig. 30.2). To find the best value for y, the encoder searches through the aggregate of 256 code vectors from both the adaptive and fixed codebooks. The code vector that results in a synthesis output most resembling the input speech is selected. The best code vector thus chosen also implicitly determines the voicing condition (voiced/unvoiced) and the pitch lag value L∗ most appropriate to the current subframe of input speech. These parameters are said to be determined in a closed-loop search. The stochastic excitation z is formed as a sum of two code vectors, each selected from a conjugate codebook (Fig. 30.2) [13]. Using a pair of conjugate codebooks each of size 16 code vectors (4 b) has been found to improve robustness against channel errors, in comparison with using one single codebook of size 256 code vectors (8 b). The synthesis output due to z can be decomposed into a sum of two orthogonal components, one of which points in the same direction as the synthesis output due to the periodic excitation y and the other component points in a direction orthogonal to the synthesis output due to y. The latter synthesis output component of z is kept, whereas the former component is discarded. Such decomposition enables the two gain factors g0 and g1 to be separately quantized. For voiced speech, the conjugate code vectors are preprocessed to produce a set of pitch synchronous innovation (PSI) vectors. The first L∗ samples of each code vector are treated as a fundamental period of samples. The fundamental period is replicated until there are enough samples to populate a subframe. If L∗ is not an integer, interpolated samples of the code vectors are used (upsampled versions of the code vectors can be precomputed). PSI has been found to reinforce the periodicity and substantially improve the quality of synthesized voiced speech. The postfilter in the PSI-CELP decoder has three sections, for enhancing the formants, the pitch harmonics, and the high frequencies of the synthesized speech, respectively. Pitch-harmonic enhancement is applied only when the adaptive codebook has been used. Formant enhancement makes use of the decoded LP synthesis filter parameters, whereas a refined pitch analysis is performed on the synthesized speech to obtain the values for the parameters of the pitch-harmonic section of the postfilter. A first-order high-pass filter section compensates for the low-pass spectral tilt [1] of the formant enhancement section. Of the 138 speech data bits generated by the speech encoder every 40-ms frame, 66 b (Table 30.3) receive error protection and the remaining 72 speech data bits of the frame are not error protected. An error detection code of 9 cyclic redundancy check (CRC) bits is appended to the 66 b and then submitted to a rate 1/2, punctured convolutional encoder to generate a sequence of 152 channel coded bits. Of the unprotected 72 b, the 40 b that index the excitation codebooks (Table 30.3) are remapped or pseudo-Gray coded [17] so as to equalize their channel error sensitivity. As a result, a bit error occurring in an index word is likely to cause about the same amount of degradation regardless of the bit error position in the index word. For each speech frame, the channel encoder emits 224 b of payload data. The payload data from two adjacent frames are interleaved before transmission over the radio link. Uncorrected errors in the most critical 66 b are detected with high probability as a CRC error. A finite state machine keeps track of the recent history of CRC errors. When a sequence of CRC errors is encountered, the power level of the synthesized speech is progressively suppressed, so that muting is reached after four consecutive CRC errors. Conversely, following the cessation of a sequence of CRC errors, the power level of the synthesized speech is ramped up gradually.

1999 by CRC Press LLC

c

30.7

The European GSM Half-Rate Standard

A vector sum excited linear prediction (VSELP) coder, contributed by Motorola, Inc., was selected in January 1994 by the main GSM technical committee as a basis for the GSM half-rate standard. The standard was finally approved in January 1995. VSELP is a generic name for a family of algorithms from Motorola; the North American full-rate and the Japanese full-rate standards are also based on VSELP. All VSELP coders make use of the basic idea of representing the excitation signal by a linear combination of basis vectors [6]. This representation renders the excitation codebook search procedure very computationally efficient. A block diagram of the GSM half-rate decoder is depicted in Fig. 30.3 and bit allocations are tabulated in Table 30.4. The coder’s frame size is 20 ms, and each frame comprises four subframes of 5 ms each. The coder has been optimized for execution on a processor with 16-b word length and 32-b accumulator. The GSM standard is a bit exact specification: in addition to specifying the codec’s processing steps, the numerical formats and precisions of the codec’s variables are also specified.

FIGURE 30.3: Basic structure of the GSM VSELP decoder. Top is for mode 0 and bottom is for modes 1, 2, and 3. 1999 by CRC Press LLC

c

TABLE 30.4 Coder

Bit Allocations for the VSELP Half-Rate GSM

Parameter LP synthesis filter Soft interpolation Frame energy Mode selection Mode 0 Excitation code I Excitation code H Gain code Gs , P0 Mode 1, 2, and 3 Pitch lag L (first subframe) Difference lag (subframes 2, 3, 4) Excitation code J Gain code Gs , P0 Total

Bits/subframe

Bits/frame 28 1 5 2

7 7 5

28 28 20

4 9 5

8 12 36 20 112

The synthesis filter coefficients in GSM VSELP are encoded using the fixed point lattice technique (FLAT) [8] and vector quantization. FLAT is based on the lattice filter representation of the linear prediction error filter. The tenth-order lattice filter has 10 stages, with the ith stage, i ∈ {1, . . . , 10}, containing a reflection coefficient parameter ri . The lattice filter has an order-recursion property such that the best prediction error filters of all orders less than ten are all embedded in the best tenth-order lattice filter. This means that once the values of the lower order reflection coefficients have been optimized, they do not have to be reoptimized when a higher order predictor is desired; in other words, the coefficients can be optimized sequentially from low to high orders. On the other hand, if the lower order coefficients were suboptimal (as in the case when the coefficients are quantized), the higher order coefficients could still be selected to minimize the prediction residual (or error) energy at the output of the higher order stages; in effect, the higher order stages can compensate for the suboptimality of lower order stages. In the GSM VSELP coder, the ten reflection coefficients {r1 , . . . , r10 } that have to be encoded for each frame are grouped into three coefficient vectors v 1 = [r1 r2 r3 ], v 2 = [r4 r5 r6 ], v 3 = [r7 r8 r9 r10 ]. The vectors are quantized sequentially, from v 1 to v 3 , using a bi -bit VQ codebook Ci for v i , where bi , i = 1, 2, 3 are 11, 9, and 8 b, respectively. The vector v i is quantized to minimize the prediction error at the energy output of the j th stage of the lattice filter where rj is the highest order coefficient in the vector v i . The computational complexity associated with quantizing v i is reduced by searching only a small subset of the code vectors in Ci . The subset is determined by first searching a prequantizer codebook of size ci bits, where ci , i = 1, . . . , 3 are 6, 5, and 4 b, respectively. Each code vector in the prequantizer codebook is associated with 2bi −ci code vectors in the target codebook. The subset is obtained by pooling together all of the code vectors in Ci that are associated with the top few best matching prequantizer code vectors. In this way, a factor of reduction in computational complexity of nearly 2bi −ci is obtained for the quantization of v i . The half-rate GSM coder changes its configuration of excitation generation (Fig. 30.3) in accordance with a voicing mode [7]. For each frame, the coder selects one of four possible voicing modes depending on the values of the open-loop pitch-prediction gains computed for the frame and its four subframes. Open loop refers to determining the pitch lag and the pitch-predictor coefficient(s) via a direct analysis of the input speech signal or, in the case of the half-rate GSM coder, the perceptually weighted (LP-weighting only) input signal. Open-loop analysis can be regarded as the opposite of closed-loop analysis, which in our context is synonymous with ABS. When the pitch-prediction gain for the frame is weak, the input speech signal is deemed to be unvoiced and mode 0 is used. In this mode, two 7-b trained codebooks (excitation codebooks 1 and 2 in Fig. 30.3) are used, and the excitation signal for each subframe is formed as a linear combination of two code vectors, one from 1999 by CRC Press LLC

c

each of the codebooks. A trained codebook is one designed by applying the coder to a representative set of speech signals while optimizing the codebook to suit the set. Mode 1, 2, or 3 is chosen depending on the strength of the pitch-prediction gains for the frame and its subframes. In these modes, the excitation signal is formed as a linear combination of a code vector from an 8-b adaptive codebook and a code vector from a 9-b trained codebook (Fig. 30.3). The code vectors that are summed together to form the excitation signal for a subframe are each scaled by a gain factor (β and γ in Fig. 30.3). Each mode uses a gain VQ codebook specific to that mode. As depicted in Fig. 30.3, the decoder contains an adaptive pitch prefilter for the voiced modes and an adaptive postfilter for all modes. The filters enhance the perceptual quality of the decoded speech and are not present in the encoder. It is more conventional to locate the pitch prefilter as a section of the postfilter; the distinctive placement of the pitch prefilter in VSELP was chosen to reduce artifacts caused by the time-varying nature of the filter. In mode 0, the encoder uses an LP spectral weighting filter in its ABS search of the two excitation codebooks. In the other modes, the encoder uses a pitchharmonic weighting filter in cascade with an LP spectral weighting filter for searching excitation codebook 0, whereas only LP spectral weighting is used for searching the adaptive codebook. The pitch-harmonic weighting filter has two parameters, a pitch lag and a coefficient, whose values are determined in the aforementioned open-loop pitch analysis. A code vector in the 8-b adaptive codebook has a dimension of 40 (the duration of a subframe) and is populated with past samples of the excitation signal beginning with the Lth sample back from the present time. L can take on one of 256 different integer and fractional values. The best adaptive code vector for each subframe can be selected via a complete ABS; the required exhaustive search of the adaptive codebook is, however, computationally expensive. To reduce computation, the GSM VSELP coder makes use of the aforementioned open-loop pitch analysis to produce a list of candidate lag values. The open-loop pitch-prediction gains are ranked in decreasing order, and only the lags corresponding to top-ranked gains are kept as candidates. The final decisions for the four L values of the four subframes in a frame are made jointly. By assuming that the four L values can not vary over the entire range of all possible 256 values in the short duration of a frame, the L of the first subframe is coded using 8 b, and the L of each of the other three subframes is coded differentially using 4 b. The 4 b represent 16 possible values of deviation relative to the lag of the previous subframe. The four lags in a frame trace out a trajectory where the change from one time point to the next is restricted; consequently, only 20 b are needed instead of 32 b for encoding the four lags. Candidate trajectories are constructed by linking top ranked lags that are commensurate with differential encoding. The best trajectory among the candidates is then selected via ABS. The trained excitation codebooks of VSELP have a special vector sum structure that facilitates fast searching [6]. Each of the 2b code vectors in a b-bit trained codebook is formed as a linear combination of b basis vectors. Each of the b scalar weights in the linear combination is restricted to have a binary value of either 1 or −1. The 2b code vectors in the codebook are obtained by taking all 2b possible combinations of values of the weights. A substantial storage saving is incurred by storing only b basis vectors instead of 2b code vectors. Computational saving is another advantage of the vector-sum structure. Since filtering is a linear operation, the synthesis output due to each code vector is a linear combination of the synthesis outputs due to the individual basis vectors, where the same weight values are used in the output linear combination as in forming the code vector. A vector sum codebook can be searched by first performing synthesis filtering on its b basis vectors. If, for the present subframe, another trained codebook (mode 0) or an adaptive codebook (mode 1, 2, 3) had been searched, the filtered basis vectors are further orthogonalized with respect to the signal synthesized from that codebook, i.e., each filtered basis vector is replaced by its own component that is orthogonal to the synthesized signal. Further complexity reduction is obtained by examining the code vectors in a sequence such that two successive code vectors differ in only one of the b scalar 1999 by CRC Press LLC

c

weight values; that is, the entire set of 2b code vectors is searched in a Gray coded sequence. With successive code vectors differing in only one term in the linear combination, it is only necessary in the codebook search computation to progressively track the difference [6]. The total energy of a speech frame is encoded with 5 b (Table 30.4). The two gain factors (β and γ in Fig. 30.3) for each subframe are computed after the excitation codebooks have been searched and are then transformed to parameters Gs and P0 to be vector quantized. Each mode has its own 5-b gain VQ codebook. Gs represents the energy of the subframe relative to the total frame energy, and P0 represents the fraction of the subframe energy due to the first excitation source (excitation codebook 1 in mode 0, or the adaptive codebook in the other modes). An interpolation bit (Table 30.4) transmitted for each frame specifies to the decoder whether the LP synthesis filter parameters for each subframe should be obtained from interpolating between the decoded filter parameters for the current and the previous frames. The encoder determines the value of this bit according to whether interpolation or no interpolation results in a lower prediction residual energy for the frame. The postfilter in the decoder operates in concordance with the actual LP parameters used for synthesis. The speech encoder generates 112 b of encoded data (Table 30.4) for every 20-ms frame of the speech signal. These bits are processed by the channel encoder to improve, after channel decoding at the receiver, the uncoded bit-error rate and the detectability of uncorrected errors. Error detection coding in the form of 3 CRC bits is applied to the most critical 22 data bits. The combined 25 b plus an additional 73 speech data bits and 6 tail bits are input to an RCPCC encoder (the tail bits serve to bring the channel encoder and decoder to a fixed terminal state at the end of the payload data stream). The 3 CRC bits are encoded at rate 1/3 and the other 101 b are encoded at rate 1/2, generating a total of 211 channel coded bits. These are finally combined with the remaining 17 (uncoded) speech data bits to form a total of 228 b for the payload data of a speech frame. The payload data from two speech frames are interleaved for transmission over four timeslots of the GSM TDMA channel. With the Viterbi algorithm, the channel decoder performs soft decision decoding on the demodulated and deinterleaved channel data. Uncorrected channel errors may still be present in the decoded speech data after Viterbi decoding. Thus, the channel decoder classifies each frame into three integrity categories: bad, unreliable, and reliable, in order to assist the speech decoder in undertaking error concealment measures. A frame is considered bad if the CRC check fails or if the received channel data is close to more than one candidate sequence. The latter evaluation is based on applying an adaptive threshold to the metric values produced by the Viterbi algorithm over the course of decoding the most critical 22 speech data bits and their 3 CRC bits. Frames that are not bad may be classified as unreliable, depending on the metric values produced by the Viterbi algorithm and on channel reliability information supplied by the demodulator. Depending on the recent history of decoded data integrity, the speech decoder can take various error concealment measures. The onset of bad frames is concealed by repetition of parameters from previous reliable frames, whereas the persistence of bad frames results in power attenuation and ultimately muting of the synthesized speech. Unreliable frames are decoded with normality constraints applied to the energy of the synthesized speech.

30.8

Conclusions

The half-rate standards employ some of the latest techniques in speech and channel coding to meet the challenges posed by the severe transmission environment of digital cellular radio systems. By halving the bit rate, the voice transmission capacity of existing full-rate digital cellular systems can be doubled. Although advances are still being made that can address the needs of quarter-rate speech transmission, 1999 by CRC Press LLC

c

much effort is currently devoted to enhancing the speech quality and robustness of full-rate (GSM and IS-54) systems, aiming to be closer to toll quality. On the other hand, the imminent introduction of competing wireless systems that use different modulation schemes [e.g., coded division multiple access (CDMA)] and/or different radio frequencies [e.g., personal communications systems (PCS)] is poised to alleviate congestion in high-user-density areas.

Defining Terms Codebook: An ordered collection of all possible values that can be assigned to a scalar or vector variable. Each unique scalar or vector value in a codebook is called a codeword, or code vector where appropriate. Codec: A contraction of (en)coder–decoder, used synonymously with the word coder. The encoder and decoder are often designed and deployed as a pair. A half-rate standard codec performs speech as well as channel coding. Echo canceller: A signal processing device that, given the source signal causing the echo signal, generates an estimate of the echo signal and subtracts the estimate from the signal being interfered with by the echo signal. The device is usually based on a discrete-time adaptive filter. Pitch period: The fundamental period of a voiced speech waveform that can be regarded as periodic over a short-time interval (quasiperiodic). The reciprocal of pitch period is pitch frequency or simply, pitch. Tandem coding: Having more than one encoder–decoder pair in an end-to-end transmission path. In cellular radio communications, having a radio link at each end of the communication path could subject the speech signal to two passes of speech encoding–decoding. In general, repeated encoding and decoding increases the distortion.

Acknowledgment The authors would like to thank Erdal Paksoy and Mark A. Jasiuk for their valuable comments.

References [1] Chen, J.-H. and Gersho, A., Adaptive postfiltering for quality enhancement of coded speech. IEEE Trans. Speech & Audio Proc., 3(1), 59–71, 1995. [2] De Martino, E., Speech quality evaluation of the European, North-American and Japanese speech codec standards for digital cellular systems. In Speech and Audio Coding for Wireless and Network Applications, Atal, B.S., Cuperman, V., and Gersho, A., Eds., 55–58, Kluwer Academic Publishers, Norwell, MA, 1993. [3] Dimolitsas, S., Corcoran, F.L., and Baraniecki, M.R., Transmission quality of North American cellular, personal communications, and public switched telephone networks. IEEE Trans. Veh. Tech., 43(2), 245–251, 1994. [4] Gersho, A., Advances in speech and audio compression. Proc. IEEE, 82(6), 900–918, 1994. [5] Gersho, A. and Gray, R.M., Vector Quantization and Signal Compression, Kluwer Academic Publishers, Norwell, MA, 1991. [6] Gerson, I.A. and Jasiuk, M.A., Vector sum excited linear prediction (VSELP) speech coding at 8 kbps. In Proceedings, IEEE Intl. Conf. Acoustics, Speech, & Sig. Proc., 461–464, April, 1990. 1999 by CRC Press LLC

c

[7] Gerson, I.A. and Jasiuk, M.A., Techniques for improving the performance of CELP—type speech coders. IEEE J. Sel. Areas Comm., 10(5), 858–865, 1992. [8] Gerson, I.A., Jasiuk, M.A., Nowack, J.M., Winter, E.H., and M¨uller, J.-M., Speech and channel coding for the half-rate GSM channel. In Proceedings, ITG-Report 130 on Source and Channel Coding, 225–232. Munich, Germany, Oct., 1994. [9] Gibson, J.D., Koo, B., and Gray, S.D., Filtering of colored noise for speech enhancement and coding. IEEE Trans. Sig. Proc., 39(8), 1732–1742, 1991. [10] Hagenauer, J., Rate-compatible punctured convolutional codes (RCPC codes) and their applications. IEEE Trans. Comm., 36(4), 389–400, 1988. [11] Jayant, N.S. and Noll, P., Digital Coding of Waveforms, Prentice-Hall, Englewood Cliffs, NJ, 1984. [12] Masui, F. and Oguchi, M.,. Activity of the half rate speech codec algorithm selection for the personal digital cellular system. Tech. Rept. of IEICE, RCS93-77(11), 55–62 (in Japanese), 1993. [13] Ohya, T., Suda, H., and Miki, T., 5.6 kbits/s PSI-CELP of the half-rate PDC speech coding standard. In Proceedings, IEEE Veh. Tech. Conf., 1680–1684, June, 1994. [14] Proakis, J.G., Digital Communications, 3rd ed., McGraw-Hill, New York, 1995. [15] Suda, H., Ikeda, K., and Ikedo, J., Error protection and speech enhancement schemes of PSICELP, NTT R & D. (Special issue on PSI-CELP speech coding system for mobile communications), 43(4), 373–380, (in Japanese), 1994. [16] Telecommunication Industries Association (TIA). Half-rate speech codec test plan V6.0. TR45.3.5/93.05.19.01, 1993. [17] Zeger, K. and Gersho, A., Pseudo-Gray coding. IEEE Trans. Comm., 38(12), 2147–2158, 1990.

Further Information Additional technical information on speech coding can be found in the books, periodicals, and conference proceedings that appear in the list of references. Other relevant publications not represented in the list are Speech Communication, Elsevier Science Publishers; Advances in Speech Coding, B. S. Atal, V. Cuperman, and A, Gersho, eds., Kluwer Academic Publishers; and Proceedings of the IEEE Workshop on Speech Coding.

1999 by CRC Press LLC

c

Budagavi, M. & Talluri, R. “Wireless Video Communications” Mobile Communications Handbook Ed. Suthan S. Suthersan Boca Raton: CRC Press LLC, 1999

1999 by CRC Press LLC

c

Wireless Video Communications 31.1 Introduction 31.2 Wireless Video Communications Recommendation H.223

31.3 Error Resilient Video Coding

A Standard Video Coder • Error Resilient Video Decoding Classification of Error-Resilience Techniques

•

31.4 MPEG-4 Error Resilience Tools

Resynchronization • Data Partitioning • Reversible Variable Length Codes (RVLCs) • Header Extension Code (HEC) • Adaptive Intra Refresh (AIR)

31.5 H.263 Error Resilience Tools

Madhukar Budagavi Texas Instruments

Raj Talluri Texas Instruments

31.1

Slice Structure Mode (Annex K) • Independent Segment Decoding (ISD) (Annex R) • Error Tracking (Appendix I) • Reference Picture Selection (Annex N)

31.6 Discussion Defining Terms References Further Information

Introduction

Recent advances in technology have resulted in a rapid growth in mobile communications. With this explosive growth, the need for reliable transmission of mixed media information—audio, video, text, graphics, and speech data—over wireless links is becoming an increasingly important application requirement. The bandwidth requirements of raw video data are very high (a 176 × 144 pixels, color video sequence requires over 8 Mb/s). Since the amount of bandwidth available on current wireless channels is limited, the video data has to be compressed before it can be transmitted on the wireless channel. The techniques used for video compression typically utilize predictive coding schemes to remove redundancy in the video signal. They also employ variable length coding schemes, such as Huffman codes, to achieve further compression. The wireless channel is a noisy fading channel characterized by long bursts of errors [8]. When compressed video data is transmitted over wireless channels, the effect of channel errors on the video can be severe. The variable length coding schemes make the compressed bitstream sensitive to channel errors. As a result, the video decoder that is decoding the corrupted video bitstream can easily lose synchronization with the encoder. Predictive coding techniques, such as block motion compensation, which are used in current video compression standards, make the matter worse by 1999 by CRC Press LLC

c

quickly propagating the effects of channel errors across the video sequence and rapidly degrading the video quality. This may render the video sequence totally unusable. Error control coding [5], in the form of Forward Error Correction (FEC) and/or Automatic Repeat reQuest (ARQ), is usually employed on wireless channels to improve the channel conditions. FEC techniques prove to be quite effective against random bit errors, but their performance is usually not adequate against longer duration burst errors. FEC techniques also come with an increased overhead in terms of the overall bitstream size; hence, some of the coding efficiency gains achieved by video compression are lost. ARQ techniques typically increase the delay and, therefore, might not be suitable for real-time videoconferencing. Thus, in practical video communication schemes, error control coding is typically used only to provide a certain level of error protection to the compressed video bitstream, and it becomes necessary for the video coder to accept some level of errors in the video bitstream. Error-resilience tools are introduced in the video codec to handle these residual errors that remain after error correction. The emphasis in this chapter is on discussing relevant international standards that are making wireless video communications possible. We will concentrate on both the error control and source coding aspects of the problem. In the next section, we give an overview of a wireless video communication system that is a part of a complete wireless multimedia communication system. The International Telecommunication Union—Telecommunications Standardization Sector (ITU-T) H.223 [1] standard that describes a method of providing error protection to the video data before it is transmitted is also described. It should be noted that the main function of H.223 is to multiplex/demultiplex the audio, video, text, graphics, etc., which are typically communicated together in a videoconferencing application—error protection of the transmitted data becomes a requirement to support this functionality on error-prone channels. In Section 31.3, an overview of error-resilient video coding is given. The specific tools adopted into the International Standards Organization (ISO)/International Electrotechnical Commission (IEC) Motion Picture Experts Group (MPEG) v.4 (i.e., MPEG-4) [7] and the ITU-T H.263 [3] video coding standards to improve the error robustness of the video coder are described in Sections 31.4 and 31.5, respectively. Table 31.1 provides a listing of some of the standards that are described or referred to in this chapter.

31.2

Wireless Video Communications

Figure 31.1 shows the basic block diagram of a wireless video communication system [10]. Input video is compressed by the video encoder to generate a compressed bitstream. The transport coder converts the compressed video bitstream into data units suitable for transmission over wireless channels. Typical operations carried out in the transport coder include channel coding, framing of data, modulation, and control operations required for accessing the wireless channel. At the receiver side, the inverse operations are performed to reconstruct the video signal for display. In practice, the video communication system is part of a complete multimedia communication system and needs to interact with other system components to achieve the desired functionality. Hence, it becomes necessary to understand the other components of a multimedia communication system in order to design a good video communication system. Figure 31.2 shows the block diagram of a wireless multimedia terminal based on the ITU-T H.324 set of standards [4]. We use the H.324 standard as an example because it is the first videoconferencing standard for which mobile extensions were added to facilitate use on wireless channels. The system components of a multimedia terminal can be grouped into three processing blocks: (1) audio, video, and data (the word data is used here to mean still images/slides, shared files, documents etc.), (2) control, and (3) multiplex-demultiplex blocks. 1999 by CRC Press LLC

c

TABLE 31.1

List of Relevant Standards

ISO/IEC 14496-2

Information Technology—Coding of Audio-Visual Objects: Visual

(MPEG-4) H.263 (Version 1 and Version 2)

Video coding for low bitrate communication

H.261

Video codec for audiovisual services at p X 64 kbit/s

H.223

Multiplexing protocol for low bitrate multimedia communication

H.324

Terminal for low bitrate multimedia communication

H.245

Control protocol for multimedia communication

G.723.1

Dual rate speech coder for multimedia communication transmitting at 5.3 and 6.3 kbit/s

FIGURE 31.1: A wireless video communication system.

1. Audio, video, and data processing blocks—These blocks basically produce/consume the multimedia information that is communicated. The aggregate bitrate generated by these blocks is restricted due to limitations of the wireless channel and, therefore, the total rate allowed has to be judiciously allocated among these blocks. Typically, the video blocks use up the highest percentage of the aggregate rate, followed by audio and then data. H.324 specifies the use of H.261/H.263 for video coding and G.723.1 for audio coding. 2. Control block—This block has a wide variety of responsibilities all aimed at setting up and maintaining a multimedia call. The control block facilitates the set-up of compression methods and preferred bitrates for audio, video, and data to be used in the multimedia call. It is also responsible for end-to-network signalling for accessing the network and end-to-end signalling for reliable operation of the multimedia call. H.245 is the control protocol in the H.324 suite of standards that specifies the control messages to achieve the above functionality. 3. Multiplex-Demultiplex (MUX) block—This block multiplexes the resulting audio, video, data, and control signals into a single stream before transmission on the network. Similarly, the received bitstream is demultiplexed to obtain the audio, video, data, and control signals, which are then passed to their respective processing blocks. The MUX block accesses the network through a suitable network interface. The H.223 standard is the multiplexing scheme used in H.324. 1999 by CRC Press LLC

c

FIGURE 31.2: Configuration of a wireless multimedia terminal.

Proper functioning of the MUX is crucial to the operation of the video communication system, as all the multimedia data/signals flow through it. On wireless channels, transmission errors can lead to a breakdown of the MUX resulting in, for example, nonvideo data being channeled to the video decoder or corrupted video data being passed on to the video decoder. Three annexes were specifically added to H.223 to enable its operation in error-prone environments. Below, we give a more detailed overview of H.223 and point out the levels of error protection provided by H.223 and its three annexes. It should also be noted that MPEG-4 does not specify a lower-level MUX like H.223, and thus H.223 can also be used to transmit MPEG-4 video data.

31.2.1

Recommendation H.223

Video, audio, data, and control information is transmitted in H.324 on distinct logical channels. H.223 determines the way in which the logical channels are mixed into a single bitstream before transmission over the physical channel (e.g., the wireless channel). The H.223 multiplex consists of two layers—the multiplex layer and the adaptation layer, as shown in Fig. 31.2. The multiplex layer is responsible for multiplexing the various logical channels. It transmits the multiplexed stream in the form of packets. The adaptation layer adapts the information stream provided by the applications above it to the multiplex layer below it by adding, where appropriate, additional octets for the purposes of error control and sequence numbering. The type of error control used depends on the type of information (audio/video/data/control) being conveyed in the stream. The adaptation layer provides error control support in the form of both FEC and ARQ. H.223 was initially targeted for use on the benign general switched telephone network (GSTN). Later on, to enable its use on wireless channels, three annexes (referred to as Levels 1–3, respectively), were defined to provide improved levels of error protection. The initial specification of H.223 is 1999 by CRC Press LLC

c

referred to as Level 0. Together, Levels 0–3 provide for a trade-off of error robustness against the overhead required, with Level 0 being the least robust and using the least amount of overhead and Level 3 being the most robust and also using the most amount of overhead. 1. H.223 Level 0—Default mode. In this mode the transmitted packet sizes are of variable length and are delimited by an 8-bit HDLC (High-level Data Link Control) flag (01111110). Each packet consists of a 1-octet header followed by the payload, which consists of a variable number of information octets. The header octet includes a Multiplex Code (MC) which specifies, by indexing to a multiplex table, the logical channels to which each octet in the information field belongs. To prevent emulation of the HDLC flag in the payload, bitstuffing is adopted. 2. H.223 Level 1 (Annex A)—Communication over low error-prone channels. The use of bitstuffing leads to poor performance in the presence of errors; therefore in Level 1, bitstuffing is not performed. The other improvement incorporated in Level 1 is the use of a longer 16-bit pseudo-noise synchronization flag to allow for more reliable detection of packet boundaries. The input bitstream is correlated with the synchronization flag and the output of the correlator is compared with a correlation threshold. Whenever the correlator output is equal to or greater than the threshold, a flag is detected. Since, bitstuffing is not performed, it is possible to have this flag emulated in the payload. However, the probability of such an emulation is low and is outweighed by the improvement gained by not using bitstuffing over error-prone channels. 3. H.223 Level 2 (Annex B)—Communication over moderately error-prone channels. When compared to the Level 1 operation, Level 2 increases the protection on the packet header. A Multiplex Payload Length (MPL) field, which gives the length of the payload in bytes, is introduced into the header to provide additional redundancy for detecting the length of the video packet. A (24,12,8) extended Golay code is used to protect the MC and the MPL fields. Use of error protection in the header enables robust delineation of packet boundaries. Note that the payload data is not protected in Level 2. 4. H.223 Level 3 (Annex C)—Communication over highly error-prone channels. Level 3 goes one step above Level 2 and provides for protection of the payload data. Rate Compatible Punctured Convolutional (RCPC) codes, various CRC polynomials, and ARQ techniques are used for protection of the payload data. Level 3 allows for the payload error protection overhead to vary depending on the channel conditions. RCPC codes are used for achieving this adaptive level of error protection because RCPC codes use the same channel decoder architecture for all the allowed levels of error protection, thereby reducing the complexity of the MUX.

31.3

Error Resilient Video Coding

Even after error control and correction, some amount of residual errors still exist in the compressed bitstream fed to the video decoder in the receiver. Therefore, the video decoder should be robust to these errors