Lecture 6

Digital Communications III (ECE 154C)
Introduction to Coding and Information Theory
Tara Javidi
These lecture notes were originally developed by late Prof. J. K. Wolf.
UC San Diego
Spring 2014
1 / 26
Lossy Source Coding
• Lossy Coding
• Distortion
A/D & D/A
Scalar Quantization
Vector Quantization
Audio Compression
Coding With Distortion
2 / 26
Coding with Distortion
Lossy Source Coding
• Lossy Coding
• Distortion
A/D & D/A
Scalar Quantization
Vector Quantization
Audio Compression
Placeholder Figure
1
2
ǫ = lim
T →∞ T
Z
T /2
−T /2
A
E(x(t) − x
ˆ(t))2 dt = M.S.E.
3 / 26
Discrete-Time Signals
Lossy Source Coding
• Lossy Coding
• Distortion
A/D & D/A
If signals are bandlimited, one can sample at nyquist rate and
convert continuous-time problem to discrete-time problem. This
sampling is part of the A/D converter.
Scalar Quantization
Vector Quantization
Audio Compression
Placeholder Figure
A
m
1 X
2
E(xi − x
ˆ i )2
ǫ = lim
m→∞ m
i=1
4 / 26
Lossy Source Coding
A/D & D/A
• A/D
• D/A
• D/A
• D/A
Scalar Quantization
Vector Quantization
Audio Compression
A/D Conversion and D/A
Conversion
5 / 26
A/D Conversion
Lossy Source Coding
A/D & D/A
• A/D
• D/A
• D/A
• D/A
Scalar Quantization
Vector Quantization
Audio Compression
• Assume a random variable X which falls into the range
(Xmin , Xmax ).
• The goal is for X to be converted into k binary digits. Let
M = 2k .
• The usual A/D converter first subdivides the interval
(Xmin , Xmax ) into M equal sub-intervals.
• Subintervals are of width ∆ = (Xmax − Xmin )/M .
• Shown below for the case of k = 3 and M = 8.
Placeholder Figure
A
• We call the ith sub-interval, ℜi .
6 / 26
D/A Conversion
Lossy Source Coding
A/D & D/A
• A/D
• D/A
• D/A
• D/A
• Assume that if X falls in the region ℜi , i.e. x ∈ ℜi
ˆ = Y which
• D/A converter uses as an estimate of X , the value X
is the center of the ith region.
ˆ is
• The mean-squared error between X and X
Scalar Quantization
Vector Quantization
Audio Compression
ˆ 2] =
ǫ2 = E[(X − X)
Z
Xmax
Xmin
ˆ 2 fx (x)dx
(X − X)
where fx (x) is the probability density function of the random
variable X .
• Let fX|ℜi (x) be the conditional density function of X given that
X falls in the region ℜi . Then
ǫ2 =
M
X
i=1
P [x ∈ ℜi ]
Z
x∈ℜi
(x − yi )2 fx|ℜi (x)dx
7 / 26
D/A Conversion
Lossy Source Coding
A/D & D/A
• A/D
• D/A
• D/A
• D/A
Scalar Quantization
Vector Quantization
Audio Compression
• Note that for i = 1, 2, ..., M
M
X
i=1
P [x ∈ ℜi ] = ∆
and
Z
x∈ℜi
fX|ℜi (x)dx = 1.
• Make the further assumption that k is large enough so that
fX|ℜi (x) is a constant over the region ℜi .
1
• Then fX|ℜi (x) = ∆
for all i, and
Z
1
(x − yi ) fX|ℜi (x)dx =
∆
x∈ℜi
Z
1
=
∆
Z
2
b
a
b−a 2
)) dx
(x − (
2
∆
2
−∆
2
1 2
· ·
=
∆ 3
(x − 0)2 dx
∆
2
3
∆2
=
12
8 / 26
D/A Conversion
Lossy Source Coding
A/D & D/A
• A/D
• D/A
• D/A
• D/A
Scalar Quantization
Vector Quantization
Audio Compression
2
• In other words, ǫ =
M
X
i=1
∆2
∆2
P [x ∈ ℜi ] ·
=
12
12
• If X has variance σx2 , the signal-to-noise
ratio of the A to the D
∆2
2
σx / 12
(& D to A) converter is often defined as
• If Xmin is equal to −∞ and/or Xmax = +∞, then the last and
first intervals can be infinite in extent.
• However fx (x) is usually small enough in those intervals so that
the result is still approximately the same.
Placeholder Figure
A
9 / 26
Lossy Source Coding
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
SCALAR QUANTIZATION of
GAUSSIAN SAMPLES
Vector Quantization
Audio Compression
10 / 26
Scalar Quantization
Lossy Source Coding
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
Vector Quantization
Placeholder Figure
• ENCODER:
Audio Compression
• DECODER:
x ≤ −3b
−3b < x ≤ −2b
−2b ≤ x ≤ −b
−b ≤ x ≤ 0
000
001
010
011
A
000
001
010
011
−3.5b
−2.5b
−1.5b
−.5b
0<x<b
b < x ≤ 2b
2b < x ≤ 3b
3b < x
100
101
110
111
100
101
110
111
+.5b
+1.5b
+2.5b
+3.5b
11 / 26
Optimum Scalar Quantizer
Lossy Source Coding
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
• Let us construct boundaries bi , (b0 = −∞, bM = +∞ and
quantization symbols ai such that
bi−1 ≤ x < bi −→ x
ˆ = ai
i = 1, 2, ..., M
Vector Quantization
Placeholder Figure
A
Audio Compression
• The question is how to optimize {bi } and {ai } to minimize
distortion ǫ2
2
ǫ =
M Z
X
i=1
bi
bi−1
(x − ai )2 fx (x)dx
12 / 26
Optimum Scalar Quantizer
Lossy Source Coding
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
Vector Quantization
Audio Compression
•
To optimize ǫ2
=
PM R bi
i=1 bi−1 (x
− ai )2 fx (x)dx we take
derivatives and putting them equal to zero:
δǫ2
=0
δaj
δǫ2
=0
δbj
• And use Leibnitz’s Rule:
δ
δt
Z
b(t)
a(t)
δb(t)
f (x, t)dx =f (b(t), t)
δt
δa(t)
− f (a(t), t)
δt
Z b(t)
δ
+
f (x, t)dt
a(t) δt
13 / 26
Optimum Scalar Quantizer
Lossy Source Coding
δ
δbj
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
δ
δbj
Z
bj
M Z bi
X
i=1
bi−1
(x − ai )2 fx (x)dx
δ
(x − aj )2 fx (x)dx +
δbj
bj−1
Z
bj+1
bj
!
=
(x − aj+1 )2 fx (x)dx
= (bj − aj )2 fx (x)|x=bj − (bj − aj+1 )2 fx (x)|x=bj = 0
Vector Quantization
Audio Compression
b2j − 2aj bj + a2j = b2j − 2bj aj+1 + a2j+1
2bj (aj+1 − aj ) = a2j+1 − a2j
bj =
aj+1 +aj
2
(I)
14 / 26
Optimum Scalar Quantizer
Lossy Source Coding
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
δ
δaj
M Z
X
i=1
bi−1
bi (x − ai )2 fx (x)dx
aj
Z
bj
fx (x)dx =
bj−1
Vector Quantization
Audio Compression
aj =
R bj
!
Z
= −2
Z
bj
(x−aj )fx (x)dx =
bj−1
bj
xfx (x)dx
bj−1
bj−1 xfx (x)dx
R bj
bj−1 fx (x)dx
(II)
15 / 26
Optimum Scalar Quantizer
Lossy Source Coding
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
Vector Quantization
Audio Compression
• Note that the {bi } can be found from (I) once the {ai } is known.
◦ In fact, the {bi } are the midpoints of the {ai }.
• The {ai } can also be solved from (II) once the {bi } are known.
◦ The {ai } are centroids of the corresponding regions.
• Thus one can use a computer to iteratively solve for the {ai } and
the {bi }
1.
2.
3.
4.
One starts with an initial guess for the {bi }.
One uses (II) to solve for the {ai }.
One uses (I) to solve for the {bi }.
One repeats steps 2 and 3 until the {ai } and the {bi } ”stop
changing”.
16 / 26
Comments on Optimum Scalar Quantizer
Lossy Source Coding
A/D & D/A
1.
2.
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
Vector Quantization
This works for any fx (x)
If fx (x) only has a finite support one adjusts b0 &bM to be the
limits of the support.
Placeholder Figure
3.
One needs to know
β
X
A
fx (x)dx and
α
Audio Compression
4.
β
X
xfx (x)dx (true for
α
any fx (x))
For a Gaussian, we can integrate by parts or let y = x2
Z
β
α
1 2
1
√ e− 2 x dx = Q(β) − Q(α)
2π
Z
β
α
1 2
1
x √ e− 2 x dx = ...
2π
17 / 26
Comments on Optimum Scalar Quantizer
Lossy Source Coding
5.
A/D & D/A
Scalar Quantization
• Quantization
• Quantizer
• Optimal Quantizer
• Optimal Quantizater
• Optimal Quantizater
• Iterative Solution
• Comments
• Comments
6.
If M = 2a one could use a binary digits to represent the
quantized value. However since the quantized values are not
necessarily equally likely, one could use a H UFFMAN C ODE to
use fewer binary digits(on the average).
After the {ai } and {bi } are known, one computes ǫ2 from
2
ǫ =
M Z
X
i=1
Vector Quantization
Audio Compression
7.
For M = 2 and fx (x)
bi
bi−1
=√
(x − ai )2 fx (x)dx
1
2πσ 2
− 12
e
x2
σ2
we have
b0 = −∞, b1 = 0, b2 = +∞, and a2 = −a1 =
8.
Also easy to show that ǫ2 = (1 − π2 )σ 2 = .3634σ 2 .
r
2σ 2
π
18 / 26
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
• Quantization
• Gaussain DMS
• Gaussain DMS
Audio Compression
Vector Quantization
19 / 26
Vector Quantization
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
• Quantization
• Gaussain DMS
• Gaussain DMS
• One can achieve a smaller ǫ2 by quantizing several samples at a
time.
• We would then use regions in an m-dimensional space
Audio Compression
Placeholder Figure
A
• Shannon characterized this in terms of ”rate-distortion formula”
which tells us how small ǫ2 can be (m → ∞).
• For a Gaussian source with one binary digit per sample,
2
ǫ ≥
σ2
4
= 0.25σ 2
◦ This follows from the result on the next page.
◦ Contrast this with scalar case: ǫ2s = (1 − π2 )σ 2 = .3634σ 2 .
20 / 26
VQ: Discrete Memoryless Gaussian Source
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
• Quantization
• Gaussain DMS
• Gaussain DMS
• Let source produce i.i.d. Gaussian samples X1 , X2 , ... where
1
e
fX (x) = √
2πσ 2
−1 x2
2 σ2
Audio Compression
• Let the source encoder produce a sequence of binary digits at a
rate of R binary digits/source symbol.
◦ In our previous terminology R = log M
ˆ1, X
ˆ 2 , ..., X
ˆ i , ...
• Let the source decoder produce the sequence X
ˆ i } is ǫ2 .
such that the mean-squared error between {Xi } and {X
n
1X
2
ˆ i )2 }
E{(Xi − X
ǫ =
n
i=1
21 / 26
VQ: Discrete Memoryless Gaussian Source
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
• Quantization
• Gaussain DMS
• Gaussain DMS
Audio Compression
• Then one can prove that for any such system
1
σ2
R ≥ log2 ( 2 )
2
ǫ
for
ǫ2 ≤ σ 2
◦ Note that R = 0 for ǫ2 ≥ σ 2 . What does this mean?
◦ Note that for R = log M = 1,
σ2
σ2
1
1 ≥ log2 ( 2 ) ⇒ 2 ≥ log2 ( 2 )
2
ǫ
ǫ
σ2
⇒ 4 ≥ 2 ⇒ ǫ2 ≥ (1/4)σ 2
ǫ
• This is an example of “Rate-Distortion Theory.”
22 / 26
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
Audio Compression
• MP3
• CD
• MPEG-1 Layer 3
Reduced Fidelity Audio
Compression
23 / 26
Reduced Fidelity Audio Compression
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
• MP3 players use a form of audio compression called MPEG-1
Audio Layer 3.
Audio Compression
• MP3
• CD
• MPEG-1 Layer 3
• It takes advantage of a psycho-acoustic phenomena whereby
◦ a loud tone at one frequency “masks” the presence of softer
◦
tones at neighboring frequencies;
hence, these softer neighbouring tones need not be stored(or
transmitted).
• Compression efficiency of an audio compression scheme is
usually described by the encoded bit rate (prior to the
introduction of coding bits.)
24 / 26
Reduced Fidelity Audio Compression
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
Audio Compression
• MP3
• CD
• MPEG-1 Layer 3
• The CD has a bit rate of (44.1 × 103 × 2 × 16) = 1.41 × 106
bits/second.
◦ The term 44.1 × 103 is the sampling rate which is
◦
◦
◦
approximately the Nyquist frequency of the audio to be
compressed.
The term 2 comes from the fact that there are two channels in
a stereo audio system.
The term 16 comes from the 16-bit (or 216 = 65536 level) A
to D converter.
Note that a slightly higher sampling rate 48 × 103
samples/second is used for a DAT recorder.
25 / 26
Reduced Fidelity Audio Compression
Lossy Source Coding
A/D & D/A
Scalar Quantization
Vector Quantization
Audio Compression
• MP3
• CD
• MPEG-1 Layer 3
• Different standards are used in MP3 players.
• Several bit rates are specified in the MPEG-1, Layer 3 standard.
◦ These are 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160,
192, 224, 256 and 320 kilobits/sec.
◦ The sampling rates allowed are 32, 44.1 and 48 kiloHz but
the sampling rate of 44.1 × 103 Hz is almost always used.
• The basic idea behind the scheme is as follows.
◦ A block of 576 time domain samples are converted into 576
◦
◦
◦
frequency domain samples using a DFT.
The coefficients then modified using psycho-acoustic
principles.
The processed coefficients are then converted into a bit
stream using various schemes including Huffman Encoding.
The process is reversed at the receiver: bits −→ frequency
domain coefficients −→ time domain samples.
26 / 26