Solutions

UCSD ECE154C
Prof. Young-Han Kim
Handout #13
Thursday, April 30, 2015
Solutions to Homework Set #4
1. Consider a source that produces an i.i.d. sequence of symbols from the alphabet
{A, B, C, D, E, F, G} with probabilities 0.22, 0.35, 0.15, 0.09, 0.09, 0.05, and 0.05, respectively. Find the binary Shannon-Fano code and compute the average number of
code symbols per source symbol.
Solution: The Shannon-Fano code construction is shown below.
source symbol probability codeword
B
0.35
00
0.22
01
A
C
0.15
100
D
0.09
101
E
0.09
110
F
0.05
1110
0.05
1111
G
The average number of code symbols per source symbol is
(0.35 + 0.22) ∗ 2 + (0.15 + 0.09 + 0.09) ∗ 3 + (0.05 + 0.05) ∗ 4 = 2.53.
2. Consider a source that produces an i.i.d. sequence of symbols from the alphabet {A, B}
with probabilities 15/16, 1/16, respectively.
(a) Find a Tunstall code for the source that encodes the source phrases into binary
codewords of length 3.
(b) Find a Tunstall code for the source that encodes the source phrases into binary
codewords of length 4.
Solution:
1
15 6
> 16
, the tree will keep splitting the leaf with symbol A until there
(a) Since 16
are 8 codewords.
The codebook is described in the following table.
1
source phrases
B
AB
AAB
AAAB
AAAAB
AAAAAB
AAAAAAB
AAAAAAA
codewords
000
001
010
011
100
101
110
111
14
1
> 16
, the tree keeps splitting the leaf with symbol
(b) Similar to part (a), since 15
16
A until there are 16 codewords.
Here is the codebook.
source phrases
codewords
B
0000
AB
0001
AAB
0010
AAAB
0011
AAAAB
0100
AAAAAB
0101
AAAAAAB
0110
AAAAAAAB
0111
AAAAAAAAB
1000
AAAAAAAAAB
1001
AAAAAAAAAAB
1010
AAAAAAAAAAAB
1011
AAAAAAAAAAAAB
1100
AAAAAAAAAAAAAB
1101
AAAAAAAAAAAAAAB
1110
AAAAAAAAAAAAAAA
1111
3. Using the Lempel Ziv window algorithm with a window size 32, encode the following
text message.
MARY_HAD_A_LITTLE_LAMB_LITTLE_LAMB_LITTLE_LAMB_MARY_HAD_
A_LITTLE_LAMB_WITH_FLEECE_AS_WHITE_AS_SNOW_WHERE_IS_MARY
_AND_HER_LITTLE_LAMB_HERE_IS_MARY_BUT_WHERE_IS_THE_LAMB
Use the following code for the length of the match: match length = 1 → 0, match
length = 2 → 10, match length = 3 → 110, match length = 4 → 1110, etc.
Count the average number of encoded binary digits per source letter.
Solution: Since the window size is 32, we need 5 bits to encode the position of the
matches. Refer to following pages for the procedure.
2
3
4
5
4. Using the algorithm derived in class, find the optimal quantization regions and representation points for the cases of 2, 3, 4 bit quantization for a random variable with
probability density function
1
fX (x) = e−|x|
2
or all x.
In each case, calculate the average mean squared error between the unquantized and
quantized values.
Solution: We will compute the case for 2 bits. The other cases can be done in the
6
same way. Let the initial values for ai ’s and bi ’s as the following:
a = [−1.66, −0.589, 0.589, 1.665],
b = [−∞, −1.177, 0.1.177, ∞].
Since fX (x) is symmetric, we only need to compute one side of the region. Using the
following formula for the Lloyd Algorithm, we acquire the iterated values of the ai ’s
and bi ’s:
R bj
xfX (x)dx
b
aj = Rj−1
bj
f (x)dx
bj−1 X
aj+1 + aj
.
bj =
2
The converged values are
a = [−2.5936, −0.5936, 0.5936, 2.5936]
b = [−∞, −1.5936, 0, 1.5936, ∞].
Now we compute average MSE
2
=
4 Z
X
j=1
bj
(x − aj )2 fX (x)dx = 0.3524.
bj−1
We can compute the case for 3 and 4 bits in the same way.
For the case of 3 bits,
a = [−4.3652, −2.3652, −1.1780, −0.3301, 0.3301, 1.1780, 2.3652, 4.3652]
b = [−∞, −3.3652, −1.7716, −0.7540, 0, 0.7540, 1.7716, 3.3652, ∞]
MSE = 0.1090.
For the case of 4 bits,
a = [−6.2665, −4.2665, −3.0792, −2.2313, −1.5712, −1.0305, −0.5725, −0.1753,
0.1753, 0.5725, 1.0305, 1.5712, 2.2313, 3.0792, 4.2665, 6.2665]
b = [−∞, −5.2665, −3.6729, −2.6553, −1.9012, −1.3008, −0.8015, −0.3739,
0, 0.3739, 0.8015, 1.3008, 1.9012, 2.6553, 3.6729, 5.2665, ∞]
MSE = 0.03070.
7