Download Report

UCSD ECE154C
Prof. Young-Han Kim
Handout #15
Thursday, April 30, 2015
Solutions to Practice Midterm (Spring 2012)
(Total: 160 points)
There are 4 problems, each problem with multiple parts, each part worth 10 points. Your
answer should be as clear and readable as possible.
1. Codes (30 pts). Consider the code {1, 10}.
(a) Is it instantaneous?
Solution: It is not instantaneous since 1 is a prefix of 10.
(b) Is it uniquely decodable?
Solution: It is uniquely decodable. Let source symbols A and B correspond
to codewords 1 and 10, respectively. Take any code string and start from the
beginning. If the first bit is 1, we must look at the following bits. If the next bit
is a 0, the source symbol B is decoded. If the next bit is a 1, the source symbol is
a A. By repeating this argument, we can see that this code is uniquely decodable.
(c) Is it nonsingular?
Solution: It is nonsingular because the two codewords are different.
1
2. Codeword lengths (20 pts).
(a) Can {1, 2, 2, 3} be codeword lengths of a binary uniquely decodable code? If so,
construct a code with given codeword lengths.
Solution: Since 2−1 + 2−2 + 2−2 + 2−3 = 98 > 1, by the McMillan theorem
{1, 2, 2, 3} cannot be codeword lengths of a binary uniquely decodable code.
(b) Can {1, 2, 3, 3} be codeword lengths of a binary uniquely decodable code? If so,
construct a code with given codeword lengths.
Solution: Since 2−1 + 2−2 + 2−3 + 2−3 = 1 satisfies the Kraft inequality, we can
construct an instantaneous code and hence a uniquely decodable code. Consider
a random variable X taking values in the set X = {A, B, C, D} with probabilities
1 1 1 1
, , , , respectively. Following the Huffman code construction, we can obtain
2 4 8 8
the following table.
X Probability Codeword
A
0.5
1
B
0.25
01
C
0.125
001
D
0.125
000
2
3. Huffman coding (60 pts). Consider a source that produces an i.i.d. sequence of symbols
from the alphabet {A, B} with probabilities 0.7, 0.3, respectively.
(a) Find the binary entropy of the source.
Solution: The binary entropy of the source is
−0.7 × log2 0.7 − 0.3 × log2 0.3 = 0.8813.
(b) Find a binary Huffman code for taking 2 source symbols at a time.
Solution:
0.49"
AA"
0.21"
AB"
0.21"
BA"
0.09"
BB"
1.00"
0.51"
0.30"
The constructed binary Huffman code is summarized in the following table.
source symbols probabilities codewords
AA
0.49
1
AB
0.21
00
BA
0.21
011
BB
0.09
010
(c) Find the average number of code symbols per source symbol for the Huffman code
found in part (b).
Solution: The average number of code symbols per source symbol for the Huffman code found in part (b) is
(1 × 0.49 + 2 × 0.21 + 3 × 0.21 + 3 × 0.09)/2 = 0.905.
(d) Find the ternary entropy of the source.
Solution: The ternary entropy of the source is
−0.7 × log3 0.7 − 0.3 × log3 0.3 = 0.556.
3
(e) Find a ternary Huffman code for taking 3 source symbols at a time.
Solution:
0.343% AAA%
1%
0.147% AAB%
0.147% ABA%
0.441%
0.147% BAA%
0.063%
ABB%
0.063%
BAB%
0.063%
BBA%
0.027%
BBB%
0%
0%
0.216%
0.09%
The constructed ternary Huffman code is summarized below.
source symbols probabilities codewords
AAA
0.343
1
AAB
0.147
22
ABA
0.147
21
BAA
0.147
20
ABB
0.063
01
BAB
0.063
00
BBA
0.063
022
BBB
0.027
021
0
0
(f) Find the average number of code symbols per source symbol for the Huffman code
found in part (e).
Solution: The average number of code symbols per source symbol for the Huffman code found in part (e) is
(1 × 0.343 + 2 × 0.147 × 3 + 2 × 0.063 × 2 + 3 × 0.063 + 3 × 0.027)/3 = 0.5823.
4
4. Variable-length coding (50 pts). Consider a source that produces an i.i.d. sequence of
symbols from the alphabet {A, B} with probabilities 0.7, 0.3, respectively.
(a) Consider the following code:
AA → 0,
ABA → 110,
ABB → 111,
B → 10.
Find the average number of code symbols per source symbol.
Solution: Let’s put relevant information in the table.
source phrases
AA
ABA
ABB
B
probabilities codewords
0.49
0
0.147
110
0.063
111
0.3
10
The average number of code symbols per source symbol is
1 × 0.49 + 3 × 0.147 + 3 × 0.063 + 2 × 0.3
= 0.9005.
2 × 0.49 + 3 × 0.147 + 3 × 0.063 + 1 × 0.3
(b) Find a Tunstall code that encodes the source phrases into binary codewords of
length 2.
Solution:
A"
B"
0.7"
A"
0.49"
A"
0.343"
B"
0.21"
B"
0.147"
Since 0.72 > 0.3, the tree keeps splitting the leaf with symbol A until there are 4
codewords.
5
The codebook is described in the following table.
source phrases codewords
B
00
AB
01
AAB
10
AAA
11
(c) Find the average number of code symbols per source symbol for the Tunstall code
found in part (b).
Solution: Since the average length of the source phrases for the Tunstall code
found in part (b) is 1 + 0.7 + 0.49 = 2.19, the average number of code symbols
per source symbol is 2/2.19 = 0.9132.
(d) Now suppose that the Tunstall code found in part (b) is followed by a binary
Huffman code. Write down the resulting code.
Solution: From the code found in part (b), we can construct the Huffman tree
as follows.
0.343%
AAA%
1%
0.643%
0.3%
B%
0.21%
AB%
0.147%
AAB%
0.357%
The resulting codebook is the same as that in part (b)
source phrases probabilities codewords
AAA
0.343
11
B
0.3
00
AB
0.21
01
AAB
0.147
10
6
(e) Find the average number of code symbols per source symbol for the code found
in part (d). Compare it to your answer in part (a).
Solution: Since adding the Huffman code does not change the codewords, the
answer is the same as that in part (c), which is worse than that in part (a). This
gives an example showing the suboptimality of the Tunstall–Huffman coding as
a variable-length to variable-length lossless compression algorithm.
7