Conditional entropy - Computer Science

Conditional entropy
Properties of entropy
CS349 Cryptography
Department of Computer Science
Wellesley College
Properties of entropy
o
In this lecture, we prove
some fundamental results
concerning entropy which
we apply next time to
cryptosystems.
o
We begin with a result,
known as Jensen’s
inequality.
Conditional entropy
7-2
1
Concave functions
A real-valued function f is a concave function on
an interval I if
Ê x + y ˆ f (x) + f (y)
fÁ
˜≥
Ë 2 ¯
2
for all x, y Œ I. f is a strictly concave function
on an interval i if
Ê x + y ˆ f (x) + f (y)
fÁ
˜>
Ë 2 ¯
2
†
for all x, y Œ I, x ≠ y.
Conditional entropy
†
7-3
Jensen’s Inequality
Suppose f is a continuous strictly concave
function on the interval I,
n
Âa
i
=1
i=1
and ai > 0, 1 ≤ i ≤ n. Then
n
 a f (x ) £ f ( a x )
i
†
n
i=1
i
i
i
i=1
where xi Œ I, 1 ≤ i ≤ n. Further, equality occurs if
and only if x1 = x2 = . . . = xn.
†
Conditional entropy
7-4
2
Maximum information content
Suppose X is a random variable having a
probability distribution which takes on the values
p1, p2, . . ., pn, where pi > 0, 1 ≤ i ≤ n. Then
H(X) ≤ log2 n,
with equality if and only if pi = 1/n, 1 ≤ i ≤ n.
Conditional entropy
7-5
Applying Jensen’s inequality*, we have
n
H(X) = -Â pi log 2 pi
i=1
n
= Â pi log 2
i=1
1
pi
n
£ log 2 Â pi ¥
i=1
1
pi
= log 2 n
*Further, equality occurs if and only if pi = 1/n, 1 ≤ i ≤ n.
†
Conditional entropy
7-6
3
Joint distributions
o The information content of a joint distribution is
not more than sum of the information contents
of individual distributions.
o In particular,
H(X, Y) ≤ H(X) + H(Y)
with equality if and only if X and Y are
independent random variables.
Conditional entropy
7-7
Conditional entropy
Suppose X and Y are random variables. Then for any fixed
value y of Y, we get a conditional probability distribution on
X; we denote the associated random variable by X|y.
H(X | y) = -Â Pr[x | y]log 2 Pr[x | y].
x
Define the conditional entropy, H(X | Y), to be the weighted
average (with respect to the probabilities Pr[y]) of the
entropies H(X|y) over all possible values y. It is computed to
†
be
H(X | Y) = -Â Â Pr[y]Pr[x | y]log 2 Pr[x | y].
y
x
Conditional entropy
7-8
†
4
Two homework assignments
Theorem.
H(X, Y) = H(Y) + H(X|Y).
Corollary.
H(X|Y) ≤ H(X),
with equality if and only if X and Y are
independent.
Conditional entropy
7-9
5