Slide - the David R. Cheriton School of Computer Science

Differential Privacy in the Streaming Model
Jalaj Upadhyay
David R. Cheriton School of Computer Science
University of Waterloo
April 24, 2015
1 / 40
Why We Need Privacy? Social Graph
2 / 40
Why We Need Privacy? Social Graph
• A natural question: how many people have friends outside
their circle?
2 / 40
Why We Need Privacy? Social Graph
• A natural question: how many people have friends outside
their circle?
• The answer is the number of edges crossing the border of
the set of the vertices corresponding to those people
2 / 40
Friendships or "What You May Call" Between People
Suppose Facebook decides to reveal the relationship graph
There might be some people who might end up in trouble
3 / 40
Friendships or "What You May Call" Between People
Suppose Facebook decides to reveal the relationship graph
There might be some people who might end up in trouble
3 / 40
Friendships or "What You May Call" Between People
Suppose Facebook decides to reveal the relationship graph
There might be some people who might end up in trouble
3 / 40
Friendships or "What You May Call" Between People
Suppose Facebook decides to reveal the relationship graph
There might be some people who might end up in trouble
3 / 40
Friendships or "What You May Call" Between People
Suppose Facebook decides to reveal the relationship graph
There might be some people who might end up in trouble
3 / 40
Friendships or "What You May Call" Between People
Suppose Facebook decides to reveal the relationship graph
There might be some people who might end up in trouble
3 / 40
Disclaimer
I do not support any of the above infidelity
None of this work should be used in any of the above cited or
related scenarios
Mr. Clinton, Mr. Woods, or NSA did not fund this research
4 / 40
Why We Need Privacy? Netflix Competition
5 / 40
Why We Need Privacy? Netflix Competition
Task: Deanonymize the data
Training set: 100,480,507
Qualifying set: 2,817,131
5 / 40
Why We Need Privacy? Netflix Competition
Task: Deanonymize the data
Training set: 100,480,507
Qualifying set: 2,817,131
Winners announced in September,
2009
5 / 40
Differential Privacy: The Framework
Analyst wishes to get some task done on Database
6 / 40
Differential Privacy: The Framework
Privacy guard provides privacy of an individual in Database
6 / 40
Differential Privacy: The Framework
The privacy guard performs the task on the Database
6 / 40
Differential Privacy: The Mathematical Formulation
The idea is that absence or presence of an individual entry
should not change the output “by much"
7 / 40
Differential Privacy: The Mathematical Formulation
The idea is that absence or presence of an individual entry
should not change the output “by much"
Definition. A randomized algorithm, M, gives
(ε, δ)-differential privacy if, for all “neighboring data," D
e and for all S ⊆ Range(M),
and D,
h
i
e ∈S +δ
Pr [M(D) ∈ S] ≤ exp(ε)Pr M(D)
7 / 40
Differential Privacy: The Mathematical Formulation
The idea is that absence or presence of an individual entry
should not change the output “by much"
Definition. A randomized algorithm, M, gives
(ε, δ)-differential privacy if, for all “neighboring data," D
e and for all S ⊆ Range(M),
and D,
h
i
e ∈S +δ
Pr [M(D) ∈ S] ≤ exp(ε)Pr M(D)
We restrict how the privacy guard can access the database
7 / 40
Differentially Private Streaming Model of Computation
Privacy Guard
Private Matrix
8
3
4
5
6
2
2
1
1
1
5
9
3
2
6
1
1
3
6
7
2
7
1
7
9
6
1
Differentially Private Streaming Model of Computation
Privacy Guard
Private Matrix
8
3
4
5
6
2
2
1
1
• Operates on the stream
 
4

• Update the data-structure 6
1
⇐
1
5
9
3
2
6
1
1
3
6
7
2
7
1
7
9
6
1
Differentially Private Streaming Model of Computation
Privacy Guard
Private Matrix
1
5
9
3
2
6
1
1
3
• Operates on the stream


4 3
• Update the data-structure 6 2
1 8
⇐
6
7
2
7
1
7
9
6
1
Differentially Private Streaming Model of Computation
Privacy Guard
6
7
2
7
1
7
9
6
1
• Operates on the stream


4 3 4
• Update the data-structure 6 2 8
1 8 9
8 / 40
Differentially Private Streaming Model of Computation
An analyst comes
along
9 / 40
Differentially Private Streaming Model of Computation
request to do a task
−−−−−−−−−−−−−−−−−→
An analyst comes
along
9 / 40
Differentially Private Streaming Model of Computation
request to do a task
−−−−−−−−−−−−−−−−−→
An analyst comes
along
• uses 

4 3 4
6 2 8 
1 8 9
9 / 40
Differentially Private Streaming Model of Computation
request to do a task
−−−−−−−−−−−−−−−−−→
performs the task
←−−−−−−−−−−−−−−−
An analyst comes
along
• uses 

4 3 4
6 2 8 
1 8 9
9 / 40
Differentially Private Streaming Model of Computation
cannot figure out
individual
information
9 / 40
Differentially Private Streaming Model of Computation
cannot figure out
individual
information
Privacy goal
achieved
9 / 40
Differentially Private Streaming Model of Computation
Following are the extra parameters
1
number of passes over the matrix
2
space requirement of the data-structures
3
time required to update the data-structures
10 / 40
Differentially Private Streaming Model of Computation
Following are the extra parameters
1
number of passes over the matrix
2
space requirement of the data-structures
3
time required to update the data-structures
10 / 40
Differentially Private Streaming Model of Computation
Following are the extra parameters
1
number of passes over the matrix
2
space requirement of the data-structures
3
time required to update the data-structures
10 / 40
Differentially Private Streaming Model of Computation
Following are the extra parameters
1
number of passes over the matrix
2
space requirement of the data-structures
3
time required to update the data-structures
Two common method for constructing the data-structure:
1
Random sampling
2
Sketch based approach
10 / 40
Differentially Private Streaming Model of Computation
Following are the extra parameters
1
number of passes over the matrix
2
space requirement of the data-structures
3
time required to update the data-structures
Two common method for constructing the data-structure:
1
Random sampling
2
Sketch based approach
10 / 40
This Presentation in One Slide
Non-private Setting
11 / 40
This Presentation in One Slide
Non-private Setting
Data-structure is a sketch generated using random matrix
11 / 40
This Presentation in One Slide
Non-private Setting
Data-structure is a sketch generated using random matrix
⇓
Efficient one-pass streaming algorithms
11 / 40
This Presentation in One Slide
Private Setting
12 / 40
This Presentation in One Slide
Private Setting
Special distribution of random matrices
12 / 40
This Presentation in One Slide
Private Setting
Special distribution of random matrices
+
Sketch generated using a random matrix picked from this
distribution
12 / 40
This Presentation in One Slide
Private Setting
Special distribution of random matrices
+
Sketch generated using a random matrix picked from this
distribution
⇓
Differentially private one-pass streaming algorithms
12 / 40
This Presentation in One Slide
Private Setting
Special distribution of random matrices
+
Sketch generated using a random matrix picked from this
distribution
⇓
Differentially private one-pass streaming algorithms
with optimal space data-structure and comparable update time
12 / 40
Outline of the Talk
• Part I covers space efficient privacy preserving data-structure
• Part II improves the update efficiency of the data-structure
13 / 40
Part I
Space efficient Data-Structures 1
1
J. Upadhyay. Differentially Private Linear Algebra in the Streaming Model.
arXiv:1409.5414
14 / 40
Main Result
Streaming Private Sketch Generator (PSG1 )
Pick a random Gaussian matrix Φ
Multiply Φ to the streamed column
15 / 40
Main Result
Streaming Private Sketch Generator (PSG1 )
Pick a random Gaussian matrix Φ
Multiply Φ to the streamed column
Theorem. If the singular values of the streamed matrix
are at least
p to PSG1 algorithm
σ1 := 4 r log(2/δ) log(r/δ) /ε, then PSG1 preserves
(ε, δ)-differential privacy
15 / 40
Main Result
Streaming Private Sketch Generator (PSG1 )
Pick a random Gaussian matrix Φ
Multiply Φ to the streamed column
Theorem. If the singular values of the streamed matrix
are at least
p to PSG1 algorithm
σ1 := 4 r log(2/δ) log(r/δ) /ε, then PSG1 preserves
(ε, δ)-differential privacy
Similar result was shown by [BBDS12] for non-streaming
algorithms
15 / 40
Main Result
Streaming Private Sketch Generator (PSG2 )
Pick a random Gaussian matrix Φ
Multiply ΦT Φ to the streamed column
Theorem. If the singular values of the streamed matrix
to the PSG2 algorithm are at least
σ2 := (4r log(r/δ)) /ε, then PSG2 preserves
(ε, δ)-differential privacy.
15 / 40
Main Result
Streaming Private Sketch Generator (PSG2 )
Pick a random Gaussian matrix Φ
Multiply ΦT Φ to the streamed column
Theorem. If the singular values of the streamed matrix
to the PSG2 algorithm are at least
σ2 := (4r log(r/δ)) /ε, then PSG2 preserves
(ε, δ)-differential privacy.
15 / 40
A Meta Algorithm
• Get a stream in the form of column vector
16 / 40
A Meta Algorithm
• Get a stream in the form of column vector
• Perturb the vector to lift the singular values
16 / 40
A Meta Algorithm
• Get a stream in the form of column vector
• Perturb the vector to lift the singular values
• Feed it to PSG1 or PSG2
16 / 40
A Meta Algorithm
• Get a stream in the form of column vector
• Perturb the vector to lift the singular values
• Feed it to PSG1 or PSG2
• Perform any post-processing
16 / 40
17 / 40
Low-rank Approximation
Objective: Given an input matrix A, find orthonormal matrix Ψ
such that, for N ∈ {F, S}
kA − ΨΨT AkN ≤ (1 + α)
min
rank(Ak )≤k
kA − Ak kN + τ
with probability 1 − β
18 / 40
Why Do We Care?
systems and control
model
reduction
system
identification
signal processing
spectral
estimation
image
deblurring
str uctur ed low-r ank appr oximation
approx.
GCD
approx.
factorization
computational mathematics
recomender
systems
manifold
learning
machine learning
19 / 40
Why Do We Care? Image Processing
20 / 40
Standard Prototype
Non-private constructions uses the following two steps
• Compute Y = ΦA, where Φ is a picked using some
distribution
• Computes a rank k matrix B = ΠY A, where ΠY is the
projection matrix corresponding to Y
21 / 40
Standard Prototype
A known private construction [HR12] does the following
• Compute Y = ΦA + G, where Φ is a picked using some
distribution
• Computes a rank k matrix B = ΠY A + G0 , where ΠY is
the projection matrix corresponding to Y
It seems like two applications of A is necessary
21 / 40
Standard Prototype
Non-private constructions uses the following two steps
• Compute Y = ΦA, where Φ is a picked using some
distribution
• Computes a rank k matrix B = ΠY A, where ΠY is the
projection matrix corresponding to Y
It seems like two applications of A is necessary
We use linear algebra to emulate the second step by
using Φ and Y
21 / 40
Private Low-rank Approximation in Single Pass
Computation of Y
• Set w to be above the threshold of PSG2
• Pick a random Gaussian matrix Φ
On receiving the column Ai :
bi =
• Lift the singular values: A
wei
Ai
bi
• Compute the sketch Yi = ΦA
22 / 40
Private Low-rank Approximation in Single Pass
Computation of Y
• Set w to be above the threshold of PSG2
• Pick a random Gaussian matrix Φ
On receiving the column Ai :
bi =
• Lift the singular values: A
bi
• Compute the sketch Yi = ΦA
Y = Y1 · · ·
Yd
wei
Ai
b =
A
wI
A
22 / 40
Private Low-rank Approximation in Single Pass
Computation of Y
• Set w to be above the threshold of PSG2
• Pick a random Gaussian matrix Φ
On receiving the column Ai :
bi =
• Lift the singular values: A
wei
Ai
bi
• Compute the sketch Yi = ΦA
Y = Y1 · · ·
Yd
b =
A
wI
A
Privacy follows from PSG1 as σ1 w
22 / 40
Private Low-rank Approximation in Single Pass
Post-processing: Use Y and Φ to compute B = ΠY A
• Let ΠY = ΨΨT
• Find a solution to BΨT Φ = ΨT Y
• Compute and publish the required decomposition of B
22 / 40
Privacy and Utility Guarantees
Privacy follows from PSG1 and PSG2 as w = σ2
Utility proof requires perturbation theory and the
Johnson-Lindenstrauss property
23 / 40
Privacy and Utility Guarantees
Privacy follows from PSG1 and PSG2 as w = σ2
Utility proof requires perturbation theory and the
Johnson-Lindenstrauss property
Johnson-Lindenstrauss property says that using
special choice of random projection matrix, we can
project a set of vectors to a lower dimensional space
while preserving Euclidean distances
23 / 40
Comparison of Results
Method
[CSS13]
Additive noise (τ )
√ O(nk/ε)
q
µkAkF
kn
e
O
+
ε
ε
2p
k
e
O(
(rµ + k log n) log n)
ε
3
O(dk
/(εγ 2 ))
√
[HR12]
[HR13]
[KT13]
[HP14]
[DTTZ14]
Our result
Our result
e
O
λ1
knµ log(n/γ) log log(n/γ)
εγ 1.5 λk
√
e √n/ε) + O(
˜ k 3 n3/2 /ε2 )
O(k
p
O(pnk ln(k/δ)/ε)
O( nk ln(k/δ)/ε)
# Passes
k
Norm
S
2
F
k log λ
k
√
k log λ
S
S
1
1
1
S
F
S
√
S
Comparison with previous works (F stands for Frobenius norm,
S stands for Spectral norm
24 / 40
Space Optimality Through the Meta Algorithm
Problem
Multiplication
Regression
Low-rank (F)
Low-rank (S)
Lower Bound
Ω dα−2 log(nd)
Ω d2 α−1 log(nd)
Ω(kα−1 (n + d) log(nd))
−
Our Result
−2
˜
O(dα log(nd) log(1/β))
˜ d2 log(nd)−1 κ log(1/β)
O
O(kε−1 (n + d) log(nd))
O(kε−1 (n + d) log(nd))
25 / 40
Part II
Update time and space efficient data-structure 2
2
J. Upadhyay. Randomness Efficient Fast-Johnson-Lindenstrauss
Transform with Applications in Differential Privacy and Compressed Sensing.
arXiv: 1410.2470
26 / 40
Privacy and Run-time
Let n =1 billion = 109 , d = 103 , r = 100
Any of the above algorithm would take rnd = 1014 time
27 / 40
Privacy and Run-time
Let n =1 billion = 109 , d = 103 , r = 100
Any of the above algorithm would take rnd = 1014 time
27 / 40
Construction of Φ
1
Pick {g1 , · · · , gn } ∼ N (0, 1)n
2
Divide it into r equal blocks of vectors Φ1 , · · · , Φr .
Φ1 0n/r
0n/r Φ2

P :=  .
..
 ..
.
···
···
..
.
0n/r
0n/r

···

0n/r
0n/r 

.. 
. 
Φr
q
Compute Φ = 1r PΠW, where W is a randomized Hadamard
matrix and Π is a permutation matrix
28 / 40
Main Result
Streaming Private Sketch Generator (PSG1 )
Pick Φ
Multiply Φ to the streamed column
29 / 40
Main Result
Streaming Private Sketch Generator (PSG1 )
Pick Φ
Multiply Φ to the streamed column
Theorem. If the singular values of the streamed matrix
to PSG1 algorithm are at least
p
σ1 := 4 r log(2/δ) log(r/δ) /ε, then PSG1 preserves
(ε, δ + negl)-differential privacy
29 / 40
Main Result
Streaming Private Sketch Generator (PSG2 )
Pick Φ
Multiply ΦT Φ to the streamed column
29 / 40
Main Result
Streaming Private Sketch Generator (PSG2 )
Pick Φ
Multiply ΦT Φ to the streamed column
Theorem. If the singular values of the streamed matrix
to the PSG2 algorithm are at least
σ2 := (4r log(r/δ)) /ε, then PSG2 preserves
(ε, δ + negl)-differential privacy.
29 / 40
Johnson-Lindenstrauss Property
We prove that Φ satisfies the Johnson-Lindenstrauss Property
30 / 40
Johnson-Lindenstrauss Property
We prove that Φ satisfies the Johnson-Lindenstrauss Property
Hanson-Wright inequality [HW71] controls the quadratic forms
in sub-gaussian random variables: for symmetric matrix A and
subgaussian random vector g, gT Ag is concentrated around its
mean
h
i
c1 η 2 c1 η
T
T
Pr g Ag − E[g Ag] ≥ η ≤ exp − min
,
kAk2F kAk
30 / 40
Few More Results
• Graph sparsification in composition with random projection
also preserves differential privacy
31 / 40
Few More Results
• Graph sparsification in composition with random projection
also preserves differential privacy
• Differentially private linear regression
31 / 40
Few More Results
• Graph sparsification in composition with random projection
also preserves differential privacy
• Differentially private linear regression
• Application in differentially private manifold learning
31 / 40
Few More Results
• Graph sparsification in composition with random projection
also preserves differential privacy
• Differentially private linear regression
• Application in differentially private manifold learning
• Known variants of sparse Johnson-Lindenstrauss
transform does not preserve privacy
31 / 40
Few More Results
• Graph sparsification in composition with random projection
also preserves differential privacy
• Differentially private linear regression
• Application in differentially private manifold learning
• Known variants of sparse Johnson-Lindenstrauss
transform does not preserve privacy
• Simpler proof for many variants of Johnson-Lindenstrauss
transform using Hanson-Wright inequality
31 / 40
Few More Results
• Graph sparsification in composition with random projection
also preserves differential privacy
• Differentially private linear regression
• Application in differentially private manifold learning
• Known variants of sparse Johnson-Lindenstrauss
transform does not preserve privacy
• Simpler proof for many variants of Johnson-Lindenstrauss
transform using Hanson-Wright inequality
• Improved bound in compressed sensing for small number
of measurements
31 / 40
Future Works
Three closely related open problems:
• Improve the bound for sparse random matrix Φ for large
values of r
• Better understand low dimension embedding for any
normed space other than `2 -norm
• Investigate more applications in private learning theory
32 / 40
Thank you for your attention
33 / 40
Additional Slides: Johnson-Lindenstrauss Property
Recall

Φ1 0n/r · · · 0n/r
0n/r Φ2
· · · 0n/r 


P :=  .
.. 
..
..
 ..
.
.
. 
0n/r · · · 0n/r Φr

Construct P1 and P2 which operates as follows
• On input y, P2 feeds
p
n/t y(j−1)n/r+1 , · · · , yjn/r
to the i-th row of P1
• Row-i of P1 is Φi for 1 ≤ i ≤ r
34 / 40
Additional Slides: Johnson-Lindenstrauss Property
Recall Φ =
q
1
r PΠW
The proof follows proving following four claims
• W preserves isometry (unitary matrix)
• Wx has bounded infinity norm for all unit vector x
• P2 Π preserves isometry if its input has bounded unit norm
• P1 preserves isometry (Hanson-Wright inequality)
34 / 40
Additional Slides
Differentially Private Linear Regression 3
3
J. Upadhyay. Differentially Private Linear Algebra in the Streaming Model.
arXiv: 1409.5414
35 / 40
Linear Regression: Problem Statement
Objective: Given a private matrix A and a column vector b,
output a vector x such that
Pr kAx − bkF ≤ (1 + α) min kAy − bkF + τ ≥ 1 − β
y∈Rd×1
36 / 40
Linear Regression: Problem Statement
Objective: Given a private matrix A and a column vector b,
output a vector x such that
Pr kAx − bkF ≤ (1 + α) min kAy − bkF + τ ≥ 1 − β
y∈Rd×1
Why do we care?
36 / 40
Linear Regression: Problem Statement
Objective: Given a private matrix A and a column vector b,
output a vector x such that
Pr kAx − bkF ≤ (1 + α) min kAy − bkF + τ ≥ 1 − β
y∈Rd×1
Why do we care?
− People in finance will tell you about capital asset pricing
model
− Computer scientists will tell you about Reinforcement
Learning
36 / 40
Linear Regression
• Set s above the threshold of PSG1
• Pick a random Gaussian matrix Φ
37 / 40
Linear Regression
• Set s above the threshold of PSG1
• Pick a random Gaussian matrix Φ
Data-structure update. On input a stream A:c ,
b :c = sec 0n+d A:c
• Life the singular values: A
b :c .
• Compute the sketch Y:c of tA
Form the matrix Y = Y1: · · · Yn:
37 / 40
Linear Regression
Stored Data-structure: Y
Answering queries. On being queried with a vector b
b = 0d 0n+d bi
• Set b
• Compute the sketch of e
b
e
• Compute arg minx kYx − bk
Privacy follows from PSG1 as s = σ1
37 / 40
Linear Regression
Stored Data-structure: Y
Answering queries. On being queried with a vector b
b = 0d 0n+d bi
• Set b
• Compute the sketch of e
b
e
• Compute arg minx kYx − bk
Utility follows using the Johnson-Lindenstrauss property
37 / 40
Additional Slides
Differentially Private Manifold Learning 4
4
J. Upadhyay. Randomness Efficient Fast-Johnson-Lindenstrauss
Transform with Applications in Differential Privacy and Compressed Sensing.
arXiv: 1410.2470
38 / 40
Differentially Private Manifold Learning
Input: Y = {y1 , · · · , ym } in Rn .
T
• Set xi = σei yi for all 1 ≤ i ≤ m.
• Let X = x1 · · · xm .
while residual variance ≥ ζ
• Sample Φ
• Run the Grassberger-Procaccia’s algorithm on ΦX
• Use b
k to perform Isomap on ΦX.
• Calculate the residual variance, increment r
Invoke the Isomap algorithm with b
k, {x1 , · · · , xm }, and Φ
Privacy follows from PSG1
39 / 40
Analysis of Differentially Private Manifold Learning
Convergence requires proving two things. If
k log(nVκ) log(1/δ)
r≥O
ε2
then with probability 1 − β,
• Residual variance of the embedded points are bounded by
residual variance of the sampled points

!2 
p
16r
log(2/β)
ln(4/β)

RΦ ≤ R + O cεd2 + cε
α
• Intrinsic dimension of embedded points is within
multiplicative factor of that of sampled points
40 / 40