Atul Singh
Junior Undergraduate
CSE, IIT Kanpur


Dimension reduction is a technique which is used
to represent a high dimensional data in a more
compact way using some alternate representation
Many process of data generation generate a large
data set which is embedded in a low dimensional
manifold of a high dimensional space. Dimensional
reduction can be applied to those data sets

In statistical pattern recognition, we often
encounter a data set in a high dimensional space

Many times the data is correlated in such a manner
that there are very few independent dimensions

Possible to represent the data using much lower
dimensions. Some benefits are ◦
◦
◦
◦
Compact representation
Less processing time
Visualization of high dimensional data
Interpolate meaningful dimensions

Linear Methods
◦ Principal Component Analysis (PCA)
◦ Independent Component Analysis (ICA)
◦ Multi-dimensional Scaling (MDS)

Non-linear Methods
◦ Global
 Isomap and its variants
◦ Local
 Locally Linear Embedding (LLE)
 Laplacian Eigenmaps

Principal Component Analysis
◦ Involves finding directions having large covariance
◦ Express the data as a linear combination of
eigenvectors along those directions

Multidimensional Scaling
◦ Keep inter point distances as invariant
◦ Again a linear methodology

Many variables can’t be represented as linear
combinations of some vectors
◦ Examples – Swiss roll, faces data, etc


In general the low dimension data is
embedded on some non linear manifold
Not possible to transform these manifolds to
low dimension space using only translation,
rotation, rescaling




Linear methods have a pre conceived notion
of dimension reduction
Goal is to – automate the estimation (infer the
degrees of freedom of data manifold) and
classification (embed data in low dimension
space) process
So need to go beyond Linear methods
Non Linear Methods
◦ Isomap
◦ Locally Linear Embedding


Global manifold learning algorithm
A swiss roll (fig a) embedded as a manifold in high dimension
space. Goal is to reduce it to two dimensions (fig c)

Consider a human face
◦ How is it represented/stored in brain?
◦ How is it represented/stored in a computer?


Do we need to store all the information (every
pixel)?
Just need to figure out some important
structures

Basic steps involved are
◦ Construct Neighborhood Graph: Determine the
neighbors of each point and assign edge weights dx
(i, j) to the graph thus formed
◦ Compute Geodesic distances: Estimate the geodesic
distances dM (i, j) using Dijkstra’s algorithm
◦ Use MDS to lower dimensions: Apply MDS on the
computed shortest path distance matrix and thus
reduce the dimensionality

So basically similar to MDS just using
geodesic distances rather than Euclidean




Another non-linear dimension reduction
algorithm
Follows a local approach to reduce the
dimensions
The idea is based on the assumption that a
point on a manifold reside on a hyper plane
determined by the point and its some nearest
neighbors
Key Question – How to combine these parts
of hyper planes and map them to low
dimensional space?

The basic steps involved are
◦ Assign neighbors: To each point assign neighbors
using nearest neighbor approach
◦ Weight calculation: We compute weights Wij such
that Xi is best reconstructed from its neighbors
◦ Compute Low dimension embedding: Using the
above computed weight matrix find corresponding
embedding vectors Yi in lower dimensional space
minimizing error function



The weights computed for
reconstruction are invariant
with respect to translation,
rotation and rescaling
The same weights should
reconstruct the map in
reduced dimensions
So we can conclude that the
local geometry is preserved

Shortcomings of Isomap
◦ Need a dense sampling of data on the manifold
◦ If k is chosen very small then residual error will
be too large
◦ If k is chosen very large then short-circuiting may
happen

Shortcomings of LLE
◦ Due to local nature doesn’t give a complete
picture of the data
◦ Again problems with selection of k

Short-circuiting
◦ “When the distance between the folds is very less or there is noise
such that a point from a different fold is chosen to be a neighbour
of the point, the distance computed does not represent geodesic
distance and hence the algorithm fails”

Insight
◦ This problem arises due to selection of neighbors just on the
basis of their Euclidean distance. The basic selection criteria are
 Select all the points within a ball of radius ε
 Select K nearest neighbors
◦ Locally Linear isomaps overcomes this problem by
modifying the neighbor selection criteria

Proposed algorithm – KLL Isomaps



Similar to Tanenbaum’s Isomap except for the
selection of nearest neighbors
Previous algorithms (Isomap, LLE) consider only the
Euclidean distance to judge neighborhood criteria
The proposed algorithm is
◦ Find a candidate neighborhood using K-nn approach
◦ Construct the data point using candidate neighbors (as in
LLE) to minimize the reconstruction error
◦ The neighbours whose Euclidean distance is less and those
lying on the locally linear patch of the manifold get higher
weights and hence are selected preferably
◦ Now KLL ≤ K neighbors are chosen based on the post
construction weights

KLL Isomap has been demonstrated to perform
better than Isomap under
◦ Sparsely sampled data
◦ Noisy data
◦ Dense data without noise

The metric used for Analysis are “Short circuit
edges” and “Residual variance”
To establish a formal proof for better performance of
this algorithm






Tenenbaum J.B., Silva V.d., Langford J.C.: A Global Geometric
Framework for Nonlinear Dimensionality Reduction
Roweis S.T., Saul L.K.: Nonlinear Dimensionality Reduction by
Locally Linear Embedding
Balasubramanian M., Schwartz E.L., Tenenbaum J.B., Silva
V.d., Langford J.C.: The Isomap Algorithm and Topological
Stability
Silva V.d., Tenenbaum J.B.: Global versus local methods in
nonlinear dimensionality reduction
Roweis S.T., Saul L.K.: An Introduction to Locally Linear
Embedding
Saxena A., Gupta A., Mukherjee A.: Non-linear Dimensionality
Reduction by Locally Linear Isomaps