Automatic Alignment of Local Representations
This post summarizes
Automatic Alignment of Local Representations (2003)
Yee Whye Teh , Sam Roweis
NIPS 2003.
The paper proposes a generic method to combine coordinates obtained from many dimensionality reduction algorithms into one global coordinate, essentially “aligning” local coordinates into a global one. The motivation of such a proposal starts from the inability of linear methods (e.g., PCA) to capture the structure of curved manifolds. A common way to deal with a curved manifold is to consider a mixture of local dimensionality reducers. The idea is analogous to approximating a curve with a piecewise linear function. The problem with this approach is that there is no globally coherent coordinate because each local region has its own.
Proposal
Assume that we are given a set of points X=[x1,…,xN]⊤ from D-dimensional space sampled from a d≪D dimensional manifold. Let Y=[y1,…,yN]⊤ be the reduced images of X in d dimensional embedding space. Suppose that we have already trained K local dimensionality reducers, each producing a representation znk∈Rdk of point xn. Associated with each output znk is a responsibity rnk≥0 describing the reliability of the representation of xn from the kth reducer. The algorithm aligns {znk}n,k and {rnk}n,k to produce Y.
Assume that the global coordinate is a linear function of znk. The paper proposes to obtain Y from
yn=∑krnk(Lkznk+l0k)=∑kdk∑i=0rnkzinklik=∑junjlj,where Lk=(l0k|⋯|ldkk) is a linear projection associated with the kth reducer, and l0k is an offset. If we let unj=rnkzink, z0nk:=1 and vectorizing the indeces (i,k) into j, we have Y=UL, which is a linear system with fixed U and unknown L. To determine L, the authors proposed to use a cost function E based on the locally linear embedding (LLE) algorithm: E(Y,W)=∑n‖xn−∑m∈Nnwnmxm‖2=tr(Y⊤(I−W⊤)(I−W)Y), where Nn is the set of nearest neighbours of xn and ∑m∈Nnwnm=1.
The local linear reconstruction weights W are constructed by solving minWE(X,W) subject to the normalization constraint. With W determined, E(Y,W) can be solved by minimizing E(Y,W)=tr[L⊤U⊤(I−W⊤)(I−W)UL], subject to two constraints, 1N1⊤Y=0 and 1NY⊤Y=Id to break degeneracies. Solving for Y now amounts to solving for L. The objective together with the two constraints form a generalized eigenvalue problem whose solution is given by the 2nd to (d+1)st smallest eigenvectors. The alignment problem has no local optima.