Skip to main content
Log in

Local Tangent Space Discriminant Analysis

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

We propose a novel supervised dimensionality reduction method named local tangent space discriminant analysis (TSD) which is capable of utilizing the geometrical information from tangent spaces. The proposed method aims to seek an embedding space where the local manifold structure of the data belonging to the same class is preserved as much as possible, and the marginal data points with different class labels are better separated. Moreover, TSD has an analytic form of the solution and can be naturally extended to non-linear dimensionality reduction through the kernel trick. Experimental results on multiple real-world data sets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. That’s because there are only \(k_1+1\) examples as the inputs of local PCA.

  2. This protein sequence data set is available at http://www.ebi.ac.uk/uniprot.

References

  1. Bache K, Lichman M (2013) UCI machine learning repository

  2. Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396

    Article  MATH  Google Scholar 

  3. Cai D, He X, Zhou K, Han J, Bao H (2007) Locality sensitive discriminant analysis. In: Proceedings of the 20rd International Joint Conference on Artificial Intelligence (IJCAI), pp 708–713

  4. Chung FRK (1997) Spectral graph theory. American Mathematical Society, Rhode Island

    MATH  Google Scholar 

  5. Donoho DL, Grimes C (2003) Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc Natl Acad Sci 100(10):5591–5596

    Article  MathSciNet  MATH  Google Scholar 

  6. Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic Press, New York

    MATH  Google Scholar 

  7. He X, Niyogi P (2004) Locality preserving projections. In: Thrun S, Saul L, Schölkopf B (eds) Advances in neural information processing systems 16. MIT Press, Cambridge, pp 1–8

    Google Scholar 

  8. Lin B, He X, Zhang C, Ji M (2013) Parallel vector field embedding. J Mach Learn Res 14(1):2945–2977

    MathSciNet  MATH  Google Scholar 

  9. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  10. Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319

    Article  Google Scholar 

  11. Simard P, LeCun Y, Denker JS (1993) Efficient pattern recognition using a new transformation distance. In: Hanson SJ, Cowan JD, Giles CL (eds) Advances in neural information processing systems 5. Morgan-Kaufmann, San Mateo

    Google Scholar 

  12. Sugiyama M (2007) Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J Mach Learn Res 8:1027–1061

    MATH  Google Scholar 

  13. Sun S (2013) Tangent space intrinsic manifold regularization for data representation. In: Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), pp 179–183

  14. Tyagi H, Vural E, Frossard P (2013) Tangent space estimation for smooth embeddings of riemannian manifolds. Inf Inference 2(1):69–114

    Article  MathSciNet  MATH  Google Scholar 

  15. Yan S, Xu D, Zhang B, Zhang H, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51

    Article  Google Scholar 

  16. Ye J, Li Q (2005) A two-stage linear discriminant analysis via QR-decomposition. IEEE Trans Pattern Anal Mach Intell 27(6):929–941

    Article  Google Scholar 

  17. Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J Sci Comput 26(1):313–338

    Article  MathSciNet  MATH  Google Scholar 

  18. Zhu M, Martinez AM (2006) Subclass discriminant analysis. IEEE Trans Pattern Anal Mach Intell 28(8):1274–1286

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Projects 61370175 and 61075005, and Shanghai Knowledge Service Platform Project (No.ZF1213).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiliang Sun.

Appendix 1: Detailed Derivation of S

Appendix 1: Detailed Derivation of S

In order to fix S, we decompose (10) into three additive terms as follows:

$$\begin{aligned} \varvec{f}^{\top } S \varvec{f}= & {} \underbrace{\sum _{i,j=1}^n W_{ij}^w((\varvec{x}_i - \varvec{x}_j)^{\top } \varvec{t})^2}_{\text{ term } \text{ one }} + \\&\underbrace{\sum _{i,j=1}^n W_{ij}^w \left( \varvec{w}_{\varvec{x}_j}^{\top } T_{\varvec{x}_j}^{\top } (\varvec{x}_i-\varvec{x}_j)\right) ^2}_{\text{ term } \text{ two }} + \\&\underbrace{\sum _{i,j=1}^n W_{ij}^w\left[ -2((\varvec{x}_i - \varvec{x}_j)^{\top } \varvec{t}) \varvec{w}_{\varvec{x}_j}^{\top } T_{\varvec{x}_j}^{\top } (\varvec{x}_i-\varvec{x}_j)\right] }_{\text{ term } \text{ three }}, \end{aligned}$$

and then examine their separate contributions to the whole S.

Term One

$$\begin{aligned}&\sum _{i,j=1}^n W_{ij}^w\left( (\varvec{x}_i - \varvec{x}_j)^{\top } \varvec{t}\right) ^2 = 2\varvec{t}^{\top } {X} (D^w-W^w) {X}^{\top } \varvec{t} = 2\varvec{t}^{\top } {X} L^w {X}^{\top } \varvec{t}, \end{aligned}$$

where \(D^w\) is a diagonal weight matrix with \(D_{ii}^w = \sum _{j=1}^n W_{ij}^w\), and \(L^w = D^w-W^w\) is the Laplacian matrix. Then we have \(S_1 = 2 (D^w-W^w) = 2L^w\). And term one contributes to \(X S_1 X^{\top }\) in (14).

Term Two Define \(B_{ji}=T_{\varvec{x}_j}^{\top } (\varvec{x}_i-\varvec{x}_j)\), then

$$\begin{aligned}&\sum _{i,j=1}^n W_{ij}^w \left( \varvec{w}_{\varvec{x}_j}^{\top } T_{\varvec{x}_j}^{\top }(\varvec{x}_i-\varvec{x}_j)\right) ^2 \\&\quad = \sum _{i,j=1}^n W_{ij}^w \left( \varvec{w}_{\varvec{x}_j}^{\top } B_{ji}\right) ^2 \\&\quad = \sum _{i,j=1}^n W_{ij}^w \varvec{w}_{\varvec{x}_j}^{\top } B_{ji}B_{ji}^{\top } \varvec{w}_{\varvec{x}_j}\\&\quad = \sum _{j=1}^n \varvec{w}_{\varvec{x}_j}^{\top } \left( \sum _{i=1}^n W_{ij}^w B_{ji}B_{ji}^{\top }\right) \varvec{w}_{\varvec{x}_j} = \sum _{i=1}^n \varvec{w}_{\varvec{x}_i}^{\top } H_i \varvec{w}_{\varvec{x}_i} , \end{aligned}$$

where we have defined matrices \(\{H_j\}_{j=1}^n\) with \(H_j=\sum _{i=1}^n W_{ij}^w B_{ji} B_{ji}^{\top }\).

Now suppose we define a block diagonal matrix \(S_3\) sized \(mn \times mn\) with block size \(m \times m\). Set the (ii)-th block \((i=1,\ldots ,n)\) of \(S_3\) to be \(H_i\). Then the resultant \(S_3\) is the contribution of term two for S in (14).

Term Three Define vectors \(\{F_j\}_{j=1}^n\) with \(F_j=\sum _{i=1}^n W_{ij}^w B_{ji}\), then term three can be rewritten as:

$$\begin{aligned}&\sum _{i,j=1}^n W_{ij}^w\left[ -2 \left( (\varvec{x}_i - \varvec{x}_j)^{\top } \varvec{t}\right) \varvec{w}_{\varvec{x}_j}^{\top } T_{\varvec{x}_j}^{\top } (\varvec{x}_i-\varvec{x}_j)\right] \\&\quad = \sum _{i,j=1}^n 2 W_{ij}^w\left[ \left( (\varvec{x}_j - \varvec{x}_i)^{\top } \varvec{t}\right) \varvec{w}_{\varvec{x}_j}^{\top } B_{ji}\right] \\&\quad = \sum _{i,j=1}^n W_{ij}^w\left( - \varvec{t}^{\top } \varvec{x}_i B_{ji}^{\top } \varvec{w}_{\varvec{x}_j}\right) + \sum _{i=1}^n \varvec{t}^{\top } \varvec{x}_i F_i^{\top } \varvec{w}_{\varvec{x}_i} + \\&\quad \sum _{i,j=1}^n W_{ij}^w\left( -\varvec{w}_{\varvec{x}_j}^{\top } B_{ji} \varvec{x}_i^{\top } \varvec{t}\right) + \sum _{i=1}^n \varvec{w}_{\varvec{x}_i}^{\top } F_i \varvec{x}_i^{\top } \varvec{t}. \end{aligned}$$

From this expression, we can give the formulation of \(S_2\). Then the \(S_2^{\top }\) in (14), which is its transpose, is ready to get.

Suppose we define two block matrices \(S_2^1\) and \(S_2^2\) sized \(n\times mn\) each where the block size is \(1\times m\), and \(S_2^2\) is a block diagonal matrix. Set the (ij)-th block \((i,j=1,\ldots ,n)\) of \(S_2^1\) to be \(-W_{ij}^w B_{ji}^{\top }\), and the (ii)-th block \((i=1,\ldots ,n)\) of \(S_2^2\) to be \(F_{i}^{\top }\). Then, term three can be rewritten as: \( \varvec{t}^{\top } X (S_2^1 + S_2^2) \varvec{w} + \varvec{w}^{\top } (S_2^1 + S_2^2)^{\top } X^{\top } \varvec{t}\). It is clear that \(S_2=S_2^1 + S_2^2\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Sun, S. Local Tangent Space Discriminant Analysis. Neural Process Lett 43, 727–744 (2016). https://doi.org/10.1007/s11063-015-9443-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-015-9443-4

Keywords

Navigation