What is a distance metric learning algorithm?

A distance metric learning algorithm (DML) is an algorithm that can learn a similarity measure or distance from the data. This distance can be used for many purposes, such as improving distance based algorithms wither in supervised, semi-supervised or unsupervised learning. DMLs also have interesting applications in dimensionality reduction.

How to learn a distance

The (pseudo-)distances learned by distance metric learning algorithms are also known as Mahalanobis distances. This distances are determined by positive semidefinite matrices \(M \in \mathcal{M}_d(\mathbb{R})\), and can be calculated as \[ d(x,y) = \sqrt{(x-y)^TM(x-y)}, \] for \(x, y \in \mathbb{R}^d\). It is known that the PSD matrix \(M\) can be decomposed as \(M = L^TL\), with \(L \in \mathcal{M}_d(\mathbb{R})\) is an arbitrary matrix. In this case, we have \[ d(x,y)^2 = (x-y)^TL^TL(x-y) = (L(x-y))^T(L(x-y)) = \|L(x-y)\|_2^2. \] So every Mahalanobis distance is equivalent to the euclidean distance after applying the linear mapping \(L\).

Matrices \(M\) and \(L\) define the two approaches for learning a distance. We can either learn the metric matrix \(M\) which defines the distance, or learn the linear map \(L\), and calculate the distance in the mapped space. Each DML will learn the distance following one of these approaches.

Additional functionalities

Examples

Get started with the following examples

See also

The pyDML software, which is the DML software used by rDML, and its documentation.

References

  • Fei Wang and Changshui Zhang. “Feature extraction by maximizing the average neighborhood margin”. In: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on. IEEE. 2007, pages 1-8.
  • Kilian Q Weinberger and Lawrence K Saul. “Distance metric learning for large margin nearest neighbor classification”. In: Journal of Machine Learning Research 10.Feb (2009), pages 207-244.
  • Jacob Goldberger et al. “Neighbourhood components analysis”. In: Advances in neural information processing systems. 2005, pages 513-520.
  • Thomas Mensink et al. “Metric learning for large scale image classification: Generalizing to new classes at near-zero cost”. In: Computer Vision–ECCV 2012. Springer, 2012, pages 488-501.
  • Jason V Davis et al. “Information-theoretic metric learning”. In: Proceedings of the 24th international conference on Machine learning. ACM. 2007, pages 209-216.
  • Bac Nguyen, Carlos Morell and Bernard De Baets. “Supervised distance metric learning through maximization of the Jeffrey divergence”. In: Pattern Recognition 64 (2017), pages 215-225.
  • Amir Globerson and Sam T Roweis. “Metric learning by collapsing classes”. In: Advances in neural information processing systems. 2006, pages 451-458.
  • Eric P Xing et al. “Distance metric learning with application to clustering with side-information”. In: Advances in neural information processing systems. 2003, pages 521-528.
  • Yiming Ying and Peng Li. “Distance metric learning with eigenvalue optimization”. In: Journal of Machine Learning Research 13.Jan (2012), pages 1-26.
  • Matthieu Guillaumin, Jakob Verbeek and Cordelia Schmid. “Is that you? Metric learning approaches for face identification”. In: Computer Vision, 2009 IEEE 12th international conference on. IEEE. 2009, pages 498-505.
  • Sebastian Mika et al. “Fisher discriminant analysis with kernels”. In: Neural networks for signal processing IX, 1999. Proceedings of the 1999 IEEE signal processing society workshop. Ieee. 1999, pages 41-48.
  • Lorenzo Torresani and Kuang-chih Lee. “Large margin component analysis”. In: Advances in neural information processing systems. 2007, pages 1385-1392.