Research Publications
Scalable Large-Margin Mahalanobis Distance Metric Learning For many machine learning algorithms such as k-Nearest Neighbor
(k-NN) classifiers and k-means clustering, often their success
heavily depends on the metric used to calculate distances between
different data points. An effective solution for defining such a
metric is to learn it from a set of labeled training samples. In
this work, we propose a fast and scalable algorithm to learn a
Mahalanobis distance metric. The Mahalanobis metric can be viewed
as the Euclidean distance metric on the input data that have been
linearly transformed. By employing the principle of margin
maximization to achieve better generalization performances, this
algorithm formulates the metric learning as a convex optimization
problem and a positive semidefinite (p.s.d.) matrix is the unknown
variable. Based on an important theorem that a p.s.d. trace-one matrix
can always be represented as a convex combination of multiple
rank-one matrices, our algorithm accommodates any differentiable
loss function and solves the resulting optimization problem using a
specialized gradient descent procedure. During the course of
optimization, the proposed algorithm maintains the positive
semidefiniteness of the matrix variable that is essential for a
Mahalanobis metric. Compared with conventional methods like standard
interior-point algorithms or the special solver used
in Large Margin Nearest Neighbor (LMNN), our algorithm
is much more efficient and has a better performance in scalability.
Experiments on benchmark data sets suggest that, compared with
state-of-the-art metric learning algorithms, our algorithm can
achieve a comparable classification accuracy with reduced
computational complexity. Keywords: Large-margin nearest neighbor, distance metric learning, Mahalanobis distance, semidefinite optimization Details
|
