Effective and efficient image comparison plays a vital role in content-based image retrieval (CBIR). The earth mover’s distance (EMD) is an enticing measure for image comparison, offering intuitive geometric interpretation and modelling the human perceptions of similarity. Unfortunately, computing EMD, using the simplex method, has cubic complexity. FastEMD, based on min-cost flow, reduces the complexity to (O( )). Although both methods can obtain the optimal result, the high complexity prevents the application of EMD on large-scale image datasets. Thresholding the ground distance can make EMD faster and more robust, since it can decrease the impact of noise and reduce the range of transportation. In this paper, we present a new image distance metric, , which applies a threshold to the ground distance. To compute , the FastEMD approach can be employed. We also propose a novel linear approximation algorithm. Our algorithm achieves complexity with the benefit of qualified bins. Experimental results show that (1) our method is 2 to 3 orders of magnitude faster than EMD (computed by FastEMD) and 2 orders of magnitude faster than FastEMD and (2) the precision of our approximation algorithm is no less than the precision of FastEMD. 1. Introduction The development of advanced multimedia technology increases the importance of image retrieval, since large-scale image datasets are proliferating. Effective and efficient methods for computing the similarity between images is vitally important to content-based image retrieval (CBIR) systems. The earth mover’s distance (EMD) is a famous measure for image retrieval [1]. It not only takes the corresponding bins into account, but also considers the correlations between noncorresponding bins and thus is robust to histogram shift and rotation. The effectiveness of EMD and/or its variants is witnessed in many application domains, such as image retrieval [1–4], common pattern discovery [5, 6], mathematical symbol retrieval [7], contour matching [8], texture retrieval [1, 9], shape matching [10], graph matching [11], visual event recognition [12], face verification [13, 14], visual tracker [15], and uncertain or probability data [16, 17]. The EMD defines the comparison of two histograms as the transportation problem. (In [1], EMD is used to compute the distance between two signatures. For the convenience of statement, we use histogram in this paper. In fact, our algorithm can be used to both histogram and signature.) The minimum cost to transport one histogram to another one is taken as the distance of the two histograms.
References
[1]
Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover's distance as a metric for image retrieval,” International Journal of Computer Vision, vol. 40, no. 2, pp. 99–121, 2000.
[2]
Q. Lv, M. Charikar, and K. Li, “Image similarity search with compact data structures,” in Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM ’04), pp. 208–217, Washington, DC, USA, November 2004.
[3]
O. Pele and M. Werman, “A linear time histogram metric for improved sift matching,” in Computer Vision, D. Forsyth, P. Torr, and A. Zisserman, Eds., vol. 5304 of Lecture Notes in Computer Science, pp. 495–508, Springer, Berlin, Germany, 2008, Proceedings of the 10th European Conference on Computer Vision, 2008.
[4]
V. Ljosa, A. Bhattacharya, and K. A. Singh, “Indexing spatially sensitive distance measures using multi-resolution lower bounds,” in Advances in Database Technology, Y. Ioannidis, M. H. Scholl, J. W. Schmidt et al., Eds., vol. 3896 of Lecture Notes in Computer Science, pp. 865–883, Springer, Berlin, Germany, 2006, Proceedings of the 10th International Conference on Extending Database Technology, 2006.
[5]
H. K. Tan and C. W. Ngo, “Common pattern discovery using earth mover's distance and local flow maximization,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV ’05), vol. 2, pp. 1222–1229, Beijing, China, October 2005.
[6]
H. K. Tan and C. W. Ngo, “Localized matching using earth mover's distance towards discovery of common patterns from small image samples,” Image and Vision Computing, vol. 27, no. 10, pp. 1470–1483, 2009.
[7]
S. Marinai, B. Miotti, and G. Soda, “Using earth mover's distance in the bag-of-visual-words model for mathematical symbol retrieval,” in Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR ’11), pp. 1309–1313, Beijing, China, September 2011.
[8]
K. Grauman and T. Darrell, “Fast contour matching using approximate earth mover's distance,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’ 04), vol. 1, pp. I220–I227, July 2004.
[9]
S. Lazebnik, C. Schmid, and J. Ponce, “A sparse texture representation using affine-invariant regions,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’03), vol. 2, pp. 319–324, June 2003.
[10]
H. Ling and K. Okada, “An efficient earth mover's distance algorithm for robust histogram comparison,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 840–853, 2007.
[11]
M. F. Demirci, A. Shokoufandeh, S. J. Dickinson, Y. Keselman, and L. Bretzner, “Many-to-many feature matching using spherical coding of directed graphs,” in Proceedings of the 8th European Conference on Computer Vision (ECCV ’04), vol. 1, pp. 322–335, 2004.
[12]
D. Xu and S. F. Chang, “Visual event recognition in news video using kernel methods with multi-level temporal alignment,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'07), pp. 1–8, Minneapolis, Minn, USA, June 2007.
[13]
F. Wang and L. J. Guibas, “Supervised earth mover’ s distance learning and its computer vision applications,” in Computer Vision, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, Eds., vol. 7572 of Lecture Notes in Computer Science, pp. 442–455, Springer, Berlin, Germany, 2012, Proceedings of the 12th European conference on Computer Vision, 2012.
[14]
D. Xu, S. Yan, and J. Luo, “Face recognition using spatially constrained earth mover's distance,” IEEE Transactions on Image Processing, vol. 17, no. 11, pp. 2256–2260, 2008.
[15]
I. Leichter, “Mean shift trackers with cross-bin metrics,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 695–706, 2012.
[16]
B. E. Ruttenberg and A. K. Singh, “Indexing the earth mover's distance using normal distributions,” Proceedings of the VLDB Endowment, vol. 5, no. 3, pp. 205–216, 2011.
[17]
J. Xu, Z. Zhang, A. K. H. Tung, and G. Yu, “Efficient and effective similarity search over probabilistic data based on earth mover's distance,” The VLDB Journal, vol. 21, no. 4, pp. 535–559, 2011.
[18]
F. Hillier and G. Lieberman, Introduction to Mathematical Programming, McGraw-Hill, New York, NY, USA, 2nd edition, 1977.
[19]
O. Pele and M. Werman, “Fast and robust earth mover's distances,” in Proceedings of the 12th International Conference on Computer Vision (ICCV ’09), pp. 460–467, Kyoto, Japan, October 2009.
[20]
A. Rahimi and R. Kiram, “How earth mover's distance comprares two bags,” Tech. Rep., Intel Labs Berkeley, 2007.
[21]
S. Shirdhonkar and D. W. Jacobs, “Approximate earth mover's distance in linear time,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’08), pp. 1–8, Anchorage, Alaska, USA, June 2008.
[22]
M.-H. Jang, S.-W. Kim, C. Faloutsos, and S. Park, “A linear-time approximation of the earth mover's distance,” in Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM ’11), pp. 505–514, Scotland, UK, October 2011.
[23]
J. Z. Wang, J. Li, and G. Wiederhold, “Simplicity: semantics-sensitive integrated matching for picture libraries,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947–963, 2001.
[24]
B. E. Flores, “A pragmatic view of accuracy measurement in forecasting,” Omega, vol. 14, no. 2, pp. 93–98, 1986.