Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2021 Dec 6;8(2):257–272. doi: 10.1007/s41095-021-0241-9

Unsupervised random forest for affinity estimation

Yunai Yi 1, Diya Sun 1, Peixin Li 1, Tae-Kyun Kim 2, Tianmin Xu 3, Yuru Pei 1,
PMCID: PMC8645415  PMID: 34900375

Abstract

This paper presents an unsupervised clustering random-forest-based metric for affinity estimation in large and high-dimensional data. The criterion used for node splitting during forest construction can handle rank-deficiency when measuring cluster compactness. The binary forest-based metric is extended to continuous metrics by exploiting both the common traversal path and the smallest shared parent node.

The proposed forest-based metric efficiently estimates affinity by passing down data pairs in the forest using a limited number of decision trees. A pseudo-leaf-splitting (PLS) algorithm is introduced to account for spatial relationships, which regularizes affinity measures and overcomes inconsistent leaf assign-ments. The random-forest-based metric with PLS facilitates the establishment of consistent and point-wise correspondences. The proposed method has been applied to automatic phrase recognition using color and depth videos and point-wise correspondence. Extensive experiments demonstrate the effectiveness of the proposed method in affinity estimation in a comparison with the state-of-the-art.

graphic file with name 41095_2021_241_Fig1_HTML.jpg

Keywords: affinity estimation, forest-based metric, unsupervised clustering forest, pseudo-leaf-splitting (PLS)

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61876008 and 82071172, Beijing Natural Science Foundation under Grant No. 7192227, and the Research Center of Engineering and Technology for Digital Dentistry, the Ministry of Health.

Footnotes

Yunai Yi received her B.S. degree from the University of Electronic Science and Technology of China in 2014, and her M.S. degree from Peking University in 2017. She is currently an engineer in Netease. Her research interests include computer graphics and machine learning.

Diya Sun received his B.S. degree in 2018 from the School of Electronics Engineering and Computer Science, Peking University. She is currently a master degree student in the Key Laboratory of Machine Perception, MOE, Peking University. Her research interests include image processing, image registration, and 3D reconstruction.

Peixin Li received his B.Sc. degree in computer science from Xi’an Jiaotong University in 2018. Currently, he is working towards his M.Sc. degree in the School of Electronics Engineering and Computer Science at Peking University. His research interests include computer vision and image processing.

Tae-Kyun Kim received his Ph.D. degree from the University of Cambridge UK, in 2008 and was a Junior Research Fellow at Sidney Sussex College, Cambridge from 2007 to 2010. He has been a lecturer in computer vision and learning at Imperial College, London since 2010. His research interests span object recognition and tracking, face recognition and surveillance, action and gesture recognition, semantic image segmentation and reconstruction, and man-machine interfaces. He has co-authored over 40 academic papers in top-tier conferences and journals, 6 MPEG-7 standard documents, and 17 international patents. His co-authored algorithm is an international standard in MPEG-7 ISO/IEC for face retrieval.

Tianmin Xu received his B.M. degree in stomatology from Nanjing Medical University, China and his M.D. degree in orthodontics from the Health Science Center, Peking University in 1986 and 1992, respectively. From 1994 to 1996, Dr. Xu was with the School of Dentistry, University of California, San Francisco as a postdoctoral researcher. He is now a professor of medicine in the School of Stomatology, and a professor of treatment in the Stomatology Hospital, Peking University. He is the associate director of the Department of Orthodontics, and the Oral and Craniofacial Growth and Development Center. His research interests include digitized orthodontics, clinical orthodontics theory and applications, oral and craniofacial growth and development, and clinical MBT techniques.

Yuru Pei received her B.S. degree from Central South University in 2000, her M.S. degree from Zhejiang University in 2003, and her Ph.D. degree from Peking University in 2006. She is now an associate professor in the Department of Machine Intelligence, Peking University. She was a visiting professor in Queen Mary, University of London, and Imperial College, London, in 2011–2012. Her research interests include image processing and computer vision.

Contributor Information

Yunai Yi, Email: yiyunai521@126.com.

Diya Sun, Email: dysun@pku.edu.cn.

Peixin Li, Email: lipeixin@pku.edu.cn.

Tae-Kyun Kim, Email: tk.kim@imperial.ac.uk.

Tianmin Xu, Email: tmxuortho@163.com.

Yuru Pei, Email: yrpei@pku.edu.cn.

References

  • [1].Rao S, Tron R, Vidal R, Ma Y. Motion segmen-tation in the presence of outlying, incomplete, or corrupted trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2010;32(10):1832–1845. doi: 10.1109/TPAMI.2009.191. [DOI] [PubMed] [Google Scholar]
  • [2].Brox T, Malik J. Object segmentation by long term analysis of point trajectories. In: Daniilidis K, Maragos P, Paragios N, editors. Computer Vision-ECCV 2010. Berlin Heidelberg: Springer; 2010. pp. 282–295. [Google Scholar]
  • [3].Vrigkas M, Karavasilis V, Nikou C, Kakadiaris I A. Matching mixtures of curves for human action recognition. Computer Vision and Image Understanding. 2014;119:27–40. doi: 10.1016/j.cviu.2013.11.007. [DOI] [Google Scholar]
  • [4].Pei, Y. R.; Kim, T. K.; Zha, H. B. Unsupervised random forest manifold alignment for lipreading. In: Proceedings of the IEEE International Conference on Computer Vision, 129–136, 2013.
  • [5].Boscaini D, Masci J, Rodolà E, Bronstein M M, Cremers D. Anisotropic diffusion descriptors. Computer Graphics Forum. 2016;35(2):431–441. doi: 10.1111/cgf.12844. [DOI] [Google Scholar]
  • [6].ACM Transactions on Graphics. 2012. [DOI] [PMC free article] [PubMed]
  • [7].ACM Transactions on Graphics. 2011.
  • [8].Sahillioglu Y, Yemez Y. Coarse-to-fine combinatorial matching for dense isometric shape correspondence. Computer Graphics Forum. 2011;30(5):1461–1470. doi: 10.1111/j.1467-8659.2011.02020.x. [DOI] [Google Scholar]
  • [9].Rodolà, E.; Bulò, S.; Windheuser, T.; Vestner, M.; Cremers, D. Dense non-rigid shape correspondence using random forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4177–4184, 2014.
  • [10].Boyer D M, Lipman Y, St. Clair E, Puente J, Patel B A, Funkhouser T, Jernvall J, Daubechies I. Algorithms to automatically quantify the geometric similarity of anatomical surfaces. Proceedings of the National Academy of Sciences. 2011;108(45):18221–18226. doi: 10.1073/pnas.1112822108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Pei, Y. R.; Kou, L.; Zha, H. B. Anatomical structure similarity estimation by random forest. In: Proceedings of the IEEE International Conference on Image Processing, 2941–2945, 2016.
  • [12].Criminisi A, Shotton J. Decision Forests for Computer Vision and Medical Image Analysis. London: Springer London; 2013. [Google Scholar]
  • [13].Moosmann, F.; Triggs, B.; Jurie, F. Fast discriminative visual codebooks using randomized clustering forests. In: Proceedings of the Conference on Neural Information Processing Systems, 985–992, 2006.
  • [14].Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. Realtime human pose recognition in parts from single depth images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1297–1304, 2011.
  • [15].Gall J, Yao A, Razavi N, Van Gool L, Lempitsky V. Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2011;33(11):2188–2202. doi: 10.1109/TPAMI.2011.70. [DOI] [PubMed] [Google Scholar]
  • [16].Hengl T, Nussbaum M, Wright M N, Heuvelink G B M, Gräler B. Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ. 2018;6:e5518. doi: 10.7717/peerj.5518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Jeung M, Baek S, Beom J, Cho K H, Her Y, Yoon K. Evaluation of random forest and regression tree methods for estimation of mass first flush ratio in urban catchments. Journal of Hydrology. 2019;575:1099–1110. doi: 10.1016/j.jhydrol.2019.05.079. [DOI] [Google Scholar]
  • [18].Yeşilkanat C M. Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm. Chaos, Solitons & Fractals. 2020;140:110210. doi: 10.1016/j.chaos.2020.110210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Breiman L. Random forests. Machine Learning. 2001;45(1):5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
  • [20].Criminisi A. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Foundations and Trends® in Computer Graphics and Vision. 2011;7(2–3):81–227. doi: 10.1561/0600000035. [DOI] [Google Scholar]
  • [21].Liu, B.; Xia, Y. Y.; Yu, P. S. Clustering through decision tree construction. In: Proceedings of the 9th International Conference on Information and Knowledge Management, 20–29, 2000.
  • [22].Shi T, Horvath S. Unsupervised learning with random forest predictors. Journal of Computational and Graphical Statistics. 2006;15(1):118–138. doi: 10.1198/106186006X94072. [DOI] [Google Scholar]
  • [23].Yu, G.; Yuan, J. S.; Liu, Z. C. Unsupervised random forest indexing for fast action search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 865–872, 2011.
  • [24].Zhu, X. T.; Loy, C. C.; Gong, S. G. Video synopsis by heterogeneous multi-source correlation. In: Proceedings of the IEEE International Conference on Computer Vision, 81–88, 2013.
  • [25].Zhu, X. T.; Loy, C. C.; Gong, S. G. Constructing robust affinity graphs for spectral clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1450–1457, 2014.
  • [26].Alzubaidi L, Arkah Z M, Hasan R I. Using random forest algorithm for clustering. Journal of Engineering and Applied Sciences. 2018;13(21):9189–9193. [Google Scholar]
  • [27].Pei, Y. R.; Yi, Y. N.; Chen, G.; Xu, T. M.; Zha, H. B.; Ma, G. Y. Voxel-wise correspondence of cone-beam computed tomography images by cascaded randomized forest. In: Proceedings of the IEEE 14th International Symposium on Biomedical Imaging, 481–484, 2017.
  • [28].Pei Y R, Yi Y N, Ma G Y, Guo Y K, Chen G, Xu T M, Zha H. Mixed metric random forest for dense correspondence of cone-beam computed tomography images. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins D, Duchesne S, editors. Medical Image Computing and Computer Assisted Intervention-MICCAI 2017. Cham: Springer; 2017. pp. 283–290. [Google Scholar]
  • [29].Sun, D.; Pei, Y.; Guo, Y.; Ma, G.; Xu, T.; Zha, H. Dense correspondence of cone-beam computed tomography images using oblique clustering forest. In: Proceedings of the British Machine Vision Conference, 2018.
  • [30].Pei Y R, Yi Y N, Ma G Y, Kim T K, Guo Y K, Xu T M, Zha H. Spatially consistent supervoxel correspondences of cone-beam computed tomography images. IEEE Transactions on Medical Imaging. 2018;37(10):2310–2321. doi: 10.1109/TMI.2018.2829629. [DOI] [PubMed] [Google Scholar]
  • [31].Li Z H, Nie F P, Chang X J, Yang Y, Zhang C Q, Sebe N. Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Transactions on Neural Networks and Learning Systems. 2018;29(12):6323–6332. doi: 10.1109/TNNLS.2018.2829867. [DOI] [PubMed] [Google Scholar]
  • [32].Ganapathi-Subramanian V, Diamanti O, Guibas L J. Modular latent spaces for shape correspondences. Computer Graphics Forum. 2018;37(5):199–210. doi: 10.1111/cgf.13502. [DOI] [Google Scholar]
  • [33].Aflalo Y, Dubrovina A, Kimmel R. Spectral generalized multi-dimensional scaling. International Journal of Computer Vision. 2016;118(3):380–392. doi: 10.1007/s11263-016-0883-8. [DOI] [Google Scholar]
  • [34].Huang Q X, Guibas L. Consistent shape maps via semidefinite programming. Computer Graphics Forum. 2013;32(5):177–186. doi: 10.1111/cgf.12184. [DOI] [Google Scholar]
  • [35].ACM Transactions on Graphics. 2014. [DOI] [PMC free article] [PubMed]
  • [36].Nguyen A, Ben-Chen M, Welnicka K, Ye Y Y, Guibas L. An optimization approach to improving collections of shape maps. Computer Graphics Forum. 2011;30(5):1481–1491. doi: 10.1111/j.1467-8659.2011.02022.x. [DOI] [Google Scholar]
  • [37].Litany, O.; Remez, T.; Rodolà, E.; Bronstein, A.; Bronstein, M. Deep functional maps: Structured prediction for dense shape correspondence. In: Proceedings of the IEEE International Conference on Computer Vision, 5659–5667, 2017.
  • [38].Groueix T, Fisher M, Kim V G, Russell B C, Aubry M. 3D-CODED: 3D correspondences by deep deformation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, editors. Computer Vision-ECCV 2018. Cham: Springer; 2018. pp. 235–251. [Google Scholar]
  • [39].Wang, W. Y.; Ceylan, D.; Mech, R.; Neumann, U. 3DN: 3D deformation network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1038–1046, 2019.
  • [40].Dice L. Measures of the amount of ecologic association between species. Ecology. 1945;26(3):297–302. doi: 10.2307/1932409. [DOI] [Google Scholar]
  • [41].Comaniciu D, Meer P. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002;24(5):603–619. doi: 10.1109/34.1000236. [DOI] [Google Scholar]
  • [42].Zhao G Y, Barnard M, Pietikainen M. Lipreading with local spatiotemporal descriptors. IEEE Transactions on Multimedia. 2009;11(7):1254–1265. doi: 10.1109/TMM.2009.2030637. [DOI] [Google Scholar]
  • [43].Anina, I.; Zhou, Z. H.; Zhao, G. Y.; Pietikäinen, M. OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis. In: Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 1–5, 2015.
  • [44].Cootes T F, Edwards G J, Taylor C J. Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2001;23(6):681–685. doi: 10.1109/34.927467. [DOI] [Google Scholar]
  • [45].Bronstein A, Bronstein M, Kimmel R. Numerical Geometry of Non-Rigid Shapes. New York: Springer New York; 2008. [Google Scholar]
  • [46].Anguelov D, Srinivasan P, Koller D, Thrun S, Rodgers J, Davis J. Scape. ACM Transactions on Graphics. 2005;24(3):408–416. doi: 10.1145/1073204.1073207. [DOI] [Google Scholar]
  • [47].Bogo, F.; Romero, J.; Loper, M.; Black, M. FAUST: Dataset and evaluation for 3D mesh registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3794–3801, 2014.
  • [48].Aubry, M.; Schlickewei, U.; Cremers, D. The wave kernel signature: A quantum mechanical approach to shape analysis. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 1626–1633, 2011.
  • [49].ACM Transactions on Graphics. 2008. [DOI] [PMC free article] [PubMed]
  • [50].Wang, F.; Huang, Q. X.; Guibas, L. J. Image co-segmentation via consistent functional maps. In: Proceedings of the IEEE International Conference on Computer Vision, 849–856, 2013.
  • [51].Chen, Q. F.; Koltun, V. Robust nonrigid registration by convex optimization. In: Proceedings of the IEEE International Conference on Computer Vision, 2039–2047, 2015.
  • [52].Wei, L. Y.; Huang, Q. X.; Ceylan, D.; Vouga, E.; Li, H. Dense human body correspondences using convolutional networks. arXiv preprint arXiv:1511.05904, 2015.
  • [53].ACM Transactions on Graphics. 2013. [DOI] [PMC free article] [PubMed]
  • [54].Zhou Z H, Hong X P, Zhao G Y, Pietikäinen M. A compact representation of visual speech data using latent variables. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2014;36(1):1–1. doi: 10.1109/TPAMI.2013.173. [DOI] [PubMed] [Google Scholar]
  • [55].Lee D, Lee J, Kim K-E. Multi-view automatic lip-reading using neural network. In: Chen C S, Lu J, Ma K K, editors. Computer Vision-ACCV 2016 Workshops. Cham: Springer; 2017. pp. 290–302. [Google Scholar]
  • [56].Chung J S, Zisserman A. Out of time: Automated lip sync in the wild. In: Chen C S, Lu J, Ma K K, editors. Computer Vision-ACCV 2016 Workshops. Cham: Springer; 2017. pp. 251–263. [Google Scholar]
  • [57].Chung J S, Zisserman A. Lip reading in the wild. In: Lai S H, Lepetit V, Nishino K, Sato Y, editors. Computer Vision-ACCV 2016. Cham: Springer; 2017. pp. 87–103. [Google Scholar]
  • [58].Kanavati F, Tong T, Misawa K, Fujiwara M, Mori K, Rueckert D, Glocker B. Supervoxel classification forests for estimating pairwise image correspondences. Pattern Recognition. 2017;63:561–569. doi: 10.1016/j.patcog.2016.09.026. [DOI] [Google Scholar]

Articles from Computational Visual Media are provided here courtesy of Nature Publishing Group

RESOURCES