Abstract
Federated learning came into being with the increasing concern of privacy security, as people’s sensitive information is being exposed under the era of big data. It is an algorithm that does not collect users’ raw data, but aggregates model parameters from each client and therefore protects user’s privacy. Nonetheless, due to the inherent distributed nature of federated learning, it is more vulnerable under attacks since users may upload malicious data to break down the federated learning server. In addition, some recent studies have shown that attackers can recover information merely from parameters. Hence, there is still lots of room to improve the current federated learning frameworks. In this survey, we give a brief review of the state-of-the-art federated learning techniques and detailedly discuss the improvement of federated learning. Several open issues and existing solutions in federated learning are discussed. We also point out the future research directions of federated learning.
Electronic Supplementary Material
Supplementary material is available in the online version of this article at 10.1007/s11704-021-0598-z.
Keywords: federated learning, privacy protection, security
Electronic Supplementary Material
Acknowledgements
This work was supported by Guangdong Provincial Key Laboratory (2020B121201001).
Footnotes
Kaiyue Zhang received the BE degree from the Department of Computer Science and Engineering, Southern University of Science and Technology, China in 2019. She is currently pursuing the PhD degree with the Faculty of Engineering and Information Technology, University of Technology Sydney, Australia, and the Department of Computer Science and Engineering, Southern University of Science and Technology. Her research interests include human mobility modeling, urban computing, privacypreserving mechanisms in deep learning.
Xuan Song received the PhD degree from Peking University, China in 2010. In 2017, he was selected as Excellent Young Researcher of Japan MEXT. He led and participated in many important projects as principal investigator or primary actor in Japan, such as DIAS/GRENE Grant of MEXT; Japan/US Big Data and Disaster Project of JST; Young Scientists Grant and Scientific Research Grant of MEXT; Research Grant of MLIT; Grant of JR EAST Company and Hitachi Company. He served as Associate Editor, Guest Editor, Program Chair, Area Chair, Program Committee Member or reviewer for many famous journals and top-tier conferences, such as IMWUT, WWW Journal, ACM TIST, IEEE TKDE, UbiComp, ICCV, CVPR, ICRA.
Chenhan Zhang received the BEng degrees in Telecommunication Engineering from University of Wollongong, Australia, and Zhengzhou University, China in 2017 and 2018, respectively. He received the MS degree in Engineering Management from City University of Hong Kong, China in 2019. He is currently a PhD student at Faculty of Engineering and Information Technology, University of Technology Sydney, Australia. His research interests include deep learning, intelligent transportation systems, privacy-preserving in AI.
Shui Yu obtained his PhD from Deakin University, Australia in 2004. He currently is a Professor of School of Computer Science, University of Technology Sydney, Australia. He has published three monographs and edited two books, more than 400 technical papers, including top journals and conferences, such as IEEE TPDS, TIFS, TMC, TKDE, ToN, and INFOCOM. Dr. Yu initiated the research field of networking for big data, and his research outputs have been widely adopted by industrial systems, such as Amazon cloud security. He is currently serving a number of prestigious editorial boards, including IEEE Communications Surveys and Tutorials (Area Editor), IEEE Communications Magazine, and so on.
Contributor Information
Xuan Song, Email: songx@sustech.edu.cn.
Shui Yu, Email: shui.yu@uts.edu.au.
References
- 1.Shen S, Zhu T, Wu D, Wang W, Zhou W. From distributed machine learning to federated learning: in the view of data privacy and security. Concurrency and Computation: Practice and Experience, 2020, DOI: 10.1002/cpe.6002
- 2.Abadi M, Chu A, Goodfellow I, McMahan H B, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016, 308–318
- 3.Li P, Li J, Huang Z, Li T, Gao C Z, Yiu S M, Chen K. Multi-key privacy-preserving deep learning in cloud computing. Future Generation Computer Systems. 2017;74:76–85. doi: 10.1016/j.future.2017.02.006. [DOI] [Google Scholar]
- 4.McMahan B, Moore E, Ramage D, Hampson S, Arcas y B A. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of Artificial Intelligence and Statistics. 2017, 1273–1282
- 5.Yang T, Andrew G, Eichner H, Sun H, Li W, Kong N, Ramage D, Beaufays F. Applied federated learning: Improving google keyboard query suggestions. 2018, arXiv preprint arXiv: 1812.02903
- 6.Hard A, Rao K, Mathews R, Ramaswamy S, Beaufays F, Augenstein S, Eichner H, Kiddon C, Ramage D. Federated learning for mobile keyboard prediction. 2018, arXiv preprint arXiv: 1811.03604
- 7.Shokri R, Shmatikov V. Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 2015, 1310–1321
- 8.Leroy D, Coucke A, Lavril T, Gisselbrecht T, Dureau J. Federated learning for keyword spotting. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2019, 6341–6345
- 9.Ramaswamy S, Mathews R, Rao K, Beaufays F. Federated learning for emoji prediction in a mobile keyboard. 2019, arXiv preprint arXiv: 1906.04329
- 10.Fallah A, Mokhtari A, Ozdaglar A. Personalized federated learning with theoretical guarantees: a modelagnostic meta-learning approach. Advances in Neural Information Processing Systems, 2020: 33
- 11.Ye D, Yu R, Pan M, Han Z. Federated learning in vehicular edge computing: a selective model aggregation approach. IEEE Access. 2020;8:23920–23935. doi: 10.1109/ACCESS.2020.2968399. [DOI] [Google Scholar]
- 12.Lu Y, Huang X, Dai Y, Maharjan S, Zhang Y. Federated learning for data privacy preservation in vehicular cyber-physical systems. IEEE Network. 2020;34(3):50–56. doi: 10.1109/MNET.011.1900317. [DOI] [Google Scholar]
- 13.Zhou C, Fu A, Yu S, Yang W, Wang H, Zhang Y. Privacy-preserving federated learning in fog computing. IEEE Internet of Things Journal. 2020;7(11):10782–10793. doi: 10.1109/JIOT.2020.2987958. [DOI] [Google Scholar]
- 14.Lim W Y B, Luong N C, Hoang D T, Jiao Y, Liang Y C, Yang Q, Niyato D, Miao C. Federated learning in mobile edge networks: a comprehensive survey. IEEE Communications Surveys & Tutorials. 2020;22(3):2031–2063. doi: 10.1109/COMST.2020.2986024. [DOI] [Google Scholar]
- 15.Mothukuri V, Parizi R M, Pouriyeh S, Huang Y, Dehghantanha A, Srivastava G. A survey on security and privacy of federated learning. Future Generation Computer Systems. 2021;115:619–640. doi: 10.1016/j.future.2020.10.007. [DOI] [Google Scholar]
- 16.Fung C, Yoon C J, Beschastnikh I. Mitigating sybils in federated learning poisoning. 2018, arXiv preprint arXiv: 1808.04866
- 17.Bonawitz K, Ivanov V, Kreuter B, Marcedone A, McMahan H B, Patel S, Ramage D, Segal A, Seth K. Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017, 1175–1191
- 18.Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V. Federated learning with non-iid data. 2018, arXiv preprint arXiv: 1806.00582
- 19.Li T, Sahu A K, Talwalkar A, Smith V. Federated learning: challenges, methods, and future directions. IEEE Signal Processing Magazine. 2020;37(3):50–60. doi: 10.1109/MSP.2020.2975749. [DOI] [Google Scholar]
- 20.Yang Q, Liu Y, Chen T, Tong Y. Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology. 2019;10(2):1–19. doi: 10.1145/3298981. [DOI] [Google Scholar]
- 21.Nilsson A, Smith S, Ulm G, Gustavsson E, Jirstrand M. A performance evaluation of federated learning algorithms. In: Proceedings of the 2nd Workshop on Distributed Infrastructures for Deep Learning. 2018, 1–8
- 22.Aono Y, Hayashi T, Wang L, Moriai S, et al. Privacypreserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security. 2017;13(5):1333–1345. [Google Scholar]
- 23.Chen Y, Qin X, Wang J, Yu C, Gao W. Fedhealth: a federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems. 2020;35(4):83–93. doi: 10.1109/MIS.2020.2988604. [DOI] [Google Scholar]
- 24.Wang X, Han Y, Wang C, Zhao Q, Chen X, Chen M. In-edge ai: Intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network. 2019;33(5):156–165. doi: 10.1109/MNET.2019.1800286. [DOI] [Google Scholar]
- 25.Yu F X, Rawat A S, Menon A K, Kumar S. Federated learning with only positive labels. 2020, arXiv preprint arXiv: 2004.10342
- 26.Kairouz P, McMahan H B, Avent B, Bellet A, Bennis M, Bhagoji A N, Bonawitz K, Charles Z, Cormode G, Cummings R, et al. Advances and open problems in federated learning. 2019, arXiv preprint arXiv: 1912.04977
- 27.Bhagoji A N, Chakraborty S, Mittal P, Calo S. Analyzing federated learning through an adversarial lens. In: Proceedings of International Conference on Machine Learning. 2019, 634–643
- 28.Zhu L, Liu Z, Han S. Deep leakage from gradients. Advances in Neural Information Processing Systems. 2019;32:14774–14784. [Google Scholar]
- 29.Konečnỳ J, McMahan H B, Yu F X, Richtárik P, Suresh A T, Bacon D. Federated learning: strategies for improving communication efficiency. 2016, arXiv preprint arXiv: 1610.05492
- 30.Konečnỳ J, McMahan H B, Yu F X, Richtarik P, Suresh A T, Bacon D. Federated learning: strategies for improving communication efficiency. In: Proceedings of NIPS Workshop on Private Multi-Party Machine Learning. 2016
- 31.Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. 2018, arXiv preprint arXiv: 1812.06127
- 32.Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecny J, Mazzocchi S, McMahan H B, Van Overveldt T, Petrou D, Ramage D, Roselander J. Towards federated learning at scale: system design, 2019, arXiv preprint arXiv: 1902.01046
- 33.Kang J, Xiong Z, Niyato D, Zou Y, Zhang Y, Guizani M. Reliable federated learning for mobile networks. IEEE Wireless Communications. 2020;27(2):72–80. doi: 10.1109/MWC.001.1900119. [DOI] [Google Scholar]
- 34.Rakhlin A, Shamir O, Sridharan K. Making gradient descent optimal for strongly convex stochastic optimization. In: Proceedings of the 29th International Coference on International Conference on Machine Learning. 2012, 1571–1578
- 35.Sattler F, Wiedemann S, Müller K R, Samek W. Robust and communication-efficient federated learning from non-iid data. IEEE Transactions on Neural Networks and Learning Systems. 2019;31(9):3400–3413. doi: 10.1109/TNNLS.2019.2944481. [DOI] [PubMed] [Google Scholar]
- 36.Li X, Huang K, Yang W, Wang S, Zhang Z. On the convergence of fedavg on non-iid data. 2019, arXiv preprint arXiv: 1907.02189
- 37.Ha T, Dang T K, Le H, Truong T A. Security and privacy issues in deep learning: a brief review. SN Computer Science. 2020;1(5):253. doi: 10.1007/s42979-020-00254-4. [DOI] [Google Scholar]
- 38.Truex S, Baracaldo N, Anwar A, Steinke T, Ludwig H, Zhang R, Zhou Y. A hybrid approach to privacypreserving federated learning. In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security. 2019, 1–11
- 39.Fredrikson M, Jha S, Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 2015, 1322–1333
- 40.Geiping J, Bauermeister H, Dröge H, Moeller M. Inverting gradients-how easy is it to break privacy in federated learning? 2020, arXiv preprint arXiv: 2003.14053
- 41.Geyer R C, Klein T, Nabi M. Differentially private federated learning: a client level perspective. 2017, arXiv preprint arXiv: 1712.07557
- 42.Wei K, Li J, Ding M, Ma C, Yang H H, Farokhi F, Jin S, Quek T Q, Poor H V. Federated learning with differential privacy: algorithms and performance analysis. IEEE Transactions on Information Forensics and Security. 2020;15:3454–3469. doi: 10.1109/TIFS.2020.2988575. [DOI] [Google Scholar]
- 43.Biggio B, Nelson B, Laskov P. Poisoning attacks against support vector machines. In: Proceedings of the 29th International Coference on International Conference on Machine Learning. 2012, 1467–1474
- 44.Bagdasaryan E, Veit A, Hua Y, Estrin D, Shmatikov V. How to backdoor federated learning. In: Proceedings of International Conference on Artificial Intelligence. 2020, 2938–2948
- 45.Sun Z, Kairouz P, Suresh A T, McMahan H B. Can you really backdoor federated learning? 2019, arXiv preprint arXiv: 1911.07963
- 46.Bittau A, Erlingsson Ú, Maniatis P, Mironov I, Raghunathan A, Lie D, Rudominer M, Kode U, Tinnes J, Seefeld B. Prochlo: strong privacy for analytics in the crowd. In: Proceedings of the 26th Symposium on Operating Systems Principles. 2017, 441–459
- 47.Liu R, Cao Y, Chen H, Guo R, Yoshikawa M. Flame: differentially private federated learning in the shuffle model. 2020, arXiv preprint arXiv: 2009.08063
- 48.Wang T, Ding B, Xu M, Huang Z, Hong C, Zhou J, Li N, Jha S. Improving utility and security of the shuffler-based differential privacy. Proceedings of the VLDB Endowment. 2020;13(13):3545–3558. doi: 10.14778/3424573.3424576. [DOI] [Google Scholar]
- 49.Ma C, Li J, Ding M, Yang H H, Shu F, Quek T Q, Poor H V. On safeguarding privacy and security in the framework of federated learning. IEEE Network. 2020;34(4):242–248. doi: 10.1109/MNET.001.1900506. [DOI] [Google Scholar]
- 50.Goddard M. The eu general data protection regulation (GDPR): European regulation that has a global impact. International Journal of Market Research. 2017;59(6):703–705. doi: 10.2501/IJMR-2017-050. [DOI] [Google Scholar]
- 51.Lim W Y B, Garg S, Xiong Z, Niyato D, Leung C, Miao C, Guizani M. Dynamic contract design for federated learning in smart healthcare applications. IEEE Internet of Things Journal, 2020, DOI: 10.1109/JIOT.2020.3033806
- 52.Brisimi T S, Chen R, Mela T, Olshevsky A, Paschalidis I C, Shi W. Federated learning of predictive models from federated electronic health records. International Journal of Medical Informatics. 2018;112:59–67. doi: 10.1016/j.ijmedinf.2018.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Silva S, Gutman B A, Romero E, Thompson P M, Altmann A, Lorenzi M. Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. In: Proceedings of IEEE 16th International Symposium on Biomedical Imaging. 2019, 270–274
- 54.Xu J, Glicksberg B S, Su C, Walker P, Bian J, Wang F. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research. 2020;5(1):1–19. doi: 10.1007/s41666-020-00082-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Kumar R, Khan A A, Zhang S, Wang W, Abuidris Y, Amin W, Kumar J. Blockchain-federated-learning and deep learning models for covid-19 detection using ct imaging. 2020, arXiv preprint arXiv: 2007.06537 [DOI] [PMC free article] [PubMed]
- 56.Liu B, Yan B, Zhou Y, Yang Y, Zhang Y. Experiments of federated learning for covid-19 chest x-ray images. 2020, arXiv preprint arXiv: 2007.05592
- 57.Yu H, Liu Z, Liu Y, Chen T, Cong M, Weng X, Niyato D, Yang Q. A fairness-aware incentive scheme for federated learning. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 2020, 393–399
- 58.Khan L U, Pandey S R, Tran N H, Saad W, Han Z, Nguyen M N, Hong C S. Federated learning for edge networks: resource optimization and incentive mechanism. IEEE Communications Magazine. 2020;58(10):88–93. doi: 10.1109/MCOM.001.1900649. [DOI] [Google Scholar]
- 59.Pandey S R, Tran N H, Bennis M, Tun Y K, Manzoor A, Hong C S. A crowdsourcing framework for ondevice federated learning. IEEE Transactions on Wireless Communications. 2020;19(5):3241–3256. doi: 10.1109/TWC.2020.2971981. [DOI] [Google Scholar]
- 60.Kang J, Xiong Z, Niyato D, Xie S, Zhang J. Incentive mechanism for reliable federated learning: a joint optimization approach to combining reputation and contract theory. IEEE Internet of Things Journal. 2019;6(6):10700–10714. doi: 10.1109/JIOT.2019.2940820. [DOI] [Google Scholar]
- 61.Weng J, Weng J, Zhang J, Li M, Zhang Y, Luo W. Deepchain: auditable and privacy-preserving deep learning with blockchain-based incentive. IEEE Transactions on Dependable and Secure Computing. 2019;18(5):2438–2455. [Google Scholar]
- 62.Huang Y, Chu L, Zhou Z, Wang L, Liu J, Pei J, Zhang Y. Personalized federated learning: an attentive collaboration approach. 2020, arXiv preprint arXiv: 2007.03797
- 63.Dinh C T, Tran N, Nguyen T D. Personalized federated learning with moreau envelopes. Advances in Neural Information Processing Systems, 2020: 33
- 64.Deng Y, Kamani M M, Mahdavi M. Adaptive personalized federated learning. 2020, arXiv preprint arXiv: 2003.13461
- 65.Hu R, Guo Y, Li H, Pei Q, Gong Y. Personalized federated learning with differential privacy. IEEE Internet of Things Journal. 2020;7(10):9530–9539. doi: 10.1109/JIOT.2020.2991416. [DOI] [Google Scholar]
- 66.Mansour Y, Mohri M, Ro J, Suresh A T. Three approaches for personalization with applications to federated learning. 2020, arXiv preprint arXiv: 2002.10619
- 67.Wang K, Mathews R, Kiddon C, Eichner H, Beaufays F, Ramage D. Federated evaluation of on-device personalization. 2019, arXiv preprint arXiv: 1910.10252
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.