Abstract:

With the rise of “big data”, finding computationally efficient and privacy-preserving solutions for large-scale machine learning problems has gained paramount importance, especially in the case of medical data which is collected in huge volumes by modern healthcare systems. Since a large amount of data resides in different locations and owned by different entities, accessing sufficient data while keeping ethical, legal, economic, and technical challenges related to privacy in mind, precludes the medical data from being fully exploited by ML. Thus, to counter these challenges, we propose a novel blockchain-based Federated Learning architecture for healthcare consortia, which provides a solution to the current problems while highlighting the challenges and considerations that need to be addressed. The authors suggest a multi-modular system that can be broken down into three main modules - decentralized medical history module, differentially private institutional analytics module, and Federated Learning based patient prognosis. We conduct extensive experimentation using Logistic Regression and TabNet and receive an accuracy of 83.82 under IID settings with a client fraction of 10%. Further, we show that TabNet outperforms Logistic Regression under conditions of showing less data.