Abstract
Distributed learning plays a key role in reducing the training time of modern deep neural networks with massive datasets. In this article, we consider a distributed learning problem where gradient computation is carried out over a number of computing devices at the wireless edge. We propose hierarchical broadcast coding, a provable coding-Theoretic framework to speed up distributed learning at the wireless edge. Our contributions are threefold. First, motivated by the hierarchical nature of real-world edge computing systems, we propose a layered code which mitigates the effects of not only packet losses at the wireless computing nodes but also straggling access points (APs) or small base stations. Second, by strategically allocating data partitions to nodes in the overlapping areas between cells, our technique achieves the fundamental lower bound on computational load to combat stragglers. Finally, we take advantage of the broadcast nature of wireless networks by which wireless devices in the overlapping cell coverage broadcast to more than one AP. This further reduces the overall training time in the presence of straggling APs. Experimental results on Amazon EC2 confirm the advantage of the proposed methods in speeding up learning. Our design targets any gradient descent based learning algorithms, including linear/logistic regressions and deep learning.
Original language | English |
---|---|
Article number | 9285212 |
Pages (from-to) | 2266-2281 |
Number of pages | 16 |
Journal | IEEE Transactions on Wireless Communications |
Volume | 20 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2021 Apr |
Bibliographical note
Publisher Copyright:© 2002-2012 IEEE.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Electrical and Electronic Engineering
- Applied Mathematics