Abstract
Deep learning has been extensively researched in various areas and scales up very fast in the last decade. It has deeply permeated into our daily life, such as image classification, video synthesis, autonomous driving, voice recognition, and personalized recommendation systems. The main challenge for most deep learning models, including convolutional neural networks, recurrent neural networks, and recommendation models, is their large amount of computation. Fortunately, most computations in deep learning applications are parallelizable, therefore they can be effectively handled by throughput processors, such as Graphics Processing Units (GPUs). GPUs support high throughput, parallel processing performance, and high memory bandwidth and becomes the most popularly adopted device for deep learning. As a matter of fact, many deep learning workloads from mobile devices to data centers are performed by GPUs. In particular, modern GPU systems provide specialized hardware modules and software stacks for deep learning workloads. In this chapter, we present detailed analysis on the evolution of GPU architectures and the recent hardware and software supports for more efficient acceleration of deep learning in GPUs. Furthermore, we introduce leading-edge researches, challenges, and opportunities of running deep learning workloads on GPUs.
Original language | English |
---|---|
Title of host publication | Hardware Accelerator Systems for Artificial Intelligence and Machine Learning |
Editors | Shiho Kim, Ganesh Chandra Deka |
Publisher | Academic Press Inc. |
Pages | 167-215 |
Number of pages | 49 |
ISBN (Print) | 9780128231234 |
DOIs | |
Publication status | Published - 2021 Jan |
Publication series
Name | Advances in Computers |
---|---|
Volume | 122 |
ISSN (Print) | 0065-2458 |
Bibliographical note
Publisher Copyright:© 2021 Elsevier Inc.
All Science Journal Classification (ASJC) codes
- General Computer Science