Balanced Column-Wise Block Pruning for Maximizing GPU Parallelism

Cheonjun Park, Mincheol Park, Hyun Jae Oh, Minkyu Kim, Myung Kuk Yoon, Suhyun Kim, Won Woo Ro

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Pruning has been an effective solution to reduce the number of computations and the memory requirement in deep learning. The pruning unit plays an important role in exploiting the GPU resources efficiently. The filter is proposed as a simple pruning unit of structured pruning. However, since the filter is quite large as pruning unit, the accuracy drop is considerable with a high pruning ratio. GPU rearranges the weight and input tensors into tiles (blocks) for efficient computation. To fully utilize GPU resources, this tile structure should be considered, which is the goal of block pruning. However, previous block pruning prunes both row vectors and column vectors. Pruning of row vectors in a tile corresponds to filter pruning, and it also interferes with column-wise block pruning of the following layer. In contrast, column vectors are much smaller than row vectors and can achieve lower accuracy drop. Additionally, if the pruning ratio for each tile is different, GPU utilization can be limited by imbalanced workloads by irregular-sized blocks. The same pruning ratio for the weight tiles processed in parallel enables the actual inference process to fully utilize the resources without idle time. This paper proposes balanced column-wise block pruning, named BCBP, to satisfy two conditions: the column-wise minimal size of the pruning unit and balanced workloads. We demonstrate that BCBP is superior to previous pruning methods through comprehensive experiments.

Original languageEnglish
Title of host publicationAAAI-23 Technical Tracks 8
EditorsBrian Williams, Yiling Chen, Jennifer Neville
PublisherAAAI press
Pages9398-9407
Number of pages10
ISBN (Electronic)9781577358800
Publication statusPublished - 2023 Jun 27
Event37th AAAI Conference on Artificial Intelligence, AAAI 2023 - Washington, United States
Duration: 2023 Feb 72023 Feb 14

Publication series

NameProceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
Volume37

Conference

Conference37th AAAI Conference on Artificial Intelligence, AAAI 2023
Country/TerritoryUnited States
CityWashington
Period23/2/723/2/14

Bibliographical note

Publisher Copyright:
Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Balanced Column-Wise Block Pruning for Maximizing GPU Parallelism'. Together they form a unique fingerprint.

Cite this