Context Swap: Multi-PIM System Preventing Remote Memory Access for Large Embedding Model Acceleration

Hongju Kal, Cheolhwan Kim, Minjae Kim, Won Woo Ro

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Processing-in-Memory (PIM) has been an attractive solution to accelerate memory-intensive neural network layers. Especially, PIM is efficient for layers using embeddings, such as the embedding layer and graph convolution layer, because of their large capacity and low arithmetic intensity. The embedding tables of such layers are stored across multiple memory nodes and processed by local PIM modules with sparse access patterns. Towards computing data from other memory nodes on a local PIM module, a naive approach is to allow the local PIM to retrieve data from remote memory nodes. This approach might incur significant performance degradation due to the long latency overhead of remote accesses. To avoid remote access, PIM system can adopt a framework based on MapReduce programming model, which enables PIMs to compute the local data only and CPUs to compute intermediate results of PIMs. However, the multi-PIM system still suffers from performance degradation because the framework is processed on the CPU and it has a long delay compared to the PIM kernel execution. Therefore, we propose a context swap technique that prevents remote data access even without a high-latency framework. We observe that transferring PIM contexts to the remote PIM node needs much fewer data traffic than remote accesses of data. Our PIM system makes PIM nodes swap their context data with each other when they complete their own computation and no longer have local data to compute. Until all PIMs calculate all local data, several context swaps occur. The context swap is performed by a memory controller between PIMs in the same CPU socket and simple software between PIMs in different CPU sockets. To this end, the proposed multi-PIM system outperforms the base PIM system transferring remote data and the PIM system with the kernel-managing framework by 4.1 × and 3.3 ×, respectively.

Original languageEnglish
Title of host publicationAICAS 2023 - IEEE International Conference on Artificial Intelligence Circuits and Systems, Proceeding
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350332674
DOIs
Publication statusPublished - 2023
Event5th IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2023 - Hangzhou, China
Duration: 2023 Jun 112023 Jun 13

Publication series

NameAICAS 2023 - IEEE International Conference on Artificial Intelligence Circuits and Systems, Proceeding

Conference

Conference5th IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2023
Country/TerritoryChina
CityHangzhou
Period23/6/1123/6/13

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Information Systems
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Context Swap: Multi-PIM System Preventing Remote Memory Access for Large Embedding Model Acceleration'. Together they form a unique fingerprint.

Cite this