Needleman-Wunsch Attention: A Framework for Enhancing DNA Sequence Embedding

Kyelim Lee, Albert No

Research output: Contribution to journalArticlepeer-review

Abstract

In many biological research studies that rely on DNA sequence data, calculating the edit distance between two sequences is a vital component. However, computing the edit distance involves dynamic programming, which can be computationally intensive. To address this challenge, numerous works have focused on embedding sequences into the vector space while preserving the distance metric. This means that the edit distance between sequences is analogous to the distance between their corresponding vectors. In this study, we propose a novel Needleman-Wunsch Attention (NWA) framework for sequence embedding that leverages the relationship between the Needleman-Wunsch (NW) matrix and attention maps to improve the accuracy and efficiency of edit distance approximation methods. Our approach applies to any deep learning-based sequence embedding network and provides a general solution to improve the accuracy and efficiency of edit distance approximation methods. We validate the effectiveness of our proposed method by applying it to various existing embedding networks, demonstrating improved edit distance-preserving embedding in an actual dataset. The code is publicly available at https://github.com/thisislim/nw-attention/.

Original languageEnglish
Pages (from-to)69087-69096
Number of pages10
JournalIEEE Access
Volume12
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'Needleman-Wunsch Attention: A Framework for Enhancing DNA Sequence Embedding'. Together they form a unique fingerprint.

Cite this