With advances in high throughput brain imaging at the cellular and sub-cellular level, there is growing demand for platforms that can support high performance, large-scale brain data processing and analysis. In this paper, we present a novel pipeline that combines Accumulo, D4M, geohashing, and parallel programming to manage large-scale neuron connectivity graphs in a cloud environment. Our brain connectivity graph is represented using vertices (fiber start/end nodes), edges (fiber tracks), and the 3D coordinates of the fiber tracks. For optimal performance, we take the hybrid approach of storing vertices and edges in Accumulo and saving the fiber track 3D coordinates in flat files. Accumulo database operations offer low latency on sparse queries while flat files offer high throughput for storing, querying, and analyzing bulk data. We evaluated our pipeline by using 250 gigabytes of mouse neuron connectivity data. Benchmarking experiments on retrieving vertices and edges from Accumulo demonstrate that we can achieve 1-2 orders of magnitude speedup in retrieval time when compared to the same operation from traditional flat files. The implementation of graph analytics such as Breadth First Search using Accumulo and D4M offers consistent good performance regardless of data size and density, thus is scalable to very large dataset. Indexing of neuron subvolumes is simple and logical with geohashing-based binary tree encoding. This hybrid data management backend is used to drive an interactive web-based 3D graphical user interface, where users can examine the 3D connectivity map in a Google Map-like viewer. Our pipeline is scalable and extensible to other data modalities.
|Title of host publication||2017 IEEE High Performance Extreme Computing Conference, HPEC 2017|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Publication status||Published - 2017 Oct 30|
|Event||2017 IEEE High Performance Extreme Computing Conference, HPEC 2017 - Waltham, United States|
Duration: 2017 Sept 12 → 2017 Sept 14
|Name||2017 IEEE High Performance Extreme Computing Conference, HPEC 2017|
|Conference||2017 IEEE High Performance Extreme Computing Conference, HPEC 2017|
|Period||17/9/12 → 17/9/14|
Bibliographical notePublisher Copyright:
© 2017 IEEE.
All Science Journal Classification (ASJC) codes
- Computational Theory and Mathematics
- Hardware and Architecture
- Computer Networks and Communications