Abstract
Large-scale systems with arrays of solid state disks (SSDs) have become increasingly common in many computing segments. To make such systems resilient, we can adopt erasure coding such as Reed-Solomon (RS) code as an alternative to replication because erasure coding can offer a significantly lower storage cost than replication. To understand the impact of using erasure coding on system performance and other system aspects such as CPU utilization and network traffic, we build a storage cluster consisting of approximately one hundred processor cores with more than fifty high-performance SSDs, and evaluate the cluster with a popular open-source distributed parallel file system, Ceph. Then we analyze behaviors of systems adopting erasure coding from the following five viewpoints, compared with those of systems using replication: (1) storage system I/O performance; (2) computing and software overheads; (3) I/O amplification; (4) network traffic among storage nodes; (5) the impact of physical data layout on performance of RS-coded SSD arrays. For all these analyses, we examine two representative RS configurations, which are used by Google and Facebook file systems, and compare them with triple replication that a typical parallel file system employs as a default fault tolerance mechanism. Lastly, we collect 54 block-level traces from the cluster and make them available for other researchers.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 76-86 |
Number of pages | 11 |
ISBN (Electronic) | 9781538612323 |
DOIs | |
Publication status | Published - 2017 Dec 5 |
Event | 2017 IEEE International Symposium on Workload Characterization, IISWC 2017 - Seattle, United States Duration: 2017 Oct 1 → 2017 Oct 3 |
Publication series
Name | Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017 |
---|---|
Volume | 2017-January |
Other
Other | 2017 IEEE International Symposium on Workload Characterization, IISWC 2017 |
---|---|
Country/Territory | United States |
City | Seattle |
Period | 17/10/1 → 17/10/3 |
Bibliographical note
Funding Information:IX. ACKNOWLEDGEMENTS This research is mainly supported by NRF 2016R1C1B2015312. This work is also supported in part by IITP-2017-2017-0-01015, NRF-2015M3C4A7065645, DOE DE-AC02-05CH 11231, and MemRay grant (2015-11-1731). Nam Sung Kim is supported in part by NSF 1640196 and SRC/NRC NERC 2016-NE-2697-A. Myoungsoo Jung is the corresponding author.
Funding Information:
This research is mainly supported by NRF 2016R1C1B2015312. This work is also supported in part by IITP-2017-2017-0-01015, NRF-2015M3C4A7065645, DOE DE-AC02-05CH 11231, and MemRay grant (2015-11-1731). Nam Sung Kim is supported in part by NSF 1640196 and SRC/NRC NERC 2016-NE-2697-A. Myoungsoo Jung is the corresponding author.
Publisher Copyright:
© 2017 IEEE.
All Science Journal Classification (ASJC) codes
- Hardware and Architecture
- Information Systems and Management