Hard Disk/Solid State Drive Synergy in Support of Data-Intensive Computing [electronic resource].
- Los Alamos, N.M. : Los Alamos National Laboratory, 2012.
Oak Ridge, Tenn. : Distributed by the Office of Scientific and Technical Information, U.S. Dept. of Energy.
- Additional Creators:
- Los Alamos National Laboratory and United States. Department of Energy. Office of Scientific and Technical Information
- Restrictions on Access:
- Free-to-read Unrestricted online access
- Data-intensive applications are becoming increasingly common in high-performance computing. Examples include combustion simulation, human genome analysis, and satellite image processing. Efficient access of data sets is critical to the performance of these applications. Because of the size of the data today's economically feasible approach is to store the data files on an array of hard disks or data servers equipped with hard disks and managed by a parallel file system such as PVFS or Lustre wherein the data is striped over a (large) number of disks for high aggregate I/O throughout. With file striping, a request for a segment of logically contiguous file space is decomposed into multiple sub-requests, each to a different server. While the data unit for this striping is usually reasonably large to benefit disk efficiency, the first and/or last sub-requests can be much smaller than the striping unit if the request does not align with the striping pattern, severely compromising hard disk efficiency and thus application performance. We propose to exploit solid state drives (SSD), whose efficiency is much less sensitive to small random accesses, to enable the alignment of requests to disk with the data striping pattern. In this scheme hard disks mainly serve large, aligned, sequential requests, with SSDs serving small or unaligned requests, thus respecting the relative cost, performance, and durability characteristics of the two media, and thereby achieving synergy in performance/cost. We will describe the design of the proposed scheme, its implementation on CCS-7's Darwin cluster, and performance results.
- Report Numbers:
- E 1.99:la-ur-12-23164
- Other Subject(s):
- Published through SciTech Connect.
Jiang, Song; Liu,Ke; Davis, Kei.
- Funding Information:
View MARC record | catkey: 14343220