Final Report [electronic resource] : Efficient Databases for MPC Microdata
- Washington, D.C. : United States. Dept. of Energy, 2011. and Oak Ridge, Tenn. : Distributed by the Office of Scientific and Technical Information, U.S. Dept. of Energy.
- Additional Creators:
- United States. Department of Energy and United States. Department of Energy. Office of Scientific and Technical Information
- Restrictions on Access:
- Free-to-read Unrestricted online access
- The purpose of this grant was to develop the theory and practice of high-performance databases for massive streamed datasets. Over the last three years, we have developed fast indexing technology, that is, technology for rapidly ingesting data and storing that data so that it can be efficiently queried and analyzed. During this project we developed the technology so that high-bandwidth data streams can be indexed and queried efficiently. Our technology has been proven to work data sets composed of tens of billions of rows when the data streams arrives at over 40,000 rows per second. We achieved these numbers even on a single disk driven by two cores. Our work comprised (1) new write-optimized data structures with better asymptotic complexity than traditional structures, (2) implementation, and (3) benchmarking. We furthermore developed a prototype of TokuFS, a middleware layer that can handle microdata I/O packaged up in an MPI-IO abstraction.
- Published through SciTech Connect., 08/31/2011., "doe/er25853-1", Michael A. Bender; Martin Farach-Colton; Bradley C. Kuszmaul., and Tokutek
- Type of Report and Period Covered Note:
- Funding Information:
View MARC record | catkey: 14653335