Algorithmic support for commodity-based parallel computing systems [electronic resource].
- Washington, D.C. : United States. Dept. of Energy, 2003.
Oak Ridge, Tenn. : Distributed by the Office of Scientific and Technical Information, U.S. Dept. of Energy.
- Physical Description:
- 94 pages : digital, PDF file
- Additional Creators:
- Sandia National Laboratories, United States. Department of Energy, and United States. Department of Energy. Office of Scientific and Technical Information
- Restrictions on Access:
- Free-to-read Unrestricted online access
- The Computational Plant or Cplant is a commodity-based distributed-memory supercomputer under development at Sandia National Laboratories. Distributed-memory supercomputers run many parallel programs simultaneously. Users submit their programs to a job queue. When a job is scheduled to run, it is assigned to a set of available processors. Job runtime depends not only on the number of processors but also on the particular set of processors assigned to it. Jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs. This report introduces new allocation strategies and performance metrics based on space-filling curves and one dimensional allocation strategies. These algorithms are general and simple. Preliminary simulations and Cplant experiments indicate that both space-filling curves and one-dimensional packing improve processor locality compared to the sorted free list strategy previously used on Cplant. These new allocation strategies are implemented in Release 2.0 of the Cplant System Software that was phased into the Cplant systems at Sandia by May 2002. Experimental results then demonstrated that the average number of communication hops between the processors allocated to a job strongly correlates with the job's completion time. This report also gives processor-allocation algorithms for minimizing the average number of communication hops between the assigned processors for grid architectures. The associated clustering problem is as follows: Given n points in ℜd, find k points that minimize their average pairwise L₁ distance. Exact and approximate algorithms are given for these optimization problems. One of these algorithms has been implemented on Cplant and will be included in Cplant System Software, Version 2.1, to be released. In more preliminary work, we suggest improvements to the scheduler separate from the allocator.
- Report Numbers:
- E 1.99:sand2003-3702
- Published through SciTech Connect.
Phillips, Cynthia Ann; Leung, Vitus Joseph; Bender, Michael A.; Bunde, David P.
- Funding Information:
View MARC record | catkey: 14346051