Chapter 1 [electronic resource] : Efficient communication primitives on mesh architectures with hardware routing
Published
Washington, D.C. : United States. Dept. of Energy, 1993. Oak Ridge, Tenn. : Distributed by the Office of Scientific and Technical Information, U.S. Dept. of Energy.
Several algorithms are discussed for implementing global combine on distributed memory computers using a two-dimensional mesh interconnect with wormhole routing. These include algorithms that are asymptotically optimal for short vectors (O(log(p)) for p processors) and for long vectors (O(n) for n data elements per processor). Performance models are developed which include startup and transfer costs that can depend on the number of messages that each node must handle at once. The models are validated using experimental data from the Intel Touchstone DELTA computer.
Report Numbers
E 1.99:pnl-sa--22001 E 1.99: conf-9303204--1 conf-9303204--1 pnl-sa--22001
Published through SciTech Connect. 03/01/1993. "pnl-sa--22001" " conf-9303204--1" "DE93018791" 6. Society Industrial and Applied Math conference on parallel processing for scientific computing,Newport Beach, CA (United States),22-24 Mar 1993. Payne, D.G.; Barnett, M.; Littlefield, R.; van de Geijn, R.