Добавил:

Tushkan Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный исследовательский университет «МЭИ»

Предмет:

Параллельные вычислительные системы

Файл:

Файлы для подготовки к экзамену / NAMD-BluegeneL.ppt

Скачиваний:

Добавлен:

28.06.2014

Размер:

3.27 Mб

Скачать

☆

<<< < Предыдущая 1 2 34 / 54 5 > Следующая >>>

IBM Research

Network Progress Calls

NAMD makes progress engine calls from the compute loops

– Typical frequency is10000 cycles, dynamically tunable

for ( i = 0; i < (i_upper SELF(- 1)); ++i ){

CmiNetworkProgress();

const CompAtom &p_i = p_0[i]; //……………………………

//Compute Pairlists

for (k=0; k<npairi; ++k) { //Compute forces

}

void CmiNetworkProgress() { new_time = rts_get_timebase();

if(new_time < lastProgress + PERIOD) { lastProgress = new_time;

return;

}

lastProgress = new_time; AdvanceCommunication();

}

Corporation

IBM Research

MPI Scalability

Charm++ MPI Driver

–Iprobe based implementation

–Higher progress overhead of MPI_Test

–Statically pinned FIFOs for point to point communication

32	© 2005 IBM Corporation

IBM Research

Charm++ Native Driver

BGX Message Layer (developed by George Almasi)

–Lower progress overhead

–Active messages

• Easily design complex communication protocols

–Dynamic FIFO mapping

–Low overhead remote memory access

–Interrupts

–Charm++ BGX driver was developed by Chao Huang over this summer

33	© 2005 IBM Corporation

IBM Research

BG/L Msglayer

Messages

Msg Queues

SpadMessage

TreeMessage

TorusMessage

Collective

Msq queue

Torus

Packets

Msq queue

TreePacket

TorusPacket

FIFO

Dynamically

Deterministically

pinning

routed packet

…

n-1

Templates

Scratchpad

Msq queue

TorusDirectMessage<>

Advance loop

				ts
			e
		k
	c
a
p

( This slide is taken from G. Almási’s talk on the “new” msglayer. )

Network

Coll. network FIFO

Torus FIFOs

I0 0 1 2 H

I1 0 1 2 H

R0 x+x- y+y- z+z-H

R1 x+x- y+y- z+z-H

Dispatching

Torus pkt. registry

…

Coll. pkt. disp.

34	© 2005 IBM Corporation

IBM Research

Optimized Multicast

pinFifo Algorithms

–Decide which of the 6 FIFOs to use when send msg to {x,y,z,t}

–Cones, Chessboard

Dynamic FIFO mapping

–A special send queue that msg can go from whichever FIFO that is not full

35	© 2005 IBM Corporation

IBM Research

Communication Pattern in PME

108 procs

36		© 2005 IBM Corporation

IBM Research

PME

Plane decomposition for 3D-FFT

PME objects placed close to patch objects on the torus

PME optimized through an asynchronous all-to-all with dynamic FIFO mapping

37	© 2005 IBM Corporation

IBM Research

Performance Results

IBM Research

BGX Message layer vs MPI

Fully non-blocking version performed below par on MPI

– Polling overhead high for a list of posted receives

BGX message layer works well with asynchronous communication

# Nodes	Cutoff		with PME
	Msglayer	MPI*	Msglayer	MPI*

APoA1 Benchmark

4	2250	2250
32	314	316	356	371
128	85	91.6	103
512	22.7	23.8	26.7	27.8
1024	13.2	13.9	14.4	17.3
2048	7.9	8.1	9.7	10.2
4096	4.8	4.9	6.8	7.3

NAMD Co-Processor Mode Performance (ms/step)

Message layer has sender side blocking communication here

39	© 2005 IBM Corporation

IBM Research

Blocking vs Overlap

	Cutoff		with PME
# Nodes	Blocking Sender	Non-Blocking	Blocking Sender	Non-Blocking
	Blocking Sender	Non-Blocking	Blocking Sender	Non-Blocking
32	314	313	356	347
128	85	82	103	97.2
512	22.7	21.7	26.7	23.7
1024	13.2	11.9	14.4	13.8
2048	7.9	7.3	9.7	8.6
4096	4.8	4.3	6.8	6.2
8192	-	3.7	-	-

APoA1 Benchmark in Co-Processor Mode

40	© 2005 IBM Corporation

<<< < Предыдущая 1 2 34 / 54 5 > Следующая >>>

Соседние файлы в папке Файлы для подготовки к экзамену

#
28.06.2014122.88 Кб21IBM RS6000 SP и HP(SPP) 9000 (Exemplar).doc
#
28.06.20143.27 Mб25NAMD-BluegeneL.ppt
#
28.06.201486.53 Кб24Анализ вычислительной сложности ФС.doc
#
28.06.2014464.9 Кб27Анализ структурной сложности.doc
#
28.06.20141.26 Mб72Архитектура компьютеров и ВС, принципы параллельного программирования.doc
#
28.06.20141.11 Mб46Лямбда-исчисление.doc
#
28.06.201476.8 Кб26О методах и алгоритмах параллельного решения задач и их характеристиках.doc