Добавил:

korayakov Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный исследовательский университет «МИЭТ»

Предмет:

Операционные системы

Файл:

Введение в параллельное программирование

.pdf

Скачиваний:

Добавлен:

16.04.2013

Размер:

1.73 Mб

Скачать

☆

<<< < Предыдущая 1 2 34 / 44

How to mix MPI and OpenMP* in one program?

A sequential program working on a data set

Replicate the program.

Add glue code

Break up the data

•Create the MPI program with its data decomposition.

• Use OpenMP inside each MPI process.

*Other names and brands may be claimed as the property of others.

Get the MPI part done first, then add OpenMP pragma where it makes sense to do so

Pi program with MPI and OpenMP*

#include <mpi.h> #include “omp.h”

void main (int argc, char *argv[])

{

int i, my_id, numprocs; double x, pi, step, sum = 0.0 ; step = 1.0/(double) num_steps ;

MPI_Init(&argc, &argv) ; MPI_Comm_Rank(MPI_COMM_WORLD, &my_id) ; MPI_Comm_Size(MPI_COMM_WORLD, &numprocs) ; my_steps = num_steps/numprocs ;

#pragma omp parallel for private(x) reduction(+:sum)

for (i=myrank*my_steps; i<(myrank+1)*my_steps ; i++)

{

x = (i+0.5)*step;

sum += 4.0/(1.0+x*x);

}

sum *= step ;

MPI_Reduce(&sum, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD) ;

}

*Other names and brands may be claimed as the property of others.

1 D Heat Diffusion Equation - sequential

#include <stdio.h> #include <stdlib.h> #define NX 100

void main(void) {

double ukArray[NX], ukp1Array[NX];

double *uk = ukArray;	double *ukp1 = ukp1Array;
double dx = 1.0/NX;	double dt = 0.5dxdx; double *temp;

uk[0] = 1.0 ; uk[NX-1] = 10.0; ukp1[0] = 1.0; ukp1[NX-1] = 10.0; for (int i = 1; i < NX-1; ++i) uk[i] = 0.0;

for (int k = 0; k < 10000; ++k) { for (int i = 1; i < NX-1; ++i) {

ukp1[i]=uk[i]+ (dt/(dx*dx))*(uk[i+1]-2*uk[i]+uk[i-1]);

}

temp = ukp1; ukp1 = uk; uk = temp;

}

1 D Heat Diffusion Equ. – OpenMP*

#include <stdio.h> #include <stdlib.h> #include <omp.h> #define NX 100

void main(void) {

double ukArray[NX], ukp1Array[NX];
double *uk = ukArray;	double *ukp1 = ukp1Array;
double dx = 1.0/NX;	double dt = 0.5dxdx; double *temp;
uk[0] = 1.0 ; uk[NX-1] = 10.0; ukp1[0] = 1.0; ukp1[NX-1] = 10.0;
for (int i = 1; i < NX-1; ++i) uk[i] = 0.0;
#pragma OMP parallel
for (int k = 0; k < 10000; ++k) {
#pragma omp for
for (int i = 1; i < NX-1; ++i) {
ukp1[i]=uk[i]+ (dt/(dxdx))(uk[i+1]-2*uk[i]+uk[i-1]);
}
#pragma omp single

	{ temp = ukp1; ukp1 = uk; uk = temp;}
	}
}	*Other names and brands may be claimed as the property of others.

1 D Heat Diffusion Equation - MPI

#include <stdio.h>
#include <stdlib.h>		for (int k = 0; k < NSTEPS; ++k) {
#include <mpi.h>		for (int k = 0; k < NSTEPS; ++k) {
#define NX 100			if (myID != 0) MPI_Send(&uk[1], 1, MPI_DOUBLE,
void main(void) {			leftNbr, 0,MPI_COMM_WORLD);
double *uk = ukArray;			if (myID != numProcs-1) MPI_Send(&uk[numPoints],
double *ukp1 = ukp1Array;			1, MPI_DOUBLE, rightNbr, 0, MPI_COMM_WORLD);
double dx = 1.0/NX;			if (myID != 0) MPI_Recv(&uk[0], 1, MPI_DOUBLE, leftNbr,
double dt = 0.5dxdx;			0, MPI_COMM_WORLD, &status);
double *temp;			if (myID != numProcs-1) MPI_Recv(&uk[numPoints+1], 1,
int numProcs, myID, leftNbr, rightNbr;			MPI_DOUBLE, rightNbr, 0, MPI_COMM_WORLD,
int numPoints;			&status);
MPI_Status status;			if (myID != 0) {
			int i=1;
uk[0] = 1.0 ; uk[NX-1] = 10.0;			ukp1[i]=uk[i]+ (dt/(dxdx))(uk[i+1]-2*uk[i]+uk[i-1]);
ukp1[0] = 1.0; ukp1[NX-1] = 10.0;			}
for (int i = 1; i < NX-1; ++i)	uk[i] = 0.0;		if (myID != numProcs-1) {
MPI_Init(&argc, &argv);			int i=numPoints;
MPI_Comm_size (MPI_COMM_WORLD, &numProcs); ukp1[i]=uk[i]+ (dt/(dxdx))(uk[i+1]-2*uk[i]+uk[i-1]);
MPI_Comm_rank(MPI_COMM_WORLD, &myID);			}
leftNbr = myID - 1; // ID of left "neighbor" process			temp = ukp1; ukp1 = uk; uk = temp;
	}
rightNbr = myID + 1; // ID of right "neighbor" process
numPoints = (NX / numProcs);			MPI_Finalize();
uk = malloc(sizeof(double) * (numPoints+2));			return 0;
ukp1 = malloc(sizeof(double) * (numPoints+2)); }

Outline

Parallel programming, wetware, and software

Parallel programming APIs

–Thread Libraries

–Win32 API

–POSIX threads

–Compiler Directives

–OpenMP*

–Message Passing

–MPI

More complicated examples

Choosing which API is for you

*Other names and brands may be claimed as the property of others.

Choosing a parallel programming language

Which should you use?

–If you need to run on clusters, SMP, and many-core systems use MPI

–MPI is the assembly code of parallel programming. It Demands very little of the hardware and hence runs “everywhere”

–If you have very complex data structures that are hard to break into distinct chunks, use one of the shared memory approaches

–Both OpenMP* and the thread libraries are based on shared address spaces, so sharing big complex data structures is easy.

–BUT BE CAREFUL … a shared address space means you might be sharing when you don’t know it and that can mean race conditions.

–If you are writing complex applications and need to focus on the application not the API, use OpenMP.

–If you are writing system software and need to control everything, use a thread library.

*Other names and brands may be claimed as the property of others.

Summary

Most Parallel programs today are written with:

–A low level threading library (Pthreads or Windows threads)

–MPI

–OpenMP*

Pick one and start becoming familiar with parallel programming.

–Don’t stress too much on picking the best one. Programmers almost always work with multiple languages, and the same holds for parallel programmers and parallel languages.

Tools can help tremendously. Attend the next webinars in this series to learn about Intel’s parallel software tools

*Other names and brands may be claimed as the property of others.

Thank you for attending!

five more in this series… come interact with experts…

http://on24.com/event/36/88/3/rt/1/?eventid=36883

April 3	A Gentle Introduction to Parallel Software	Dr. Tim Mattson

	Software Performance Analysis for Multi-Core CPUs and Windows
April 17		Gary Carleton
	Vista*

	Three Steps to Threading and Performance.
May 1	Part 1 – Thread Correctness: Maintaining Deterministic Results in	Dr. David Mackay
	Developing, Maintaining and Tuning Threaded Software

	Three Steps to Threading and Performance.
May 15	Part 2 – Expressing Parallelism: Case Studies with Intel®	Victoria Gromova
	Threading Building Blocks

	Three Steps to Threading and Performance.
June 5		Vasanth Tovinkere
	Part 3 – Tuning Threaded Software: Next Steps After Concurrency

	Using Intel® C++ and Fortran Compilers, Version 10.0 for
June 19		Joe Wolf
	Performance, Multithreading, and Security

Read detailed abstracts & sign-up

*Other names and brands may be claimed as the property of others.

<<< < Предыдущая 1 2 34 / 44

Соседние файлы в папке Лекции_2007

#
16.04.20133.99 Mб76untitled.bmp
#
16.04.20131.73 Mб94Введение в параллельное программирование.pdf
#
16.04.2013100.86 Кб120Лекции 2 Концепция процесса.ppt
#
16.04.20133.2 Mб113Лекция 1 История ВТ.ppt
#
16.04.201381.92 Кб128Лекция 3 Прерывания и ядро ОС.ppt
#
16.04.2013172.03 Кб100Лекция 4 Асинхронные процессы.ppt
#
16.04.201373.22 Кб132Лекция 5 Семафоры.ppt