Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
платформа.docx
Скачиваний:
0
Добавлен:
01.07.2025
Размер:
1.91 Mб
Скачать

1.1.Condition Variable is a kind of Event used for signaling between two or more threads. One or more thread can wait on it to get signaled, while an another thread can signal this.

Header file required for condition Variable in C++11 is #include <condition_variable>.

How things actually work with condition variable,

  • Thread 1 calls the wait on condition variable, which internally acquires the mutex and check if required condition is met or not.

  • If not then it releases the lock and waits for Condition Variable to get signaled ( thread gets blocked). Condition Variable’s wait() function provides both these operations in atomic manner.

  • Another Thread i.e. like Thread 2 signals the Condition Variable when condition is met

  • Once Conditional Variable get signaled the Thread 1 which was waiting for it resumes. It then acquires the mutex lock again and checks if the condition associated with Condition Variable is actually met or if it is superiors call. If more than one thread was waiting then notify_one will unblock only one thread.

  • If it was a superiors call then it again calls the wait() function.

Main member functions for condition variable are: Wait() - It makes the current thread to block until the condition variable get signaled or a spurious wake up happens. notify_one() - If any threads are waiting on same conditional variable object then  notify_one unblocks one of the waiting threads. notify_all() - If any threads are waiting on same conditional variable object then  notify_all unblocks all of the waiting threads.

1.2.Describe ways to accelerate computation and memory hierarchy. Memory is evaluated according to the following criteria:

Costs; Capacity: The amount of data that can be stored per unit. Speed: The time it takes to access data. Reliability: Increasingly important, reliability in the strict sense measures the time from initialization to the first/next failure event. It is measured in mean time to failure (MTTF). Availability in the strict sense is the proportion of time that a device is up and running, or - in an alternative interpretation - the probability of finding a device up and running. Availability equals MTTF / (MTTF + MTTR).

If we plot the size, speed, and costs of memory and storage devices, we obtain the following, schematic picture:

Faster devices cost more and have lower capacity. Currently the largest capacity storage devices are tapes, which still cost marginally less than hard drives. But tapes have access times in the minutes. The most expensive memory is in-CPU memory such as registers and internal instruction queues. While it is difficult to calculate the price, it is clearly more than tens of dollars per kilobyte. This is the fastest memory available (access time is essentially clock speed), but also one that cannot be provided in large quantities.

1.3.Describe the OpenMp programming model and compile programs with OpenMp.

Shared Memory Model: OpenMP is designed for multi-processor/core, shared memory machines.

Thread Based Parallelism:

  • OpenMP programs accomplish parallelism exclusively through the use of threads.

  • A thread of execution is the smallest unit of processing that can be scheduled by an operating system. The idea of a subroutine that can be scheduled to run autonomously might help explain what a thread is.

  • Threads exist within the resources of a single process. Without the process, they cease to exist.

  • Typically, the number of threads match the number of machine processors/cores. However, the actual use of threads is up to the application.

Explicit Parallelism:

  • OpenMP is an explicit (not automatic) programming model, offering the programmer full control over parallelization.

  • Parallelization can be as simple as taking a serial program and inserting compiler directives....

  • Or as complex as inserting subroutines to set multiple levels of parallelism, locks and even nested locks.

Compiler Directive Based:

  • Most OpenMP parallelism is specified through the use of compiler directives which are imbedded in C/C++ or Fortran source code.

#pragma omp parallel default(shared) private(beta,pi)

1.4.Make definition of the classification of computer architectures, classification of parallel computing systems

Symmetric multiprocessing (SMP) involves a symmetric multiprocessor system hardware and software architecture where two or more identical processors connect to a single, shared main memory, have full access to all I/O devices, and are controlled by a single operating system instance that treats all processors equally, reserving none for special purposes.

1.5.Describe the process of setting the number of parallel streams in OpenMp.

omp_get_thread_num() - get the thread rank in a parallel region (0 -omp_get_num_threads() -1) omp_set_num_threads(nthreads) - set the number of threads used in a parallel region omp_get _num_threads() - get the number of threads used in a parallel region.

The number of threads in a parallel region is determined by the following factors, in order of precedence: 1. Evaluation of the if clause. 2. Setting of the num_threads() clause. 3. Use of the omp_set_num_threads() library function. 4. Setting of the OMP_NUM_THREAD environment variable. 5. Implementation default – usually the number of cores on a node. Threads are numbered from 0 (master thread) to N-1.

int nthreads, tid;

#pragma omp parallel num_threads(4) private(tid) {

The call to find the maximum number of threads that are available to do work is omp_get_max_threads() (from omp.h). This should not be confused with the similarly named omp_get_num_threads(). The 'max' call returns the maximum number of threads that can be put to use in a parallel region. There's a big difference between the two. In a serial region omp_get_num_threads will return 1; in a parallel region it will return the number of threads that are being used.

1.6.Define the race flows and mutexes 11 in C ++.

Race condition is a kind of a bug that occurs in multithreaded applications.

When two or more threads perform a set of operations in parallel, that access the same memory location.  Also, one or more thread out of them modifies the data in that memory location, then this can lead to an unexpected results some times.

This is called race condition.

Race conditions are usually hard to find and reproduce because they don’t occur every time. They will occur only when relative order of execution of operations by two or more threads leads to an unexpected result. 

class Wallet{

    int mMoney;

public:

    Wallet() :mMoney(0){}

    int getMoney() { return mMoney; }

    void addMoney(int money){

       for(int i = 0; i < money; ++i){ mMoney++; }}};

Now Let’s create 5 threads and all these threads will share a same object of class Wallet and add 1000 to internal money using it’s addMoney() member function in parallel.

So, if initially money in wallet is 0. Then after completion of all thread’s execution money in Wallet should be 5000.

But, as all threads are modifying the shared data at same time, it might be possible that in some scenarios money in wallet at end will be much lesser than 5000.

mutex is a lockable object that is designed to signal when critical sections of code need exclusive access, preventing other threads with the same protection from executing concurrently and access the same memory locations. The new concurrency library of C++11 comes with two different classes for managing mutex locks: namely std::lock_guard and std::unique_lock. 

1.7. Describe barrier synchronization C ++ 11, OpenMP, MPI

Barrier: Each thread waits until all threads arrive.

A barrier defines a point in the code where all active threads will stop until all threads have arrived at that point. With this, you can guarantee that certain calculations are finished. For instance, in this code snippet, computation of  y can not proceed until another thread has computed its value of  x .

#pragma omp parallel

{

int mytid = omp_get_thread_num();

x[mytid] = some_calculation();

y[mytid] = x[mytid]+x[mytid+1];

}

This can be guaranteed with a barrier pragma:

#pragma omp parallel

{

int mytid = omp_get_thread_num();

x[mytid] = some_calculation();

#pragma omp barrier

y[mytid] = x[mytid]+x[mytid+1];

}