Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
MatrixCUDAFranDissertation.pdf
Скачиваний:
14
Добавлен:
22.03.2016
Размер:
2.18 Mб
Скачать

CHAPTER 2. THE ARCHITECTURE OF MODERN GRAPHICS PROCESSORS

INPUT ASSEMBLER

VERTEX

 

 

SHADER

 

GEOMETRY

 

SHADER

 

FRAGMENT

SETUP & RASTERIZER

RASTER OPERATIONS

 

SHADER

 

UNIFIED PROCESSOR

 

ARRAY

Figure 2.4: Cyclic approach of the graphics pipeline in the unified architectures.

The key contribution of this evolved architecture was the introduction of programmable stages, which in fact became the kernel of current graphics architectures, with plenty of fully-programmable processing units; this then led to the transformation of GPUs into a feasible target for generalpurpose computation with the appearance of the programmable units in the graphics pipeline implementation.

In addition to the hardware update, the introduction of new APIs for programming the GPU entailed a renewed interest in GPGPU. Between those APIs, the most successful ones were Cg [64] and HLSL [124], jointly developed by NVIDIA and Microsoft.

2.2.The Nvidia G80 as an example of the CUDA architecture

The mapping of this logical programmable pipeline onto the physical processor is what ultimately transformed the GPU computing scenario. In 2006, a novel architectural design was introduced by GPU vendors based on the idea of unified vertex and pixel processors. In this approach, there is no distinction between the units that perform the tasks for the vertex and the pixel processing. From this generation of GPUs on, all programming stages were performed by the same functional units, without taking into account the nature of the calculation to be done.

From the graphics perspective, the aim of this transformation was to reduce the unbalance that frequently occurred between vertex and pixel processing. Due to this unbalance, many of the functional units inside the GPU were basically idle for significant periods of time. In the unified architecture, there is only one type of processing unit, capable of executing both vertex and pixel operations. Thus, the sequential pipeline is transformed into a cyclic one, in which data recirculates through the processor. Data produced by one stage is used as an input to subsequent stages, using the same computational resources, but with a reconfigured behavior. Figure 2.4 illustrates this novel view of the graphics pipeline.

22

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]