Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Лаб2012 / 25366517.pdf
Скачиваний:
65
Добавлен:
02.02.2015
Размер:
3.33 Mб
Скачать

PROGRAMMING WITH INTEL MMX TECHNOLOGY

Unsigned saturation arithmetic — With unsigned saturation arithmetic, out-of-range results are limited to the representable range of unsigned integers for the integer size. So, positive overflow when operating on unsigned byte integers results in FFH being returned and negative overflow results in 00H being returned.

.

Table 9-1. Data Range Limits for Saturation

 

 

 

Data Type

 

Lower Limit

Upper Limit

 

 

 

 

 

 

 

 

Hexadecimal

Decimal

Hexadecimal

Decimal

 

 

 

 

 

 

Signed Byte

 

80H

-128

7FH

127

Signed Word

 

8000H

-32,768

7FFFH

32,767

 

 

 

 

 

 

Unsigned Byte

 

00H

0

FFH

255

Unsigned Word

 

0000H

0

FFFFH

65,535

 

 

 

 

 

 

Saturation arithmetic provides an answer for many overflow situations. For example, in color calculations, saturation causes a color to remain pure black or pure white without allowing inversion. It also prevents wraparound artifacts from entering into computations when range checking of source operands it not used.

MMX instructions do not indicate overflow or underflow occurrence by generating exceptions or setting flags in the EFLAGS register.

9.4MMX INSTRUCTIONS

The MMX instruction set consists of 47 instructions, grouped into the following categories:

Data transfer

Arithmetic

Comparison

Conversion

Unpacking

Logical

Shift

Empty MMX state instruction (EMMS)

Table 9-2 gives a summary of the instructions in the MMX instruction set. The following sections give a brief overview of the instructions within each group.

9-6 Vol. 1

PROGRAMMING WITH INTEL MMX TECHNOLOGY

NOTES

The MMX instructions described in this chapter are those instructions that are available in an IA-32 processor when CPUID.01H:EDX.MMX[bit 23] = 0.

Section 10.4.4, “SSE 64-Bit SIMD Integer Instructions” and Section 11.4.2, “SSE2 64-Bit and 128-Bit SIMD Integer Instructions” list additional instructions included with SSE/SSE2 extensions that operate on the MMX registers but are not considered part of the MMX instruction set.

Table 9-2. MMX Instruction Set Summary

 

 

 

 

 

 

Unsigned

 

Category

Wraparound

Signed Saturation

Saturation

 

 

 

 

 

 

 

Arithmetic

Addition

PADDB, PADDW,

PADDSB,

 

PADDUSB,

 

Subtraction

PADDD

PADDSW

 

PADDUSW

 

PSUBB, PSUBW,

PSUBSB,

 

PSUBUSB,

 

Multiplication

PSUBD

PSUBSW

 

PSUBUSW

 

PMULL, PMULH

 

 

 

 

 

Multiply and Add

PMADD

 

 

 

 

 

 

 

 

 

 

 

Comparison

Compare for Equal

PCMPEQB,

 

 

 

 

 

 

PCMPEQW,

 

 

 

 

 

Compare for Greater

PCMPEQD

 

 

 

 

 

PCMPGTPB,

 

 

 

 

 

Than

PCMPGTPW,

 

 

 

 

 

 

PCMPGTPD

 

 

 

 

 

 

 

 

 

 

Conversion

Pack

 

PACKSSWB,

PACKUSWB

 

 

 

PACKSSDW

 

 

 

 

 

 

 

 

Unpack

Unpack High

PUNPCKHBW,

 

 

 

 

 

 

PUNPCKHWD,

 

 

 

 

 

Unpack Low

PUNPCKHDQ

 

 

 

 

 

PUNPCKLBW,

 

 

 

 

 

 

PUNPCKLWD,

 

 

 

 

 

 

PUNPCKLDQ

 

 

 

 

 

 

 

 

 

 

 

 

 

Packed

 

 

Full Quadword

Logical

And

 

 

 

 

 

 

 

 

PAND

 

 

And Not

 

 

 

PANDN

 

 

Or

 

 

 

POR

 

 

Exclusive OR

 

 

 

PXOR

 

 

 

 

 

 

 

 

Shift

Shift Left Logical

PSLLW, PSLLD

 

 

PSLLQ

 

 

Shift Right Logical

PSRLW, PSRLD

 

 

PSRLQ

 

 

Shift Right Arithmetic

PSRAW, PSRAD

 

 

 

 

 

 

 

 

 

 

 

Doubleword Transfers

 

Quadword Transfers

Data Transfer

Register to Register

 

 

 

 

 

MOVD

 

 

MOVQ

 

 

Load from Memory

MOVD

 

 

MOVQ

 

 

Store to Memory

MOVD

 

 

MOVQ

 

 

 

 

 

 

 

 

Empty MMX

 

EMMS

 

 

 

 

State

 

 

 

 

 

 

 

 

 

 

 

 

 

Vol. 1 9-7

PROGRAMMING WITH INTEL MMX TECHNOLOGY

9.4.1Data Transfer Instructions

The MOVD (Move 32 Bits) instruction transfers 32 bits of packed data from memory to an MMX register and vice versa; or from a general-purpose register to an MMX register and vice versa.

The MOVQ (Move 64 Bits) instruction transfers 64 bits of packed data from memory to an MMX register and vice versa; or transfers data between MMX registers.

9.4.2Arithmetic Instructions

The arithmetic instructions perform addition, subtraction, multiplication, and multiply/add operations on packed data types.

The PADDB/PADDW/PADDD (add packed integers) instructions and the PSUBB/PSUBW/ PSUBD (subtract packed integers) instructions add or subtract the corresponding signed or unsigned data elements of the source and destination operands in wraparound mode. These instructions operate on packed byte, word, and doubleword data types.

The PADDSB/PADDSW (add packed signed integers with signed saturation) instructions and the PSUBSB/PSUBSW (subtract packed signed integers with signed saturation) instructions add or subtract the corresponding signed data elements of the source and destination operands and saturate the result to the limits of the signed data-type range. These instructions operate on packed byte and word data types.

The PADDUSB/PADDUSW (add packed unsigned integers with unsigned saturation) instructions and the PSUBUSB/PSUBUSW (subtract packed unsigned integers with unsigned saturation) instructions add or subtract the corresponding unsigned data elements of the source and destination operands and saturate the result to the limits of the unsigned data-type range. These instructions operate on packed byte and word data types.

The PMULHW (multiply packed signed integers and store high result) and PMULLW (multiply packed signed integers and store low result) instructions perform a signed multiply of the corresponding words of the source and destination operands and write the high-order or low-order 16 bits of each of the results, respectively, to the destination operand.

The PMADDWD (multiply and add packed integers) instruction computes the products of the corresponding signed words of the source and destination operands. The four intermediate 32-bit doubleword products are summed in pairs (high-order pair and low-order pair) to produce two 32-bit doubleword results.

9.4.3Comparison Instructions

The PCMPEQB/PCMPEQW/PCMPEQD (compare packed data for equal) instructions and the PCMPGTB/PCMPGTW/PCMPGTD (compare packed signed integers for greater than) instructions compare the corresponding signed data elements (bytes, words, or doublewords) in the source and destination operands for equal to or greater than, respectively.

9-8 Vol. 1

PROGRAMMING WITH INTEL MMX TECHNOLOGY

These instructions generate a mask of ones or zeros which are written to the destination operand. Logical operations can use the mask to select packed elements. This can be used to implement a packed conditional move operation without a branch or a set of branch instructions. No flags in the EFLAGS register are affected.

9.4.4Conversion Instructions

The PACKSSWB (pack words into bytes with signed saturation) and PACKSSDW (pack doublewords into words with signed saturation) instructions convert signed words into signed bytes and signed doublewords into signed words, respectively, using signed saturation.

PACKUSWB (pack words into bytes with unsigned saturation) converts signed words into unsigned bytes, using unsigned saturation.

9.4.5Unpack Instructions

The PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ (unpack high-order data elements) instructions and the PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ (unpack low-order data elements) instructions unpack bytes, words, or doublewords from the highor low-order data elements of the source and destination operands and interleave them in the destination operand. By placing all 0s in the source operand, these instructions can be used to convert byte integers to word integers, word integers to doubleword integers, or doubleword integers to quadword integers.

9.4.6Logical Instructions

PAND (bitwise logical AND), PANDN (bitwise logical AND NOT), POR (bitwise logical OR), and PXOR (bitwise logical exclusive OR) perform bitwise logical operations on the quadword source and destination operands.

9.4.7Shift Instructions

The logical shift left, logical shift right and arithmetic shift right instructions shift each element by a specified number of bit positions.

The PSLLW/PSLLD/PSLLQ (shift packed data left logical) instructions and the PSRLW/PSRLD/PSRLQ (shift packed data right logical) instructions perform a logical left or right shift of the data elements and fill the empty high or low order bit positions with zeros. These instructions operate on packed words, doublewords, and quadwords.

The PSRAW/PSRAD (shift packed data right arithmetic) instructions perform an arithmetic right shift, copying the sign bit for each data element into empty bit positions on the upper end of each data element. This instruction operates on packed words and doublewords.

Vol. 1 9-9

Соседние файлы в папке Лаб2012