Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Лаб2012 / 253665.pdf
Скачиваний:
33
Добавлен:
02.02.2015
Размер:
3.31 Mб
Скачать

PROGRAMMING WITH INTEL® MMX™ TECHNOLOGY

7FFFH, which is the largest positive integer that can be represented in 16 bits; if negative overflow occurs, the result is saturated to 8000H.

Unsigned saturation arithmetic — With unsigned saturation arithmetic, out- of-range results are limited to the representable range of unsigned integers for the integer size. So, positive overflow when operating on unsigned byte integers results in FFH being returned and negative overflow results in 00H being returned.

.

Table 9-1. Data Range Limits for Saturation

 

 

 

 

 

 

 

 

 

Data Type

 

Lower Limit

Upper Limit

 

 

 

 

 

 

 

 

Hexadecimal

Decimal

Hexadecimal

Decimal

 

 

 

 

 

 

Signed Byte

 

80H

-128

7FH

127

Signed Word

 

8000H

-32,768

7FFFH

32,767

 

 

 

 

 

 

Unsigned Byte

 

00H

0

FFH

255

Unsigned Word

 

0000H

0

FFFFH

65,535

 

 

 

 

 

 

Saturation arithmetic provides an answer for many overflow situations. For example, in color calculations, saturation causes a color to remain pure black or pure white without allowing inversion. It also prevents wraparound artifacts from entering into computations when range checking of source operands it not used.

MMX instructions do not indicate overflow or underflow occurrence by generating exceptions or setting flags in the EFLAGS register.

9.4MMX INSTRUCTIONS

The MMX instruction set consists of 47 instructions, grouped into the following categories:

Data transfer Arithmetic Comparison Conversion Unpacking Logical

Shift

Empty MMX state instruction (EMMS)

Table 9-2 gives a summary of the instructions in the MMX instruction set. The following sections give a brief overview of the instructions within each group.

9-6 Vol. 1

PROGRAMMING WITH INTEL® MMX™ TECHNOLOGY

NOTES

The MMX instructions described in this chapter are those instructions that are available in an IA-32 processor when CPUID.01H:EDX.MMX[bit 23] = 0.

Section 10.4.4, “SSE 64-Bit SIMD Integer Instructions,” and Section 11.4.2, “SSE2 64-Bit and 128-Bit SIMD Integer Instructions,” list additional instructions included with SSE/SSE2 extensions that operate on the MMX registers but are not considered part of the MMX instruction set.

Table 9-2. MMX Instruction Set Summary

 

Category

Wraparound

Signed

Unsigned Saturation

 

 

 

Saturation

 

 

 

 

 

 

Arithmetic

Addition

PADDB, PADDW,

PADDSB, PADDSW

PADDUSB, PADDUSW

 

 

PADDD

PSUBSB, PSUBSW

PSUBUSB, PSUBUSW

 

Subtraction

PSUBB, PSUBW,

 

 

 

 

PSUBD

 

 

 

Multiplication

PMULL, PMULH

 

 

 

Multiply and Add

PMADD

 

 

 

 

 

 

 

Comparison

Compare for Equal

PCMPEQB,

 

 

 

 

PCMPEQW,

 

 

 

 

PCMPEQD

 

 

 

Compare for

PCMPGTPB,

 

 

 

Greater Than

PCMPGTPW,

 

 

 

 

PCMPGTPD

 

 

 

 

 

 

 

Conversion

Pack

 

PACKSSWB,

PACKUSWB

 

 

 

PACKSSDW

 

 

 

 

 

 

Unpack

Unpack High

PUNPCKHBW,

 

 

 

 

PUNPCKHWD,

 

 

 

 

PUNPCKHDQ

 

 

 

Unpack Low

PUNPCKLBW,

 

 

 

 

PUNPCKLWD,

 

 

 

 

PUNPCKLDQ

 

 

 

 

 

 

 

 

 

Packed

Full Quadword

 

 

 

 

 

Logical

And

 

 

PAND

 

And Not

 

 

PANDN

 

Or

 

 

POR

 

Exclusive OR

 

 

PXOR

 

 

 

 

 

Shift

Shift Left Logical

PSLLW, PSLLD

 

PSLLQ

 

Shift Right Logical

PSRLW, PSRLD

 

PSRLQ

 

Shift Right

PSRAW, PSRAD

 

 

 

Arithmetic

 

 

 

 

 

 

 

 

Vol. 1 9-7

PROGRAMMING WITH INTEL® MMX™ TECHNOLOGY

Table 9-2. MMX Instruction Set Summary (Contd.)

 

Category

Wraparound

Signed

Unsigned Saturation

 

 

 

Saturation

 

 

 

 

 

 

 

 

Doubleword Transfers

Quadword Transfers

 

 

 

 

 

Data

Register to

MOVD

 

MOVQ

Transfer

Register

MOVD

 

MOVQ

 

Load from

MOVD

 

MOVQ

 

Memory

 

 

 

 

Store to Memory

 

 

 

 

 

 

 

 

Empty MMX

 

EMMS

 

 

State

 

 

 

 

 

 

 

 

 

9.4.1Data Transfer Instructions

The MOVD (Move 32 Bits) instruction transfers 32 bits of packed data from memory to an MMX register and vice versa; or from a general-purpose register to an MMX register and vice versa.

The MOVQ (Move 64 Bits) instruction transfers 64 bits of packed data from memory to an MMX register and vice versa; or transfers data between MMX registers.

9.4.2Arithmetic Instructions

The arithmetic instructions perform addition, subtraction, multiplication, and multiply/add operations on packed data types.

The PADDB/PADDW/PADDD (add packed integers) instructions and the PSUBB/PSUBW/ PSUBD (subtract packed integers) instructions add or subtract the corresponding signed or unsigned data elements of the source and destination operands in wraparound mode. These instructions operate on packed byte, word, and doubleword data types.

The PADDSB/PADDSW (add packed signed integers with signed saturation) instructions and the PSUBSB/PSUBSW (subtract packed signed integers with signed saturation) instructions add or subtract the corresponding signed data elements of the source and destination operands and saturate the result to the limits of the signed data-type range. These instructions operate on packed byte and word data types.

The PADDUSB/PADDUSW (add packed unsigned integers with unsigned saturation) instructions and the PSUBUSB/PSUBUSW (subtract packed unsigned integers with unsigned saturation) instructions add or subtract the corresponding unsigned data elements of the source and destination operands and saturate the result to the limits of the unsigned data-type range. These instructions operate on packed byte and word data types.

9-8 Vol. 1

PROGRAMMING WITH INTEL® MMX™ TECHNOLOGY

The PMULHW (multiply packed signed integers and store high result) and PMULLW (multiply packed signed integers and store low result) instructions perform a signed multiply of the corresponding words of the source and destination operands and write the high-order or low-order 16 bits of each of the results, respectively, to the destination operand.

The PMADDWD (multiply and add packed integers) instruction computes the products of the corresponding signed words of the source and destination operands. The four intermediate 32-bit doubleword products are summed in pairs (high-order pair and low-order pair) to produce two 32-bit doubleword results.

9.4.3Comparison Instructions

The PCMPEQB/PCMPEQW/PCMPEQD (compare packed data for equal) instructions and the PCMPGTB/PCMPGTW/PCMPGTD (compare packed signed integers for greater than) instructions compare the corresponding signed data elements (bytes, words, or doublewords) in the source and destination operands for equal to or greater than, respectively.

These instructions generate a mask of ones or zeros which are written to the destination operand. Logical operations can use the mask to select packed elements. This can be used to implement a packed conditional move operation without a branch or a set of branch instructions. No flags in the EFLAGS register are affected.

9.4.4Conversion Instructions

The PACKSSWB (pack words into bytes with signed saturation) and PACKSSDW (pack doublewords into words with signed saturation) instructions convert signed words into signed bytes and signed doublewords into signed words, respectively, using signed saturation.

PACKUSWB (pack words into bytes with unsigned saturation) converts signed words into unsigned bytes, using unsigned saturation.

9.4.5Unpack Instructions

The PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ (unpack high-order data elements) instructions and the PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ (unpack low-order data elements) instructions unpack bytes, words, or doublewords from the highor loworder data elements of the source and destination operands and interleave them in the destination operand. By placing all 0s in the source operand, these instructions can be used to convert byte integers to word integers, word integers to doubleword integers, or doubleword integers to quadword integers.

Vol. 1 9-9

Соседние файлы в папке Лаб2012