Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Лаб2012 / 319433-011.pdf
Скачиваний:
27
Добавлен:
02.02.2015
Размер:
2.31 Mб
Скачать

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

BZHI - Zero High Bits Starting with Specified Bit Position

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

VEX.NDS1.LZ.0F38.W0 F5 /r

A

V/V

BMI2

Zero bits in r/m32 starting with the

BZHI r32a, r/m32, r32b

 

 

 

position in r32b, write result to

 

 

 

r32a.

 

 

 

 

VEX.NDS1.LZ.0F38.W1 F5 /r

A

V/N.E.

BMI2

Zero bits in r/m64 starting with the

BZHI r64a, r/m64, r64b

 

 

 

position in r64b, write result to

 

 

 

r64a.

 

 

 

 

 

 

 

 

 

NOTES:

1.ModRM:r/m is used to encode the first source operand (second operand) and VEX.vvvv encodes the second source operand (third operand).

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

ModRM:r/m (R)

VEX.vvvv (R)

NA

 

 

 

 

 

Description

BZHI copies the bits of the first source operand (the second operand) into the destination operand (the first operand) and clears the higher bits in the destination according to the INDEX value specified by the second source operand (the third operand). The INDEX is specified by bits 7:0 of the second source operand. The INDEX value is saturated at the value of OperandSize -1. CF is set, if the number contained in the 8 low bits of the third operand is greater than OperandSize -1.

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

Operation

N ← SRC2[7:0]

DEST ← SRC1

IF (N < OperandSize)

DEST[OperandSize-1:N] ← 0

FI

IF (N > OperandSize - 1)

CF ← 1

ELSE

CF ← 0

7-12

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

FI

Flags Affected

ZF, CF and SF flags are updated based on the result. OF flag is cleared. AF and PF flags are undefined.

Intel C/C++ Compiler Intrinsic Equivalent

BZHI unsigned __int32 _bzhi_u32(unsigned __int32 src, unsigned __int32 index); BZHI unsigned __int64 _bzhi_u64(unsigned __int64 src, unsigned __int32 index);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-13

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

LZCNT— Count the Number of Leading Zero Bits

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

F3 0F BD /r

A

V/V

LZCNT

Count the number of leading zero

LZCNT r16, r/m16

 

 

 

bits in r/m16, return result in r16.

F3 0F BD /r

A

V/V

LZCNT

Count the number of leading zero

LZCNT r32, r/m32

 

 

 

bits in r/m32, return result in r32.

REX.W + F3 0F BD /r

A

V/N.E.

LZCNT

Count the number of leading zero

LZCNT r64, r/m64

 

 

 

bits in r/m64, return result in r64.

 

 

 

 

 

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

ModRM:r/m (R)

NA

NA

 

 

 

 

 

Description

Counts the number of leading most significant zero bits in a source operand (second operand) returning the result into a destination (first operand).

LZCNT is an extension of the BSR instruction. The key difference between LZCNT and BSR is that LZCNT provides operand size as output when source operand is zero, while in the case of BSR instruction, if source operand is zero, the content of destination operand are undefined. On processors that do not support LZCNT, the instruction byte encoding is executed as BSR.

In 64-bit mode 64-bit operand size requires REX.W=1.

Operation

temp ← OperandSize - 1 DEST ← 0

WHILE (temp >= 0) AND (Bit(SRC, temp) = 0) DO

temp ← temp - 1 DEST ← DEST+ 1

OD

IF DEST = OperandSize

CF ← 1

ELSE

7-14

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

CF ← 0

FI

IF DEST = 0

ZF ← 1

ELSE

ZF ← 0

FI

Flags Affected

ZF flag is set to 1 in case of zero output (most significant bit of the source is set), and to 0 otherwise, CF flag is set to 1 if input was zero and cleared otherwise. OF, SF, PF and AF flags are undefined.

Intel C/C++ Compiler Intrinsic Equivalent

LZCNT unsigned __int32 _lzcnt_u32(unsigned __int32 src);

LZCNT unsigned __int64 _lzcnt_u64(unsigned __int64 src);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-21.

Ref. # 319433-011

7-15

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

MULX — Unsigned Multiply Without Affecting Flags

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

VEX.NDD.LZ.F2.0F38.W0 F6 /r

A

V/V

BMI2

Unsigned multiply of r/m32 with

MULX r32a, r32b, r/m32

 

 

 

EDX without affecting arithmetic

 

 

 

 

flags.

VEX.NDD.LZ.F2.0F38.W1 F6 /r

A

V/N.E.

BMI2

Unsigned multiply of r/m64 with

MULX r64a, r64b, r/m64

 

 

 

RDX without affecting arithmetic

 

 

 

 

flags.

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

VEX.vvvv (W)

ModRM:r/m (R)

RDX/EDX is implied 64/32

bits source

 

 

 

 

 

 

 

 

 

Description

Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand (the third operand) and stores the low half of the result in the second destination (second operand), the high half of the result in the first destination operand (first operand), without reading or writing the arithmetic flags. This enables efficient programming where the software can interleave add with carry operations and multiplications.

If the first and second operand are identical, it will contain the high half of the multiplication result.

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

Operation

//DEST1: ModRM:reg

//DEST2: VEX.vvvv IF (OperandSize = 32)

SRC1 ← EDX;

DEST2 ← (SRC1*SRC2)[31:0]; DEST1 ← (SRC1*SRC2)[63:32];

ELSE IF (OperandSize = 64) SRC1 ← RDX;

DEST2 ← (SRC1*SRC2)[63:0];

7-16

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

DEST1 ← (SRC1*SRC2)[127:64];

FI

Flags Affected

None

Intel C/C++ Compiler Intrinsic Equivalent

Auto-generated from high-level language.

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-17

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

PDEP — Parallel Bits Deposit

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

VEX.NDS.LZ.F2.0F38.W0 F5 /r

A

V/V

BMI2

Parallel deposit of bits from r32b

PDEP r32a, r32b, r/m32

 

 

 

using mask in r/m32, result is writ-

 

 

 

 

ten to r32a.

VEX.NDS.LZ.F2.0F38.W1 F5 /r

A

V/N.E.

BMI2

Parallel deposit of bits from r64b

PDEP r64a, r64b, r/m64

 

 

 

using mask in r/m64, result is writ-

 

 

 

 

ten to r64a

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

VEX.vvvv (R)

ModRM:r/m (R)

NA

 

 

 

 

 

Description

PDEP uses a mask in the second source operand (the third operand) to transfer/scatter contiguous low order bits in the first source operand (the second operand) into the destination (the first operand). PDEP takes the low bits from the first source operand and deposit them in the destination operand at the corresponding bit locations that are set in the second source operand (mask). All other bits (bits not set in mask) in destination are set to zero.

SRC1

S31 S30

S29 S28 S27

SRC2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

0

0

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(mask)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DEST

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

0

0

S3

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

bit 31

S7 S6 S5 S4

S3 S2 S1 S0

0

S

2

0

S1

0

0

S

0

0

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

bit 0

Figure 7-1. PDEP Example

7-18

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

Operation

TEMP ← SRC1;

MASK ← SRC2; DEST ← 0 ; m← 0, k← 0;

DO WHILE m< OperandSize

IF MASK[ m] = 1 THEN DEST[ m] ← TEMP[ k]; k ← k+ 1;

FI

m ← m+ 1;

OD

Flags Affected

None.

Intel C/C++ Compiler Intrinsic Equivalent

PDEP unsigned __int32 _pdep_u32(unsigned __int32 src, unsigned __int32 mask);

PDEP unsigned __int64 _pdep_u64(unsigned __int64 src, unsigned __int32 mask);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-19

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

PEXT— Parallel Bits Extract

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

VEX.NDS.LZ.F3.0F38.W0 F5 /r

A

V/V

BMI2

Parallel extract of bits from r32b

PEXT r32a, r32b, r/m32

 

 

 

using mask in r/m32, result is writ-

 

 

 

 

ten to r32a.

VEX.NDS.LZ.F3.0F38.W1 F5 /r

A

V/N.E.

BMI2

Parallel extract of bits from r64b

PEXT r64a, r64b, r/m64

 

 

 

using mask in r/m64, result is writ-

 

 

 

 

ten to r64a.

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

VEX.vvvv (R)

ModRM:r/m (R)

NA

 

 

 

 

 

Description

PEXT uses a mask in the second source operand (the third operand) to transfer either contiguous or non-contiguous bits in the first source operand (the second operand) to contiguous low order bit positions in the destination (the first operand). For each bit set in the MASK, PEXT extracts the corresponding bits from the first source operand and writes them into contiguous lower bits of destination operand. The remaining upper bits of destination are zeroed.

SRC1

S

31

S

30

S

29

S

S

 

 

 

 

28

27

SRC2

0

 

0

0

1

0

(mask)

 

DEST

0

 

0

 

0

0

0

bit 31

 

 

 

 

 

S7 S6 S5 S4

S3 S2 S1 S0

1

0

1

0

0

1

0

0

0

0

0

0

S

S

S

5

S2

 

 

 

 

28

7

 

 

 

 

 

 

 

 

 

 

bit 0

Figure 7-2. PEXT Example

7-20

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

Operation

TEMP ← SRC1;

MASK ← SRC2; DEST ← 0 ; m← 0, k← 0;

DO WHILE m< OperandSize

IF MASK[ m] = 1 THEN DEST[ k] ← TEMP[ m]; k ← k+ 1;

FI

m ← m+ 1;

OD

Flags Affected

None.

Intel C/C++ Compiler Intrinsic Equivalent

PEXT unsigned __int32 _pext_u32(unsigned __int32 src, unsigned __int32 mask);

PEXT unsigned __int64 _pext_u64(unsigned __int64 src, unsigned __int32 mask);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-21

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

RORX — Rotate Right Logical Without Affecting Flags

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

VEX.LZ.F2.0F3A.W0 F0 /r ib

A

V/V

BMI2

Rotate 32-bit r/m32 right imm8

RORX r32, r/m32, imm8

 

 

 

times without affecting arith-

 

 

 

 

metic flags.

VEX.LZ.F2.0F3A.W1 F0 /r ib

A

V/N.E.

BMI2

Rotate 64-bit r/m64 right imm8

RORX r64, r/m64, imm8

 

 

 

times without affecting arith-

 

 

 

 

metic flags.

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

ModRM:r/m (R)

NA

NA

 

 

 

 

 

Description

Rotates the bits of second operand right by the count value specified in imm8 without affecting arithmetic flags. The RORX instruction does not read or write the arithmetic flags.

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

Operation

IF (OperandSize = 32)

y ← imm8 AND 1FH;

DEST ← (SRC >> y) | (SRC << (32-y)); ELSEIF (OperandSize = 64 )

y ← imm8 AND 3FH;

DEST ← (SRC >> y) | (SRC << (64-y)); ENDIF

Flags Affected

None

Intel C/C++ Compiler Intrinsic Equivalent

Auto-generated from high-level language.

7-22

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-23

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

SARX/SHLX/SHRXShift Without Affecting Flags

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

VEX.NDS1.LZ.F3.0F38.W0 F7 /r

A

V/V

BMI2

Shift r/m32 arithmetically right

SARX r32a, r/m32, r32b

 

 

 

with count specified in r32b.

 

 

 

 

VEX.NDS1.LZ.66.0F38.W0 F7 /r

A

V/V

BMI2

Shift r/m32 logically left with

SHLX r32a, r/m32, r32b

 

 

 

count specified in r32b.

 

 

 

 

VEX.NDS1.LZ.F2.0F38.W0 F7 /r

A

V/V

BMI2

Shift r/m32 logically right with

SHRX r32a, r/m32, r32b

 

 

 

count specified in r32b.

 

 

 

 

VEX.NDS1.LZ.F3.0F38.W1 F7 /r

A

V/N.E.

BMI2

Shift r/m64 arithmetically right

SARX r64a, r/m64, r64b

 

 

 

with count specified in r64b.

 

 

 

 

VEX.NDS1.LZ.66.0F38.W1 F7 /r

A

V/N.E.

BMI2

Shift r/m64 logically left with

SHLX r64a, r/m64, r64b

 

 

 

count specified in r64b.

 

 

 

 

VEX.NDS1.LZ.F2.0F38.W1 F7 /r

A

V/N.E.

BMI2

Shift r/m64 logically right with

SHRX r64a, r/m64, r64b

 

 

 

count specified in r64b.

 

 

 

 

 

 

 

 

 

NOTES:

1.ModRM:r/m is used to encode the first source operand (second operand) and VEX.vvvv encodes the second source operand (third operand).

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

ModRM:r/m (R)

VEX.vvvv (R)

NA

 

 

 

 

 

Description

Shifts the bits of the first source operand (the second operand) to the left or right by a COUNT value specified in the second source operand (the third operand). The result is written to the destination operand (the first operand).

The shift arithmetic right (SARX) and shift logical right (SHRX) instructions shift the bits of the destination operand to the right (toward less significant bit locations), SARX keeps and propagates the most significant bit (sign bit) while shifting.

The logical shift left (SHLX) shifts the bits of the destination operand to the left (toward more significant bit locations).

7-24

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

If the value specified in the first source operand exceeds OperandSize -1, the COUNT value is masked.

SARX,SHRX, and SHLX instructions do not update flags.

Operation

TEMP ← SRC1;

IF VEX.W1 and CS.L = 1 THEN

countMASK ←3FH; ELSE

countMASK ←1FH;

FI

COUNT ← (SRC2 AND countMASK)

DEST[OperandSize -1] = TEMP[OperandSize -1];

DO WHILE (COUNT != 0)

IF instruction is SHLX

THEN

DEST[] ← DEST *2;

ELSE IF instruction is SHRX

THEN

 

 

DEST[] ← DEST /2; //unsigned divide

;

ELSE

// SARX

DEST[] ← DEST /2; // signed divide, round toward negative infinity

FI;

COUNT ← COUNT - 1;

OD

Flags Affected

None.

Intel C/C++ Compiler Intrinsic Equivalent

Auto-generated from high-level language.

SIMD Floating-Point Exceptions

None

Ref. # 319433-011

7-25

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

Other Exceptions

See Table 2-22.

7-26

Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

TZCNT — Count the Number of Trailing Zero Bits

Opcode/

Op/

64/32

CPUID

Description

Instruction

En

-bit

Feature

 

 

 

Mode

Flag

 

F3 0F BC /r

A

V/V

BMI1

Count the number of trailing zero

TZCNT r16, r/m16

 

 

 

bits in r/m16, return result in r16.

F3 0F BC /r

A

V/V

BMI1

Count the number of trailing zero

TZCNT r32, r/m32

 

 

 

bits in r/m32, return result in r32

REX.W + F3 0F BC /r

A

V/N.E.

BMI1

Count the number of trailing zero

TZCNT r64, r/m64

 

 

 

bits in r/m64, return result in r64.

 

 

 

 

 

Instruction Operand Encoding

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

A

ModRM:reg (W)

ModRM:r/m (R)

NA

NA

 

 

 

 

 

Description

TZCNT counts the number of trailing least significant zero bits in source operand (second operand) and returns the result in destination operand (first operand). TZCNT is an extension of the BSF instruction. The key difference between TZCNT and BSF instruction is that TZCNT provides operand size as output when source operand is zero while in the case of BSF instruction, if source operand is zero, the content of destination operand are undefined. On processors that do not support TZCNT, the instruction byte encoding is executed as BSF.

Operation

temp ← 0 DEST ← 0

DO WHILE ( (temp < OperandSize) and (SRC[ temp] = 0) )

temp ← temp +1 DEST ← DEST+ 1

OD

IF DEST = OperandSize

CF ← 1

ELSE

CF ← 0

FI

Ref. # 319433-011

7-27

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

IF DEST = 0

ZF ← 1

ELSE

ZF ← 0

FI

Flags Affected

ZF is set to 1 in case of zero output (least significant bit of the source is set), and to 0 otherwise, CF is set to 1 if the input was zero and cleared otherwise. OF, SF, PF and AF flags are undefined.

Intel C/C++ Compiler Intrinsic Equivalent

TZCNT unsigned __int32 _tzcnt_u32(unsigned __int32 src);

TZCNT unsigned __int64 _tzcnt_u64(unsigned __int64 src);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-21.

7-28

Ref. # 319433-011

Соседние файлы в папке Лаб2012