Добавил:

Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Национальный Технический Университет Харьковский Политехнический Институт

Предмет:

[НЕСОРТИРОВАННОЕ]

Файл:

Лаб2012 / 319433-011.pdf

Скачиваний:

Добавлен:

02.02.2015

Размер:

2.31 Mб

Скачать

☆

<<< < Предыдущая 46 47 48 49 50 51 52 53 54 55 56 5758 / 7358 59 60 61 62 63 64 65 66 67 68 69 70 > Следующая >>>

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

BZHI - Zero High Bits Starting with Specified Bit Position

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
VEX.NDS1.LZ.0F38.W0 F5 /r	A	V/V	BMI2	Zero bits in r/m32 starting with the
BZHI r32a, r/m32, r32b				position in r32b, write result to
BZHI r32a, r/m32, r32b				r32a.
				r32a.
VEX.NDS1.LZ.0F38.W1 F5 /r	A	V/N.E.	BMI2	Zero bits in r/m64 starting with the
BZHI r64a, r/m64, r64b				position in r64b, write result to
BZHI r64a, r/m64, r64b				r64a.
				r64a.

NOTES:

1.ModRM:r/m is used to encode the first source operand (second operand) and VEX.vvvv encodes the second source operand (third operand).

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	ModRM:r/m (R)	VEX.vvvv (R)	NA

Description

BZHI copies the bits of the first source operand (the second operand) into the destination operand (the first operand) and clears the higher bits in the destination according to the INDEX value specified by the second source operand (the third operand). The INDEX is specified by bits 7:0 of the second source operand. The INDEX value is saturated at the value of OperandSize -1. CF is set, if the number contained in the 8 low bits of the third operand is greater than OperandSize -1.

This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.

Operation

N ← SRC2[7:0]

DEST ← SRC1

IF (N < OperandSize)

DEST[OperandSize-1:N] ← 0

IF (N > OperandSize - 1)

CF ← 1

ELSE

CF ← 0

7-12	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

Flags Affected

ZF, CF and SF flags are updated based on the result. OF flag is cleared. AF and PF flags are undefined.

Intel C/C++ Compiler Intrinsic Equivalent

BZHI unsigned __int32 _bzhi_u32(unsigned __int32 src, unsigned __int32 index); BZHI unsigned __int64 _bzhi_u64(unsigned __int64 src, unsigned __int32 index);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-13

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

LZCNT— Count the Number of Leading Zero Bits

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
F3 0F BD /r	A	V/V	LZCNT	Count the number of leading zero
LZCNT r16, r/m16				bits in r/m16, return result in r16.
F3 0F BD /r	A	V/V	LZCNT	Count the number of leading zero
LZCNT r32, r/m32				bits in r/m32, return result in r32.
REX.W + F3 0F BD /r	A	V/N.E.	LZCNT	Count the number of leading zero
LZCNT r64, r/m64				bits in r/m64, return result in r64.

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	ModRM:r/m (R)	NA	NA

Description

Counts the number of leading most significant zero bits in a source operand (second operand) returning the result into a destination (first operand).

LZCNT is an extension of the BSR instruction. The key difference between LZCNT and BSR is that LZCNT provides operand size as output when source operand is zero, while in the case of BSR instruction, if source operand is zero, the content of destination operand are undefined. On processors that do not support LZCNT, the instruction byte encoding is executed as BSR.

In 64-bit mode 64-bit operand size requires REX.W=1.

Operation

temp ← OperandSize - 1 DEST ← 0

WHILE (temp >= 0) AND (Bit(SRC, temp) = 0) DO

temp ← temp - 1 DEST ← DEST+ 1

IF DEST = OperandSize

CF ← 1

ELSE

7-14	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

CF ← 0

IF DEST = 0

ZF ← 1

ELSE

ZF ← 0

Flags Affected

ZF flag is set to 1 in case of zero output (most significant bit of the source is set), and to 0 otherwise, CF flag is set to 1 if input was zero and cleared otherwise. OF, SF, PF and AF flags are undefined.

Intel C/C++ Compiler Intrinsic Equivalent

LZCNT unsigned __int32 _lzcnt_u32(unsigned __int32 src);

LZCNT unsigned __int64 _lzcnt_u64(unsigned __int64 src);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-21.

Ref. # 319433-011

7-15

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

MULX — Unsigned Multiply Without Affecting Flags

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
VEX.NDD.LZ.F2.0F38.W0 F6 /r	A	V/V	BMI2	Unsigned multiply of r/m32 with
MULX r32a, r32b, r/m32				EDX without affecting arithmetic
				flags.
VEX.NDD.LZ.F2.0F38.W1 F6 /r	A	V/N.E.	BMI2	Unsigned multiply of r/m64 with
MULX r64a, r64b, r/m64				RDX without affecting arithmetic
				flags.

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	VEX.vvvv (W)	ModRM:r/m (R)	RDX/EDX is implied 64/32
A	ModRM:reg (W)	VEX.vvvv (W)	ModRM:r/m (R)	bits source
				bits source

Description

Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand (the third operand) and stores the low half of the result in the second destination (second operand), the high half of the result in the first destination operand (first operand), without reading or writing the arithmetic flags. This enables efficient programming where the software can interleave add with carry operations and multiplications.

If the first and second operand are identical, it will contain the high half of the multiplication result.

Operation

//DEST1: ModRM:reg

//DEST2: VEX.vvvv IF (OperandSize = 32)

SRC1 ← EDX;

DEST2 ← (SRC1*SRC2)[31:0]; DEST1 ← (SRC1*SRC2)[63:32];

ELSE IF (OperandSize = 64) SRC1 ← RDX;

DEST2 ← (SRC1*SRC2)[63:0];

7-16	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

DEST1 ← (SRC1*SRC2)[127:64];

Flags Affected

None

Intel C/C++ Compiler Intrinsic Equivalent

Auto-generated from high-level language.

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-17

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

PDEP — Parallel Bits Deposit

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
VEX.NDS.LZ.F2.0F38.W0 F5 /r	A	V/V	BMI2	Parallel deposit of bits from r32b
PDEP r32a, r32b, r/m32				using mask in r/m32, result is writ-
				ten to r32a.
VEX.NDS.LZ.F2.0F38.W1 F5 /r	A	V/N.E.	BMI2	Parallel deposit of bits from r64b
PDEP r64a, r64b, r/m64				using mask in r/m64, result is writ-
				ten to r64a

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	VEX.vvvv (R)	ModRM:r/m (R)	NA

Description

PDEP uses a mask in the second source operand (the third operand) to transfer/scatter contiguous low order bits in the first source operand (the second operand) into the destination (the first operand). PDEP takes the low bits from the first source operand and deposit them in the destination operand at the corresponding bit locations that are set in the second source operand (mask). All other bits (bits not set in mask) in destination are set to zero.

SRC1

S31 S30

S29 S28 S27

SRC2
SRC2	0		0	0	1
(mask)
(mask)
DEST
		0	0	0	S3	0
		0	0	0	S3	0

bit 31

S7 S6 S5 S4

S3 S2 S1 S0

S	2	0	S1	0	0	S	0	0	0
	2						0
									bit 0

Figure 7-1. PDEP Example

7-18	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

Operation

TEMP ← SRC1;

MASK ← SRC2; DEST ← 0 ; m← 0, k← 0;

DO WHILE m< OperandSize

IF MASK[ m] = 1 THEN DEST[ m] ← TEMP[ k]; k ← k+ 1;

m ← m+ 1;

Flags Affected

None.

Intel C/C++ Compiler Intrinsic Equivalent

PDEP unsigned __int32 _pdep_u32(unsigned __int32 src, unsigned __int32 mask);

PDEP unsigned __int64 _pdep_u64(unsigned __int64 src, unsigned __int32 mask);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-19

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

PEXT— Parallel Bits Extract

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
VEX.NDS.LZ.F3.0F38.W0 F5 /r	A	V/V	BMI2	Parallel extract of bits from r32b
PEXT r32a, r32b, r/m32				using mask in r/m32, result is writ-
				ten to r32a.
VEX.NDS.LZ.F3.0F38.W1 F5 /r	A	V/N.E.	BMI2	Parallel extract of bits from r64b
PEXT r64a, r64b, r/m64				using mask in r/m64, result is writ-
				ten to r64a.

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	VEX.vvvv (R)	ModRM:r/m (R)	NA

Description

PEXT uses a mask in the second source operand (the third operand) to transfer either contiguous or non-contiguous bits in the first source operand (the second operand) to contiguous low order bit positions in the destination (the first operand). For each bit set in the MASK, PEXT extracts the corresponding bits from the first source operand and writes them into contiguous lower bits of destination operand. The remaining upper bits of destination are zeroed.

SRC1	S	31	S	30	S	29	S	S
		31		30		29	28	27
SRC2	0			0	0		1	0
(mask)	0			0	0		1	0
DEST	0			0		0	0	0
bit 31

S7 S6 S5 S4

S3 S2 S1 S0

0	0	0	0	S	S	S	5	S2
				28	7		5
								bit 0

Figure 7-2. PEXT Example

7-20	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

Operation

TEMP ← SRC1;

MASK ← SRC2; DEST ← 0 ; m← 0, k← 0;

DO WHILE m< OperandSize

IF MASK[ m] = 1 THEN DEST[ k] ← TEMP[ m]; k ← k+ 1;

m ← m+ 1;

Flags Affected

None.

Intel C/C++ Compiler Intrinsic Equivalent

PEXT unsigned __int32 _pext_u32(unsigned __int32 src, unsigned __int32 mask);

PEXT unsigned __int64 _pext_u64(unsigned __int64 src, unsigned __int32 mask);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-21

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

RORX — Rotate Right Logical Without Affecting Flags

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
VEX.LZ.F2.0F3A.W0 F0 /r ib	A	V/V	BMI2	Rotate 32-bit r/m32 right imm8
RORX r32, r/m32, imm8				times without affecting arith-
				metic flags.
VEX.LZ.F2.0F3A.W1 F0 /r ib	A	V/N.E.	BMI2	Rotate 64-bit r/m64 right imm8
RORX r64, r/m64, imm8				times without affecting arith-
				metic flags.

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	ModRM:r/m (R)	NA	NA

Description

Rotates the bits of second operand right by the count value specified in imm8 without affecting arithmetic flags. The RORX instruction does not read or write the arithmetic flags.

Operation

IF (OperandSize = 32)

y ← imm8 AND 1FH;

DEST ← (SRC >> y) | (SRC << (32-y)); ELSEIF (OperandSize = 64 )

y ← imm8 AND 3FH;

DEST ← (SRC >> y) | (SRC << (64-y)); ENDIF

Flags Affected

None

Intel C/C++ Compiler Intrinsic Equivalent

Auto-generated from high-level language.

7-22	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-22.

Ref. # 319433-011

7-23

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

SARX/SHLX/SHRXShift Without Affecting Flags

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
VEX.NDS1.LZ.F3.0F38.W0 F7 /r	A	V/V	BMI2	Shift r/m32 arithmetically right
SARX r32a, r/m32, r32b				with count specified in r32b.
SARX r32a, r/m32, r32b
VEX.NDS1.LZ.66.0F38.W0 F7 /r	A	V/V	BMI2	Shift r/m32 logically left with
SHLX r32a, r/m32, r32b				count specified in r32b.
SHLX r32a, r/m32, r32b
VEX.NDS1.LZ.F2.0F38.W0 F7 /r	A	V/V	BMI2	Shift r/m32 logically right with
SHRX r32a, r/m32, r32b				count specified in r32b.
SHRX r32a, r/m32, r32b
VEX.NDS1.LZ.F3.0F38.W1 F7 /r	A	V/N.E.	BMI2	Shift r/m64 arithmetically right
SARX r64a, r/m64, r64b				with count specified in r64b.
SARX r64a, r/m64, r64b
VEX.NDS1.LZ.66.0F38.W1 F7 /r	A	V/N.E.	BMI2	Shift r/m64 logically left with
SHLX r64a, r/m64, r64b				count specified in r64b.
SHLX r64a, r/m64, r64b
VEX.NDS1.LZ.F2.0F38.W1 F7 /r	A	V/N.E.	BMI2	Shift r/m64 logically right with
SHRX r64a, r/m64, r64b				count specified in r64b.
SHRX r64a, r/m64, r64b

NOTES:

1.ModRM:r/m is used to encode the first source operand (second operand) and VEX.vvvv encodes the second source operand (third operand).

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	ModRM:r/m (R)	VEX.vvvv (R)	NA

Description

Shifts the bits of the first source operand (the second operand) to the left or right by a COUNT value specified in the second source operand (the third operand). The result is written to the destination operand (the first operand).

The shift arithmetic right (SARX) and shift logical right (SHRX) instructions shift the bits of the destination operand to the right (toward less significant bit locations), SARX keeps and propagates the most significant bit (sign bit) while shifting.

The logical shift left (SHLX) shifts the bits of the destination operand to the left (toward more significant bit locations).

7-24	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

If the value specified in the first source operand exceeds OperandSize -1, the COUNT value is masked.

SARX,SHRX, and SHLX instructions do not update flags.

Operation

TEMP ← SRC1;

IF VEX.W1 and CS.L = 1 THEN

countMASK ←3FH; ELSE

countMASK ←1FH;

COUNT ← (SRC2 AND countMASK)

DEST[OperandSize -1] = TEMP[OperandSize -1];

DO WHILE (COUNT != 0)

IF instruction is SHLX

THEN

DEST[] ← DEST *2;

ELSE IF instruction is SHRX

THEN

		DEST[] ← DEST /2; //unsigned divide
;	ELSE	// SARX

DEST[] ← DEST /2; // signed divide, round toward negative infinity

FI;

COUNT ← COUNT - 1;

Flags Affected

None.

Intel C/C++ Compiler Intrinsic Equivalent

Auto-generated from high-level language.

SIMD Floating-Point Exceptions

None

Ref. # 319433-011

7-25

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

Other Exceptions

See Table 2-22.

7-26	Ref. # 319433-011

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

TZCNT — Count the Number of Trailing Zero Bits

Opcode/	Op/	64/32	CPUID	Description
Instruction	En	-bit	Feature
		Mode	Flag
F3 0F BC /r	A	V/V	BMI1	Count the number of trailing zero
TZCNT r16, r/m16				bits in r/m16, return result in r16.
F3 0F BC /r	A	V/V	BMI1	Count the number of trailing zero
TZCNT r32, r/m32				bits in r/m32, return result in r32
REX.W + F3 0F BC /r	A	V/N.E.	BMI1	Count the number of trailing zero
TZCNT r64, r/m64				bits in r/m64, return result in r64.

Instruction Operand Encoding

Op/En	Operand 1	Operand 2	Operand 3	Operand 4
A	ModRM:reg (W)	ModRM:r/m (R)	NA	NA

Description

TZCNT counts the number of trailing least significant zero bits in source operand (second operand) and returns the result in destination operand (first operand). TZCNT is an extension of the BSF instruction. The key difference between TZCNT and BSF instruction is that TZCNT provides operand size as output when source operand is zero while in the case of BSF instruction, if source operand is zero, the content of destination operand are undefined. On processors that do not support TZCNT, the instruction byte encoding is executed as BSF.

Operation

temp ← 0 DEST ← 0

DO WHILE ( (temp < OperandSize) and (SRC[ temp] = 0) )

temp ← temp +1 DEST ← DEST+ 1

IF DEST = OperandSize

CF ← 1

ELSE

CF ← 0

Ref. # 319433-011

7-27

INSTRUCTION SET REFERENCE - VEX-ENCODED GPR INSTRUCTIONS

IF DEST = 0

ZF ← 1

ELSE

ZF ← 0

Flags Affected

ZF is set to 1 in case of zero output (least significant bit of the source is set), and to 0 otherwise, CF is set to 1 if the input was zero and cleared otherwise. OF, SF, PF and AF flags are undefined.

Intel C/C++ Compiler Intrinsic Equivalent

TZCNT unsigned __int32 _tzcnt_u32(unsigned __int32 src);

TZCNT unsigned __int64 _tzcnt_u64(unsigned __int64 src);

SIMD Floating-Point Exceptions

None

Other Exceptions

See Table 2-21.

7-28	Ref. # 319433-011

<<< < Предыдущая 46 47 48 49 50 51 52 53 54 55 56 5758 / 7358 59 60 61 62 63 64 65 66 67 68 69 70 > Следующая >>>

Соседние файлы в папке Лаб2012

#
02.02.20153.33 Mб6525366517.pdf
#
02.02.20152.52 Mб25253666.pdf
#
02.02.20152.7 Mб2725366617.pdf
#
02.02.20152.09 Mб31253667.pdf
#
02.02.20152.19 Mб2625366717.pdf
#
02.02.20152.31 Mб27319433-011.pdf
#
02.02.2015162.82 Кб20ЛР1 Виконання арифм операц.doc
#
02.02.201564 Кб21ЛР10 ММХ-розширення.doc
#
02.02.201567.07 Кб22ЛР11 SSE-розширення.doc
#
02.02.201590.62 Кб18ЛР12 Windows-застосування.doc
#
02.02.2015181.25 Кб19ЛР13 Досл_дження коду програм.doc