- •Chapter 1 Intel® Advanced Vector Extensions
- •1.1 About This Document
- •1.2 Overview
- •1.3.2 Instruction Syntax Enhancements
- •1.3.3 VEX Prefix Instruction Encoding Support
- •1.4 Overview AVX2
- •1.5 Functional Overview
- •1.6 General Purpose Instruction Set Enhancements
- •2.1 Detection of PCLMULQDQ and AES Instructions
- •2.2 Detection of AVX and FMA Instructions
- •2.2.1 Detection of FMA
- •2.2.3 Detection of AVX2
- •2.3.1 FMA Instruction Operand Order and Arithmetic Behavior
- •2.4 Accessing YMM Registers
- •2.5 Memory alignment
- •2.7 Instruction Exception Specification
- •2.7.1 Exceptions Type 1 (Aligned memory reference)
- •2.7.2 Exceptions Type 2 (>=16 Byte Memory Reference, Unaligned)
- •2.7.3 Exceptions Type 3 (<16 Byte memory argument)
- •2.7.5 Exceptions Type 5 (<16 Byte mem arg and no FP exceptions)
- •2.7.7 Exceptions Type 7 (No FP exceptions, no memory arg)
- •2.7.8 Exceptions Type 8 (AVX and no memory argument)
- •2.8.1 Clearing Upper YMM State Between AVX and Legacy SSE Instructions
- •2.8.3 Unaligned Memory Access and Buffer Size Management
- •2.9 CPUID Instruction
- •3.1 YMM State, VEX Prefix and Supported Operating Modes
- •3.2 YMM State Management
- •3.2.1 Detection of YMM State Support
- •3.2.2 Enabling of YMM State
- •3.2.4 The Layout of XSAVE Area
- •3.2.5 XSAVE/XRSTOR Interaction with YMM State and MXCSR
- •3.2.6 Processor Extended State Save Optimization and XSAVEOPT
- •3.2.6.1 XSAVEOPT Usage Guidelines
- •3.3 Reset Behavior
- •3.4 Emulation
- •4.1 Instruction Formats
- •4.1.1 VEX and the LOCK prefix
- •4.1.2 VEX and the 66H, F2H, and F3H prefixes
- •4.1.3 VEX and the REX prefix
- •4.1.4 The VEX Prefix
- •4.1.4.1 VEX Byte 0, bits[7:0]
- •4.1.4.2 VEX Byte 1, bit [7] - ‘R’
- •4.1.5 Instruction Operand Encoding and VEX.vvvv, ModR/M
- •4.1.6 The Opcode Byte
- •4.1.7 The MODRM, SIB, and Displacement Bytes
- •4.1.8 The Third Source Operand (Immediate Byte)
- •4.1.9.1 Vector Length Transition and Programming Considerations
- •4.1.10 AVX Instruction Length
- •4.2 Vector SIB (VSIB) Memory Addressing
- •4.3 VEX Encoding Support for GPR Instructions
- •5.1 Interpreting InstructIon Reference Pages
- •5.1.1 Instruction Format
- •5.1.2 Opcode Column in the Instruction Summary Table
- •5.1.3 Instruction Column in the Instruction Summary Table
- •5.1.4 Operand Encoding column in the Instruction Summary Table
- •5.1.5 64/32 bit Mode Support column in the Instruction Summary Table
- •5.1.6 CPUID Support column in the Instruction Summary Table
- •5.2 Summary of Terms
- •5.3 Instruction SET Reference
- •MPSADBW - Multiple Sum of Absolute Differences
- •PALIGNR - Byte Align
- •PBLENDW - Blend Packed Words
- •PHADDW/PHADDD - Packed Horizontal Add
- •PHADDSW - Packed Horizontal Add with Saturation
- •PHSUBW/PHSUBD - Packed Horizontal Subtract
- •PHSUBSW - Packed Horizontal Subtract with Saturation
- •PMOVSX - Packed Move with Sign Extend
- •PMOVZX - Packed Move with Zero Extend
- •PMULDQ - Multiply Packed Doubleword Integers
- •PMULHRSW - Multiply Packed Unsigned Integers with Round and Scale
- •PMULHUW - Multiply Packed Unsigned Integers and Store High Result
- •PMULHW - Multiply Packed Integers and Store High Result
- •PMULLW/PMULLD - Multiply Packed Integers and Store Low Result
- •PMULUDQ - Multiply Packed Unsigned Doubleword Integers
- •POR - Bitwise Logical Or
- •PSADBW - Compute Sum of Absolute Differences
- •PSHUFB - Packed Shuffle Bytes
- •PSHUFD - Shuffle Packed Doublewords
- •PSHUFLW - Shuffle Packed Low Words
- •PSIGNB/PSIGNW/PSIGND - Packed SIGN
- •PSLLDQ - Byte Shift Left
- •PSLLW/PSLLD/PSLLQ - Bit Shift Left
- •PSRAW/PSRAD - Bit Shift Arithmetic Right
- •PSRLDQ - Byte Shift Right
- •PSRLW/PSRLD/PSRLQ - Shift Packed Data Right Logical
- •PSUBB/PSUBW/PSUBD/PSUBQ -Packed Integer Subtract
- •PSUBSB/PSUBSW -Subtract Packed Signed Integers with Signed Saturation
- •PSUBUSB/PSUBUSW -Subtract Packed Unsigned Integers with Unsigned Saturation
- •PXOR - Exclusive Or
- •VPBLENDD - Blend Packed Dwords
- •VPERMD - Full Doublewords Element Permutation
- •VPERMPD - Permute Double-Precision Floating-Point Elements
- •VPERMPS - Permute Single-Precision Floating-Point Elements
- •VPERMQ - Qwords Element Permutation
- •VPSLLVD/VPSLLVQ - Variable Bit Shift Left Logical
- •VPSRAVD - Variable Bit Shift Right Arithmetic
- •VPSRLVD/VPSRLVQ - Variable Bit Shift Right Logical
- •VGATHERDPD/VGATHERQPD - Gather Packed DP FP values Using Signed Dword/Qword Indices
- •VGATHERDPS/VGATHERQPS - Gather Packed SP FP values Using Signed Dword/Qword Indices
- •VPGATHERDD/VPGATHERQD - Gather Packed Dword values Using Signed Dword/Qword Indices
- •VPGATHERDQ/VPGATHERQQ - Gather Packed Qword values Using Signed Dword/Qword Indices
- •6.1 FMA InstructIon SET Reference
- •Chapter 7 Instruction Set Reference - VEX-Encoded GPR Instructions
- •7.1 Instruction Format
- •7.2 INSTRUCTION SET REFERENCE
- •BZHI - Zero High Bits Starting with Specified Bit Position
- •INVPCID - Invalidate Processor Context ID
- •Chapter 8 Post-32nm Processor Instructions
- •8.1 Overview
- •8.2 CPUID Detection of New Instructions
- •8.4 Vector Instruction Exception Specification
- •8.6 Using RDRAND Instruction and Intrinsic
- •8.7 Instruction Reference
- •A.1 AVX Instructions
- •A.2 Promoted Vector Integer Instructions in AVX2
- •B.1 Using Opcode Tables
- •B.2 Key to Abbreviations
- •B.2.1 Codes for Addressing Method
- •B.2.2 Codes for Operand Type
- •B.2.3 Register Codes
- •B.2.4 Opcode Look-up Examples for One, Two, and Three-Byte Opcodes
- •B.2.4.1 One-Byte Opcode Instructions
- •B.2.4.2 Two-Byte Opcode Instructions
- •B.2.4.3 Three-Byte Opcode Instructions
- •B.2.4.4 VEX Prefix Instructions
- •B.2.5 Superscripts Utilized in Opcode Tables
- •B.3 One, Two, and THREE-Byte Opcode Maps
- •B.4.1 Opcode Look-up Examples Using Opcode Extensions
- •B.4.2 Opcode Extension Tables
- •B.5 Escape Opcode Instructions
- •B.5.1 Opcode Look-up Examples for Escape Instruction Opcodes
- •B.5.2 Escape Opcode Instruction Tables
- •B.5.2.1 Escape Opcodes with D8 as First Byte
- •B.5.2.2 Escape Opcodes with D9 as First Byte
- •B.5.2.3 Escape Opcodes with DA as First Byte
- •B.5.2.4 Escape Opcodes with DB as First Byte
- •B.5.2.5 Escape Opcodes with DC as First Byte
- •B.5.2.6 Escape Opcodes with DD as First Byte
- •B.5.2.7 Escape Opcodes with DE as First Byte
- •B.5.2.8 Escape Opcodes with DF As First Byte
OPCODE MAP
B.5.2.4 Escape Opcodes with DB as First Byte
Table B-13 and B-14 contain maps for escape instruction opcodes that begin with DBH. Table B-13 shows the map if the ModR/M byte is in the range of 00H-BFH. Here, the value of bits 3-5 (the nnn field in Figure B-1) selects the instruction.
Table B-13. DB Opcode Map When ModR/M Byte is Within 00H to BFH *
nnn Field of ModR/M Byte
000B |
001B |
010B |
011B |
100B |
101B |
110B |
111B |
FILD |
FISTTP |
FIST |
FISTP |
|
FLD |
|
FSTP |
dword-integer |
dword-integer |
dword-integer |
dword-integer |
|
extended-real |
|
extended-real |
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
Table B-14 shows the map if the ModR/M byte is outside the range of 00H-BFH. Here, the first digit of the ModR/M byte selects the table row and the second digit selects the column.
Table B-14. DB Opcode Map When ModR/M Byte is Outside 00H to BFH *
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
C |
|
|
|
FCMOVNB |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(0),ST(1) |
ST(0),ST(2) |
ST(0),ST(3) |
ST(0),ST(4) |
ST(0),ST(5) |
ST(0),ST(6) |
ST(0),ST(7) |
|
|
|
|
|
|
|
|
|
D |
|
|
|
FCMOVNBE |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(0),ST(1) |
ST(0),ST(2) |
ST(0),ST(3) |
ST(0),ST(4) |
ST(0),ST(5) |
ST(0),ST(6) |
ST(0),ST(7) |
|
|
|
|
|
|
|
|
|
E |
|
|
FCLEX |
FINIT |
|
|
|
|
|
|
|
|
|
|
|
|
|
F |
|
|
|
FCOMI |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(0),ST(1) |
ST(0),ST(2) |
ST(0),ST(3) |
ST(0),ST(4) |
ST(0),ST(5) |
ST(0),ST(6) |
ST(0),ST(7) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
9 |
A |
B |
C |
D |
E |
F |
|
|
|
|
|
|
|
|
|
C |
|
|
|
FCMOVNE |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(0),ST(1) |
ST(0),ST(2) |
ST(0),ST(3) |
ST(0),ST(4) |
ST(0),ST(5) |
ST(0),ST(6) |
ST(0),ST(7) |
|
|
|
|
|
|
|
|
|
D |
|
|
|
FCMOVNU |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(0),ST(1) |
ST(0),ST(2) |
ST(0),ST(3) |
ST(0),ST(4) |
ST(0),ST(5) |
ST(0),ST(6) |
ST(0),ST(7) |
|
|
|
|
|
|
|
|
|
E |
|
|
|
FUCOMI |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(0),ST(1) |
ST(0),ST(2) |
ST(0),ST(3) |
ST(0),ST(4) |
ST(0),ST(5) |
ST(0),ST(6) |
ST(0),ST(7) |
|
|
|
|
|
|
|
|
|
F |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
Ref. # 319433-011 |
B-27 |
|
OPCODE MAP
B.5.2.5 Escape Opcodes with DC as First Byte
Table B-15 and B-16 contain maps for escape instruction opcodes that begin with DCH. Table B-15 shows the map if the ModR/M byte is in the range of 00H-BFH. Here, the value of bits 3-5 (the nnn field in Figure B-1) selects the instruction.
Table B-15. DC Opcode Map When ModR/M Byte is Within 00H to BFH *
nnn Field of ModR/M Byte (refer to Figure B-1)
000B |
001B |
010B |
011B |
100B |
101B |
110B |
111B |
FADD |
FMUL |
FCOM |
FCOMP |
FSUB |
FSUBR |
FDIV |
FDIVR |
double-real |
double-real |
double-real |
double-real |
double-real |
double-real |
double-real |
double-real |
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
Table B-16 shows the map if the ModR/M byte is outside the range of 00H-BFH. In this case the first digit of the ModR/M byte selects the table row and the second digit selects the column.
Table B-16. DC Opcode Map When ModR/M Byte is Outside 00H to BFH *
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
|
|
|
|
|
|
|
|
|
C |
|
|
|
FADD |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
D |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
E |
|
|
|
FSUBR |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
F |
|
|
|
FDIVR |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
|
8 |
9 |
A |
B |
C |
D |
E |
F |
|
|
|
|
|
|
|
|
|
C |
|
|
|
FMUL |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
D |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
E |
|
|
|
FSUB |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
F |
|
|
|
FDIV |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
B-28 |
Ref. # 319433-011 |
|
OPCODE MAP
B.5.2.6 Escape Opcodes with DD as First Byte
Table B-17 and B-18 contain maps for escape instruction opcodes that begin with DDH. Table B-17 shows the map if the ModR/M byte is in the range of 00H-BFH. Here, the value of bits 3-5 (the nnn field in Figure B-1) selects the instruction.
Table B-17. DD Opcode Map When ModR/M Byte is Within 00H to BFH *
nnn Field of ModR/M Byte
000B |
001B |
010B |
011B |
100B |
101B |
110B |
111B |
FLD |
FISTTP |
FST |
FSTP |
FRSTOR |
|
FSAVE |
FSTSW |
double-real |
integer64 |
double-real |
double-real |
98/108bytes |
|
98/108bytes |
2 bytes |
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
Table B-18 shows the map if the ModR/M byte is outside the range of 00H-BFH. The first digit of the ModR/M byte selects the table row and the second digit selects the column.
Table B-18. DD Opcode Map When ModR/M Byte is Outside 00H to BFH *
|
0 |
1 |
2 |
3 |
|
4 |
5 |
6 |
7 |
C |
|
|
|
FFREE |
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
ST(0) |
ST(1) |
ST(2) |
ST(3) |
|
ST(4) |
ST(5) |
ST(6) |
ST(7) |
|
|
|
|
|
|
|
|
|
|
D |
|
|
|
|
FST |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0) |
ST(1) |
ST(2) |
ST(3) |
|
ST(4) |
ST(5) |
ST(6) |
ST(7) |
|
|
|
|
|
|
|
|
|
|
E |
|
|
|
FUCOM |
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
|
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
|
F |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
9 |
A |
B |
|
C |
D |
E |
F |
|
|
|
|
|
|
|
|
|
|
C |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D |
|
|
|
FSTP |
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
ST(0) |
ST(1) |
ST(2) |
ST(3) |
|
ST(4) |
ST(5) |
ST(6) |
ST(7) |
|
|
|
|
|
|
|
|
|
|
E |
|
|
|
FUCOMP |
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
ST(0) |
ST(1) |
ST(2) |
ST(3) |
|
ST(4) |
ST(5) |
ST(6) |
ST(7) |
|
|
|
|
|
|
|
|
|
|
F |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
Ref. # 319433-011 |
B-29 |
|
OPCODE MAP
B.5.2.7 Escape Opcodes with DE as First Byte
Table B-19 and B-20 contain opcode maps for escape instruction opcodes that begin with DEH. Table B-19 shows the opcode map if the ModR/M byte is in the range of 00H-BFH. In this case, the value of bits 3-5 (the nnn field in Figure B-1) selects the instruction.
Table B-19. DE Opcode Map When ModR/M Byte is Within 00H to BFH *
nnn Field of ModR/M Byte
000B |
001B |
010B |
011B |
100B |
101B |
110B |
111B |
FIADD |
FIMUL |
FICOM |
FICOMP |
FISUB |
FISUBR |
FIDIV |
FIDIVR |
word-integer |
word-integer |
word-integer |
word-integer |
word-integer |
word-integer |
word-integer |
word-integer |
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
Table B-20 shows the opcode map if the ModR/M byte is outside the range of 00H-BFH. The first digit of the ModR/M byte selects the table row and the second digit selects the column.
Table B-20. DE Opcode Map When ModR/M Byte is Outside 00H to BFH *
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
|
|
|
|
|
|
|
|
|
C |
|
|
|
FADDP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
D |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
E |
|
|
|
FSUBRP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
F |
|
|
|
FDIVRP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
|
8 |
9 |
A |
B |
C |
D |
E |
F |
|
|
|
|
|
|
|
|
|
C |
|
|
|
FMULP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
D |
|
FCOMPP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
E |
|
|
|
FSUBP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0) |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
F |
|
|
|
FDIVP |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ST(0),ST(0) |
ST(1),ST(0) |
ST(2),ST(0). |
ST(3),ST(0) |
ST(4),ST(0) |
ST(5),ST(0) |
ST(6),ST(0) |
ST(7),ST(0) |
|
|
|
|
|
|
|
|
|
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
B-30 |
Ref. # 319433-011 |
|