- •Chapter 1 Intel® Advanced Vector Extensions
- •1.1 About This Document
- •1.2 Overview
- •1.3.2 Instruction Syntax Enhancements
- •1.3.3 VEX Prefix Instruction Encoding Support
- •1.4 Overview AVX2
- •1.5 Functional Overview
- •1.6 General Purpose Instruction Set Enhancements
- •2.1 Detection of PCLMULQDQ and AES Instructions
- •2.2 Detection of AVX and FMA Instructions
- •2.2.1 Detection of FMA
- •2.2.3 Detection of AVX2
- •2.3.1 FMA Instruction Operand Order and Arithmetic Behavior
- •2.4 Accessing YMM Registers
- •2.5 Memory alignment
- •2.7 Instruction Exception Specification
- •2.7.1 Exceptions Type 1 (Aligned memory reference)
- •2.7.2 Exceptions Type 2 (>=16 Byte Memory Reference, Unaligned)
- •2.7.3 Exceptions Type 3 (<16 Byte memory argument)
- •2.7.5 Exceptions Type 5 (<16 Byte mem arg and no FP exceptions)
- •2.7.7 Exceptions Type 7 (No FP exceptions, no memory arg)
- •2.7.8 Exceptions Type 8 (AVX and no memory argument)
- •2.8.1 Clearing Upper YMM State Between AVX and Legacy SSE Instructions
- •2.8.3 Unaligned Memory Access and Buffer Size Management
- •2.9 CPUID Instruction
- •3.1 YMM State, VEX Prefix and Supported Operating Modes
- •3.2 YMM State Management
- •3.2.1 Detection of YMM State Support
- •3.2.2 Enabling of YMM State
- •3.2.4 The Layout of XSAVE Area
- •3.2.5 XSAVE/XRSTOR Interaction with YMM State and MXCSR
- •3.2.6 Processor Extended State Save Optimization and XSAVEOPT
- •3.2.6.1 XSAVEOPT Usage Guidelines
- •3.3 Reset Behavior
- •3.4 Emulation
- •4.1 Instruction Formats
- •4.1.1 VEX and the LOCK prefix
- •4.1.2 VEX and the 66H, F2H, and F3H prefixes
- •4.1.3 VEX and the REX prefix
- •4.1.4 The VEX Prefix
- •4.1.4.1 VEX Byte 0, bits[7:0]
- •4.1.4.2 VEX Byte 1, bit [7] - ‘R’
- •4.1.5 Instruction Operand Encoding and VEX.vvvv, ModR/M
- •4.1.6 The Opcode Byte
- •4.1.7 The MODRM, SIB, and Displacement Bytes
- •4.1.8 The Third Source Operand (Immediate Byte)
- •4.1.9.1 Vector Length Transition and Programming Considerations
- •4.1.10 AVX Instruction Length
- •4.2 Vector SIB (VSIB) Memory Addressing
- •4.3 VEX Encoding Support for GPR Instructions
- •5.1 Interpreting InstructIon Reference Pages
- •5.1.1 Instruction Format
- •5.1.2 Opcode Column in the Instruction Summary Table
- •5.1.3 Instruction Column in the Instruction Summary Table
- •5.1.4 Operand Encoding column in the Instruction Summary Table
- •5.1.5 64/32 bit Mode Support column in the Instruction Summary Table
- •5.1.6 CPUID Support column in the Instruction Summary Table
- •5.2 Summary of Terms
- •5.3 Instruction SET Reference
- •MPSADBW - Multiple Sum of Absolute Differences
- •PALIGNR - Byte Align
- •PBLENDW - Blend Packed Words
- •PHADDW/PHADDD - Packed Horizontal Add
- •PHADDSW - Packed Horizontal Add with Saturation
- •PHSUBW/PHSUBD - Packed Horizontal Subtract
- •PHSUBSW - Packed Horizontal Subtract with Saturation
- •PMOVSX - Packed Move with Sign Extend
- •PMOVZX - Packed Move with Zero Extend
- •PMULDQ - Multiply Packed Doubleword Integers
- •PMULHRSW - Multiply Packed Unsigned Integers with Round and Scale
- •PMULHUW - Multiply Packed Unsigned Integers and Store High Result
- •PMULHW - Multiply Packed Integers and Store High Result
- •PMULLW/PMULLD - Multiply Packed Integers and Store Low Result
- •PMULUDQ - Multiply Packed Unsigned Doubleword Integers
- •POR - Bitwise Logical Or
- •PSADBW - Compute Sum of Absolute Differences
- •PSHUFB - Packed Shuffle Bytes
- •PSHUFD - Shuffle Packed Doublewords
- •PSHUFLW - Shuffle Packed Low Words
- •PSIGNB/PSIGNW/PSIGND - Packed SIGN
- •PSLLDQ - Byte Shift Left
- •PSLLW/PSLLD/PSLLQ - Bit Shift Left
- •PSRAW/PSRAD - Bit Shift Arithmetic Right
- •PSRLDQ - Byte Shift Right
- •PSRLW/PSRLD/PSRLQ - Shift Packed Data Right Logical
- •PSUBB/PSUBW/PSUBD/PSUBQ -Packed Integer Subtract
- •PSUBSB/PSUBSW -Subtract Packed Signed Integers with Signed Saturation
- •PSUBUSB/PSUBUSW -Subtract Packed Unsigned Integers with Unsigned Saturation
- •PXOR - Exclusive Or
- •VPBLENDD - Blend Packed Dwords
- •VPERMD - Full Doublewords Element Permutation
- •VPERMPD - Permute Double-Precision Floating-Point Elements
- •VPERMPS - Permute Single-Precision Floating-Point Elements
- •VPERMQ - Qwords Element Permutation
- •VPSLLVD/VPSLLVQ - Variable Bit Shift Left Logical
- •VPSRAVD - Variable Bit Shift Right Arithmetic
- •VPSRLVD/VPSRLVQ - Variable Bit Shift Right Logical
- •VGATHERDPD/VGATHERQPD - Gather Packed DP FP values Using Signed Dword/Qword Indices
- •VGATHERDPS/VGATHERQPS - Gather Packed SP FP values Using Signed Dword/Qword Indices
- •VPGATHERDD/VPGATHERQD - Gather Packed Dword values Using Signed Dword/Qword Indices
- •VPGATHERDQ/VPGATHERQQ - Gather Packed Qword values Using Signed Dword/Qword Indices
- •6.1 FMA InstructIon SET Reference
- •Chapter 7 Instruction Set Reference - VEX-Encoded GPR Instructions
- •7.1 Instruction Format
- •7.2 INSTRUCTION SET REFERENCE
- •BZHI - Zero High Bits Starting with Specified Bit Position
- •INVPCID - Invalidate Processor Context ID
- •Chapter 8 Post-32nm Processor Instructions
- •8.1 Overview
- •8.2 CPUID Detection of New Instructions
- •8.4 Vector Instruction Exception Specification
- •8.6 Using RDRAND Instruction and Intrinsic
- •8.7 Instruction Reference
- •A.1 AVX Instructions
- •A.2 Promoted Vector Integer Instructions in AVX2
- •B.1 Using Opcode Tables
- •B.2 Key to Abbreviations
- •B.2.1 Codes for Addressing Method
- •B.2.2 Codes for Operand Type
- •B.2.3 Register Codes
- •B.2.4 Opcode Look-up Examples for One, Two, and Three-Byte Opcodes
- •B.2.4.1 One-Byte Opcode Instructions
- •B.2.4.2 Two-Byte Opcode Instructions
- •B.2.4.3 Three-Byte Opcode Instructions
- •B.2.4.4 VEX Prefix Instructions
- •B.2.5 Superscripts Utilized in Opcode Tables
- •B.3 One, Two, and THREE-Byte Opcode Maps
- •B.4.1 Opcode Look-up Examples Using Opcode Extensions
- •B.4.2 Opcode Extension Tables
- •B.5 Escape Opcode Instructions
- •B.5.1 Opcode Look-up Examples for Escape Instruction Opcodes
- •B.5.2 Escape Opcode Instruction Tables
- •B.5.2.1 Escape Opcodes with D8 as First Byte
- •B.5.2.2 Escape Opcodes with D9 as First Byte
- •B.5.2.3 Escape Opcodes with DA as First Byte
- •B.5.2.4 Escape Opcodes with DB as First Byte
- •B.5.2.5 Escape Opcodes with DC as First Byte
- •B.5.2.6 Escape Opcodes with DD as First Byte
- •B.5.2.7 Escape Opcodes with DE as First Byte
- •B.5.2.8 Escape Opcodes with DF As First Byte
OPCODE MAP
B.4 OPCODE EXTENSIONS FOR ONE-BYTE AND TWOBYTE OPCODES
Some 1-byte and 2-byte opcodes use bits 3-5 of the ModR/M byte (the nnn field in Figure B-1) as an extension of the opcode.
mod |
nnn |
R/M |
|
|
|
Figure B-1. ModR/M Byte nnn Field (Bits 5, 4, and 3)
Opcodes that have opcode extensions are indicated in Table B-6 and organized by group number. Group numbers (from 1 to 16, second column) provide a table entry point. The encoding for the r/m field for each instruction can be established using the third column of the table.
B.4.1 Opcode Look-up Examples Using Opcode Extensions
An Example is provided below.
Example B-4. Interpreting an ADD Instruction
An ADD instruction with a 1-byte opcode of 80H is a Group 1 instruction:
•Table B-6 indicates that the opcode extension field encoded in the ModR/M byte for this instruction is 000B.
•The r/m field can be encoded to access a register (11B) or a memory address using a specified addressing mode (for example: mem = 00B, 01B, 10B).
Example B-5. Looking Up 0F01C3H
Look up opcode 0F01C3 for a VMRESUME instruction by using Table B-2, Table B-3 and Table B-6:
•0F tells us that this instruction is in the 2-byte opcode map.
•01 (row 0, column 1 in Table B-3) reveals that this opcode is in Group 7 of Table B-6.
•C3 is the ModR/M byte. The first two bits of C3 are 11B. This tells us to look at the second of the Group 7 rows in Table B-6.
•The Op/Reg bits [5,4,3] are 000B. This tells us to look in the 000 column for Group 7.
•Finally, the R/M bits [2,1,0] are 011B. This identifies the opcode as the VMRESUME instruction.
B-20 |
Ref. # 319433-011 |
|
OPCODE MAP
B.4.2 Opcode Extension Tables
See Table B-6 below.
Table B-6. Opcode Extensions for Oneand Two-byte Opcodes by Group Number *
|
|
|
|
Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis) |
|||||||
Opcode |
Group |
Mod 7,6 |
pfx |
|
|
|
|
|
|
|
|
000 |
001 |
010 |
011 |
100 |
101 |
110 |
111 |
||||
80-83 |
1 |
mem, |
|
ADD |
OR |
ADC |
SBB |
AND |
SUB |
XOR |
CMP |
11B |
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
8F |
1A |
mem, |
|
POP |
|
|
|
|
|
|
|
11B |
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
C0,C1 reg, imm |
|
mem, |
|
ROL |
ROR |
RCL |
RCR |
SHL/SAL |
SHR |
|
SAR |
D0, D1 reg, 1 |
2 |
11B |
|
|
|
|
|
|
|
|
|
D2, D3 reg, CL |
|
|
|
|
|
|
|
|
|
|
|
F6, F7 |
3 |
mem, |
|
TEST |
|
NOT |
NEG |
MUL |
IMUL |
DIV |
IDIV |
11B |
|
Ib/Iz |
|
|
|
AL/rAX |
AL/rAX |
AL/rAX |
AL/rAX |
||
|
|
|
|
|
|
||||||
FE |
4 |
mem, |
|
INC |
DEC |
|
|
|
|
|
|
11B |
|
Eb |
Eb |
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|||
FF |
5 |
mem, |
|
INC |
DEC |
CALLNf64 |
CALLF |
JMPNf64 |
JMPF |
PUSHd64 |
|
11B |
|
Ev |
Ev |
Ev |
Ep |
Ev |
Mp |
Ev |
|
||
|
|
|
|
||||||||
0F 00 |
6 |
mem, |
|
SLDT |
STR |
LLDT |
LTR |
VERR |
VERW |
|
|
11B |
|
Rv/Mw |
Rv/Mw |
Ew |
Ew |
Ew |
Ew |
|
|
||
|
|
|
|
|
|||||||
|
|
mem |
|
SGDT |
SIDT |
LGDT |
LIDT |
SMSW |
|
LMSW |
INVLPG |
|
|
|
|
Ms |
Ms |
Ms |
Ms |
Mw/Rv |
|
Ew |
Mb |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11B |
|
VMCALL (001) |
MONITOR |
XGETBV |
|
|
|
|
SWAPGS |
0F 01 |
7 |
|
|
VMLAUNCH |
(000) |
(000) |
|
|
|
|
o64(000) |
|
|
(010) |
MWAIT (001) |
XSETBV |
|
|
|
|
RDTSCP (001) |
||
|
|
|
|
VMRESUME |
|
(001) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(011) |
|
|
|
|
|
|
|
|
|
|
|
VMXOFF |
|
|
|
|
|
|
|
|
|
|
|
(100) |
|
|
|
|
|
|
|
0F BA |
8 |
mem, |
|
|
|
|
|
BT |
BTS |
BTR |
BTC |
11B |
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
CMPXCH8B |
|
|
|
|
VMPTRLD |
VMPTRST |
|
|
|
|
|
Mq |
|
|
|
|
Mq |
Mq |
|
|
|
|
|
CMPXCHG16B |
|
|
|
|
|
|
|
|
mem |
|
|
Mdq |
|
|
|
|
|
|
0F C7 |
9 |
66 |
|
|
|
|
|
|
VMCLEAR |
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
Mq |
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
F3 |
|
|
|
|
|
|
VMXON |
VMPTRST |
|
|
|
|
|
|
|
|
|
|
Mq |
Mq |
|
|
11B |
|
|
|
|
|
|
|
RDRAND |
|
|
|
|
|
|
|
|
|
|
Rv |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0F B9 |
10 |
mem |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
11B |
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
C6 |
|
mem, |
|
MOV |
|
|
|
|
|
|
|
|
11B |
|
Eb, Ib |
|
|
|
|
|
|
|
|
|
11 |
|
|
|
|
|
|
|
|
||
C7 |
mem |
|
MOV |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
Ev, Iz |
|
|
|
|
|
|
|
|
|
11B |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Ref. # 319433-011 |
B-21 |
|
OPCODE MAP
Table B-6. Opcode Extensions for Oneand Two-byte Opcodes by Group Number *
|
|
|
|
Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis) |
|||||||
Opcode |
Group |
Mod 7,6 |
pfx |
|
|
|
|
|
|
|
|
000 |
001 |
010 |
011 |
100 |
101 |
110 |
111 |
||||
|
|
mem |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
psrlw |
|
psraw |
|
psllw |
|
0F 71 |
12 |
11B |
|
|
|
Nq, Ib |
|
Nq, Ib |
|
Nq, Ib |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
|
|
vpsrlw |
|
vpsraw |
|
vpsllw |
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
Hx,Ux,Ib |
|
Hx,Ux,Ib |
|
Hx,Ux,Ib |
|
|
|
mem |
|
|
|
|
|
|
|
|
|
0F 72 |
13 |
|
|
|
|
psrld |
|
psrad |
|
pslld |
|
11B |
|
|
|
Nq, Ib |
|
Nq, Ib |
|
Nq, Ib |
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
|
|
vpsrld |
|
vpsrad |
|
vpslld |
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
Hx,Ux,Ib |
|
Hx,Ux,Ib |
|
Hx,Ux,Ib |
|
|
|
mem |
|
|
|
|
|
|
|
|
|
0F 73 |
14 |
|
|
|
|
psrlq |
|
|
|
psllq |
|
11B |
|
|
|
Nq, Ib |
|
|
|
Nq, Ib |
|
||
|
|
66 |
|
|
vpsrlq |
vpsrldq |
|
|
vpsllq |
vpslldq |
|
|
|
|
|
|
|
|
|||||
|
|
|
|
|
|
Hx,Ux,Ib |
Hx,Ux,Ib |
|
|
Hx,Ux,Ib |
Hx,Ux,Ib |
|
|
mem |
|
fxsave |
fxrstor |
ldmxcsr |
stmxcsr |
XSAVE |
XRSTOR XSAVEOPT |
clflush |
|
0F AE |
15 |
|
|
|
|
|
|
|
lfence |
mfence |
sfence |
11B |
|
|
|
|
|
|
|
|
|
||
|
|
F3 |
RDFSBASE RDGSBASE WRFSBASE WRGSBASE |
|
|
|
|
||||
|
|
|
|
Ry |
Ry |
Ry |
Ry |
|
|
|
|
|
|
mem |
|
prefetch |
prefetch |
prefetch |
prefetch |
|
|
|
|
0F 18 |
16 |
|
NTA |
T0 |
T1 |
T2 |
|
|
|
|
|
|
|
|
|
|
|
||||||
|
|
11B |
|
|
|
|
|
|
|
|
|
VEX.0F38 F3 |
17 |
mem |
F3 |
|
BLSRv |
BLSMSKv |
BLSIv |
|
|
|
|
11B |
F3 |
|
By, Ey |
By, Ey |
By, Ey |
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
NOTES:
*All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.
B-22 |
Ref. # 319433-011 |
|