- •Chapter 1 Intel® Advanced Vector Extensions
- •1.1 About This Document
- •1.2 Overview
- •1.3.2 Instruction Syntax Enhancements
- •1.3.3 VEX Prefix Instruction Encoding Support
- •1.4 Overview AVX2
- •1.5 Functional Overview
- •1.6 General Purpose Instruction Set Enhancements
- •2.1 Detection of PCLMULQDQ and AES Instructions
- •2.2 Detection of AVX and FMA Instructions
- •2.2.1 Detection of FMA
- •2.2.3 Detection of AVX2
- •2.3.1 FMA Instruction Operand Order and Arithmetic Behavior
- •2.4 Accessing YMM Registers
- •2.5 Memory alignment
- •2.7 Instruction Exception Specification
- •2.7.1 Exceptions Type 1 (Aligned memory reference)
- •2.7.2 Exceptions Type 2 (>=16 Byte Memory Reference, Unaligned)
- •2.7.3 Exceptions Type 3 (<16 Byte memory argument)
- •2.7.5 Exceptions Type 5 (<16 Byte mem arg and no FP exceptions)
- •2.7.7 Exceptions Type 7 (No FP exceptions, no memory arg)
- •2.7.8 Exceptions Type 8 (AVX and no memory argument)
- •2.8.1 Clearing Upper YMM State Between AVX and Legacy SSE Instructions
- •2.8.3 Unaligned Memory Access and Buffer Size Management
- •2.9 CPUID Instruction
- •3.1 YMM State, VEX Prefix and Supported Operating Modes
- •3.2 YMM State Management
- •3.2.1 Detection of YMM State Support
- •3.2.2 Enabling of YMM State
- •3.2.4 The Layout of XSAVE Area
- •3.2.5 XSAVE/XRSTOR Interaction with YMM State and MXCSR
- •3.2.6 Processor Extended State Save Optimization and XSAVEOPT
- •3.2.6.1 XSAVEOPT Usage Guidelines
- •3.3 Reset Behavior
- •3.4 Emulation
- •4.1 Instruction Formats
- •4.1.1 VEX and the LOCK prefix
- •4.1.2 VEX and the 66H, F2H, and F3H prefixes
- •4.1.3 VEX and the REX prefix
- •4.1.4 The VEX Prefix
- •4.1.4.1 VEX Byte 0, bits[7:0]
- •4.1.4.2 VEX Byte 1, bit [7] - ‘R’
- •4.1.5 Instruction Operand Encoding and VEX.vvvv, ModR/M
- •4.1.6 The Opcode Byte
- •4.1.7 The MODRM, SIB, and Displacement Bytes
- •4.1.8 The Third Source Operand (Immediate Byte)
- •4.1.9.1 Vector Length Transition and Programming Considerations
- •4.1.10 AVX Instruction Length
- •4.2 Vector SIB (VSIB) Memory Addressing
- •4.3 VEX Encoding Support for GPR Instructions
- •5.1 Interpreting InstructIon Reference Pages
- •5.1.1 Instruction Format
- •5.1.2 Opcode Column in the Instruction Summary Table
- •5.1.3 Instruction Column in the Instruction Summary Table
- •5.1.4 Operand Encoding column in the Instruction Summary Table
- •5.1.5 64/32 bit Mode Support column in the Instruction Summary Table
- •5.1.6 CPUID Support column in the Instruction Summary Table
- •5.2 Summary of Terms
- •5.3 Instruction SET Reference
- •MPSADBW - Multiple Sum of Absolute Differences
- •PALIGNR - Byte Align
- •PBLENDW - Blend Packed Words
- •PHADDW/PHADDD - Packed Horizontal Add
- •PHADDSW - Packed Horizontal Add with Saturation
- •PHSUBW/PHSUBD - Packed Horizontal Subtract
- •PHSUBSW - Packed Horizontal Subtract with Saturation
- •PMOVSX - Packed Move with Sign Extend
- •PMOVZX - Packed Move with Zero Extend
- •PMULDQ - Multiply Packed Doubleword Integers
- •PMULHRSW - Multiply Packed Unsigned Integers with Round and Scale
- •PMULHUW - Multiply Packed Unsigned Integers and Store High Result
- •PMULHW - Multiply Packed Integers and Store High Result
- •PMULLW/PMULLD - Multiply Packed Integers and Store Low Result
- •PMULUDQ - Multiply Packed Unsigned Doubleword Integers
- •POR - Bitwise Logical Or
- •PSADBW - Compute Sum of Absolute Differences
- •PSHUFB - Packed Shuffle Bytes
- •PSHUFD - Shuffle Packed Doublewords
- •PSHUFLW - Shuffle Packed Low Words
- •PSIGNB/PSIGNW/PSIGND - Packed SIGN
- •PSLLDQ - Byte Shift Left
- •PSLLW/PSLLD/PSLLQ - Bit Shift Left
- •PSRAW/PSRAD - Bit Shift Arithmetic Right
- •PSRLDQ - Byte Shift Right
- •PSRLW/PSRLD/PSRLQ - Shift Packed Data Right Logical
- •PSUBB/PSUBW/PSUBD/PSUBQ -Packed Integer Subtract
- •PSUBSB/PSUBSW -Subtract Packed Signed Integers with Signed Saturation
- •PSUBUSB/PSUBUSW -Subtract Packed Unsigned Integers with Unsigned Saturation
- •PXOR - Exclusive Or
- •VPBLENDD - Blend Packed Dwords
- •VPERMD - Full Doublewords Element Permutation
- •VPERMPD - Permute Double-Precision Floating-Point Elements
- •VPERMPS - Permute Single-Precision Floating-Point Elements
- •VPERMQ - Qwords Element Permutation
- •VPSLLVD/VPSLLVQ - Variable Bit Shift Left Logical
- •VPSRAVD - Variable Bit Shift Right Arithmetic
- •VPSRLVD/VPSRLVQ - Variable Bit Shift Right Logical
- •VGATHERDPD/VGATHERQPD - Gather Packed DP FP values Using Signed Dword/Qword Indices
- •VGATHERDPS/VGATHERQPS - Gather Packed SP FP values Using Signed Dword/Qword Indices
- •VPGATHERDD/VPGATHERQD - Gather Packed Dword values Using Signed Dword/Qword Indices
- •VPGATHERDQ/VPGATHERQQ - Gather Packed Qword values Using Signed Dword/Qword Indices
- •6.1 FMA InstructIon SET Reference
- •Chapter 7 Instruction Set Reference - VEX-Encoded GPR Instructions
- •7.1 Instruction Format
- •7.2 INSTRUCTION SET REFERENCE
- •BZHI - Zero High Bits Starting with Specified Bit Position
- •INVPCID - Invalidate Processor Context ID
- •Chapter 8 Post-32nm Processor Instructions
- •8.1 Overview
- •8.2 CPUID Detection of New Instructions
- •8.4 Vector Instruction Exception Specification
- •8.6 Using RDRAND Instruction and Intrinsic
- •8.7 Instruction Reference
- •A.1 AVX Instructions
- •A.2 Promoted Vector Integer Instructions in AVX2
- •B.1 Using Opcode Tables
- •B.2 Key to Abbreviations
- •B.2.1 Codes for Addressing Method
- •B.2.2 Codes for Operand Type
- •B.2.3 Register Codes
- •B.2.4 Opcode Look-up Examples for One, Two, and Three-Byte Opcodes
- •B.2.4.1 One-Byte Opcode Instructions
- •B.2.4.2 Two-Byte Opcode Instructions
- •B.2.4.3 Three-Byte Opcode Instructions
- •B.2.4.4 VEX Prefix Instructions
- •B.2.5 Superscripts Utilized in Opcode Tables
- •B.3 One, Two, and THREE-Byte Opcode Maps
- •B.4.1 Opcode Look-up Examples Using Opcode Extensions
- •B.4.2 Opcode Extension Tables
- •B.5 Escape Opcode Instructions
- •B.5.1 Opcode Look-up Examples for Escape Instruction Opcodes
- •B.5.2 Escape Opcode Instruction Tables
- •B.5.2.1 Escape Opcodes with D8 as First Byte
- •B.5.2.2 Escape Opcodes with D9 as First Byte
- •B.5.2.3 Escape Opcodes with DA as First Byte
- •B.5.2.4 Escape Opcodes with DB as First Byte
- •B.5.2.5 Escape Opcodes with DC as First Byte
- •B.5.2.6 Escape Opcodes with DD as First Byte
- •B.5.2.7 Escape Opcodes with DE as First Byte
- •B.5.2.8 Escape Opcodes with DF As First Byte
APPLICATION PROGRAMMING MODEL
2.7.5Exceptions Type 5 (<16 Byte mem arg and no FP exceptions)
Table 2-14. Type 5 Class Exception Conditions
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Invalid Opcode, #UD |
X |
X |
|
|
VEX prefix |
|
|
|
|
|
|
|
|
|
X |
X |
VEX prefix: |
|
|
|
|
|
If XFEATURE_ENABLED_MASK[2:1] != ‘11b’. |
|
|
|
|
|
If CR4.OSXSAVE[bit 18]=0. |
|
|
|
|
|
|
|
X |
X |
X |
X |
Legacy SSE instruction: |
|
|
|
|
|
If CR0.EM[bit 2] = 1. |
|
|
|
|
|
If CR4.OSFXSR[bit 9] = 0. |
|
|
|
|
|
|
|
X |
X |
X |
X |
If preceded by a LOCK prefix (F0H) |
|
|
|
|
|
|
|
|
|
X |
X |
If any REX, F2, F3, or 66 prefixes precede a |
|
|
|
|
|
VEX prefix |
|
|
|
|
|
|
|
X |
X |
X |
X |
If any corresponding CPUID feature flag is ‘0’ |
|
|
|
|
|
|
Device Not Available, |
X |
X |
X |
X |
If CR0.TS[bit 3]=1 |
#NM |
|
|
|
|
|
|
|
|
|
|
|
Stack, SS(0) |
|
|
X |
|
For an illegal address in the SS segment |
|
|
|
|
|
|
|
|
|
|
X |
If a memory address referencing the SS seg- |
|
|
|
|
|
ment is in a non-canonical form |
|
|
|
|
|
|
General Protection, |
|
|
X |
|
For an illegal memory operand effective |
#GP(0) |
|
|
|
|
address in the CS, DS, ES, FS or GS segments. |
|
|
|
|
|
|
|
|
|
|
X |
If the memory address is in a non-canonical |
|
|
|
|
|
form. |
|
|
|
|
|
|
|
X |
X |
|
|
If any part of the operand lies outside the |
|
|
|
|
|
effective address space from 0 to FFFFH |
|
|
|
|
|
|
Page Fault |
|
X |
X |
X |
For a page fault |
#PF(fault-code) |
|
|
|
|
|
|
|
|
|
|
|
Alignment Check |
|
X |
X |
X |
If alignment checking is enabled and an |
#AC(0) |
|
|
|
|
unaligned memory reference is made while |
|
|
|
|
|
the current privilege level is 3. |
|
|
|
|
|
|
Ref. # 319433-011 |
2-25 |
APPLICATION PROGRAMMING MODEL
2.7.6Exceptions Type 6 (VEX-Encoded Instructions Without Legacy SSE Analogues)
Note: At present, the AVX instructions in this category do not generate floating-point exceptions.
Table 2-15. Type 6 Class Exception Conditions
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Invalid Opcode, #UD |
X |
X |
|
|
VEX prefix |
|
|
|
|
|
|
|
|
|
X |
X |
If XFEATURE_ENABLED_MASK[2:1] != ‘11b’. |
|
|
|
|
|
If CR4.OSXSAVE[bit 18]=0. |
|
|
|
|
|
|
|
|
|
X |
X |
If preceded by a LOCK prefix (F0H) |
|
|
|
|
|
|
|
|
|
X |
X |
If any REX, F2, F3, or 66 prefixes precede a |
|
|
|
|
|
VEX prefix |
|
|
|
|
|
|
|
|
|
X |
X |
If any corresponding CPUID feature flag is ‘0’ |
|
|
|
|
|
|
Device Not Available, |
|
|
X |
X |
If CR0.TS[bit 3]=1 |
#NM |
|
|
|
|
|
|
|
|
|
|
|
Stack, SS(0) |
|
|
X |
|
For an illegal address in the SS segment |
|
|
|
|
|
|
|
|
|
|
X |
If a memory address referencing the SS seg- |
|
|
|
|
|
ment is in a non-canonical form |
|
|
|
|
|
|
General Protection, |
|
|
X |
|
For an illegal memory operand effective |
#GP(0) |
|
|
|
|
address in the CS, DS, ES, FS or GS segments. |
|
|
|
|
|
|
|
|
|
|
X |
If the memory address is in a non-canonical |
|
|
|
|
|
form. |
|
|
|
|
|
|
Page Fault |
|
|
X |
X |
For a page fault |
#PF(fault-code) |
|
|
|
|
|
|
|
|
|
|
|
Alignment Check |
|
|
X |
X |
For 4 or 8 byte memory references if align- |
#AC(0) |
|
|
|
|
ment checking is enabled and an unaligned |
|
|
|
|
|
memory reference is made while the current |
|
|
|
|
|
privilege level is 3. |
|
|
|
|
|
|
2-26 |
Ref. # 319433-011 |
APPLICATION PROGRAMMING MODEL
2.7.7Exceptions Type 7 (No FP exceptions, no memory arg)
Table 2-16. Type 7 Class Exception Conditions
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Invalid Opcode, #UD |
X |
X |
|
|
VEX prefix |
|
|
|
|
|
|
|
|
|
X |
X |
VEX prefix: |
|
|
|
|
|
If XFEATURE_ENABLED_MASK[2:1] != ‘11b’. |
|
|
|
|
|
If CR4.OSXSAVE[bit 18]=0. |
|
|
|
|
|
|
|
X |
X |
X |
X |
Legacy SSE instruction: |
|
|
|
|
|
If CR0.EM[bit 2] = 1. |
|
|
|
|
|
If CR4.OSFXSR[bit 9] = 0. |
|
|
|
|
|
|
|
X |
X |
X |
X |
If preceded by a LOCK prefix (F0H) |
|
|
|
|
|
|
|
|
|
X |
X |
If any REX, F2, F3, or 66 prefixes precede a |
|
|
|
|
|
VEX prefix |
|
|
|
|
|
|
|
X |
X |
X |
X |
If any corresponding CPUID feature flag is ‘0’ |
|
|
|
|
|
|
Device Not Available, |
|
|
X |
X |
If CR0.TS[bit 3]=1 |
#NM |
|
|
|
|
|
|
|
|
|
|
|
2.7.8Exceptions Type 8 (AVX and no memory argument)
Table 2-17. Type 8 Class Exception Conditions
Exception
Invalid Opcode, #UD
Device Not Available, #NM
Real |
Virtual 80x86 |
Protected and Compatibility |
64-bit |
X |
X |
|
|
|
|
X |
X |
X |
X |
X |
X |
|
|
X |
X |
|
|
|
|
Cause of Exception
Always in Real or Virtual 80x86 mode
If XFEATURE_ENABLED_MASK[2:1] != ‘11b’. If CR4.OSXSAVE[bit 18]=0.
If CPUID.01H.ECX.AVX[bit 28]=0. If VEX.vvvv != 1111B.
If proceeded by a LOCK prefix (F0H) If CR0.TS[bit 3]=1.
Ref. # 319433-011 |
2-27 |
APPLICATION PROGRAMMING MODEL
2.7.9Exception Type 11 (VEX-only, mem arg no AC, floating-point exceptions)
Table 2-18. Type 11 Class Exception Conditions
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Invalid Opcode, #UD |
X |
X |
|
|
VEX prefix |
|
|
|
|
|
|
|
|
|
X |
X |
VEX prefix: |
|
|
|
|
|
If XFEATURE_ENABLED_MASK[2:1] != ‘11b’. |
|
|
|
|
|
If CR4.OSXSAVE[bit 18]=0. |
|
|
|
|
|
|
|
X |
X |
X |
X |
If preceded by a LOCK prefix (F0H) |
|
|
|
|
|
|
|
|
|
X |
X |
If any REX, F2, F3, or 66 prefixes precede a |
|
|
|
|
|
VEX prefix |
|
|
|
|
|
|
|
X |
X |
X |
X |
If any corresponding CPUID feature flag is ‘0’ |
|
|
|
|
|
|
Device Not Available, |
X |
X |
X |
X |
If CR0.TS[bit 3]=1 |
#NM |
|
|
|
|
|
|
|
|
|
|
|
Stack, SS(0) |
|
|
X |
|
For an illegal address in the SS segment |
|
|
|
|
|
|
|
|
|
|
X |
If a memory address referencing the SS seg- |
|
|
|
|
|
ment is in a non-canonical form |
|
|
|
|
|
|
General Protection, |
|
|
X |
|
For an illegal memory operand effective |
#GP(0) |
|
|
|
|
address in the CS, DS, ES, FS or GS segments. |
|
|
|
|
|
|
|
|
|
|
X |
If the memory address is in a non-canonical |
|
|
|
|
|
form. |
|
|
|
|
|
|
|
X |
X |
|
|
If any part of the operand lies outside the |
|
|
|
|
|
effective address space from 0 to FFFFH |
|
|
|
|
|
|
Page Fault |
|
X |
X |
X |
For a page fault |
#PF(fault-code) |
|
|
|
|
|
|
|
|
|
|
|
SIMD Floating-Point |
X |
X |
X |
X |
If an unmasked SIMD floating-point exception |
Exception, #XM |
|
|
|
|
and CR4.OSXMMEXCPT[bit 10] = 1 |
|
|
|
|
|
|
2-28 |
Ref. # 319433-011 |
APPLICATION PROGRAMMING MODEL
2.7.10Exception Type 12 (VEX-only, VSIB mem arg, no AC, no floating-point exceptions)
Table 2-19. Type 12 Class Exception Conditions
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Invalid Opcode, #UD |
X |
X |
|
|
VEX prefix |
|
|
|
|
|
|
|
|
|
X |
X |
VEX prefix: |
|
|
|
|
|
If XFEATURE_ENABLED_MASK[2:1] != ‘11b’. |
|
|
|
|
|
If CR4.OSXSAVE[bit 18]=0. |
|
|
|
|
|
|
|
X |
X |
X |
X |
If preceded by a LOCK prefix (F0H) |
|
|
|
|
|
|
|
|
|
X |
X |
If any REX, F2, F3, or 66 prefixes precede a |
|
|
|
|
|
VEX prefix |
|
|
|
|
|
|
|
X |
X |
X |
NA |
If address size attribute is 16 bit |
|
|
|
|
|
|
|
X |
X |
X |
X |
If ModR/M.mod = ‘11b’ |
|
|
|
|
|
|
|
X |
X |
X |
X |
If ModR/M.rm != ‘100b’ |
|
|
|
|
|
|
|
X |
X |
X |
X |
If any corresponding CPUID feature flag is ‘0’ |
|
|
|
|
|
|
|
X |
X |
X |
X |
If any vector register is used more than once |
|
|
|
|
|
between the destination register, mask regis- |
|
|
|
|
|
ter and the index register in VSIB addressing. |
|
|
|
|
|
|
Device Not Available, |
X |
X |
X |
X |
If CR0.TS[bit 3]=1 |
#NM |
|
|
|
|
|
|
|
|
|
|
|
Stack, SS(0) |
|
|
X |
|
For an illegal address in the SS segment |
|
|
|
|
|
|
|
|
|
|
X |
If a memory address referencing the SS seg- |
|
|
|
|
|
ment is in a non-canonical form |
|
|
|
|
|
|
General Protection, |
|
|
X |
|
For an illegal memory operand effective |
#GP(0) |
|
|
|
|
address in the CS, DS, ES, FS or GS segments. |
|
|
|
|
|
|
|
|
|
|
X |
If the memory address is in a non-canonical |
|
|
|
|
|
form. |
|
|
|
|
|
|
|
X |
X |
|
|
If any part of the operand lies outside the |
|
|
|
|
|
effective address space from 0 to FFFFH |
|
|
|
|
|
|
Page Fault #PF(fault- |
|
X |
X |
X |
For a page fault |
code) |
|
|
|
|
|
|
|
|
|
|
|
Ref. # 319433-011 |
2-29 |
APPLICATION PROGRAMMING MODEL
2.7.11Exception Conditions for VEX-Encoded GPR Instructions
The exception conditions applicable to VEX-encoded GPR instruction differs from those of legacy GPR instructions. Table 2-20 groups instructions listed in Chapter 7 and lists details of the exception conditions for VEX-encoded GRP instructions in Table 2-22 for those instructions which have a default operand size of 32 bits and 16bit operand size is not encodable. Table 2-21 lists exception conditions for those instructions that support operation on 16-bit operands.
Table 2-20. Exception Groupings for Instructions Listed in Chapter 7
Exception Class |
Instruction |
|
|
See Table 2-21 |
LZCNT, TZCNT |
|
|
See Table 2-22 |
ANDN, BLSI, BLSMSK, BLSR, BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, |
|
SHRX |
|
|
(*) - Additional exception restrictions are present - see the Instruction description for details
Table 2-21. Exception Definition for LZCNT and TZCNT
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Stack, SS(0) |
X |
X |
X |
|
For an illegal address in the SS segment |
|
|
|
|
|
|
|
|
|
|
X |
If a memory address referencing the SS seg- |
|
|
|
|
|
ment is in a non-canonical form |
|
|
|
|
|
|
General Protection, |
|
|
X |
|
For an illegal memory operand effective |
#GP(0) |
|
|
|
|
address in the CS, DS, ES, FS or GS segments. |
|
|
|
|
|
If the DS, ES, FS, or GS register is used to |
|
|
|
|
|
access memory and it contains a null segment |
|
|
|
|
|
selector. |
|
|
|
|
|
|
|
|
|
|
X |
If the memory address is in a non-canonical |
|
|
|
|
|
form. |
|
|
|
|
|
|
|
X |
X |
|
|
If any part of the operand lies outside the |
|
|
|
|
|
effective address space from 0 to FFFFH |
|
|
|
|
|
|
Page Fault #PF(fault- |
|
X |
X |
X |
For a page fault |
code) |
|
|
|
|
|
|
|
|
|
|
|
Alignment Check |
|
X |
X |
X |
If alignment checking is enabled and an |
#AC(0) |
|
|
|
|
unaligned memory reference is made while the |
|
|
|
|
|
current privilege level is 3. |
|
|
|
|
|
|
2-30 |
Ref. # 319433-011 |
APPLICATION PROGRAMMING MODEL
Table 2-22. Exception Definition (VEX-Encoded GPR Instructions)
Exception |
Real |
Virtual80x86 |
Protectedand Compatibility |
64-bit |
Cause of Exception |
|
|
|
|
||
|
|
|
|
|
|
Invalid Opcode, #UD |
X |
X |
X |
X |
If BMI1/BMI2 CPUID feature flag is ‘0’ |
|
|
|
|
|
|
|
X |
X |
|
|
If a VEX prefix is present |
|
|
|
|
|
|
|
|
|
X |
X |
If any REX, F2, F3, or 66 prefixes precede a |
|
|
|
|
|
VEX prefix |
|
|
|
|
|
|
Stack, SS(0) |
X |
X |
X |
|
For an illegal address in the SS segment |
|
|
|
|
|
|
|
|
|
|
X |
If a memory address referencing the SS seg- |
|
|
|
|
|
ment is in a non-canonical form |
|
|
|
|
|
|
General Protection, |
|
|
X |
|
For an illegal memory operand effective |
#GP(0) |
|
|
|
|
address in the CS, DS, ES, FS or GS segments. |
|
|
|
|
|
If the DS, ES, FS, or GS register is used to |
|
|
|
|
|
access memory and it contains a null segment |
|
|
|
|
|
selector. |
|
|
|
|
|
|
|
|
|
|
X |
If the memory address is in a non-canonical |
|
|
|
|
|
form. |
|
|
|
|
|
|
|
X |
X |
|
|
If any part of the operand lies outside the |
|
|
|
|
|
effective address space from 0 to FFFFH |
|
|
|
|
|
|
Page Fault #PF(fault- |
|
X |
X |
X |
For a page fault |
code) |
|
|
|
|
|
|
|
|
|
|
|
Alignment Check |
|
X |
X |
X |
If alignment checking is enabled and an |
#AC(0) |
|
|
|
|
unaligned memory reference is made while the |
|
|
|
|
|
current privilege level is 3. |
|
|
|
|
|
|
2.8PROGRAMMING CONSIDERATIONS WITH 128-BIT SIMD INSTRUCTIONS
VEX-encoded SIMD instructions generally operate on the 256-bit YMM register state. In contrast, non-VEX encoded instructions (e.g from SSE to AES) operating on XMM registers only access the lower 128-bit of YMM registers. Processors supporting both 256-bit VEX-encoded instructions and legacy 128-bit SIMD instructions have internal state to manage the upper and lower halves of the YMM register states. Functionally, VEX-encoded SIMD instructions can be intermixed with legacy SSE instructions (non- VEX-encoded SIMD instructions operating on XMM registers). However, there is a
Ref. # 319433-011 |
2-31 |