Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
CS 220 / ARM / ARM1176JZ-S Technical Reference Mmanual.pdf
Источник:
Скачиваний:
45
Добавлен:
16.04.2015
Размер:
4.47 Mб
Скачать

Program Flow Prediction

5.4Memory Barriers

Memory barrier is the general term applied to an instruction, or sequence of instructions, used to force synchronization events by a processor with respect to retiring load/store instructions in a processor core. A memory barrier is used to guarantee completion of preceding load/store instructions to the programmers model, flushing of any prefetched instructions prior to the event, or both. The ARMv6 architecture mandates three explicit barrier instructions in the System Control Coprocessor to support the memory order model, see the ARM Architecture Reference Manual, and requires these instructions to be available in both Privileged and User modes:

Data Memory Barrier, see Data Memory Barrier operation on page 3-85

Data Synchronization Barrier, see Data Synchronization Barrier operation on page 3-84

Prefetch Flush, see Flush operations on page 3-79.

Note

The Data Synchronization Barrier operation is synonymous with Drain Write Buffer and Data Write Barrier in earlier versions of the architecture.

These instructions might be sufficient on their own, or might have to be used in conjunction with cache and memory management maintenance operations, operations that are only available in Privileged modes.

5.4.1Instruction Memory Barriers (IMBs)

Because it is impossible to entirely avoid self modifying code it is necessary to define a sequence of operations that can be used in the middle of a self-modifying code sequence to make it execute reliably. This sequence is called an Instruction Memory Barrier (IMB), and might depend both on the ARM processor implementation and on the memory system implementation.

The IMB sequence must be executed after the new instructions have been stored to memory and before they are executed, for example, after a program has been loaded and before its entry point is branched to. Any self-modifying code sequence that does not use an IMB in this way has Unpredictable behavior.

An IMB might be included in-line where required, however, it is recommended that software is designed so that the IMB sequence is provided as a call to an easily replaceable system dependencies module. This eases porting across different architecture variants, ARM processors, and memory systems.

IMB sequences can include operations that are only usable from Privileged processor modes, such as the cache cleaning and invalidation operations supplied by the system control coprocessor. To enable User mode programs access to privileged IMB sequences, it is recommended that they are supplied as operating system calls, invoked by SVC instructions. For systems that use the 24-bit immediate in a SVC instruction to specify the required operating system service, that are default values as follows:

SVC 0xF00000; the general case

SVC 0xF00001; where the system can take advantage of specifying an ; affected address range

These are recommended for general use unless an operating system has good reason to choose differently, to align with a broader range of operating system specific system services.

The SVC 0xF00000 call takes no parameters, does not return a result, and, apart from the fact that a SVC instruction is used for the call, rather than a BL instruction, uses the same calling conventions as a call to a C function with prototype:

ARM DDI 0333H

Copyright © 2004-2009 ARM Limited. All rights reserved.

5-8

ID012410

Non-Confidential, Unrestricted Access

 

Program Flow Prediction

void IMB(void);

The SVC 0xF00001 call uses similar calling conventions to those used by a call to a C function with prototype:

void IMB_Range(unsigned long start_addr, unsigned long end_addr);

Where the address range runs from start_addr (inclusive) to end_addr (exclusive). When the standard ARM Procedure Call Standard is used, this means that start_addr is passed in R0 and end_addr in R1.

The execution time cost of an IMB can be very large, many thousands of clock cycles, even when a small address range is specified. For small scale uses of self-modifying code, this is likely to lead to a major loss of performance. It is therefore recommended that self-modifying code is only used where it is unavoidable and/or it produces sufficiently large execution time benefits to offset the cost of the IMB.

ARM DDI 0333H

Copyright © 2004-2009 ARM Limited. All rights reserved.

5-9

ID012410

Non-Confidential, Unrestricted Access

 

Program Flow Prediction

5.5ARM1176JZ-S IMB implementation

For the ARM1176JZ-S processor:

executing the SVC instruction is sufficient to cause IMB operation

both the IMB and the IMBRange instructions flush all stored information about the instruction stream.

Note

The IMB implementation described here applies to the ARM1020T and later processors, including the ARM1176JZ-S.

This means that all IMB instructions can be implemented in the operating system by returning from the IMB or IMBRange service routine, and that the IMB and IMBRange service routines can be exactly the same. The following service routine code can be used:

IMB_SVC_handler

IMBRange_SVC_handler

MOVS

PC, R14_svc ; Return to the code after the SVC call

Note

In new code, you are strongly encouraged to use the IMBRange instruction whenever the changed area of code is small, even if there is no distinction between it and the IMB instruction on ARM1176JZ-S processors. Future processors might implement the

IMBRange instruction in a more efficient and faster manner, and code migrated from the

ARM1176JZ-S core is likely to benefit when executed on these processors.

ARM1176JZ-S processors implement a Flush Prefetch Buffer operation that is user-accessible and acts as an IMB. For more details see c7, Cache operations on page 3-69.

5.5.1Execution of IMB instructions

This section comprises three examples that show what can happen during the execution of IMB instructions. The pseudo code in the square brackets shows what happens to execute the IMB (or IMBRange) instruction in the SVC handler.

Example 5-1 shows how code that loads a program from a disk, and then branches to the entry point of that program, must execute an IMB instruction between loading the program and trying to execute it.

Example 5-1 Loading code from disk

IMB

EQU 0xF00000

 

.

 

 

.

 

 

; code that loads program from disk

 

.

 

 

.

 

 

SVC

IMB

 

 

[branch to IMB service routine]

 

 

[perform processor-specific operations to execute IMB]

 

 

[return to code]

 

 

.

 

 

 

 

ARM DDI 0333H

Copyright © 2004-2009 ARM Limited. All rights reserved.

5-10

ID012410

Non-Confidential, Unrestricted Access

 

Program Flow Prediction

MOV PC, entry_point_of_loaded_program

.

.

Compiled BitBlt routines optimize large copy operations by constructing and executing a copying loop that has been optimized for the exact operation wanted. When writing such a routine an IMB is required between the code that constructs the loop and the actual execution of the constructed loop. Example 5-2 shows this.

Example 5-2 Running BitBlt code

IMBRange EQU 0xF00001.

.

;code that constructs loop code

;load R0 with the start address of the constructed loop

;load R1 with the end address of the constructed loop SVC IMBRange

[branch to IMBRange service routine]

[read registers R0 and R1 to set up address range parameters] [perform processor-specific operations to execute IMBRange] [within address range]

[return to code]

;start of loop code

.

.

When writing a self-decompressing program, an IMB must be issued after the routine that decompresses the bulk of the code and before the decompressed code starts to be executed. Example 5-3 shows this.

Example 5-3 Self-decompressing code

IMB EQU 0xF00000

.

.

; copy and decompress bulk of code SVC IMB

; start of decompressed code

.

.

.

ARM DDI 0333H

Copyright © 2004-2009 ARM Limited. All rights reserved.

5-11

ID012410

Non-Confidential, Unrestricted Access

 

Соседние файлы в папке ARM