Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
E77790.pdf
Скачиваний:
2
Добавлен:
16.05.2024
Размер:
1.29 Mб
Скачать

3.4 Options Reference

Sub-option

Meaning

 

 

 

size of the function being inlined into exceeds max_recursive_inst. The

 

settings of these two parameters control the degree of inlining of small

 

recursive functions.

 

 

If -xinline_param=default is specified, the compiler will set all the values of the sub-options to the default values.

If the option is not specified, the default is -xinline_param=default.

The list of values and options accumulate from left to right. So for a specification of - xinline_param=max_inst_hard:30,..,max_inst_hard:50, the value max_inst_hard:50 will be passed to the compiler.

If multiple -xinline_param options are specified on the command line, the list of sub-options likewise accumulate from left to right. For example, the effect of

-xinline_param=max_inst_hard:50,min_counter:70 ...

-xinline_param=max_growth:100,max_inst_hard:100

will be the same as that of

-xinline_param=max_inst_hard:100,min_counter:70,max_growth:100

3.4.134 -xinline_report[=n]

This option generates a report written to standard output on the inlining of functions by the compiler. The type of report depends on the value of n, which must be 0, 1, or 2.

0

No report is generated.

1

A summary report of default values of inlining parameters is generated.

2

A detailed report of inlining messages is generated, showing which

 

callsites are inlined and which are not, with a short reason for not inlining

 

a callsite. In some cases, this report will include suggested values for -

 

xinline_param that can be used to inline a callsite that is not inlined.

When -xinline_report is not specified, the default value for n is 0. When -xinline_report is specified without =n, the default value is 1.

When -xlinkopt is present, the inlining messages about the callsites that are not inlined might not be accurate.

128 Oracle Developer Studio 12.6: Fortran User's Guide • July 2017

3.4 Options Reference

The report is limited to inlining performed by the compiler that is subject to the heuristics controllable by the -xinline_param option. Callsites inlined by the compiler for other reasons may not be reported.

3.4.135 –xinstrument=[%no]datarace

Specify this option to compile and instrument your program for analysis by the Thread Analyzer.

(For information on the Thread Analyzer, see tha(1) for details.)

By compiling with this option you can then use the Performance Analyzer to run the instrumented program with collect -r races to create a data-race-detection experiment. You can run the instrumented code standalone but it runs more slowly.

Specify -xinstrument=no%datarace to turn off this feature. This is the default.

-xinstrument must be specified with an argument.

If you compile and link in separate steps, you must specify -xinstrument=datarace in both the compilation and linking steps.

This option defines the preprocessor token __THA_NOTIFY. You can specify #ifdef __THA_NOTIFY to guard calls to libtha(3) routines.

This option also sets -g.

-xinstrument cannot be used together with -xlinkopt.

3.4.136 –xipo[={0|1|2}]

Perform interprocedural optimizations.

Performs whole-program optimizations by invoking an interprocedural analysis pass. -xipo performs optimizations across all object files in the link step, and is not limited to just the source files on the compile command.

-xipo is particularly useful when compiling and linking large multi-file applications. Object files compiled with this flag have analysis information compiled within them that enables interprocedural analysis across source and pre-compiled program files. However, analysis and

Chapter 3 • Fortran Compiler Options

129

3.4 Options Reference

optimization is limited to the object files compiled with -xipo, and does not extend to object files on libraries.

-xipo=0 disables, and -xipo=1 enables, interprocedural analysis. -xipo=2 adds interprocedural aliasing analysis and memory allocation and layout optimizations to improve cache performance. The default is -xipo=0, and if -xipo is specified without a value, -xipo=1 is used.

When compiling with -xipo=2, there should be no calls from functions or subroutines compiled without -xipo=2 (for example, from libraries) to functions or subroutines compiled with -

xipo=2.

As an example, if you interpose on the function malloc() and compile your own version of malloc() with -xipo=2, all the functions that reference malloc() in any library linked with your code would also have to be compiled with -xipo=2. Since this might not be possible for system libraries, your version of malloc should not be compiled with -xipo=2.

When compiling and linking are performed in separate steps, -xipo must be specified in both steps to be effective.

Example using -xipo in a single compile/link step:

demo% f95 -xipo -xO4 -o prog part1.f part2.f part3.f

The optimizer performs crossfile inlining across all three source files. This is done in the final link step, so the compilation of the source files need not all take place in a single compilation and could be over a number of separate compilations, each specifying -xipo.

Example using -xipo in separate compile/link steps:

demo% f95 -xipo -xO4 -c part1.f part2.f demo% f95 -xipo -xO4 -c part3.f

demo% f95 -xipo -xO4 -o prog part1.o part2.o part3.o

The object files created in the compile steps have additional analysis information compiled within them to permit crossfile optimizations to take place at the link step.

A restriction is that libraries, even if compiled with -xipo do not participate in crossfile interprocedural analysis, as shown in this example:

demo% f95 -xipo -xO4 one.f two.f three.f demo% ar -r mylib.a one.o two.o three.o

...

demo% f95 -xipo -xO4 -o myprog main.f four.f mylib.a

Here interprocedural optimizations will be performed between one.f, two.f and three.f, and between main.f and four.f, but not between main.f or four.f and the routines on mylib.a.

130 Oracle Developer Studio 12.6: Fortran User's Guide • July 2017

3.4 Options Reference

(The first compilation may generate warnings about undefined symbols, but the interprocedural optimizations will be performed because it is a compile and link step.)

Other important information about -xipo:

requires at least optimization level -xO4

Building executables compiled with -xipo using a parallel make tool can cause problems if object files used in the build are common to the link steps running in parallel. Each link step should have its own copy of the object file being optimized prior to linking.

objects compiled without -xipo can be linked freely with objects compiled with -xipo.

The -xipo option generates significantly larger object files due to the additional information needed to perform optimizations across files. However, this additional information does not become part of the final executable binary file. Any increase in the size of the executable program will be due to the additional optimizations performed

If you have .o files compiled with the -xipo option from different compiler versions, mixing these files can result in failure with an error message about "IR version mismatch". When using the -xipo option, all the files should be compiled with the same version of the compiler.

In this release, crossfile subprogram inlining is the only interprocedural optimization performed by -xipo.

.s assembly language files do not participate in interprocedural analysis.

The -xipo flag is ignored if compiling with -S.

When Not To Compile With -xipo:

Working with the set of object files in the link step, the compiler tries to perform wholeprogram analysis and optimizations. For any function or subroutine foo() defined in this set of object files, the compiler makes the following two assumptions:

1.At runtime, foo() will not be called explicitly by another routine defined outside this set of object files, and

2.calls to foo() from any routine in the set of object files will be not be interposed upon by a different version of foo() defined outside this set of object files.

If assumption (1) is not true for the given application, do not compile with -xipo=2.If assumption (2) is not true, do not compile with either -xipo=1 or -xipo=2.

As an example, consider interposing on the function malloc() with your own source version and compiling with -xipo=2. Then all the functions in any library that reference malloc() that are linked with your code would have to also be compiled with -xipo=2 and their object files would need to participate in the link step. Since this might not be possible for system libraries, your version of malloc() should not be compiled with -xipo=2.

Chapter 3 • Fortran Compiler Options

131

Соседние файлы в предмете Информационные и сетевые технологии