For users interested in compiling their code to run reasonably fast, we provide this list of recommended default flags. For users interested in achieving peak performance, a list of tuning flags follows below.
PGI Recommended Default Flags
| Compiler | Flags |
|---|---|
| PGFORTRAN | -fast -Mipa=fast,inline |
| PGCC | -fast -Mipa=fast,inline -Msmartalloc |
| PGC++ | -fast -Mipa=fast,inline -Msmartalloc --zc_eh |
Where:
| -fast | A generally optimal set of options including global optimization, SIMD vectorization, loop unrolling and cache optimizations. |
| -Mipa=fast,inline | Aggressive inter-procedural analysis and optimization, including automatic inlining. |
| -Msmartalloc | Use optimized memory allocation (Linux only). |
| --zc_eh | Generate low-overhead exception regions. |
PGI Tuning Flags
| Flag | Usage |
|---|---|
| -Mconcur | Enable auto-parallelization; for use with multi-core or multi-processor targets. |
| -mp | Enable OpenMP; enable user inserted parallel programming directives and pragmas. |
| -Mprefetch | Control generation of prefetch instructions to improve memory performance in compute-intensive loops. |
| -Msafeptr | Ignore potential data dependencies between C/C++ pointers. |
| -Mfprelaxed | Relax floating point precision; trade accuracy for speed. |
| -tp x64 | Create a PGI Unified Binary which functions correctly on and is optimized for both Intel and AMD processors. |
| -Mpfi/-Mpfo | Profile Feedback Optimization; requires two compilation passes and an interim execution to generate a profile. |
Please see the PGI Compiler Reference Manual for detailed flag information. Find more specific information for tuning many popular community applications on the Porting & Tuning Guides page.
Inline intrinsic functions map to actual x86 or x64 machine instructions. Intrinsics are inserted inline to avoid the overhead of a function call. The compiler has special knowlege of intrinsics, so with use of intrinsics, better code may be generated as compared to extended inline assembly code.
See the PGI Compiler User's Guide, Chapter 14, for information on how to use them.
There are a number of things to be aware of:
There are a couple of work-arounds:
Add a C function which defines and assigns xarg & xargc and add a call to the C function from the fortran main program. For example:
zarg.c:
int xargc;
char **xargv;
extern int __argc_save;
extern char **__argv_save;
void zarg_() { /* call zarg() from fortran */
xargc = __argc_save;
xargv = __argv_save;
}
Recompile the file farg.f with 'pgf77 -c' and add the object to the link before -lmpich. If the second underscore is needed for global names, remember to include the second underscore option. The source for farg.f is:
integer function mpir_iargc() mpir_iargc = iargc() return end subroutine mpir_getarg( i, s ) integer i character*(*) s call getarg(i,s) return end
The environment variable NO_STOP_MESSAGE should be declared, with any value.
If any of the compilers fail from time to time with a SIGSEGV or SIGNAL 11 interrupt, it could be your temp directory.
Compile a dummy program with the compiler with -dryrun set
% pgcc -c -dryrun x.c
Reading rcfile /usr/pgi/linux86/bin/.pgccrc
Reading rcfile /usr/pgi/linux86/bin/linux86rc
Reading rcfile /usr/pgi/linux86/bin/pgcomprc
Reading rcfile /usr/pgi/linux86/bin/.pgirc
/usr/pgi/linux86/bin/pgc x.c -x 122 0x40 -x 119 0x10000 \
-x 123 0x1000 -x 119 0xa00000 -x 127 4 -x 119 0x8000000 -inform warn \
-terse 1 -astype 0 -stdinc /usr/pgi/linux86/include:/usr/local/include: \
/usr/i386-redhat-linux/include: \
/usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/include:/usr/include \
-def unix -def i386 -def linux -def __unix__ -def __inline__= -def __i386__ \
-def __linux__ -def __unix -def __i386 -def __linux -def __NO_MATH_INLINES \
-def linux86 -def unix -predicate #machine(i386) #lint(off) #system(unix) \
#system(posix) #cpu(i386) -opt 1 -x 80 0x300 -y 80 0x1000 -asm /tmp/pgccprl4kS
/usr/bin/as -o x.o /tmp/pgccprl4kS
Unlinking /tmp/pgccprl4kS
^^^^^^^^^^^^^^^
The file /tmp/pgccprl4kS is a temporary file holding the assembler source created by the compiler. Therefore /tmp is the directory used by the compiler. You can change the temporary directory by setting the TMPDIR environment variable to another directory.
Whatever directory you use for temporary files, make sure you have enough space in it and permissions to write to it, or you may see errors like the above, or you may see truncated temp files, which leads to very strange behavior.
The statements
#include "filename"
include "filename"
are handled differently in pgf77 and pgf90. #include is a preprocessor statement, while the indented 'include' statement is handled by the front end of the compiler.
To handle files with #include statements, either rename the file from 'x.f' to 'x.F', or use the switch -Mpreprocess in your compile line.
Some users want to keep gcc, and call pgcc as 'cc'. The easiest way to do this is to create a script file named cc and make sure your path is set up to find the cc script file. The script file, using csh syntax, is:
#!/bin/csh
setenv PGI /usr/pgi #! or wherever pgcc is installed
set path=($PGI/linux86/bin $path)
pgcc $*
If the PGI environment variable is already set, then delete the 'setenv' command.
We do not recommend renaming pgcc to cc. There are several changes that are necessary in order for this to work correctly, and each release changes the driver structure.
The SIZE intrinsic is set to the samefunction type as the default INTEGER, which is 4 bytes with the 64-bit compilers. If users compile -i8 the default integer size is 8, and the SIZE intrinsic will now be 8 bytes.