Optimizing for the Intel® Xeon Phi Coprocessor
Argument |
Comment |
-mmic | Builds an application that runs natively on Intel® Xeon Phi coprocessors. (Off by default). |
-qopt-streaming-cache-evict=n | Controls whether compiler generates a cache line evict instruction after a streaming store. n=0 no clevict; n=1 L1 clevict only; n=2 L2 clevict only (default); n=3 L1 and L2 clevict generated. |
-qopt-assume-safe-padding | Asserts that compiler may safely access up to 64 bytes beyond the end of array or dynamically allocated objects as accessed by the user program. User is responsible for padding. Off by default. |
-qopt-threads-per-core=n | Hint to the compiler to optimize for n threads per physical core, where n=1, 2, 3 or 4 |
-qopt-prefetch=n | Enables increasing levels of software prefetching for n=0 to 4. Default is n=3 at optimization levels of -O2 or higher. |
-fimf-domain-exclusion=n | Specifies special case arguments for which math functions need not conform to IEEE standard. The bits of n correspond to the domains: 0: extreme values (e.g. very large; very small; close to singularities); 1: NaNs; 2: infinities; 3: denormals; 4: zeros |
-qopt-gather-scatter-unroll | Specifies an alternative loop unroll sequence for gather and scatter loops. |
-align array64byte | Seek to align the start of arrays at a memory address that is divisible by 64, to enable aligned loads and help vectorization. (Fortran only) |