Friday, October 14, 2011

working with openmp and gcc vectorization output

When working with openmp for parallelisation you can set the number of threads in a bash shell (cygwin) by

export OMP_NUM_THREADS=4

this can be good to do if you do not want to use all "cores" detected automatically on a cpu with hyperthreading.

To see which loops have been automatically vectorized by gcc you have to add the compiler switch

-ftree-vectorizer-verbose=5

Other verbose levels can also be used but for me level 5 seems to give the most relevant information.