Appendix A. Improving PCA Performance

This appendix is designed to help you improve PCA's performance for a particular application. Table A-1, which follows, lists common goals and offers suggestions for possible improvements.

Table A-1. Improving PCA Performance

Goal

Action

Recognize reductions and recurrences as safe to run in parallel.

Turn on roundoff option.

Convert more loops to run in parallel.

Turn up optimize option. Use arl option. Turn on roundoff option. Use directives.

Prevent PCA from converting to parallel execution a large number of inner loops containing a small number of iterations.

Use machine=o.

Eliminate dusty-deck transformations.

Turn down scalaropt option

Create a more informative listing.

Use –lo=ls or other listing options under the listoptions command-line option. (See the list option description for how to get a listing file.)

Force PCA to ignore assumed data dependences and convert the loop to run in parallel.

Use #pragma concurrent

Allow PCA to convert loops to run in parallel even though those loops contain function calls.

Enable in-lining or interprocedural analysis, or use no side effects or concurrent call directives

PCA is a tool to optimize C code and, as with any tool, it performs best when you are familiar with its features and the details of how it works. The PCA default settings can usually improve the performance of your code significantly. However, you can sometimes get larger performance improvements if you know when to use directives and alternate option settings.